Big data

What it is and what it might mean to investment managers

Read article and learn about:

  • What is big data and what is it not
  • What big data means to investment managers
  • A brief technology summary
  • Getting started with big data
  • Some big data solutions in practice
  • What it takes to win with big data

About the author:

Anders KirkebyAnders Kirkeby is Director for Technical Product Management, SimCorp, London, UK

 

Anders Kirkeby is currently based in London leading a couple of teams charged with managing the future of the technical platform, operations, and user experience in the SimCorp Dimension® investment management software product.

With the era of big data upon us, few in the investment management industry doubt its disruptive force – only how to apply it effectively. This article outlines the big data concept and questions its practical implications for investment managers. We set the parameters for assessing asset management systems and technology to prepare and prosper as big data takes hold.

As one of the most discussed applied IT industry topics, big data has a lot of hype surrounding it. By all accounts, big data is to the current technology discussion what the cloud was just a few years back. A key question that is often raised in all the talk is whether or not big data technologies have any profound implications for the investment management industry. Can such technologies actually prove of benefit to investment managers and provide competitive advantage?

This article paints a broad-brush picture of the big data concept, and what it means for investment managers. First, we provide a brief summary of the concept; then we relate it to examples of its successful application in other industries. We go on to conjecture why it may be relevant to the investment management industry – and why it may not. Finally, we outline an approach for evaluating systems and practices to gear up and thrive in the disruptive big data era.

Big data as a concept is first and foremost about combining data, in particular unstructured data, from multiple sources to produce new insights that were impossible or hard to attain previously

Big data: a brief technology summary

Big data as a concept is about combining data, in particular unstructured data, from multiple sources to produce new insights that were impossible or hard to attain previously. The enabler for big data is a combination of cheap and highly scalable computational processing capability and a rapidly increasing set of new digital data streams and data often expressed in self-describing standards-based formats.

The inherent challenges in big data are often summarized as three Vs for Volume, Velocity, and Variety (see Figure 1). Sometimes these are supplemented with at least one additional V for Veracity since it is very important to manage the quality of the data.

Diagram illustrating the big data concept in action Journal 740x493

Figure 1. Diagram illustrating the big data concept in action. Source: Datameer, Inc., 2014.

Some of the core technologies in play include Apache™ Hadoop™, which is the open source implementation of the Google data storage and processing infrastructure.1 Hadoop draws on a programming model called MapReduce that was already in use in the 1970s in functional programming languages where all data could not be loaded into memory (see Figure 2). But that is just one of the more important technologies. Recent years have seen many new technologies enter the market to support various aspects of big data needs.2

How several of the new technologies associated with big data fit together

Figure 2. Figure illustrating how several of the new technologies associated with big data fit together. Source: ‘Exploring NoSQL, Hadoop and HBase’, Pettine, R. and Wadie, K., 2013.

These technologies often run on cloud platforms with elastic demand pricing models. They combine with new query technologies that support time-based query clauses, NoSQL, and engines that can query a multitude of schemas simultaneously based on having sufficient inferred commonality, new dense data formats, and on-demand data sources like web services.

However, this does not spell the end of the relational database. Each technology has its own pros and cons, and most of the new big data-oriented technologies are poorly equipped to provide the transactional integrity offered by the regular database platforms that are at the core of many investment management processes.

Big data as a phenomenon is made possible because of new and cheaper technologies. However, as with any project, there is a danger in substituting a concrete business problem with a focus on a particular technology.

Simply put, you need to have a reasonably clear idea as to what big data should do for you before you invest in it. Yet some industry commentators are advocating acquiring technology simply to be technologically prepared.3 But big data is still at an early stage in the industry and as a consequence, it will be hard and risky to acquire technology without first identifying where the value is.

New insights – two approaches

Big data is about insight – getting new insights that were hard to come by before. There are essentially two approaches to produce such insights. One is to start with a hypothesis, figure out which kind of data can validate or invalidate the hypothesis, and then build and run the query.

The hypothesis approach is reasonably cost-effective since it is targeted and can be used for one-off research and for specific reproducible analytical results. In the past few years, several national governments have introduced open data programs to let citizens explore and combine national and municipal data to promote democratic values.

The alternative approach is the ‘build-it-and-they-will-come’ approach. Here you may make a series of informed guesses as to which data sources and structures may hold insights, and then provide a platform that data scientists or ‘amateurs’ can use to explore the data to discover new insights at will.

Results have probably been less impressive than some may have hoped for, but there are several interesting successes too. Most of the successes are found in the area of online services that take the approach of the hypothesis with reproducible results to provide a very narrow but useful service.

What is big data and what is not?

For example, a new generation expenses system might correlate data from individual receipts with what all users enter as their reason for the claims and data from their calendars to make increasingly good guesses and ultimately automate the process of filling in expense claims. This is big data because the solution:

  • combines multiple sources of data in different formats;
  • some of the data is not yet digital;
  • some of the data is not structured; and
  • the results are provided on a best-effort basis – there is no notion of a complete dataset.

Closer to home in the area of securities trading, a matching add-on solution that uses past matching results to make increasingly good guesses to ultimately automate the matching processes could also be viewed as a big data solution because it:

  • combines structured data that has yet to be processed with machine-learning on top of past results, and
  • provides results on a best-effort basis – hence, rules can be introduced to grade the guesses as a guideline.

Alpha generation based on some types of market sentiment would also be big data. If such a service is used as an automated or semi-automatic portfolio management strategy based on sense-making and text-mining in Twitter and news feeds, it could definitely be considered as a big data solution. Such a service could be very powerful if it were combined with the right data but the most valuable data would in most cases not be reachable. Bloomberg already provides such a sentiment data service.

Big data in investment management

Where big data could come into play for the investment management industry is in the practical application of big data analytics for the compliance and regulation functions, as well as promoting business through obtaining advanced insights into clients, operations, and financial markets.4

Disruptive market forces including greater regulatory complexity and scrutiny, more competition, globalization, shifting investor preferences, and tighter operational budgets are spurring firms to integrate their governance, risk and compliance (GRC) management functions.5 A recent example of this trend has seen several firms signing up for big data-based services to capture environmental, social, and governance aspects of clients, investments, and partners alike, primarily as means to handle reputational risk.6

These forces, including the strategic impact of greatly increased reporting requirements, raise a major data challenge. Agile operating platforms need to be in place to collect, aggregate, and report data across multiple regulatory regimes, regions, and formats – all in real time. Here big data analytics has an important role to play as a driver of change in how investment data is managed.7

These forces, including the strategic impact of greatly increased reporting requirements, raise a major data challenge. Agile operating platforms need to be in place to collect, aggregate, and report data across multiple regulatory regimes, regions, and formats – all in real time.

Other practical uses include fraud detection. Firms can harness and integrate different datasets to either detect or deter fraud by employees or hackers using disparate databases to their advantage.

Lastly, with profits still lagging behind in the industry, it is obviously important to increase efficiency to reduce costs or enable new revenue-driving activities. Investment management enterprise architectures are complex with multiple data flows that are hard to measure in order to establish effective metrics.

Here another use of big data methods is to apply process mining to data and logs. The aim is to produce actionable facts to ease costs or to increase performance. Previous studies at dozens of investment management firms applied big data methodology to highlight which parts of the investment management processes could benefit the most from an overhaul. These studies consistently found substantial performance improvements after addressing the identified bottlenecks – typically in the 20 to 250 basis point (bps) range.8

Anders Kirkeby 620x500

Getting started with big data

Convinced there may be something to big data, how do you get started? Lots of column space has been spent on how ‘data scientist’ is the new job title to have. The idea is that data scientists will sift through data, make connections where data islands exist today, and ultimately spot new patterns, trends, and correlations.

Data scientists will be tasked to produce new insights. Since the insights derive from the unique combination of your particular data and the skills of the data scientist, new and original insights will emerge to add value to the business and provide scope for competitive growth. This will be a good recipe for creating new product ideas or service concepts and potentially even a unique selling point for your firm.

The point here is that you may or may not have the data already, but the insights concealed in the data do not appear on their own. They have to be teased out by someone with the right statistical skills, business understanding, and instinct. Without a data scientist or other specialists leading this task, big data will in all likelihood fall short of delivering on its promise.

Some big data solutions in practice

Several vendors of general database and business intelligence tools are marketing their big data solutions. Systems may be required but they have to be tailored very carefully to your particular data landscape and business needs. Even then you will need somebody charged with distilling actionable insight and driving action.

One example of big data solutions for the investment management industry is in the area of securities trading. With growth mushrooming in the volumes, sources, and complexities (including unstructured data) of trading information, firms need to access and analyze data as accurately and swiftly as possible – preferably in real time - in order for it to be of value.

In order to handle the data management challenges such growing volumes entail, firms need to bring their securities trading data infrastructure under control with the help of big data analytics. Doing so will deliver competitive advantage. This and other cases illustrate the game-changing potential as seen in other industries, such as consumer staples, E-commerce, healthcare, and telecommunications, where big data analytics is already in full play.9

While big data analytics can serve as the basis for initiatives to drive the business forward, making analytics operational involves automating and embedding it directly into business processes, allowing it to inform, prescribe, and facilitate decision-making.

Another interesting example of machine-learning leveraging big data methods to sift through reams of unstructured data comes from the venture capital space. A Hong Kong VC firm has effectively appointed a machine intelligence on the board. The new board member will continuously analyze anything related to a very specific life sciences space to identify and rank potential investments.10

The VC example hints at something that could prove disruptive for the investment management industry. Similar agents could soon be employed in investment management to move from the relatively simple algorithms used in automated trading systems to automated big data-based asset allocation in active portfolio management.

What it takes to win with big data

State Street concludes in a recent report: “It’s not a surprise that so many executives in our survey agree that data is now a source of competitive advantage... The pace of technological change combined with the disruptive industry trends described in this report mean that organizations must continue to invest to retain a data advantage… Institutional investors that can extract deep insights from data will have a huge advantage.”11

With exponential growth in data, increasingly competitive markets, and other disruptive market forces in play, investment managers must respond to the big data challenge and adapt to gain every advantage available. But having access to the right data and computational power to process big data does not produce meaningful results by themselves. To cut through the hype and produce real results that shift the bottom-line, investment managers need to allocate the talent to identify the right potential insights and then invest in technology to help produce these insights.

While big data analytics can serve as the basis for initiatives to drive the business forward, making analytics operational involves automating and embedding it directly into business processes, allowing it to inform, prescribe, and facilitate decision-making.

Investment managers need to identify and carefully examine the requirements for laying a solid technical foundation that is capable of supporting big data analytics at the operational level. What is crucial is for investment managers not only to understand the big data concept in principle but also how the technology can tangibly benefit their own firms and help create competitive advantage.


1 – http://www.datameer.com/product/big-data.html

2 – ‘Exploring NoSQL, Hadoop and HBase’, Pettine, R. and Wadie, K., 2013.

3 – Turning Big Data into a Dashboard for Investment Managers, Wall Street & Technology: http://www.wallstreetandtech.com/data-management/turning-big-data-into-a-dashboard-for-investment-managers/a/d-id/1266018

4 – http://www.datameer.com/solutions/financial-services.html

5 – http://www.risk.net/operational-risk-and-regulation/advertisement/2363604/governance-risk-and-compliance-survey-2014

6 – http://en.wikipedia.org/wiki/RepRisk

7 – ‘Chasing Alpha: How Data and Analytics Help Alternative Asset Managers to Outperform the Pack’, State Street, 2014.

8 – ‘The Investment Book of Record: Can You Compete Without One’, SimCorp StrategyLab, 2014.

9 – ‘Blueprints for Big Data Success: Succeeding with Four Common Scenarios’, Pentaho, 2014.

10 – Deep Knowledge Venture’s Appoints Intelligent Investment Analysis Software VITAL as Board Member: http://www.prweb.com/releases/2014/05/prweb11847458.htm

11 – ‘Leader or Laggard: How Data Drives Competitive Advantage in the Investment Community’, State Street, 2013.