Machine learning enables automation, insights, and data governance

Harvesting unstructured data and enabling STP for alternatives
Machine learning for alternatives
Hugues Chabanis
Product Portfolio Manager, Alternative Investments at SimCorp
Machine learning for alternatives
Harald Collet
CEO at Alkymi

Read the article and learn how machine learning can:

  • Empower operations
  • Increase STP for alternative investments
  • Facilitate data insourcing by keeping control of data and governing its value
  • Help meet changing expectations of the workforce

Institutional investors are buckling under the operational constraint of processing hundreds of data streams from unstructured data sources such as email, PDF documents, and spreadsheets. These data formats bury employees in low-value ‘copy-paste’ workflows and block firms from capturing valuable data. This article demonstrates how machine learning (ML) paired with a better operational workflow can enable firms to more quickly extract insights for informed decision-making, and help govern the value of data.


According to McKinsey, the average professional spends 28% of the workday reading and answering an average of 120 emails – on top of the 19% spent on searching and processing data. The issue is even more pronounced in information-intensive industries such as financial services, as valuable employees are also required to spend needless hours every day processing and synthesizing unstructured data.  Transformational change, however, is finally on the horizon. Gartner research estimates that by 2022, one in five workers engaged in mostly non-routine tasks will rely on artificial intelligence (AI) to do their jobs. And embracing ML will be a necessity for digital transformation demanded both by the market and the changing expectations of the workforce.

For institutional investors that are operating in an environment of ongoing volatility, tighter competition, and economic uncertainty, using ML to transform operations and back-office processes offers a unique opportunity. In fact, institutional investors can capture up to 15-30% efficiency gains by applying ML and intelligent process automation (Boston Consulting Group, 2019) in operations, which in turn creates ‘operational alpha’ with improved customer service and redesigning agile processes front-to-back.


Operationalizing machine learning workflows

ML has finally reached the point of maturity that means it can deliver on these promises. In fact, AI has flourished for decades, but the deep learning breakthroughs of the last decade has played a major role in the current AI boom. When it comes to understanding and processing unstructured data, deep learning solutions provide much higher levels of potential automation than traditional machine learning or rule-based solutions. Rapid advances in open source ML frameworks and tools – including natural language processing (NLP) and computer vision – have made ML solutions more widely available for data extraction.

However, the first wave of ML investments frequently disappointed, with many early adopters yet to reap the rewards. A recent survey of more than 2,500 senior executives, conducted by MIT Sloan Management Review and BCG, highlights the challenges of deploying ML solutions into production. More than 7 in 10 executives found that ML had not delivered the expected business results, and 40% of organizations making significant investments in ML have yet to report business gains from ML.

The critical gap has been in planning for how to operationalize ML for specific workflows. ML solutions should be designed collaboratively with business and process owners and target narrow and well-defined use cases that can successfully be put into production. To paraphrase BCG, successful ML deployments are 10% about algorithms, 20% about technology, and 70% about business application. ML used to automate manual workflows in operations must ensure that the manual task is automated end-to-end and implemented with a ‘human-in-the-loop’ design that routes lower confidence exceptions to employees, generating a critical learning feedback loop so models are constantly improving. The user experience (UX) should give employees an intuitive way to accelerate their specific workflow with transparency, visibility, and reporting every step along the way.

Asset class deep-dive: Machine learning applied to Alternative investments

In a 2019 industry survey conducted by InvestOps, data collection (46%) and efficient processing of unstructured data (41%) were cited as the top two challenges European investment firms faced when supporting Alternatives.

This is no surprise as Alternatives assets present an acute data management challenge and are costly, difficult, and complex to manage, largely due to the unstructured nature of Alternatives data. This data is typically received by investment managers in the form of email with a variety of PDF documents or Excel templates that require significant operational effort and human understanding to interpret, capture, and utilize. For example, transaction data is typically received by investment managers as a PDF document via email or an online portal. In order to make use of this mission critical data, the investment firm has to manually retrieve, interpret, and process documents in a multi-level workflow involving 3-5 employees on average.

The exceptionally low straight-through-processing (STP) rates already suffered by investment managers working with alternative investments is a problem that will further deteriorate as Alternatives investments become an increasingly important asset class, predicted by Prequin to rise to $14 trillion AUM by 2023 from $10 trillion today.  

Specific challenges faced by investment managers dealing with manual Alternatives workflows are:

  • The process is slow and expensive, manual data search and entry is time consuming, unpredictable and not scalable.
  • Higher incidence for lost or unused data, because email is brittle and the copy-paste process is error-prone.
  • The process is fragmented, business data is fragmented across repositories with limited visibility across an organization.

Within the Alternatives industry, various attempts have been made to use templates or standardize the exchange of data. However, these attempts have so far failed, or are progressing very slowly.

Applying ML to process the unstructured data will enable workflow automation and real-time insights for institutional investment managers today, without needing to wait for a wholesale industry adoption of a standardized document type like the ILPA template.

To date, the lack of straight-through-processing (STP) in Alternatives has either resulted in investment firms putting in significant operational effort to build out an internal data processing function, or reluctantly going down the path of adopting an outsourcing workaround.

However, applying a digital approach, more specifically ML, to workflows in the front, middle and back office can drive a number of improved outcomes for investment managers, including:

  • Providing real-time access to high-value data insights to enable more informed investment decision-making, faster reporting and better client service,
  • Driving a deeper understanding of portfolio risk and exposure by allowing investment managers to make use of more data
  • Enabling highly efficient end-to-end workflows that make data visible, accessible, and available the second it enters the enterprise,
  • Eliminating the need for outsourcing, enabling firms to maintain exclusive control and security over the proprietary data that forms their vital edge.

Trust and control are critical when automating critical data processing workflows. This is achieved with a ‘human-in-the-loop’ design that puts the employee squarely in the driver’s seat with features such as confidence scoring thresholds, randomized sampling of the output, and second-line verification of all STP data extractions. Validation rules on every data element can ensure that high quality output data is generated and normalized to a specific data taxonomy, making data immediately available for action. In addition, processing documents with computer vision can allow all extracted data to be traced to the exact source location in the document (such as a footnote in a long quarterly report).


Reverse outsourcing to govern the value of your data

Big data is “the new oil” or “information power”, and there are, of course, many third-party service providers standing at the ready, offering to help institutional investors extract and organize the ever-increasing amount of unstructured, big data which is not easily accessible, either because of the format (emails, PDFs, etc.) or location (web traffic, satellite images, etc.). However, there are risks to engaging third-party service providers.

While outsourcing data processing transfers a heavy manual burden for investment firms, it generates another complication, which can be even more costly: giving a third-party access to the proprietary data of an institutional investor. Turning over a firm’s data processing to an outside entity means giving them access to the “information power” that fuels the firm’s investment decision - its alpha generation. What is more, this data will have even bigger value to the service provider as it will be pooled with the data from a lot of other investment firms, leaving the third-party with the power of commingled data while the individual firm only has access to its own universe. If ever there were an appropriate time for the term, Caveat Emptor, this is it.


Embracing ML and unleashing its potential

Investment managers should think of ML as a co-pilot that can help employees in various ways: First, it is fast, documents are processed instantly and when confidence levels are high, processed data only requires minimum review. Second, ML is used as an initial set of eyes, to initiate proper workflows based on documents that have been received. Third, instead of just collecting the minimum data required, ML can collect everything, providing users with options to further gather and reconcile data, that may have been ignored and lost due to a lack of resources. Finally, ML will not forget the format of any historical document – from yesterday or 10 years ago – safeguarding institutional knowledge that is commonly lost during cyclical employee turnover.

ML has reached the maturity where it can be applied to automate narrow and well-defined cognitive tasks and can help transform how employees work in financial services. However many early adopters have paid a price for focusing too much on the ML technology and not enough on the end-to-end business process and workflow.

The critical gap has been in planning for how to operationalize ML for specific workflows. ML solutions should be designed collaboratively with business owners and target narrow and well-defined use cases that can successfully be put into production.

Alternatives assets are costly, difficult, and complex to manage, largely due to the unstructured nature of Alternatives data. Processing unstructured data with ML is a use case that generates high levels of STP through the automation of manual data extraction and data processing tasks in operations.

Using ML to automatically process unstructured data for institutional investors will generate ‘operational alpha’; a level of automation necessary to make data-driven decisions, reduce costs, and become more agile.