Read this article and learn about:
- Why buy-side firms (and regulators) increasingly focus on data lineage
- How data lineage concretely impacts buy-side firms
- Why firms must combine multiple approaches to improve the status-quo
- What best practices can be leveraged to start building a culture of trust in your data
Olivier Kenji Mathurin, Head of Strategic Research, AIM Software
Data has become as complex as it is vital to a financial institution's success. Asset management firms rely on the accuracy, quality and correlation of their data for trading, accounting and risk mitigation.
A typical situation in an organization
Picture this: An auditor calls and asks, “can you justify the prices used for the NAV calculation of last week?” The analyst will investigate back the process, starting from the NAV report delivered last week, to the multiple systems involved – including reporting, data warehouse, portfolio management, accounting system – to the system collecting asset valuations from different pricing sources and selecting the correct price.
The information is all there but it takes time to investigate.
On that particular day, the Hong Kong stock market was closed due to a typhoon striking the entire city1; a pricing exception was raised for each instrument domiciled there. The team of analysts thus decided to export the suspect records outside the system into a spreadsheet, copy/paste the valuations from the most-liquid market, and then reimport the corrected values back into the system – ensuring the NAV cut-off deadline is respected.
The analyst reports back to the auditor, having used several hours investigating, calling, emailing and reviewing technical log files to understand what happened with that price, on that day.
Requests from regulators and client auditors who demand “as-of” determination – how the data was arrived at, source and calculation methods used – has become part of the daily routine.Olivier Kenji Mathurin, Head of Strategic Research, AIM Software
This type of request is far from unusual. All recent regulations, from Dodd-Frank, to EMIR, IFRS, AIFMD, Solvency II or MiFID II, have put a major focus on data transparency. Requests from regulators and client auditors who demand “as-of” determination – how the data was arrived at, source and calculation methods used – has become part of the daily routine. One global asset manager I work with reported more than 14 due-diligence meetings conducted every year from one European regulator – each one requiring deep dive reviews of pricing methods in use and review of historical information.
Why is it so hard?
Understanding data lineage is typically impacted by three main issues:
- Distance to access the information: Massive amounts of market data flow into a buy-side organization. The data is manipulated, extracted and reworked by different functions and within different systems. Although the information exists, it is often spread across different systems in different organizational siloes. It is also captured in technical log files and databases which require IT staff to access the information, before the information is usable for business users.
- Gaps in ownership of data, standards and enforcement processes: With dozens of applications, myriads of repositories and data models, the challenge of data lineage is more than daunting. The lack of a matured data governance able to cover these gaps will fail to improve the situation of access to data lineage information.
- Usage of spreadsheets: Spreadsheets are popular due to their ease of use when executing complex business processes. However, they are also difficult to control, and typically run outside of data management processes and controls. The risk that mission-critical information is lost or altered remains a real concern to operational managers and now increasingly to regulators.
The visible cost of a lack of data lineage is the amount of time spent in data forensics. A case study from IDC*i, Data Lineage Management: Impact and Value, shows that data stewards can spend up to 30-50% of their time on data forensics when responding to requests from business users.
A critical need is to be able to answer these requests very fast, to explain what happened to a particular portfolio or for a specific client.
One thing is clear, new regulations and revisions will continue to pressure firms for more granular transparency on the data reportedOlivier Kenji Mathurin, Head of Strategic Research, AIM Software
One thing is clear, new regulations and revisions will continue to pressure firms for more granular transparency on the data reported – thus increasing the costs on financial services firms.
- ESMA already announced[ii] that upcoming revisions of EMIR and MiFID will look to improve the quality of the data reported
- Basel III’s FRTB will require high-quality, granular historical data when using internal models and keep track of long risk factor histories
- IFRS-9 will introduce comprehensive data requirements, with specific needs to source loan origination information.
The hidden costs of unknown data provenance
Beyond the direct costs of data forensics, consider the impacts on the organization – and the related costs and risks – when data provenance and data quality controls are not known:
- Redundant data control activities: The same or duplicated controls are often performed several times in different departments, because there is no shared view of the controls previously applied on the data element received.
- Incorrect bookings: One asset manager I know of, uses several hours per month on correcting bookings because the data is not fit for purpose. These corrections would also involve further data forensics activities.
- Data quality streams in analytics or reporting initiatives: Projects such as risk data aggregation, customer analytics, data warehouses, reporting or IBOR will often embed a data quality stream to ensure at the point of consumption that data is quality controlled prior to usage – even though it has probably already been controlled before.
- Increased data costs: Different departments often decide to acquire data directly from the vendor, even when that data is probably already available to them.
- Market data usage and compliance risk: With stricter usage agreements, data vendors demand increasingly more details on data usage and distribution. Inability to relate data provenance and usage exposes the firm to difficult contract negotiations and compliance risks which can incur massive additional data costs.
- Accuracy of analytics and models: Difficulty to investigate why models result in sub-optimal outcomes – in particular when back-testing data contained look-ahead bias.
- Client reporting: Inability to determine the data provenance of the reported values, or delays in reporting it can lead to client risk and reputational risk.
- Slowdown of growth and M&A initiatives: Lack of data provenance information results in significant difficulties to integrate data sets from another entity.