So you know you need a Data Warehouse. Now what? Build or buy?

The majority of investment managers understand the need for a data warehouse solution, but the decision of building it themselves versus buying it from their investment management solution provider is a little more hazy. It doesn’t have to be.

As someone who has previously worked at a major asset manager and been a consultant in the industry, I have seen the benefits and pitfalls on the different approaches to acquire a data warehouse into an organization.

I understand the potential appeal of building your own data warehouse. It is a bit like the appeal to design and build your own house – you get to choose the materials and decide on all aspects of the design. But, as with every building project, there is a risk that the house might not be delivered in time and the costs may be larger than anticipated. The same goes for building data warehouses.

Decision drivers

Before making the decision to build or buy, there are a lot of aspects to consider. What are your organizational and internal competences? Can you even build your own data warehouse or will you need to hire experts or consultants? And what about your IT landscape and strategy, does it say you need to use a specific tool, or that you should have as few providers as possible?

A crucial point is your requirement for time-to-value. When do you actually need to start using your data warehouse? Are you looking at a six month timeframe, or three years? Finally, how complex is your reporting and data requirements compared to the market? Do you have unique requirements or could a standard solution cover the vast majority of your needs?

These are all very important considerations, which need to be answered before embarking on this journey to ensure the best options.

Building versus buying a data warehouse

Building blocks

If we stick with the house analogy, there are elements in the process that you could choose to either buy or build. For the data warehouse I see three elements; the tool as the software to build the solution, the model reflecting the business entities and design, and the extracts used to take data from your source systems.

The tool

Do you want to build your ETL mechanism yourself? There is an operational risk involved when being your own application provider. Not only because it might not be one of your core capabilities, but also because it will require updating every time your other solutions are upgraded. It’s important to note that when you buy from an established vendor, you are leveraging from years of experience in this area which they have built up in this area across a broad range of clients. If your data load is very complex, building yourself is not recommended.

If you do choose to build your own tool to load data, then you must also expect significant wait time before you even start building your data warehouse, and even more time before it is functional. On the other hand, one of the benefits is that you get full transparency of what is happening in your code and thus control of what is happening behind the user interface.

The model

Almost everyone, myself included often underestimate how long it takes to design and structure the business entities in a data warehouse, often assuming that it will take only a few months. In fact, it can take a couple of years in the investment management industry to design the full spectrum of the business domain reflecting an iterative process involving several stakeholders.

When you buy, you are also buying other clients’ input and experience – all of which enhances the product compared to doing it solo. Building however, gives you the flexibility to design a setup that better reflects your unique business model, but at the end of the day, how unique is your data requirements compared to your peers? A standard model might still let you add customizations if necessary.

The extracts

How well do you know the structure of your primary data sources? It usually requires a lot of in depth knowledge of software applications in order to fetch the right data from the right database tables. However, if you have this knowledge in your organization building your own extracts might enable you to have a full overview of what is going on.

On the other hand, when you buy, you get a better integration and more holistic solution as the vendor can think in the data warehouse before application changes are implemented – e.g. Release 6.3 of SimCorp Dimension next year – will impact the underlying data structure, and manual changes to your own data warehouse is needed if you have chosen to build it yourself.

Rather than two separate products following different product roadmaps and focus areas, it makes a huge difference having the same vendor focusing on the bigger picture and improving on both at the same time. This of course requires that your software vendor has a well-built data warehouse that can support your needs. If they do, then the argument is strong. Buy it from them and enjoy the short and long-term benefits that will come.

Total delivery from your software vendor

One thing that I would add (that has come to my mind since I started working at SimCorp) is the significant benefit having the software vendor (in this case SimCorp) delivering a Data Warehouse solution as a standard offering.

SimCorp knows what happens to the tables the software modules rely on which makes it much easier to make the right solution for reporting from the start. Having the same people who are involved designing the software modules being a part of the team handling how the data should flow to the Data Warehouse makes an enormous difference.

If we sum up using the analogy of the house this will correspond to having one vendor responsible for delivering your dream home – making sure that all parts fits together and you can move in on time, on budget and with few surprises.