Capital markets big data ‘still at an early stage’ says Thomson Reuters
Most capital markets firms are still not using big data and even those that do often lack a concerted strategy, according to a new report commissioned by Thomson Reuters.
The survey, which was carried out by analyst firm Aite, reported that only 5% of the 423 firms contacted felt willing or able to talk about their big data programmes, and that those with an active programme were usually only focused on specific, narrow areas. The findings suggest that relatively little has changed since the Centre of Economics and Business Research released a report in June 2013 which said that current accounting methods do not capture the importance of big data, and that lack of awareness of data’s potential hampers policy decision making.
Big data is defined as a strategy or technology that deals with data problems that are too large, too fast or too complex for conventional database or processing technology. The concept of big data is often linked to velocity – the speed of data delivery and processing; volume – the amount of data that must be managed or processed; and variety, the range of different data sets that must be dealt with, including structured and unstructured data. Big data can often be measured in petabytes, – 1015 bytes. For example, eBay runs two data warehouses storing 87 petabytes of data in total.
Current investments in big data are largely focused on revenue generation at the front office. Firms are looking for insights, speed of response and future scalability from big data. It should come as no surprise therefore that the most popular use cases for big data are analytics for trading and quantitative research. The report notes that in the capital markets, areas such as tick data and HFT provide good examples of large data sets and a high velocity of data.
One of the challenges facing capital markets firms is combining siloed data sets from areas such as different asset classes and attempting to cross-reference data in near real time. The need to do this is being driven by regulations such as EMIR and MiFID II as well as Dodd-Frank, which all mandate greater transparency, as well as by changing economic realities and shifting flows following the financial crisis. However, the Thomson Reuters report warns that inadequate technical knowledge is currently holding back progress.
About half of the firms that responded to the survey had or are planning to hire a data scientist in the next two years, and about the same proportion had already made some investment in big data. But the report emphasised that data scientists need more than just a strong mathematical ability – they will also need communication skills and business aptitude, because the data question is becoming more important within financial institutions.
“If the data scientist does not understand the industry dynamics and drivers, then there is much higher chance of data misinterpretation as they are unable to properly sanity check the models and assumptions they are working with,” warns Aite in the report.
Big data can also be combined with cloud technology, and outside the capital markets this is often the case. For example, Cloudera was founded in 2008 to deliver Apache Hadoop, which is a Java-based tool that is often associated with big data processing. The main advantage is the cost reduction and scalability naturally associated with cloud, because the technology basically eliminates the upfront cost of IT investment and maintenance of hardware and software. However, capital markets firms remain cautious about the use of public cloud technology in commercially sensitive areas. While the private cloud is preferred for large sell-side firms, these are generally more costly and therefore Aite points out that they may not bring down the costs of big data support “in any significant manner”.
Despite the challenges, the report suggests that the future is likely to see much more focus on client retention, compliance function support and enterprise risk management and governance. In particular, as tough new regulations increase the emphasis on transparency, data management is expected to become more important. Education and viable use cases are also expected to drive up adoption in the years ahead.
“A good example of a revenue generating focus for big data is in the area of sentiment analysis,” said the report. “Big data can be used to drive front office trading strategies, as well as to determine the valuation of individual securities. Using this information, [including Twitter and blogs] traders are able to determine whether various market participants are bullish or bearish and formulate investment strategies accordingly.”
The report also adds that sentiment and news can be used to produce a price for a security, which can then be compared to the market value to see if it is undervalued or overvalues, providing an opportunity for arbitrage. Big data can also be used to better meet regulatory compliance requirements. For example, Dodd-Frank requires financial institutions to respond to regulatory investigations within a 72-hour period – an obligation that often entails trade reconstruction and reporting. A big data strategy that helps a large sell-side firm to better organise its data, for example by reorganising a network of 100 data warehouses into a more manageable system, would be useful. Firms need to recognise that their data storage strategy needs to change via the addition of a focus on tiered storage, placing the data sets of most importance on faster devices and allowing other sets to be less readily accessible but more cheaply stored, added Aite.
“Big data in capital markets is a long way off being considered mature,” said the report. “It has yet to reach a tipping point in usage. A key part of the future growth of big data strategy within the capital markets will be the continued education of staff about its benefits and uses. Delivering tangible returns on investment in insights, speed of response and future scalability will enable further investment in big data across other areas of business. Big data has no end point, but must be continually refined over time.”