Why is it so hard to know who you’re doing business with?
Every bank wants to achieve a single view of a customer, but no-one has found an easy, cost-effective, scalable way to do it. The simple fact is that there is too much customer information, entering the bank from too many places, to align and maintain into a single view. An effective solution would reduce costs, allow the bank to generate revenue faster, facilitate KYC processes and meet regulatory requirements, write Steve Goldstein and Alan Samuels
Disparate silos, duplicate records, dirty data and frequent errors all cost banks money. It would be easy if every entity had one, and only one, unique identifier. Alignment and maintenance would then be easy. Unfortunately, this is not the case. Market data providers, regulators, rating agencies and banks themselves each use a different code to identify an entity. The LEI was established to address this issue, but LEI adoption has been slow and there are only a few regulatory initiatives that mandate its use. With MiFID being delayed, getting to a critical mass of entities with LEIs is even further down the road.
Accuracy in uniquely identifying customers and counterparties is critical, not only to satisfy regulatory requirements, but also to provide a better customer experience. With accurate customer information, you are no longer asking for the same information multiple times but, rather, you understand the business the customer actually does with your firm.
It’s very clear that the proliferation of entity identifiers presents an enormous challenge in mapping, cross-referencing and de-duping data in order for a financial institution to make sense of customer relationships. The problem is seen everywhere – trading desks, investment banking departments, corporate finance and wealth management. Data in individual silos is flawed, and even when cleaned, it’s practically impossible to align the different silos.
Ideally, in a central place within the financial institution, data from all sources, public and internal, would be continuously collected, assessed and integrated back into the “golden copy.” Here, processes to understand data lineage, matching entity names and addressing the dynamic nature of entity data would be deployed to improve overall data quality. We find these are the three causes of substandard entity data.
Uncertain data lineage
Not all data sources are created equal, so they should not all be treated as equal. Yet over and over again we see banks do this. For example, the GLEIF oversees the assignment of the LEI and the IRS oversees the assignment of the GIIN. Both are important components of an entity data record and are required identifiers for regulatory reporting. Both are published by recognised market participants. However, LEIs are vetted by the individual LOUs, each with quality standards, so the data can be considered “trustworthy”. GIINs are not vetted by anyone and, therefore, in our view, are not “trustworthy”.
When comparing these two data sets, you get many entities that don’t match – but that should. The most basic data attribute – the name of the entity – is frequently a problem. In some cases, there are duplicates, while in others it is two different entities which have names that are similar to each other. For example, the five entities above are the same but they have been entered into the bank’s database by an employee of the firm, with very different names. Only with a combination of software and experienced researchers can a financial institution effectively deal with data like this.
Challenges in matching entity names
The simplest test of whether a database is clean is whether or not a given entity appears in the database once, with the correct name. We looked at how many different ways fifteen companies in the Dow Jones Industrial Average were represented in 10 major data vendor databases (right). We looked at choice of name, spelling, abbreviations and punctuation.The most variations were for Goldman Sachs, which was referred to in seven different ways across the ten different sources.
For a bank consuming data from these sources, this has the potential to create a tremendous number of duplicate entries – especially when dealing with a less well-known entity. This, in turn, is exacerbated by multiple departments in a bank onboarding the same entity multiple times, which is another prevalent problem.
Data changes over time
Entity data and entity attributes are not static. It can be shocking to see that changes that are in the public domain are often missed (or changed so long after the fact) by financial institutions.
According to the FT, over 120 public companies changed their registered office address in the last year. The most critical type of address change is when it triggers a change in jurisdiction, because jurisdiction is a key component of practically every bank’s risk-rating engine. You really need to know if your client has moved from London or New York to the Cayman Islands.
Regulatory status is also a key component of risk ranking and determining how much work is needed to onboard an entity. Entities regulated by a recognized regulator typically require lighter touch KYC and are often viewed as low risk. There are hundreds of regulatory status changes each year across the universe of regulated entities that go unprocessed by financial institutions.
Most critically, every week there are about 10,000 updates to the data sources that track sanctions, PEPs and adverse news. Clients and counterparties must be checked against these databases during the onboarding process, but given the number of changes, ongoing monitoring is recommended. Financial institutions should be aware of any event that might change the risk rating of a client or counterparty, such as PEP status or adverse news, or if they must cease doing business with a firm because they appear on a sanctions list.
The ultimate goal for entity data management and know-your-customer compliance is the single view of a customer. Investing in the data, people and processes to make this is a reality will have an exceptional ROI. Only with trusted, accurate (and maintained) entity data can this be achieved.