Clarifying the Data Cleansing Process

4 minutes, 14 seconds Read

I sit in a lot of sales calls and I was recently in a meeting with a prospective client in which we were asked if we “cleanse” market data as part of our service offering. This is a question that I hear a lot and unfortunately is an indication that there is a good deal of confusion within the market about what data cleansing really means. In this case, what the client was really asking was whether we correct market data that is issued from data vendors such as Platts, GFI, Argus, ICE, etc. The short answer to this question is “no” we don’t “cleanse” market data in this context and truth be told, nobody should be or is allowed to independently change 3rd party market data on behalf of the client. There is however, a process for handling data issues and errors.

When it comes to market data coming from vendors such as Platts, Argus, ICE, or other major providers, no one other than those source providers themselves should be cleansing or correcting this data (or even have the right to) before passing it on to the client, and with good reason. Millions of transactions are based upon these numbers, none of which would be possible if companies, or data aggregators, arbitrarily changed them to meet their needs. Data is a critical element to any business process whether it is looking at market trends, creating derivative curves for your end of day process or settling or invoicing against a counterparty. It is extremely important that the potential errors be caught, numbers collected are accurate and at the same time ensuring the process is transparent and auditable.

The good news is that the industry is collectively very efficient at detecting and correcting erroneous data. Thousands of companies including ZE track and validate data as it is being generated and when any data is identified as being suspect, these companies notify the data provider who either confirms that the data is correct, or, as part of their responsibility, issue a correction which everyone in the market is able to get at the same time. This is the only way to ensure that everyone in the market is working from the same data and that there is an audit trail for those needing to understand when and why the data underpinning a transaction changed.

Still companies are very concerned with the quality of the market data that underlies their forecasts, analysis and transactions, and many are also very interested in the specific validation rules or routines that ensures the data is correct and meaningful. Moreover, these companies want to be able to identify extreme data or outliers allowing them to investigate whether or not the data is correct.

Leading the way in this area, the ZEMA Data Suite and specifically its Data Validation module provides a method of tracking the quality of a company’s market or internal, data. It is an application that provides a visual status of the data in real time and email alerts for those data points that are suspicious, incomplete or considered extreme / anomalies. Many of the validation routines are client configurable and market data revisions are imported, tracked and archived where necessary providing a full audit trail of how market data is changing and how it impacted on historical analysis. Like other modules in the ZEMA Suite of applications, it removes the work load and worry, without removing any of the responsibilities or transparency around the process.

In the end, data cleansing is about reporting issues, importing corrections from the market, providing update statuses, archiving changed data points and providing notification to systems and users. Data cleansing doesn’t mean that a service provider can change your mission critical numbers on their own. Nobody can, nor should you want them to.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *