You Want to Do What With your Data?
By Andrew Sohn, SVP - Global Digital and Analytics Srvcs, Crawford & Company
Any business that’s over a few months old has a lot of data. Businesses that are several years old have tons of data. Large, established businesses and digital native companies are drowning in data, and are having a hard time just figuring out how to store and manage everything in a cost-efficient manner. So, with all of the “data is the new oil” whitepapers and “data is our biggest corporate asset” keynote speeches, you would think the amount of data a company has must correlate to how much they can increase the business revenue and drive up shareholder value. Right? Unfortunately, that is not the case.
These days, CIO and CDOs are routinely instructed by their management and business partners to “monetize the data” and “use our data for a competitive advantage”. The thinking is that since the company has been in business for a long time and has a plethora of data, it must be able to use this information and drive value from it better, and manage and leverage it better than a company that’s only been in business for a few years. Often times, management will offer support by bringing in some data scientists, consultants and other smart people to try to do something quickly and gain an advantage over the competition.
There are many challenges in this type situation. The first and foremost is that even though an organization may have an abundance of data, it typically wasn’t collected, managed or processed for the requested purposes. In the past, just enough data may have been captured or moved from system to system to fulfill an order, manage inventory, process a transaction, or record a specific event. Because storage used to be expensive, not all of the detailed and necessary data was kept — only summary or aggregate data was kept.
"Cradle to grave data management that treats data as an asset is the key"
Another challenge is that much of the detailed level information that is available is stored in formats that are not easy to access or analyze. This includes content such as unstructured and unmanaged spreadsheets, unstructured text-based documents or scanned images. And in many cases systems within the company were developed independent of one another, so being able to share and combine data between systems was not included in their design.
It’s not only long established companies that encounter these issues. Many digital native companies are equally challenged. Whether it’s because of the need for speed-to-market or a dependence on software-as-a-service applications where the focus was just to get the system up and running quickly, their data is in a state where it is not easy for them to take additional actions on it.
The data pipelines at these companies were set up to handle one set of requirements. So, when the desire to change or add what an organization wants to do with the data is formulated, it is often not feasible to fully use the historical data for these requirements. When it is possible, it usually requires expensive, time-consuming manual efforts to prep the data. Furthermore, in order to enable future data to be used for concepts like data monetization and new data products, changes to the status quo need to be made. These changes touch not only the technology, but processes and culture as well. They can be complex, pervasive and disruptive.
Here are a few of the barriers in a company’s data that create difficulties:
• No common language was established nor were detailed instructions adhered to so the data that is not captured in a structured way with consistent fields and database;
• Data was captured in a free-form text field or on written forms that a processing clerk read as part of performing a task;
• Those forms are stored with little or no metadata and the context is not searchable. The only context available are the transactions they are associated with;
• Systems within the company were developed independent of one another, so the ability to share and combine data between systems wasn’t a consideration. The same customer could be assigned different reference numbers in different systems. In a worst case scenario, the same customer could be aggregated in one system and called out as multiple entities in another.
But aren’t new technologies supposed to solve all of these challenges? That’s what the airline magazines say as well as every consultant and tool vendor that comes knocking on our doors. Can’t we use text analytics to surface all of the data in the unstructured documents? Won’t machine learning take all of this data and come up with algorithms to determine what the missing data should be? Can’t intelligent master data management tools synchronize the data in all of these systems? And won’t putting all of this data into “a data lake in the cloud” solve all of our data consolidation issues?
The answer to these questions is that these technologies will help, but they can’t work magic and they can be very costly. If data was never captured, it can’t be used for analysis. If subject matter expertise is not available to interpret the historical data and put context around it, the tools may infer incorrect relationships if it can infer anything at all. If the data is wrong or internally inconsistent, the output from these technologies will be wrong. These tools do little to overcome the law of garbage in – garbage out.
Of course, there are some companies that had made headway in leveraging their data. A company may have pockets of historical data that are dynamic enough and of sufficient quality to have value beyond their initial purpose. But for the most part, companies driving significant value from their data have made substantial, multi-year investments in technology, talent and change management.
When faced with this mandate, CIOs and CDOs need to take a multi-temporal approach. For dealing with the historical data, they have to look for opportunities within the piles of data that will contain valuable insights, either for internal or external customers. There may be a lot of dead ends and not so statistically significant results, but there will most likely be some kernels of knowledge that can uncovered. It’s up to each organization to determine how many resources they want to spend on this exploration and if the potential ROI is worth it.
The other path is to focus on a go forward plan where a comprehensive data strategy is critical. Cradle to grave data management that treats data as an asset is key. All of the aspects of data quality – accuracy, completeness, timeliness, etc. – need to be embedded in systems and processes. This will allow for the data going forward to create the most value for your company.