Data Prep Demystified

View Original

A Data Prep Perspective on Infonomics

Doug Laney’s Infonomics is perhaps the most thought-provoking book I’ve read on the topic of Information Management. As a Data Prep practitioner I’d like to share my personal take-aways from the three sections of the book; Monetizing, Managing, and Measuring Information.

Monetizing

This section of the book was eye-opening for me because of the number of real use cases where companies are monetizing their information. These examples dispel the myth that the only way of monetizing information is by selling it. Companies are encouraged to think beyond their internal data sources, and to look to external data sources to derive more comprehensive insights, and to form strategic partnerships along their supply chain and information ecosystem.

Regardless of your objectives, for-profit vs not-for-profit, one of the first practical steps to deriving value from your information assets is to prepare them for use, which means Data Preparation. In my second blog post I provided my own definition of Data Preparation, “…the process of making data user friendly and consumable…”. I was pleased to see a dedicated paragraph with bullet points on Data Preparation in the Monetizing section of Infonomics.

There’s just no escaping the ‘grunt’ work that needs to be done to make data ‘user friendly and consumable’, for an objective, which in this case is Monetizing it.

Managing

Information Management done correctly can help an organization to accomplish more with its information (data) assets. By the way, Doug does a brilliant job of squashing the whole nuanced argument on ‘information vs data’. My personal and enjoyable take-aways from this section were the following;

Inventory of Information Assets: If we don’t know what information assets we have how can we derive value from them? Many organizations tend to have better insight on which physical and financial assets they have, but no so much on thier information assets. Admittedly this is no small feat, and I believe that it needs to be more Strategic and Enterprise-Value-Aligned than a traditional routine process of creating a Data Catalog.

Enterprise Content Management (ECM): Doug highlights several different disciplines and how their approaches can be borrowed in the overall discipline of Information management. Well, this is one that really struck a cord with me. As a Certified Information Professional (CIP), and proud member of AIIM, ECM is always top of mind. ECM primarily deals with unstructured content, and Data Management (DM) primarily deals with structured content. Well this is all information, structured or unstructured, and unfortunately many organizations have different departments managing these assets in silos. According to Doug, “Very few Chief Data Officers oversee both ECM and DM, but makes sense that they should”. I agree 100%.

How does Data Prep tie into all of this? Data Prep can be likened to an important thread making its way through the fabric of Information Management projects. For example, creating an Inventory of information assets will require data prep to help acquire, clean, and transform data (and metadata) from disparate data sources. Similarly Data Prep can be used to help blend data usage of databases and content management systems to help put together a holistic view of all the information assets a department is using while identifying other sources they should consider using.

In short Data Prep can help to support the overall mission of Information Management.

Measuring

The last section of the book is on Measuring Information Value, which is an absolute necessity, but a very challenging exercise. Here are my thoughts from this section;

No Formal Agreed Upon Methodology of Valuing Information: I believe that all Information Professionals will agree with the phrase “Information is an asset”. Where opinions will vary is the answering the question; “What is the value of this (intangible) asset we call information?” Doug provides great insight, interesting historical context, and some very compelling use cases where Informations Value and it’s Measurement is the focus. Doug also provides some useful methods to measure information value. My personal take-away here is that there is currently no agreed upon and formal way of measuring information value. I don’t see this as a bad thing, just that it is a challenging endeavor that is still worth undertaking, with a variety of approaches. I look forward to the day that the Accounting Profession will formally include Information as an (intangible) asset on the Balance Sheet. Until then I will wait patiently with bated breath.

Data Quality Directly impacts Information Value: The bottom line here is that the higher the quality of a company’s information assets, the closer we are to the intrinsic value of those information assets. For example, diamonds are intrinsically valuable in their natural and raw state. It is not until we have done the work to refine them do we get closer to their real valuation. This is the same with information. Even at rest and unrefined Information still has its unearthed intrinsic value. Improving its quality helps get us closer to that valuation. There are fortunately formal methods of measuring information quality, addressing root causes within a process that can impact quality, and validating information quality within context and expected outcomes. However data quality alone will not help us to measure the value of our information assets.

Finally how does Data Prep connect to the Measurement of Information Value? I believe that Data Prep is most relevant to the Data Quality aspect of measuring the value of information. Again, Data Prep is all about acquiring data, cleansing it, transforming it, and finally consuming it for an objective. The end result here will be to validate the data within business context. Typical use cases of Data Prep in a Data Quality project is comparing data from two different systems in order to determine which system has more reliable information, and then understand what is the root cause of the difference. Data Prep supports the work of Data Quality which in turn impacts the value of information. I’m going to do a separate blog post about Data Prep and Data Quality because this is one topic that warrants a dedicated discussion.

In ending, Infonomics is a great read for information professionals, and in my perspective Infonomics directly and indirectly validates the importance is Data Preparation.