All Things Data Prep

Unveiling Data Prep

Embracing the Reality of Disparity

“And the truth will set you free” - John 8:32

We live in a world where the volume, variety, and velocity of data are growing exponentially. Today I want to focus on the Variety of Data or the disparate nature of it. The truth is that Data will always be disparate. In all my years of working, and I suspect yours as well, I have never seen a Utopian environment where ‘all the data’ exists in a Data Warehouse, Data Lake, Data Mart, you name it. Similarly with respects to content, I have never seen an environment where all content (documents, images, records) were all filed into one single Enterprise Content Management (ECM) system.

So if this is the truth, and we know it is, then how is it that we do not have more companies building Enterprise Information Strategies with this in mind? How come there are still too many projects trying to deliver a Utopian world of ‘all the data’ in one place? Or all your documents in one system?

When you embrace the reality of the disparity of information sources, you have a better chance of crafting a more practical and intelligent strategy for information management. Here two take away’s in developing such a strategy:

  1. Establish a United Enterprise Information Management Strategy: Break down the glass wall between Traditional Data Management and Traditional Enterprise Content Management. Managing both structured and unstructured data as one cohesive program will give you the opportunity to identify all information across the enterprise, manage it with economies of scale, and better understand business challenges.

  2. Strive for Standardization But Be Flexible towards Exceptions: I believe that the benefits of standardizing on data management and content services platforms far outweighs the perceived notion of being ‘locked-in’ to a vendor. How many types of databases platforms do you really need? How many content services platforms do you really need? How many skilled IT resources do you need to manage each platform? How easily can people navigate and access information from all these different sources? Of course there are always exceptions, but do you really need 20 Data Marts? How can we streamline and standardize that down to 5 Data Marts? How can we address the challenge of network shared drives and encourage employees to use the right Content Services application ecosystem?

As Data Prep empowers users to work with data in a variety of formats and sources it quickly reveals the Infrastructure challenges,if any, with regards to the accessibility of data and content. As such Data Prep can play a key role in how Information Management Strategy seeks to practically provide all data to the business.

Baba MajekodunmiComment