The Data Prep Job Market
Where is the data prep job market? I believe it is coming soon, very soon. Today what we have is the demand for Data Prep Skills, but not the demand for dedicated “Data Prep Professionals”.
Data Prep Skills + Data Prep Professionals
Currently there is no clear and specific job market for data prep experts and leaders. Most data related jobs expect data prep to be covered as a part of the job description or job function, which is not unreasonable. However, research leads us to believe that anyone working with data could be spending up to 80% of their time cleaning and preparing it. This means that “Data Prep Skills” are essential for some of these common roles:
Accountants
Business Analyst
Business Intelligence Analyst
Data Analyst
Data Architect
Data Engineer
Data Scientist
Database Administrator
ETL Developer
HRIS Analyst
Healthcare Data Analyst
Marketing Analyst
Operations Analyst
Given the enormity of the challenge of cleaning and preparing data, making it user friendly for an intended use, the case can be made for dedicated Data Prep Professionals. Naturally, I am biased, but doesn’t it make sense to have specialized professionals who are experts at addressing 80% of time spent cleaning data?
Demand for these individuals will strengthen as the proliferation of data prep tools and platforms continues and the arrival of Power Users. Not only are these Data Prep Professionals Power Users, they also think strategically about data prep and how it helps to fuel Data Strategy. They are also prime candidates to be administrators of Data Prep tools. Data Prep Leaders are Power Users who can teach, train, mentor, and develop other Data Prep Professionals, as well as implement data prep strategically, aligned with Data Strategy.
With time the job market will better articulate the need for Data Prep Professionals in positions such as the following:
Data Prep Analyst
Data Prep Manager
Data Prep Project Manager
Data Prep Program Manager
Director of Data Prep
VP of Data Prep
I believe these roles belong in the Chief Data and Analytics Officer’s (CDAO) division, with the highest level (VP/Director of Data Prep) reporting directly to the CDAO. These roles will help ensure Data Prep evolves from a disorganized and siloed Self-Service initiative to an Enterprise Enabler for Data Strategy.
Why isn’t the Data Prep job market here?
I think there are three reasons why it has not emerged yet, and three reasons why it will arrive eventually:
Infancy vs Maturity: It is still early days for the Data Prep Industry. It’s only been officially recognized as a separate disciple in the last 5 years or so. Over the next 5 years more companies will realize benefits of and invest in Data Prep tools. Those companies will be able to better deliver upon many of their strategic data initiatives faster than those who don’t.
Resisting vs Embracing: Unfortunately there are organizations and teams within organizations that really don’t believe 80% of their staffs time is spent preparing data, and why they should invest in empowering their staff with Data Prep skills and tools. Sometimes the resistance from IT is because Data Prep is disrupting the traditional ETL process. Sometimes both Business or IT individuals are scared of change or learning something new so they are content with manual inefficient processes (as far as they are concerned if their process works why change it?). Those organizations that do invest and embrace Data Prep will see a marked increase in productivity and data literacy of end users.
Self-Service Perception vs Strategic Enterprise Alignment & Adoption: Data Prep tools were created primarily to address the self-service needs of the business. Today when most people think or hear data prep they think of “self-service”. However the value of Data Prep to a company as a whole has warranted the need to shift from a Self-Service mindset to an Enterprise Wide adoption. In other words, you still have self-service but under they umbrella of a governed and scalable platform. Data Prep enables the Business to get insight and use data now while collaborating with IT to eventually ‘productionize’ if necessary.