Co-Authored by: Jeff Moore and Benjamin Diez
In today’s data-driven world and with all the buzz about AI, organizations are increasingly recognizing the immense value of data and the need for a well-defined data strategy. A robust data strategy acts as a guiding compass, enabling businesses to harness the power of data to drive informed decision-making, gain a competitive edge, and fuel innovation. In this part 1 of 2 articles on the topic, we delve into the critical components of a successful data strategy and explore how organizations can develop and implement a data-driven approach to achieve their goals.
What is a data strategy and why do I need it?
Simply put, a data strategy refers to a comprehensive plan or framework that organizations develop to effectively manage and maximize the value of their data assets. It outlines the goals and guidelines for handling data throughout its lifecycle, from acquisition and storage to analysis and decision-making. A well-defined data strategy aligns with the overall business strategy and enables organizations to derive insights, make informed decisions, and gain a competitive edge in the digital age. It’s also a critical foundation needed to harness the power of AI in your business. Starting implementation of AI without a good data strategy can create more problems than it solves.
For a data strategy to work, everybody needs to play along
One of the most common pitfalls is the emergence of so-called data silos, which describes data that is only available to certain departments or people. This usually happens when departments use their own, internal tools and data collection processes without sharing this information beyond their direct stakeholders. It is not uncommon for data you think about buying from external providers is already there in your organization – you just don’t know it, yet. Of course, for some data it makes sense to keep it within one department, like most things Legal and HR related where silos exist for a reason.
A practical example from the food industry would be the supply chain department, that has a database with all the food & paper costs for your products which are not shared across the organization. Without this information being available to accounting or controlling, no profit could be calculated.
Therefore, an important part and one of the first steps when planning a data strategy is a detailed overview of your data landscape. As the figure illustrates, this overview contains where and how data is stored, the flows from point A to point B, tools being used or where data is used for dashboards. With multiple departments and at least as many sources and flows, in reality this chart will most likely be too big for a single PowerPoint slide (unless you have the vision of a falcon).
The approach to designing a data landscape starts with interviews of identifying data champions in every single department of your company. They are the ones who know best about the data flows, sources and tools their team is using. Next step is to come up with a local data map of their area of expertise where you try to link those components in a flow chart and characterize them: Is it raw, aggregated, manually entered data, what kind of database is it, etc. By combining piece by piece, you will start to understand how those processes are intertwined between departments and how the overall data flow in your company is.
It is an iterative process which can be rather tedious, as you have to re-evaluate your map with all the data champions as you go along to see if you made the right connections between systems and covered every area. The upside behind this is, that you will end up with an exhaustive inventory over every data aspect in your company. Also, as a result of the interviews you did in the process, you gathered valuable insights over the departments’ business problems in the data area which will prove helpful when aligning your data strategy with your business strategy.
The central part of a data strategy is the data landscape. This allows you to sketch how and where in your organization what data is being collected, processed and entered. The challenge often is that for every source you understand, two more come up and so it is not unusual to see the map of your landscape grow and grow.
Of course you also need to know what happens to the respective data. Is it being used straight away, enriched with additional information or processed and stored for use in a BI tool? From our experience this is quite the iterative exercise as you constantly have to check with the respective stakeholders to make sure you have the data flows lined up correctly. As tedious as this appears, the final result will be an important foundation of all your further data endeavors.
Data Governance: The garbage pitfall.
We all know the saying: “Garbage in, Garbage out”, meaning if the data you use is bad or wrong, you cannot expect to gain helpful insights from it. That is why an estimated three quarters of time in data projects is spent on cleaning up data. By defining and executing clear standards and responsibilities you make sure that your data is clean, complete and taken care of by professionals. This is achieved by establishing best practices on how to handle empty values, outliers or what format to use for attributes.
This seems logical, but in reality there are often huge gaps between what is and what should be. Let’s consider this example: we want to combine transaction data from various sources, our US and our German website. While the US website saves the data in the format MM/DD/YYYY, the German website uses DD.MM.YYYY. In order to be able to use this data in the same table, we need to align on a common format, ideally in the commonly used YYYY-MM-DD format.
Navigating the chicken and egg challenge of a rapidly evolving technology landscape
Will I postpone my data projects for another year, hoping the technology is even better by then or will I do them now, risking that everything we build is outdated by the time we´re done?
A lot of decision makers are hesitant to act, in the hopes that the product will be much better when they wait for another 6 months or a year. After all, with technology developing rapidly, the outcome might be even better then.
As plausible as this might seem, in the end, not acting is almost always the worst decision. The key to a great data strategy is a solid foundation, with clean, cataloged data and well documented processes. You can never start working on this early enough because those are processes that need to be established anyway. Once this setup is in place, you are incredibly flexible to react to new challenges.
New data sources? We have a process to link them to our existing infrastructure. New method of insight generation? We have the data ready to go. And so on…This also gives you the comfort to revisit your processes in regular cycles (e.g. every year) and evaluate if there are ways to improve them and – if this is the case – adapt them to the new challenges
Expectation Management: Another common pitfall.
Expectations towards timelines of data projects often vary significantly between data experts and business users. While the first group knows that exploring sources, gathering data, cleaning and verifying it takes time, the latter group rarely sees it the same way.
“Just click download, save as and boom, you´re done“. It’s not that hard” is still a favorite quote.
So, unless you already have your data cataloged, cleaned and ready to use, gathering and preparing the right data for your use cases will most likely take a few months or longer. Getting alignment on realistic timelines upfront is therefore a key to success.
Stay-tuned for Part 2 on data strategy where we’ll provide a perspective on talent strategy and how it can unlock your data strategy.
Hi, this is a comment.
To get started with moderating, editing, and deleting comments, please visit the Comments screen in the dashboard.
Commenter avatars come from Gravatar.