Ryan Wakefield | 03/17/2016 | Data and Analytics
Data is one of the most critical corporate assets for any organization. What’s more, organizations are generating data at staggering rates, with analysts predicting roughly 50% year-over-year data growth in some industries, with media and entertainment rapidly exceeding these growth estimates due to large amounts of unstructured data. According to a Gartner survey, nearly 75% of organizations are investing or planning to invest in big data in the next two years. Despite these staggering assertions, many organizations are unaware if their data is being properly captured, integrated across lines of business and accurately represented.
And the cost of poor data quality can be staggering, especially at scale. In 2014, the USPS published a report on their internal audit of Undeliverable as Addressed (UAA) mail. Of the 158 billion pieces of mail processed in FY13 almost 4.3% was classified as UAA, which totaled 6.8 billion. The total cost of processing these 6.8 billion pieces of mail was $1.5 billion comprised of costs associated with forwarding mail, returning mail, general waste and administrative costs. In other words, $1.5 billion of your tax money is wasted because of five attributes with incorrect data: name, address, city, state, and zip code. Extrapolate this example across the tens of thousands of medium and large-sized businesses in the US and the actual cost of poor data quality for US-based organizations is in the billions and with some have speculating it could be as high as several trillions of dollars per year
Data Governance is at the heart of this problem.
What is Data Governance?
Data Governance is a term with an evolving definition, but what it clearly is not is a collection of ad-hoc, tactically-focused data correction projects. Data Governance is the overarching policies and processes that govern the management of enterprise data assets. The purpose of data governance is to identify what data and information is important, establish the processes to manage it and measure the effectiveness of the effort in achieving business objectives.
Data Governance is achieved through the establishment of a focused team composed of technology and business stakeholders. This team oversees data by documenting policies and controlling how pieces of data are captured, defined, stored and distributed across the enterprise.
There are six foundational components that help Data Governance programs achieve success.
1. Executive Sponsorship
The need for Data Governance starts in the boardroom. Organizations are striving to be more innovative and adaptable to economic realities. This often manifests in the form of executive leadership frequently changing the strategic and tactical direction of the organization. As business needs change so do data needs; therefore, it is critical that senior leadership understand the importance of being data-driven, which means taking strategic action as a result of what the data indicates rather than making intuitive guesses.
Although senior and/or mid-level analysts who work daily with data may be the people who most want to drive data governance and initiatives, their efforts will only succeed if the executive leadership team is directly involved. Unfortunately, it is common for data quality projects to become an executive-level imperative only when the organization needs to comply with regulations or during a merger and/or acquisition. This type of initiative is often reduced to a one-off data integration or correction project assigned to IT to implement and maintain.
To motivate executive understanding and involvement in building a data-driven culture, it is necessary to start with a business case. Leadership must see the strategic benefit of doing business through data-driven initiatives, such as how data can drive more reliable sales and marketing campaigns or how data can help improve business processes and reduce operational costs.
When leadership is on board with the value of these initiatives, a strategy and roadmap should be created that shows how the business will transform to the new standard over time. These efforts should not be seen as a single project, but a gradual increase in maturity that is achieved through each initiative.
2. Policies and Standards
Defining the standards and policies that the rest of the organization will follow falls into the responsibility of the data governance team.
An essential piece of the puzzle is the definition, allowed values, and restrictions of each data element. Using our USPS example above, a key data element is the postal address of a customer. This piece of data can be named, defined, and inputted in multiple ways, so an authoritative standard must be set to maintain consistency. Without consistency, data cannot be integrated throughout the organization, which can result, as an example, in an organization sending mail to an out-of-date customer record resulting in an undeliverable address and wasted time and money. Ideally, each data element or grouping of data elements is assigned a data steward who documents the standards and ensures quality.
Data stewards examine how data enters the system through all the possible entry points. For example, data might originate through a point-of-sale application, a website, an iOS or Android application, a mailed-in form, a call center, or other channels. A full data workflow must be created, mapping all of the channels and entry points where errors or inconsistencies might occur. Any transformations the data goes through also need to be documented and distributed to all consumers of the data.
Data can be defined and captured in numerous ways, with different choices benefiting different lines of business. Each standard and policy that is produced will result in a tradeoff between the overall corporate needs and the individual line of business needs. It is critical that data stewards across all lines of business have healthy debates and strike a good balance in the structure and content of every policy and standard.
3. Master Data Management
Master Data Management (MDM) provides a single, authoritative point of reference to ensure that all data stored in enterprise applications is consistently managed. Some organizations adopt Master Data Management solutions hoping that the technology alone will solve their data quality issues. However, MDM does not ensure data quality without human guidance and mature business processes. You still need data governance. As a result, organizations are often disappointed when MDM solutions fail to deliver the expected ROI.
Ideally, an organization establishes data governance before attempting to implement complex data initiatives like MDM and does so to satisfy a specific business strategy. But in reality, the situation is often reversed, resulting in some overlap of effort or missing data standards that were immaterial to the specific challenge MDM was implemented to solve.
The risks of not having a sound MDM strategy are well-known: no “single version of the truth,” data inconsistencies across the enterprise, complex data integration efforts and potential negative business impact due to inaccurate data. This often materializes in the business as operational inefficiencies, lost customer opportunities and decreased revenues.
Organizations should not approach MDM as a “boil the ocean” strategy, but instead focus on the most critical data elements that need to be mastered. For example, retail organizations may want to have a robust MDM solution around their customer data, but may not have as much focus on employee data or vendor data. Additional focus may also be needed; “customer” can be defined many ways (e.g. consumer, business-to-business, user, etc.) and it is best to focus energy on solving critical business problems that add tremendous overall value to the organization first.
4. Data Ownership with the Business
One of the first tasks for an organization’s data governance body is to clearly establish data ownership. Having the business own their data is critical in ensuring a long-term commitment to the overall quality of data and to ensure that it aligns to the overall business needs. Organizations that have many IT stakeholders as the data owner should re-evaluate their governance strategy.
Once data ownership has been assumed by the business, data stewards are elected to document procedures and guidelines on how data is defined, transformed, accessed and used. If low data quality leads to poor business decisions, data stewards are held accountable. For example, if revenue is lost by sending the wrong type of customer discount offers, or if you can’t deliver your product because of problems with inventory data, the data owner should be responsible.
Assuming ownership does not mean assuming 100% of the work, however. There are many roles within a high-performing data governance organization and each has their area of responsibility:
- Chief Data Officer – Provides overall guidance and makes overarching decisions across the entire organization; sets the vision and ensures executive leadership sees value.
- Data Governance Leader – Navigates politics, briefs executives and guides the overall data governance program; the chief lieutenant for the CDO
- Data Governance Council (Data Owners) – Typically a cross-functional representation of the data owners aligned by their respective business areas. These leaders ensure their own program tracks are progressing and elevate corporate issues as they arise. The council must empower data stewards.
- Data Stewards – Business-aligned resources that set the business process rules, data definitions and help define the standards and policies; actively monitor data quality.
- Data Custodians – The technical resources (often IT-aligned) that ensure data follows the standards and policies defined by the governance council and data stewards.
- Data Consumers and Producers – The main users of the data and the first to be impacted by changes in policies and standards; one of the main stakeholders to consider.
Providing data management functions to data stewards creates impetus for quality, removes silos, and limits redundancy. There must be recognition that certain individuals in the organization are the authoritative resource on a specific subset of data. Most of these individuals already work with this data day to day, so assigning them ownership is simply recognizing their expertise and their accountability. The best people to become data stewards are usually not in IT.
5. Structure and Cadence
Another critical component to ensuring a successful Data Governance program is to ensure a well-defined structure is in place and that teams meet to discuss critical issues on a regular basis. There are three forms of data governance organizations: decentralized, centralized and hybrid.
- Decentralized: Functional areas operate with complete autonomy, while attempting to maintain global standards to meet specific enterprise requirements.
- Centralized: Data Governance provides a single point of control at the enterprise level for decision-making, with little or no responsibility in the functional areas.
- Hybrid: Data Governance provides a single point of control and decision-making at the enterprise level as well as governance structures in the functional areas.
Each of these structures have pros and cons and depending on the maturity and complexity of the organization, each can be successful. For example, a decentralized model works well for fast decision making in an individual line of business whereas a highly centralized model takes a long time to make decisions, but is very inclusive of needs across all lines of business.
It is best to examine your business needs and organization structure to determine what works best for your specific use case. Most importantly is that, once you commit on a structure, your data governance organization sets regular meetings and establishes a good cadence and momentum.
The following diagram is an example of how a Data Governance organization might be structured.
6. Measurement and Regular Auditing
One of the most important aspects of data governance is measurement. Data quality must be checked against clearly defined and measurable metrics for the business to assess the result of their data governance effort. At a high level, there are two categories for metrics: quantitative and qualitative.
Quantitative Metrics:
Quantitative metrics are direct measurements that assess the data itself. Some examples of quantitative metrics include completeness, validity, and accuracy:
- Completeness is the presence of a data value within a field, such as a postal address that is not missing street names, according to the rules established for that data element.
- Validity references whether the value is correct within a limited context of reference, such as whether it matches a value in a master data repository.
- Accuracy refers to whether or not data is correct in a real-world context, such as whether the postal address entered is a real address where mail can be delivered—according to, for example, the USPS database.
These metrics constitute hard numbers that assess data quality. For example, a data steward might discover that postal address data is 99% complete, 87% valid and 66% accurate.
Qualitative Metrics:
Qualitative metrics are more indirect. These metrics are established by the business to measure soft objectives like improved customer satisfaction, customer loyalty, business opportunity, regulations compliance, or team collaboration. For example, data collected from surveys, social media reviews, or comment cards can be used to measure customer satisfaction, but the way the business chooses to evaluate this data is subjective.
It is important that all metrics are relevant to the business objectives and establish what success looks like. For some data, this may be a hard number or percentage that the business is trying to hit. For other data, there may be levels of progress or maturity that are measured according to stated objectives and supported by other metrics.
Auditing:
Once measurements for data quality are established, regular audits must be done to ensure compliance. When data quality issues are identified, the source causing the quality issue must be discovered and fixed. For example, a company might discover many misspelled customer names in its database. Rather than just fixing the spellings, the data steward should investigate how the names were entered incorrectly. He or she might determine that most of the misspellings originated in the call center where intake operators guessed how to spell names heard over the phone. This could be corrected by implementing a process where call center representatives are required to ask the customer for a unique identifier, such as an account number or social security number, so that the system can pull the correct customer record.
Conclusion
We live in a data-driven economy. Organizations will continue to collect tremendous amounts of data regardless of whether that data is effectively managed. It is essential for businesses to understand that data quality initiatives are not done for the sake of data but for the benefit of the business. At AIM Consulting, we provide both the strategy and resources needed to help businesses establish data governance and optimize management of enterprise data throughout the entire data lifecycle, leading to more reliable business decisions and enhanced productivity and efficiency.
Building a data-driven culture requires a paradigm shift that prioritizes consistency and accountability through data governance. The result should be data that is trustworthy and available to the entire organization. When businesses come to realize the importance of owning and caring for their data they will cease to fear their data and start profiting from it.