Information is the lifeblood of industry and government, but can your organization trust the quality of the data used to carry out day-to-day business operations? IT departments can spend significant resources implementing and supporting complex business applications from customer relationship management (CRM) and enterprise resource planning (ERP) to data warehouse and business intelligence (BI), but all too often the data in these systems is inaccurate, incomplete and lacks consistency. One of the reasons data quality has been difficult to mitigate, is that it has been left solely in the hands of IT departments. At Informatica we take the view that data quality is a strategic issue that should be driven by senior management and information consumers in co-operation with the IT department. Data quality should not be the sole preserve of IT.
The cost of poor data quality
Data quality, or information quality, is a growing concern for organizations more and more reliant on technology. Information intensive applications such as BI, ERP and CRM deliver value only if the data they use is reliable, complete and accurate. Process automation means that data is the foundation of all critical business operations, where poor data quality can lead to breakdowns in the supply chain and poor business decisions. At the same time in this era of higher public sensitivity customers have come to expect better service, misspelling a name or not knowing the status of an order can quickly alienate hard won customers.
There are endless examples of how defective data can hamper the performance of business applications and cost organizations money. Every year firms spend millions of pounds/dollars/euro on wasted direct marketing effort by mailing expensive marketing materials to undeliverable and duplicate addresses. Poor quality data also impedes customer segmentation and analysis – according to one analyst firm three-quarters of financial services providers make business decisions based on sub optimal data.
Implementing an enterprise application without addressing data quality is a waste of time and resources. If data quality is not dealt with, potential problems such as redundant data, incorrect mailing addresses, and unnecessary printing, postage and staffing costs, and ultimately bad corporate decisions based on bad data, can lead to a slow but steady erosion of an organization's credibility among customers, suppliers and staff. Because CRM, ERP and BI are central to an organization’s ability to do business everyone who uses the systems, from the production staff, sales force and call center personnel, to executive-level professionals and marketing teams, relies on the quality of the data they contain. If they have reason to lack confidence in the currency and quality of the information, the application will fall into disuse and ultimately fail to deliver value to the business.
What is Data Quality?
Good data quality is defined in a number of different ways, but ultimately it is about the data meeting the needs of the information consumer. For example, a marketing manager can still execute a marketing campaign even if 20 per cent of the data he/she uses contains duplicate entries or in-complete address information. The campaign may be costly, inefficient and wasteful of resources, but it is unlikely to fail outright. On the other hand, a bank calculating risk exposures for Basel II compliance or an insurance company implementing Solvency II could not work with data of such poor quality.
The marketing manager may have different data needs than the chief risk officer at a bank or policy-underwriting department in the insurance company, but at the same time all will benefit significantly from improved levels of data quality. In a recent article on data quality published in The Data Warehousing Institute’s journal What Works in Data Integration, Neil Hershaw, Information management officer at M&S Money in the UK wrote, “there was obviously the need to satisfy the data quality elements of Basel II, but also taking this enhanced overall approach to data management would give us greater confidence in our activities to create sales and therefore help drive competitive advantage in the market.”
Causes of poor data quality
Data quality problems can occur in many different ways. The most common include:
Experience in the United States Department of Defense for example revealed some time ago that a majority of data errors can be attributed to process problems. Data quality problems often stem from system deficiencies rooted in poorly documented modifications, incomplete user training and user manuals, or systems and data that have been extended beyond their original intent.
This final point is a significant problem in today's business environment where organizations are keen to gain competitive advantage from their information resources. Businesses are using more data from more sources in more systems. New IT initiatives often mean that data collected for one purpose such as billing or manufacturing materials management is being used in other applications such as Business Intelligence, Supply Chain Management, Risk Management and Customer Relationship Management. But data collected for use in operational systems may not be suitable for these more data intensive activities. At the same time, fragmented and distributed IT systems can lead to data duplication, lack of conformity across systems and other discrepancies.
Data quality metrics and KPI
Defining data quality and then reporting and monitoring progress are necessary to ensure that best practice is embedded within the organization culture. By measuring and tracking how data quality problems arise, you can provide the information needed to change business processes and work practices. Producing and publishing regular data quality reports and scorecards also provides the basis for incentivizing managers and data collectors to provide data that really meets the needs of the organization.
Informatica uses six key dimensions to define and measure good data quality: These are completeness, conformity, consistency, accuracy, duplication and integrity.
These six dimensions cover a multitude of sins that we most commonly associate with poor quality data: data entry errors, misapplied business rules, duplicate records, and missing or incorrect data values. Duplication of data is probably one of the most serious and persistent problems. Duplicate records in a data warehouse for example make it difficult to analyze customer habits and to properly segment customers in terms of market or down to a level of household or subsidiary. Inaccurate data results in poor targeting, budgeting, staffing, unreliable financial projections and so on.
Data quality for life
To date most organizations that have tackled the data quality issue have tended to implement tactical solutions to improve quality within a single department or within a single business process. While this approach may mitigate the problem for part of the organization in the short term, such limited initiatives generally fail to achieve long-term data quality improvement on a broad scale.
Like a pet, data quality is not just for the holidays – it’s for life. It is not an exercise that can be completed once and then forgotten about. The problem with data is that its quality quickly degenerates over time. According to PricewaterhouseCoopers two per cent of records in a CRM system could become obsolete in just one month as customers move, divorce, marry or die. Added to this defective data entry and collection processes lead to incorrect information and duplicate entries contaminating databases on an ongoing basis.
So how does an organization tackle the problem of poor quality data, and who should take responsibility for data quality within the organization? The answer is to clean up the data already in use; stop low quality data from getting into systems; and remember that data quality levels degrade once the information is collected, so data quality is a never-ending mission. Solving the data quality issue requires an enterprise-wide approach that includes a combination of technology and organizational, cultural, and process change.
This can be achieved on a phased basis, but for data quality improvement to become a reality, from which your organization reaps dramatic benefits, a long-term business-led view must be taken. Why business-led? Because there is an increasing recognition within the data quality market that there has historically been too much of a focus on IT data quality requirements such as structure and not enough on those of the business, which is all about content. Data quality should not be the sole preserve of IT. IT-only data quality efforts tend to be ineffective. Data quality is a strategic issue that should be driven by senior management and business information consumers in co-operation with the IT departments.
No organization can be expected to solve all its data quality problems in one go. Ensuring that accurate, consistent and timely data is delivered to the CRM system requires a long-term a step-by-step data quality management program that eventually encompasses all company data. Data quality improvement is not just about fixing data. It involves process and cultural change within your organization to ensure that high standards are maintained and like any quality initiative it is important to put programs in place for continuous improvement.
Reporting and monitoring for example, is one of the key tools necessary to ensure that data quality is embedded within the organization culture. By measuring and tracking how data quality problems arise, you can provide the information needed to change business processes and work practices. Producing and publishing regular data quality reports and scorecards also provides the basis for incentivizing managers and data collectors to provide data that really meets the needs of the organization.
The true value of data
Despite nearly 40 years of information technology advances it is only now that some organizations are realizing the true value of their corporate data. Increasing reliance on automated business processes, more stringent regulation and the ever-increasing pitch of competition is driving organizations to focus on the data they use to run their business as they strive to improve customer service, comply with government regulations and customer mandates, and streamline global operations.
To support today’s business processes and goals all corporate data needs to be accessible, reliable, accurate, timely and fit for purpose. Organizations need to know more about what is in their source systems, they need to be able to integrated data from multiple systems into new more productive data-intensive applications, and they need to be able to cleanse and enhance data as well as monitoring and managing the quality of data as it is used in different applications.
TDWI and Informatica present: Data Quality Management – Assessing Business Impacts and Data Quality Requirements
Date/Time: January 9, 2008, 12:00 PM, EST
Please join David Loshin, President, Knowledge Integrity, Inc, to learn about the methods and techniques employed to asses business and data quality requirements and the tools and techniques for ensuring the validity, completeness, and consistency of data in order to enable the organization to achieve its business objectives.