data quality audit

15
WHITE PAPER: DQ AUDIT WHITE PAPER / Uniserv Data Quality Audit: Does the quality of company data meet the requirements of data consumers? In order to implement suitable measures for improving quality of data, it should be ensured, that the actions (a) meet the require- ments of data consumers and (b) increase the efficiency of the company. The Data Quality Audit is used to determine the status quo of the data quality. In this respect, not only the data itself is at the focus, but, more importantly, the requirements of data consumers with regard to the data are considered and the data and information creation proc- esses examined. The Data Quality Audit has a modular structure, and each module has its own area of focus. Possible concepts for optimization of the data quality can be prepared in conjunction with the specialist departments. All company and product names and logos used in this document are trade names and/or registered trademarks of the respective companies.

Upload: uniserv

Post on 20-Aug-2015

822 views

Category:

Business


0 download

TRANSCRIPT

Page 1

WHITE PAPER: DQ AUDIT

WHITE PAPER / Uniserv Data Quality Audit: Does the quality of company data meet the requirements of data consumers?

In order to implement suitable measures for improving quality of data, it should be ensured, that the actions (a) meet the require-ments of data consumers and (b) increase the efficiency of the company. The Data Quality Audit is used to determine the status quo of the data quality.

In this respect, not only the data itself is at the focus, but, more importantly, the requirements of data consumers with regard to the data are considered and the data and information creation proc-esses examined.

The Data Quality Audit has a modular structure, and each module has its own area of focus. Possible concepts for optimization of the data quality can be prepared in conjunction with the specialist departments.

All company and product names and logos used in this document are trade names and/or registered trademarks of the

respective companies.

Page 2© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

Contents

The quality of data – what lies behind it?

“Single View of Customer” versus “Single View of Data”

The quality of company master data

Status quo of the quality of the company data: the Uniserv Data Quality Audit can help

Looking ahead

List of references

PAGE 3

PAGE 6

PAGE 7

PAGE 9

PAGE 14

PAGE 14

Page 3© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

The quality of data – what lies behind it?Land, work and capital are spoken of as conven-tional production factors in classic economics. However, the term „knowledge“ or „information“ is increasingly referred to as one of the important production factors. (Wikipedia, Bauer & Günzel 2009). In this context, the high importance of infor-mation derived from data becomes apparent.

Another line of thought relates the production of information directly to the manufacture of any desired product (Ballou et al. 1998). These con-siderations were preceded by the concept of Total Quality Management (TQM). This concentrates on the maximum satisfaction of the requirements for a product and refers to all the processes and depart-ments involved in the production and therefore the entire company (Wikipedia).

The correlation between the quality of data and products can therefore be expressed very simply as follows: if „information“ is considered as a product, certain requirements of the users of this product can also be defined and provided as specifications in production processes.

However, how can the requirements for information and knowledge be defined? How can it be veri-fied that the standards specified in the production process are complied with?

If the production of a car is visualized, it is assumed that the end product has four wheels and the doors and windows actually open. The engine fits in the body and the safety standards comply with the requirements. All these and more specifications were repeatedly checked during the actual pro-duction process and therefore correspond to the previously specified requirements of the subsequent driver of the car. If defects occur during the produc-tion, it is immediately stopped and fault tracing and correction are started.

HOWEVER, HOW CAN THE REQUIREMENTS FOR INFORMATION AND KNOWLEDGE BE DEFINED?

HOW CAN IT BE VERIFIED THAT THE STANDARDS SPECIFIED IN THE PRODUC-TION PROCESS ARE COMPLIED WITH?

Page 4© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

The production of information should not be any different.

THE FOLLOWING EXAMPLE HELPS TO PROVIDE A BETTER

UNDERSTANDING OF THE TERMS DATA QUALITY, INFOR-

MATION QUALITY AND KNOWLEDGE:

– 0100010 + 01001101 + 01010111 - -> DATA

– char(66) + char(77) + char(87) --> INFORMATION?

– BMW --> Three letters! - -> INFORMATION?

– BMW --> Binary Moving Window --> INFORMATION?

– BMW --> Beer with Water --> INFORMATION?

– BMW --> Bayrische Motorenwerke --> INFORMATION?

– Bayrische Motorenwerke --> KNOWLEDGE!

It becomes clear that data is at the beginning of the chain. Information is generated from the data.

But the respective background information is required to provide the meaning of this information and to enable the information to be put in the right context. In the end, the correct conclusions can only be drawn and new knowledge therefore gen-erated if the data and information at the beginning of this chain are correct.

Several authors have taken a close look at this subject, in order to give the term „data quality“ a more tangible form. To begin with, the term „quality“ is concerned. Derived from the standard EN ISO 9000:2005, quality states the degree to which a product (goods or service) complies with the existing requirements (Wikipedia). This means that the quality can be good or bad if it meets the requirements of the user or not.

In their studies, Wang and Strong (1996) asked consumers of data to define the properties of good quality data. The German Association for Information and Data Quality (DGIQ) took up this idea and described the categories and dimensions of data quality in an easy to understand manner (Rohweder et al. 2008).

IT BECOMES CLEAR THAT DATA IS AT THE BEGINNING OF THE CHAIN. INFORMA-TION IS GENERATED FROM THE DATA.

Page 5© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

According to this, the quality of data can be divided into four categories and 15 dimensions.

Category Dimension

System Accessibility Editability

Content Highly regarded Freedom from errors Objectivity Credibility

Display Comprehensibility Clarity Standard display Unambiguous interpretability

Use Up-to-dateness Added value Completeness Reasonable extent Relevance

Overview of the categories and dimensions of data quality

(after Rohweder et al. 2008)

If each of the dimensions stated here is con-sidered to be „good“, the data quality can be assumed to be optimum. Many of the stated dimensions cannot be evaluated by the system with simple performance indicators, instead the consumers of the respective data always have to decide whether the quality of the data is good.

Larry English (1999) has similar basic approach-es but makes a fundamental distinction between the quality of the data contents (correctness of the data) and the pragmatic quality of the data (data presentation).

It is inherent to both basic approaches that the focus of attention is on the data consumer who receives the data, so that can he carry out his tasks in a satisfactory manner. Or to express it in the words of Wang & Strong (1996), data quality is defined as „data that are fit for use by data consumers“.

IT IS INHERENT TO BOTH BASIC APPROACHES THAT THE FOCUS OF ATTENTION IS ON THE DATA CONSUMER WHO RECEIVES THE DATA, SO THAT CAN HE CARRY OUT HIS TASKS IN A SATISFACTORY MANNER.

Page 6© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

„Single View of Customer“ versus „Single View of Data“ It is generally assumed that the „Single View of Customer“ represents one of the highest levels of data quality in the consideration of customer master data. In the simplest case, the term „Single View of Customer“ refers to a duplicate-free customer mas-ter database. With regard to the above stated data quality dimensions, the „unambiguous interpretabil-ity“ has been considered here. Nevertheless, the „Single View of Customer“ should not be equated with the „Single View of Data“.

The illustration below makes the difference clear: whereas the „Single View of Customer“ puts the data itself at the focus, the „Single View of Data“ refers to the different requirements of the various company departments for the data. If each individ-ual user group about were asked the requirements for the data, there would certainly be different answers or lists of shortcomings.

There is therefore no „Single View of Data“ if the requirements for the data vary. A „Differing View of Data“ should be referred to instead.

„Single View of Customer“ vs. „Single View of Data“ A

„Differing View of Data“ should be referred to in a company with

different departments which have different requirements for the

customer master data.

WHEREAS THE „SINGLE VIEW OF CUSTOMER“ PUTS THE DATA ITSELF AT THE FOCUS, THE „SINGLE VIEW OF DATA“ REFERS TO THE DIFFERENT REQUIREMENTS OF THE VARIOUS COMPANY DEPARTMENTS FOR THE DATA.

MASTER DATA

Marketing

Sales

Finance

SupportDevelopment

Page 7© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

The quality of company master dataThere are many indications of possible data qual-ity problems, particularly in the case of customer master data.

THE FOLLOWING ARE MENTIONED HERE BY WAY OF

EXAMPLE:

– There is a high proportion of returns in mailing campaigns because of undeliverable address-es

– Customers complain because they receive advertising material several times

– Invoices are not paid, because they never arrived on account of an incorrect address

– Sales and marketing forecast analyses prove to be unreliable, since the prospects of suc-cess were booked to duplicates of different prospective customers

– Customers say that they are dissatisfied with the support, because the employees take too long to find all the relevant data in the system

However, the phenomena described above only concern the initial or most obvious symptoms of poor data quality. Several fundamental require-ments for customer master data can be derived from this.

The postal address should be correct and every customer should only be represented in the customer master database once (points 1 to 4).

Furthermore, all the relevant data should be avail-able to the personnel (point 5). The list of require-ments for customer master data could probably be extended indefinitely, but in the end, it would be discovered that the requirements for the data are different for each department.

BUT IN THE END, IT WOULD BE DISCOVERED THAT THE REQUIREMENTS FOR THE DATA ARE DIFFERENT FOR EACH DEPARTMENT.

Page 8© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

For example, the marketing department attaches high importance to a correct address for mailing campaigns, whereas the staff in customer support depend on the up-to-dateness and completeness of the respective customer products, which are displayed in an clearly arranged manner. The qual-ity of the data can therefore only be assessed as good or bad by comparing it with the requirements of the respective data consumers.

Many of the requirements for data can be auto-matically tested using appropriate analysis soft-ware. Some of the deficiencies can also be cor-rected in short-term one-off cleansing operations. However, even if the above stated symptoms are brought under control with cleansing opera-tions, the reason why the quality of the data is poor in the first place has neither been found nor excluded. It has not yet been guaranteed that the cleansing operation meets the requirements of all the data consumers.

A Data Quality Audit should be carried out, in order to be able to make a statement about the current status quo of the quality of the company data. The respective data is not only analyzed by means of software in the audit, but, more impor-tantly, the requirements of the data consumers are considered.

Not until the results of the audit are known can statements actually be made about which of the company data meets the requirements of the data users and which does not. The “perceived” status of the quality of the data can be verified (or refuted) with objective numbers. Furthermore, appropriate activities for a long-term improve-ment in the data quality can be considered.

A DATA QUALITY AUDIT SHOULD BE CARRIED OUT, IN ORDER TO BE ABLE TO MAKE A STATEMENT ABOUT THE CURRENT STATUS QUO OF THE QUALITY OF THE COMPANY DATA.

Page 9© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

Agreement must be reached about which criteria the data quality should be measured against, in order to carry out an appropriate assessment of the quality of company data. Many of the requirements can be checked by means of suit-able analysis tools. The data consumers must be asked about their requirements with regard to other qualities. Finally, the creation process of the „information“ product should also be carefully considered. There should be clarity about who requires which data for what purpose.

Uniserv GmbH offers a comprehensive Data Quality Audit, in order to be able to answer these questions and objectively assess the sta-tus quo of the quality of the data. The Data Quality Audit has a modular structure, the mod-ules are based on each other.

EACH MODULE HAS ITS MAIN AREA OF FOCUS ON ONE

OF THE POINTS MENTIONED ABOVE:

– Requirements for the data and their compliance which can be verified by means of analysis software. Data quality dimensions such as com-pleteness or freedom from errors are a major concern here.

– Requirements for the data and their compli-ance, about which the data consumers can provide information. Data quality dimensions such as comprehensibility or clarity are verified here. The data consumers can also submit their assessments of the credibility or the reputation of the data. The data consumers can provide valuable information about the editability or the accessibility of the data.

– Analysis of the data / information creation processes, in order to be able to identify any weak points. A fundamental understanding of the creation history is important, since the data creation processes frequently go through many individual stages such as different soft-ware applications and individual processes which in turn concern different business areas. Knowledge of the processes plays an important role if long-term measures for optimization of the data quality are to be specified.

Status quo of the quality of the company data: the Uniserv Data Quality Audit can help

UNISERV GMBH OFFERS A COMPREHEN-SIVE DATA QUALITY AUDIT, IN ORDER TO BE ABLE TO ANSWER THESE QUESTIONS AND OBJECTIVELY ASSESS THE STATUS QUO OF THE QUALITY OF THE DATA.

Page 10© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

MODULE 1: DATA QUALITY CHECK

The Data Quality check provides an initial view of the customer master data of a company. In this phase, a representative extract of the data is analyzed by means of the Data Quality Batch Suite. In this respect, particular importance is attached to the completeness and the presence of the name elements. The postal correctness of the address elements is verified and a duplicate check is carried out.

The following requirements for the data are assumed in the Data Quality Check:

– All the „must“ fields of every data record are filled

– The address elements correspond to a valid address and are therefore correct

– The „Single View of Customer“ applies, i.e. the data extract is duplicate-free, or so-called „desired“ duplicates are marked

The results of the Data Quality Check are sub-sequently presented and made available to the customer.

THE UNISERV DATA QUALITY AUDIT IS THEREFORE

DIVIDED INTO THREE MODULES:

DATA QUALITY CHECK – TECHNICAL DETAILS

Thefollowingarerequired:

– A data file, ideally in the delimiter format

– Definition and meaning of the headers

– Definition of any keys

– Definition of any value ranges

– Character coding: UTF-8 or ISO-Latin 1

– All addresses come from one country

– Maximum of 100,000 addresses

Page 11© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

MODULE 2: DATA QUALITY ANALYSIS

The Data Quality Check in Module 1 primarily validates an extract from the customer master data in a relatively simple process. Particular importance is attached to the name elements and the address elements.

The Data Quality Analysis goes a big step further. The entirety of the company data can be consid-ered here. Very specific data, such as telephone numbers, customer sales, persistence, attached data concerning other transactions, etc., can be checked here as required. Compliance with spe-cific business and plausibility rules can also be verified. If required, the customer master data can even be checked against sanction lists at this point. (A check of the in-house customer master data against the sanction lists is generally recommend-ed, in order to prevent contravention of the relevant anti-terrorism regulations. Details can be found in the White Paper on Compliance.)

Specific requirements for the data to be validated are identified in an opening workshop with the heads of the specialist departments concerned. A comparison of the technical analyses and the evaluations of the workshop will indicate the extent to which the requirements defined by the specialist departments correspond with the actual and the „perceived“ quality level.

After the analyses and evaluations have been completed, the results are presented in a clos-ing workshop. It is recommended that the heads of all the specialist departments concerned are invited, in order to take account of the „Differing View of Data“.

If any measures for optimization of the data qual-ity are to be adopted, it is indispensable that the consumers of the data are included in the decision-making process. Only in this way will the imple-mentation of the measures be widely accepted. Needless to say, the results of the Data Quality Analysis are also provided in writing.

THE DATA QUALITY ANALYSIS GOES A BIG STEP FURTHER. THE ENTIRETY OF THE COMPANY DATA CAN BE CONSIDERED HERE.

DATA QUALITY ANALYSIS – TECHNICAL AND

ORGANIZATIONAL DETAILS

Thefollowingarerequired:

– Several files or databases

– Description of the meta data:

– Contact persons from the various departments

– Definition and meaning of the headers– Definition of any keys– Definition of any value ranges– Character coding: UTF-8 or ISO-Latin 1– Information about the business and/or plausibility

rules to be verified

Page 12© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

MODULE 3: DATA QUALITY PROCESS ANALYSIS

After the status quo of the data quality has been determined in both the previous modules, Module 3 deals with the creation of the data in the company and its actual efficiency of use for the data consumers. The following questions are focused on here:

– How are the processes for the creation, change and deletion of the data described?

– Are these processes up-to-date and are they put into practice?

– Does the data and information generated by the processes enable the consumers to work as efficiently as possible?

The processes are analyzed with regard to the previously prepared requirements for the data. Any weak points in respect of the data qual-ity are identified. The data consumers are also asked for their assessment of the quality of the data. The emphasis in these interviews is on whether the contents and form of the data are presented in such a way that the daily work can be carried out with a high degree of efficiency. These requirements apply both to the operative business and to analytical business areas.

Data quality dimensions which can only be assessed with great difficulty by means of anal-ysis software are examined in the interviews with the data consumers. These concern e.g. the dimensions, credibility, accessibility, editabil-ity and objectivity. Since each of the consumer groups concerned should be included in the inter-views, the various views and requirements for the data can be considered once more.

DATA QUALITY PROCESS ANALYSIS – TECHNICAL AND

ORGANIZATIONAL DETAILS

Thefollowingarerequired:

– The relevant processes and workflow descriptions

– Contact persons (at least 2 to 3 data consum-ers) from the departments concerned

Page 13© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

ALL THE ANALYSIS RESULTS ARE PUT INTO A CONTEXT AT

THE END OF THE DATA QUALITY AUDIT:

– The evaluations made in the Data Quality Analysis

– The requirements of the specialist departments for the data

– Assessments of the data quality and individual requirements of the data consumers

– The status quo of the information-generating processes with regard to the identified require-ments for the data

The knowledge gained thereby is presented to the specialist departments and data consumers concerned in a workshop. Discussions on the pos-sible causes of inadequate data quality are encour-aged. Optimization measures and customization of the processes to improve the quality of the data can also be discussed.

The results of Module 3 and the findings of the discussions conducted in the workshop are made available in writing.

AFTER THE STATUS QUO OF THE DATA QUALITY HAS BEEN DETERMINED IN BOTH THE PREVIOUS MODULES, MODULE 3 DEALS WITH THE CREATION OF THE DATA IN THE COMPANY AND ITS ACTUAL EFFICIENCY OF USE FOR THE DATA CONSUMERS.

Page 14© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

List of references

– Wikipedia: http://de.wikipedia.org/wiki/Produktionsfaktor

– Wikipedia: http://de.wikipedia.org/wiki/Total-Quality-Management

– Wikipedia: http://de.wikipedia.org/wiki/Qualität

– Bauer, A., Günzel, H. 2009. Begriffliche Einordnung. In: Bauer, A. , Günzel, H. (Hrsg). Data Warehouse Systeme. Architektur, Entwicklung, Anwendung. dpunkt.verlag. S. 6.

– Ballou, D., Wang, R. Prazer, H. Tayi, G.K. 1998. Modeling Information Manufacturing Systems to Determine Information Product Quality. Management Science, Vol. 44, p. 462-484.

– Wang R. Y. & Strong, D. M. 1996. Beyond Accuracy. What Data Quality Means to Data Consumers. Journal of Management Information Systems, Vol. 12, p. 5-34.

– Rohweder, J.P., Kasten, G., Malzahn, D,. Piro, A., Schmid, J. 2008. Informationsqualität - Definitionen, Dimensionen und Begriffe. In: Hildebrand, K., Gebauer, M. Hinrichs, H. Mielke, M. (Hrsg.) Daten- und Informationsqualität. Auf dem Weg zur Information Excellence. Vieweg & Teubner. S. 25-45.

– English, L. P. 1999. Improving Data Warehouse and Business Information Quality. Methods for reducing costs and incre-asing profit. Wiley Computer Publishing. 518pp.

Looking aheadThe status of the quality of the company data has been determined and initial discus-sions about measures for its improvement have been conducted. What is the next step?

Irrespective of the areas of the company in which optimizations are to be implemented and the measures which have been considered, you will find the right contact partner at Uniserv GmbH.

As a solution-oriented provider covering all aspects of data quality, Uniserv offers support in the implementation and optimization of operati-ve and analytical business applications. Uniserv is also an expert partner in the areas of direct marketing, compliance / block lists and data migrations.

Individual solution concepts which improve the quality of the data and information in the long term are developed together with the customers.

As a result, the day-to-day business can be car-ried out more efficiently, business numbers are reliable and strategies for the future successful.

For further informationabout the Uniserv Data Quality Audit please visit our web page www.uniserv.com/Audit or contact us directly:

We are looking forward for advising and sup-porting you through your project.

Page 15© Uniserv GmbH / +49 7231 936-1000 / All rights reserved.

WHITE PAPER: DQ AUDIT

UniservUniserv is the largest specialised supplier of data quality solutions in Europe with an internationally usable software portfolio and services for the quality as-surance of data in business intelligence, CRM applications, data warehousing, eBusiness and direct and database marketing.

With several thousand installations worldwide, Uniserv supports hundreds of customers in their endeavours to map the Single View of Customer in their customer data-base. Uniserv employs more than 110 people at its head-quarters in Pforzheim and its subsidiary in Paris, France, and serves a large number of prestigious customers in all sectors of industry and commerce, such as ADAC, Al-lianz, BMW, Commerzbank, DBV Winterthur, Deutsche Bank, Deutsche Börse Group, France Telecom, Green-peace, GEZ, Heineken, Johnson & Johnson, Nestlé, Payback, PSA Peugeot Citroën as well as Time Life and Union Investment.

Further information is available at www.uniserv.com

Experience: OVER 40 YEARS

Market position:LARGESTEUROPEAN SUPPLIER

Employees: MORE THAN 110 PEOPLE

DIRECT MARKETING

BI/BDW

CPM

CRM

ERP

E-COMMERCE

DATA MIGRATION

PROJECTS

SOA

ON-PREMISE/ON-DEMAND

MDM/CDI

COMPLIANCE

Contact:+49 7231 936-0

Uniserv GmbH Rastatter Straße 13 • 75179 Pforzheim • Germany • T +49 7231 936-0 • F +49 7231 936-3002 • E [email protected] • www.uniserv.com© Copyright Uniserv • Pforzheim/Germany • All rights reserved.