data quality dashboards
DESCRIPTION
A short deck on how to build a data quality dashboardTRANSCRIPT
DATA QUALITY DASHBOARDS
HOW TO BUILD ONE AND HOW TO MAINTAIN IT
BILL SHARP
FIRST THING, WELL, FIRST
1. QUALITY IS AN AMBIGUOUS TERM SO YOU NEED TO DRIVE TO DEFINE IN IT IN YOUR CUSTOMERS EYES
• TO DO THIS, CONFIRM WHAT THEY CARE ABOUT AND, MORE IMPORTANTLY, WHY THEY CARE ABOUT IT
2. I HAVE SEEN CLIENTS SPEND A LOT OF TIME ON DASHBOARDS DESIGN, WIDGETS, AND NOT ENOUGH TIME ON WHAT THE DASHBOARDS ARE DRIVING
DATA QUALITY DASHBOARD COMPONENTS
• DIMENSIONS & METRICS
• THESE FORM THE BASIC FRAMEWORK FOR A DASHBOARD
• SHOULD BE PURPOSE FIT
• SOME METRICS ARE MORE APPLICABLE FOR CERTAIN ACTIVITIES
• DUPLICATION IS A PURPOSE FIT DIMENSION FOR MDM
• CONFORMITY IS A PURPOSE FIT DIMENSION FOR A MIGRATION EFFORT
• TARGETS & TRENDS
• THESE GIVE STAKEHOLDERS THE ABILITY TO CUSTOMIZE DASHBOARDS
• SHOULD ALSO BE PURPOSE FIT
• TARGETS ARE RELATIVE TO THE METRICS THEY ARE ASSOCIATED WITH
• TRENDING IS A VERY INSIGHTFUL AND QUICK WAY TO GAUGE PROGRESS
DATA QUALITY DIMENSIONS & METRICS
COMMONLY ACCEPTED DIMENSIONS OF DATA QUALITY
1. COMPLETENESS
IS REQUIRED DATA PRESENT?
2. CONFORMITY
IS DATA ADHERING TO DEFINED RULES?
3. CONSISTENCY
IS DATA REPRESENTED THE SAME ACROSS THE ENTERPRISE?
4. DUPLICATION
IS DATA REPRESENTED ONCE AND ONLY ONCE?
5. INTEGRITY
ARE DATA RELATIONSHIPS DEFINED AND ENFORCED?
6. ACCURACY
IS DATA CORRECT? (TYPICALLY REFERENCE DATA LIKE CODES / ADDRESSES / ETC)
DATA QUALITY DIMENSIONS: COMPLETENESS
• IS ALL THE REQUIRED INFORMATION PRESENT?
• IMPLIES THAT THE REQUIRED INFORMATION IS A KNOWN AND THAT IT CAN BE PACKAGED INTO A RULE
• SOME EXAMPLES FROM MY PAST:
• EVERY CUSTOMER MUST HAVE A LAST NAME, ADDRESS LINE ONE AND ZIP CODE PRESENT BECAUSE THIS IS THE ESSENTIAL INFORMATION REQUIRED TO MAIL AN INVOICE
• THIS RULE IS ROOTED IN DATA ELEMENTS AND TIED TO A MEANINGFUL AND VALUE ADDED BUSINESS OBJECTIVE
• THAT’S A GOOD METRIC!
DATA QUALITY DIMENSIONS: CONFORMITY
• DOES THE DATA MATCH THE REQUIRED DATA TYPE?
• IMPLIES THAT THE REQUIRED DATA TYPE IS A KNOWN AND THAT IT CAN BE PACKAGED INTO A RULE
• SOME EXAMPLES FROM MY PAST:
• ALL INVOICE AMOUNTS ARE TO BE STORED IN US DOLLARS BECAUSE THERE ARE CALCULATIONS DOWNSTREAM THAT CONVERT THESE AMOUNTS TO OTHER CURRENCIES WHEN REQUIRED
• THIS RULE IS ROOTED IN DATA ELEMENTS AND TIED TO A MEANINGFUL AND VALUE ADDED BUSINESS OBJECTIVE
• THAT’S A GOOD METRIC!
DATA QUALITY DIMENSIONS: CONSISTENCY
• IS DATA REPRESENTED THE SAME WAY IN MULTIPLE SYSTEMS?
• IMPLIES THAT THERE IS ONE WAY TO REPRESENT THE DATA IN ALL SYSTEMS, THAT THIS IS A KNOWN AND THAT IT CAN BE PACKAGED INTO A RULE
• SOME EXAMPLES FROM MY PAST:
• ARE ASSETS ASSIGNED TO THE SAME CUSTOMER IN INVENTORY, BILLING AND CRM SYSTEMS?
• THIS RULE IS ROOTED IN DATA ELEMENTS AND TIED TO A MEANINGFUL AND VALUE ADDED BUSINESS OBJECTIVE
• THAT’S A GOOD METRIC!
DATA QUALITY DIMENSIONS: DUPLICATION
• IS INFORMATION REPRESENT ONCE AND ONLY ONCE?
• IMPLIES THAT HOW TO BREAKDOWN INFORMATION INTO COMPONENTS THAT NEED TO ONLY BE REPRESENTED ONCE IS A KNOWN
• SOME EXAMPLES FROM MY PAST:
• A CUSTOMER, DEFINED BY NAME AND ADDRESS, SHOULD ONLY HAVE ONE ACTIVE RECORD ACROSS THE ENTERPRISE DATA LANDSCAPE
• THIS RULE IS ROOTED IN DATA ELEMENTS AND TIED TO A MEANINGFUL AND VALUE ADDED BUSINESS OBJECTIVE
• THAT’S A GOOD METRIC!
DATA QUALITY DIMENSIONS: INTEGRITY
• ARE THERE TRANSACTIONAL ORPHANS PRESENT IN THE SYSTEM?
• IMPLIES THAT THE REQUIRED INFORMATION IS A KNOWN AND THAT IT CAN BE PACKAGED INTO A RULE
• SOME EXAMPLES FROM MY PAST:
• EVERY UNIQUE CUSTOMER MUST BE ASSOCIATED WITH AT LEAST ONE ADDRESS
• THIS RULE IS ROOTED IN DATA ELEMENTS AND TIED TO A MEANINGFUL AND VALUE ADDED BUSINESS OBJECTIVE
• THAT’S A GOOD METRIC!
DATA QUALITY DIMENSIONS: ACCURACY
• ACCURACY
• IS THE DATA VALID/TRUE?
• IMPLIES THAT THE REQUIRED INFORMATION IS A KNOWN AND THAT IT CAN BE PACKAGED INTO A RULE
• SOME EXAMPLES FROM MY PAST:
• EVERY CUSTOMER MUST HAVE A DELIVERABLE ADDRESS
• THIS RULE IS ROOTED IN DATA ELEMENTS AND TIED TO A MEANINGFUL AND VALUE ADDED BUSINESS OBJECTIVE
• THAT’S A GOOD METRIC!
TARGETS & TRENDS
TRAFFIC LIGHT TARGET SETTING• PERCENTAGES REPRESENT THE PERCENTAGE OF
RECORDS THAT VIOLATE THE RULE
• HELPS QUICKLY HIGHLIGHT WHAT NEEDS TO BE PRIORITIZED (REDS) AND WHAT IS GOING WELL (GREEN)
• PROBABLY ONLY CARE ABOUT THE RED CATEGORY METRICS
• HIGHLY DEPENDENT ON A GOOD DEFINITION OF WHAT PERCENTAGES ARE GREEN, YELLOW AND RED
• TAKES SOME TWEAKING TO GET IT RIGHT
TRENDING: PROGRESS INDICATOR
• PROBABLY CARE ABOUT TRENDS MORE THAN ANYTHING ELSE
• THIS IS THE MEASURE OF REMEDIATION PROGRAM EFFECTIVENESS
• PROBABLY ONLY CARE ABOUT WHAT’S DECLINING OR REMAINING THE SAME (QUALITY IS SUPPOSED TO GET BETTER)