best practice data quality

37
Best Practices for Data Quality

Upload: knjhajpr

Post on 07-Apr-2015

150 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Best Practice Data Quality

Best Practices for Data Quality

March 2009

Page 2: Best Practice Data Quality

Agenda

Business Driver

Best Practices Overview

Importance of Data Quality

Data Quality Management

– Data Culture, Analyze, Plan, Standardize, Clean & Enrich,

Integrate & Automate, Maintain

Tools and Resources

Additional Information: Data Considerations

– De-duping, Merging, Migration, Integrations & Mapping,

Reporting, IDs

Page 3: Best Practice Data Quality

Business Driver

All organizations buy a CRM tool to derive clear

quantitative metrics on their business. Having bad data

causes user frustration, poor adoption, and may lead to

bad decisions due to inaccurate reports/metrics. The

drive to have accurate data for an organization is critical

since it can provide better and accurate visibility to

increase revenue, reduce costs, increase customer

profitability, and usage. It is important to understand Data

Quality Management best practices using Salesforce.

Page 4: Best Practice Data Quality

Best Practices Overview

Every successful implementation of Salesforce should

have accurate data quality as a CRM goal. This is the

key in generating the right metrics and truly

understanding your customer. This presentation

touches on all of the aspects of creating and

maintaining good data quality.

Page 5: Best Practice Data Quality

Importance of Data QualityPitfalls of Bad Data

Inaccurate report metrics

Bad information wastes users time and effort

Marketing wastes money and effort pursuing bad prospects

Understanding your “customer” is impossible

IT wastes time sifting through information and trying to make

sense of it

Operations has difficulty reconciling data against financial and

other backend information

User get frustrated, you lose valuable buy-in and adoption

Analysts rate bad data as one of the top 3 reasons for CRM failure

Page 6: Best Practice Data Quality

75% of commercial businesses believe that they are losing as much as 73% of revenue due to poor data quality

Experian - QASU.S. Business Losing Revenue Through

Poorly Managed Customer Data

Importance of Data QualityThe Cost of Bad Data

75% ofrespondents

41% ofrespondents

Poor data quality costs U.S. businesses more than $600 billion annually

Data Warehousing Institute.”“

Page 7: Best Practice Data Quality

Data Quality Management Best Practices

Page 8: Best Practice Data Quality

Data Quality Management Best Practices

Data Culture

Analyze

Plan

Standardize, Clean & Enrich

Integrate & Automate

Maintain

Page 9: Best Practice Data Quality

Installing a Culture of Data Quality

IntroductionAnything goes, adoption before data integrity

IntroductionAnything goes, adoption before data integrity

AdaptationRecognize usage trends,Adapt standards to reality

AdaptationRecognize usage trends,Adapt standards to reality

StandardizationTrain to common « best practices »

StandardizationTrain to common « best practices »

Reward / RepressionReinforce best practices,with a carrot AND a stick

Reward / RepressionReinforce best practices,with a carrot AND a stick

IntegrationBuild tools to help multidepartment tasks / processes

IntegrationBuild tools to help multidepartment tasks / processes

AutomationMake everybody’s job easier,and make the company more efficient

AutomationMake everybody’s job easier,and make the company more efficient

1 2 3

456

Page 10: Best Practice Data Quality

Analyze: Data Profiling

Understand your data sources

– Where is everything coming from

Understand your data’s weaknesses

– Rate your data; consider completeness, accuracy, validity,

relevance, integrity, level of standardization and duplication

– Pinpoint your problems and find ways of improving this

Understand your mapping and usage of data

– Entity Level Mapping (Account, Opportunity, Contact)

– Field Level Mapping (state, city etc)

– Don’t duplicate information between entities

Page 11: Best Practice Data Quality

Data Quality Analysis Example: Phone Numbers

Not valid

Not standardized

Not complete

Page 12: Best Practice Data Quality

Plan: Data Quality Management Strategy

Create your Data Quality Plan

Identify and Prioritize Goals

Define Reports and Dashboards

Find Sponsors and Owners

Establish Budget

Select Tools (i.e. for De-Duplication)

Commit Resources

Create Communication Plan

Provide Rewards and Disincentives

Page 13: Best Practice Data Quality

Standardize, Clean & Enrich

Names Company Name & Address

Identify, Match & Score

Load to Sandbox

Find & Replace

1 2 4 5

StandardizeStandardize CleanseCleanse Enrich (Optional)Enrich (Optional) De-dupeDe-dupe ValidateValidate

US, U.S, U.S.A -> USA Acme-Widgets-453

Acme Inc HQAcme UK

J. Smith, John Smith – 80%

Hot HighCold Low

Data Transformation

Hierarchy Data

Demographics Re-parent Child Records

acme incorp.-> Acme Inc

Account: Division, Opportunity, Contact

Naming Conventions

Addresses Merge

Mergers, acquisitions, spin-offs

3

Postal Standards

J. Smith, John Smith -> John Smith

Archiving & Filtering

Validate & Modify

Load to Production

Page 14: Best Practice Data Quality

Standardize Create naming conventions and data standards and train all users

Enforce standards with validation rules and pick-lists

Implement procedures to standardize data before mass-importing

Examples:

– Accounts names: Inc vs. Incorp., INC, incorporated; Ltd vs LTD, Limited

– Opportunity names: i.e. Name – Product: “Acme – 250 Tschotchkes”

– Country/State: use validation to standardize TX vs Texas, USA vs. U.S.

– Postal Code: use validation rules for proper format in US/CAN: xxxxx-xxxx

– Contact info: use pick lists for roles, titles, department: Marketing vd. Mktg

Look for useful validation rules in Help & Training!

Page 15: Best Practice Data Quality

Cleanse Cleanse your Data

– Correct inaccuracies and inconsistencies

– Find and replace bad or missing data

– Remove or merge duplicates

– Leverage all users to fix data (it’s their data)

– Archive irrelevant and old data

– Leverage automated routines/tools

– Routinely reconcile Salesforce data against other data points/systems

Prioritize your data control process

– Fix high visibility/usage information first (duplicates, addresses, emails)

– Fix business specific information next (opportunity types, stages etc)

– Remove duplicate fields (don’t repeat account info on contact)

– Remove irrelevant fields

Page 16: Best Practice Data Quality

Enrich: Data Augmentation

Add missing information from 3rd party sources

– Phone, emails, address info, executive contact information,

– Company demographics, i.e. SIC, Industry, Revenue, Employees, Company Overview, Competitors, Fiscal Year

Understand what data would provide additional value

– Poll your sales and marketing users and see what is needed

Add internally available account intelligence

– Order history

– Purchasing Pattern

– Up-sell opportunity, i.e. products not yet owned

Page 17: Best Practice Data Quality

Integrate

Understand your Masters

– Account Master (Unique ID stored on all other systems)

– Product Master

Avoid stale and bad information from spreading

– Integrated solutions make it easier for users and more reliable for customers

– Create links or integrated apps to avoid duplicates in many systems

– Use and monitor ‘review dates’ for key objects, i.e. account plans

– Archive or flag old/irrelevant data, i.e. contacts not updated in last x months

– Use workflow/approval processes before updating key fields

Create a true “360” view of your customer

– Link order entry, fulfillment apps to Salesforce.com

Make some information read only

– Use processes like “case submission” to update account master information

Product Pricing

SFA

IntegrationTools

Internet

Internet

Accounts

Data Enrichment

Internet

Data Warehouse Leads/Oppty

• Catapult• IMI• Volume• View Central

• Quick Arrow• View Central• Siebel

• SAP• Oracle (Custom)

• SAP

EAI/Middleware• Tibco, WebMethods (Alcatel)• BizTalk

ETL• Assorted

• ???

Standards based Integration• SOA/Web Services• XML

Acct Master based on lifecycle

Page 18: Best Practice Data Quality

: Five paths to integration success

SalesforceAppExchange

Native Desktop

Connectors

Integration Partners

Developer Toolkits

Native ERP Connectors

1 2 3 4 5

A comprehensive family of technologies built on top of the Force.com Web Services API

Page 19: Best Practice Data Quality

Automate Salesforce.com partners can help!

– Leverage 3rd parties such as D&B, Hoovers and others to periodically import and automatically update account records

– Inside Scoop or other partners to augment and cleanse information

Workflow can help!– Emails requesting missing information automatically sent to owner when a record is

incomplete

Force.com can help!– Generate your own alerts through the API

– Script adds missing information

– Script updates erroneous information

Create integration points– Account Master/Product Master/Address Masters

– Address Cleansing

– Keep Relationships automated

Page 20: Best Practice Data Quality

Low Complexity Medium Complexity

High Complexity

Composite Apps• Enterprise Mash-ups• Rich user interface

Application Integration• Real-time integration•Multi-step integration• Human workflow

Data Integration• Data migration• Data replication• Bulk Data Transfers

Data Cleansing• Data de-duplication• Data assessment

4

Scontrol

Data Management ApplicationsForce.com Appexchange app considerations list not all encompassing

Page 21: Best Practice Data Quality

Data Quality Management Best PracticesNative tools for managing data quality

Web-to-X

Leverage tools to prevent duplicates before passing to Salesforce real-time

Import data from various file sources

Features

Use Validation Rules and Workflow

Data Quality Analytics

Use reports and dashboards to measure data quality

Analyze and cleanse data

Excel Connector Data Loader

Page 22: Best Practice Data Quality

Maintain your Data

Data quality decays rapidly & enterprises should follow a methodology that includes regular measurement of data quality with goals for improvement & deployment of process improvements & technology”“

Safeguard your cleansed data and prevent future deterioration

TrainTrain

•User Training•Naming Conventions•Address Conventions•Dupe. Prevention Process•Data Importing Policies

•Required Fields•Default Values•Data Validation Rules•Workflow Field Updates•Web-to-Lead Restrictions

•Data Quality Dashboards•Data Quality Reassessment•AppExchange Tools

EnforceEnforce MonitorMonitor

Page 23: Best Practice Data Quality

Maintain Data Quality: Train and Communicate

Users are trained that data integrity is a collective responsibility

Users are trained on how data will be used (establish reasons for why data

needs to be clean and accurate)

Communicate data quality goals and progress updates

Communicate policies and procedures

Data is always changing so Data Quality processes are on-ongoing!

Page 24: Best Practice Data Quality

Maintain Data Quality: Enforce

Make sure Data Ownership and Sharing is accurate

– Critical to keep data in the right peoples hands

– Designate i.e. super user or geography lead to own regional data quality

– Make sure your hierarchy, groups, teams etc are kept up to date

– Proactively have meetings with management and stakeholders to understand org changes

Define your CRUD rights on each profile

– Give users access rights to only the information they should have

Page 25: Best Practice Data Quality

Maintain Data Quality: Monitor

Use Reports and Dashboards to monitor and identify

erroneous/missing data

Data Quality owners spot check and monitor data on a regular

basis

Create Alerts and workflow to monitor data

Define centralized processes for mass loads

Implement Procedures and Policies

Enlist everyone and hold them accountable

Exception reports run monthly to find incomplete records or records

with incorrect pick list values

Page 26: Best Practice Data Quality

Improvement Checklist

Do you understand what data you have in Salesforce?

– Where is it coming from? What is wrong? What is the business impact?

Have you cleaned your data?

– Identify data owners, ensure permissions are up to date (CRUD)

– Remove duplicates (manually and through tools or partners)

Have you integrated and automated your data?

– Do your applications tie together?

– Are you using workflow for notifications? Are validation rules in place?

Have you augmented your data?

– Have you added information to help your sales users?

Do you monitor your data?

– Get the reports, dashboards and automation in place to monitor the health of your data

Do you have a good data quality culture?

– Is everyone trained and contributing to your data quality? Do users trust the data?

Page 27: Best Practice Data Quality

Tools & Resources

AppExchange - Data Quality tools and offerings

– Data Quality Analysis Dashboards

– Integration & Data Management

– Data Cleansing

– De-duplication Tools - Search term “Data Quality”

Salesforce.com Data Tools

– Apex Data Loader and Excel Connector

Dreamforce Data Quality Sessions

– Data, Data Everywhere

– No More Bad Data

– Wrangle Data & Pump up the Configuration

– Turning Around your Data Quality Dilemma,

– Data Data Data: Start your Spring Cleaning Now

Salesforce Professional Services

– Data Quality Assessment and Cleansing Solutions

Page 28: Best Practice Data Quality

Thank You

Page 29: Best Practice Data Quality

Additional Information

Page 30: Best Practice Data Quality

Data Considerations Addressing duplicate records

– There will most likely be overlapping/duplicate data

– De-dupe either before or after you import the data from one system into the other

• Prior to importing into master account– Export both data sets, merge into one and identify duplicates

– Merge/delete duplicates, import clean file

• After importing into master account– Leverage de-dupe tools in salesforce.com

– Leverage de-dupe tools from partners (www.salesforce.com/appexchange)

– Use a custom field to flag each records source system

• Establish controls and processes to minimize dupe creation and to remove dupes on an ongoing basis

• Consider existing integrations and system of record for your data

Develop rules for merging data– When there are two records for the same entity (i.e., Account), which one ‘wins’?

• Newest record? Most complete record? Record from one of the databases? Most recently updated?

– Determine who will own the records if there are duplicates

• Impacts sharing rules, reporting, etc.

• Leverage for data cleansing that will ensue

Page 31: Best Practice Data Quality

Data Considerations

Establish plan for migrating data

– Determine when master system becomes live/system of record (i.e.,

stop entering data into other system)

– Set date when you will extract all data from the system being merged

– How long will the merge take? How will you deal with interim data? New

data blackout dates? Temporary data ID? How will you communicate to

users?

– Ensure you have a complete copy of both data sets before attempting

any merging … just in case!

Note – if you have not done this type of work before, it is challenging.

Page 32: Best Practice Data Quality

Data Considerations

Create mapping tables

– Every record in Salesforce is assigned a unique 18-digit alpha-

numeric, case sensitive id by salesforce.com

– Relationships between records are established based on these IDs

(i.e., Activity related to a Contact)

– These IDs will change when you import data from one system to

another, as the system will assign it a new ID

– In order to re-create the relationships between records (i.e., import

Activities and associate to the appropriate Contact), you need to

create a mapping table that will allow you to associate the OLD

Contact ID with the new one

Page 33: Best Practice Data Quality

Data Considerations

Create Mapping Tables (cont.)

– Create a temporary/mapping field on each object you will need to map for the

old id (i.e., OLD ACCOUNT ID, LEGACY ID)

– Export all your data from the instance to be retired

• You can do this via the Weekly Export service, reports, the API, Excel

Connector, AppExchange Data Loader or request a one-time full extract

from customer support

• Don’t forget about attachments and Documents!

– Consider “dumping” these to a file server with a unique naming strategy and

use Custom Links from the salesforce.com objects to access

– When importing the data into the master Account, map the Account Id to the

OLD ACCOUNT ID field

– You will then be able to export the new Account Id, OLD ACCOUNT ID and

Account Name to act as your mapping table

Page 34: Best Practice Data Quality

Data Considerations

Created Dates– All records imported/migrated will have a Created Date = to when the import occurs

– To retain original dates, create a custom field to import into (i.e., Original Create Date)

– If you are updating via the API, the new 7.0 version will allow you to set the Created and Last Modified Dates: http://www.sforce.com/resources/tn-17.jsp

Note: You must contact Salesforce support to enable this feature.

History Tables– Stage History for Opportunities / Case History for Cases

– Data cannot be migrated into these tables, this information must be stored elsewhere if you bring it over (“Note” field is not Reportable, so custom field is recommended)

Unique Ids (system generated)– Record Ids are unique and cannot be imported

– Imported records are assigned new Id, it is a good idea to import the old Id into a custom field for mapping purposes

– Features that reference (i.e., Custom Links) unique ids of other objects (i.e., a report) must also be updated

Page 35: Best Practice Data Quality

Data Considerations

Reports

– When reporting on migrated data, date filters must take into account standard

and custom date fields (i.e., Create Date and Original Create Date)

– Other filters on existing reports must be reviewed to ensure they are still

relevant/apply to all data

Record Types (EE/UE only)

– If one of the salesforce.com instances leverages record types, all records

added from the other instance must be assigned a Record Type

– Record Types can be updated through the API, not through the import wizard

– Record Type assignment must also be aligned with user Profiles

Page 36: Best Practice Data Quality

Data Considerations

What if data is inadvertently…

– Deleted• Restore from the Recycle Bin (retained for 30 days)• Restore missing data from backups

– Merged• There is no way to “un-merge” data• Clean up/work with merged records, OR• Delete and restore from back ups

– Imported incorrectly• Mass transfer (if you can)

• Delete and re-import into proper area

• Consider tagging batches with a custom field indicating the load/batch number in case you need to reverse

Page 37: Best Practice Data Quality

Advanced Data Quality