ist722 data warehousing master data management and data governance michael a. fudge, jr
TRANSCRIPT
IST722 Data
WarehousingMaster Data Management and Data
Governance
Michael A. Fudge, Jr.
Two sides of the same coin…
MDM DG
What is Master Data Management?
• Creating a single “Reference Copy” of key business entities.• Examples: Customers, Vendors, Products, Employees• Offers an organization a single version of what “Customer” means.• Helps reduce inconsistencies in data distributed throughout the
enterprise.• Aims to provide clean, reliable data. • MDM Systems can provide automated rules and utilities to maintain
“golden records” for business entities.
What is Data Governance?
• A committee or person acting as the administrator for the data.• Decides what is and is not master data.• Care for the data and certifies its accuracy.• Builds relationships and hierarchies among the data not found in
existing systems.• Decides on “rules” for the population of MDM golden records.
Some Problems Can Only Be Solved With…
Master Data Management1. Unifying disparate sources of
the same data.2. Cleaning up dirty data.3. Establish data hierarchies and
relationships not found in source systems.
4. Complete weighing factors in M-M dimensions.
Data Governance1. Which components from each
source are the authority?2. How should the data be
cleaned?3. What are the rules we use to
establish these hierarchies / relationships?
4. What formula / criteria should be used for weighing factors?
Example: “Students”
• Applicants become Students. When? How?• Students become Alumni.
When? How?• Students have Campus and Home
addresses. Which one do I use?• Student goes through legal name
change. Where do I change it?• Some Students are Alumni.• Some Applicants are Students.
• How do you deal with the complexity of who is and is not a student? And when are and when are they not a student?• Who makes these decisions?• Where is the authoritative
source for this data?
How does this relate to data warehousing?MDM and DG initiatives will help us create conformed dimensions in our data warehouse, specifically when the dimensional data is sourced from multiple systems.
OLTP1
OLTP3
OLTP2
Dimension
OLTP1
OLTP3
OLTP2
DimensionMDM
Who decides which OLTP the dimension row comes from?
IBM MDM in Action:https://www.youtube.com/watch?v=cH9r60Lru6A
Group Activity: DreamSpark Downloads• Our Goal: Track DreamSpark downloads back to the course
which requires it. • The Challenge – The download is associated with 1 student,
but 1 student enrolls in many courses.
• How can MDM and DG Solve this Problem? • Outline a strategy.
DownloadSQL Svr.
StudentBob
CourseIST722IST522
EnrollmentBob IST722Bob IST552
M … 1 1 … M M … 1 Which coursegets credit forthis download?
IST722 Data
WarehousingMaster Data Management and Data
Governance
Michael A. Fudge, Jr.