copyright © 2008 initiate systems, inc. implementing enterprise information programs with mdm-cdi...
Post on 22-Dec-2015
215 views
TRANSCRIPT
Copyright © 2008 Initiate Systems, Inc.
Implementing Enterprise Information Programs with MDM-CDI & SOA
Larry Dubov, Sr. Director, Sales Consulting & Architecture
New York, NY
May 15, 2008
DAMA-NY 2008
2
Agenda
Definitions
Why MDM and EDM Now and Key Challenges
MDM and Data Hub Capabilities
Data Stewardship Framework and Information Quality
SOA and Data Services: Strengths and Weaknesses
Information Management Methodology
Lessons Learned and Accelerators
What are the next disruptive things in EDM and MDM?
3
What is Master Data Management (MDM)?
Master Data Management (MDM) is a framework of processes and technologies aimed at creating and maintaining an authoritative, reliable, and sustainable, accurate, and secure data environment that represents a “single version of the truth,” an accepted system of record used both intra- and inter-enterprise across a diverse set of application systems, lines of business, and user communities.
Master data are those data which are foundationalto business processes, are usually widely distributed, which, when well managed, are directly contributing to the success of an organization, and when not well managed pose the most risk
4
Customer Data Integration (CDI) is the “Entry Point” to MDM
MDMMDM
CDICDI
Products
Equipment
Financial Assets
Vessels
Containers
Weapons
Locations
Drugs
Vehicles
Customers
Prospects
Patients
People
Citizens
Employees
Vendors
Suppliers
Trading Partners
CDI Focus ison Individual & OrganizationalEntities:
MDM Expands the Problem to Include New Entities:
Party (CDI) Product (PIM)
5
Key Challenges
Very complex, multidimensional,and multi-disciplined and can be risky
Difficult to sell data initiativesto the business and executive management
No single vendor providesa comprehensive solution
These factors mandate development and relianceon sound models, open integration standards, and methodologies in building holistic solutions from multiple best-of-breed components
New Customer &Relationship Centric Business Processes
CDIConsumingApplications
CustomerIdentification,Correlation &
Grouping ServiceOriented
Architecture
Business &OperationalReporting
ExceptionCapture &
ProcessingData Acquisition,
Distribution &Synchronization
(Batch & Real Time)
Visibility,Security,
Confidentiality,Compliance
MetadataManagement
DataGovernance &
Standards
InformationQuality
External DataProviders
6
Enterprise Customer Data Managed by Lines of Business
Enterprises organized by Lines of Business and manage customer informationin a product-centric model with overlapping customer domains
Business Line 1:Business Line 1:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 1:Business Line 1:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 2:Business Line 2:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 2:Business Line 2:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 3:Business Line 3:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 3:Business Line 3:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 4:Business Line 4:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 4:Business Line 4:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 5:Business Line 5:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
Business Line 5:Business Line 5:
Business LineBusiness LineProductsProducts
Business LineBusiness LineCustomersCustomers
7
So, What’s the MDM-CDI Focus?The Need to Transition from Product/Account to Party…it’s a Big Deal
Current State: Future State:
Household Client Grouping
Account Grouping
Account 123:
Product 1
E-Statement KYC Acct & Client
Docs Approval
Account 123:
Product 1
E-Statement KYC Acct & Client
Docs Approval
Account 456:
Product 2
E-Statement KYC Acct. & Client
Docs Approval
Account 456:
Product 2
E-Statement KYC Acct. & Client
Docs Approval
Account 789: Product 3
Paper Statement KYC Acct & Client Docs Approval
Account 789: Product 3
Paper Statement KYC Acct & Client Docs Approval
Account 123:Product 1
Acct. Attr. Only
Account 123:Product 1
Acct. Attr. Only
Account 456:Product 2
Acct. Attr. Only
Account 456:Product 2
Acct. Attr. Only
Account 789:Product 3
Acct. Attr. Only
Account 789:Product 3
Acct. Attr. Only
Derived
JoeJoe MaryMary JoeJoe MaryMary MaryMary
Joe Mary 2Mary 1
E-Statement KYC Client Doc Approval
Paper Statement KYC Client Doc Approval
Spouse
Joe Mary
Owner Owner Beneficiary OwnerOwner(Joint)
8
Drivers
Business Area:Business Area: Drivers:Drivers:
Business Development, Sales
& Marketing
Cross sell/up sell to existing customers Effectiveness of marketing campaigns Recurring revenue from existing customers Retain “good” customers by reducing attrition rates Recognize “bad” customers
Customer Service
Account setup time Customer service time Customer intelligence and level of service Consolidated statements
Risk, Privacy, Compliance& Control
Risk Management Accurate Books & Records Compliance with AML & KYC Regulations Compliance with corporate standards and policies Regulatory fines and penalties
OperationalEfficiency
Account setup costs Customer acquisition costs Administrative overhead of redundant data entry Operation costs: duplication, redundancies, transaction errors, data processing errors
& exceptions Failed tactical initiatives Reduces costs of planned initiatives due to CDI
9
MDM Is Adding Value When You…
… Purchase Software
… Pick Up Your Prescription
… Apply for a Loan
… Check In & Earn Points
… Identify Risks
… File Insurance Claims
… Register a Patient
11
Single Version of The Truth
INFORMATION
James Jones(Name)
35 W. 15th Street(Address)
Toledo, OH(Address 2)
INFORMATION
Jim Jones(Name)
35 West 15th St.(Address)
Toledo, OH(Address 2)
INFORMATION
J. Jones(Name)
35 West 15th Street(Address)
Toledo, OH(Address 2) Sales
CustomerSupport
E.R.
INFORMATION
James(First)
35 West 15th Street(Address)
Toledo(City)
Jones(Last)
OH(State)
MGender)
30(Age)
Merge and Persist
Composite View
or
12
D&BD&B
AIUAIU
Back OfficeBack Office
Single Version of Truth: Commercial Customers
Provides accurate, real-time access to complete customer or entity data across disparate sources, systems and networks
ABC Incorporated
9146 E VIA DEL SOL
NETOWN, CA 45883
Joe Smith
480-473-5620
Name
Addr
Cont.
Phone
ABCC Incorp.
Joseph Smithe
304-473-5602
Name
Cont.
Phone
ABC Inc.
Will Jones
480-473-5620
Name
Cont.
Phone
AB&C
9146 VIA DEL SOL
NETOWN, CA 45883
480-473-5620
Name
Addr
Phn
INFORMATIONINFORMATION
Trusted System of RecordTrusted System of Record
ABC Inc.
9146 VIA DEL SOL
NETOWN, CA 45883
Joseph Smith
Will Jones
480-473-5620
Name
Addr
Cont.
Cont.
Phone
13
Why Do We Need MDM?(DW is Only as Good as Its Dimensions)
Customer
Pro
duct
Time
Customer
Pro
duct
Time
Can we really ‘slice and dice?’ Traditional Deterministic ETL may not be sufficient…
This is where Probabilistic MDM enabled by Data Hubs comes in
14
Star Schema Hubs
If your MDM solution is BI driven, align your MDM solution with complex DW dimensions
CustomerCustomerHubHub
CustomerCustomerHubHub
AccountAccountHubHub
AccountAccountHubHub
BranchBranchHubHub
BranchBranchHubHub
ProductProductHubHub
ProductProductHubHub
CustomerCustomerCustomerCustomer
AccountAccountAccountAccount
BranchBranchBranchBranch
ProductProductProductProduct
TimeTimeTimeTime
StatusStatusStatusStatus
FactsFactsFactsFacts
15
The Initiate MDS Solution in an Enterprise Architecture
We
bW
eb
Se
rvic
es
Se
rvic
es
We
bW
eb
Se
rvic
es
Se
rvic
es
Initiate Master Data Service™
TransactionTransactionSupportSupport
TransactionTransactionSupportSupport
Security &Security &Audit TrailAudit TrailSecurity &Security &Audit TrailAudit Trail
DataDataStewardshipStewardship
DataDataStewardshipStewardship
BatchBatchSupportSupportBatchBatch
SupportSupport
MatchingMatchingAccuracyAccuracyMatchingMatchingAccuracyAccuracy
PerformancePerformanceScalabilityScalability
PerformancePerformanceScalabilityScalability
Master DataMaster DataViewsViews
Master DataMaster DataViewsViews
RelationshipsRelationshipsHierarchiesHierarchies
RelationshipsRelationshipsHierarchiesHierarchies
AP
IsA
PIs
AP
IsA
PIs
Lo
ad
s &
L
oa
ds
&
Ex
tra
cts
Ex
tra
cts
Lo
ad
s &
L
oa
ds
&
Ex
tra
cts
Ex
tra
cts
Orc
he
str
ate
dO
rch
es
tra
ted
Se
rvic
es
Se
rvic
es
Orc
he
str
ate
dO
rch
es
tra
ted
Se
rvic
es
Se
rvic
es
Me
ss
ag
ing
Me
ss
ag
ing
Me
ss
ag
ing
Me
ss
ag
ing
Ev
en
ts &
Ev
en
ts &
Ale
rts
Ale
rts
Ev
en
ts &
Ev
en
ts &
Ale
rts
Ale
rts
OrdersOrdersOrdersOrders
CRMCRMCRMCRM
Call CenterCall CenterCall CenterCall Center
SalesSalesSalesSales
ProfilesProfilesProfilesProfiles
WebWebWebWeb
Self-ServiceSelf-ServiceSelf-ServiceSelf-Service
MarketingMarketingMarketingMarketing
16
Implementation Styles
Co
nso
lid
ati
on
Sty
le
Ownership Style
Inte
gratio
n Sty
le
Link
Co-Exist
Combine
Batch
Near Real-Time
Real-Time
Registry/Slave Hybrid Transaction/Master
17
More on Matching and Linking
Step 1: Optimizes data for statistical comparisons Normalizes & compacts data, creates derived data layer,
source data remains intact Phonetic equivalences, tokenization, nicknames, etc.
Step 2: Finds all the potential matches Casts a wide net – all matches on current or historical attributes,
prevents misses Partial matches, reversals, anonymous values, etc.
Step 3: Scores accurately via probabilistic statistics Compares attributes one-by-one and produces a weighted score (likelihood ratio)
for each pair of records Frequency weights specific to your business Edit distance, proximity of match Allows custom deterministic rules, e.g. false positive filters
Step 4: Custom threshold settings Single or dual threshold models Link, don’t link, don’t know – “learns” from manual input Manage cost/quality trade-offs Manage the linkages, workflow review Manual reviewManual review
Lowestpossible
score
Lowestpossible
score
Highestpossible
score
Highestpossible
scoreDon’tlink
Don’tlink LinkLink
Lowestthreshold
Upperthreshold
Should be linked
Should not be linked
18
Hierarchy Management The term hierarchy is used only as a simple hierarchy
with one and only one root, only one parent for each node within one hierarchy
Typically one hierarchy is selected as the Masterused as a foundation (e.g. D&B or custom)
There is a notion of source precedence / tree of truth High performance match to build the hierarchy 40MM
records for 15 minutes One original systems record (member) or single version
of truth record (entity) can belong to multiple hierarchies (e.g. corporate for D&B and geographywith territories, regions etc.)
A data steward can edit the hierarchy manually,(e.g. if there is a knowledge of a merger)
When later the merger update is coming from the Master source, the data steward can reconcile the source merge with the node previously created manually
Hierarchy query and navigation is done using various types of methods that allow to navigate to the node, node’s immediate children and all the sub-tree below or navigate from the node up and across (to be checked)
The product can export a hierarchy(e.g. to build a DW dimension)
19
Understand & Visualize Customer Relationships
Establish business& consumer hierarchies
Resolve logical master from multiple internal or third party source hierarchies
Rule based hierarchy & relationship creation and management
Maintain individual & organizational hierarchies through web applicationto support active data stewardship
Hierarchy: Management & Services
Hierarchy Source Data
Customer Data
20
Initial Hierarchy Harmonization:Original State of Customer Records
ID:ID: Source:Source: Bill To:Bill To: Name:Name: Address:Address: Phone:Phone:
10 SAP Harley-Davidson Inc3700 W Juneau AveMilwaukee, WI 53208-2865
4143424680
20 SAP Harley-Davidson Motor Co3700 W Juneau AveMilwaukee, WI 53208-2865
4145353500
40 SAP 20 Andy’s Harley-DavidsonBusiness Highway 81 NGrand Forks, ND 58203
7017756098
60 SAP 20 Shumate Harley Davidson6815 E TRENT AVESpokane Valley, WA 99212-1252
5099286811
30 .COM Buell Motorcycle Co2815 BUELL DREast Troy, WI 53120-1366
2626422020
80 SAP 30 Buell Motorcycles8272 Gateway Blvd EEl Paso, TX 79907-1511
9155925804
Customer’s Organization Records are Fragmented Utility of hierarchies housed in SAP & other apps is inconsistent
Disclaimer: (Example Data Only)
8080 9090
30302020
4040 5050 6060
1010
7070
Existing (SAP) Relationships
Shipping Location
Pricing
Legend:
21
Initial Hierarchy Harmonization: DNB Reference Source
ID:ID: Source:Source: Parent:Parent: Name:Name: Address:Address: Phone:Phone:
1 DNB Harley-Davidson Inc3700 W Juneau Ave
Milwaukee, WI 5320828654143424680
2 DNB 1 Harley-Davidson Financial Services, Inc
150 S Wacker Dr
Chicago, IL 6060641033123689501
6 DNB 2 Harley-Davidson Credit Corp
4150 Technology Way
Carson City, NV 8970620097758863393
3 DNB 1 Buell Motorcycle Company2815 Buell Dr
East Troy, WI 5312013662626422020
4 DNB 1 Harley-Davidson Europe LTD
6000 Garsington Rd
Oxford, Oxfordshire OX4 2DQ1865719000
5 DNB 1 Harley-Davidson Motor Company Inc
3700 W Juneau Ave
Milwaukee, WI 5320828654143424680
11
22 33 44 55
66
ID:ID: Source:Source: Parent:Parent: Name:Name: Address:Address: Phone:Phone:
1 DNB Harley-Davidson Inc3700 W Juneau Ave
Milwaukee, WI 5320828654143424680
5 DNB 1 Harley-Davidson Motor Company Inc
3700 W Juneau Ave
Milwaukee, WI 5320828654143424680
22
Initial Hierarchy Harmonization: Target State
8080 9090
30302020
4040 5050 6060
11
22 33 44 55
66
1010
7070
ID: ID: Source:Source: Parent:Parent: Name:Name: Address:Address: Phone:Phone:
1 DNB Harley-Davidson Inc.3700 W. Juneau Ave.Milwaukee, WI 532082865
4143424680
10 .COM Harley-Davidson Inc.3700 W. Juneau Ave.Milwaukee, WI 53208-2865
4143424680
ID: ID: Source:Source: Parent:Parent: Name:Name: Address:Address: Phone:Phone:
5 DNB 1Harley-Davidson Motor Company Inc.
3700 W. Juneau Ave.Milwaukee, WI 532082865
4143424680
20 SAP Harley-Davidson Motor Co.3700 W. Juneau Ave.Milwaukee, WI 53208-2865
4145353500
ID: ID: Source:Source: Parent:Parent: Name:Name: Address:Address: Phone:Phone:
3 DNB 1 Buell Motorcycle Company2815 Buell Dr.East Troy, WI 531201366
2626422020
30 SAP Buell Motorcycle Co.2815 BUELL DREast Troy, WI 53120-1366
2626422020
23
Resolve & Rationalize Hierarchies for Immediate Impact
2020
4040 5050 6060
7070
8080 9090
3030
Accurate Relationships Will Drive Revenue: DNB members that are not yet Customers
Misaligned Pricing or Territories: Unassigned or incorrect track codes and rep assignments
Incomplete Customer Profiles: Matched & organized records required for accurate analytics
Will be properly included and targeted in marketing campaigns and in sales activity
Improved Customer Satisfaction, Improved Sales Coverage, Improved Sales Operations
Deliver complete customer relationships to Data Warehouse and Marketing Analytics apps
ID – Pricing CodeID – Pricing Code::
20 – KFDRR20 – KFDRR
70 – 70 – [missing][missing]
ID – Pricing CodeID – Pricing Code::
20 – KFDRR20 – KFDRR
70 – 70 – [missing][missing]
Harley Davidson: Global Account InfoHarley Davidson: Global Account Info::
Customer Locations / Bus Units Customer Locations / Bus Units 9 9
Current Yearly Purchasing Current Yearly Purchasing $900K $900K
Added PotentialAdded Potential $300K $300K
Harley Davidson: Global Account InfoHarley Davidson: Global Account Info::
Customer Locations / Bus Units Customer Locations / Bus Units 9 9
Current Yearly Purchasing Current Yearly Purchasing $900K $900K
Added PotentialAdded Potential $300K $300K
2020 7070
1010 11
22 33 44 55
66
22 44
66
24
New Account Creation Scenarios
ID ID SourceSource ParentParent NameName AddressAddress PhonePhone
A New Andya Harley DavidsonHWY 81 N
Grand Forks, ND 58206
B New Buell Motor Cycles8272 Gateway Boulevard
El Paso, TX 77907592-5804
C NewSchumate Harley Davidson
7001 E Trent Ave
Spokane, WA 99212
AA BB CC
25
Improve Account Creation Process & Data Quality Duplicate prevention Pricing alignment Territory assignment
ID: ID: Source:Source: Parent:Parent: Name:Name: Address:Address: Phone:Phone:
40 SAP 20 Andy’s Harley-DavidsonBusiness Highway 81 N.Grand Forks, ND 58203
7017756098
A [new] Andya Harley DavidsonHWY 81 N.Grand Forks, ND 58206
2020
4040 5050 6060 70708080 9090
3030
1010 11
22 33 44 55
66
Pricing Code:KFFDEKFFDE
Sales Territory:
840123
AA BB CC
Duplicate: DO NOT ADD!
ID: ID: Source:Source: Parent:Parent: Name:Name: Address:Address: Track Code:Track Code:
6 SAP 2 Buell Motorcycles8272 Gateway Blvd. E.El Paso, TX 79907-1511
N / A
B [new] Buell Motor Cycles8272 Gateway BoulevardEl Paso, TX 77907
Assign Correct Track Code to New Account
KFFDE
ID: ID: Source:Source: Parent:Parent: Name:Name: Address:Address: Territory:Territory:
90 .COM 30 Shumate Harley Davidson6815 E. TRENT AVE.Spokane Valley, WA 99212-1252
840123
C [new] Schumate Harley Davidson7001 E. Trent Ave.Spokane, WA 99212
Assign Correct Territory to New Account
840123
26
Relationship Management
Relationship is a much looser construct than a hierarchy. Relationship can be used to associate people with group or products with categories, etc.
Relationships supports one-to-many and many-to-many associations
Relationships can be symmetric and asymmetric
For each relationship type its cardinality (one-to-many or many-to-many) is defined along with its symmetry (symmetricor asymmetric)
Also when a relationship type is defined, the types of records that can be related are also defined
Many-to-Many
One-to-Many
Asymmetric
Symmetric
28
Approaches to Information Quality
“Upstream” at the point of entry of customer information Better validation
Change in business process is likely required
Change in applications, workflows and data flows
“Downstream” Focus on ETL
Includes data stewardship
Less invasive – does not require changes in business processes
Less effective – always on the flow of dirty data
Combination of both “Upstream” and “Downstream” approaches required to accomplish best results
29
Moving Resolution of IQ Issues Closer to Point of Entry: Account Opening & Client On-boarding
CustomerInformation
File
CustomerInformation
File
HouseholdingSystem
HouseholdingSystem
Product 1:Product 1:
Account 1:
AccountAttributes
CustomerAttributes
Account 2:
AccountAttributes
CustomerAttributes
Product 2:Product 2:
Account 3:
AccountAttributes
CustomerAttributes
Product 3:Product 3:
Account 4:
AccountAttributes
CustomerAttributes
Account 5:
AccountAttributes
CustomerAttributes
Product 1:Product 1:
Account 1:
AccountAttributes
CustomerAttributes
Account 2:
AccountAttributes
CustomerAttributes
Product 2:Product 2:
Account 3:
AccountAttributes
CustomerAttributes
Product 3:Product 3:
Account 4:
AccountAttributes
CustomerAttributes
Account 5:
AccountAttributes
CustomerAttributes
Account-centric: Customer-centric:
Data EntryData EntryData EntryData Entry
DataDataEnrichmentEnrichment
VendorVendor
DataDataEnrichmentEnrichment
VendorVendor
Data EntryData EntryData EntryData Entry
Data HubData HubData HubData Hub
30
Policies, standards, processes, roles, responsibilities,metrics, & controls
Information TechnologyInformation Technology
Performs ongoing data quality task resolution
Performs ongoing data quality task resolution
Improves dataentry validation
Configuresdata quality task generation & reports
Improves dataentry validation
Configuresdata quality task generation & reports
Data StewardsData Stewards
Data Quality Improvement
Loop
Enterprise Data Stewardship Framework
Validity of Identifiers Updates overlaying identity Link Merge Conflicts between match
& hierarchy / relationships associations
Summary reportson data quality metrics
Technology Supports Customizable Data QualityTask Resolution Queues:
DataDataGovernanceGovernance
DataDataGovernanceGovernance
System
sS
ystems
System
sS
ystems
Copyright © 2008 Initiate Systems, Inc.
Service Oriented Architecture and Data Services: Strengths and Weaknesses
36
Service Oriented Architecture
Software design and implementation architecture characterized by the following:
Logical View – abstraction of loosely coupled, reusable business needs and functions;
Message Orientation – exchange between provider and requester Description Orientation – machine-processable metadata to support
interoperabilty Granularity – combination of coarse-grained and fine-grained Network Orientation – typically used over network Platform Neutral
Software architecture in which software components are exposed as services on the network, which can be reusedas necessary for different applications
When implemented with web services, offers a standard foundation for functionality reuse and data access within a federated enterprise and services provided externally
What is in common between SOA initiatives and EDM or MDM? Data Services
37
Data Sources
List Reports
Data Services Metadata
End User Interfaces with the System by Requesting a data source agnostic Business/Data Service
Executive Manager Analyst
Find Client
Report Premium Revenue
Create New
Account
Create New
Report
Data Services Metadata translates the course-grain business service call into an orchestrated set of data location aware service calls. The Metadata processes parameters of the request and security access eligibility and data visibility parameters of the requestor.
Transform Data
CRUD Record
Capture Exception
Calculate
Get Best Source Data and Process It
Execute and Return to Requestor
1
2
3 4
Note: Metadata insulates user applications and business services from data sources. Thus the data sources can be changed or replaced seamlessly with no changes in user interfaces and user experience
Data Services: Format Agnostic & Source Agnostic Data Management
38
Data Hub: SOA Architecture Viewpoint
Legacy Data Store Other Data Store
Data ProviderService Interface
Data ProviderService Interface
CDICDICustomer HubCustomer Hub
CDICDICustomer HubCustomer Hub
Data ProviderService Interface
Data ProviderService Interface
Service Consumers – Business ApplicationsService Consumers – Business Applications
External (Exposed) Services LayerExternal (Exposed) Services Layer
Internal Services LayerInternal Services Layer
Individual Identification & Recognition Privacy Preference Capture & Notification Identification of Associations, Roles & Relationships Customer Grouping Management Compliance Notification & Reporting Customer Information Maintenance Customer Insight Reporting
Individual Identification & Recognition Privacy Preference Capture & Notification Identification of Associations, Roles & Relationships Customer Grouping Management Compliance Notification & Reporting Customer Information Maintenance Customer Insight Reporting
Key Services Third-Party Data Interface Service Data Synchronization & Queue Management Data Archival & Versioning Coordination & Orchestration Visibility, Entitlements, Privacy & Security
Key Services Third-Party Data Interface Service Data Synchronization & Queue Management Data Archival & Versioning Coordination & Orchestration Visibility, Entitlements, Privacy & Security
Rules / Workflow Administration Metadata Management Transaction Logging & Auditing Content Management & Caching Event / Notification Management Error Management
Rules / Workflow Administration Metadata Management Transaction Logging & Auditing Content Management & Caching Event / Notification Management Error Management
39
Process Mgmt.Process Mgmt.OrchestrationOrchestrationContact Mgmt.Contact Mgmt. Campaign Mgmt.Campaign Mgmt. RelationshipRelationship
Mgmt.Mgmt.DocumentDocument
Mgmt.Mgmt.
Business Processes LayerBusiness Processes Layer
ILLUSTRATIVEILLUSTRATIVEReference Architecture
Hub Data Management LayerHub Data Management Layer
Party Mgmt.Party Mgmt.Client/SuspectClient/SuspectIdentificationIdentification Profile Mgmt.Profile Mgmt. GroupingGrouping
Mgmt.Mgmt.Enrichment &Enrichment &
SustainingSustaining
Hub Data Rules LayerHub Data Rules Layer
Rules CaptureRules Capture& Mgmt.& Mgmt.
SynchronizationSynchronizationRulesRules
IdentityIdentityMatchingMatching
Aggregation &Aggregation &Split RulesSplit Rules
VisibilityVisibilityRulesRules
TransformationTransformationRulesRules
Hub Data Quality LayerHub Data Quality Layer
Data QualityData QualityMgmt.Mgmt.
GUIDGUIDMgmt.Mgmt.
AddressAddressStandardizationStandardization ReportingReportingTransformationTransformation
& Lineage& Lineage
Hub Systems Services LayerHub Systems Services Layer
TransactionTransaction& State Mgmt.& State Mgmt.
SecuritySecurityVisibilityVisibility
ServicesServicesOrchestrationOrchestration
LegacyLegacyConnectivityConnectivity
PersistencePersistenceSynchronizationSynchronization
Data SourcesData Sources
40
Pros & Cons of SOA: When to Use or Not to Use SOA
Performance problems if multiple pieces of information are to be joined
If implemented with web services, solutions do not support transactional integrity for synchronous processes; compensating transactions required
SOA initiatives don’t meet expectations if not supported by data strategy
Use of data services requires strong governance and a new culture for the data governors, data stewards, and testers
Reduced data redundancy
Business, data governance and data stewards can define SLAvia services
Data services provide level of abstraction – no need to work at the data and data model levels
Standardized interaction withinthe enterprise, external andvendor provided services
Increased productivity of development and agility to support evolving requirements
ProsProsProsPros ConsConsConsCons
When implemented properly SOA and data services provide significant benefits for MDM and EDM
41
Testing SOAP messages
Testing WSDL files and usingthem for test plan generation
Web service consumerand producer emulation
Testing the publish, find, andbind capabilities of a SOA
Testing the asynchronouscapabilities of Web services
Testing dynamic run-timecapabilities of Web Services
Web services orchestration testing
Web service versioning testing
Testing Data Hub (Not Just Data But Also Services)
Data Hub
43
Importance of Information Management Methodology
Implementation of enterprise information management projects (DW, ODS, MDM, CDI, etc) require well structured methodology
Methodologies used for data intensive projects are different from traditional application development methodologies
44
Methodology – Need and Overview
Mike2.0 (Method for an Integrated Knowledge Management) is an open source for Enterprise Information Management (www.openmethodology.org)
Developed by BearingPoint
Available for Open Source Community since December 2006
Transition to Open Source “Creative Commons” license completed in May 2007
Over 2000 online pages and growing
Contains Phases, Activities and Tasks
Open Source Mike2.0 allows global community to use this methodology right now for their Information Development initiatives
Organizations and individuals can sign-up to become a contributing member of Mike2.0
46
Mike2.0 SAFE Architecture
SAFE (Strategic Architecture for the Federated Enterprise) provides the technology solution framework for MIKE2.0.
48
Ownership
Buy-in
Understanding
Awareness
Ownership
Buy-in
Understanding
Awareness
Security
Front Office
Back Office
Senior Management
Technology & Infrastructure
Legacy Systems
Security
Front Office
Back Office
Senior Management
Technology & Infrastructure
Legacy Systems
Lifecycle Phases/Releases
Lev
el o
f In
volv
emen
t
Stakeholders
Tra
inin
g
Te
stin
g
De
ve
lop
me
nt
Pla
nn
ing
Tra
inin
g
Te
stin
g
De
ve
lop
me
nt
Pla
nn
ing
The Three Dimensional Socialization Roadmap
Ownership: Demonstrated commitment tothe change and accountability
Buy-in: Agreement with the conceptsand ideas &expressed support
Understanding: Internalizing the concepts andideas and grasping the implicationsof the change
Awareness: Becoming cognizant and developinga sense of appreciationfor the change
Helps Program Managers Build Communications Plan
49
It is much easierto discuss, define and plan MDM-CDI when the problem is broken down into more manageable areas and specialty domains
It is much easierto discuss, define and plan MDM-CDI when the problem is broken down into more manageable areas and specialty domains
Typical Implementation Work Streams:Organizing for Success & Breaking the Problem Down
Master Entity Identification Entity groups &
relationships Data governance,
standards, quality,& compliance
Data architecture Metadata management &
administrative applications Initial data load Inbound data processing
(batch & real-time) Outbound data processing
(batch & real-time)
Changes to legacy systems & applications
Visibility & security Exception processing Infrastructure Data Hub applications Reporting requirements
of a stratified user community
Testing Release management Deployment Training
Helps Program Managers Build Project Plan
50
Complexity vs. Manageability
Helps Program Managers Define Phases & Releases
ManageabilityManageability ComplexityComplexity
Plan a Release HerePlan a Release Here
Critical PointCritical Point
51
Fastest ROI
Time - Months
Business Value &
ROI
High
Low
Potential End State
CDIStart
Initial Phase
ROIInitial
Resolve…Relevant
Relationshipsin the Data
Synchronize…Data, Systems,
Processes & People
Master…Your Data
6 12 18 24
Start w/ Resolve
Start w/ Master
52
Implementation Continuum
Cross Reference Management
Customer Data Access
Customer Data Synchronization
CustomerTransaction
Management
Customer ProcessManagement
Establish and maintain a trusted source for analysis
Provide people with on demand search
Transactional applicationsbuilt on topof customerdefinitionacross sources
Transference of record ownership – Hub owned and maintained
Managebusiness process associated with customer data/transactionmanagement
Provide bi-directional update between sourcesof customerinformation through messaging, APIsand other integrationmethods
Create linkages amongst all records
Prepare data fornew systems
Batch/Analytical
Real-time/Operational
Pure Registry Mastered
53
Develop Repeatable Initiative On-boardingProcesses & Templates
Program Initiation:
Program Planning & Definition
Program Initiated DW DEVELOPMENT
CDI-MDM
DATA PROFILING &DATA QUALITY
DATA SERVICES FRAMEWORK
DATA GOVERNANCE & STEWARDSHIP
DATA MODELING
On-boarding Initiative 3
On-boarding Initiative 2
On-boarding Initiative 1
Business Case & Value Proposition Business Requirements Target State Solution Detailed Roadmap Data Governance, Standards, Data Quality Architectural Principles
Global Geography
En
terp
ris
e I
nfo
rma
tio
n
Pro
gra
m
54
Develop Repeatable MDM Systems on-boarding Processes and Templates
In the first year and first implementation phase the number of legacy systems integrated in scope of MDM-CDI is limited (typically 2-3)
How to accelerate on-boarding of new systems in the consequent phases given that it is not unusual that 20-50 systems can be in scope of MDM-CDI integration?
A well-defined set of system on-boarding standards and procedures determines common rules that each legacy system should comply with to be integrated into the evolving MDM-CDI solution
Enables a repetitive on-boarding process and enables sustainable accelerated solution growth in terms of the number of systems and LOBs
Preserves integrity and consistency of the MDM-CDI solution
Improved data governance
Enables highly sustainable pace of
55
Two Schools of Thought on Hub Data Model
Hub with Out-of-the-box Data Model
Pros
Seems attractive to have the “right” data model out-of-the-box
The product has some pre-built coarse-grain business transactions
Cons
How flexible is it to support on-going changes?
Overhead of having multiple entities and attributes that never used by your specific solution
Data Model Agnostic Hub Product
Pros
Flexible to accommodate any data model and its changes
Can generate fine-grain services on top of any data model
Cons
Development work required to build coarse-grain services to support composite transactions
Possible performance and maintenance impact due to additional metadata lookups
56
Hub Implementation: Buy vs. Build vs. Data Enrichment Partner Traditional “buy or build” question is typically resolved in favor of “buy”
An additional consideration is the use of an External Data Enrichment vendor
Can we outsource the primary function of CDI hub and do customer match externally?
Use of an External Data Enrichment partner has its own pros and cons Pros
Higher match accuracy that based on the Knowledge Base (US NCOA and other Libraries)
Ability to recognize new customers and prospects
Additional data from the Knowledge Base – “data enrichment”
Cons Need to share customer data with external vendor
Capabilities and Knowledge Base quality depends on the country – domestic better than international
Additional cost
57
Focus on Data Mapping
Data Mapping is an activity that “maps” the legacy system attributes to the new customer-centric model and vice versa
This activity is performed by business analysts
The produced data maps are used by ETL and EAI developers
The mapping process is time-consuming, can cause numerous errors and can be on the critical path of development
A data mapping vendor product can help accelerate delivery Drag-and-drop interface
Open source mapping metadata
Ability to integrate the mapping metadata with ETL, EAI and EII tools and share the metadata rules
Ability to reverse the transformation rules when possible
58
Creation and Protection of Test Data
Sensitive Customer data must be anonymized (obfuscated, cloaked) to disguise it from unauthorized personnel in test and development environments.
Some anonymization techniques are as follows: Masking Data
Substitution
Shuffling records .
Number Variance
Gibberish Generation
Encryption / Decryption
Key challenges in using data anonymization techniques include:
• Ability to preserve logic potentially embedded in data to ensure that application logic continues to function
• Need to provide consistent transformation outcome for the same data
59
Some Key Reasons for Project Failure to Avoid at Project Kick-off
Lack of executive support and budgetary commitment
Lack of cooperation and/or coordination between business and technology
Lack of consuming applications – “if we build, they will come…”
Lack of end-user adoption
Underestimation of legacy impact
Insufficient socialization throughout the enterprise to include all stakeholders at the right level
Underestimation of the need for layered architecture provided by SOA
Gaps in data governance, stewardship, and information quality strategy
Miscalculated staffing needs
60
What are the Next Disruptive Things in MDM and EDM?
Match and link evolution: From entities to relationships
Integration of MDM with Business Rules Engines and Work Flows
Data Stewardship Framework
Metadata Integration
Externalization of data visibility and security