cloud&datamanagementsystem& (cdms)& - rackspace · pdf...
TRANSCRIPT
Cloud Data Management System (CDMS)
Wiqar Chaudry Solu9ons Engineer – Senior Advisor
CDMS
Overview The OpenStack cloud data management system features a canonical data modeling framework designed to broker context sensi9ve data to distributed applica9ons. Key features include:
§ Tunable consistency and availability guarantees based on transac9ons types
§ Demand based replica9on of canonical data to various data management systems (Cassandra, Hadoop, Mongo, MySQL, etc…)
§ Dynamic scalability for cloud-‐scale applica9ons
CDMS -‐ Components
CDMS manager
CDMS agent
Columnar (Read op9mized)
Rela9onal (Write op9mized)
CDMS agent
Document (Dynamically structured)
<Variable>
Network level resource manager monitors all physical and virtual compute resources par9cipa9ng in a CDMS. This component is responsible for federa9ng all requests to and from CDMS Agents
Compute level agents manage all data and logic processing on physical or virtual hosts.
Supported databases and data management systems.
Cano
nical persistence
(Block storage)
Cano
nical data
Canonical data and persistence maintain a golden copy of all data in a consistent state.
• A logically grouped collec9on of the above metadata objects represents a canonical mapping object.
• The persisted state of these objects enables sta9c and dynamic analysis of the CDMS environment.
• Func9onal metadata enables flexibility and reusability.
CDMS: Fundamental Building Blocks
Data Schemas
Data Connec<ons Mappings Canonical
Schemas A?ributes
Data Rule
Logic Rule Workflow Rule
Page 4
Metadata: Sources and Targets
Page 5
Schemas
Connec<ons • Connec9ons are reusable ar9facts that capture
informa9on required to connect to a source or target. Users are able to store this connec9on informa9on and reuse it when extrac9ng or loading data.
• Schemas define the physical layout, format, and data types of data within a source/target object. Schema ar9facts are also stored and can be reused.
Metadata: Canonical A?ribute Ar<facts
Page 6
• Schema mapping sets define the rela9onship between physical source schemas to internal canonical data objects. Schema mapping sets can have a one-‐to-‐many rela9onship between physical and logical schemas.
• A logical data table is a collec9on of one or more a]ributes that defines a data table within the CDMS.
• An a]ribute schema is the collec9on of metadata required to define a managed a]ribute for use within a data table schema.
*A?ributes for all intents and purposes are simple key value pairs.
Schema Mapping Sets
Data Table Schema
A?ributes Schema
Metadata: Applica<on Logic
Page 7
• A Logic Rule is a reusable object that contains transforma9on logic.
• A Workflow Rule contains logic that replicates, moves, or makes data available within a requested context.
• A Data Rule is a reusable object that contains both data and workflow rules.
Data Rule
Logic
Workflow
A]ribute Schema Associa9ons
Data Table *,*
A?ribute Schema
CDMS Type
System Name
A?ribute Name
Display Name
Created and managed by the system.
User defined name that uniquely iden9fies an a]ribute within a folder.
User defined name that uniquely iden9fies an a]ribute within data table.
A collec9on of valida9on criteria that can be applied to a]ributes as a template.
Primi<ve Data Type
CDMS Type
Field FormaPng (email, SSN, phone)
Constraints (min/max or allowed values)
Type Name Compound
Type
A collec9on of a]ributes and CDMS Types that defines a complete or par9al record as a single compound type.
Data Rule Associa9ons
Rule
Parameter Name
Rule Logic
O-‐CAP Type
Rule Associa<on
Parameter Name
Display Name
Table/List Name
Data Output
Data Input
Constraint
CDMS Type
Data Type
Canonical Data
Correla<on
Rules External Data
Name
A?ributes
O-‐CAP Type
Display Name
System Name
Data Table List
Data Table
Data Table
Data Table
Data Table
CDMS Canonical Data Sources and
targets Canonical
data foundational structures
Master catalog of all
metadata objects
Files
Databases
Cloud Applica9ons
Canonical Map
Customer Data Table
Transac9on Data Table
Dic9onary Data Table
Transac9on Detail Data
Table
Customer Data Table
Transac9on Data Table
Dic9onary Data Table
Transac9on Detail Data
Table
Customer Data Table
Transac9on Data Table
Dic9onary Data Table
Transac9on Detail Data
Table
Object
Object transac9ons
Object references
Object transac9on details
Top level objects (accounts, customers, etc…)
Object transactions (aggregate summaries)
Object details (logs)
Miscellaneous reference and relationship data
*All objects contain at least one or more a?ributes
Canonical Object Map A]ribute Associa9ons
fname
lname
sms
first
last
mobile
Source Schema A]ributes (key value pairs)
First Name
Last Name
Home Email
Work Email
SMS
System Name
Name
Name
SMS
O-‐CAP Type
CDC/MDM Logic
Constraint Logic
Valida9on Logic
Data Tables Reject Records
ID
First Name
Last Name
ID
First Name
Last Name
ID
Databases Applica9ons
Applica<on
Files Web forms
Etc…
Databases
MySQL Cassandra
Etc…
Display Name
K
O
O
O O
T T
T
T
T T
T
T
D
D
D D
D
D
D D
R R
R R
K Key: auto-‐generated, managed by system, uniquely iden9fies a logical data model.
O Object: a logical collec9on of one or more a]ributes.
T Transac9on Object : manages aggrega9ons and summaries of one or more objects.
D Transac9on Details: Manages details that might roll up into a transac9on object.
R Reference Object: Manages logical references and rela9onship data between the other object types within the system.
Canonical Data Model
Why CDMS?
• Single canonical representa9on of data across public and private could environments
• Context sensi9ve bi-‐direc9onal replica9on of data
• Object and collec9on level consistency tuning. • Enables collabora9ve data management strategies across enterprises
• High availability an elas9c scalability.