cloud&datamanagementsystem& (cdms)& - rackspace · pdf...

13

Click here to load reader

Upload: vuonghuong

Post on 06-Mar-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

Cloud  Data  Management  System  (CDMS)  

Wiqar  Chaudry  Solu9ons  Engineer  –  Senior  Advisor  

 

Page 2: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

CDMS  

Overview  The  OpenStack  cloud  data  management  system  features  a    canonical  data  modeling  framework  designed  to  broker  context  sensi9ve  data  to  distributed  applica9ons.  Key  features  include:  

§  Tunable  consistency  and  availability  guarantees  based  on  transac9ons  types  

§  Demand  based  replica9on  of  canonical  data  to  various  data  management  systems  (Cassandra,  Hadoop,  Mongo,  MySQL,  etc…)  

§  Dynamic  scalability  for  cloud-­‐scale  applica9ons    

Page 3: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

CDMS  -­‐  Components  

CDMS  manager  

CDMS  agent  

Columnar  (Read  op9mized)  

Rela9onal  (Write  op9mized)  

CDMS  agent  

Document  (Dynamically  structured)  

<Variable>  

Network  level  resource  manager  monitors  all  physical  and  virtual  compute  resources  par9cipa9ng  in  a  CDMS.  This  component  is  responsible  for  federa9ng  all  requests  to  and  from  CDMS  Agents  

Compute  level  agents  manage  all  data  and  logic  processing  on  physical  or  virtual  hosts.  

Supported  databases  and  data  management  systems.  

Cano

nical  persistence  

(Block  storage)  

Cano

nical  data  

Canonical  data  and  persistence  maintain  a  golden  copy  of  all  data  in  a  consistent  state.  

Page 4: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

•  A  logically  grouped  collec9on  of  the  above  metadata  objects  represents  a  canonical  mapping  object.    

•  The  persisted  state  of  these  objects  enables  sta9c  and  dynamic  analysis  of  the  CDMS  environment.  

•  Func9onal  metadata  enables  flexibility  and  reusability.  

CDMS:  Fundamental  Building  Blocks  

Data  Schemas  

Data  Connec<ons   Mappings   Canonical  

Schemas   A?ributes  

Data  Rule  

Logic  Rule   Workflow  Rule  

Page  4  

Page 5: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

Metadata:  Sources  and  Targets  

Page  5  

Schemas  

Connec<ons  •  Connec9ons  are  reusable  ar9facts  that  capture  

informa9on  required  to  connect  to  a  source  or  target.  Users  are  able  to  store  this  connec9on  informa9on  and  reuse  it  when  extrac9ng  or  loading  data.  

•  Schemas  define  the  physical  layout,  format,  and  data  types  of  data  within  a  source/target  object.  Schema  ar9facts  are  also  stored  and  can  be  reused.  

Page 6: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

Metadata:  Canonical  A?ribute  Ar<facts  

Page  6  

•  Schema  mapping  sets  define  the  rela9onship  between  physical  source  schemas  to  internal  canonical  data  objects.  Schema  mapping  sets  can  have  a  one-­‐to-­‐many  rela9onship  between  physical  and  logical  schemas.  

•  A  logical  data  table  is  a  collec9on  of  one  or  more  a]ributes  that  defines  a  data  table  within  the  CDMS.  

•  An  a]ribute  schema  is  the  collec9on  of  metadata  required  to  define  a  managed  a]ribute  for  use  within  a  data  table  schema.  

*A?ributes  for  all  intents  and  purposes  are  simple  key  value  pairs.    

Schema  Mapping  Sets  

Data  Table  Schema  

A?ributes  Schema    

Page 7: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

Metadata:  Applica<on  Logic  

Page  7  

•  A  Logic  Rule  is  a  reusable  object  that  contains  transforma9on  logic.    

•  A  Workflow  Rule  contains  logic  that  replicates,  moves,  or  makes  data  available  within  a  requested  context.  

•  A  Data  Rule  is  a  reusable  object  that  contains  both  data  and  workflow  rules.    

Data  Rule  

Logic  

Workflow  

Page 8: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

A]ribute  Schema  Associa9ons  

Data  Table  *,*  

A?ribute  Schema  

CDMS  Type  

System  Name  

A?ribute  Name  

Display  Name  

Created  and  managed  by  the  system.  

User  defined  name  that  uniquely  iden9fies  an  a]ribute  within  a  folder.  

User  defined  name  that  uniquely  iden9fies  an  a]ribute  within  data  table.  

A  collec9on  of  valida9on  criteria  that  can  be  applied  to  a]ributes  as  a  template.  

Primi<ve  Data  Type  

CDMS  Type  

Field  FormaPng  (email,  SSN,  phone)  

Constraints  (min/max  or  allowed  values)  

Type  Name  Compound  

Type  

A  collec9on  of  a]ributes  and  CDMS  Types  that  defines  a  complete  or  par9al  record  as  a  single  compound  type.  

Page 9: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

Data  Rule  Associa9ons  

Rule  

Parameter  Name  

Rule  Logic  

O-­‐CAP  Type  

Rule    Associa<on  

Parameter  Name  

Display  Name  

Table/List    Name  

Data  Output  

Data  Input  

Constraint  

CDMS  Type  

Data  Type  

Canonical  Data  

Correla<on  

Rules   External  Data  

Name  

A?ributes  

O-­‐CAP  Type  

Display  Name  

System  Name  

Data  Table  List  

Data  Table  

Data  Table  

Data  Table  

Data  Table  

Page 10: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

CDMS  Canonical  Data  Sources and

targets Canonical

data foundational structures

Master catalog of all

metadata objects

Files  

Databases  

Cloud  Applica9ons  

Canonical  Map  

Customer  Data  Table  

Transac9on  Data  Table  

Dic9onary  Data  Table  

Transac9on  Detail  Data  

Table  

Customer  Data  Table  

Transac9on  Data  Table  

Dic9onary  Data  Table  

Transac9on  Detail  Data  

Table  

Customer  Data  Table  

Transac9on  Data  Table  

Dic9onary  Data  Table  

Transac9on  Detail  Data  

Table  

Object  

Object  transac9ons  

Object  references  

Object  transac9on  details  

Top level objects (accounts, customers, etc…)

Object transactions (aggregate summaries)

Object details (logs)

Miscellaneous reference and relationship data

*All  objects  contain  at  least  one  or  more  a?ributes  

Page 11: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

Canonical  Object  Map  A]ribute  Associa9ons  

fname  

lname  

email  

sms  

first  

last  

email  

mobile  

Source  Schema   A]ributes  (key  value  pairs)  

First  Name  

Last  Name  

Home  Email  

Work  Email  

SMS  

System  Name  

Name  

Name  

Email  

SMS  

O-­‐CAP  Type  

CDC/MDM  Logic  

Constraint  Logic  

Valida9on  Logic  

Data  Tables   Reject  Records  

ID  

First  Name  

Last  Name  

ID  

First  Name  

Last  Name  

ID  

Databases  Applica9ons  

Applica<on    

Files  Web  forms  

Etc…  

Databases    

MySQL  Cassandra  

Etc…  

Display  Name  

Email  

Email  

Email  

Page 12: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

K  

O  

O  

O  O  

T   T  

T  

T  

T   T  

T  

T  

D  

D  

D   D  

D  

D  

D   D  

R   R  

R  R  

K  Key:  auto-­‐generated,  managed  by  system,  uniquely  iden9fies  a  logical  data  model.  

O  Object:  a  logical  collec9on  of  one  or  more  a]ributes.  

T  Transac9on  Object  :  manages  aggrega9ons  and  summaries  of  one  or  more  objects.  

D  Transac9on  Details:  Manages  details  that  might  roll  up  into  a  transac9on  object.  

R  Reference  Object:  Manages  logical  references  and  rela9onship  data  between  the  other  object  types  within  the  system.  

Canonical  Data  Model  

Page 13: Cloud&DataManagementSystem& (CDMS)& - Rackspace · PDF filedatamanagementsystems&(Cassandra,& Hadoop, Mongo,&MySQL,&etc)&! Dynamic&scalability&for&cloudOscale&applicaons&

Why  CDMS?  

•  Single  canonical  representa9on  of  data  across  public  and  private  could  environments  

•  Context  sensi9ve  bi-­‐direc9onal  replica9on  of  data  

•  Object  and  collec9on  level  consistency  tuning.  •  Enables  collabora9ve  data  management  strategies  across  enterprises  

•  High  availability  an  elas9c  scalability.