preparing your data for the cloud

27
1 Preparing Your Data For Cloud Narinder Kumar Inphina Technologies

Upload: inphina-technologies

Post on 24-May-2015

1.593 views

Category:

Technology


3 download

DESCRIPTION

More and more Enterprises are moving their IT infrastructure to Cloud platforms. Out of the entire components, Data Storage still remains a tricky part of the puzzle. I would like to present an overview of the choices, their advantages and limitations, we as Software Developers have currently. Based upon the choices, we may need to think about the design and architecture of the data-manipulation components of the application, we plan to put on Cloud. Following is an overview of the proposed agenda: Existing “Cloud Capable” and “Cloud Native” Relational DBMS Existing “Cloud Capable” and “Cloud Native” Non-Relational DBMS Main differences between Relational and Non-Relational DBMS’s Advantages and Limitations of Relational DBMS on Cloud Platforms Advantages and Limitations of Non-Relational DBMS on Cloud Platforms Design Patterns while using Non-Relational DBMS in the application Code Walk-through showing Integration of “Cloud Capable” and “Cloud Native” Non-Relational DBMS with a Web-Application

TRANSCRIPT

Page 1: Preparing your data for the cloud

1

Preparing Your Data For Cloud

Narinder KumarInphina Technologies

Page 2: Preparing your data for the cloud

2

Agenda

Relational DBMS's : Pros & Cons Non-Relational DBMS's : Pros & Cons Types of Non-Relational DBMS's Current Market State Applicability of Different Data-Bases in

different environments

Page 3: Preparing your data for the cloud

3

Relational DBMS - Pros

Data Integrity ACID Capabilities High Level Query Model Data Normalization Data Independence

Page 4: Preparing your data for the cloud

4

Relational DBMS - Cons

Scaling Issues By Duplication (Master-Slave)

By Sharding/Division (Not transparent)

Fixed Schema Mostly disk-oriented (Performance) May fair poorly with large data

Page 5: Preparing your data for the cloud

5

Non-Relational DBMS - Pros

Scalability

Replication / Availability

Performance

Deployment Flexibility

Modelling Flexibility

Faster Development (?)

Page 6: Preparing your data for the cloud

6

Non-Relational DBMS - Cons

Lack of Transactional Support Data Integrity is Application's responsibility Data Duplication / Application Dependent Eventually Consistent (mostly) No Standardization New Technology

Page 7: Preparing your data for the cloud

7

RDBMS's and Cloud

Page 8: Preparing your data for the cloud

8

Cloud Capable RDBMS

s

Almost every RDBMS can run in a IAAS Cloud Platform

Page 9: Preparing your data for the cloud

9

Cloud Native RDBMS

Page 10: Preparing your data for the cloud

10

Types of Non-Relational DBMS

Key Value Stores

Document Stores

Column Stores

Graph Stores

Page 11: Preparing your data for the cloud

11

Key Value Data-Bases

Object is completely Opaque to DB Mostly GET, PUT & DELETE operations are

supported There may be limits on size of Objects

Inspired by Amazon Dynamo Paper

Page 12: Preparing your data for the cloud

12

Key Value DataBases & Cloud

Project Voldemort

MemCachedDB Tokyo Tyrant

Page 13: Preparing your data for the cloud

13

Document Data-Bases Object is not completely opaque to DB Every Object has it's own schema

FirstName="Bob", Address="5 Oak St.", Hobby="sailing".

FirstName="Jonathan", Children=("Michael,10", "Jennifer,8")

Can perform queries based on Object's attributes

Possible to describe relationships between Objects

Joins and Transactions are not supported Good for XML or JSON objects

Page 14: Preparing your data for the cloud

14

Document Data-Bases & Cloud

Page 15: Preparing your data for the cloud

15

Column-Store Data-Bases

Richer than Document Stores Multi-Dimensional Map

Tables Row Column Time-Stamp

Supports Multiple Data Types Usually use an Underlying DFS

Inspired by Google Big Table Paper

Page 16: Preparing your data for the cloud

16

Column-Store Data-Bases & Cloud

Page 17: Preparing your data for the cloud

17

Key Factors while Making a Choice

Application Architecture Requirements Platform choices Non-Functional Requirements

Consistency

Availability

Partition

Security

Data Redemption

Page 18: Preparing your data for the cloud

18

Different Requirements = Different Solutions

Page 19: Preparing your data for the cloud

19

Scenario 1

Feature First Corporate Data

Consistency Requirements

Business Intelligence

Legacy Application

RDBMS on Amazon Cloud, RackSpace (IaaS) or

Microsoft Azure/Amazon RDS (PaaS)

Page 20: Preparing your data for the cloud

20

Scenario 2

Consumer Facing Application Big Files (Images, BLOB's, Files)

Geographically Distributed

Mostly writes

Not heavy requirement on Rich Queries

Key-Value Data Stores (Amazon S3, Project Voldemort, Redis)

Page 21: Preparing your data for the cloud

21

Scenario 3

Hundreds Of Government Documents with different schemas Need to serve on Web

Data Mining

Document Data-Stores (Amazon SimpleDB, Apache Couche DB, MongoDB)

Page 22: Preparing your data for the cloud

22

Scenario 4

Scale First Huge Data-Set

Analytical Requirements

Consumer Facing

High Availability over Consistency

Column Data-Stores (Google App Engine, Hbase, Cassandra)

Page 23: Preparing your data for the cloud

23

Mix & Match of Earlier Scenarios

Polyglot Persistence RDBMS for low-volume

and high value

Key-Value DB for large files with little queries

Memcached DB for short-lived Data

Column DB for Analytics

Page 24: Preparing your data for the cloud

24

CAP Theorem

Page 25: Preparing your data for the cloud

25

Conclusions

One Size does not Fit all

Many choices

No-SQL DB's providing Alternatives

RDBMS serve useful purpose

Page 26: Preparing your data for the cloud

26

[email protected]

www.inphina.comhttp://thoughts.inphina.com

Page 27: Preparing your data for the cloud

27

References http://nosql-database.org/

http://www.drdobbs.com/database/218900502

http://perspectives.mvdirona.com/2009/11/03/OneSizeDoesNotFitAll.aspx

http://blog.nahurst.com/visual-guide-to-nosql-systems

http://blog.heroku.com/archives/2010/7/20/nosql/

http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html

http://project-voldemort.com/

http://code.google.com/p/redis/

http://memcachedb.org/

http://aws.amazon.com/simpledb/

http://couchdb.apache.org/

http://www.mongodb.org

http://riak.basho.com/