webinar: from relational databases to mongodb - what you need to know

58
Engineer Bryan Reinero @blimpyacht Relational to MongoDB

Upload: mongodb

Post on 01-Nov-2014

10 views

Category:

Technology


1 download

DESCRIPTION

Relational databases weren't designed to cope with the scale and agility challenges that face modern applications. MongoDB can offer scalability, performance and ease of use - but proper design will be a critical factor to that success. We'll take a dive into how MongoDB works to better understand what non-relational design is, why we might use it and what advantages it gives us. We'll develop schema designs by example, and consider strategies for scale out.

TRANSCRIPT

Page 1: Webinar: From Relational Databases to MongoDB - What You Need to Know

Engineer

Bryan Reinero

@blimpyacht

Relational to MongoDB

Page 2: Webinar: From Relational Databases to MongoDB - What You Need to Know

Unhelpful Terms

• NoSQL

• Big Data

• Distributed

What’s the data model?

Page 3: Webinar: From Relational Databases to MongoDB - What You Need to Know

MongoDB

• Non-relational

• Scalable

• Highly available

• Full featured

• Document database

Page 4: Webinar: From Relational Databases to MongoDB - What You Need to Know

RDBMS MongoDBTable, View ➜ CollectionRow ➜ DocumentIndex ➜ IndexJoin ➜ Embedded

DocumentForeign Key ➜ ReferencePartition ➜ Shard

Terminology

Page 5: Webinar: From Relational Databases to MongoDB - What You Need to Know

Sample Document{

maker : "M.V. Agusta",type : sportbike,rake : 7,trail : 3.93,engine : {

type : "internal cumbustion",layout : "inline"cylinders : 4,displacement : 750,

},transmission : {

type : "cassette",speeds : 6,pattern : "sequential”,ratios : [ 2.7, 1.94, 1.34, 1, 0.83, 0.64 ]

}}

Page 6: Webinar: From Relational Databases to MongoDB - What You Need to Know

Relational DBs• Attribute columns are valid for

every row

• Duplicate rows are not allowed

• Every column has the same type and same meaning

As a document store, MongoDB supports a flexible schema

Page 7: Webinar: From Relational Databases to MongoDB - What You Need to Know

1st Normal Form: No repeating groups

• Can't use equality to match elements

NameLumiaiPad

Galaxy

Categories“electronics,hand held, smart

phone”“PDA,tablet”

“smart phone,tablet”

Product_id

1234567891011

MakerNokiaApple

Samsung

Page 8: Webinar: From Relational Databases to MongoDB - What You Need to Know

1st Normal Form: No repeating groups

• Can't use equality to match elements

• Must use regular expressions to find data

NameLumiaiPad

Galaxy

Categories“electronics,hand held, smart

phone”“PDA,tablet”

“smart phone,tablet”

Product_id

1234567891011

MakerNokiaApple

Samsung

Page 9: Webinar: From Relational Databases to MongoDB - What You Need to Know

1st Normal Form: No repeating groups

• Can't use equality to match elements

• Must use regular expressions to find data

• Aggregate functions are difficult

NameLumiaiPad

Galaxy

Categories“electronics,hand held, smart

phone”“PDA,tablet”

“smart phone,tablet”

Product_id

1234567891011

MakerNokiaApple

Samsung

Page 10: Webinar: From Relational Databases to MongoDB - What You Need to Know

1st Normal Form: No repeating groups

• Can't use equality to match elements

• Must use regular expressions to find data

• Aggregate functions are difficult

• Updating a specific element is difficult

NameLumiaiPad

Galaxy

Categories“electronics,hand held, smart

phone”“PDA,tablet”

“smart phone,tablet”

Product_id

1234567891011

MakerNokiaApple

Samsung

Page 11: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

{ _id : ObjectId(),maker : “Nokia”name : “Lumia”,categories : [

"electronics","handheld","smart phone"

]}

Page 12: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

{ _id : ObjectId(),maker : “Nokia”name : “Lumia”,categories : [

"electronics","handheld","smart phone"

]}

// querying is easydb.products.find( { "categories": ”handheld" } );

Page 13: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

{ _id : ObjectId(),maker : “Nokia”name : “Lumia”,categories : [

"electronics","handheld","smart phone"

]}

// querying is easydb.products.find( { "categories": ”handheld" } );

// can be indexeddb.products.ensureIndex( { "categories”: 1 } );

Page 14: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

{ _id : ObjectId(),maker : “Nokia”name : “Lumia”,categories : [

"electronics","handheld","smart phone"

]}

// Updates are easydb.products.update(

{ "categories": "electronics"}, { $set: { "categories.$" : "consumer electronics" } }

);

Page 15: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

{ _id : ObjectId(),maker : “Nokia”name : “Lumia”,categories : [

"electronics","handheld","smart phone"

]}

db.products.aggregate({ $unwind : "$categories" }, { $group : {

"_id" : "$categories", "counts" : { "$sum" : 1 }

} }

);

Page 16: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

{ _id : ObjectId(),maker : “Nokia”name : “Lumia”,categories : [

"electronics","handheld","smart phone"

]}

db.products.aggregate({ $unwind : "$categories" }, { $group : {

"_id" : "$categories", "counts" : { "$sum" : 1 }

} }

);

Unwind the array

Page 17: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

{ _id : ObjectId(),maker : “Nokia”name : “Lumia”,categories : [

"electronics","handheld","smart phone"

]}

db.products.aggregate({ $unwind : "$categories" }, { $group : {

"_id" : "$categories", "counts" : { "$sum" : 1 }

} }

);

Unwind the array

Tally the occurrences

Page 18: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB "result" : [

{ "_id" : "smart phone”, "counts" : 1589 },{ "_id" : "handheld”, "counts" : 2403 },{ "_id" : "electronics”, "counts" : 4767 }

]

db.products.aggregate({ $unwind : "$categories" }, { $group : {

"_id" : "$categories", "counts" : { "$sum" : 1 }

} }

);

Page 19: Webinar: From Relational Databases to MongoDB - What You Need to Know

Meh, big deal…. Right?

Aren’t nested structures just a pre-joined schema?

• I could use an adjacency list

• I could use an intersection table

Page 20: Webinar: From Relational Databases to MongoDB - What You Need to Know

Goals of Normalization

• Model data an understandable form

• Reduce fact redundancy and data inconsistency

• Enforce integrity constraints

Performance is not a primary goal

Page 21: Webinar: From Relational Databases to MongoDB - What You Need to Know

Normalize or Denormalize

Commonly held that denormalization is faster

Page 22: Webinar: From Relational Databases to MongoDB - What You Need to Know

Normalize or Denormalize

Commonly held that denormalization is faster

• Normalization can be fast, right?

Page 23: Webinar: From Relational Databases to MongoDB - What You Need to Know

Normalize or Denormalize

Commonly held that denormalization is faster

• Normalization can be fast, right? Requires proper indexing, indexing effects write performance

Page 24: Webinar: From Relational Databases to MongoDB - What You Need to Know

Normalize or Denormalize

Commonly held that denormalization is faster

• Normalization can be fast, right? Requires proper indexing, indexing effects write performance

• Does denormalization commit me to a join strategy?

Page 25: Webinar: From Relational Databases to MongoDB - What You Need to Know

Normalize or Denormalize

Commonly held that denormalization is faster

• Normalization can be fast, right? Requires proper indexing, indexing effects write performance

• Does denormalization commit me to a join strategy? Indexing overhead is a commitment too

Page 26: Webinar: From Relational Databases to MongoDB - What You Need to Know

Normalize or Denormalize

Commonly held that denormalization is faster

• Normalization can be fast, right? Requires proper indexing, indexing effects write performance

• Does denormalization commit me to a join strategy? Indexing overhead is a commitment too

• Does denormalizaiton improve a finite set of queries at the cost of several others?

Page 27: Webinar: From Relational Databases to MongoDB - What You Need to Know

Normalize or Denormalize

Commonly held that denormalization is faster

• Normalization can be fast, right? Requires proper indexing, indexing effects write performance

• Does denormalization commit me to a join strategy? Indexing overhead is a commitment too

• Does denormalizaiton improve a finite set of queries at the cost of several others? MongoDB works best in service to an application

Page 28: Webinar: From Relational Databases to MongoDB - What You Need to Know

Object–Relational Impedance Mismatch

• Inheritance hierarchies

• Polymorphic associations

Page 29: Webinar: From Relational Databases to MongoDB - What You Need to Know

Table Per Subclass

Vehiclesvinregistration maker

MotorcycleEngineraketrial Racebike

racing numberclassteamrider

Page 30: Webinar: From Relational Databases to MongoDB - What You Need to Know

Table Per Subclass

Vehicles- electric

- car- bus- motorcycle

- internal combustion-motorcycle - aircraft

- human powered- bicycle- skateboard

-horsedrawn

Page 31: Webinar: From Relational Databases to MongoDB - What You Need to Know

Table Per Concrete Class

• Each class is mapped to a separate table

• Inherited fields are present in each class’ table

• Can’t support polymorphic relationships

Page 32: Webinar: From Relational Databases to MongoDB - What You Need to Know

Table Per Concrete Class

• Each class is mapped to a separate table

• Inherited fields are present in each class’ table

• Can’t support polymorphic relationshipsSELECT maker FROM Motorcycles WHERE Motorcycles.country = "Italy"UNIONSELECT maker FROM Automobiles WHERE Automobiles.country = "Italy"

Page 33: Webinar: From Relational Databases to MongoDB - What You Need to Know

Table Per Class Family

• Classes mapped to a single table

NameF4

A104Triton 95

Typesportbikehelicoptersubmarine

Vehicle_id1234567891011

MakerM.V

AgustaM.V.

AgustaTriton

Page 34: Webinar: From Relational Databases to MongoDB - What You Need to Know

Table Per Class Family

• Classes mapped to a single table

• Discriminator column to identify class

discriminator

NameF4

A104Triton 95

Typesportbikehelicoptersubmarine

Vehicle_id1234567891011

MakerM.V

AgustaM.V.

AgustaTriton

Page 35: Webinar: From Relational Databases to MongoDB - What You Need to Know

Table Per Class Family

• Classes mapped to a single table

• Discriminator column to identify class

• Many empty columns, nullability issues

NameF4

A104Triton 95

Typesportbikehelicoptersubmarine

Vehicle_id1234567891011

MakerM.V

AgustaM.V.

AgustaTriton

Page 36: Webinar: From Relational Databases to MongoDB - What You Need to Know

Table Per Class Family

• Classes mapped to a single table

• Discriminator column to identify class

• Many empty columns, nullability issues

maker = “M.V. Agusta”, type = “sportbike”, num_doors = 0,wing_area = 0, maximum_depth = 0

???NameF4

A104Triton 95

Typesportbikehelicoptersubmarine

Vehicle_id1234567891011

MakerM.V

AgustaM.V.

AgustaTriton

Page 37: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB{ maker : "M.V. Agusta",

type : sportsbike,engine : {

type : ”internal combustion",cylinders: 4,displacement : 750

},rake : 7,trail : 3.93

}{ maker : "M.V. Agusta",

type : Helicopterengine : {

type : "turboshaft"layout : "axial”,massflow : 1318

},Blades : 4undercarriage : "fixed"

}

Page 38: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB{ maker : "M.V. Agusta",

type : sportsbike,engine : {

type : ”internal combustion",cylinders: 4,displacement : 750

},rake : 7,trail : 3.93

}{ maker : "M.V. Agusta",

type : Helicopter,engine : {

type : "turboshaft"layout : "axial”,massflow : 1318

},Blades : 4,undercarriage : "fixed"

}

Discriminator column

Page 39: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB{ maker : "M.V. Agusta",

type : sportsbike,engine : {

type : ”internal combustion",cylinders: 4,displacement : 750

},rake : 7,trail : 3.93

}{ maker : "M.V. Agusta",

type : Helicopterengine : {

type : "turboshaft"layout : "axial”,massflow : 1318

},Blades : 4,undercarriage : "fixed"

}

Shared indexing strategy

Page 40: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB{ maker : "M.V. Agusta",

type : sportsbike,engine : {

type : ”internal combustion",cylinders: 4,displacement : 750

},rake : 7,trail : 3.93

}{ maker : "M.V. Agusta",

type : Helicopterengine : {

type : "turboshaft"layout : "axial”,massflow : 1318

},Blades : 4undercarriage : "fixed"

}

Polymorphic attributes

Page 41: Webinar: From Relational Databases to MongoDB - What You Need to Know

Relaxed ACID

• Atomic operations at the Document level

Page 42: Webinar: From Relational Databases to MongoDB - What You Need to Know

Relaxed ACID

• Atomic operations at the Document level

• Consistency – strong / eventual

Page 43: Webinar: From Relational Databases to MongoDB - What You Need to Know

Replication

Page 44: Webinar: From Relational Databases to MongoDB - What You Need to Know

Relaxed ACID

• Atomic operations at the Document level

• Consistency – strong / eventual

• Isolation - read lock, write lock / logical database

Page 45: Webinar: From Relational Databases to MongoDB - What You Need to Know

Relaxed ACID

• Atomic operations at the Document level

• Consistency – strong / eventual

• Isolation - read lock, write lock / logical database

• Durability – write ahead journal, replication

Page 46: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

• Document database

• Flexible schema

• Relaxed ACID

This favors denormalization. What’s the consequence?

Page 47: Webinar: From Relational Databases to MongoDB - What You Need to Know

Scaling MongoDB

Client Applicatio

n

Single InstanceOr

Replica Set

MongoDB

Sharded cluster

Page 48: Webinar: From Relational Databases to MongoDB - What You Need to Know

Partitioning

• User defines shard key

• Shard key defines range of data

• Key space is like points on a line

• Range is a segment of that line

Page 49: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Mechanism of Sharding

Complete Data Set

Define shard key on vehicle id

3456 56781234 45672345

Page 50: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Mechanism of Sharding

Chunk Chunk

Define shard key on title

3456 56781234 45672345

Page 51: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Mechanism of ShardingChunk Chunk ChunkChunk

Define shard key on vehicle id

3456 56781234 45672345

Page 52: Webinar: From Relational Databases to MongoDB - What You Need to Know

Chunk Chunk ChunkChunk

Shard 1 Shard 2 Shard 3 Shard 4

3456 56781234 45672345

Define shard key on vehicle id

Page 53: Webinar: From Relational Databases to MongoDB - What You Need to Know

Shard 1 Shard 2 Shard 3 Shard 4

TargetedOperations

Client

mongos

Page 54: Webinar: From Relational Databases to MongoDB - What You Need to Know

Shard 1 Shard 2 Shard 3 Shard 4

Data Growth

Page 55: Webinar: From Relational Databases to MongoDB - What You Need to Know

Shard 1 Shard 2 Shard 3 Shard 4

Load Balancing

Page 56: Webinar: From Relational Databases to MongoDB - What You Need to Know

Relational if you need to

• Enforce data constraints

• Service a broad set of queries

• Minimize redundancy

Page 57: Webinar: From Relational Databases to MongoDB - What You Need to Know

The Tao of MongoDB

• Avoid ad-hoc queries

• Model data for use, not storage

• Index effectively, index efficiently

Page 58: Webinar: From Relational Databases to MongoDB - What You Need to Know

Engineer, 10gen

Bryan Reinero

@blimpyacht

Thank You