concurrency patterns with mongodb

27

Click here to load reader

Upload: yann-cluchey

Post on 04-Jul-2015

2.308 views

Category:

Technology


11 download

TRANSCRIPT

Page 1: Concurrency Patterns with MongoDB

CONCURRENCY

PATTERNS WITH

MONGODB

Yann Cluchey

CTO @ Cogenta

Page 2: Concurrency Patterns with MongoDB

• Real-time retail intelligence

• Gather products and prices from web

• MongoDB in production

• Millions of updates per day, 3K/s peak

• Data in SQL, Mongo, ElasticSearch

Page 3: Concurrency Patterns with MongoDB

Concurrency Patterns: Why?

• MongoDB: atomic updates, no transactions

• Need to ensure consistency & correctness

• What are my options with Mongo?

• Shortcuts

• Different approaches

Page 4: Concurrency Patterns with MongoDB

Concurrency Control Strategies

• Pessimistic

• Suited for frequent conflicts

• http://bit.ly/two-phase-commits

• Optimistic

• Efficient when conflicts are rare

• http://bit.ly/isolate-sequence

• Multi-version

• All versions stored, client resolves conflict

• e.g. CouchDb

Page 5: Concurrency Patterns with MongoDB

Optimistic Concurrency Control (OCC)

• No locks

• Prevent dirty writes

• Uses timestamp or a revision number

• Client checks & replays transaction

Page 6: Concurrency Patterns with MongoDB

Example

Original

{ _id: 23, comment: “The quick brown fox…” }

Edit 1

{ _id: 23,

comment: “The quick brown fox prefers SQL” }

Edit 2

{ _id: 23,

comment: “The quick brown fox prefers

MongoDB” }

Page 7: Concurrency Patterns with MongoDB

Example

Edit 1

db.comments.update({ _id: 23 },{ _id: 23,

comment: “The quick brown fox prefers SQL” })

Edit 2

db.comments.update({ _id: 23 },{ _id: 23,

comment: “The quick brown fox prefers MongoDB” })

Outcome: One update is lost, other might be wrong

Page 8: Concurrency Patterns with MongoDB

OCC Example

Original

{ _id: 23, rev: 1,

comment: “The quick brown fox…” }

Update a specific revision (edit 1)

db.comments.update(

{ _id: 23, rev: 1 },

{ _id: 23, rev: 2,

comment: “The quick brown fox prefers SQL” })

Page 9: Concurrency Patterns with MongoDB

OCC Example

Edit 2

db.comments.update(

{ _id: 23, rev: 1 },

{ _id: 23, rev: 2,

comment: “The quick brown fox prefers

MongoDB” })

..fails

{ updatedExisting: false, n: 0,

err: null, ok: 1 }

• Caveat: Only works if all clients follow convention

Page 10: Concurrency Patterns with MongoDB

Update Operators in Mongo

• Avoid full document replacement by using operators

• Powerful operators such as $inc, $set, $push

• Many operators can be grouped into single atomic update

• More efficient (data over wire, parsing, etc.)

• Use as much as possible

• http://bit.ly/update-operators

Page 11: Concurrency Patterns with MongoDB

Still Need OCC?

A hit counter

{ _id: 1, hits: 5040 }

Edit 1

db.stats.update({ _id: 1 },

{ $set: { hits: 5045 } })

Edit 2

db.stats.update({ _id: 1 },

{ $set: { hits: 5055 } })

Page 12: Concurrency Patterns with MongoDB

Still Need OCC?

Edit 1

db.stats.update({ _id: 1 },

{ $inc: { hits: 5 } })

Edit 2

db.stats.update({ _id: 1 },

{ $inc: { hits: 10 } })

• Sequence of updates might vary

• Outcome always the same

• But what if sequence is important?

Page 13: Concurrency Patterns with MongoDB

Still Need OCC?

• Operators can offset need for concurrency control

• Support for complex atomic manipulation

• Depends on use case

• You’ll need it for

• Opaque changes (e.g. text)

• Complex update logic in app domain

(e.g. changing a value affects some calculated fields)

• Sequence is important and can’t be inferred

Page 14: Concurrency Patterns with MongoDB

Update Commands

• Update• Specify query to match one or more documents

• Use { multi: true } to update multiple documents

• Must call Find() separately if you want a copy of the doc

• FindAndModify• Update single document only

• Find + Update in single hit (atomic)

• Returns the doc before or after update

• Whole doc or subset

• Upsert (update or insert)• Important feature. Works with OCC..?

Page 15: Concurrency Patterns with MongoDB

Consistent Update Example

• Have a customer document

• Want to set the LastOrderValue and return the previous value

db.customers.findAndModify({

query: { _id: 16, rev: 45 },

update: {

$set: { lastOrderValue: 699 },

$inc: { rev: 1 }

},

new: false

})

Page 16: Concurrency Patterns with MongoDB

Consistent Update Example

• Customer has since been updated, or doesn’t exist

• Client should replay

null

• Intended version of customer successfully updated

• Original version is returned

{ _id: 16, rev: 45, lastOrderValue: 145 }

• Useful if client has got partial information and needs the

full document

• A separate Find() could introduce inconsistency

Page 17: Concurrency Patterns with MongoDB

Independent Update with Upsert

• Keep stats about customers

• Want to increment NumOrders and return new total

• Customer document might not be there

• Independent operation still needs protection

db.customerStats.findAndModify({query: { _id: 16 },update: {

$inc: { numOrders: 1, rev: 1 },$setOnInsert: { name: “Yann” }

},new: true,upsert: true

})

Page 18: Concurrency Patterns with MongoDB

Independent Update with Upsert

• First run, document is created

{ _id: 16, numOrders: 1, rev: 1, name: “Yann” }

• Second run, document is updated

{ _id: 16, numOrders: 2, rev: 2, name: “Yann” }

Page 19: Concurrency Patterns with MongoDB

Subdocuments

• Common scenario

• e.g. Customer and Orders in single document

• Clients like having everything

• Powerful operators for matching and updating

subdocuments

• $elemMatch, $, $addToSet, $push

• Alternatives to “Fat” documents;

• Client-side joins

• Aggregation

• MapReduce

Page 20: Concurrency Patterns with MongoDB

Currency Control and Subdocuments

• Patterns described here still work, but might be

impractical

• Docs are large

• More collisions

• Solve with scale?

Page 21: Concurrency Patterns with MongoDB

Subdocument Example

• Customer document contains orders

• Want to independently update orders

• Correct order #471 value to £260

{

_id: 16,

rev: 20,

name: “Yann”,

orders: {

“471”: { id: 471, value: 250, rev: 4 }

}

}

Page 22: Concurrency Patterns with MongoDB

Subdocument Example

db.customers.findAndModify({

query: { “orders.471.rev”: { $lte: 4 } },

update: {

$set: { “orders.471.value”: 260 },

$inc: { rev: 1, “orders.471.rev”: 1 },

$setOnInsert: {

name: “Yann”,

“orders.471.id”: 471 }

},

new: true,

upsert: true

})

Page 23: Concurrency Patterns with MongoDB

Subdocument Example

• First run, order updated successfully

• Could create if not exists

{

_id: 16,

rev: 21,

name: “Yann”,

orders: {

“471”: { id: 471, value: 260, rev: 5 }

}

}

Page 24: Concurrency Patterns with MongoDB

Subdocument Example

• Second conflicting run

• Query didn’t match revision, duplicate document created

{

_id: ObjectId("533bf88a50dbb55a8a9b9128"),

rev: 1,

name: “Yann”,

orders: {

“471”: { id: 471, value: 260, rev: 1 }

}

}

Page 25: Concurrency Patterns with MongoDB

Subdocument Example

• Solve with unique index (good idea anyway)

db.customers.ensureIndex(

{ "orders.id" : 1 },

{

"name" : "orderids",

"unique" : true

})

Page 26: Concurrency Patterns with MongoDB

Subdocument Example

Client can handle findAndModify result accordingly;

• Successful update

{ updatedExisting: true }

• New document created

{ updatedExisting: false, n: 1 }

• Conflict, need to replay

{ errmsg: “exception: E11000 duplicate key error index: db.customers.$orderids dup key…” }

Page 27: Concurrency Patterns with MongoDB

Final Words

• Don’t forget deletes

• Gotchas about subdocument structure

orders: [ { id: 471 }, … ]

orders: { “471”: { }, … }

orders: { “471”: { id: 471 }, … }

• Coming in 2.6.0 stable

$setOnInsert: { _id: .. }

• Sharding..?