queue in the cloud with mongo db
DESCRIPTION
Leveraging MongoDB as a queue infrastructure. This talk describes a low-overhead approach for building a queue as a service built on top of MongoDB, covering concerns of durability, long running processes and distributed processing.TRANSCRIPT
QUEUE IN THE CLOUD WITH MONGODBMONGODB LA 2013
NURI HALPERIN
QUEUE
USAGE
Ordered execution
Buffering consumer/producer
Work distribution
GOALS OF PROJECT
Leverage Mongo
• Reduce ops overhead by reusing infrastructure• Map queue semantics to Mongo’s strengths
Reliable
• Durable - support long running process• Resilient to machine failure• Narrow down window of failure/ data loss.
Centralized, distributed:
• Multiple producers• Multiple consumers
ITERATION 0
Capped collection – not the perfect choice
• Tailing queue seems attractive, but…• Need external sync to avoid double-consume• Secondary indexes and updating are anti-pattern
Relaxing FIFO is OK
• No guarantee that first-popped is first done• Multi-client is negated if they have to sync on execution order• Race condition for queue insertion has same effect
Conclusion: Project doesn’t use capped collection and relaxes FIFO.
PARANOID BY DESIGN
Network diesProcess dies
DB dies
Machine dies Poison letter Dead letter
ITERATION 1
db.q4foo.save({v:{f:1}})
db.q4foo.findAndModify({query: {}, sort: {_id:1}, remove: true})
Hot: quick and simple
Not: dead client, dead in transit, no trace
ARE WE THERE YET?
Network diesProcess dies
DB dies
Machine dies Poison letter Dead letter
QUEUE SEMANTICSLocal / Memory Distributed
Push Put
Pop Get << visibility >>
<< exception >> Release << retry >>
Delete
<< exception >>
ITERATION 2db.q4foo.save({v:{f:1}, dq: null})
db.q4foo.findAndModify( { query: { dq: null}, sort: {_id:1}, update:{ $set: { dq: later(60)}}})
… If processing was success => delete..
Hot: If client dies, item remains in queue. Data not lost.
Not: index on _id less useful in high volume.
ARE WE THERE YET?
Network diesProcess dies
DB dies
Machine dies Poison letter Dead letter
ITERATION 3db.q4foo.save({v:{f:1}, dq: null, pc: 0})
db.q4foo.findAndModify({query: { dq: null, pc:{$lt:3}}, sort: {_id:1}, update:{$set:{dq:later(60)},$inc:{pc:1}}}) // consume
db.q4foo.findAndModify({query: {_id:"..."}, update:{$set:{dq: null}}}) // release
Hot: An item can be retried automatically (pc) after released. Exhausted item remains in queue.
Not: Not strict FIFO.
ARE WE THERE? YES.
Network diesProcess dies
DB dies
Machine dies Poison letter Dead letter
ITERATION 4
Ensure your queue writes use applicable durability
• db.q4foo.save() + getLastError(…)• db.q4foo.findAndModify () + getLastError(…)
Replica sets for durability only. No capacity or speed gain.
OTHER THOUGHTSCreate admin jobs to monitor queues:
• Growth• Retries exhausted
Consider TTL risks (ex: client failure before calling Release())
Consider idempotent operations when possible
Design clients to back off polling
Separate queue vs. extra “topic” field
Consider dedicated DB for write-lock scope
Capped vs. regular collection – capped now can have _id, in-place update.