databases and queries: matching performance and reliability
TRANSCRIPT
![Page 1: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/1.jpg)
D ATA B A S E S A N D Q U E R I E SM A T C H I N G P E R F O R M A N C E A N D R E L I A B I L I T Y
![Page 2: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/2.jpg)
Dave SmithVP, Engineering
@dizzyd
![Page 3: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/3.jpg)
S U R V E Y
• Who has hit problems scaling RDBMS?
• Who is using non-relational databases?
![Page 4: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/4.jpg)
Q U E R I E S
• Relational
• Key/value (document)
• Text retrieval (full-text search)
• Graph
• Time-series
• Geospatial
![Page 5: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/5.jpg)
Q U E R I E S ( C O N T. )
• What questions are you asking of your data?
• Get a record by a key
• Find records based on a relationship
• Find all documents with a given term
• Apply operation to metrics within a timeframe
![Page 6: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/6.jpg)
It is possible to rewrite most queries in other forms.
![Page 7: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/7.jpg)
P E R F O R M A N C E
• Access patterns
• Read/write mix
• Sequential vs. Pareto vs. uniformly random
• Throughput - how many requests/sec?
• Latency - how long does it take to service a single request?
• Always a distribution! Mean is meaningless…
• Data size
• Total size of dataset
• Size per item in dataset
![Page 8: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/8.jpg)
R E L I A B I L I T Y
• How can databases fail?
• Disks -> integrity checking
• Nodes -> replication
• Network -> versioning
• Software -> (all of above)
• Overload -> elasticity
• Key questions
• How well does the system tolerate failure?
• How well does the system deal with unexpected load?
![Page 9: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/9.jpg)
It can be impossible to distinguish between a slow node and a failed node.
![Page 10: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/10.jpg)
U G LY T R U T H S
• All databases require tuning
• Failure is hard to test — most people don’t bother
• Networks fail — especially under high load
• The more your database does, the more ways it can fail
• More code == more bugs
![Page 11: Databases and Queries: Matching Performance and Reliability](https://reader035.vdocuments.site/reader035/viewer/2022071706/55c29790bb61eb37128b4730/html5/thumbnails/11.jpg)
C H O I C E S , C H O I C E S …
• MySQL, Postgres, Oracle
• CouchDB, MongoDB, RethinkDB
• Riak, Cassandra
• HBase, Hypertable
• MemSQL, CouchBase
• ElasticSearch, SOLR
• Neo4J, Titan