nosql - life beyond the outer join
DESCRIPTION
This talk was from the Canberra Java User Group in July 2010. You can download the source to go with these slides from http://bitbucket.org/glen_a_smith/cjug-nosql-examples. The NoSQL (Not Only SQL) movement has been gaining a lot of press over the last year as a means of scaling massive data storage, complex relationships and lightening fast retrieval for the Web's biggest sites. This month we're taking a trip to the big end of town and looking at some of the backend technologies that are powering sites like Twitter, Facebook, LinkedIn, Reddit, Digg and Google. We'll be looking at popular Java clients and servers that play in the NoSQL space and have a brief survey of the following popular NoSQL platforms: Document Databases (CouchDB), Sophisticated Key/Value Stores (Voldemort), Graph Databases (Neo4j), and simple Key/Value stores (Memcached). It'l be a lightening tour of what each technology offers, some source code on how it works, and lots of headshifts about how to store data such that you don't ever need another Left Outer Join!TRANSCRIPT
Survey the landscape of NoSQL offerings
Learn some of the terminology
Look at some of the Java offerings in the space
Take away source to play with
Be able to ask questions (but you may not get answers)
Objectives
(N)ot (O)nly SQL not “Anti SQL”
Movement more than “one” technology
Distributed Storage System
Much weaker queries
Scale across many machines
Much larger data, much faster queries
What is NoSQL?
Inspired by Distributed Data Storage problems
Scale easily by adding servers
Not suited to all problem types, but super-suited to certain large problem types
High-write situations (eg activity tracking or timeline rendering for millions of users)
A lot of relational uses are really dumbed down (eg fetch by PK with update)
Why NoSQL?
Nothing ;-)
To scale RDBMS, your approach is typically:
Shard your datasource
Put in a bunch of read replicas
Put memcached in front of those
What could possibly go wrong?
Complex. Custom caching. Partitioning. Migrating of shards. Tons of moving parts.
What’s wrong with RDBMS?
Atomic (it happens or not, no partial completes)
Consistent (DB internals, ref integ, field validate)
Isolated (Can’t modify uncommitted data)
Durable (written to disk/transaction log)
But in a distributed db, life is not so simple...
How can I live w/o ACID?
In a distributed system, when you have state on more than one machine, pick any two:
Consistency (easy in read-only states – copy!)
Availability (can you get at your data? Is it up?)
Partition Tolerance (3 machines on one net, 3 on the other, with a broken link. How do you take updates since you can’t keep people up to date. What if you don’t agree on what’s up?)
The CAP theorum
Basically big distributed hashtables
Push all logic into the write (update two lists – one for userId, one for email)
Things don’t happen transactionally. These are two writes.
There is no free lunch. The programmer is now handling consistency problems.
You were thinking about query optimisation before, and now even more so.
How do these NoSQL things work?
Digg - 3Tb
Facebook Inbox – 50 Tb
eBay – 2 Pb
Think about Twitter’s issues.. Billion of queries a second over Tb of data.
How big are we talking?
Key-Value In-Memory stores (Memcached, Redis)
Key-Value “Eventually Consistent” stores (“Dynamo Clones” like Cassandra, Voldemort, Riak)
Document stores (Couchdb, Mongodb, JCR)
Graph Databases (Neo4j)
Tabular (“BigTable clones” like Hadoop/Hbase)
The NoSQL Taxonomy
Developed for the original LiveJournal site
LRU, distributed hashtable
Logic is in both client and server
Used in Google App Engine, Facebook, Twitter
Ehcache now has similar service
Good for things that outlive an app server
Memcached
Clients know how to: Send items to servers (consistent hashing)
What to do when a server fails
How to fetch keys from servers
Can “weigh” to server capacities
Servers know how to: Store items they receive
Expire them from the cache
No inter-server comms – everything is unaware
How does it work?
Sample Code
Less than Memcached, but also more!
Not a cache, but a distributed key/value store
Developed by LinkedIn
Works on distributed hashmap w/failover
Logic can be in client/server or just server
Pluggable storage (mysql,bdb,mock)
Pluggable serialization (JSON, Google PB, etc)
Voldemort
Eventual consistency – data will come into sync but not immediately on the write. In practice “pretty soon” is milliseconds later
We are actually used to this – eg Google indexes update every so often.
Guarantees to read your own writes (eg your profile on LinkedIn)
Tuneable to better performance/weaker consistency
“Relaxed” Consistency
Data is automatically replicated
Partitioning ensures all servers have subset
Server failure is handled transparently
Data is rebalanced when servers added/removed
Serialization is pluggable
Apache License
What’s attractive?
“We were able to move applications that needed to handle hundreds of millions of reads and writes per day from over 400ms to under 10ms while simultaneously increasing the amount of data we store.”
Impressive Performance
Performance Info
http://www.slideshare.net/bhupeshbansal/hadoop-user-group-jan2010
Starting the server (or deploy as a .war) bin\voldemort-server.bat config\single_node_cluster
Starting the console bin\voldemort-shell.bat test tcp://localhost:6666
Run some queries
put “hello” “world”
get “hello”
put “hello” “world 2.0”
delete “hello”
Sample Script
Sample Code
Document-Oriented Db – No Schema
Written in Erlang (!) by a Notes Dev (!!!)
Everything is stored in JSON, Restful API
Clever replication concepts – works in disconnected settings
Every write is a new document, version
Map/Reduce baked in
Apache License
CouchDb
Schemaless operation – Adhoc data
Incremental replication (great for disconnected settings)
Great fault-tolerance (with versioned conflicts)
Fast query with flexibility (MapReduce)
What’s attractive?
Popularized by Google’s BigTable
Map functions collect documents matching criteria and create a B-Tree
Reduce functions operate on the B-Tree
Everything happens in parallel on many machines
Example: distributed grep
So what is this Map/Reduce thing?
http://127.0.0.1:5984/
http://127.0.0.1:5984/_all_dbs
http://127.0.0.1:5984/mydb (PUT)
http://127.0.0.1:5984/_utils/ (Futon)
The Naked Couch
You lose some of the joy of schema-less
But you do get lots of boilerplate ;-)
Oh, and strong typing.
Mapping Couch with Ekron
You write a map function to extract data
You always return a key/value pair
function(doc) {
if (doc.title.indexOf(“Hi!") > -1) {
emit(doc.title, doc);
}
}
Writing a Couch MapReduce
Stored data in a graph of nodes and r’ships
Can handle billions of nodes per machine
Means you can query on relationships!
Supports ACID transactions
One 500kb jar (!)
Dual-licensed GPL/Commercial
Neo4j
Sample Code
http://blogs.bytecode.com.au/glen
http://twitter.com/glen_a_smith
http://grailspodcast.com/
Download all the source from today:
http://bitbucket.org/glen_a_smith/cjug-nosql-examples
Blogvertising
Looking for a good book?
Q & A