cql: sql in cassandra

37
CQL: SQL for Cassandra Cassandra NYC December 6, 2011 Eric Evans [email protected] @jericevans, @acunu

Upload: eric-evans

Post on 15-Jan-2015

16.517 views

Category:

Technology


0 download

DESCRIPTION

CQL presentation from Cassandra NYC, December 6, 2011

TRANSCRIPT

Page 1: CQL: SQL In Cassandra

CQL: SQL for Cassandra

Cassandra NYCDecember 6, 2011

Eric [email protected]

@jericevans, @acunu

Page 2: CQL: SQL In Cassandra

● Overview, history, motivation

● Performance characteristics

● Coming soon (?)

● Drivers status

Page 3: CQL: SQL In Cassandra

What?

● Cassandra Query Language● aka CQL● aka /ˈsēkwəl/

● Exactly like SQL (except where it's not)● Introduced in Cassandra 0.8.0● Ready for production use

Page 4: CQL: SQL In Cassandra

SQL? Almost.

–- Inserts or updatesINSERT INTO Standard1 (KEY, col0, col1) VALUES (key, value0, value1)

vs.

–- Inserts or updatesUPDATE Standard1SET col0=value0, col1=value1 WHERE KEY=key

Page 5: CQL: SQL In Cassandra

SQL? Almost.–- Get columns for a rowSELECT col0,col1 FROM Standard1 WHERE KEY=key

–- Range of columns for a rowSELECT col0..colN FROM Standard1 WHERE KEY=key

–- First 10 results from a range of columnsSELECT FIRST 10 col0..colN FROM Standard1 WHERE KEY=key

–- Invert the sorting of resultsSELECT REVERSED col0..colN FROM Standard1 WHERE KEY=key

Page 6: CQL: SQL In Cassandra

Why?

Page 7: CQL: SQL In Cassandra

Interface Instability

Page 8: CQL: SQL In Cassandra

(Un)ease of useColumn col = new Column(ByteBuffer.wrap(“name”.getBytes()));col.setValue(ByteBuffer.wrap(“value”.getBytes()));col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();cosc.setColumn(col);

Mutation mutation = new Mutation();Mutation.setColumnOrSuperColumn(cosc);

List mutations = new ArrayList<Mutation>();mutations.add(mutation);

Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();Map cf_map = new HashMap<String, List<Mutation>>();cf_map.set(“Standard1”, mutations);mutations.put(ByteBuffer.wrap(“key”.getBytes()), cf_map)

Page 9: CQL: SQL In Cassandra

CQL

INSERT INTO Standard1 (KEY, col0) VALUES (key, value0)

Page 10: CQL: SQL In Cassandra

Why? How about...

● Better stability guarantees● Easier to use (you already know it)● Better code readability / maintainability

Page 11: CQL: SQL In Cassandra

Why? How about...

● Better stability guarantees● Easier to use (you already know it)● Better code readability / maintainability● Irritates the NoSQL purists

Page 12: CQL: SQL In Cassandra

Why? How about...

● Better stability guarantees● Easier to use (you already know it)● Better code readability / maintainability● Irritates the NoSQL purists● (Still )irritates the SQL purists

Page 13: CQL: SQL In Cassandra
Page 14: CQL: SQL In Cassandra

Performance

Page 15: CQL: SQL In Cassandra
Page 16: CQL: SQL In Cassandra

Thrift RPCColumn col = new Column(ByteBuffer.wrap(“name”.getBytes()));col.setValue(ByteBuffer.wrap(“value”.getBytes()));col.setTimestamp(System.currentTimeMillis());

ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();cosc.setColumn(col);

Mutation mutation = new Mutation();Mutation.setColumnOrSuperColumn(cosc);

List mutations = new ArrayList<Mutation>();mutations.add(mutation);

Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();Map cf_map = new HashMap<String, List<Mutation>>();cf_map.set(“Standard1”, mutations);mutations.put(ByteBuffer.wrap(“key”.getBytes()), cf_map)

Page 17: CQL: SQL In Cassandra

Your query, it's a graph

Page 18: CQL: SQL In Cassandra

CQL

INSERT INTO Standard1 (KEY, col0) VALUES (key, value0)

Page 19: CQL: SQL In Cassandra

HotspotQuoted string literals

UPDATE table SET 'name' = 'value' WHERE KEY = 'somekey'

Page 20: CQL: SQL In Cassandra

HotspotQuoted string literals

UPDATE table SET 'name' = 'value' WHERE KEY = 'somekey'

Page 21: CQL: SQL In Cassandra

HotspotQuoted string literals

UPDATE table SET 'name' = 'value' WHERE KEY = 'somekey'

● Anything that appears between quotes● Inlined Java constructs a StringBuilder to store

the contents (slow not fast)● Incurred multiple times per statement

Page 22: CQL: SQL In Cassandra

HotspotMarshalling

UPDATE table SET 'clear' = 'abffaadd10' WHERE KEY = 'acfe12ff'

Page 23: CQL: SQL In Cassandra

HotspotMarshalling

UPDATE table SET 'clear' = 'abffaadd10' WHERE KEY = 'acfe12ff'

ascii blob

Page 24: CQL: SQL In Cassandra

HotspotMarshalling

UPDATE table SET 'clear' = 'abffaadd10' WHERE KEY = 'acfe12ff'

● Terms are marshalled to bytes by type● String.getBytes is slow (AsciiType)● Hex conversion is fast faster (BytesType)● Incurred multiple times per statement

ascii blob

Page 25: CQL: SQL In Cassandra

HotspotCopying / Conversion

execute_cql_query( ByteBuffer query, enum compression)

● Query is binary to support compression (is it worth it?)● And don't forget the String → ByteBuffer conversion on

the client-side● Incurred only once per statement!

Page 26: CQL: SQL In Cassandra

Achtung!(These tests weren't perfect)

● Uneeded String → ByteBuffer → String● No query compression implemented● Co-located client and server

Page 27: CQL: SQL In Cassandra

Insert 20M rows, 5 columns

Avg rate Avg latency

RPC 20,953/s 1.6ms

CQL 19,176/s (-8%) 1.7ms (+9%)

Page 28: CQL: SQL In Cassandra

Insert 10M rows, 5 cols (indexed)

Avg rate Avg latency

RPC 9,850/s 5.3ms

CQL 9,290/s (-6%) 5.5ms (+4%)

Page 29: CQL: SQL In Cassandra

Counts, 10M rows, 5 cols

Avg rate Avg latency

RPC 18,052/s 1.7ms

CQL 17,635/s (-2%) 1.7ms

Page 30: CQL: SQL In Cassandra

Reading 20M rows, 5 cols

Avg rate Avg latency

RPC 22.726/s 2.0ms

CQL 20,272/s (-11%) 2.3ms (+10%)

Page 31: CQL: SQL In Cassandra

In Summary

Don't step over dollars to pick up pennies!

Page 32: CQL: SQL In Cassandra

Coming Soon(ish)

Page 33: CQL: SQL In Cassandra

Roadmap

● Prepared statements (CASSANDRA-2475)

● Compound columns (CASSANDRA-2474)

● Custom transport / protocol (CASSANDRA-2478)

● Performance testing (CASSANDRA-2268)

● Schema introspection (CASSANDRA-2477)

● Multiget support (CASSANDRA-3069)

Page 34: CQL: SQL In Cassandra

Drivers

Page 35: CQL: SQL In Cassandra

Drivers

● Hosted on Apache Extras (Google Code)● Tagged cassandra and cql● Licensed using Apache License 2.0● Conforming to a standard for database

connectivity (if applicable)● Coming soon, automated testing and

acceptance criteria

Page 36: CQL: SQL In Cassandra

Drivers

Driver Platform Statuscassandra-jdbc Java Goodcassandra-dbapi2 Python Goodcassandra-ruby Ruby Newcassandra-pdo PHP Newcassandra-node Node.js Good

http://code.google.com/a/apache-extras.org/hosting/search?q=label%3aCassandra

Page 37: CQL: SQL In Cassandra

The End