introduction to .net driver
TRANSCRIPT
The DataStax Drivers for Cassandra
• Currently Available
– C# (.NET)
– Python
– Java
– NodeJS
– Ruby
– C++
– PHP
• Will Probably Happen
– Scala
– JDBC
• Early Discussions
– Go
– Rust
2
• Open source, Apache 2 licensed, available on GitHub
– https://github.com/datastax/
The DataStax Drivers for Cassandra
Language Bootstrapping Code
C# Cluster cluster = Cluster.Builder().AddContactPoint("127.0.0.1").Build(); ISession session = cluster.Connect("killrvideo");
Python
from cassandra.cluster import Cluster cluster = Cluster(contact_points=['127.0.0.1']) session = cluster.connect('killrvideo')
Java Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build(); Session session = cluster.connect("killrvideo");
NodeJS
var cassandra = require('cassandra-driver'); var client = new cassandra.Client({ contactPoints: ['127.0.0.1'], keyspace: 'killrvideo' });
4
A video sharing web application built on DataStax
Enterprise and Microsoft Azure www.killrvideo.com
.NET and Cassandra
• Available via NuGet
• Bootstrap using the Builder and then reuse the ISession object
Cluster cluster = Cluster.Builder() .AddContactPoint("127.0.0.1") .Build(); ISession session = cluster.Connect("killrvideo");
5
.NET and Cassandra
• Executing CQL with SimpleStatement
• Sync and Async API available for executing statements
• Use Async API for executing queries in parallel
var videoId = Guid.NewGuid(); var statement = new SimpleStatement("SELECT * FROM videos WHERE videoid = ?", videoId); RowSet rows = await session.ExecuteAsync(statement);
6
.NET and Cassandra
• Getting values from a RowSet is easy
• Rowset is a collection of Row (IEnumerable<Row>)
RowSet rows = await _session.ExecuteAsync(statement); foreach (Row row in rows) { var videoId = row.GetValue<Guid>("videoid"); var addedDate = row.GetValue<DateTimeOffset>("added_date"); var name = row.GetValue<string>("name"); }
7
CQL 3 Data Types to .NET Types
• Full listing available in driver docs (http://www.datastax.com/docs)
CQL 3 Data Type .NET Type
bigint, counter long
boolean bool
decimal, float float
double double
int int
uuid, timeuuid System.Guid
text, varchar string (Encoding.UTF8)
timestamp System.DateTimeOffset
varint System.Numerics.BigInteger
Use Prepared Statements
• Performance optimization for queries you run repeatedly
• Pay the cost of preparing once (causes roundtrip to Cassandra)
• KillrVideo: looking a user’s credentials up by email address
• Save and reuse the PreparedStatement instance after preparing
9
PreparedStatement prepared = session.Prepare( "SELECT * FROM user_credentials WHERE email = ?");
Use Prepared Statements
• Bind variable values when ready to execute
• Execution only has to send variable values over the wire
• Cassandra doesn’t have to reparse the CQL string each time
• Remember: Prepare once, bind and execute many
10
BoundStatement bound = prepared.Bind("[email protected]"); RowSet rows = await _session.ExecuteAsync(bound);
Statement Options
• Options like Consistency Level and Retry Policy are available at
the Statement level
• If not set on a statement, driver will fallback to defaults set when
building/configuring the Cluster
11
IStatement bound = prepared.Bind("[email protected]") .SetPageSize(100) .SetConsistencyLevel(ConsistencyLevel.LocalOne) .SetRetryPolicy(new DefaultRetryPolicy()) .EnableTracing();
Batch Statements: Use and Misuse
• You can mix and match Simple/Bound statements in a batch
• Batches are Logged (atomic) by default
• Use when you want a group of mutations (statements) to all
succeed or all fail (denormalizing at write time)
• Large batches are an anti-pattern (Cassandra will warn you)
• Not a performance optimization for bulk-loading data
12
KillrVideo: Update a Video’s Name with a Batch
13
public class VideoCatalogDataAccess { public VideoCatalogDataAccess(ISession session) { _session = session; _prepared = _session.Prepare( "UPDATE user_videos SET name = ? WHERE userid = ? AND videoid = ?"); } public async Task UpdateVideoName(UpdateVideoDto video) { BoundStatement bound = _prepared.Bind(video.Name, video.UserId, video.VideoId); var simple = new SimpleStatement("UPDATE videos SET name = ? WHERE videoid = ?", video.Name, video.VideoId); // Use an atomic batch to send over all the mutations var batchStatement = new BatchStatement(); batchStatement.Add(bound); batchStatement.Add(simple); RowSet rows = await _session.ExecuteAsync(batch); } }
Lightweight Transactions when you need them
• Use when you don’t want writes to step on each other
– Sometimes called Linearizable Consistency
– Similar to Serial Isolation Level from RDBMS
• Essentially a Check and Set (CAS) operation using Paxos
• Read the fine print: has a latency cost associated with it
• The canonical example: unique user accounts
14
KillrVideo: LWT to create user accounts
• Returns a column called [applied] indicating success/failure
• Different from relational world where you might expect an
Exception (i.e. PrimaryKeyViolationException or similar)
15
string cql = "INSERT INTO user_credentials (email, password, userid)" +
"VALUES (?, ?, ?) IF NOT EXISTS";
var statement = new SimpleStatement(cql, user.Email, hashedPassword, user.UserId);
RowSet rows = await _session.ExecuteAsync(statement);
var userInserted = rows.Single().GetValue<bool>("[applied]");
Automatic Paging
• The Problem: Loading big result sets into memory is a recipe
for disaster (OutOfMemoryExceptions, etc.)
• Better to load and process a large result set in pages (chunks)
• Automatic Paging makes paging on a large RowSet
transparent
Automatic Paging
• Set a page size on a statement
• Iterate over the resulting RowSet
• As you iterate, new pages are fetched transparently when the
Rows in the current page are exhausted
• Will allow you to iterate until all pages are exhausted
boundStatement = boundStatement.SetPageSize(100); RowSet rows = await _session.ExecuteAsync(boundStatement); foreach (Row row in rows) { }
Mapping Rows to Objects – Mapper Component
• Micro ORM: Write CQL queries, RowSets are mapped to POCOs
• Mappings are based on conventions, can be configured via code (fluent-style interface) or attributes on your POCOs
public class User { public Guid UserId { get; set; } public string Name { get; set; } } // Create a mapper from your session object var mapper = new Mapper(session); // Get a user by id from Cassandra or null if not found var user = client.SingleOrDefault<User>( "SELECT userid, name FROM users WHERE userid = ?", someUserId);
18
Mapping Rows to Objects – LINQ Provider
• Write LINQ queries instead of CQL, results mapped to POCOs
[Table("users")] public class User { [Column("userid"), PartitionKey] public Guid UserId { get; set; } [Column("name")] public string Name { get; set; } } var user = session.GetTable<User>() .SingleOrDefault(u => u.UserId == someUserId) .Execute();
19
Questions? @LukeTillman
https://www.linkedin.com/in/luketillman/
https://github.com/LukeTillman/
20