introduction to .net driver

20
Introduction to the .NET Driver Luke Tillman Technical Evangelist @LukeTillman

Upload: datastax-academy

Post on 13-Apr-2017

357 views

Category:

Technology


0 download

TRANSCRIPT

Introduction to the .NET Driver

Luke Tillman Technical Evangelist

@LukeTillman

The DataStax Drivers for Cassandra

• Currently Available

– C# (.NET)

– Python

– Java

– NodeJS

– Ruby

– C++

– PHP

• Will Probably Happen

– Scala

– JDBC

• Early Discussions

– Go

– Rust

2

• Open source, Apache 2 licensed, available on GitHub

– https://github.com/datastax/

The DataStax Drivers for Cassandra

Language Bootstrapping Code

C# Cluster cluster = Cluster.Builder().AddContactPoint("127.0.0.1").Build(); ISession session = cluster.Connect("killrvideo");

Python

from cassandra.cluster import Cluster cluster = Cluster(contact_points=['127.0.0.1']) session = cluster.connect('killrvideo')

Java Cluster cluster = Cluster.builder().addContactPoint("127.0.0.1").build(); Session session = cluster.connect("killrvideo");

NodeJS

var cassandra = require('cassandra-driver'); var client = new cassandra.Client({ contactPoints: ['127.0.0.1'], keyspace: 'killrvideo' });

4

A video sharing web application built on DataStax

Enterprise and Microsoft Azure www.killrvideo.com

.NET and Cassandra

• Available via NuGet

• Bootstrap using the Builder and then reuse the ISession object

Cluster cluster = Cluster.Builder() .AddContactPoint("127.0.0.1") .Build(); ISession session = cluster.Connect("killrvideo");

5

.NET and Cassandra

• Executing CQL with SimpleStatement

• Sync and Async API available for executing statements

• Use Async API for executing queries in parallel

var videoId = Guid.NewGuid(); var statement = new SimpleStatement("SELECT * FROM videos WHERE videoid = ?", videoId); RowSet rows = await session.ExecuteAsync(statement);

6

.NET and Cassandra

• Getting values from a RowSet is easy

• Rowset is a collection of Row (IEnumerable<Row>)

RowSet rows = await _session.ExecuteAsync(statement); foreach (Row row in rows) { var videoId = row.GetValue<Guid>("videoid"); var addedDate = row.GetValue<DateTimeOffset>("added_date"); var name = row.GetValue<string>("name"); }

7

CQL 3 Data Types to .NET Types

• Full listing available in driver docs (http://www.datastax.com/docs)

CQL 3 Data Type .NET Type

bigint, counter long

boolean bool

decimal, float float

double double

int int

uuid, timeuuid System.Guid

text, varchar string (Encoding.UTF8)

timestamp System.DateTimeOffset

varint System.Numerics.BigInteger

Use Prepared Statements

• Performance optimization for queries you run repeatedly

• Pay the cost of preparing once (causes roundtrip to Cassandra)

• KillrVideo: looking a user’s credentials up by email address

• Save and reuse the PreparedStatement instance after preparing

9

PreparedStatement prepared = session.Prepare( "SELECT * FROM user_credentials WHERE email = ?");

Use Prepared Statements

• Bind variable values when ready to execute

• Execution only has to send variable values over the wire

• Cassandra doesn’t have to reparse the CQL string each time

• Remember: Prepare once, bind and execute many

10

BoundStatement bound = prepared.Bind("[email protected]"); RowSet rows = await _session.ExecuteAsync(bound);

Statement Options

• Options like Consistency Level and Retry Policy are available at

the Statement level

• If not set on a statement, driver will fallback to defaults set when

building/configuring the Cluster

11

IStatement bound = prepared.Bind("[email protected]") .SetPageSize(100) .SetConsistencyLevel(ConsistencyLevel.LocalOne) .SetRetryPolicy(new DefaultRetryPolicy()) .EnableTracing();

Batch Statements: Use and Misuse

• You can mix and match Simple/Bound statements in a batch

• Batches are Logged (atomic) by default

• Use when you want a group of mutations (statements) to all

succeed or all fail (denormalizing at write time)

• Large batches are an anti-pattern (Cassandra will warn you)

• Not a performance optimization for bulk-loading data

12

KillrVideo: Update a Video’s Name with a Batch

13

public class VideoCatalogDataAccess { public VideoCatalogDataAccess(ISession session) { _session = session; _prepared = _session.Prepare( "UPDATE user_videos SET name = ? WHERE userid = ? AND videoid = ?"); } public async Task UpdateVideoName(UpdateVideoDto video) { BoundStatement bound = _prepared.Bind(video.Name, video.UserId, video.VideoId); var simple = new SimpleStatement("UPDATE videos SET name = ? WHERE videoid = ?", video.Name, video.VideoId); // Use an atomic batch to send over all the mutations var batchStatement = new BatchStatement(); batchStatement.Add(bound); batchStatement.Add(simple); RowSet rows = await _session.ExecuteAsync(batch); } }

Lightweight Transactions when you need them

• Use when you don’t want writes to step on each other

– Sometimes called Linearizable Consistency

– Similar to Serial Isolation Level from RDBMS

• Essentially a Check and Set (CAS) operation using Paxos

• Read the fine print: has a latency cost associated with it

• The canonical example: unique user accounts

14

KillrVideo: LWT to create user accounts

• Returns a column called [applied] indicating success/failure

• Different from relational world where you might expect an

Exception (i.e. PrimaryKeyViolationException or similar)

15

string cql = "INSERT INTO user_credentials (email, password, userid)" +

"VALUES (?, ?, ?) IF NOT EXISTS";

var statement = new SimpleStatement(cql, user.Email, hashedPassword, user.UserId);

RowSet rows = await _session.ExecuteAsync(statement);

var userInserted = rows.Single().GetValue<bool>("[applied]");

Automatic Paging

• The Problem: Loading big result sets into memory is a recipe

for disaster (OutOfMemoryExceptions, etc.)

• Better to load and process a large result set in pages (chunks)

• Automatic Paging makes paging on a large RowSet

transparent

Automatic Paging

• Set a page size on a statement

• Iterate over the resulting RowSet

• As you iterate, new pages are fetched transparently when the

Rows in the current page are exhausted

• Will allow you to iterate until all pages are exhausted

boundStatement = boundStatement.SetPageSize(100); RowSet rows = await _session.ExecuteAsync(boundStatement); foreach (Row row in rows) { }

Mapping Rows to Objects – Mapper Component

• Micro ORM: Write CQL queries, RowSets are mapped to POCOs

• Mappings are based on conventions, can be configured via code (fluent-style interface) or attributes on your POCOs

public class User { public Guid UserId { get; set; } public string Name { get; set; } } // Create a mapper from your session object var mapper = new Mapper(session); // Get a user by id from Cassandra or null if not found var user = client.SingleOrDefault<User>( "SELECT userid, name FROM users WHERE userid = ?", someUserId);

18

Mapping Rows to Objects – LINQ Provider

• Write LINQ queries instead of CQL, results mapped to POCOs

[Table("users")] public class User { [Column("userid"), PartitionKey] public Guid UserId { get; set; } [Column("name")] public string Name { get; set; } } var user = session.GetTable<User>() .SingleOrDefault(u => u.UserId == someUserId) .Execute();

19

Questions? @LukeTillman

https://www.linkedin.com/in/luketillman/

https://github.com/LukeTillman/

20