apache cassandra and go
DESCRIPTION
Al Tobey (@AlTobey) is an Open Source Mechanic at DataStax. Prior to working at DataStax, Al was a Tech Lead of Compute and Data Services at Ooyala, which has been using Apache Cassandra since version 0.4 and these days uses Go in production. Al will be presenting a brief introduction to Go (#golang) and Cassandra, and how they are a great fit for each other. This talk will include code samples and a live demo.TRANSCRIPT
©2013 DataStax Confidential. Do not distribute without consent.
@AlTobey Open Source Mechanic @ Datastax
Cassandra and Go
!1
Open Source Mechanic • Officially, Open Source Evangelist for Apache Cassandra
• Educate
• Ask (the right) questions
• Fix things
• Get the word out
Cassandra• AP distributed database
• Dynamo x BigTable
• Robust data model
• Built-in multi-datacenter replication
• At-rest encryption
• Easy to use
Cassandra 2.0• Lightweight Transactions (a.k.a. CAS)
• Results paging for CQL (cursors)
• Eager retries
•Many optimizations all over
• Experimental triggers
•Note: Java 7 is required
CREATE TABLE demo (id UUID, name VARCHAR, dob TIMESTAMP); !INSERT INTO demo (id, name, created) VALUES (“…”, “Al Tobey”, “2013-10-29”); !SELECT name FROM demo; !DESCRIBE TABLE demo; !use system; !SELECT columnfamily_name FROM schema_columnfamilies WHERE keyspace_name = ‘demo’; !
Look familiar?
#golang• Tiny language
• Strongly typed
• Simple concurrency primitives
• first-class functions
•Native, statically-linked code
• Garbage collection
• Excellent standard library
• Great community
• Open Source
package main !import ( "fmt" ) !func main() { for i := 0; i < 100; i++ { fmt.Printf("%d: ", i) ! if i%3 == 0 { fmt.Printf("fizz") } ! if i%5 == 0 { fmt.Printf("buzz") } ! fmt.Printf("\n") } }
Fizz Buzz
Chocolate and Peanut Butter • Cassandra is distributed and can handle highly concurrent load
• Go makes concurrency trivial
• A CQL native protocol driver already exists for Go (gocql)
• A Thrift driver is also available (gossie)
CREATE KEYSPACE demo WITH REPLICATION = { 'class' : ‘SimpleStrategy', 'replication_factor' : 1 }; !CREATE TABLE subscribers { id UUID, email VARCHAR, created TIMESTAMP, updated TIMESTAMP, deleted TIMESTAMP ); !INSERT INTO subscribers (id, email, created, updated, deleted) VALUES ( 4e7cd544-a72c-41a1-a004-74ea34e1e932, ‘[email protected]’, ‘2013-10-24T00:00:00+0000’, ‘2013-10-26T00:00:00+0000’, ‘9999-12-31T00:00:00+0000’ ); !SELECT id, email, created, updated, deleted FROM subscribers;
Schema
package main !import ( "fmt" "time" "tux21b.org/v1/gocql" "tux21b.org/v1/gocql/uuid" ) !type Record struct { Id uuid.UUID Email string Created time.Time Updated time.Time Deleted time.Time }
Querying
func main() { cluster := gocql.NewCluster("127.0.0.1") cluster.Keyspace = "subscribers" cluster.Consistency = gocql.Quorum ! cass, err := cluster.CreateSession() if err != nil { panic(fmt.Sprintf("Error creating session: %v", err)) } defer cass.Close() ! sub := Record{} ! q := cass.Query( `SELECT id, email, created, updated, deleted FROM subscribers` ) ! q.Scan(&sub.Id, &sub.Email, &sub.Created, &sub.Updated, &sub.Deleted) ! fmt.Printf("Record: %v\n", sub) }
Querying
Go Web Services• Go seems to be particularly popular for web services
•With good reasons • The standard library’s HTTP server is quite good
• Deploy a single binary
• High concurrency
• Light resource usage
package main !import ( "fmt" "net/http" ) !func main() { http.HandleFunc(“/index.txt", func(w http.ResponseWriter, r *http.Request) { fmt.Fprintf(w, “Ohai!\n") } ) ! http.ListenAndServe(":8080", nil) }
Hello WWW
Skeezy: don’t do this at home•Written specifically for this talk
•No (known) heinous crimes
• https://github.com/tobert/skeezy
package main !import ( … ) !func main () { // connect to Cassandra (same as before) ! // set up routes (with a twist) ! // start the web service (same as before) }
skeezy
http.Handle("/img/", http.StripPrefix("/img/", http.FileServer(http.Dir(“./public/img/“)) ) ) !http.HandleFunc(“/c/“, func(w http.ResponseWriter, r *http.Request) { q := cass.Query(`SELECT id FROM c WHERE postId=?`, id.Bytes()) q.Iter() ! c := Comment{} for iq.Scan(&c.Id, &c.ParentId, &c.Created) { fmt.Fprintf(w, “%v”, c) } })
skeezy - setting up routes
package main import (…) !var cass *gocql.Session !func main () { // … var err error cass, err = cluster.CreateSession() // … ! http.HandleFunc(“/c/“, func(…) { q := cass.Query(…) // … }) }
skeezy - globals?
func main () { // … cass, err := cluster.CreateSession() // … ! http.HandleFunc(“/c/“, func(…) { q := cass.Query( … ) } )
skeezy - closure
func main () { // … cass, err := cluster.CreateSession() // … ! http.HandleFunc(“/c/“, func(…) { cc := skeezy.ListPosts(cass, …) } )
skeezy - responsible passing
func main () { // … cass, err := cluster.CreateSession() // … ! http.HandleFunc(“/c/“, func(…) { cc := skeezy.ListPosts(cass, …) } ) !package skeezy !func ListPosts(…) []Comment { ret := make([]Comment, 1) for { ret = append(ret, c) } return ret }
skeezy - returning data
func ListComments(cass *gocql.Session, id uuid.UUID) (chan *Comment) { cc := make(chan *Comment) ! go func() { iq := cass.Query(`… WHERE postId=?`, id.Bytes()).Iter() for { c := Comment{} if iq.Scan(&c.Id, &c.ParentId, &c.Created) { cc <- &c } else { break } } if err := iq.Close(); err != nil { log.Fatal(err) } close(cc) }() ! return cc }
skeezy - channels & goroutines
// a list of comment ids http.HandleFunc("/c/", func(…) { cc := skeezy.ListComments(cass, getId(r, “/c/“)) ! for comment := range cc { js, _ := json.Marshal(comment) w.Write(js) } }) !
skeezy - channel receiver
Overview• Cassandra and Go are both great for high concurrency applications
• Go is a simple and powerful language
• Both are great choices for your next application
• Learn more • http://golang.org
• http://planetcassandra.com/
• http://tux21b.org/gocql/
!
• Questions?