the matrix and datastax

30

Upload: datastax

Post on 26-Jan-2015

109 views

Category:

Technology


1 download

DESCRIPTION

By: Hayato Shimizu

TRANSCRIPT

Page 1: The Matrix and DataStax
Page 2: The Matrix and DataStax
Page 3: The Matrix and DataStax
Page 4: The Matrix and DataStax
Page 5: The Matrix and DataStax

What%is%Cassandra?%•  Apache Cassandra™ is a massively

scalable NoSQL database.

•  Cassandra is designed to handle big data workloads across multiple data centers with no single point of failure, providing enterprises with continuous availability without compromising performance.

Page 6: The Matrix and DataStax

Why%Cassandra%•  Fast / Linear scalability •  Elastic •  No single point of failure •  Very little moving parts •  Enterprise / multi-data center / cloud data distribution •  Location independence – read and write anywhere •  Dynamic / Flexible data structure •  Tunable data consistency (per operation) •  Data compression •  Cloud ready •  Familiar SQL-Like language – CQL •  Easy setup •  No special hardware needed •  No special caching layer needed

Page 7: The Matrix and DataStax
Page 8: The Matrix and DataStax

4% 2%

3%

1%

data1%

data1%

data2%

data2%

Page 9: The Matrix and DataStax

Network%Topology%

Page 10: The Matrix and DataStax

Data%Consistency%

•  Any%•  One%•  Quorum%•  Local_Quorum%•  Each_Quorum%•  All%

Writes%•  One%•  Quorum%•  Local_Quorum%•  Each_Quorum%•  All%

Reads%

Page 11: The Matrix and DataStax

Durable%Writes%

INSERT%INTO…%

Commit&log& memtable&

SSTable&

Page 12: The Matrix and DataStax

Data%Structure%Keyspace:&Matrix&&&&&&replica7on_factor:&3&

Column%Family:%character_locaLons%

day1% morphius:<7meuuid>:&coordinates%neo:<7meuuid>:&coordinates%

day1:neo% <7meuuid>:&coordinates%

day1:morph% <7meuuid>:&coordinates% <7meuuid>:&coordinates%

<7meuuid>:&coordinates%

Column%Family:%character_informaLon%

neo% DOB:&2600H06H27%Actor:&Keanu&Reeves% email1:&Neo@matrix%

email2:&[email protected]%

Page 13: The Matrix and DataStax

Overview%of%DataStax%•  Founded in April 2010 •  Commercial leader in Apache Cassandra™ •  300+ customers (including 20 of the Fortune 100) •  100+ employees •  Home to Apache Cassandra Chair & most

committers •  Headquartered in San Mateo •  Funded by prominent venture firms

Page 14: The Matrix and DataStax

DataStax%Enterprise%Architecture%

Page 15: The Matrix and DataStax
Page 16: The Matrix and DataStax

DataStax%Cassandra%

•  Kerberos%authenLcaLon%•  Encrypted%data%at%rest%•  AudiLng%•  iSECpartners%validated%

Page 17: The Matrix and DataStax
Page 18: The Matrix and DataStax

<schema%name="wikipedia"%version="1.1">%%<types>%%%<fieldType%name="string"%class="solr.StrField"/>%%%<fieldType%name="text"%class="solr.TextField">%%%%%<analyzer><tokenizer%class="solr.WikipediaTokenizerFactory"/></analyzer>%%%</fieldType>%%</types>%%<fields>%%%%%<field%name="id"%%type="string"%indexed="true"%%stored="true"/>%%%%%<field%name="name"%%type="text"%indexed="true"%%stored="true"/>%%%%%<field%name="body"%%type="text"%indexed="true"%%stored="true"/>%%%%%<field%name="Ltle"%%type="text"%indexed="true"%%stored="true"/>%%%%%<field%name="date"%%type="string"%indexed="true"%%stored="true"/>%%</fields>%%<defaultSearchField>body</defaultSearchField>%%<uniqueKey>id</uniqueKey>%

Page 19: The Matrix and DataStax

Searching%Data%

HTTP&

curl%"hZp://localhost:8983/solr/wiki.solr/select?\%q=Ltle%3AnaLo%2A%20AND%20Ltle%3A%5B2000%20TO%202010%5D"%%&

&

CQL3&

use%wiki;%select%Ltle%from%solr%where%solr_query='Ltle:naLo*%AND%Ltle:[2000%TO%2010]';%%%

Page 20: The Matrix and DataStax

Workload%IsolaLon%

Solr%

C*%

C*%

C*%

C*%

Solr%

Solr%

Solr%

Solr%Queries%

Cassandra%Queries%

Page 21: The Matrix and DataStax
Page 22: The Matrix and DataStax

Hive%

•  {LEFT|RIGHT|FULL}%[OUTER]%JOIN%•  GROUP%BY%•  {SORT|DISTRIBUTE|CLUSTER|ORDER}%BY%•  UNION%•  Sub%Queries%%

Page 23: The Matrix and DataStax

Hive%p>%Cassandra%Example%

DROP%TABLE%IF%EXISTS%StockHist;%CREATE%EXTERNAL%TABLE%StockHist(row_key%string,%column_name%string,%value%double)%STORED%BY%'org.apache.hadoop.hive.cassandra.Cassand%raStorageHandler’%WITH%SERDEPROPERTIES%("cassandra.ks.name"%=%"PorvolioDemo",%%"cassandra.cf.validatorType"%=%"UTF8Type,UTF8Type,DoubleType"%);%%%

Page 24: The Matrix and DataStax

Pig%cassandra_data%=%LOAD%'cassandra://<keyspace>/<CF>'%%USING%CassandraStorage()%AS%(name,%columns:%bag%{T:%tuple(score,%value)});%%total_scores%=%FOREACH%cassandra_data%GENERATE%name,%COUNT(columns.score),%LongSum(columns.score)%as%total%PARALLEL%3;%%ordered_scores%=%ORDER%total_scores%BY%total%DESC%PARALLEL%3;%%STORE%ordered_scores%INTO%'cfs:///final_scores.txt'%USING%PigStorage();%

Page 25: The Matrix and DataStax

Workload%IsolaLon%

H*%

C*%

C*%

C*%

C*%

H*%

Solr%

Solr%

Solr%Queries%

Cassandra%Queries%

Hadoop%AnalyLcs%

Page 26: The Matrix and DataStax
Page 27: The Matrix and DataStax
Page 28: The Matrix and DataStax
Page 29: The Matrix and DataStax

Cassandra%Roadmap%

%

Page 30: The Matrix and DataStax