cassandra at umbel
DESCRIPTION
Umbel presents on their internal bitmap index that is backed by Cassandra.TRANSCRIPT
11
September 2014Cassandra at Umbel
33
UmbelEmpower companies to convert people-based data into addressable, actionable relationships. !
44
Cassandra at Umbel1. Segmentation of people-based data 2. Pilosa 3. Cassandra Persistent Storage
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 5
SDKs
Capture
S3
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 6
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 7
8
Decentralized, Distributed Bitmap Index & Query Engine
Pilosa
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 9
SDKs
Capture
S3
Pilosa
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 10
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 11
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 12
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 13
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 14
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 15
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 16
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 17
11 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 10
0 0 0 0 0 0 0 0 0 0
0000000000
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 0
000
0 0 0 0 0 0 000 0 0 0 0 0
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 18
11 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 10
0 0 0 0 0 0 0 0 0 0
0000000000
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 0
000
0 0 0 0 0 0 000 0 0 0 0 0
AND
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 19
11 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 10
0 0 0 0 0 0 0 0 0 0
0000000000
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 0
000
0 0 0 0 0 0 000 0 0 0 0 0
OR
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 20
11 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 10
0 0 0 0 0 0 0 0 0 0
0000000000
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 0
000
0 0 0 0 0 0 000 0 0 0 0 0
OR NOT
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 21
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 22
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 23
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 24
2640
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 25
Slice 264
65,536 bits
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 26
Slice 264
65,536 bits
Fragment
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 27
Slice 264
65,536 bits
0 1 2 313
Fragment
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 28
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
(2048 bits)
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 29
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 30
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits) profile id: 150,000
0010 0100 1001 1111 0000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 31
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 0000 ÷ 216
profile id: 150,000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 32
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 00000010 = 2
÷ 216
profile id: 150,000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 33
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 0000
profile id: 150,000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 34
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 0000 ÷ 211
profile id: 150,000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 35
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 00000 1001
÷ 211
= 9
profile id: 150,000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 36
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 00000 1001
÷ 211
= 9001 1111 0000
profile id: 150,000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 37
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 00000 1001
÷ 211
= 9001 1111 0000 ÷ 26
profile id: 150,000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 38
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 00000 1001
÷ 211
= 9001 1111 0000
0 0111÷ 26
= 7
profile id: 150,000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 39
Slice 264
65,536 bitsChunk
0 1 2 313
Fragment0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits)
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 00000 1001
÷ 211
= 9001 1111 0000
0 0111÷ 26
= 711 0000 = 48
profile id: 150,000
40
v 2.1
Cassandra Persistent Storage
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 41
Chunk
0 1 2 313
0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits) profile id: 150,000
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 00000 1001
÷ 211
= 9001 1111 0000
0 0111÷ 26
= 711 0000 = 48
CREATE TABLE bitmap ( bitmap_id bigint, db varchar, frame varchar, slice int, chunkkey int, blockindex int, block bigint, PRIMARY KEY ((bitmap_id, db, frame, slice), chunkkey, blockindex) )
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 42
Chunk
0 1 2 313
0 12 34 56 78 910 1112 1314 1516 1718 1920 2122 2324 2526 2728 2930 31
Block (64 bits)
(2048 bits) profile id: 150,000
0010 0100 1001 1111 00000010 = 2
÷ 216
0100 1001 1111 00000 1001
÷ 211
= 9001 1111 0000
0 0111÷ 26
= 711 0000 = 48
PQL: set(88, d, 0, 150000) cqlsh:pilosa> select * from bitmap; ! bitmap_id | db | frame | slice | chunkkey | blockindex | block | filter -----------+------+-------+-------+----------+------------+-----------------+-------- 88 | test | d | 2 | -1 | 0 | 1 | 0 88 | test | d | 2 | 9 | 7 | 281474976710656 | 0 !(2 rows)
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 43
Slice
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 44
Slice
Frame
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 45
Frame
!• Default
• Time-based
• Top-n
Frame Types
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 46
foo
set(88, foo, 150000)
Default
88
150000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 47
bar.t
set(88, bar.t, 150000, 2014-09-18T19:00) !
fn(88, 2014-09-18T19:00) => [88.y, 88.m, 88.d, 88h] bitmap_id: 64 bits id: 44 bits date/hour: 20 bits (60 years with hours, 2010-2070)
Time-based
88.y
150000
88.m88.d
88.h
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 48
bar.t
get(88, bar.t, 2014-08-01T00:00, 2014-09-04T03:00) 88.m.2014-08 88.d.2014-09-01 88.d.2014-09-02 88.d.2014-09-03 88.h.2014-09-04T00 88.h.2014-09-04T01 88.h.2014-09-04T02
Time-based
88.y
150000
88.m88.d
88.h
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 49
brands.n
set(88, brands.n, 150000) !
• only one fragment • sorted by count • configurable limit (50,000) • compares count and loads from Cassandra
Top-n
88
150000
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 50
brands.n
set(88, brands.n, 150000) !
• only one fragment • sorted by count • configurable limit (50,000) • compares count and loads from Cassandra
Top-n
88
150000
cqlsh:pilosa> select * from bitmap; ! bitmap_id | db | frame | slice | chunkkey | blockindex | block -----------+----------+-------+-------+----------+------------+----------------- 88 | brands.n | d | 2 | -1 | 0 | 1 88 | brands.n | d | 2 | 9 | 7 | 281474976710656 !(2 rows)
Umbel | Unmask your data. www.umbel.com © 2014 Umbel Corp. All Rights Reserved 51
brands.n
top-n(get(222, foo), brands.n)
Top-n
Thank You!
52
umbel.com/engineering