webinar couchbase 104 - views and indexing

76

Upload: couchbase

Post on 31-May-2015

2.197 views

Category:

Technology


4 download

DESCRIPTION

Learn the architecture and use of Views, the structure of Map-Reduce functions, design documents, querying views and view query parameters, primary aggregate reduces and grouping, eventual consistency of indexes and strategies of use. What will be covered during this training: What are Indexes What is a Map-Reduce Understanding Design Documents Admin Console Overview Anatomy of Map Functions Batch Processing Range Querying, Index-Key Querying, Set Querying RDBMS Queries vs. Map-Reduce Queries Grouping and Group Level Eventual Consistency and Stale Parameter Tips for Creating Views and Sandboxing Tests

TRANSCRIPT

Page 1: Webinar   Couchbase 104 - Views and Indexing
Page 2: Webinar   Couchbase 104 - Views and Indexing

Technical  Evangelist

twi0er:  @scalabl3email:  [email protected]

Jasdeep  Jaitla

Couchbase  104:  Views  and  Indexing

Page 3: Webinar   Couchbase 104 - Views and Indexing
Page 4: Webinar   Couchbase 104 - Views and Indexing

WHAT  IS  A  VIEW?

Page 5: Webinar   Couchbase 104 - Views and Indexing

Views are Indexes

• Indexes are methodologies to speed up access to information• Examples:- Dewey Decimal System- Card Catalogs- Hierarchal File Folders

• In databases, Indexes are specialized structures for searching for data, typically one or two key fields

Page 6: Webinar   Couchbase 104 - Views and Indexing

Indexing Subsystem

• Storing data and Indexing data are separate systems in all databases

• In explicit schema scenarios (RDBMS), Indexes are optimized based on the data type(s)

• In flexible schema scenarios Map-Reduce is used to create indexes

Page 7: Webinar   Couchbase 104 - Views and Indexing

What is Map-Reduce?

• Map-Reduce is a technique designed for dealing with Big Data and processing in parallel in distributed systems

• Map-Reduce is also specifically designed for dealing with unstructured or semi-structured data

• Map functions identify data with collections, process them, and output transformed values

• Reduce functions take the output of Map functions and perform numeric aggregate calculations on them

Page 8: Webinar   Couchbase 104 - Views and Indexing

Views: Map-Reduce Indexes

• In Couchbase, Map-Reduce is specifically used to create Indexes

• Map functions are applied to JSON documents and they output or "emit" data that is organized in an Index form

CRUD Operations MAP()

emit()

(processed)

Page 9: Webinar   Couchbase 104 - Views and Indexing

function (doc, meta) { if (doc.type == “beer” && doc.brewery_id && doc.name) { emit(doc.name, doc.abv); } }

Sample View

Page 10: Webinar   Couchbase 104 - Views and Indexing

function (doc, meta) { if (doc.type == “beer” && doc.brewery_id && doc.name) { emit(doc.name, doc.abv); } }

Sample View

Page 11: Webinar   Couchbase 104 - Views and Indexing

function (doc, meta) { if (doc.type == “beer” && doc.brewery_id && doc.name) { emit(doc.name, doc.abv); } }

Sample View

Page 12: Webinar   Couchbase 104 - Views and Indexing

function (doc, meta) { if (doc.type == “beer” && doc.brewery_id && doc.name) { emit(doc.name, doc.abv); } }

Sample View

Page 13: Webinar   Couchbase 104 - Views and Indexing

function (doc, meta) { if (doc.type == “beer” && doc.brewery_id && doc.name) { emit(doc.name, doc.abv); } }

Sample View

Page 14: Webinar   Couchbase 104 - Views and Indexing

function (doc, meta) { if (doc.type == “beer” && doc.brewery_id && doc.name) { emit(doc.name, doc.abv); } }

Sample View

Page 15: Webinar   Couchbase 104 - Views and Indexing

function (doc, meta) { if (doc.type == “beer” && doc.brewery_id && doc.name) { emit(doc.name, doc.abv); } }

Sample View

Page 16: Webinar   Couchbase 104 - Views and Indexing

Sample View

• Creates an Index of Beer Names (doc.name) and the Alcohol By Volume values (doc.abv)

- Filters Documents• Only JSON Documents with json key doc.type == "beer"• and doc.brewery_id is non-null • and doc.name is non-null

- Outputs• Beer Name (doc.name) [searchable]• Beer Alcohol By Volume (doc.abv) [row value]

function (doc, meta) { if (doc.type == “beer” && doc.brewery_id && doc.name) { emit(doc.name, doc.abv); } }

Page 17: Webinar   Couchbase 104 - Views and Indexing
Page 18: Webinar   Couchbase 104 - Views and Indexing

ARCHITECTURE

Page 19: Webinar   Couchbase 104 - Views and Indexing

Storage to Index

Couchbase Server

EP EngineRAM Cache

Disk Write Queue

Replication Queue

View Engine

Indexers

Application Server

Replica Couchbase Cluster Machine

Page 20: Webinar   Couchbase 104 - Views and Indexing

Storage to Index

Couchbase Server

EP EngineRAM Cache

Disk Write Queue

Replication Queue

View Engine

Indexers

Application Server

storage ops

Replica Couchbase Cluster Machine

Page 21: Webinar   Couchbase 104 - Views and Indexing

Views: Eventual Consistency

Couchbase Server

EP EngineRAM Cache

Disk Write Queue

Replication Queue

View Engine

Indexers

Application Server

Replica Couchbase Cluster Machine

Page 22: Webinar   Couchbase 104 - Views and Indexing

Views: Eventual Consistency

Couchbase Server

EP EngineRAM Cache

Disk Write Queue

Replication Queue

View Engine

Indexers

Application Server

storage ops

Replica Couchbase Cluster Machine

Time 1

Page 23: Webinar   Couchbase 104 - Views and Indexing

Views: Eventual Consistency

Couchbase Server

EP EngineRAM Cache

Disk Write Queue

Replication Queue

View Engine

Indexers

Application Server

Replica Couchbase Cluster Machine

Time 1

get

Page 24: Webinar   Couchbase 104 - Views and Indexing

Views: Eventual Consistency

Couchbase Server

EP EngineRAM Cache

Disk Write Queue

Replication Queue

View Engine

Indexers

Application Server

Replica Couchbase Cluster Machine

Time 1

get

Time 2

Page 25: Webinar   Couchbase 104 - Views and Indexing

Why  Use  Map-­‐Reduce  Indexes?

• Index  (Find)  Documents  by  different  JSON  Values    

• Query  Documents  by  JSON  Values    

• Create  StaXsXcs  and  Aggregates  

When  are  Indexes  Necessary?

•Documents  are  Keyed  by  Random  ProperXes  (UUID,  GUID,  etc.)  

• IteraXng  through  Lists  of  Documents  with  Random  Keys  

• IteraXng  through  Lists  of  Documents  on  different  JSON  ProperXes  (i.e.  all  User  docs,  all  Product  docs,  by  Timestamp,  etc.)  

Page 26: Webinar   Couchbase 104 - Views and Indexing
Page 27: Webinar   Couchbase 104 - Views and Indexing

ANATOMY  OF  A  VIEW

Page 28: Webinar   Couchbase 104 - Views and Indexing

Buckets  >>  Design  Documents  >>  Views

Couchbase Bucket

Page 29: Webinar   Couchbase 104 - Views and Indexing

Buckets  >>  Design  Documents  >>  Views

Couchbase Bucket

Design Document 1 Design Document 2

View ViewViewViewView

Page 30: Webinar   Couchbase 104 - Views and Indexing

Buckets  >>  Design  Documents  >>  Views

Couchbase Bucket

Design Document 1 Design Document 2

View ViewViewViewView

Indexers Are Allocated Per Design Doc

All Updated at Same TimeAll Updated at Same TimeAll Updated at Same Time

Can Only Access Data in the Bucket Namespace

Can Only Access Data in the Bucket Namespace

Page 31: Webinar   Couchbase 104 - Views and Indexing

Map()  FuncXon  =>  Index

function(doc,  meta)  {  emit(doc.username,  doc.email)  

}

Every Document passes through View Map() functions

Map

Page 32: Webinar   Couchbase 104 - Views and Indexing

Map()  FuncXon  =>  Index

function(doc,  meta)  {  emit(doc.username,  doc.email)  

}

json doc

Every Document passes through View Map() functions

Map

Page 33: Webinar   Couchbase 104 - Views and Indexing

Map()  FuncXon  =>  Index

function(doc,  meta)  {  emit(doc.username,  doc.email)  

}

json doc doc metadata

Every Document passes through View Map() functions

Map

Page 34: Webinar   Couchbase 104 - Views and Indexing

Map()  FuncXon  =>  Index

function(doc,  meta)  {  emit(doc.username,  doc.email)  

}create row

json doc doc metadata

Every Document passes through View Map() functions

Map

Page 35: Webinar   Couchbase 104 - Views and Indexing

Map()  FuncXon  =>  Index

function(doc,  meta)  {  emit(doc.username,  doc.email)  

}indexed keycreate row

json doc doc metadata

Every Document passes through View Map() functions

Map

Page 36: Webinar   Couchbase 104 - Views and Indexing

Map()  FuncXon  =>  Index

function(doc,  meta)  {  emit(doc.username,  doc.email)  

}indexed key output value(s)create row

json doc doc metadata

Every Document passes through View Map() functions

Map

Page 37: Webinar   Couchbase 104 - Views and Indexing

Single  Element  Keys  (Text  Key)

function(doc,  meta)  {  emit(doc.email,  doc.points)  

}

Map

Page 38: Webinar   Couchbase 104 - Views and Indexing

Single  Element  Keys  (Text  Key)

function(doc,  meta)  {  emit(doc.email,  doc.points)  

}text key

Map

Page 39: Webinar   Couchbase 104 - Views and Indexing

Single  Element  Keys  (Text  Key)

function(doc,  meta)  {  emit(doc.email,  doc.points)  

}text key

Map

meta.id doc.email doc.points

u::1 [email protected] 1000

u::35 [email protected] 1200

u::20 [email protected] 900

Page 40: Webinar   Couchbase 104 - Views and Indexing

Compound  Keys  (Array)

function(doc,  meta)  {  emit(dateToArray(doc.timestamp),  1)  

}

Array Based Index Keys get sorted as Strings, but can be grouped by array elements

Map

Page 41: Webinar   Couchbase 104 - Views and Indexing

Compound  Keys  (Array)

function(doc,  meta)  {  emit(dateToArray(doc.timestamp),  1)  

}array key

Array Based Index Keys get sorted as Strings, but can be grouped by array elements

Map

Page 42: Webinar   Couchbase 104 - Views and Indexing

Compound  Keys  (Array)

function(doc,  meta)  {  emit(dateToArray(doc.timestamp),  1)  

}array key

Array Based Index Keys get sorted as Strings, but can be grouped by array elements

Map

meta.id dateToArray(doc.3mestamp) value

u::20 [2012,10,9,18,45] 1

u::1 [2012,9,26,11,15] 1

u::35 [2012,8,13,2,12] 1

Page 43: Webinar   Couchbase 104 - Views and Indexing

32 32

Page 44: Webinar   Couchbase 104 - Views and Indexing

QUERYING  VIEWS

32 32

Page 45: Webinar   Couchbase 104 - Views and Indexing

View Query Parameters

•  key$=$“”$­  used%for%exact%match%of%index1key%

•  keys$=$[]$­  used%for%matching%set%of%index1keys%

•  startkey/endkey$=$“”$­  used%for%range%queries%on%index1keys%

•  startkey_docID/endkey_docID$=$“”$­  used%for%range%queries%on%meta.id%

•  stale=[false,$update_a;er,$true]$­  used%to%decide%indexer%behavior%from%client%

•  group/group_by$­  used%with%reduces%to%aggregate%with%grouping%

Page 46: Webinar   Couchbase 104 - Views and Indexing

Most  Common  Query’s  Are  Ranges

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

Page 47: Webinar   Couchbase 104 - Views and Indexing

Most  Common  Query’s  Are  Ranges

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

?startkey=”b1”  &  endkey=”zZ”

Pulls  the  Index-­‐Keys  between  UTF-­‐8  Range  specified  by  the  startkey  and  endkey.

Page 48: Webinar   Couchbase 104 - Views and Indexing

Most  Common  Query’s  Are  Ranges

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

?startkey=”bz”  &  endkey=”zn”

Pulls  the  Index-­‐Keys  between  UTF-­‐8  Range  specified  by  the  startkey  and  endkey.

Page 49: Webinar   Couchbase 104 - Views and Indexing

Most  Common  Query’s  Are  Ranges

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

Page 50: Webinar   Couchbase 104 - Views and Indexing

Index-­‐Key  Matching

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

Page 51: Webinar   Couchbase 104 - Views and Indexing

Index-­‐Key  Matching

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

?key=”[email protected]”  

Match  a  Single  Index-­‐Key

Page 52: Webinar   Couchbase 104 - Views and Indexing

Index-­‐Key  Set  Matches

doc.email meta.id

[email protected] u::1

[email protected] u::7

[email protected] u::2

[email protected] u::5

[email protected] u::6

[email protected] u::4

[email protected] u::3

?keys=[“[email protected]”,  “[email protected]”]

Query  MulXple  in  the  Set  (Array  NotaXon)

Page 53: Webinar   Couchbase 104 - Views and Indexing

Understanding  CollaXon  Order

1234567890  <  aAbBcCdDeEfFgGhHiIjJkKlLmM...

Unicode  Colla3on

1234567890  <  a-­‐z  <  A-­‐ZByte  Order

a < á < A < Á < b

If  it  were  Byte  Order  2  Queries  Merged:

With  Unicode  Colla3on  gets  both  y  and  Y:

startkey="y"&endkey="z"  merged  with  startkey="Y"&endkey="Z"

startkey="y"&endkey="z"

Page 54: Webinar   Couchbase 104 - Views and Indexing

Understanding Stale

stale  =  UPDATE_AFTER  (default  if  nothing  is  specified)  always  get  fastest  response  can  take  two  queries  to  read  your  own  writes  

stale  =  OK  auto  update  will  trigger  eventually  might  not  see  your  own  writes  for  a  few  minutes  least  frequent  updates  -­‐>  least  resource  impact  

stale  =  FALSE  Use  with  Persistence  observe  if  data  needs  to  be  included  in  view  results  BUT  aware  of  delay  it  adds,  only  use  when  really  required

Page 55: Webinar   Couchbase 104 - Views and Indexing

Built-In Reduces

• Are faster than creating your own reduces for the same information

- _count • gives count for number of items in Index

- _sum • sums value parameters (for numeric values only)

- _stats • gives sum, count, min, max and sum of squares for

statistics!

!

Page 56: Webinar   Couchbase 104 - Views and Indexing

Custom Reduces

• Are a bit tricky at first, it's a skill!• Learn about it through our docs, practice first, most common

problem in custom reduces is that they don't "reduce" the data• Can be creatively used!• Always do it in a separate Design Document to sandbox it from

your existing Views, if you have a logic problem or error it won't interrupt existing Views

Page 57: Webinar   Couchbase 104 - Views and Indexing

32 32

Page 58: Webinar   Couchbase 104 - Views and Indexing

BEER  SAMPLE  VIEW

32 32

Page 59: Webinar   Couchbase 104 - Views and Indexing

Beer  Sample  Database  Example

{! "name": "Aventinus Weizenstarkbier / Doppel Weizen Bock",! "abv": 8.2,! "ibu": 0,! "srm": 0,! "upc": 0,! "type": "beer",! "brewery_id": "110f1f2012",! "updated": "2010-07-22 20:00:20",! "description": "Dark-ruby, almost black-colored and streaked with fine top-fermenting yeast, this beer has a compact and persistent head. This is a very intense wheat doppelbock with a complex spicy chocolate-like arome with a hint of banana and raisins. On the palate, you experience a soft touch and on the tongue it is very rich and complex, though fresh with a hint of caramel. It finishes in a rich soft and lightly bitter impression.",! "style": "South German-Style Weizenbock",! "category": "German Ale"!}

{! "id": "110f37fa30",! "rev": "1-000000000",! "expiration": 0,! "flags": 0,! "type": "json"!}

meta doc

Page 60: Webinar   Couchbase 104 - Views and Indexing

Beer  Sample  Database  Example

{! "name": "Aventinus Weizenstarkbier / Doppel Weizen Bock",! "abv": 8.2,! "ibu": 0,! "srm": 0,! "upc": 0,! "type": "beer",! "brewery_id": "110f1f2012",! "updated": "2010-07-22 20:00:20",! "description": "Dark-ruby, almost black-colored and streaked with fine top-fermenting yeast, this beer has a compact and persistent head. This is a very intense wheat doppelbock with a complex spicy chocolate-like arome with a hint of banana and raisins. On the palate, you experience a soft touch and on the tongue it is very rich and complex, though fresh with a hint of caramel. It finishes in a rich soft and lightly bitter impression.",! "style": "South German-Style Weizenbock",! "category": "German Ale"!}

{! "id": "110f37fa30",! "rev": "1-000000000",! "expiration": 0,! "flags": 0,! "type": "json"!}

meta docalcohol by volume (abv)

brewery_id (key)document key

Page 61: Webinar   Couchbase 104 - Views and Indexing

Map  FuncXon  -­‐  Index  DefiniXon

30

Page 62: Webinar   Couchbase 104 - Views and Indexing

Map  FuncXon  -­‐  Index  DefiniXon

30

+row

Page 63: Webinar   Couchbase 104 - Views and Indexing

Map  FuncXon  -­‐  Index  DefiniXon

30

indexed key+row

Page 64: Webinar   Couchbase 104 - Views and Indexing

Map  FuncXon  -­‐  Index  DefiniXon

30

indexed key value(s)+row

Page 65: Webinar   Couchbase 104 - Views and Indexing

Result  Set  -­‐  Brewery  ID’s  by  Beer

31

Page 66: Webinar   Couchbase 104 - Views and Indexing

Result  Set  -­‐  Brewery  ID’s  by  Beer

31

brewery_id

document key (of the beer)

alcohol by volume (abv)

Page 67: Webinar   Couchbase 104 - Views and Indexing

Reduce  Values  (doc.abv)  with  _stats

34 34

Page 68: Webinar   Couchbase 104 - Views and Indexing

Reduce  Values  (doc.abv)  with  _stats

34 34

add _stats built-in reduction

Page 69: Webinar   Couchbase 104 - Views and Indexing

Query  with  Group  and  Reduce

33

Find average alcohol by volume per brewery.

Page 70: Webinar   Couchbase 104 - Views and Indexing

Query  with  Group  and  Reduce

33

Find average alcohol by volume per brewery.

set group=true & reduce=true

add _stats built-in reduction

Page 71: Webinar   Couchbase 104 - Views and Indexing

Groups  Brewery_ID’s,  Reduces  for  Stats

35 35Brewery ID’s are Grouped, and _stats collected (Reduced)

Page 72: Webinar   Couchbase 104 - Views and Indexing

Groups  Brewery_ID’s,  Reduces  for  Stats

35 35

group=true & reduce=true

number of beers by this brewery max abvmin abv

Brewery ID’s are Grouped, and _stats collected (Reduced)

Page 73: Webinar   Couchbase 104 - Views and Indexing
Page 74: Webinar   Couchbase 104 - Views and Indexing

INTERFACE  DEMO

Page 75: Webinar   Couchbase 104 - Views and Indexing
Page 76: Webinar   Couchbase 104 - Views and Indexing

Q  &  A