elasticsearch for sql users

26
1 Shaunak Kashyap Developer at Elastic @shaunak Elasticsearch for SQL users

Upload: all-things-open

Post on 16-Feb-2017

106 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Elasticsearch for SQL Users

1

Shaunak Kashyap Developer at Elastic @shaunak

Elasticsearch for SQL users

Page 2: Elasticsearch for SQL Users

The Elastic Stack

2

Store, Index & Analyze

Ingest

User Interface

Plugins

Hosted Service

Page 3: Elasticsearch for SQL Users

3

Agenda

Search queries

Data modeling

Architecture

1

2

3

Page 4: Elasticsearch for SQL Users

2

4

Agenda

Search queries

Data modeling

Architecture

1

3

Page 5: Elasticsearch for SQL Users

5

Agenda

Search queries

Data modeling

1

2

3 Architecture

Page 6: Elasticsearch for SQL Users

6

Search Queries

https://www.flickr.com/photos/samhames/4422128094

Page 7: Elasticsearch for SQL Users

7

CREATE TABLE IF NOT EXISTS emails ( sender VARCHAR(255) NOT NULL, recipients TEXT, cc TEXT, bcc TEXT, subject VARCHAR(1024), body MEDIUMTEXT, datetime DATETIME );

CREATE INDEX emails_sender ON emails(sender); CREATE FULLTEXT INDEX emails_subject ON emails(subject); CREATE FULLTEXT INDEX emails_body ON emails(body);

curl -XPUT 'http://localhost:9200/enron' -d' { "mappings": { "email": { "properties": { "sender": { "type": "keyword" }, "recipients": { "type": "keyword" }, "cc": { "type": "keyword" }, "bcc": { "type": "keyword" }, "subject": { "type": "text", "analyzer": "english" }, "datetime": { "type": "date" } } } }

Schemas

Page 8: Elasticsearch for SQL Users

8

Loading the data

Page 9: Elasticsearch for SQL Users

9

[LIVE DEMO]

• Search for text in a single field

• Search for text in multiple fields

• Search for a phrase

https://github.com/ycombinator/es-enron

Page 10: Elasticsearch for SQL Users

10

Other Search Features

Stemming Synonyms Did you mean?

• Jump, jumped, jumping • Queen, monarch • Monetery => Monetary

Page 11: Elasticsearch for SQL Users

11

Data Modeling

https://www.flickr.com/photos/samhames/4422128094https://www.flickr.com/photos/ericparker/7854157310

Page 12: Elasticsearch for SQL Users

12

To analyze (text) or not to analyze (keyword)?

PUT cities/city/1 { "city": "Raleigh", "population": 431746 }

PUT cities/city/2 { "city": "New Albany", "population": 8829 }

PUT cities/city/3 { "city": "New York", "population": 8406000 }

POST cities/_search { "query": { "match": { "city": "New Albany" } } }

QUERY

+ = ?

Page 13: Elasticsearch for SQL Users

PUT cities/city/1 { "city": "Raleigh", "population": 431746 }

13

To analyze (text) or not to analyze (keyword)?

PUT cities/city/2 { "city": "New Albany", "population": 8829 }

PUT cities/city/3 { "city": "New York", "population": 8406000 }

Term Document IDs

albany 2

new 2,3

raleigh 1

york 3

Page 14: Elasticsearch for SQL Users

14

To analyze (text) or not to analyze (keyword)?

PUT cities { "mappings": { "city": { "properties": { "city": { "type": "keyword" } } } } }

MAPPING

Term Document IDs

New Albany 2

New York 3

Raleigh 1

Page 15: Elasticsearch for SQL Users

PUT blog/post/1 { "author_id": 1, "title": "...", "body": "..." }

PUT blog/post/2 { "author_id": 1, "title": "...", "body": "..." }

PUT blog/post/3 { "author_id": 1, "title": "...", "body": "..." }

15

Relationships: Application-side joinsPUT blog/author/1 { "name": "John Doe", "bio": "..." }

POST blog/author/_search { "query": { "match": { "name": "John" } } }

QUERY 1

POST blog/post/_search { "query": { "match": { "author_id": <each id from query 1 result> } } }

QUERY 2

Page 16: Elasticsearch for SQL Users

PUT blog/post/1 { "author_name": "John Doe", "title": "...", "body": "..." }

PUT blog/post/2 { "author_name": "John Doe", "title": "...", "body": "..." }

16

Relationships: Data denormalization

POST blog/post/_search { "query": { "match": { "author_name": "John" } } }

QUERY

PUT blog/post/3 { "author_name": "John Doe", "title": "...", "body": "..." }

Page 17: Elasticsearch for SQL Users

17

Relationships: Nested objects

PUT blog/author/1 { "name": "John Doe", "bio": "...", "blog_posts": [ { "title": "...", "body": "..." }, { "title": "...", "body": "..." }, { "title": "...", "body": "..." } ] }

POST blog/author/_search { "query": { "match": { "name": "John" } } }

QUERY

Page 18: Elasticsearch for SQL Users

18

Relationships: Parent-child documents

PUT blog/author/1 { "name": "John Doe", "bio": "..." }

POST blog/post/_search { "query": { "has_parent": { "type": "author", "query": { "match": { "name": "John" } } }

QUERY

PUT blog { "mappings": { "author": {}, "post": { "_parent": { "type": "author" } } } } PUT blog/post/1?parent=1

{ "title": "...", "body": "..." }

PUT blog/post/2?parent=1 { "title": "...", "body": "..." }

PUT blog/post/3?parent=1 { "title": "...", "body": "..." }

Page 19: Elasticsearch for SQL Users

19

Architecture

https://www.flickr.com/photos/samhames/4422128094https://www.flickr.com/photos/haribote/4871284379/

Page 20: Elasticsearch for SQL Users

20

RDBMS Triggers

database by Creative Stall from the Noun Project

1 2

Page 21: Elasticsearch for SQL Users

21

Async replication to Elasticsearch

12 3

ESSynchronizer

flow by Yamini Ahluwalia from the Noun Project

Page 22: Elasticsearch for SQL Users

22

Async replication to Elasticsearch with Logstash

12 3

Page 23: Elasticsearch for SQL Users

23

Forked writes from application

1

2

Page 24: Elasticsearch for SQL Users

24

Forked writes from application (more robust)

1

2

queue by Huu Nguyen from the Noun Project

ESSynchronizer3 4

Page 25: Elasticsearch for SQL Users

25

Forked writes from application (more robust with Logstash)

1

23

4

Page 26: Elasticsearch for SQL Users

26

Questions?

@shaunak

https://www.flickr.com/photos/nicknormal/2245559230/