introduction to elasticsearch
DESCRIPTION
Elasticsearch is a powerful, distributed, open source searching technology. By integrating Elasticsearch into your application, you instantly provide a way to search a lot of data very quickly. Elasticsearch has a RESTful API, it scales, its super fast, you can use plugins to customize it, and much more. In this talk I go over the basics of setting up Elasticsearch, creating a search index, importing your data, and doing some basic searching. I also touch on a few advanced topics that will show the flexibility of this awesome service.TRANSCRIPT
Introduction to Elasticsearch
Jason Austin - @jason_austin
The Problem
• You are building a website to find beers
• You have a huge database of beers and breweries to sift through
• You want simple keyword-based searching
• You also want structured searching, like finding all beers > 7% ABV
• You want to run some analytics on what beers are in your dataset
Enter Elasticsearch
• Lucene based
• Distributed
• Fast
• RESTful interface
• Document-Based with JSON
Install Elasticsearch
• Download from http://elasticsearch.org
• Requires Java to run
Run Elasticsearch
• From the install directory:
./bin/elasticsearch -d!
!
http://localhost:9200/!
Communicating
• Elasticsearch listens to RESTful HTTP requests
• GET, POST, PUT, DELETE
• CURL works just fine
ES Structure
Relational DB
Databases
Tables
Rows
Columns
Elasticsearch
Indices
Types
Documents
Fields
ES Structure
Elasticsearch
Indices
Types
Documents
Fields
Elasticsearch
phpbeer
beer
Pliny the Elder
ABV, Name, Desc
Create an Indexcurl -XPOST 'http://localhost:9200/phpbeer'
What to Search?
• Define the types of things to search
• Beer
• Brewery - Maybe later
Define a Beer
• Name
• Style
• ABV
• Brewery
‣ Name
‣ City
Beer JSON{ ! "name": "Pliny the Elder", ! "style": "Imperial India Pale Ale", ! "abv": 7.0, ! "brewery": { ! ! "name": "Russian River Brewing Co.", ! ! "city": "Santa Rosa", "state": "California" ! } }
Saving The Beer
curl -XPOST 'http://localhost:9200/phpbeer/beer/1' -d '{ ! "name": "Pliny the Elder", ! "style": "Imperial India Pale Ale", ! "abv": 7.0, ! "brewery": { ! ! "name": "Russian River Brewing Co.", ! ! "city": "Santa Rosa", "state": "California" ! } }'
Getting a beercurl -XGET 'http://localhost:9200/phpbeer/beer/1?pretty'
Updating a Beercurl -XPOST 'http://localhost:9200/phpbeer/beer/1' -d '{ ! "name": "Pliny the Elder", ! "style": "Imperial India Pale Ale", ! "abv": 8.0, ! "brewery": { ! ! "name": "Russian River Brewing Co.", ! ! "city": "Santa Rosa", "state": "California" ! } }'
POST vs PUT
• POST
• No ID - Creates new doc, assigns ID
• With ID - Updates or creates new doc
• PUT
• No ID - Error
• With ID - Updates doc
Delete a Beercurl -XDELETE 'http://localhost:9200/phpbeer/beer/1'
Finally! Searching!curl -XGET 'http://localhost:9200/_search?pretty&q=pliny'
Specific Field Searchingcurl -XGET 'http://localhost:9200/_search?pretty&q=style:pliny'!
curl -XGET 'http://localhost:9200/_search?pretty&q=style:imperial'
Alternate Approach
• Search using DSL (Domain Specific Language)
• JSON in request body
DSL Searchingcurl -XGET 'http://localhost:9200/_search?pretty' -d '{ "query" : { "match" : { "style" : "imperial" } } }'
DSL = Query + Filter
• Query - “How well does the document match”
• Filter - Yes or No question on the field
Query DSL• match
• Used to query across all fields for a string
• match_phrase
• Used to query an exact phrase
• match_all
• Matches all documents
• multi_match
• Runs the same match query on multiple fields
Filter DSL
• term
• Exact match on a field
• range
• Match numbers over a specified range
• exists / missing
• Match based on the existence of a value for a field
More Complex Search
• Find beer whose styles include “Pale Ale” that are less than 7% ABV
Match + Rangecurl -XGET 'http://localhost:9200/_search?pretty' -d '{ "query" : { "match" : { "style" : "pale ale" } }, "filter" : { "range" : { "abv" : { "lt" : 7 } } } }'
Embedded Field Searchcurl -XGET 'http://localhost:9200/_search?pretty' -d '{ "query" : { "match" : { "brewery.state" : "California" } } }'
Highlighting Search Results
Highlighting Search Resultscurl -XGET 'http://localhost:9200/_search?pretty' -d '{ "query" : { "match" : { "style" : "pale ale" } }, "highlight": { "fields" : { "style" : {} } } }'
Aggregations
• Collect analytics on your documents
• 2 main types
• Bucketing
• Produce a set of buckets with documents in them
• Metric
• Compute metrics over a set of documents
Bucketing Aggregations
Metric Aggregations
• How many beers exist of each style?
• What is the average ABV of beers for each style?
• How many beers exist that are brewed in California?
What is the average ABV of beers for each style?
curl -XGET 'http://localhost:9200/_search?pretty' -d '{ "aggs" : { "all_beers" : { "terms" : { "field" : "style" }, "aggs" : { "avg_abv" : { "avg" : { "field" : "abv" } } } } } }'
Mappings
• Define how ES searches
• Completely optional
• Must re-index after defining mapping
Create Index with Mapping
curl -XPOST localhost:9200/phpbeer -d '{ "mappings" : { "beer" : { "_source" : { "enabled" : true }, "properties" : { "style" : { "type" : "string", "index" : "not_analyzed" } } } } }'
curl -XDELETE localhost:9200/phpbeer
What is the average ABV of beers for each style?
curl -XGET 'http://localhost:9200/_search?pretty' -d '{ "aggs" : { "all_beers" : { "terms" : { "field" : "style" }, "aggs" : { "avg_abv" : { "avg" : { "field" : "abv" } } } } } }'
Non-Analyzed Fieldscurl -XGET 'http://localhost:9200/_search?pretty&q=style:imperial'!
curl -XGET 'http://localhost:9200/_search?pretty&q=style:hefeweizen'
Flexibility
• Mixing aggregations, filters and queries all together
• What beers have the word “night” in the name that are between 4 and 6 % ABV, broken down by style.
Elasticsearch and PHP
• Elasticsearch PHP Libhttps://github.com/elasticsearch/elasticsearch-php
• Elasticahttp://elastica.io/
Other Awesome ES Features
• Search analyzers
• Geo-based searching
• Elasticsearch Plugins
• kopf - http://localhost:9200/_plugin/kopf
Questions?
• @jason_austin
• http://www.pintlabs.com
• https://joind.in/10821