elasticsearch & elastica in symfony2 - sflive 2015
TRANSCRIPT
What is it ?● “Distributed, RESTful, Search Engine built on top of Apache
Lucene”
● Easy to install : aptitude install elasticsearch
● Easy to use, you will love JSON
● Denormalizing your data
Features- Scoring : Calculate relevance, boost, Score Scripting- Analyzers : a Tokenizer with TokenFilters and CharFilters- GeoLocation- Facets => Aggregations- Highlighting- Scripting- Percolator : Prospective search- 3 layers cache- Plugin (attachment type, River …)- Suggester : autocompletion and more
Why ElasticSearch● For SearchEngine: we reach SQL efficient and functional limits
● An easy solution for a first approach to Search Engine
● Denormalize our data for search
● Used in : Search Form, Cron , SEO page, Business Metrics...
Elastica / ElasticaBundle
● Persistence automatic provider, Doctrine/Propel/MongoDB● Pagination, PagerFanta/KNPpaginator● Persistence listener CallBack (only Doctrine)● Populate
Finally we don’t use it anymore, we just keep it for index config and services
Index Type FinderClient
Searchcurl -XGET http://localhost:9200/[INDEX]/[TYPE]/_search -d ‘{
"query": {
"query_string": {
"query": "foobar"
}
},
"filter": {
"numeric_range": {
"price": {
"lte": 42
}
}
},
"sort": {
"created_at": {
"order": "desc"
}
}
Query: - Relevance- Scoring
Filter :- Discriminate- Cached- Fast
ETL● Extract all ads from SQL, Transform it then Load it in ElasticSearch
● Don’t use “Populate” for large project
● Still in PHP and Symfony2 for using our Model layer (or not...)
● DoctrineListener as AMQP publisher for live indexing
● Need to be fast : PDO & Curl : 10 types, 500 000 ads , 5min
● Next : decoupling outside Symfony with Console Components
Usage SitterForm
SitterSearch
SitterQueryextend ElasticaQuery
QueryFactory
ResultSet
PagerFantaElasticaAdapter
SearchManager
A Good FullText Search
● MultiMatch Query : Search text in multiple fields
● Highlighting : Highlight words in documents
● Suggester : Do autocompletion
● Find compromise between relevance and quantity
Multi Match Query
subfields, for fullText search : my_field.fr and my_field.en
“regular” field “my_field”
Percollator● Index user’s search query in a “percolator index”
● When an ad is registered, send it to regular index and percolator
● Matched percolator names will be return
● You can alert user that an ad corresponding to his alert has just been registered
Score Scripting
in /etc/elasticsearch/scripts/grade.groovy :doc['average_grade'].value > 3.5 ? _score * doc['average_grade'].value : _score
in /etc/elasticsearch/scripts/login.groovy :doc['lastLogin'].value < minLastLogin ? _score * 0.5 : _score