elasticsearch in production new york meetup at twitter october 2014

Elasticsearch in production !

Konrad Beiske konrad@found.no

@beiske

Senior software engineer of Found AS Working with Elasticsearch for 2 years

Herding hundreds of Elasticsearch clusters

Agenda

Agenda• Anti-patterns

• Memory / Resource Usage

• Distributed problems

• Security

• Client concerns

• Changing a cluster

found.no/foundation

Snapshot / Restore

Circuit breakersDocument values

Aggregations

Distributed percolation

Suggesters

Snapshot / Restore

Circuit breakersDocument values

Aggregations

Distributed percolation

Suggesters

Anti-Patterns

Arbitrary Keys

• “Schema Free”

• One field per value

• Ever-growing cluster state

acls: 1234: READ 42: WRITE

Heavy Updating

• Update = Delete + Reindex

• Be careful with counters

Slow queries

• WHERE foo ILIKE ‘%bar%’

• {“query_string”: {“query”: “foo:*bar*”}}

Arbitrary searches

query: filtered: filter: term: user_id: 42 query: [user’s query here]

Time Bomb

Memory

Memory• Field caches

• Filter caches

• Page caches

• Aggregations

• Index building

Page Cache

• Keeping index pages in memory

• Can’t have too much

• Outgrow: Gradual slowdown

Heap Space

• Memory used by Elasticsearch process

• Field / Filter caches

• Aggregations

Time Bomb

OutOfMemoryError

Woah there

I ate all the memories

Your cluster may or may not work any more

OutOfMemory

• Growing too big

• Selecting too big timespan in Kibana

• Document ingestion peak

Preventing OOMs• Have enough memory :-)

• Understand your search’s memory profile

• Bulk / Circuit breaker settings

• Monitoring

• Document values

Marvel( /_stats )

Document Values

"my_field": { "type": "string", "fielddata": { "format": "doc_values" } }

Sizing

• Test, don’t guess

• Start big, scale down

• Index, search, monitor

Glitch Meltdown

• Tie-breaker can be a cheap master-node

• Applies to data centers / availability zones too

Data-only nodes

Master-only nodes

Jepsen

• Kyle Kingsbury’s series on distributed systems

• Distributed systems are hard

• aphyr.com

Security

• “Not my job!” – Elasticsearch

• That’s fine!

Dynamic Scripts

• Scoring

• Aggregations

• Updating

Dynamic Scripts

Runtime.getRuntime().exec(…)

Security

• Disable dynamic scripts

• Mind index patterns

• Even then, don’t accept arbitrary requests

Client Concerns

• Connection pools

• Idempotent requests

• Have sane syncing/indexing strategies

# BOOM !

Cluster changes

• Make new nodes join existing cluster

• No rolling restarts

• Easy rollback if things go bad

v1.0.0 v1.0.1

Cluster changes

• Test first

• Mind recover_*-settings

Multi-Cluster Workflows

• Snapshot/Restore

• Operations across clusters

• Swap clusters!

• Works well with good syncing strategy

• Same JVM

• ulimits

• Unicast and cluster name

• SSD? noop-scheduler

@foundsays

Learn More! !

found.no/foundation

@beiskeFollow

elasticsearch in production new york meetup at twitter october 2014

existing cluster

memoriesyour cluster

glitch meltdown

cluster changes test

evergrowing cluster

heap space memory

elasticsearch thats

time bomb

Software

elasticsearch logstash kibana meetup

reducing mttr for production alerts at twitter hq (sre...

advanced apache spark meetup spark and elasticsearch...

topic modeling of twitter followers - paris machine learning...

sf elasticsearch meetup 2012.10.03

sf elasticsearch meetup 2013.04.06 - monitoring

getting started with elasticsearch on windows and.net with...

twitter meetup at the hacker dojo

san francisco selenium meetup held 04-aug-2009 at twitter hq

twitter sentiment analysis and visualization using ... ·...

yahoo! hadoop user group - may meetup - hbase and pig: the...

shield talk elasticsearch meetup zurich 27.05.2015

meetup elasticsearch - rio de janeiro - luiz guilherme

multiple ways of building a recommender system with...

digital marketing tips: twitter, instagram, pinterest, video...

datalab 101 (hadoop, spark, elasticsearch) par jonathan...

elasticsearch jvm-mx meetup april 2016

hippo meetup: enterprise search with solr and elasticsearch

oncrawl elasticsearch meetup france #12

michal barla: beyond search queries @ elasticsearch vienna...