sf elasticsearch meetup 2013.04.06 - monitoring

Post on 06-May-2015

792 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Using monitoring tools Zabbix for systems-level monitoring of ElasticSearch and SPM (http://sematext.com/spm/elasticsearch-performance-monitoring/index.html) for ElasticSearch-specific monitoring. Using these tools was crucial was optimizing index building performance as well as query performance. Some general tips for index building and query performance.

TRANSCRIPT

Monitoring tools for ElasticSearch

SF Meetup2013.03.06

Sushant ShankarShyam Kuttikkad

• Why and how we use ElasticSearch• Monitoring– Tools– Index Building– Query Performance

Who is asdfas• Social Sharing and Content Discovery platform

– We help >600,000 publishers with content distribution, user engagement, and advertising monetization

– 450 Fortune 1000 brand marketers leverage our unique social signals to deliver impactful advertising

• We develop Machine Learning algorithms operating on Big Data to:– Provide content sharing insights to Publishers– Build customized audience segments for advertising campaigns– Extract actionable insights out of social and interest data

www.33Across.comwww.tynt.com

Data firehose of 30B monthly events, 1.25B cookies

- Interaction with web content- Shares – images, copies- Searches

Social AudiencesBehaviorContextKnowledge

Real-time view

Build, understand,analyze

ElasticSearch!

Production ElasticSearch cluster

Build index using MR job and Bulk API

Hardware6 nodes, 24GB RAM16GB for ES service 4 cores3x 1.5TB drive

Index>1TB/index (replicated) ~300M documents~5KB / document~3 hours

System monitoring using Zabbix

Index Build

ElasticSearch specific monitoring using SPM

Scalable Performance Monitoring (http://sematext.com/spm/index.html)

• Index stats – Total/Refreshed/Merged documents• Shards – Total/Active/Relocating/Initializing• Search - Request rate and latency• Cache – {Filter, field} cache {count, evictions, size}• Machine – CPU, Memory, JVM, GC, Network, Disk

Index Building Optimization using Zabbix and SPM

Amount bulk indexed

# Shards

Time takenCPU util.

Mem util.Disk I/ONetwork

in practice…

Debugging and Validating using SPM

Index Building: Learnings

• 2 shards / CPU• 10,000 documents (users) per indexing

request

• Bulk API for our use case• No replicas• Refresh off (index.refresh_interval = -1)

Query Performance: Learnings

• 1-2 Replicas (and for reliability)• Turn refresh on again (5s default)• Warm up effect (Index Warm up API 0.20+)• Optimize API• Simulate multiple users

QUERIES?

Sushant Shankarsushant.shankar@33across.com

Shyam Kuttikkadshyam.kuttikkad@33across.com

Why we really need a search engine

… …

Batch! Good for complicated tasks (Machine Learning, Graph Algorithms, etc.)

Warm Up: load into memory and cache

Other cool features

• Custom Scoring functions• Scripts – MVEL, Python• Facets

• Exploring:• Real-time indexing• Indexing images, files, etc.• Parent-child relationships

top related