an introduction to elasticsearch for beginners

71
1 Elasticsearch Amir Sedighi Twitter: @amirsedighi Blog: http://hexican.com Email: [email protected] Oct 2014

Upload: amir-sedighi

Post on 02-Jul-2015

1.923 views

Category:

Data & Analytics


2 download

DESCRIPTION

This is an introduction to Elasticsearch, based on Alex Brazetvik presentations, Elasticsearch from the bottom up and Elasticsearch in production.

TRANSCRIPT

Page 1: An Introduction to Elasticsearch for Beginners

1

Elasticsearch

Amir Sedighi

Twitter: @amirsedighi

Blog: http://hexican.com

Email: [email protected]

Oct 2014

Page 2: An Introduction to Elasticsearch for Beginners

2

References

● http://elasticsearch.org/

● https://www.found.no/foundation/elasticsearch-in-production/

● https://www.found.no/foundation/sizing-elasticsearch/

● https://www.found.no/foundation/elasticsearch-as-nosql/

● https://www.found.no/foundation/elasticsearch-from-the-bottom-up/

Page 3: An Introduction to Elasticsearch for Beginners

3

● Thanks to Alex Brasetvik (@alexbrasetvik) from @foundsays, for the slides.

● Thanks to Leslie Hawthorn (@lhawthorn) from @elasticsearch, for the stickers.

Page 4: An Introduction to Elasticsearch for Beginners

Powered by Lucene, Search Stuffs

● 1999 Doug Cutting

● 2003 Doug Cutting

● 2004 Yonik Seeley

● 2010 Shay Banon

Page 5: An Introduction to Elasticsearch for Beginners

5

● Full-Text Search Library.● Free & Open-Source● Features:

– Indexes & Analyzes Data

– Tokenizing

– Filtering

– Wildcards

– Aggregation

– Sorting

Page 6: An Introduction to Elasticsearch for Beginners

6

● Free and Open-Source

● Java (Cross-platform)

● Real-Time Analytical Search Engine

● Distributed

● Highly Available

● RESTful

Page 7: An Introduction to Elasticsearch for Beginners

7

Page 8: An Introduction to Elasticsearch for Beginners

8

Page 9: An Introduction to Elasticsearch for Beginners
Page 10: An Introduction to Elasticsearch for Beginners
Page 11: An Introduction to Elasticsearch for Beginners
Page 12: An Introduction to Elasticsearch for Beginners
Page 13: An Introduction to Elasticsearch for Beginners
Page 14: An Introduction to Elasticsearch for Beginners

Shard

Page 15: An Introduction to Elasticsearch for Beginners
Page 16: An Introduction to Elasticsearch for Beginners
Page 17: An Introduction to Elasticsearch for Beginners
Page 18: An Introduction to Elasticsearch for Beginners

Inverted Index

Page 19: An Introduction to Elasticsearch for Beginners
Page 20: An Introduction to Elasticsearch for Beginners
Page 21: An Introduction to Elasticsearch for Beginners
Page 22: An Introduction to Elasticsearch for Beginners
Page 23: An Introduction to Elasticsearch for Beginners
Page 24: An Introduction to Elasticsearch for Beginners
Page 25: An Introduction to Elasticsearch for Beginners
Page 26: An Introduction to Elasticsearch for Beginners
Page 27: An Introduction to Elasticsearch for Beginners
Page 28: An Introduction to Elasticsearch for Beginners
Page 29: An Introduction to Elasticsearch for Beginners
Page 30: An Introduction to Elasticsearch for Beginners
Page 31: An Introduction to Elasticsearch for Beginners
Page 32: An Introduction to Elasticsearch for Beginners
Page 33: An Introduction to Elasticsearch for Beginners
Page 34: An Introduction to Elasticsearch for Beginners
Page 35: An Introduction to Elasticsearch for Beginners

One Index Per a Day

Page 36: An Introduction to Elasticsearch for Beginners
Page 37: An Introduction to Elasticsearch for Beginners
Page 38: An Introduction to Elasticsearch for Beginners

A Partial Query

Page 39: An Introduction to Elasticsearch for Beginners
Page 40: An Introduction to Elasticsearch for Beginners
Page 41: An Introduction to Elasticsearch for Beginners
Page 42: An Introduction to Elasticsearch for Beginners
Page 43: An Introduction to Elasticsearch for Beginners
Page 44: An Introduction to Elasticsearch for Beginners

The filtered Query Graph

Page 45: An Introduction to Elasticsearch for Beginners
Page 46: An Introduction to Elasticsearch for Beginners
Page 47: An Introduction to Elasticsearch for Beginners
Page 48: An Introduction to Elasticsearch for Beginners
Page 49: An Introduction to Elasticsearch for Beginners
Page 50: An Introduction to Elasticsearch for Beginners

50

Question

● Can ES be used as a "NoSQL"-database?

Page 51: An Introduction to Elasticsearch for Beginners

51

Production and Deployment

● Keeping End-users Happy.

● Tracking Quality of Service and Healthy.

Page 52: An Introduction to Elasticsearch for Beginners

52

Agenda

● Memory (Performance and Reliability)

● Security

● Networking (Reliability)

Page 53: An Introduction to Elasticsearch for Beginners

53

Memory

● Search engines have a great appetite for memory!

– Caches, caches, caches

● Field and filter caches

● Index building

Page 54: An Introduction to Elasticsearch for Beginners

54

Comparison

● RDBMSs are built to store. They Put good things in memory, and will flush to disk when there is no memory.

– Slower but working.

– Timeout is a client matter.

● Search-Engines are built for speed.

– Fast running or not running.

– Assumption: You've provided enough memory.

Page 55: An Introduction to Elasticsearch for Beginners

55

Question

● What if you don't provide them enough memory?

Page 56: An Introduction to Elasticsearch for Beginners

Question

● What if you don't provide them enough memory?

Page 57: An Introduction to Elasticsearch for Beginners

57

Out Of Memory

● In the best case:

– Your Indexing or Search Request simply failed.

● More:

– Cluster state corrupted.

– Crashed Netty.

● Just don't end up there in your production cluster.

Page 58: An Introduction to Elasticsearch for Beginners

58

Warning Signs

● ES provides lots of end-points to give you insights into it.

– Resource Usage● Cache Sizes● Heap Space

● There are Monitoring Tools.

– Profile your queries and optimize them.

Page 59: An Introduction to Elasticsearch for Beginners

59

Marvel

Page 60: An Introduction to Elasticsearch for Beginners

60

Try it on the Cloud by http://found.no

Page 61: An Introduction to Elasticsearch for Beginners

61

BigDesk

Page 62: An Introduction to Elasticsearch for Beginners

62

Paramedic

Page 63: An Introduction to Elasticsearch for Beginners

63

Memory Constraints

● Large heaps are expensive to garbage collect.

– JVM can no longer user pointer compression if heap goes beyond 32GB.

– Keep heap < 32GB

● Single Machine with Huge amount of Memory/SSD.

– Multiple nodes on super-fast machine with SSD and big amount of RAM. (Note: Replicas, SPF)

● Scale-Out

Page 64: An Introduction to Elasticsearch for Beginners

64

Security

● Everyone is most welcome.

● Auth(z) things aren't ES business.

– You are the gatekeeper

● Upon the role, limit the user requests applying filters.

– Out of memory is a critical issue. (Attacks)

– Unfiltered or unnecessary queries are pretty memory consuming.

Page 65: An Introduction to Elasticsearch for Beginners

65

Security Shield is coming soon

Page 66: An Introduction to Elasticsearch for Beginners

66

Networking

● ES works great, on a single node.

● ES is impressively easy to use for being a distributed system.

● ES Supports lots of different network topologies.

Page 67: An Introduction to Elasticsearch for Beginners

67

Networking

Page 68: An Introduction to Elasticsearch for Beginners

68

Networking

Page 69: An Introduction to Elasticsearch for Beginners

69

Networking in a Log Manager

Page 70: An Introduction to Elasticsearch for Beginners

70

Suggestions

● Have enough memory to keep your nodes reliable.

● Have majority of nodes.

● Favor filters over matching queries.

● Have an eye on the cluster (Health).

● Don't let user to run faceted queries or reduce the frequency.

Page 71: An Introduction to Elasticsearch for Beginners

71

Questions?