all about elasticsearch language clients

20
All About Elasticsearch Language Clients Karel Minařík @karmiq

Upload: enterprise-search-warsaw-meetup

Post on 25-Jul-2015

1.190 views

Category:

Technology


2 download

TRANSCRIPT

All About Elasticsearch

Language Clients

Karel Minařík@karmiq

The Motivation Clients are part of the experience Fragmentation and inconsistency Solving the same problems over and over again Lack of support for the full breadth of Elasticsearch's APIs Different assumptions about exposing the APIs

Exhibit A: The Tire client for Ruby Incomplete support for query and filter types Mixing together a Ruby API and Rails integration Still widely used and liked

The Groundwork The REST API specification The common integration test suite The implementation sketch

1

2

3

{ "index": { "documentation": "http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/docs-index_.html", "methods": ["POST", "PUT"], "url": { "path": "/{index}/{type}", "paths": ["/{index}/{type}", "/{index}/{type}/{id}"], "parts": { "id": { "type" : "string", "description" : "Document ID" }, "index": { "type" : "string", "required" : true, "description" : "The name of the index" }, "type": { https://github.com/elasticsearch/elasticsearch/blob/master/rest-api-spec/api/index.json

client.index index: 'my_index', type: 'my_type', id: '1', body: { title: 'Hello World!' }

Ruby

client.index( index='my_index', doc_type='my_type', id=1, body={'title':'Hello World!'} )

Python

$client->index([ 'index' => 'my_index', 'type' => 'my_type', 'id' => 1, 'body' => [ 'title' => 'Hello World!' ] ]);

PHP

client.index({ index: 'myindex', type: 'mytype', id: '1', body: { title: 'Hello World!' } }) .then(function (response) { //... }) .catch(function (error) { //... });

JavaScript

setup: - do: index: index: test type: test id: testing_document body: body: Amsterdam meetup - do: indices.refresh: {}

--- "Basic tests for suggest API":

- do: suggest: body: test_suggestion: text: "The Amsterdma meetpu" term: field: body

- match: {test_suggestion.1.options.0.text: amsterdam}

https://github.com/elasticsearch/elasticsearch/blob/master/rest-api-spec/test/suggest/10_basic.yaml

The Implementation Sketch Ruby and Python Focus on functionality and naming, not abstractions Easy to change and reason about Working code — no room for elaborate diagrams, endless speculation or abstract discussions

CONNECTIONCONNECTION

HTTP LIBRARY 3HTTP LIBRARY 2

CLIENT

High Level Architecture

TRANSPORT

CONNECTION POOL SELECTOR

CONNECTION

RANDOMROUND ROBIN

SNIFFER

HTTP LIBRARY 1

Tracerclient = Elasticsearch::Client.new trace: true

client.index index: 'my_index', type: 'my_type', id: '1', body: { title: 'Hello World!' }

curl -X PUT 'http://localhost:9200/my_index/my_type/1?pretty' -d '{ "title":"Hello World!" }'

# 2015-03-10T07:55:37-07:00 [201] (0.270s) # # { # "_index":"my_index", # "_type":"my_type", # "_id":"1", # "_version":1, # "created":true # # }

Selector Customization for different cluster topologies Example: the “local rack” selector

http://www.rubydoc.info/gems/elasticsearch-transport#Connection_Selector

class RackIdSelector include Elasticsearch::Transport::Transport::Connections::Selector::Base

def select(options={}) connections.select do |c| # Try selecting the nodes with a `rack_id:x1` attribute first c.host[:attributes] && c.host[:attributes][:rack_id] == 'x1' end.sample || connections.to_a.sample end end

Elasticsearch::Client.new hosts: ['x1.search.org', 'x2.search.org'], selector_class: RackIdSelector

Sniffer Make use of the Elasticsearch's dynamic nature Reuse the cluster state information Add and remove nodes dynamically Reload nodes list on failure or periodically

Elasticsearch::Client.new hosts: ['localhost:9200', 'localhost:9201'], reload_on_failure: true

http://www.rubydoc.info/gems/elasticsearch-transport#Reloading_Hosts

randomize_hosts By default, the client round-robins across the node list Prevent the “sequential load” effect in multi-process/threaded environment Why not [host1, host2].shuffle ? Educate users about this fact and increase usability

An Elasticsearch client is much more than “just HTTP and JSON” wrapper

Thank you! Questions!