elasticsearch for data mining

Post on 17-Jul-2015

683 Views

Category:

Technology

10 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ElasticSearch

Wm. Barrett Simms

barrett@wbsimms.com

@wbsimms

About Me

Software Developer

Agile Team Member

Team LeadAgile

Advocate

SDLC Implementer

SDLC

Big Data

“Big data is an all-encompassing term for any collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications.”

- Wikipedia

The 3 Vs

• Volume• A few Gigabytes -> Petabyte

• Velocity• Arrives quickly

• Variety• Multiple types of data

What is ElasticSearch?

• You know, for search…

• Elasticsearch is a search server based on Lucene. It provides a distributed, multitenant-capable full-text search engine with a RESTfulweb interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License.

Let’s break that down…

• Distributed• Run on multiple servers simultaneously

• Multitenant• The same system serving different groups of data

• REST• Web-based programming interface

• NoSQL for storage• Uses JSON

• Open Source

So what is ElasticSearch?

• It’s a search engine

• Stores data on multiple machines

• Stores multiple types of data

• Stores in JSON format

• REST interface• There are managed and unmanaged programming interfaces

• .NET• Java• NodeJs• JavaScript• Scala• Clojure

• PHP• Perl• Python• Ruby• Haskell• Erlang

• ColdFusion• SmallTalk• Ocaml• CommandLine• EventMachine• Go

Administration Tools

• CURL• CommandLine REST interface

• Marvel

Definitions• Cluster

• One or more nodes

• Document• A stored record

• Field• A document has a list of fields, or key-value pairs

• Index• Think of this as a database

• Term• This is an exact value to be matched (“FOO”, “Foo”, “foo”) are not the same term

• Type• Similar to a database

• Text• Field value• Analyzed into terms• Stored in the index

ElasticSearch Resources

• ElasticSearch• elasticsearch.org

• ElasticSearch NEST• .NET client

• nest.azurewebsites.net

Installation

• Get the binaries

• Unzip

• Run elasticsearch.bat

Contact Me

Barrett Simms

barrett@wbsimms.com

http://wbsimms.com

Twitter: @wbsimms

Phone: 781.405.4686

top related