elasticsearch for pharo smalltalk
TRANSCRIPT
Elasticsearch for Pharo Smalltalk
Smalltalkで全文検索
Sho Yoshida / @newapplesho SORABITO Inc.
2016/01/29 第84回Smalltalk勉強会
About• Sho Yoshida
• SORABITO Inc. で働いています
• 働く機械の国際オンライン取引所 ALLSTOCKER ( https://allstocker.com) を作っています
おかげさまで160カ国以上からアクセスが来ております
こんなものを扱っています
第75回Smalltalk勉強会の話• 第75回Smalltalk勉強会で今後やりたいことの1つに「全文検索」
• http://www.smalltalk-users.jp/Home/gao-zhi/dai75kaismalltalkbenkyoukai
• RDS PostgreSQLを使っているので、日本語の全文検索ができない
• 日英の全文検索をサポートしなければならない
https://www.elastic.co/products/elasticsearch
Elasticsearchとは• Apache Luceneベースの全文検索エンジン、 解析サーバー
• スキーマレス、ドキュメント指向
• RESTで操作できる
• クラスタリングも想定しているので、基本的な設定は容易?
• ライセンスは Apache License v2
• Javaで実装
• ファセット、ハイライト検索も可能
事例• GitHub
• foursquare
• SoundCloud
• stackoverflow
• ALLSTOCKER
全文検索とElasticsearch
Elasticsearchは転置インデックス(単語とドキュメントIDを辞書にして、検索する)
https://speakerdeck.com/johtani/introduction-elasticsearch-and-elk-elasticsearchmian-qiang-hui-in-nagoya
詳しくは
データ構造
RDB = Database -> Tables -> Rows -> Columns
ElasticSearch = Index -> Types -> Documents -> Fields
そもそも全くことなるので比べるのも変ですが・・・・
Elasticsearch for Pharo Smalltalk• Paul DeBruicker が作ったElasticsearchのフォークプロジェクト
• http://ss3.gemtalksystems.com/ss/Elasticsearch.html
• 作者: @umejava, @newapplesho
• 最新リポジトリはGitHub
• https://github.com/newapplesho/elasticsearch-smalltalk
• Elasticsearch version 1系はサポート。2系は未対応
Elasticsearch for Pharo Smalltalk• 拡張が難しいのを改善
• Aggregation, Query系を一新した
Elasticsearchのインストール
1.wget https://download.elasticsearch.org/….
2.tar -xf elasticsearch-1.7.2.tar.gz
3../elasticsearch-1.7.2/bin/elasticsearch
Kuromojiのinstall日本語の形態素解析エンジン
bin/plugin install elasticsearch/elasticsearch-analysis-kuromoji/2.5.0
Elasticsearch-inquisitorプラグインのインストール
bin/plugin -install polyfractal/elasticsearch-inquisitor
http://localhost:9200/_plugin/inquisitor/#/
Elasticsearch-inquisitor GUIからQueryを実行できるプラグイン
起動確認$ curl localhost:9200
{ "status" : 200, "name" : "Lilandra Neramani", "cluster_name" : "elasticsearch", "version" : { "number" : "1.7.2", "build_hash" : "e43676b1385b8125d647f593f7202acbd816e8ec", "build_timestamp" : "2015-09-14T09:49:53Z", "build_snapshot" : false, "lucene_version" : "4.10.4" }, "tagline" : "You Know, for Search" }
elasticsearch for Pharo Smalltalkをinstall
Gofer new url: 'http://ss3.gemtalksystems.com/ss/Elasticsearch'; package: 'ConfigurationOfElasticsearch'; load. (Smalltalk at: #ConfigurationOfElasticsearch) load.
Metacello new baseline: 'Elasticsearch'; repository: 'github://newapplesho/elasticsearch-smalltalk:v1.1.3/pharo-repository'; load.
または
インデックス作成とマッピングの設定
curl -XPUT 'localhost:9200/st_study' -d @sushi.json
Kuromojiの動作確認
curl -XPOST 'http://localhost:9200/st_study/_analyze?analyzer=analyzer&pretty=true' -d '油圧ショベルは建設機械'
Kuromojiの動作確認結果 { "tokens" : [ { "token" : "油圧", "start_offset" : 0, "end_offset" : 2, "type" : "word", "position" : 1 }, { "token" : "ショベル", "start_offset" : 2, "end_offset" : 6, "type" : "word", "position" : 2 }, { "token" : "は", "start_offset" : 6, "end_offset" : 7, "type" : "word", "position" : 3 }, { "token" : "建設", "start_offset" : 7, "end_offset" : 9, "type" : "word", "position" : 4 }, { "token" : "機械", "start_offset" : 9, "end_offset" : 11, "type" : "word", "position" : 5 } ] }
Sample Data
"properties": { "title": { "type": "string", "store": "yes", "index": "not_analyzed" }, "description": { "type": "string", "store": "yes", "index": "analyzed" }, "price": { "type": "integer", "store": "yes" }
Sample (Seaside Example Sushi Store)
#('Akami Maguro' 'Red Tuna' 'The lean meat near the spine of the tuna fish. It comes in various shades of red--with the lighter, shinier varieties being the best. For dieters, however, the redder the better. Easy on the palatte. The least expensive of the three types of maguro.' 150)
ドキュメントの追加
neta := JsonObject new. neta title:'Aji'; description:'This fish is pink-grey and shiny. When it''s fresh, the flesh is almost transparent. The texture is slippery and easy on the tongue--it should melt in your mouth. Aji is often eaten with soy sauce containing onion, ginger and garlic.'
esDocument := ESDocument new type:'store'; content: neta. index addDocument: esDocument.
ドキュメントの削除
esDocument := ESDocument new id:'AVKMOVs3-FeOW1ziNoOb'; type:'store'; content: neta. esDocument deleteFromIndex: index.
インデックスの削除
index delete.
全件検索
"Match All" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchAllQuery new. search query: query. results := search search. results explore.
全件検索(ページング)
"Match All" index := ESIndex indexNamed: 'st_study'. search := ESSearch new index: index. query := ESMatchAllQuery new. search query: query. results := search searchFrom: 0 size:2. results explore.
Match Query
"Match" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchQuery new. query query:'aji'. search query: query. results := search search. results explore.
Term Query
"ESTermQuery" index := ESIndex indexNamed: 'st_study'. search := ESSearch new index: index. query := ESTermQuery new field:'title'; query:'Aji'. search query: query. results := search search. results explore.
ソート
"sort" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESTermQuery new field:'title'; query:'Aji'. sort := ESSortCriteria new fieldName: 'title'; sortDescending; yourself. search query: query. search addSortCriteria: sort. results := search search. results explore.
Aggregations
"min Aggregations" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchAllQuery new. aggregation := ESMinAggregation new field:'price'. search query: query. search addAggregation: aggregation. result := search aggregate.
Aggregations
"max Aggregations" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchAllQuery new. aggregation := ESMaxAggregation new field:'price'. search query: query. search addAggregation: aggregation. result := search aggregate.
Aggregations
"avg Aggregations" index := ESIndex indexNamed: 'st_study'. search := ESSearch new; index: index. query := ESMatchAllQuery new. aggregation := ESAvgAggregation new field:'price'. search query: query. search addAggregation: aggregation. result := search aggregate.
DEMO
今後の予定• Elasticsearch 2.0に対応予定
準備は整った さあSmalltalkを書こう
paul bica https://www.flickr.com/photos/dexxus/5820866907/