piwik elasticsearch kibana at osc tokyo 2016 spring
TRANSCRIPT
Piwik fluentd
YAMAMOTO [email protected]
@yamachan5593
Piwik Japan Team
Feb 27th, 2016at Open Source Conference
Tokyo
� OpenSolaris� https://osdn.jp/projects/jposug/
� Piwikjapan /OSC� https://osdn.jp/projects/piwik-fluentd/
2 of 46
� Piwik Piwik tracker
125.54.155.180 - - [21/Feb/2016:08:46:13 +0900] "GET
/piwik.php?action_name=example.com%2F%E5%A0%B1%E5%91
&idsite=1&rec=1&r=047899&h=23&m=46&s=16
&url=http%3A%2F%2Fjpvlad.com%2Findex.php%3Ftopic%3Deventresult_ja
&_id=4e5ded8520370239&_idts=1435710334&_idvc=387
&_idn=0&_refts=0&_viewts=1455979574&send_image=0
&pdf=1&qt=0&realp=1&wma=1&dir=1&fla=1&java=1&gears=0
&ag=1&cookie=1&res=1366x768 HTTP/1.1" 204 -
"http://jpvlad.com/index.php?topic=eventresult_ja"
"Mozilla/5.0 (WindowsNT 6.1) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/28.0.1500.63 Safari/537.36"
elasticsearch kibana
3 of 46
Piwik Tracker Piwik
�
� host IP user agent referer
� Piwik Tracker� idsite Piwik Web� action name Web� id ID� res PC� pdf Web pdf ?� java java ?� fla flash ?� cookie cookie ?� viewts
� Supported Query Parameters1
1http://developer.piwik.org/api-reference/tracking-api5 of 46
1. Piwik, fluentd, elasticsearch, kibana
2. Piwik Piwik
� Piwik PHP� GET
3. Piwik fluentd elasticsearch� elasticsearch� fluentd URL decode
4. kibana elasticsearch
6 of 46
�����������
�����������
�����
���
�
� ����� ����
����
�����������
���������������������� ��
������������
������������ ��������������������
�����������������������
7 of 46
� RedHat7 CentOS7, Scientific Linux 7� RedHat6 RedHat6� RedHat6 · · · CentOS6, Scientific Linux 6
� Piwik� Piwik Web 2
� fluentd, elasticsearch, kibana
� Piwik
2http://www.piwikjapan.org/ /39858 of 46
fluentd ∼ 1
� fluentd td-agent
� td-agent 2.x 1.x� ruby RPM
� fluentd ruby� RedHat6 ruby 1.9.3� RedHat7 ruby 2.0� td-agent 2.x ruby 2.2
� fluentd fluentd� RPM
elasticsearch�
�
9 of 46
fluentd ∼ 2
� ruby 2.2.41. ruby RedHat
� CentOS, Scientific Linux� 6 7
2. td-agent RPM3. SRPM rpm
$ sudo yum groupinstall "Development tools"
4. “CentOS 6 ruby RPM 3” ruby223.spec
5. RPM Ctrl+C
$ rpmbuild -bp ruby223.spec Ctrl+C
~/rpmbuild
$ mv ruby223.spec rpmbuild/SPECS/ruby224.spec 224
3http://www.torutk.com/projects/swe/wiki/CentOS 6 ruby RPM
10 of 46
fluentd ∼ 3
� ruby 2.2.41. ˜/rpmbuild/SPECS/ruby224.spec
%define rubyver 2.2.4
2. “Ruby 2.2.4 4” ruby-2.2.4.tar.bz23. ruby-2.2.4.tar.bz2 /rpmbuild/SOURCES4. RPM
$ cd ~/rpmbuild/SPECS
$ rpmbuild -ba ruby224.spec
$ sudo rpm -ivh \
~/rpmbuild/RPMS/x86_64/ruby-2.2.4-1.el7.x86_64.rpm
RedHat6 el6
$ ruby -v
ruby 2.2.4p230 (2015-12-16 revision 53155) [x86_64-linux]4https://www.ruby-lang.org/ja/news/2015/12/16/ruby-2-2-4-released/
11 of 46
fluentd ∼ 4
�
1. epel
�
$ sudo yum install \
http://ftp-srv2.kddilabs.jp/Linux/distributions/ \
fedora/epel/7/x86 64/e/epel-release-7-5.noarch.rpm
� RedHat6
$ sudo yum install \
http://ftp-srv2.kddilabs.jp/Linux/distributions/ \
fedora/epel/6/x86 64/epel-release-6-8.noarch.rpm
2.
$ sudo yum install gecode gecode-devel fakeroot
12 of 46
fluentd ∼ 5
1. RedHat6 git
$ wget http://dl.marmotte.net/rpms/redhat/el6/x86 64/\
git-1.8.3.1-3.el6/git-1.8.3.1-3.el6.src.rpm
$ cp ~/rpmbuild/SRPMS/git-1.8.3.1-3.el6.src.rpm
$ rpmbuild --rebuild \
~/rpmbuild/SRPMS/git-1.8.3.1-3.el6.src.rpm
$ sudo yum install perl-TermReadKey
$ sudo rpm -ivh \
~/rpmbuild/RPMS/x86 64/git-1.8.3.1-3.el6.x86_64.rpm
� git 1.8 “-c”� git 1.8� epel
13 of 46
fluentd ∼ 6
� ruby fluentd
1. bundle
$ sudo gem install bundler
2. github clone
$ cd ~
$ git clone \
[email protected]:treasure-data/omnibus-td-agent.git
$ cd ~/omnibus-td-agent
3. treasure-data/omnibus-td-agent5
multipart-post Gemfile
5https://github.com/treasure-data/omnibus-td-agent14 of 46
fluentd ∼ 7
�
� multipart-post� ˜/omnibus-td-agent/Gemfile gem ’pedump’ · · · 6
source ’https://rubygems.org’
# Use Berkshelf for resolving cookbook dependencies
gem ’berkshelf’, ’~> 3.0’
gem ’pedump’, git: ’https://github.com/ksubrama/pedump’,
branch: ’patch-1’ #
# Install omnibus software
#gem ’omnibus’, ’~> 5.0’
6https://github.com/piwikjapan/omnibus-td-agent/blob/master/Gemfile15 of 46
fluentd ∼ 8
� elasticsearch, record-reformer, norikra RPM
� norikra
� ˜/omnibus-td-agent/plugin gems.rb
download "fluent-plugin-norikra", "0.2.2"
download "fluent-plugin-elasticsearch", "1.3.0"
download "fluent-plugin-record-reformer", "0.8.0"
16 of 46
fluentd ∼ 9
� norikra� norikra� norikra-client msgpack-rpc-over-http rack
2.x 1.6.4
� ˜/omnibus-td-agent/core gems.rb
download "rack", "1.6.4"
download "norikra-client", "1.3.1"
17 of 46
fluentd ∼ 10
�7
$ sudo mkdir -p /opt/td-agent /var/cache/omnibus
$ sudo chown yamachan:yamachan /opt/td-agent
$ sudo chown yamachan:yamachan/var/cache/omnibus
� yamachan:yamachan id
7https://github.com/treasure-data/omnibus-td-agent18 of 46
fluentd ∼ 11:
1. 8
$ cd ~/omnibus-td-agent
$ bundle install --binstubs
sudo
$ bin/gem_downloader core_gems.rb
$ bin/gem_downloader plugin_gems.rb
$ bin/omnibus build td-agent2
8https://github.com/treasure-data/omnibus-td-agent19 of 46
fluentd ∼
1. pkg
$ cd ~/omnibus-td-agent/pkg
$ sudo yum install td-agent-2.3.1-0.el7.x86 64.rpm
2. RedHat6 td-agent-2.3.1-0.el6.x86 64.rpm
20 of 46
elasticsearch
1. RedHat7, RedHat6
$ sudo yum install \
https://download.elasticsearch.org/elasticsearch/\
release/org/elasticsearch/distribution/\
rpm/elasticsearch/2.2.0/elasticsearch-2.2.0.rpm
2. kuromoji
$ sudo /usr/share/elasticsearch/bin/plugin \
install analysis-kuromoji
21 of 46
kibana
1.
$ cd ~
$ git clone [email protected]:piwikjapan/kibana-rpm-packaging.git
$ cd kibana-rpm-packaging
$ cp kibana.sysconfig kibana.service ~/rpmbuild/SOURCES
$ cp kibana.spec ~/rpmbuild/SPECS
$ wget -P ~/rpmbuild/SOURCES \
https://download.elastic.co/kibana/kibana/\
kibana-4.4.1-linux-x64.tar.gz
$ rpmbuild -ba ~/rpmbuild/SPECS/kibana.spec
2.
$ sudo rpm -ivh ~rpmbuild/RPMS/x86_64/\
kibana-4.4.1-1.x86_64.rpm
22 of 46
1. norikra 26578/tcp
$ sudo firewall-cmd --zone=public \
--add-port=26578/tcp --permanent # norikra web
$ sudo firewall-cmd --zone=public \
--add-port=5651/tcp --permanent # kibana web
$ sudo firewall-cmd --zone=public \
--add-port=24224/udp --permanent # fluentd heatbeat
$ sudo firewall-cmd --zone=public \
--add-port=24224/tcp --permanent # fluentd data
24 of 46
RedHat6
1. norikra 26578/tcp
2. /etc/sysconfig/iptables-A INPUT -m state –state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -m multiport -p tcp -m tcp \
--dports 26578,5651,24224 -j ACCEPT
-A INPUT -m multiport -p udp -m udp --dports 24224 -j ACCEPT
3.
$ sudo service iptables reload
25 of 46
td-agent
� Piwik elasticsearch, kibana1. Piwik server elasticsearch server2. Piwik server elasticsearch server forward
�����������
�����������
�����
���
�
� ����� ����
����
�����������
���������������������� ��
������������
������������ ��������������������
�����������������������
26 of 46
td-agent ∼ Piwik 1
� Piwik elasticsearch� td-agent� /etc/td-agent/td-agent.conf
�
�
� “Piwik elasticsearch10”
10https://osdn.jp/projects/piwik-fluentd/wiki/FrontPage27 of 46
td-agent ∼ Piwik 2
� Piwik� Piwik� tag piwiktracker.apache.access
<source>
type tail
format apache
time_format %d/%b/%Y:%H:%M:%S %z
pos_file /var/log/td-agent/access_log.pos
path /var/log/httpd/access_log
tag piwiktracker.apache.access
</source>
28 of 46
td-agent ∼ Piwik 3
� Piwik� host
<match piwiktracker.apache.access>
type forward
send_timeout 60s
recover_wait 300s
heartbeat_interval 1s
phi_threshold 16
hard_timeout 60s
<server>
name fruentd
host your_elsticsearch_server i.e. 10.x.x.x
port 24224
weight 100
</server>
</match>
29 of 46
td-agent ∼ Piwik 4
� elasticsearch� Tracker
1. Piwik2. Piwik API3. filter match piwiktracker.apache.access
<filter piwiktracker.apache.access>
type grep
regexp1 path /piwik\.php\?action name=.*\&idsite=\d+
</filter>
<match piwiktracker.apache.access>
type record_reformer
tag piwiktracker.apache.access.urldecode
30 of 46
td-agent ∼ Piwik 5
� elasticsearch� fluentd
“Supported Query Parameters11”� “ ” “id”� piwiktracker.apache.access.urldecode
<match piwiktracker.apache.access>
type record_reformer
tag piwiktracker.apache.access.urldecode
29 3
idsite ${path[/piwik\.php\?
action name=.*\&idsite=(\d+)/,1]} ID
piwikid ${path[/piwik\.php\?action name=
.*\& id=([a-z\d]+)/,1]} ID
fla ${path[/piwik\.php\?action name= flash ?
.*\&fla=(\d+)/,1] == "1" ? "true" : "false" }
</match>11http://developer.piwik.org/api-reference/tracking-api
31 of 46
td-agent ∼ Piwik 6
� elasticsearch� fluentd url encode� piwiktracker.apache.access.store
<match piwiktracker.apache.access.urldecode>
type uri_decode
tag piwiktracker.apache.access.store
key_names action_name,ref,url,urlref
</match>
32 of 46
td-agent ∼ Piwik 7:
� elasticsearch� store elasticsearch
<match piwiktracker.apache.access.store>
type copy
<store>
type elasticsearch
type_name access_log
host 127.0.0.1
port 9200
logstash_format true
logstash_prefix apache-log
logstash_dateformat %Y%m%d
include_tag_key true
tag_key @log_name
flush_interval 10s
</store>
</match>33 of 46
td-agent ∼ Piwik 1
� Piwik elasticsearch� td-agent� /etc/td-agent/td-agent.conf
� “ ”�
“Piwik elasticsearch12”
12https://osdn.jp/projects/piwik-fluentd/wiki/FrontPage34 of 46
td-agent ∼ Piwik 2:
� Piwik elasticsearch� “ ”
� “ ” Piwik forward
<source>
tag piwiktracker.apache.access
</source>
<match piwiktracker.apache.access>
tag piwiktracker.apache.access.urldecode
</match>
<match piwiktracker.apache.access.urldecode>
tag piwiktracker.apache.access.store
</match>
<match piwiktracker.apache.access.store>
</match>
35 of 46
elasticsearch 2 ∼
� Elasticsearch supports the following simple field types13:� String: string� Whole number: byte, short, integer, long� Floating-point: float, double� Boolean: boolean� Date: date
13https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-intro.html37 of 46
elasticsearch 3 ∼
� Json 14
15
� “elasticsearch mapping16”
14MySQL elasticsearch
15
16https://osdn.jp/projects/piwik-fluentd/wiki/elasticsearch#h2-elasticsearch.20.E3.81.AE.20mapping.20.E8.A8.AD.E5.AE.9A38 of 46
elasticsearch 4 ∼ Json
� ”template”: ”apache-log-*”,17 mapping td-agent.conf
logstash prefix apache-log
logstash dateformat
%Y%m%d “apache-log- ”
� ”settings”: {index
kuromoji “Elasticsearch kuromoji18”
17 DB18http://tech.gmo-media.jp/post/70245090007/elasticsearch-kuromoji-
japanese-fulltext-search39 of 46
elasticsearch 5 ∼ Json
� ”mappings”: { ”access log”: {”access log” td-agent.conf type name
access log 19
19“ default ”40 of 46
elasticsearch 6 ∼ Json
�
� source all
"mappings": {
"access log": {
" source": {
"enabled": "false" true
},
" all": {
"enabled": "false" true
},
41 of 46
elasticsearch 7 ∼ Json
�
�
"mappings": {
"access log": {
"properties": {
"@log name": { see td-agent.conf
"type": "string",
"store": "true",
"index": "not analyzed"
},
42 of 46
elasticsearch 8 ∼ Json
�
�
"ref": { td-agent.conf
"type": "multi field",
"fields": {
"ref": {
"type": "string",
"index": "analyzed",
"store": "true"
},
"full": {
"type": "string",
"index": "not analyzed",
"store": "true"
}
}
},43 of 46
elasticsearch 9: ∼ Json
�
�
"action_name": {
"type": "string",
"analyzer": "kuromoji analyzer",
"store": "true"
},
44 of 46
� td-agent
# service td-agent start
# service elasticsearch start
# service kibana start
� kibana http://your elasticserach server:5601/
45 of 46