fluentd and embulk game server 4
TRANSCRIPT
![Page 1: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/1.jpg)
Masahiro NakagawaApr 18, 2015
Game Server meetup #4
Fluentd / EmbulkFor reliable transfer
![Page 2: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/2.jpg)
Who are you?
> Masahiro Nakagawa > github/twitter: @repeatedly
> Treasure Data, Inc. > Senior Software Engineer > Fluentd / td-agent developer
> Living at OSS :) > D language - Phobos committer > Fluentd - Main maintainer > MessagePack / RPC - D and Python (only RPC) > The organizer of several meetups (Presto, DTM, etc…) > etc…
![Page 3: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/3.jpg)
Structured logging !
Reliable forwarding !
Pluggable architecture
http://fluentd.org/
![Page 4: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/4.jpg)
What’s Fluentd?
> Data collector for unified logging layer > Streaming data transfer based on JSON > Written in Ruby
> Gem based various plugins > http://www.fluentd.org/plugins
> Working in production > http://www.fluentd.org/testimonials
![Page 5: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/5.jpg)
Background
![Page 6: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/6.jpg)
Data Analytics Flow
Collect Store Process Visualize
Data source
Reporting
Monitoring
![Page 7: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/7.jpg)
Data Analytics Flow
Store Process
Cloudera
Horton Works
Treasure Data
Collect Visualize
Tableau
Excel
R
easier & shorter time
???
![Page 8: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/8.jpg)
TD Service Architecture
Time to Value
Send query result Result Push
Acquire Analyze Store
Plazma DB Flexible, Scalable, Columnar Storage
Web Log
App Log
Censor
CRM
ERP
RDBMS
Treasure Agent(Server) SDK(JS, Android, iOS, Unity)
Streaming Collector
Batch / Reliability
Ad-hoc /Low latency
KPI$
KPI Dashboard
BI Tools
Other Products
RDBMS, Google Docs, AWS S3, FTP Server, etc.
Metric Insights
Tableau, Motion Board�����etc.
POS
REST API ODBC / JDBC �SQL, Pig�
Bulk Uploader
Embulk,TD Toolbelt
SQL-based query
@AWS or @IDCF
Connectivity
Economy & Flexibility Simple & Supported
![Page 9: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/9.jpg)
Dive into Concept
![Page 10: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/10.jpg)
Divide & Conquer & Retry
error retry
error retry retry
retryBatch
Stream
Other stream
![Page 11: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/11.jpg)
Application
・・・
Server2
Application
・・・
Server3
Application
・・・
Server1
FluentLog Server
High Latency!must wait for a day...
Before…
![Page 12: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/12.jpg)
Application
・・・
Server2
Application
・・・
Server3
Application
・・・
Server1
Fluentd Fluentd Fluentd
Fluentd Fluentd
In streaming!
After…
![Page 13: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/13.jpg)
Why JSON / MessagePack? (1
> Schema on Write (Traditional MPP DB) > Writing data using schema for improving
query performance
> Pros > minimum query overhead
> Cons
> Need to design schema and workload before
> Data load is expensive operation
![Page 14: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/14.jpg)
Why JSON / MessagePack? (2
> Schema on Read (Hadoop) > Writing data without schema and map schema
at query time
> Pros > Robust over schema and workload change > Data load is cheap operation
> Cons
> High overhead at query time
![Page 15: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/15.jpg)
Features
![Page 16: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/16.jpg)
Core Plugins
> Divide & Conquer
> Buffering & Retrying
> Error handling
> Message routing
> Parallelism
> Read / receive data > Parse data > Filter data > Buffer data > Format data > Write / send data
![Page 17: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/17.jpg)
Core Plugins
> Divide & Conquer
> Buffering & Retrying
> Error handling
> Message routing
> Parallelism
> Read / receive data > Parse data > Filter data > Buffer data > Format data > Write / send data
Common Concerns
Use Case Specific
![Page 18: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/18.jpg)
> default second unit
> from data source
Event structure(log message)
✓ Time
> for message routing
> where is from?
✓ Tag
> JSON format
> MessagePackinternally
> schema-free
✓ Record
![Page 19: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/19.jpg)
Architecture (v0.12 or later)
EngineInput
Filter Output
Buffer
> grep > record_transfomer > …
> Forward > File tail > ...
> Forward > File > ...
Output
> File > Memory
not pluggable
FormatterParser
![Page 20: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/20.jpg)
Configuration and operation
> No central / master node > @include helps configuration sharing
> Operation depends on your environment > Use your deamon / deploy tools > Use Chef in Treasure Data
> Apache like syntax
![Page 21: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/21.jpg)
How to use
![Page 22: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/22.jpg)
Setup fluentd (e.g. Ubuntu)
$ apt-get install ruby!
!
$ gem install fluentd!
!
$ edit fluent.conf!
!
$ fluentd -c fluent.conf
http://docs.fluentd.org/articles/faq#w-what-version-of-ruby-does-fluentd-support
![Page 23: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/23.jpg)
Treasure Agent (td-agent)
> Treasure Data distribution of Fluentd > include ruby, popular plugins and etc
> Treasure Agent 2 is current stable > Recommend to use v2, not v1 > rpm, deb and dmg
> Latest version is 2.2.0 with fluentd v0.12
![Page 24: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/24.jpg)
Setup td-agent
$ curl -L http://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh!
!
$ edit /etc/td-agent/td-agent.conf!
!
$ sudo service td-agent start
See: http://docs.fluentd.org/categories/installation
![Page 25: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/25.jpg)
Apache to Mongo
tail
insert
event buffering routing
127.0.0.1 - - [11/Dec/2014:07:26:27] "GET / ... 127.0.0.1 - - [11/Dec/2014:07:26:30] "GET / ... 127.0.0.1 - - [11/Dec/2014:07:26:32] "GET / ... 127.0.0.1 - - [11/Dec/2014:07:26:40] "GET / ... 127.0.0.1 - - [11/Dec/2014:07:27:01] "GET / ...
...
Fluentd
Web Server
2014-02-04 01:33:51 apache.log
{ "host": "127.0.0.1", "method": "GET", ... }
![Page 26: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/26.jpg)
Plugins - use rubygems
$ fluent-gem search -rd fluent-plugin!
!
$ fluent-gem search -rd fluent-mixin!
!
$ fluent-gem install fluent-plugin-mongoIn td-agent: /usr/sbin/td-agent-gem install fluent-plugin-mongo
![Page 27: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/27.jpg)
# receive events via HTTP <source> @type http port 8888 </source> !# read logs from a file <source> @type tail path /var/log/httpd.log format apache tag apache.access </source> !# save access logs to MongoDB <match apache.access> @type mongo database apache collection log </match>
# save alerts to a file <match alert.**> @type file path /var/log/fluent/alerts </match> !# forward other logs to servers <match **> @type forward <server> host 192.168.0.11 weight 20 </server> <server> host 192.168.0.12 weight 60 </server> </match> !@include http://example.com/conf
![Page 28: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/28.jpg)
> Apply filtering routine to event stream > No more tag tricks!
Filter
<match access.**> @type record_reformer tag reformed.${tag} </match> !<match reformed.**> @type growthforecast </match>
<filter access.**> @type record_transformer … </filter>
v0.10: v0.12:
<match access.**> @type growthforecast </match>
![Page 29: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/29.jpg)
Before
![Page 30: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/30.jpg)
After
or Embulk
![Page 31: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/31.jpg)
Nagios
MongoDB
Hadoop
Alerting
Amazon S3
Analysis
Archiving
MySQL
Apache
Frontend
Access logs
syslogd
App logs
System logs
Backend
Databasesbuffering / processing / routing
M x N → M + N
![Page 32: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/32.jpg)
Roadmap> v0.10 (old stable) > v0.12 (current stable)
> Filter / Label / At-least-once > v0.14 (spring - early summer, 2015)
> New plugin APIs, ServerEngine, Time… > v1 (summer - fall, 2015)
> Fix new features / APIs
https://github.com/fluent/fluentd/wiki/V1-Roadmap
![Page 33: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/33.jpg)
Use-cases
![Page 34: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/34.jpg)
Simple forwarding
![Page 35: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/35.jpg)
# logs from a file<source> type tail path /var/log/httpd.log pos_file /tmp/pos_file format apache2 tag backend.apache</source>!# logs from client libraries<source> type forward port 24224</source>!
# store logs to MongoDB<match backend.*> type mongo database fluent collection test</match>
![Page 36: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/36.jpg)
# Ruby!Fluent.open(“myapp”)!Fluent.event(“login”, {“user” => 38})!#=> 2014-12-11 07:56:01 myapp.login {“user”:38}
> Ruby > Java > Perl > PHP > Python > D > Scala > ...
Client libraries
![Page 37: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/37.jpg)
Less Simple Forwarding
- At-most-once / At-least-once - HA (failover) - Load-balancing
![Page 38: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/38.jpg)
All data
Near realtime and batch combo!
Hot data
![Page 39: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/39.jpg)
# logs from a file<source> type tail path /var/log/httpd.log pos_file /tmp/pos_file format apache2 tag web.access</source>!# logs from client libraries<source> type forward port 24224</source>!
# store logs to ES and HDFS<match web.*> type copy <store> type elasticsearch logstash_format true </store> <store> type webhdfs host namenode port 50070 path /path/on/hdfs/ </store></match>
![Page 40: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/40.jpg)
CEP for Stream Processing
Norikra is a SQL based CEP engine: http://norikra.github.io/
![Page 41: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/41.jpg)
Container Logging
![Page 42: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/42.jpg)
> Kubernetes
!
!
!
!
!
> Google Compute Engine > https://cloud.google.com/logging/docs/install/compute_install
Fluentd on Kubernetes / GCE
![Page 43: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/43.jpg)
Treasure Data
FrontendJob Queue
WorkerHadoop
Presto
Fluentd
Applications push metrics to Fluentd (via local Fluentd)
Datadogfor realtime monitoring
Treasure Datafor historical analysis
Fluentd sums up data minutes(partial aggregation)
![Page 44: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/44.jpg)
hundreds of app servers
sends event logs
sends event logs
sends event logs
Rails app td-agent
td-agent
td-agent
GoogleSpreadsheet
Treasure Data
MySQL
Logs are available
after several mins.
Daily/Hourly
Batch
KPI
visualizationFeedback rankings
Rails app
Rails app
✓ Unlimited scalability✓ Flexible schema✓ Realtime✓ Less performance impact
Cookpad
✓ Over 100 RoR servers (2012/2/4)
![Page 45: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/45.jpg)
Slideshare
http://engineering.slideshare.net/2014/04/skynet-project-monitor-scale-and-auto-heal-a-system-in-the-cloud/
![Page 46: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/46.jpg)
Log Analysis System And its designs in LINE Corp. 2014 early
![Page 47: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/47.jpg)
Line BusinessConnect
http://developers.linecorp.com/blog/?p=3386
![Page 48: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/48.jpg)
Eco-system
![Page 49: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/49.jpg)
fluent-bit> Made for Embedded Linux
> OpenEmbedded & Yocto Project > Intel Edison, RasPi & Beagle Black boards > https://github.com/fluent/fluent-bit
> Standalone application or Library mode > Built-in plugins
> input: cpu, kmsg, output: fluentd > First release at the end of Mar 2015
![Page 50: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/50.jpg)
fluentd-forwarder> Forwarding agent written in Go
> Focusing log forwarding to Fluentd > Work on Windows
> Bundle TCP input/output and TD output > No flexible plugin mechanizm > We have a plan to add some input/output
> Similar product > fluent-agent-lite, fluent-agent-hydra, ik
![Page 51: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/51.jpg)
fluentd-ui
> Manage Fluentd instance via Web UI > https://github.com/fluent/fluentd-ui
![Page 53: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/53.jpg)
The problems at Treasure Data> Treasure Data Service on the Cloud > Customers want to try Treasure Data, but
> SEs write scripts to bulk load their data. Hard work :(
> Customers want to migrate their big data, but > Hard work :(
> Fluentd solved streaming data collection, but > bulk data loading is another problem.
![Page 54: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/54.jpg)
Embulk
> Bulk Loader version of Fluentd > Pluggable architecture
> JRuby, JVM languages > High performance parallel processing
> Share your script as a plugin > https://github.com/embulk
![Page 55: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/55.jpg)
The problems of bulk load
> Data cleaning (normalization) > How to normalize broken records?
> Error handling > How to remove broken records?
> Idempotent retrying > How to retry without duplicated loading?
> Performance optimization
![Page 56: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/56.jpg)
HDFS
MySQL
Amazon S3
Embulk
CSV Files
SequenceFile
Salesforce.com
Elasticsearch
Cassandra
Hive
Redis
✓ Parallel execution ✓ Data validation ✓ Error recovery ✓ Deterministic behaviour ✓ Idempotent retrying
Plugins Plugins
bulk load
http://www.embulk.org/plugins/
![Page 57: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/57.jpg)
![Page 58: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/58.jpg)
![Page 59: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/59.jpg)
How to use
![Page 60: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/60.jpg)
Setup embulk (e.g. Linux/Mac)
$ curl --create-dirs -o ~/.embulk/bin/embulk -L “http://dl.embulk.org/embulk-latest.jar"!
!
$ chmod +x ~/.embulk/bin/embulk!
!
$ echo 'export PATH="$HOME/.embulk/bin:$PATH"' >> ~/.bashrc!
!
$ source ~/.bashrc
![Page 61: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/61.jpg)
Try example
$ embulk example ./try1!
!
$ embulk guess ./example.yml -o config.yml!
!
$ embulk preview config.yml!
!
$ embulk run config.yml
![Page 62: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/62.jpg)
# install $ wget http://dl.embulk.org/embulk-latest.jar -O
embulk.jar $ chmod 755 embulk.jar!
# guess $ vi example.yml $ ./embulk guess example.yml
-o config.yml
Guess format & schema in: type: file path_prefix: /path/to/sample_ out: type: stdout
in: type: file path_prefix: /path/to/sample_ decoders: - {type: gzip} parser: charset: UTF-8 newline: CRLF type: csv delimiter: ',' quote: '"' skip_header_lines: 1 columns: - {name: id, type: long} - {name: account, type: long} - {name: time, type: timestamp, format: '%Y-%m-%d %H:%M:%S’} - {name: purchase, type: timestamp, format: ‘%Y%m%d'} - {name: comment, type: string} out: type: stdout
guess
by guess plugins
![Page 63: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/63.jpg)
# install $ wget http://dl.embulk.org/embulk-latest.jar -O
embulk.jar $ chmod 755 embulk.jar!
# guess $ vi example.yml $ ./embulk guess example.yml
-o config.yml!
# preview $ ./embulk preview config.yml $ vi config.yml # if necessary
+--------------------------------------+---------------+--------------------+ | time:timestamp | uid:long | word:string | +--------------------------------------+---------------+--------------------+ | 2015-01-27 19:23:49 UTC | 32,864 | embulk | | 2015-01-27 19:01:23 UTC | 14,824 | jruby | | 2015-01-28 02:20:02 UTC | 27,559 | plugin | | 2015-01-29 11:54:36 UTC | 11,270 | fluentd | +--------------------------------------+---------------+--------------------+
Preview & fix config
![Page 64: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/64.jpg)
# install $ wget http://dl.embulk.org/embulk-latest.jar -O
embulk.jar $ chmod 755 embulk.jar!
# guess $ vi example.yml $ ./embulk guess example.yml
-o config.yml!
# preview $ ./embulk preview config.yml $ vi config.yml # if necessary !# run $ ./embulk run config.yml -o config.yml
exec: {} in: type: file path_prefix: /path/to/sample_ decoders: - {type: gzip} parser: charset: UTF-8 newline: CRLF type: csv delimiter: ',' quote: '"' skip_header_lines: 1 columns: - {name: id, type: long} - {name: account, type: long} - {name: time, type: timestamp, format: '%Y-%m-%d %H:%M:%S’} - {name: purchase, type: timestamp, format: ‘%Y%m%d'} - {name: comment, type: string} last_path: /path/to/sample_001.csv.gz out: type: stdout
Deterministic run
![Page 65: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/65.jpg)
exec: {} in: type: file path_prefix: /path/to/sample_ decoders: - {type: gzip} parser: charset: UTF-8 newline: CRLF type: csv delimiter: ',' quote: '"' skip_header_lines: 1 columns: - {name: id, type: long} - {name: account, type: long} - {name: time, type: timestamp, format: '%Y-%m-%d %H:%M:%S’} - {name: purchase, type: timestamp, format: ‘%Y%m%d'} - {name: comment, type: string} last_path: /path/to/sample_01.csv.gz out: type: stdout
Repeat
# install $ wget http://dl.embulk.org/embulk-latest.jar -O
embulk.jar $ chmod 755 embulk.jar!
# guess $ vi example.yml $ ./embulk guess example.yml
-o config.yml!
# preview $ ./embulk preview config.yml $ vi config.yml # if necessary !# run $ ./embulk run config.yml -o config.yml !# repeat $ ./embulk run config.yml -o config.yml $ ./embulk run config.yml -o config.yml
![Page 66: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/66.jpg)
Use-cases
![Page 67: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/67.jpg)
Quipper from GDS slide
![Page 68: Fluentd and Embulk Game Server 4](https://reader033.vdocuments.site/reader033/viewer/2022042505/55a625bb1a28ab0c3c8b4804/html5/thumbnails/68.jpg)
Other cases
> Treasure Data > Embulk worker for automatic import
> Web services > Send existing logs to Elasticsearch
> Business / Batch systems > Database to Database
> etc…