Download - Back end analytics_platform_2013_v1.0
Agenda Targets Approaches Analytic platforms Map/Reduce GNT Game analytic system (current & testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Targets
Analytic systems KPI Monitoring (access log , error log) Real-time analytics / batch processing analytic
al tasks for Client Depends GNT infrastructure , Scalability
Approaches
Refer log platforms' achitectures of Facebook , Twitter , ...
Community reviews of each component Adapt needs
Agenda Targets Approaches Analytic platforms Map/Reduce GNT Game analytic system (current & testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Analytic platform(1/10)
Tracker Action log, error log (nginx) Web log (Play framework) Game user activities log (event-driven logs) Database log (Cassandra, Redis, commit log) Page taging / logfile analytics
Collector ETL Analyzer Reporter
Analytic platform(2/10)Facebook
Facebook Web -> Scribe -> Ptail -> Puma -> HBase http://www.slideshare.net/slrash/2011-
0630hadoopsummit-v5-8469751
=> Collection layer (Flume/Scribe) → Filter layer (Flume) → Batching layer (Coprocessor)
Analytic platform(5/10) FacebookPtail = Parallel Tail
Concurent read : HDFS 2.0 : add sync : lower write-to-read latency
Ptail : read blocking data being written , < 10s latency
Agenda Targets Approaches Analytic Platform Map/Reduce GNT Game analytic system(current/testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Agenda Targets Approaches Analytic platforms Map/Reduce GNT Game analytic system (current &
testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
GNT Game Analytic System(2/2)current
Limitation :– Javscript implementation : limit 1 JS execution
/1 server at time– Scalability : not scale except in case of
sharding Improving : integration Mongo + Hadoop http://www.slideshare.net/iammutex/the-
elephant-in-the-room-mongo-db-hadoop
GNT Game Analytic System(2/4)testing FlumeNG
flume.conf : 192.168.30.183
t-game-web183.sources = tail-nginx tail-play t-game-web183.sinks = avro-sink-nginx183 avro-sink-play183 t-game-web183.channels = mem-channel-nginx183 mem-channel-
play183 t-game-web183.sources.tail-nginx.type = exec t-game-web183.sources.tail-nginx.command = tail -F
/var/log/nginx/access.log t-game-web183.sources.tail-nginx.channels = mem-channel-nginx183 t-game-web183.channels.mem-channel-nginx183.type = memory t-game-web183.sinks.avro-sink-nginx183.type = avro t-game-web183.sinks.avro-sink-nginx183.hostname = 192.168.30.185 t-game-web183.sinks.avro-sink-nginx183.port = 10183 t-game-web183.sinks.avro-sink-nginx183.channel = mem-channel-
nginx183
GNT Game Analytic System(3/4)testing FlumeNG
flume.conf : 192.168.30.185 t-game-cass185.sinks.hdfs-sink-
nginx183.type = hdfs t-game-cass185.sinks.hdfs-sink-
nginx183.hdfs.path = hdfs://namenode/flume/webdata/nginx183
Mixed Solutions
Case 1 : old system : Mongo + Hadoop Case 2 : FlumeNG + Hadoop + HBase Case 3 : Batch processing : Hadoop HDFS
(not use FlumeNG)
Agenda Targets Approaches Analytic platforms Map/Reduce GNT Game analytic system (current & testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Agenda Targets Methodologies Log platforms Map/Reduce GNT analytic system (current & testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Agenda Targets Approaches Analytic platforms Map/Reduce GNT Game analytic system (current & testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Agenda Targets Approaches Analytic platforms Map/Reduce GNT Game analytic system (current & testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Flume/FlumeNG(2/10)Concepts
Network stream : Avro/Syslog/Netcat
Source / Channel /Sink Decorator Flow Event Flume agent Flume avro client / log4j Appender
Flume/FlumeNG(4/10) Flume Sink :
HDFS Avro Logger … , FileRoll, Custom
Flume Channel : Memory JDBC channel Recoverable memory channel
Flume/FlumeNG(10/10)plugin (decorator)
TODO Flume with HBase sink https://groups.google.com/a/cloudera.org/
group/cdh-user/browse_thread/thread/5ee135ad0e720ea9/c5bffc83f97fdd3c?hl=vi&lnk=gst&q=flume-ng#c5bffc83f97fdd3c
Agenda Targets Approaches Analytic platforms Map/Reduce GNT Game analytic system (current & testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Agenda Targets Approaches Log platforms Map/Reduce GNT Game analytic system (current & testing) Appendix
Hadoop/Hadoop components Hbase Collectors (FlumeNG/Chukwa/Scribe)
Future issues
Manage analytic jobs Message queue : Kafka , ZeroMQ
Monitoring memory , flume agent , hadoop cluster , ...
Scalability