fluentd casual talks lt #fluentd #fluentdcasual

29
FluentdRedis ではじめるカジュアルで リアルタイムなレコメンデーション GMO Media, Inc. Business Promotion Office System Architect Hitoshi Asai

Upload: hitoshi-asai

Post on 25-May-2015

7.270 views

Category:

Technology


2 download

TRANSCRIPT

  • 1. FluentdRedis GMO Media, Inc. Business Promotion Office System ArchitectHitoshi Asai

2. WORK ScalaJP backend systemdata reportingawsFAVORITE PLUGINS TWITTER input: tailHitoshi ASAI buffer: memory @hito_asa output: exec machida, tokyo, jp #fluentdcasual 3. TODAYS AGENDAWeb#fluentdcasual 4. DEMO #fluentdcasual 5. #fluentdcasual 6. SYSTEM ARCHITECTUREEXECLOG FILEFORWARD FLUENTDFLUENTDSCRIPT RUBY FLUENTDPHP WORKER WEBSCALA REDISAPI #fluentdcasual 7. #fluentdcasual 8. SYSTEM SCALE scale of service133 MILLION REQUESTS / DAY7.6 MILLION PICTURES6.5 MILLION USERS#fluentdcasual 9. SYSTEM SCALE scale of event log7GB / DAY1500 EVENTS / SEC#fluentdcasual 10. SYSTEM SCALE scale of servers WEB18 WEB SRVS6 REDIS SRVSWORKER4 API SRVS4 WORKER SRVSAPIREDIS#fluentdcasual 11. #fluentdcasual 12. RECOMMENDATION ALGORITHM 210.153.84.41 - - [10/May/2012:20:52:08 +0900]"GET /JP_r0.prcm.jp/default/pic/index/" 200 57 "-"ip_fdop_docomo2 5Cb3mxt" "51402666 14235104" USER ID PIC ID #fluentdcasual 13. RECOMMENDATION ALGORITHMRedis LPUSH keyvalue (list)vlist_user1pic5pic4pic3pic2pic1vlist_user2pic7pic2 pic4 pic1 pic2vlist_user3pic4pic6 pic2 pic3 pic1 #fluentdcasual 14. RECOMMENDATION ALGORITHMList LRANGE 3 key value (list)vlist_user1pic5pic4 pic3 pic2pic1 pic4pic5pic3pic5+2 +1 #fluentdcasual 15. RECOMMENDATION ALGORITHMRedis key: rel_pic4 (hash)key: rel_pic3 (hash)field value HINCRBY field value HINCRBY pic55 3+2 pic52 1+1 pic63 pic21 #fluentdcasual 16. #fluentdcasual 17. NON-REALTIME GENERATION 11SCP & MapReduce scpmap-reduce (houry)(houry) WEB SERVERS HadoopREDIS#fluentdcasual 18. NON-REALTIME GENERATION 60 - 70%20 30%#fluentdcasual 19. #fluentdcasual 20. REALTIME GENERATION 60 - 70% 70 - 90%20 30%50 70%#fluentdcasual 21. IMPLEMENTATIONrecommendation script (ruby) 43 LINES #!/bin/env ruby require logger require redis require redis/distributed VIEW_PREFIX = r_user_gazo_view_ RELATION_PREFIX = r_gazo_relation_ LOGGER = Logger.new("#{ENV[FLUENTD_HOME]}/logs/gazo_recommend.log", 1, 100 * 1024 * 1024) REDISM = Redis.new(:host => "xxx", :port => 6380) REDISS = Redis::Distributed.new(["redis://xxx:6380", "redis://xxx:6380", "redis://xxx:6380", "redis://xxx:6380", "redis://xxx:6380"]) def calc(time, user_id, gazo_id) time_s = time.strftime(%Y-%m-%d %H:%M:%S) view_key = VIEW_PREFIX + user_id.to_s key_org = RELATION_PREFIX + gazo_id.to_s if REDISS.lrange(view_key, 0, 0) != [gazo_id] then REDISM.lpush(view_key, gazo_id) REDISM.expire(view_key, 7200) REDISM.expire(key_org, 28800) rel = REDISS.lrange(view_key, 0, 3).find_all{|r| r != gazo_id.to_s}.uniq.reverse.take(2).reverse rel.each do |r| key_rel = RELATION_PREFIX + r.to_s score = REDISM.hincrby(key_rel, gazo_id.to_s, 1) REDISM.expire(key_rel, 28800) LOGGER.info("#{time_s} - #{user_id.to_s} : #{r.to_s} => #{gazo_id.to_s} score: #{score}") score = REDISM.hincrby(key_org, r.to_s, 1) LOGGER.info("#{time_s} - #{user_id.to_s} : #{gazo_id.to_s} => #{r.to_s} score: #{score}") end end end while line = gets begin l = line.encode!(UTF-8, UTF-8, :invalid => :replace).split("t") if l[1] == default && l[2] == pic && l[3] == index then time = l[0] user_id = l[5] gazo_id = l[6] if user_id && gazo_id then calc Time.at(time.to_i), user_id.to_i, gazo_id.to_i rescue nil end end rescue => ex LOGGER.error("error: #{ex}") end end#fluentdcasual 22. #fluentdcasual 23. LOAD OF AGENT PROCESS load average #fluentdcasual 24. LOAD OF AGENT PROCESS memory usage (rss) 0.10.15 => 0.10.19 #fluentdcasual 25. LOAD OF AGENT PROCESS memory usage (rss) - weekly #fluentdcasual 26. #fluentdcasual 27. TROUBLESin_tail 2012-03-27 03:10:27 +0900: fluent/parser.rb:85:parse: pattern not match: "...(a part of log line)"https://github.com/fluent/fluentd/pull/44#fluentdcasual 28. TROUBLES #fluentdcasual 29. #fluentdcasual