helping data teams with puppet / puppet camp london - apr 13, 2015
TRANSCRIPT
![Page 1: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/1.jpg)
S T Y L I G H T . C O M
Helping Data Teams wi th Puppet
S T Y L I G H T . C O M
S E R G I I K H O M E N K O , D A T A S C I E N T I S T , S E R G I I . K H O M E N K O @ S T Y L I G H T . C O M , @ l c 0 d 3 r
![Page 2: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/2.jpg)
W h o ? W h a t ? W h y ? S e t t i n g u p y o u r B I w i t h p u p p e t .
S m a l l t i p s a n d t r i c k s P u p p e t y o u r r a n k i n g
A G E N D A
![Page 3: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/3.jpg)
Data scientist at one of the biggest fashion communities, STYLIGHT. Data analysis and visualization hobbyist. Speaker at Berlin Buzzwords 2014, ApacheCon Europe 2014 Founder and speaker at Munich Golang UG, Munich Tableau UG. Speaker at Munich UseR Group, Munich Search UG, Munich Quantified Self UG.
Sergii Khomenko
Milos Radovanovic
Passionate about DevOps stuff: 1. microservices 2. docker 3. 12 factor apps 4. continuous integration/deployment
![Page 4: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/4.jpg)
![Page 5: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/5.jpg)
![Page 6: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/6.jpg)
L i v e i n 1 2 c o u n t r i e s STYLIGHT – international community
![Page 7: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/7.jpg)
S T Y L I G H T . C O M
Setting up your BI with puppet.
![Page 8: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/8.jpg)
T a b l e a u - r e p o r t i n g a n d a d - h o c s P y t h o n / T a l e n d E T L t o o l s
Minimum Viable BI
![Page 9: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/9.jpg)
R U N N I N G P U P P E T I N A S T A N D A L O N E M O D E
Minimum Viable BI
We use Puppet for *nix servers and can’t merge with Windows machine Standalone mode for Puppet
– easier to start and develop – windows machines are separated from *nix ones
![Page 10: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/10.jpg)
R U N N I N G P U P P E T I N A S T A N D A L O N E M O D E
Minimum Viable BI
cd c:\folder\with\our-bi git pull origin master IF %ERRORLEVEL% NEQ 0 set context=GIT_FAILURE && goto error_handler puppet apply --modulepath=puppet\modules puppet\win-node-name.net.pp IF %ERRORLEVEL% NEQ 0 set context=PUPPET_FAILURE && goto error_handler goto end
![Page 11: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/11.jpg)
R U N N I N G P U P P E T I N A S T A N D A L O N E M O D E
Minimum Viable BI
:error_handler echo entering error_handler EVENTCREATE /T ERROR /L APPLICATION /SO Puppet_Scheduler /ID 100 /D "EXECUTION FAILED REASON %context%" goto end :end echo DONE
![Page 12: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/12.jpg)
Minimum Viable BI
Standalone mode for Puppet – configuration is totally separated – custom modules --modulepath=puppet\modules – Github hosted configuration – Error handling via Windows event log
R U N N I N G P U P P E T I N A S T A N D A L O N E M O D E
![Page 13: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/13.jpg)
Minimum Viable BI
node 'ʹwin-‐‑node-‐‑name.net'ʹ { scheduled_task {'ʹrefresh-‐‑1'ʹ: ensure => present, enabled => true, command => 'ʹC:\path\to\your\script.bat'ʹ, arguments => 'ʹsome args 'ʹ,
S C H E D U L I N G I S I M P O R T A N T
![Page 14: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/14.jpg)
Minimum Viable BI
user => 'ʹyour-‐‑user'ʹ, password => 'ʹyour-‐‑password'ʹ, trigger => { schedule => daily, start_time => 'ʹ06:00'ʹ, } }
S C H E D U L I N G I S I M P O R T A N T
![Page 15: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/15.jpg)
Minimum Viable BI
# Can't use the Puppet's scheduled_task as it does not support to run the schedule task every 5 minutes. https://github.com/sdliangzhihua/windows-puppet-example/blob/master/manifest.pp#L68
S Y N C M Y C O N F I G U R A T I O N E V E R Y 1 5 M I N
![Page 16: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/16.jpg)
Minimum Viable BI
$cmd = 'C:\Windows\system32\cmd.exe' $job_name = 'sync_code' exec { 'CreateCodeSyncScheduledTask': command => "${cmd} /C schtasks /create /sc MINUTE /mo 15 /tn ${job_name} /tr C:\\your\\puppet.bat /ru administrator /f", onlyif => ["${cmd} /C schtasks /query /tn ${job_name} & if errorlevel 1 (exit /b 0) else exit /b 1"], }
S Y N C M Y C O N F I G U R A T I O N E V E R Y 1 5 M I N
![Page 17: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/17.jpg)
S T Y L I G H T . C O M
Small tips and tricks do not repeat yourself and other tricks
![Page 18: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/18.jpg)
Minimum Viable BI
node 'ʹwin-‐‑node-‐‑name.net'ʹ { scheduled_task {'ʹrefresh-‐‑1'ʹ: ensure => present, enabled => true, command => 'ʹC:\path\to\your\script.bat'ʹ, arguments => 'ʹsome args 'ʹ,
S C H E D U L I N G I S I M P O R T A N T
![Page 19: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/19.jpg)
Small tips and tricks
class job_scheduler( $ensure = $job_scheduler::params::ensure, $enabled = $job_scheduler::params::enabled, $user = $job_scheduler::params::user, $password = $job_scheduler::params::password, $working_dir = $job_scheduler::params::working_dir, )inherits job_scheduler::params{ }
![Page 20: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/20.jpg)
Small tips and tricks
define job_scheduler::job ( $arguments ='ʹtableau_adobe.py'ʹ, $command ='ʹc:\Py27-‐‑32\python.exe'ʹ, $schedule_type ='ʹdaily'ʹ, $start_time ='ʹ08:15'ʹ, $day_of_week ='ʹevery'ʹ, ) {
![Page 21: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/21.jpg)
Small tips and tricks
define job_scheduler::tableau_job ( $arguments ='ʹdefault-‐‑tableau'ʹ, $command ='ʹc:\folder\tableau.bat'ʹ, $schedule_type ='ʹdaily'ʹ, $start_time ='ʹ21:00'ʹ, $day_of_week ='ʹevery'ʹ, ) {
![Page 22: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/22.jpg)
Small tips and tricks
# Params with default values for the tableau job # that might be changed in a job definition # # 1. $arguments ='default-argument', # 2. $command ='c:\folder\script.bat', # 3. $schedule_type ='daily', # 4. $start_time ='21:00', # 5. $day_of_week ='every', ####################
![Page 23: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/23.jpg)
Small tips and tricks
job_scheduler::tableau_job { ’some job': start_time => '01:00', arguments => ’args'; ’default refresh-1': start_time => '06:00'; 'default refresh-2': start_time => '10:00'; 'weekly update': start_time => '03:35', arguments => 'weekly-update', schedule_type => weekly, day_of_week => ['mon']; }
![Page 24: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/24.jpg)
Small tips and tricks
job_scheduler::redshift_job { 'ʹRS tagged products'ʹ: start_time => 'ʹ00:40'ʹ, params => 'ʹ..\datasources\something.tds'ʹ; 'ʹRS another job'ʹ: start_time => 'ʹ00:50'ʹ, params => 'ʹ..\datasources\else.tds'ʹ
![Page 25: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/25.jpg)
S T Y L I G H T . C O M
Puppet your ranking Lean, flexible, powerful
![Page 26: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/26.jpg)
A r a n k i n g i s a r e l a t i o n s h i p b e t w e e n a s e t o f i t e m s s u c h t h a t ,
f o r a n y t w o i t e m s , t h e f i r s t i s e i t h e r ' r a n k e d h i g h e r t h a n ' ,
' r a n k e d l o w e r t h a n ' o r ' r a n k e d e q u a l t o ' t h e s e c o n d .
![Page 27: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/27.jpg)
Ranking specifics:
• Seasonal influence • Trends • Cold start of new countries, shops • Multiple dimensions of ranking model
![Page 28: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/28.jpg)
Requirements: • Decreasing time to implement new ranking
model • Keeping working infrastructure alive • A/B testing without changing entire
infrastructure • Performance level - “still fast” and
“transparent”
Lean approach to Ranking M u l t i p l e p o i n t s o f e v a l u a t i o n
![Page 29: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/29.jpg)
Jboss Solr-loadbalancer nginx Solr
nginx Solr
nginx Solr
Common search infrastructure
![Page 30: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/30.jpg)
Updated infrastructure
Jboss Solr-loadbalancer nginx Solr
nginx Solr
nginx Solr
Jboss Solr-loadbalancer nginx Solr
Front-end loadbalancer
![Page 31: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/31.jpg)
q = +brand:adidas shop:monshowroom^3 q = +adidas monshowroom defType = dismax qf = brand shop^3 sort = user_ratings desc, score desc qq = adidas q = {!boost b=$b defType=dismax v=$qq} b = prod(popularity, clicks)
Lean approach to Ranking
![Page 32: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/32.jpg)
Lean approach to Ranking solr0x.node.company.pp
include nginx nginx::config { "solr_dev": } nginx::solr-ranking { "delta2": ur ls => [ “ /some.thing?
gender=women&brand=2271&tag=1161&tag=877&tag=468", " /some.thing?
gender=men&brand=11235&tag=10203&tag=10299&tag=10326" ] ,
![Page 33: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/33.jpg)
Lean approach to Ranking
<% urls.each do |url| -%> if ($args ~* <% if url[ 'gender'] > 0 -%>gender_id%3A<
%= url[ 'gender'] %>.*<% end -%><% url[ ' tags'].each do |tag| -%>tag_id%3A<%= tag %>.*<% end -%><% if url[ 'brand'] > 0 -%>brand_id%3A%28<%= url[ 'brand'] %>%29<% end -%>) {
set $orig $args; set $args "q={!boost+b=%24b+defType=dismax+v=
%24qq}&qq=id:*"; rewrite ^(.*)$ "$1?$orig" break; } <% end -%>
nginx / templates / conf / solr-rewrites.conf.erb
![Page 34: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/34.jpg)
Stages to evaluate a model: • R ranking model • Independent Solr-node
1. For internal use-cases 2. Testing for some of pages 3. A/B roll out for % of users
• Production roll out
Lean approach to Ranking M u l t i p l e p o i n t s o f e v a l u a t i o n
![Page 35: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/35.jpg)
Thanks for your attention!
![Page 36: Helping Data Teams with Puppet / Puppet Camp London - Apr 13, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030318/5a6774bd7f8b9aa3028b49f3/html5/thumbnails/36.jpg)
S T Y L I G H T . C O M
Sergii Khomenko Data Scientist
STYLIGHT GmbH [email protected]
@lc0d3r
Nymphenburger Straße 86 80636 Munich, Germany