a bas cron ! vive oozie !
TRANSCRIPT
![Page 1: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/1.jpg)
A BAS CRON !VIVE OOZIE !“possibly the most underrated component of the Hadoop stack”
![Page 2: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/2.jpg)
Fast forward•Problèmes de cron•Rappel archi Hadoop/YARN•Archi oozie•Workflow•Coordinator•Exécution•Contrôle, SLA, etc•Perl ?
![Page 3: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/3.jpg)
Crond•Adapté aux tâches locales•Mal à l’aise pour le reste•Scalabilité horizontale ?•Dépendances ?•Contrôle, reruns, etc•Que des soucis !
![Page 4: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/4.jpg)
Hadoop : YARN
![Page 5: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/5.jpg)
Oozie
•1 serveur (ou 2, pour HA)•1 backend DB (agnostique)•1 cluster Hadoop (HDFS + YARN)•C’est tout !
![Page 6: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/6.jpg)
Workflow intro•L’unité de base•Définition en XML•+ job.properties•+ options (jars, lib, etc)•Upload sur HDFS•oozie job -run -config job.properties•Done !
![Page 7: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/7.jpg)
Workflows, DAG
![Page 8: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/8.jpg)
Workflow guts•“ligne de commande de cron”•DAG•Actions externes / internes•Decision nodes•Forks•Communication entre étapes•Exécution distribuée, tolérance aux pannes
![Page 9: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/9.jpg)
Coordinator secrets•“spécification de fréquence de cron”•Parent du workflow•Le maître du temps•Abstraction : “nominal time”•Du workflow aux “coordinator actions”•Le maître des dépendances•UTC !
![Page 10: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/10.jpg)
Exécution
•Fire and forget•Worker map + child•Worker call home•Callbacks optionnels•Flags sur HDFS
![Page 11: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/11.jpg)
Pros & C(r)ons•C’est nul comme jeu de mots•Parallélisme, retries•La précieuse abstraction du temps•Contrainte : idempotency•SLA•UTC•=> WTH ?
![Page 12: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/12.jpg)
Perl & Oozie•REST : listen, parse, introspect•Monitor, submit, start, stop, query•Shell action hors JVM•STDIN/OUT/ERR capturés•Use WebHDFS•TRANSFORM dans Hive
![Page 13: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/13.jpg)
Perl & Oozie•@booking.com
![Page 14: A bas Cron ! Vive Oozie !](https://reader035.vdocuments.site/reader035/viewer/2022062412/58d06fab1a28abc9788b4d2d/html5/thumbnails/14.jpg)
Merci !
•Questions ?