snowplow: where we came from and where we are going - march 2016
TRANSCRIPT
![Page 1: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/1.jpg)
Where we came from and where we’re goingMarch 2016
![Page 2: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/2.jpg)
Snowplow was born in 2012
Web data: rich but GA / SiteCatalyst are limited
“Big data” tech
• Marketing, not product analytics
• Silo’d: can’t join with other customer data
Snowplow
• Open source frameworks
• Cloud services
• Open source click stream data warehouse
• Event level: any query
• Built on top of Cloudfront / EMR / Hadoop
![Page 3: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/3.jpg)
The plan: spend 6 months building a pipeline…
…then get back to using the data
![Page 4: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/4.jpg)
So what went wrong?
![Page 5: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/5.jpg)
Increased project scope• Click stream data warehouse -> Event
analytics platform
• Collect events from anywhere, not just the web
• Make event data actionable in real-time
• Support more in-pipeline processing steps (enrichment and modeling)
• Support more storage targets (where your data is has big implications for what you can do with that data)
![Page 6: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/6.jpg)
Track events from anywhere
• Events• Entities
![Page 7: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/7.jpg)
Make event data actionable in real-time
• Personalization• Marketing automation
• Content analytics
![Page 8: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/8.jpg)
Today, Snowplow is an event data pipeline
![Page 9: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/9.jpg)
What makes Snowplow special?• Data pipeline evolves with your
business
• Channel coverage
• Flexibility: where your data is delivered
• Flexibility: how your data is processed (enrichment and modeling)
• Data quality
• Speed
• Transparency
![Page 10: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/10.jpg)
Used by 100s (1000s?) of companies…
…to answer their most important business questions
![Page 11: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/11.jpg)
But there’s still much more to build!
• Improve automation around schema evolution
• Make modeling event data easier, more robust, more performant
• Support more storage targets
• Make it easier to act on event data
Data modeling in Spark
Druid, BigQuery, graph databases
Analytics SDKs, Sauna
Iglu: machine-readable schema registry
![Page 12: Snowplow: where we came from and where we are going - March 2016](https://reader036.vdocuments.site/reader036/viewer/2022081520/58738c8c1a28ab272d8b6ebf/html5/thumbnails/12.jpg)
Questions?
• Can take questions now or after the other talks