scaling data science: engineering a platform

11
Engineering a Platform Scaling DataScience

Upload: datascience

Post on 20-Mar-2017

241 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Scaling Data Science: Engineering a Platform

Engineering a Platform

Scaling DataScience

Page 2: Scaling Data Science: Engineering a Platform

Who is DataScience?

2

● Do you need…● …insights about your data?● …to hire or train data scientists to provide

actionable insights?● …a platform for consuming cutting-edge

data science and publishing within your org, without having data scientists doing ops work?

● …to focus your data scientists on your core business, while we provide models for LTV, pricing, retention and more?

● Visit www.datascience.com!

Page 3: Scaling Data Science: Engineering a Platform

Two Primary Challenges

3

● Rapid Team Growth● Team member onboarding

requires a low-risk, standardized toolchain. Time to first contribution is measured in hours, not days or weeks. 

● Dynamic Tooling Landscape● Best-of-breed data tools are

always changing. Our culture and platform encourage experimentation and evaluation of new tools and techniques.

Page 4: Scaling Data Science: Engineering a Platform

Rapid Onboarding

4

● A packaged virtual development environment● No wrestling with complex

system dependencies and version compatibilities

● A clean starting point to quickly retreat at any time

● Monitoring and diagnostics● Scripted automation for

customizing per user

Page 5: Scaling Data Science: Engineering a Platform

Rapid Onboarding

5

● Upgrades are vetted in advance prior to wide release

● Continuous integration provides automated feedback

● Group chat pushes institutional knowledge out into searchable, company-wide record.

● A culture of sharing and demonstration

Page 6: Scaling Data Science: Engineering a Platform

Dynamic Landscape

● Configuration management to quickly compose systems based on requirements

● Software tools are constantly evolving. A robust virtual environment promotes experimentation and iteration.

● Think in terms of categories of tools not specific techs.

6

Page 7: Scaling Data Science: Engineering a Platform

Dynamic Landscape

● Version control: track changesto analysis over time. promote reproducibility

● Automated testing: An engineering approach to analysis quality

● Integrated publishing: A publishing workflow that closely follows the underlying analysis

7

Page 8: Scaling Data Science: Engineering a Platform

8

Thank you. Questions?

We’re hiring.

Page 9: Scaling Data Science: Engineering a Platform

Appendix: Rapid Onboarding

9

Page 10: Scaling Data Science: Engineering a Platform

Appendix: Dynamic Landscape

10

Page 11: Scaling Data Science: Engineering a Platform

Thank you.