a statistician walks into a tech company: r at a rapidly scaling healthcare startup
TRANSCRIPT
![Page 1: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/1.jpg)
A Statistician Walks into a Tech CompanyR at a rapidly scaling healthcare technology startup
Sandy GriffithTwitter: @[email protected]
![Page 2: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/2.jpg)
My story
Academic biostatistics
© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 3: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/3.jpg)
My story
3
Academic biostatistics Healthcare tech
![Page 4: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/4.jpg)
© 2016 Flatiron Health, Inc. Proprietary and confidential. 4
Flatiron’s mission is to serve cancer patients and our partners by dramatically improving treatment and accelerating research.
Our Mission
![Page 5: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/5.jpg)
Flatiron Processes EHR Data At Scale
© 2016 Flatiron Health, Inc. Proprietary and confidential. 5
Research-Grade Data
Demographics
Diagnosis
Visits
Labs
e-Prescribing
Pathology Report
Discharge Notes
Radiology Report
Physician Notes
Electronic Health Record
Structured Data Unstructured Data Outside Practice
Hospital
Lab
Structured Data Processing
Unstructured Data
Processing
Standard EHR Data
![Page 6: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/6.jpg)
Rapidly Scaling
January 2015Flatiron: ~140Software Engineers: ~50Quantitative Sciences team: 1
6© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 7: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/7.jpg)
Now: We are a team of 262
7
We include…
All Flatiron data and tools are collaboratively built, implemented and maintained by a cross-disciplinary team that includes oncology, engineering, and quantitative sciences
We come from…9 Medical oncologists and nurses
70 Software engineers
10 Quantitative scientists
5 Medical informaticists
+ more!
© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 8: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/8.jpg)
Primary Language: time of hire
© 2015 Flatiron Health, Inc. Proprietary and confidential. 8© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 9: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/9.jpg)
Proficiency with R: time of hire
9© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 10: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/10.jpg)
A decision point early on
10© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 11: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/11.jpg)
A decision point early on
11© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 12: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/12.jpg)
Cultivate R culture
1. Internal R Package2. User group3. Slack channel4. Trainings5. Hiring
12© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 13: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/13.jpg)
Cultivate R culture
1. Internal R Package2. User group3. Slack channel4. Trainings5. Hiring
13© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 14: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/14.jpg)
Proficiency with R
14© 2016 Flatiron Health, Inc. Proprietary and confidential.
Time of hire Now
![Page 15: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/15.jpg)
Now we have R users, but when should we use R?
Three scenarios:1. R for prototyping → !R in production2. R as a long-term solution3. R and !R in parallel
15© 2016 Flatiron Health, Inc. Proprietary and confidential.
![Page 16: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/16.jpg)
R for prototyping → !R in production
16© 2016 Flatiron Health, Inc. Proprietary and confidential.
Prototype
● One-time linkage● Small cohort (10s of thousands)● RecordLinkage R package● Probabilistic linkage method using
EM algorithm
Production
● Repeated daily at scale ● Large cohort (~5 million patients)● Code maintained by different team● Deterministic logic in SQL
Example: Linking external mortality data
![Page 17: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/17.jpg)
R for prototyping → !R in production
Why this made sense:● Stable method -- No longer needed rapid iteration ● Tuning parameters ● Similar performance, more transparency● No R users on team that would be maintaining code
17© 2016 Flatiron Health, Inc. Proprietary and confidential.
Example: Linking external mortality data
![Page 18: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/18.jpg)
R as a long-term solution
Early version (Jan 2015)
18© 2016 Flatiron Health, Inc. Proprietary and confidential.
● bash commands for extracting data run from R script using ETL tool
● R script run via command line● parameters in metafiles manually
updated● Runs a series of Rmd files and
renders HTML output
Current Version (April 2016)
Example: Rmarkdown QA report
● linked to data pipeline maintained by software engineering
● metafile generated dynamically ● Plotly survival curves● Flatly bootstrap theme● Plan to continue using R
indefinitely
![Page 19: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/19.jpg)
R as a long-term solution
19© 2016 Flatiron Health, Inc. Proprietary and confidential.
Example: Rmarkdown QA report
Why this made sense:
● Mature product and team● Quantitative science members remain embedded in team● Strong support and collaboration with software engineering● Requirements are dynamic -- continued need for rapid
prototyping
![Page 20: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/20.jpg)
R and !R in parallel
● Specific research questions● 2 people code independently in Python/SQL and R● Compare results● Language sometimes incidental, more about 2 different perspectives
Why this made sense:● High stakes or low error tolerance● Complicated concepts● Custom projects often involve novel problems
20© 2016 Flatiron Health, Inc. Proprietary and confidential.
Example: Some external collaborations
![Page 21: A Statistician Walks into a Tech Company: R at a Rapidly Scaling Healthcare Startup](https://reader031.vdocuments.site/reader031/viewer/2022030310/58f046bb1a28ab975b8b4643/html5/thumbnails/21.jpg)
Thank you
● Melissa Curtis● Josh Kraut● Kathi Seidl-Rathkopf● Cindy Revol● Rachael Sorg● Jay Rughani
21© 2016 Flatiron Health, Inc. Proprietary and confidential.
● Paul You● Aracelis Torres● Alphan Kirayoglu● Ben Birnbaum● Ann Jaskiw● James Gippetti
Join our Team!Drop me a note at [email protected], @sgrifter,
or visit flatiron.com/careers