designing for self-serve science

Designing for self-serve science

Daniel Halperin

How much time “handling data” vs “doing science”?

“I sort both my spreadsheets on Gene ID, then I copy matches into a new one”

We are the problem

Benchmark 1 Benchmark 2

Old system Your system Our system

Benchmark 1 Benchmark 2

Old system Your systemOur system What people use

Complexity

Design for here

What we build What they need

Steve Jurvetson https://www.flickr.com/photos/jurvetson/7408464122

sutton-images.com http://biser3a.com/formula-1/f1-airboxes-all-you-need-to-know/

terms: http://sutton-images.com/terms.asp

Lowering barrier to entry

Developing a new language

• SQL: 3 great features for science • THE language of data

management!• We know how to

scale it • Scientists can learn it

• MyriaL is better • Imperative &

declarative:easy to write

• Iteration & recursion!• Lots of practical

extensions

Giving users insight

Diagnosing problems��

��

� � � � � � � � � ��

��

Destination node

Automating the ‘CS parts’• Do work on the user’s behalf:

(Ratul Mahajan’s Buffet Principle)

• Infer indexes and constraints!

• Aggressively reuse computation

• Speculatively apply queries to data

• Key enabler: science data is (mostly) read-only

Enable authoring & sharing

• “Autocomplete for science” - predict query snippets as users work. (Nodira Khoussainova)

• Natural language interface: queries → English questions → queries “Compute the fraction of CGs that are methylated in the oyster genome.”

Improve their state of the art

• “You just did in 1 minute what took me a week”

• “Replaced 100 lines of Python with 1 line of SQL”

• “That 5-line MyriaL program was 100x faster than my R cluster, and much simpler”

Trust, but Verify (& Support)

designing for self-serve science

science data

old system

language of data management

performance complexity

time handling data vs

data key enabler

new language sql

serve science daniel

Data & Analytics

designing effective self marketing tools

serve yourself: self-service business intelligence

high profile self serve heated merchandiser introduction

designing self directed learning projects

inventory and service optimization for self-serve kiosks

the evolution of self serve - ben horesh, appthis

designing self-service experiences

self consolidating concrete-designing & tsting

where self-serve investments fall short

self serve airlines example

self serve dsi - coleman hanna carwash systems

a self-directed guide to designing courses for significant...

tools for designing programmatic self - mit...

hot to use self serve reddit ads

self-designing feature teams

parent self serve mobile · april 2014 j©2014 prologic...

sites.tufts.edu - tufts self-serve blogs and websites. ·...

mark vii jetwash self-serve

designing private health insurance products to serve...

refrigerated self-serve low profile specialty...