from data to knowledge with workflows & provenance
DESCRIPTION
NCSA colloquium on Sept 12, 2014: http://illinois.edu/calendar/detail/1435?eventId=32072828&calMin=201409&cal=20140209&skinId=160TRANSCRIPT
![Page 1: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/1.jpg)
From Data to Knowledge with Workflows & Provenance
Bertram Ludäscher
Graduate School of Library and Information Science (GSLIS) Affiliate:
National Center for Supercomputing Applications (NCSA) Department of Computer Science (CS @ Illinois)
![Page 2: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/2.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Outline
• About Yours Truly – … where I’m coming from – … strange loops …
• From Data To Knowledge … • … Scientific Workflows (CI “Upper-Ware”) • … and Provenance (part of CI “Underware”)
• Other Research Interests & Projects – Reprise (… me not) – Sept. 19: CIRSS Seminar @ GSLIS (Reasoning about Taxonomies) – Sept. 23: (Oct 7) Yahoo!-DAIS Seminar@CS (First-order Provenance Games)
![Page 3: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/3.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Some Personal Provenance … • Studies of Computer Science at Uni Karlsruhe (TH)
– … my Alma Mater now defunct!?? L – … deus ex machina: K.I.T. (Karlsruhe Institute of Technology) J – Fridericiana Polytechnic (1825) ... TU Karlsruhe (1865) ... KIT (2009)
• Undergrad work: Task-Setup Service (TSS) – part of HECTOR (HEterogeneous Computers TOgetheR, IBM & U-KA), top-layer above
DACNOS (Distributed Academic Network Operating System) – early “upper-ware”!
• … (scientific) workflows!!
![Page 4: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/4.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Sacred Scrolls … Prophesizing the Grid (DACNOS) & workflows (TSS)
Foerster, Cora. "Controlling Distributed User Tasks in Heterogeneous Networks." In HECTOR: Heterogeneous Computers Together. A Joint Project of IBM and the University of Karlsruhe. Springer Berlin Heidelberg, 1988.
“All this has happened before, and all this will happen again”
![Page 5: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/5.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
… too much C hacking … on to AI & Logic!
• Workflows? Hacking? – Boring…
• Databases?? – Boooring!!
• AI, Logic Programming? – Sounds good! – Non-monotonic reasoning
• Well-founded semantics • Stable models (now ASP)
• MSc (Diplom) – First-order theorem prover
(BDD variant)
“All this has happened before, and all this will happen again”
![Page 6: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/6.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
… and onto (logic) databases!
• PhD at University of Freiburg
“All this has happened before, and all this will happen again”
![Page 7: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/7.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
… fast forward to the present (back to the future!)
• Datalog becomes popular again: – Datalog 2.0 in Oxford and Vienna: The resurgence of
Datalog in academia and industry
• Statelog is in demand again – The Declarative Imperative: Experiences and Conjectures in
Distributed Logic. Joe Hellerstein. PODS Keynote, 2010.
• LogicBlox Inc. (Atlanta) – Re-invent how enterprise software is built – Under the hood: LogiQL
• … a high-performance Datalog engine
![Page 8: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/8.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Datalog Plus: l Skolem functions l Existentials in the head l Meta-Programming layer l Integration with LP Solvers l Expressive constraints l ...
Language Execution-Engine Cloud: l Cost-based optimizer l Versioned data-structures l Full serializability Browser: l Compiled to Javascript
Re-invent how enterprise software is built
Unified Runtime
based on Datalog
Vision
Molham Aref, LogicBlox
![Page 9: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/9.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
• 1998-2004 SDSC & CSE Dept – NARA, digital libraries
• w/ Reagan Moore
– Data Integration research – Started Kepler
• w/ Matt Jones, Ilkay Altintas, … • Head start: Ptolemy II (open source)
– EECS @ Berkeley (E.A. Lee)
– Naming things is fun! • Mediation of Information in XML (MIX) • Blended Browsing & Querying (BBQ) • Knowledge-based Information Integration of
Neuroscience Data (KIND) • Ptolemy.. Copernicus … Kepler! • Neon… Geosciences … Network … GEON!
… down by the sea !
![Page 10: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/10.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
… from SoCal to NorCal … to the Midwest!
• 2004-2014 UC Davis
• Major projects (finished) – Kepler/CORE, pPOD, ChIP-chip,
COMET, SDM, REAP • Ongoing & new:
– FilteredPush – Euler, Exploring Taxon Concepts – DataONE – Kurator
• Research themes (& names :-) – Scientific data mgmt, workflows,
provenance, KR&R, data curation … – Kepler/COMAD, X-CSR, Euler …
UC DAVIS Department of Computer Science
![Page 11: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/11.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
The 4th Paradigm
• CI, e-Science • bioinformatics • ecoinformatics • geoinformatics
• Big Data • Data Science • Information Science • Digital Humanities …
![Page 12: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/12.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Scientific Workflows: Cyberinfrastructure “Upperware”
Underware
Middleware
Upper Middleware
Upperware
NSF/SEEK ITR collaboration (2002-2008): SDSC, UCSB, UC Davis, UNM, UK, …
![Page 13: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/13.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Problem: Stitching together Tools and Databases • Tool Integration
– local, remote, tools, services, databases, applications
– BLAST on myPC? – My R script on the
cluster? • Data Handling
– Where’s the data? Access methods?
– A.out doesn’t fit B.in – Many runs, experiments
• Automate, optimize, scale, reuse, share wfs
• “Explain” results
![Page 14: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/14.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
“Integration Technologies” for Data, Tools, Models
• State of the art in tool integration often involves plumbing, stitching, and stapling …
![Page 15: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/15.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Scientific Workflows: ASAP! • Automation
– wfs to automate computational aspects of science – batch processing, scripting
• Scaling (exploit and optimize machine cycles) – wfs should make use of parallel compute resources
• dataflow-orientation avoids von Neumann bottleneck • use parallel MoCs when deploying on cluster, cloud
– wfs should be able handle large data • Abstraction, Evolution, Reuse (human cycles)
– wfs should be easy to change, evolve, share, reuse • Provenance
– wfs should capture processing history, data lineage è traceable data- and wf-evolution è Reproducible Science
![Page 16: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/16.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Find OTUs
(OTUHunter)
Assign Taxonomy (STAP)
Profile alignment
(STAP or Infernal)
Build phylogeneAc tree (RaxML or Quicktree)
View tree: Dendroscope
UniFrac: tree &
environment file
Assembled conAgs
Chimera check
(Mallard)
Diversity staAsAcs: Text: OUT list, Chao1, Shannon
Graphs: rarefacAon curves, rank-‐abundance curves
VisualizaAon tools: Cytoscape networks & Heat map
WATERS: Workflow for Alignment, Taxonomy, Ecology of Ribosomal Sequences (Amber Hartman; Eisen Lab; UC Davis)
+/-‐ cipres
+/-‐ cluster
+/-‐ cluster
+/-‐ cluster
![Page 17: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/17.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Executable WATERS Workflow in Kepler
![Page 18: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/18.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Example Bioinformatics Workflow: Motif-Catcher
Marc Facciotti et al. UC Davis Genome Center
![Page 19: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/19.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Motif-Catcher workflow, implemented in Kepler
S Köhler et al. Improved Motif Detection in Large Sequence Sets with Random Sampling in a Kepler workflow, ICCS-WS, 2012
![Page 20: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/20.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
A Data-Streaming Workflow over Sensor Data
![Page 21: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/21.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Kepler Workflows & Decision Making (Kruger Natl. Park, South Africa)
SANParks Matt Jones, NCEAS @ UC Santa Barbara
![Page 22: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/22.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Scientific workflows: a(nother) silver bullet?
Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy.
—Alan Perlis
![Page 23: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/23.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Scientific Workflow Design: Some Challenges
“And the graphical UI makes our scientific workflows so much easier to develop, understand and maintain!”
![Page 24: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/24.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Human Cycles vs Machine Cycles
• Traditional Computer Science and HPC focus: – optimize algorithms, save
machine cycles – massively parallelize
execution • The most expensive cycles:
– Human cycles! – Big scalability issues …
• cf. Bernie’s “Big Data” ~ big problems with data!
• Not either one or the other: – … better together! (cf. BSG)
![Page 25: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/25.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Overview: My Scientific Workflow Research
Modeling & Design
Provenance
Parallel Execution
Fault-Tolerance, Crash Recovery
![Page 26: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/26.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
• Monitor and control supercomputer simulations
– 50+ composite actors (subworkflows)
– 4 levels of hierarchy – 1000+ atomic (Java) actors
43 actors, 3 levels
196 actors, 4 levels 30 actors
206 actors, 4 levels
137 actors 33 actors
150 123 actors
66 actors 12 actors
243 actors, 4 levels
Norbert Podhorszki ORNL (then: UC Davis)
Programming in the large?
![Page 27: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/27.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
"Structured Plumbing" in Kepler
Cabellos et al. Computer Physics Communications 182, 2011
![Page 28: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/28.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Modeling & Design: Die Grenzen meiner Sprache bedeuten die Grenzen meiner Welt
• Vanilla Process Network
• Functional Programming Dataflow Network
• XML Transformation Network
• Collection-oriented Modeling & Design framework (COMAD)
– “Look Ma: No Shims!”
![Page 29: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/29.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Problems with [too many] Shims and Wires
• Shims need to be placed and connected – Tedious, error-prone
• Distract from scientific meaningful actors – Non-descriptive workflows – worth sharing?
• Data Organization is encoded in workflow structure – Not robust to data changes
• Shims often lead to complex designs – Imagine all previous `design-patterns’ intertwined – GOTO-programming
COMAD/VDAL: Raising the level of abstraction " Localized control-flow
" Data management not done via wires
" Actors are coupled not by wire but by data!
![Page 30: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/30.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Pipelined Collection-Oriented Workflows Collection-Oriented Modeling & Design (COMAD)
– fully embrace the assembly line metaphor
– data = tagged nested collections
– e.g. represented as flattened, pipelined (XML) token streams:
Actors (like assembly line workers), pass on what they don’t work on
T McPhillips, S Bowers, D Zinn, B Ludäscher
![Page 31: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/31.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Layers in COMAD / VDAL Pipelines
WF Graph
Configurations (white-box)
Scientific Functions (black-boxes)
CipresRAxML In: DNASeq+
Thres: Float
Method: String
Out: (t:Tree, s:score)+
• Access data in XML stream • Call Scientific Functions (Services) • Put results back into stream
![Page 32: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/32.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
COMAD/VDAL Actor Execution Semantics
![Page 33: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/33.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Two different workflow designs
• Hardwiring vs. configurable data/collection management • brittle vs. change resilient designs • scientist can recognize napkin drawing/conceptual model • Human cycles are expensive
![Page 34: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/34.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
ADIOS in Kepler
![Page 35: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/35.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
ADIOS in COMAD
![Page 36: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/36.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Conceptual Pipeline w/ Scopes & Types
Daniel Zinn et al. ICDE’09
![Page 37: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/37.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Optimizing Execution Schedules: Paral�lel
Paral·lel (Barcelona
Metro)
![Page 38: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/38.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
X-CSR (“XML Scissor”): Cut-Ship-Reassemble
Daniel Zinn et al. ICDE’09
![Page 39: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/39.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Workflow Execution Analysis and Optimization
A:1
B:1
d1
d2
d3
Actor A Queue Actor B
2
Comadlayer<C>
</C>
d1
d2
d3
<C>
</C>
B:1:2
B:1:3
B:1:1
Comadlayer
3
COMAD: Kepler PN:
Optimal Schedule:
Analysis + Data mining
Sven Köhler
![Page 40: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/40.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Dataflow Network (generic) and Views
![Page 41: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/41.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Kahn Process Networks
Kahn, Gilles & David MacQueen. "Coroutines and networks of parallel processes." (1976).
Kahn, Gilles. "The semantics of a simple language for parallel programming." (1974)
![Page 42: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/42.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Synchronous Dataflow (SDF)
Lee, Edward A., and David G. Messerschmitt. "Synchronous data flow." Proc. of the IEEE 75, no. 9 (1987): 1235-1245.
![Page 43: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/43.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Workflow Recovery in SDF
![Page 44: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/44.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Idea: “Rescue DAG” (cf. Condor/DAGMan)
Sven Köhler et al. Improving Workflow Fault Tolerance through Provenance-Based Recovery. SSDBM 2011
![Page 45: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/45.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
COMAD
![Page 46: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/46.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
VisTrails [Juliana Freire, et al]
![Page 47: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/47.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Restflow (Tim McPhillips)
![Page 48: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/48.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
So many MoCs, so little time …
![Page 49: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/49.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Outline
• About Yours Truly – … where I’m coming from – … strange loops …
• From Data To Knowledge … • … Scientific Workflows (CI “Upper-Ware”) • … and Provenance (part of CI “Underware”)
• Other Research Interests & Projects – Reprise (… me not) – Sept. 19: CIRSS Seminar @ GSLIS (Reasoning about Taxonomies) – Sept. 23: Yahoo!-DAIS Seminar @ CS (First-order Provenance Games)
![Page 50: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/50.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
From “Climate Gate” to Reproducible Science
Capturing provenance is crucial for transparency, interpretation, debugging, … => repeatable experiments, => reproducible science => need workflow-system agnostic model
![Page 51: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/51.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Data & Provenance Management: Model Chains
![Page 52: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/52.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
The Data Life Cycle
![Page 53: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/53.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
From Data Life-Cycle to Curation Life-Cycle
Uncanny Resemblance: Eye of Jupiter (“Vision Thing”?)
DCC Curation Lifecycle
![Page 54: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/54.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Common Uses of Provenance Data in Science
• Audit trail: trace data generation and possible errors • Attribution: determine ownership and responsibility for data
and scientific results • Data quality: from quality of input data, computations • Discovery: enable searching of data, methodologies
and experiments • Replication: facilitate repeatable derivation of data to
maintain currency ⇒ Reproducible Science But: different MoCs imply different Observables (and
“Knowables”) è different MoPs
![Page 55: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/55.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
The Executable Paper
Executable Paper Grand Challenge International Conference on
Computational Science, ICCS 2011 The Collage Authoring Environment
Piotr Nowakowskia*et al.
![Page 56: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/56.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Motivation: Virtual Joint Experiments
• How do we ensure that Charlie gets a complete account of the history of Wc’s outputs?
• How do we ensure that Alice gets her due (partial) credit when Charlie uses Bob’s data v? è traces TA and TB will be critical è need to compose them to obtain TC
We can view the composition WC as a new, virtual workflow
Charlie
Alice
(1) develop! WA
(2) run! RA
z x Bob
(3) develop!WB
(5) run!RB
v u f
v
WC:=
(6) inspect
provenance!
(7) understand,
generate!W
A W
S W
B
u z x
(4) data sharing!
TA! TB!f -1
![Page 57: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/57.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Provenance Composition: the Data Tree of Life (DToL) • We can formulate our questions in terms of provenance
of the datasets produced by virtual workflow WC: – What is the complete provenance of v?
• Answering the question requires tracing v’s derivation all the way to x
• But, to achieve this, we need to ensure: • TA and TB are properly connected • Provenance queries run seamlessly over and across TA and TB
Charlie
Alice
(1) develop! WA
(2) run! RA
z x Bob
(3) develop!WB
(5) run!RB
v u f
v
WC:=
(6) inspect
provenance!
(7) understand,
generate!W
A W
S W
B
u z x
(4) data sharing!
TA! TB!f -1
![Page 58: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/58.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Scientific Workflow Provenance in Action
WF Engine
ProvExplorer
ReproZip DataONE
ReproZip
WF Engine
![Page 59: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/59.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Data Quality & Curation Workflows • Collections & occurrence data
is all over the map – … literally (off the map!)
• Issues: – Lat/Long transposition,
coordinate & projection issues – Data entry/creation, “fuzzy”
data, naming issues, bit rot, data conversions and transformations, schema mappings, … (you name it)
• Filtered-Push Collaboration
![Page 60: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/60.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
![Page 61: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/61.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Filtered-Push: Kurator (Data Curation Workflows)
Tianhong Song
Lei Dou (former member)
Sven Köhler
![Page 62: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/62.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Data Curation Pipeline (w/ your friends in the loop)
[SPHNC’2011]
![Page 63: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/63.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Curation Workflow: Features
• Human-in-the-loop – You “wrapped” your buddies/experts into the workflow!
• Uses Open Authorization • Certain changes captured in the data
– ... by workflow developer/engineer – Highlighted in the spreadsheets (cf. “duplicate records”)
• Automatic capture of provenance information – data lineage and processing history
• Provenance information – can be visualized, browsed, and queried
[SPHNC’2011]
![Page 64: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/64.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Koogle: Google Cloud + Kepler
[SPHNC’2011]
![Page 65: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/65.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Koogle Kuration package: Kepler + Google cloud (esp. spreadsheet) services
actors functions importer import data to a spreadsheet exporter export data from a spreadsheet copy copy a spreadsheet from a template share share the spreadsheet with another user query query data from the spreadsheet
auditor allow human interaction during the execution of the workflow
[SPHNC’2011]
![Page 66: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/66.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
You’ve got Mail! (Two curation requests)
[SPHNC’2011]
![Page 67: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/67.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Inspect, edit (if necessary), submit!
[SPHNC’2011]
![Page 68: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/68.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
… second request
[SPHNC’2011]
![Page 69: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/69.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
DONE! Summary message…
[SPHNC’2011]
![Page 70: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/70.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
[SPHNC’2011]
![Page 71: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/71.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
http://www.youtube.com/watch?v=DEkPbvLsud0
[SPHNC’2011]
![Page 72: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/72.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
FilteredPush Curation Provenance (Spreadsheet View)
![Page 73: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/73.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
… and then there is One More Thing …
![Page 74: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/74.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
An End-to-End Climate Workflow
Configure Climate Model
Data Repository
Search Data
Process Data
Model Inputs
Build Climate Model
Run Climate Model
Model Outputs
Exploration, Visualization, & Analysis
Uncertainty Quantification
Diagnostics Generation
Exploratory Analysis
Model Benchmarking Archive Data
Repository
Src: Yaxing Wei, ORNL (EVA WG)
![Page 75: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/75.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Model Benchmarking using UV-CDAT
Workflow
Result
Src: Yaxing Wei, ORNL (EVA WG)
![Page 76: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/76.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
DataONE Provenance & Semantics Use Case
The North American Carbon Program Multi-Scale Synthesis and Terrestrial Model Intercomparison Project D. N. Huntzinger1, C. Schwalm2, A. M. Michalak3, K. Schaefer4,5, A. W. King6, Y. Wei6, A. Jacobson4,7, S. Liu6, R. B. Cook6, W. M. Post6, G. Berthier8, D. Hayes6, M. Huang9, A. Ito10, H. Lei11,12, C. Lu13, J. Mao6, C. H. Peng14,15, S. Peng8, B. Poulter8, D. Riccuito6, X. Shi6, H. Tian13, W. Wang16, N. Zeng17, F. Zhao17, and Q. Zhu15
Provenance • Externally facing • Internally facing
![Page 77: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/77.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
D-OPM: DataONE version of OPM for sci-wf
D-OPM (DataONE ProvWG)
OPM-W Daniel Garijo, Yoland Gil
![Page 78: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/78.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Structural Integrity: Traces è Workflows
Structural integrity
Implied temporal constraints
Temporal constraint declaration
![Page 79: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/79.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Logic / Rule-based Provenance Analyzer
Related: Prov-WG
Saumen Dey
![Page 80: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/80.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
From Models of Computation to Models of Provenance
M. Anand, S. Bowers, et al., SSDBM’09
![Page 81: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/81.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Fine-grained, Data & MoC-aware MoP
M. Anand, S. Bowers, et al., SSDBM’09
![Page 82: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/82.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Hamming Numbers (executable Kepler workflow)
Compute Hamming numbers H in order, where a.k.a. regular numbers or 5-smooth numbers (numbers whose prime divisors are less or equal to 5).
Babylonian clay tablet with annotations. The diagonal displays an approximation of the square root of 2 in four sexagesimal figures, which is about six decimal figures. 1 + 24/60 + 51/602 + 10/603 = 1.41421296...
![Page 83: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/83.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Two Hamming workflow variants: H1 vs. H3
![Page 84: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/84.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
It's Quiz-Time again! X2
X3
X5
S2
S3
S5
Q1
Q2
Q3
M1
M2
Q4
Q5
Q6
Q7
Q8
X2
X3
X5
S2
S3
S5
Q1
Q2
Q3
M1
M2
Q4
Q5
Q6
Q7
Q8
Hamming Trace
Does it match Hamming Workflow H1?
… or Hamming Workflow H3 ??
![Page 85: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/85.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Hamming Traces – "Debugged"
1
2
3
5
4
6
10
9
15
25
8
12
20
18
30
50
27
45
75
16
24
40
36
60
100
125
54
90
150
32
48
80
72
120
200
81
135
225
250
108
180
300
375
64
96
160
144
240
400
162
270
450
500
216
360
600
625
243
405
675
750
128
192
320
288
480
800
324
540
900
1000 432
720
486
810
256
384
640
576
960
648
729
864
972
512
768
1
2
3
5
4
6
10
9
15
25
8
12
20
18
30
50
27
45
75
16
24
40
36
60
100
125
54
90
150
32
48
80
72
120
200
81
135
225
250
108
180
300
375
64
96
160
144
240
400
162
270
450
500
216
360
600
625
243
405
675
750
128
192
320
288
480
800
324
540
900
1000
432
720
486
810
256
384
640
576
960
648
729
864
972
512
768
Trace of H1 ("Fish") Trace of H3 ("Sail")
![Page 86: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/86.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Provenance & Privacy (ProPub: Provenance Publisher)
Saumen Dey, UC Davis
![Page 87: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/87.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Meet Prof. Nico Franz: Curator of Insects @ ASU
![Page 88: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/88.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
From Tool Users to Tool Makers
Screen capture… back to the original definition
![Page 89: From Data to Knowledge with Workflows & Provenance](https://reader033.vdocuments.site/reader033/viewer/2022042715/558b5bd0d8b42a2d478b473a/html5/thumbnails/89.jpg)
NCSA Colloquium Sep 12, 2014 Data to Knowledge w/ Scientific Workflows & Provenance B. Ludäscher
Conclusion: Better Together
• Human & Machine Cycles – Better information and workflow modeling (COMAD/VDAL) – and more scalable execution (X-CSR, tagged dataflow, …)
• Theory & Practice – Experimental theory (CS problems + ASP + Info Vis)
• e.g. rediscovering Dedekind numbers via taxonomy debugging – D(N) = |monotone Boolean functions over N variables|
– Information Science & Software-Carpentry • Support tool makers!
• Big Data, Data Science, and all the rest! – Excited to work at the intersection of GSLIS & NCSA & CS!