jupyterhub for interactive data science collaboration

Post on 19-Jan-2017

206 Views

Category:

Software

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

For Interactive

Data Science Collaboration

CineGrid December 10, 2015

HELLO

CAROL WILLING

➤ Python Software Foundation, Director

➤ Project Jupyter, Contributor

➤ Fab Lab San Diego, Geek in Residence

WRITER

MANAGER AND

ANALYST

ENGINEER

ARTIST

TEACHER

WONDER AND

CURIOSITY

PROJECT JUPYTERJust the Facts

JUPYTER NOTEBOOK

The Notebook: “Literate Computing”

Computational Narratives

❖ Computers deal with code and data.

❖ Humans deal with narratives that communicate.

Literate Computing (not Literate Programming)

narratives anchored in a live computation, that communicate a story based on data and results.

Cf: Mathematica, Maple, MuPad, Sage…

“Project Jupyter serves not only the academic and scientific communities but also a much broader constituency of data scientists in research, education, industry and journalism…

- Fernando Pérez UC Berkeley

“…we see uses of our tools that range from high school education in programming to the nation’s supercomputing facilities and the leaders of the tech industry.

- Fernando Pérez UC Berkeley

“More than a million people are currently using Jupyter for everything from…

-Prof. Brian Granger Cal Poly

“…analyzing massive gene sequencing datasets to processing images from the Hubble Space Telescope and developing models of financial markets.

-Prof. Brian Granger Cal Poly

“We are excited by the potential of Project Jupyter to reach even wider audiences and to contribute to increased cross-disciplinary collaboration in the sciences.

-Betsy Fader Helmsley Charitable Trust

“Jupyter Notebook… will enable data exploration, visualization, and analysis in a way that encourages sound science and speeds progress.

-Chris Mentzel The Gordon and Betty Moore Foundation

DATA CHALLENGES Constraints or Opportunities?

SCALE

SPEED

CHOICES

CONNECTIONS

OPPORTUNITIESUse our strengths

–Hamming'62

“The purpose of computing is insight, not numbers”

The Lifecycle of a Scientific Idea (schematically)

1. Individual exploratory work

2. Collaborative development

3. Parallel production runs (HPC, cloud, ...)

4. Publication & communication (reproducibly!)

5. Education

6. Goto 1.

JUPYTERHUBand Project Jupyter ecosystem

EDUCATION

nbviewer: seamless notebook sharing

❖ Zero-install reading of notebooks

❖ Just share a URL

❖ nbviewer.ipython.org

Executable books

❖ Springer hardcover book

❖ Chapters: IPython Notebooks

❖ Posted as a blog entry

❖ All available as a Github repo

Python for Signal Processing, by José Unpingco

University Courses

These are just some we are aware of!

A collaborative MOOC on OpenEdX

http://lorenabarba.com/news/announcing-practical-numerical-methods-with-python-mooc

❖ Lorena Barba at George Washington University, USA.

❖ Ian Hawke at Southampton, UK❖ Carlos Jerez at Pontifical Catholic

University of Chile.❖ All materials on Gihtub.

Changing the scientific culture

http://www.nature.com/news/interactive-notebooks-sharing-the-code-1.16261

Executable papers: the future?

http://www.nature.com/news/ipython-interactive-demo-7.21492?article=1.16261

Notebook Workflows: The Big Picture

Image credit: Joshua Barratt

Lots more! The IPython Gallery

https://github.com/ipython/ipython/wiki/A-gallery-of-interesting-IPython-Notebooks

GOVERNMENT

Shreyas Cholia & !Oliver Ruebel!NERSC Data & Analytics Services Group!Jupyterhub Day, July 17 2015

Jupyterhub at NERSC and OpenMSI

NERSC is the Production HPC & Data Facility for DOE Office of Science Research

Bio$Energy,$$Environment$ Compu2ng$ Materials,$Chemistry,$$Geophysics$

Par2cle$Physics,$Astrophysics$

Largest$funder$of$physical$science$research$in$U.S.$$

Nuclear$Physics$ Fusion$Energy,$Plasma$Physics$

D$2$D$

ART

BUSINESS

Quantopian: algorithmic trading

Karen RubinDir. Product Management

at Quantopian

Quantopian Research Post Fortune.com

Microsoft: Python Tools for Visual Studio

Shahrokh Mortazavi, Dino Viehland, Wenming Ye, Dennis Gannon.

Microsoft Azure: Notebooks in the Cloud

Google CoLaboratoryKayur Patel, Kester Tong, Mark Sanders, Corinna Cortes @ Google

Matt Turk @ NCSA/UIUC

IBM Watson

SCIENCE

JupyterHub: multiuser support

❖ Out of the box

❖ Unix accounts

❖ Local single-user notebooks

❖ Customizable

❖ Authentication: OAuth, LDAP, etc.

❖ Subprocess control: Docker, VMs, etc.

JupyterHub in Education @ Berkeley

https://developer.rackspace.com/blog/deploying-jupyterhub-for-education

❖ Computationally intensive course, ~220 students

❖ Fully hosted environment, zero-install

❖ Homework management and grading (w B. Granger)

Jess Hamrick @ Cal

K. KelleyRackspace

M. Ragan-KelleyCal

B. GrangerCal Poly

COLLABORATIONWhy?

A ten year journey.

Optimism and hope for the future.

IMAGINE THE POSSIBILITIES

TRY.JUPYTER.ORG

WE’RE OPEN FOR YOU.

THANK YOUtry.jupyter.org

www.jupyter.org

numfocus.org ipython.org

CREDITS AND ATTRIBUTION

➤ Sources ➤ Jupyter website www.jupyter.org [11, 31, 65, 66, 69]

➤ Fernando Pérez [12, 28, 29, 33-40, 48-52, 53-55] http://fperez.org/ BIDS http://bids.berkeley.edu/

➤ Cal Poly and UC Berkeley Press Releases http://calpolynews.calpoly.edu/news_releases/2015/July/jupyter.html, http://bids.berkeley.edu/news/project-jupyter-gets-6m-expand-collaborative-data-science-software [14-19]

➤ Jupyterhub at NERSC and OpenMSI, S. Cholla and O. Ruebel, Jupyterhub Day presentation, July 17, 2015 [42, 43]

➤ music21 website http://web.mit.edu/music21/ [45]

➤ Jeremy Freeman http://jeremyfreeman.net/ PyData Talk NYC Winter 2015 https://github.com/freeman-lab/talk-nyc-winter-2015 [56, 57, 58]

➤ CodeNeuro website http://codeneuro.org/ [59-60]

➤ Binder website http://mybinder.org/ [61]

➤ Images ➤ [2, 10, 21, 27, 30, 62, 64] Galaxy

➤ [23] Hummingbird https://flic.kr/p/mo5pa1

➤ [25] Netflix Prize Christopher Hefele https://flic.kr/p/6LWT6K

➤ [3-7, 8 (artwork FabLab interns), 9, 20, 22, 24, 26, 42, 43, 46, 57, 63] Carol Willing. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

➤ For additional information ➤ Jupyter www.jupyter.org

➤ Python Software Foundation www.python.org

➤ Carol Willing, willingc@willingconsulting.com, @willingcarol, GitHub: willingc

top related