building reproducible network data analysis / visualization workflows
TRANSCRIPT
![Page 1: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/1.jpg)
Keiichiro Ono UCSD Trey Ideker Lab Cytoscape Core Team
Lab Meeting Aug 4, 2015
Building Reproducible Network Data Analysis / Visualization Workflows
REST
![Page 2: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/2.jpg)
Problems We are Trying to Solve
- Complex software stack for data analysis - Setting up environment for data analysis is not trivial, and it is time-
consuming
- Python 3.x or 2.x/NumPy/SciPy/Cython Modules - R/Bioconductor/packages - OS version, etc.
- Automation - Point-and-Click operations are not reproducible
- Applying different layouts to 100 networks by hand is possible, but ridiculous - Sharing Recipe (= common workflows) is hard
- Integration to external computing resources
![Page 3: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/3.jpg)
Goal: Reproducible,Scalable Dry Experiments
REST
![Page 4: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/4.jpg)
REST
- Docker - Data analysis environment in a portable container
- GitHub - Source code sharing
- Jupyter Notebook - Your electronic lab notebook
- cyREST - RESTful API module for Cytoscape
Goal: Reproducible, Scalable Dry Experiments
![Page 5: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/5.jpg)
Data Preparation
Analysis Visualization
![Page 6: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/6.jpg)
REST
Scenario 1: Everything on your Workstation
Notebook Server
Your Jupyter Notebook
![Page 7: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/7.jpg)
REST
Scenario 2: Workstation + Cloud
Notebook Server
Your Jupyter Notebook
![Page 8: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/8.jpg)
Example: Community-Detection + Edge-Weighted Layout
Source Code: bit.ly/1P4LUFU
![Page 9: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/9.jpg)
Demo
![Page 10: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/10.jpg)
TODO
- Integration to Cyberinfrastructure (CI)
- R Wrapper - https://github.com/tmuetze/
Bioconductor_RCy3_the_new_RCytoscape
- More realistic workflows / pipelines
![Page 11: Building Reproducible Network Data Analysis / Visualization Workflows](https://reader038.vdocuments.site/reader038/viewer/2022110317/55d1b7ecbb61eb43258b4617/html5/thumbnails/11.jpg)
Resources- cyREST
- http://apps.cytoscape.org/apps/cyrest
- py2cytoscape
- https://pypi.python.org/pypi/py2cytoscape
- RCy3
- https://github.com/tmuetze/Bioconductor_RCy3_the_new_RCytoscape