genomics on the half shell: making science more open

81
Genomics on the Half Shell: Making Science more Open Steven B. Roberts Associate Professor School of Aquatic and Fishery Sciences University of Washington robertslab.info

Upload: sr320

Post on 06-May-2015

426 views

Category:

Education


3 download

DESCRIPTION

Abstract Technology has significantly changed how research is done in biology. Along with this shift, it is increasingly easier and advantageous to operate in an open science framework. In this presentation I will begin by providing an overview of our research efforts with particularly attention to challenges in data analysis. Research in our lab focuses on characterizing physiological responses of shellfish to environmental change, examining impacts and adaptive potential from the nucleotide to organism level. A core component of this includes investigating the functional relationship of genetics, epigenetics, and transcription. In our research we leverage several computing infrastructure solutions that I will describe. In addition, our lab practices Open Notebook Science. I will describe the practical aspects of how we accomplish this including addressing some of the concerns and realized advantages. Beyond online lab notebooks, we are continually experimenting with different ways to use online resources to engage with a larger audience and improve science communication. I have found this is a complex balance of time and effort versus impact and will discuss how our lab group attempts to reach this balance. Bio Steven Roberts is an Associate Professor in the School of Aquatic and Fishery Sciences where his research centers around characterizing the response of aquatic organisms to environmental change. Prior to coming to the University of Washington, in 2007 he was at the Marine Biological Laboratory in Woods Hole, Massachusetts and received his PhD from the University of Notre Dame. In graduate school he spent most of his time transferring agarose gels, and now he spends most of his time transferring files.

TRANSCRIPT

Page 1: Genomics on the Half Shell: Making Science more Open

Genomics on the Half Shell: Making Science more Open

Steven B. RobertsAssociate Professor

School of Aquatic and Fishery SciencesUniversity of Washington

robertslab.info

Page 2: Genomics on the Half Shell: Making Science more Open

Open Science

•You are free to Share!

•Our lab practices open notebookscience

•Slides and more available @

oystergen.es/data

Page 3: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 4: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 5: Genomics on the Half Shell: Making Science more Open
Page 6: Genomics on the Half Shell: Making Science more Open
Page 7: Genomics on the Half Shell: Making Science more Open
Page 8: Genomics on the Half Shell: Making Science more Open

disease resistance

Page 9: Genomics on the Half Shell: Making Science more Open

disease resistance

TranscriptomeProteomeDNA Methylation

Page 10: Genomics on the Half Shell: Making Science more Open

disease resistance

Elevated pCO2 causes developmental delay in early larval Pacific oysters, Crassostrea gigas.Timmins-Schiffman et al 2012

Ocean Acidification

Biology

Environment

Page 11: Genomics on the Half Shell: Making Science more Open

disease resistance

Ocean Acidification

Biology

EnvironmentShotgun Proteomics

10.1093/conphys/cot009

Page 12: Genomics on the Half Shell: Making Science more Open

disease resistance

Ocean Acidification

Shotgun ProteomicsBiology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...eagle.fish.washington.edu/emma

Page 13: Genomics on the Half Shell: Making Science more Open

disease resistance

TranscriptomeProteomeDNA Methylation

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 14: Genomics on the Half Shell: Making Science more Open

Function?

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 15: Genomics on the Half Shell: Making Science more Open

mosaic

associated with gene bodies

Photo credit: Flickr, Creative Commons, dkeats

Page 16: Genomics on the Half Shell: Making Science more Open

HiSeq - lane - 70G mapping - 60G

table

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 17: Genomics on the Half Shell: Making Science more Open

Stochastic Variation

10.1093/bfgp/elt05410.6084/m9.figshare.880763

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 18: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 19: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 20: Genomics on the Half Shell: Making Science more Open

raw - 70G mapping - 60G tables - 40G ........

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 21: Genomics on the Half Shell: Making Science more Open

Genome

Primary Data Table Groupings

Expressed Sequence Tags

Gene Expression Genetic Variation Epigenetic Features

transcripts

dyna

mic

RNA-Sequencing Single Nucleotide Polymorphisms

Simple Sequence Repeats

DNA Methylation

Histone Modification

miRNA ExpressionExpression Microarrays

Amplified FragmentLength Polymorphisms

Gen

omic

D

ata

Type

sData Tables

stat

ic

Gene Annotations Sequence Motifs

Gene OntologiesPathwaysOrthologs

Transposable Elements

Interactions

Other species genomes

CpG statistics

Structural Elements

SizeGrowthLocation

EnvironmentStage

TreatmentTissueTraitStrain

Transcription Factors Binding Sites

Publications

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 22: Genomics on the Half Shell: Making Science more Open

Phenotype

Epigenetics

Genetics Environment

Increased Growth Rate Tissue Quality

DiseaseResistance

Appearance

•Amplified Fragment Length Polymorphisms

•miRNA Expression

Temperature Diet

Fecundity

Yield

•Histone Modifications

•DNA Methylation Patterns

•Simple Sequence Repeats•Single Nucleotide Polymorphisms

G e n e Ex p r e s s i o

n

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 23: Genomics on the Half Shell: Making Science more Open

Genome

Primary Data Table Groupings

Expressed Sequence Tags

Gene Expression Genetic Variation Epigenetic Features

transcripts

dyna

mic

RNA-Sequencing Single Nucleotide Polymorphisms

Simple Sequence Repeats

DNA Methylation

Histone Modification

miRNA ExpressionExpression Microarrays

Amplified FragmentLength Polymorphisms

Gen

omic

D

ata

Type

sData Tables

stat

ic

Gene Annotations Sequence Motifs

Gene OntologiesPathwaysOrthologs

Transposable Elements

Interactions

Other species genomes

CpG statistics

Structural Elements

SizeGrowthLocation

EnvironmentStage

TreatmentTissueTraitStrain

Transcription Factors Binding Sites

Publications

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 24: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 25: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 26: Genomics on the Half Shell: Making Science more Open
Page 27: Genomics on the Half Shell: Making Science more Open

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Page 28: Genomics on the Half Shell: Making Science more Open

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Page 29: Genomics on the Half Shell: Making Science more Open

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Page 30: Genomics on the Half Shell: Making Science more Open

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Page 31: Genomics on the Half Shell: Making Science more Open

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Page 32: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 33: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...github.com/sr320/qdod/wiki

Page 34: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...github.com/sr320/qdod/wiki

Page 35: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...github.com/sr320/qdod/wiki

Page 36: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...github.com/sr320/qdod/wiki

Page 37: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 38: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 39: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 40: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 41: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 42: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 43: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 44: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 45: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 46: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...eagle.fish.washington.edu

Page 47: Genomics on the Half Shell: Making Science more Open

The Evolution of My Lab Notebook

Page 48: Genomics on the Half Shell: Making Science more Open

Open Notebook Science

... there is a URL to a laboratory notebook that is freely available and indexed on common search engines. It does not necessarily have to look like a paper notebook but it is essential that all of the information available to the researchers to make their conclusions is equally available to the rest of the world.

—Jean-Claude Bradley

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 49: Genomics on the Half Shell: Making Science more Open

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 50: Genomics on the Half Shell: Making Science more Open

Open Notebook Science

Page 51: Genomics on the Half Shell: Making Science more Open

Open Notebook Science

Page 52: Genomics on the Half Shell: Making Science more Open

Open Notebook Science

carlboettiger.info/lab-notebook

Page 53: Genomics on the Half Shell: Making Science more Open

Open Notebook Sciencegenefish.wikispaces.com

Page 54: Genomics on the Half Shell: Making Science more Open

Open Notebook Science

genefish.wikispaces.com

Page 55: Genomics on the Half Shell: Making Science more Open

Open Notebook Science

evernote.com/pub/che625/che625snotebook

Page 56: Genomics on the Half Shell: Making Science more Open

Open Notebook Science

Page 57: Genomics on the Half Shell: Making Science more Open

Open Notebook Science

Set some variables

blast

convert file format

upload to SQLShare (python client)

join in SQLShare - download

read in pandas

matplotlib generates graph of GOsllim

Page 58: Genomics on the Half Shell: Making Science more Open

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 59: Genomics on the Half Shell: Making Science more Open

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 60: Genomics on the Half Shell: Making Science more Open

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

a very new experiment

Page 61: Genomics on the Half Shell: Making Science more Open

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

a very new experiment

sr320.info

Page 62: Genomics on the Half Shell: Making Science more Open

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

a very new experiment

sr320.info

Page 63: Genomics on the Half Shell: Making Science more Open

Open Science

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 64: Genomics on the Half Shell: Making Science more Open

Open Science

web-native scholarship

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 65: Genomics on the Half Shell: Making Science more Open

Photo credit: Flickr, Creative Commons, speechless

Sharing

Page 66: Genomics on the Half Shell: Making Science more Open

Example

Page 67: Genomics on the Half Shell: Making Science more Open

Example

Page 68: Genomics on the Half Shell: Making Science more Open

Example

Page 69: Genomics on the Half Shell: Making Science more Open

http://ivory.idyll.org/blog/

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 70: Genomics on the Half Shell: Making Science more Open

http://ivory.idyll.org/blog/

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 71: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 72: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 73: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...robertslab.info

Page 74: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Science Philosophy Transparency with limited effort

Page 75: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Science Philosophy Transparency with limited effort will try just about anything

Page 76: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Page 77: Genomics on the Half Shell: Making Science more Open

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...computationalproteomic.blogspot.com

Yasset Perez-Riverol en Wednesday, February 19, 2014

Page 78: Genomics on the Half Shell: Making Science more Open

Start them early

Page 79: Genomics on the Half Shell: Making Science more Open
Page 80: Genomics on the Half Shell: Making Science more Open

Acknowledgements

Emma Timmins-Schiffman

Mackenzie GaveryClaire Olson

Sam WhiteBrent VadopalasJake Heare

Bill HoweDan Halperin

EPASTAR

Aquaculture Program

Saltonstall-Kennedyacidification

DNA methylation

oystergen.es/data

Page 81: Genomics on the Half Shell: Making Science more Open