genomics on the half shell: making science more open

Post on 06-May-2015

426 Views

Category:

Education

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Abstract Technology has significantly changed how research is done in biology. Along with this shift, it is increasingly easier and advantageous to operate in an open science framework. In this presentation I will begin by providing an overview of our research efforts with particularly attention to challenges in data analysis. Research in our lab focuses on characterizing physiological responses of shellfish to environmental change, examining impacts and adaptive potential from the nucleotide to organism level. A core component of this includes investigating the functional relationship of genetics, epigenetics, and transcription. In our research we leverage several computing infrastructure solutions that I will describe. In addition, our lab practices Open Notebook Science. I will describe the practical aspects of how we accomplish this including addressing some of the concerns and realized advantages. Beyond online lab notebooks, we are continually experimenting with different ways to use online resources to engage with a larger audience and improve science communication. I have found this is a complex balance of time and effort versus impact and will discuss how our lab group attempts to reach this balance. Bio Steven Roberts is an Associate Professor in the School of Aquatic and Fishery Sciences where his research centers around characterizing the response of aquatic organisms to environmental change. Prior to coming to the University of Washington, in 2007 he was at the Marine Biological Laboratory in Woods Hole, Massachusetts and received his PhD from the University of Notre Dame. In graduate school he spent most of his time transferring agarose gels, and now he spends most of his time transferring files.

TRANSCRIPT

Genomics on the Half Shell: Making Science more Open

Steven B. RobertsAssociate Professor

School of Aquatic and Fishery SciencesUniversity of Washington

robertslab.info

Open Science

•You are free to Share!

•Our lab practices open notebookscience

•Slides and more available @

oystergen.es/data

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

disease resistance

disease resistance

TranscriptomeProteomeDNA Methylation

disease resistance

Elevated pCO2 causes developmental delay in early larval Pacific oysters, Crassostrea gigas.Timmins-Schiffman et al 2012

Ocean Acidification

Biology

Environment

disease resistance

Ocean Acidification

Biology

EnvironmentShotgun Proteomics

10.1093/conphys/cot009

disease resistance

Ocean Acidification

Shotgun ProteomicsBiology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...eagle.fish.washington.edu/emma

disease resistance

TranscriptomeProteomeDNA Methylation

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Function?

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

mosaic

associated with gene bodies

Photo credit: Flickr, Creative Commons, dkeats

HiSeq - lane - 70G mapping - 60G

table

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Stochastic Variation

10.1093/bfgp/elt05410.6084/m9.figshare.880763

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

raw - 70G mapping - 60G tables - 40G ........

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Genome

Primary Data Table Groupings

Expressed Sequence Tags

Gene Expression Genetic Variation Epigenetic Features

transcripts

dyna

mic

RNA-Sequencing Single Nucleotide Polymorphisms

Simple Sequence Repeats

DNA Methylation

Histone Modification

miRNA ExpressionExpression Microarrays

Amplified FragmentLength Polymorphisms

Gen

omic

D

ata

Type

sData Tables

stat

ic

Gene Annotations Sequence Motifs

Gene OntologiesPathwaysOrthologs

Transposable Elements

Interactions

Other species genomes

CpG statistics

Structural Elements

SizeGrowthLocation

EnvironmentStage

TreatmentTissueTraitStrain

Transcription Factors Binding Sites

Publications

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Phenotype

Epigenetics

Genetics Environment

Increased Growth Rate Tissue Quality

DiseaseResistance

Appearance

•Amplified Fragment Length Polymorphisms

•miRNA Expression

Temperature Diet

Fecundity

Yield

•Histone Modifications

•DNA Methylation Patterns

•Simple Sequence Repeats•Single Nucleotide Polymorphisms

G e n e Ex p r e s s i o

n

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Genome

Primary Data Table Groupings

Expressed Sequence Tags

Gene Expression Genetic Variation Epigenetic Features

transcripts

dyna

mic

RNA-Sequencing Single Nucleotide Polymorphisms

Simple Sequence Repeats

DNA Methylation

Histone Modification

miRNA ExpressionExpression Microarrays

Amplified FragmentLength Polymorphisms

Gen

omic

D

ata

Type

sData Tables

stat

ic

Gene Annotations Sequence Motifs

Gene OntologiesPathwaysOrthologs

Transposable Elements

Interactions

Other species genomes

CpG statistics

Structural Elements

SizeGrowthLocation

EnvironmentStage

TreatmentTissueTraitStrain

Transcription Factors Binding Sites

Publications

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Use Cases• Joining on Annotations• File Conversion• Querying Gene Tables

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...github.com/sr320/qdod/wiki

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...github.com/sr320/qdod/wiki

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...github.com/sr320/qdod/wiki

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...github.com/sr320/qdod/wiki

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...eagle.fish.washington.edu

The Evolution of My Lab Notebook

Open Notebook Science

... there is a URL to a laboratory notebook that is freely available and indexed on common search engines. It does not necessarily have to look like a paper notebook but it is essential that all of the information available to the researchers to make their conclusions is equally available to the rest of the world.

—Jean-Claude Bradley

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Notebook Science

Open Notebook Science

Open Notebook Science

carlboettiger.info/lab-notebook

Open Notebook Sciencegenefish.wikispaces.com

Open Notebook Science

genefish.wikispaces.com

Open Notebook Science

evernote.com/pub/che625/che625snotebook

Open Notebook Science

Open Notebook Science

Set some variables

blast

convert file format

upload to SQLShare (python client)

join in SQLShare - download

read in pandas

matplotlib generates graph of GOsllim

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

a very new experiment

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

a very new experiment

sr320.info

Open Notebook Science Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

a very new experiment

sr320.info

Open Science

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Science

web-native scholarship

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Photo credit: Flickr, Creative Commons, speechless

Sharing

Example

Example

Example

http://ivory.idyll.org/blog/

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

http://ivory.idyll.org/blog/

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...robertslab.info

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Science Philosophy Transparency with limited effort

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Open Science Philosophy Transparency with limited effort will try just about anything

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...

Biology

Environment

Molecular

Data Analysis

eScience

iPlant Galaxy

Notebooks

Rationale

Platforms

Open Science

Data

everything else...computationalproteomic.blogspot.com

Yasset Perez-Riverol en Wednesday, February 19, 2014

Start them early

Acknowledgements

Emma Timmins-Schiffman

Mackenzie GaveryClaire Olson

Sam WhiteBrent VadopalasJake Heare

Bill HoweDan Halperin

EPASTAR

Aquaculture Program

Saltonstall-Kennedyacidification

DNA methylation

oystergen.es/data

top related