future of research communication 2011

36
The Future of Research Communication Judith Blake The Jackson Laboratory

Upload: judith-blake

Post on 02-Dec-2014

160 views

Category:

News & Politics


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Future of Research Communication 2011

The Future of Research Communication

Judith BlakeThe Jackson Laboratory

Page 2: Future of Research Communication 2011

Rocky ISBC

My PerspectiveFuture open access to digital data will speed discovery

Data generation is getting easier, data analysis is getting harder, we are drowning in data

Key to scientific discourse is the ability to reproduce and verify results - currently difficult for computational results that do not include code and data upon publication

Current ‘academic’ publication and rewards systems are inadequate for measuring scientific contributions

Digital data repositories, open access publications, electronic journals, and semantic web enhancements will all contribute to the success of future of science communications

12/8/11

Page 3: Future of Research Communication 2011

Outline1. Improving knowledge communication

- Vision: What are the communication functionalities needed?- Technology: What are the tools for doing this?

2. Impacting our world- Social Aspects: How do we quantify impact of use/reward system?- Coolness: How do we make it attractive to do/ use?

3. Overcoming obstacles-Financial Considerations: How do we make is sustainable?-Getting the ball rolling: How do we start?

 

12/8/11Rocky ISBC

Page 4: Future of Research Communication 2011

12/8/11Rocky ISBC

Elsevier, Wiley, ISI, (Highwire), PLoS, 14 Universities, European Commission, UKRoyal Society

“Where Computer Scientists Meet”

Page 5: Future of Research Communication 2011

The Reasoned Argument

12/8/11Rocky ISBC

Page 6: Future of Research Communication 2011

Roxy Laybourne and others, photo by Chip Clark

Managing Biological Information is Nothing New

Bird Collections at the Smithsonian Natural History Museum

12/8/11Rocky ISBC

Page 7: Future of Research Communication 2011

Rocky ISBC

TCTCTCCCCCGCCCCCCAGGCTCCCCCGGTCGCTCTCCTCCGGCGGTCGCCCGCGCTCGGTGGATGTGGC

TGGCAGCTGCCGCCCCCTCCCTCGCTCGCCGCCTGCTCTTCCTCGGCCCTCCGCCTCCTCCCCTCCTCCT

TCTCGTCTTCAGCCGCTCCTCTCGCCGCCGCCTCCACAGCCTGGGCCTCGCCGCGATGCCGGAGAAGAGG

CCCTTCGAGCGGCTGCCTGCCGATGTCTCCCCCATCAACTACAGCCTTTGCCTCAAGCCCGACTTGCTGG

ACTTCACCTTCGAGGGCAAGCTGGAGGCCGCCGCCCAGGTGAGGCAGGCGACTAATCAGATTGTGATGAA

TTGTGCTGATATTGATATTATTACAGCTTCATATGCACCAGAAGGAGATGAAGAAATACATGCTACAGGA

TTTAACTATCAGAATGAAGATGAAAAAGTCACCTTGTCTTTCCCTAGTACTCTGCAAACAGGTACGGGAA

CCTTAAAGATAGATTTTGTTGGAGAGCTGAATGACAAAATGAAAGGTTTCTATAGAAGTAAATATACTAC

CCCTTCTGGAGAGGTGCGCTATGCTGCTGTAACACAGTTTGAGGCTACTGATGCCCGAAGGGCTTTTCCT

TGCTGGGATGAGCCTGCTATCAAAGCAACTTTTGATATCTCATTGGTTGTTCCTAAAGACAGAGTAGCTT

TATCAAACATGAATGTAATTGACCGGAAACCATACCCTGATGATGAAAATTTAGTGGAAGTGAAGTTTGC

CCGCACACCTGTTATGTCTACATATCTGGTGGCATTTGTTGTGGGTGAATATGACTTTGTAGAAACAAGG

TCAAAAGATGGTGTGTGTGTCCGTGTTTACACTCCTGTTGGCAAAGCAGAGCAAGGAAAATTTGCGTTAG

AGGTTGCTGCTAAAACCTTGCCTTTTTATAAGGACTACTTCAATGTTCCTTATCCTCTACCTAAAATTGA

TCTCATTGCTATTGCAGACTTTGCAGCTGGTGCCATGGAGAACTGGGGCCTTGTTACTTATAGGGAGACT

GCATTGCTTATTGATCCAAAAAATTCCTGTTCTTCATCCCGCCAGTGGGTTGCTCTGGTTGTGGGACATG

AACTCGCCCATCAATGGTTTGGAAATCTTGTTACTATGGAATGGTGGACTCATCTTTGGTTAAATGAAGG

TTTTGCATCCTGGATTGAATATCTGTGTGTAGACCACTGCTTCCCAGAGTATGATATTTGGACTCAGTTT

GTTTCTGCTGATTACACCCGTGCCCAGGAGCTTGACGCCTTAGATAACAGCCATCCTATTGAAGTCAGTG

TGGGCCATCCATCTGAGGTTGATGAGATATTTGATGCTATATCATATAGCAAAGGTGCATCTGTCATCCG

AATGCTGCATGACTACATTGGGGATAAGGACTTTAAGAAAGGAATGAACATGTATTTAACCAAGTTCCAA

CAAAAGAATGCTGCCACAGAGGATCTCTGGGAAAGTTTAGAAAATGCTAGTGGTAAACCTATAGCAGCTG

GTTTCTGCTGATTACACCCGTGCCCAGGAGCTTGACGCCTTAGATAACAGCCATCCTATTGAAGTCAGTG

TGGGCCATCCATCTGAGGTTGATGAGATATTTGATGCTATATCATATAGCAAAGGTGCATCTGTCATCCG

AATGCTGCATGACTACATTGGGGATAAGGACTTTAAGAAAGGAATGAACATGTATTTAACCAAGTTCCAA

CAAAAGAATGCTGCCACAGAGGATCTCTGGGAAAGTTTAGAAAATGCTAGTGGTAAACCTATAGCAGCTG

From the birth of the field of genetics until a decade ago, it was generally assumed that the parental origin of a gene could have no effect on its function. In the vast majority of studies carried out during the last 90 years, this paradigm has appeared to hold true. However, with increasingly sophisticated genetic and embryological investigations in the mouse, important exceptions to this rule have been uncovered over the last decade. First, the results of nuclear transplantation experiments carried out with single-cell fertilized embryos have demonstrated an absolute requirement for both a maternally-derived and a paternally-derived pronculeus to allow full-term development (McGrath and Solter, 1983). Second, in animals that receive both homologs of certain chromosomes or subchromosomal regions from one parent and not the other (through the mating of translocation heterozygotes as described in Section 5.2.3), dramatic effects on development can be observed including enhanced or retarded growth and outright lethality (Cattanach and Kirk, 1985). Third, either of two deletions that cover a small region of mouse chromosome 17 can be transmitted normally from a father to his offspring, but these same deletions cause prenatal lethality when they are maternally transmitted (Johnson, 1974; Winking and Silver, 1984). Fourth, similar parent-of-origin effects have been observed on the phenotypes expressed by animals that carry a targeted knock-out allele at the Igf2 locus (DeChiara et al., 1991). Finally, molecular techniques have been used to directly demonstrate the expression of transcripts from one parental allele and not the other at the Igf2r locus (Barlow et al., 1991) and the H19 locus (Bartolomei et al., 1991). The accumulated data indicate that a subset of mouse genes (on the order of 0.2%) will function differently in normal embryos depending on whether they have been inherited through the male or the female gamete, such that one allele will be expressed and the other will be silent. Genomic imprinting is the term that has been coined to describe this situation in which the phenotype expressed by a gene varies depending on its parental origin (Sapienza, 1989). Further experiments have demonstrated that, in general, the "imprint" is erased and regenerated during gametogenesis so that the function of an imprintable gene is fully determined by the sex of its progenitor alone, and not by earlier ancestors.

The trouble with facts is that there are so many of them.Samuel Crothers: The Gentle Reader (1903)

12/8/11

Page 8: Future of Research Communication 2011

Manual (mostly) curation of the biomedical literature

Rocky ISBC 12/8/11

Page 9: Future of Research Communication 2011

Rocky ISBC

Curators use controlled terms from structured vocabularies (ontologies) to annotate complex biological systems described in the literature

The knowledge is in the details

12/8/11

Page 10: Future of Research Communication 2011

Rocky ISBC 12/8/11

Crash Blossomsand other semantic ambiguities

translating what we say into what we mean: data,

words and knowledge

“Violinist Linked to JAL Crash Blossoms”

“MacArthur Flies Back to Front”

“Squad Helps Dog Bite Victim”

“Red Tape Holds Up New Bridge.”

Page 11: Future of Research Communication 2011

Today there are many biomedical ontologies…

Open Biomedical Ontologieshttp://www.obofoundry.org/

The ‘s link to the term request trackers for the listed ontologies.

12/8/11Rocky ISBC

Page 12: Future of Research Communication 2011

Rocky ISBC 12/8/11

Something very important and very weird is happening to the book right now: It’s shedding its papery corpus and transmigrating into a bodiless digital form, right before our eyes. We’re witnessing the bibliographical equivalent of the rapture. If anything we may be lowballing the weirdness of it all. Lev Grossman, NYTimes Book Review Sept 4, 2011

Page 13: Future of Research Communication 2011

Semantic Web

Semantic Web Layer Cake (Berners-Lee, 2000)12/8/11Rocky ISBC

Page 14: Future of Research Communication 2011

Rocky ISBC

…..into the Future

The current back-propagation of biomedical literature of semantic integration using ontologies is not scalable to necessary level of granularity and context needed

A key element of data integration is the mark-up of data at the time of generation

Reasoned Argument communication includes providing methods and data to enable reproducibility, and requires

open access to the semantically enriched discussion, machine-readable metadata, accessible datasets, peer review discussions, and possibility of testing for reproducibility

12/8/11

Page 15: Future of Research Communication 2011

Rocky ISBC

1 - Improving Knowledge Communication

What are functionalities neededdetail methods, both wet and dry

provide data with appropriate metadata

support interactive results, i.e., tables and figures

track metrics of utility, usage, and impact

12/8/11

Technologyelectronic lab books that are easy and functional

data collection and repositories that provide standards and persistence

new models for data interconnections

real-time metrics available

Page 16: Future of Research Communication 2011

Rocky ISBC

What’s Changing - 2

Peer Review

Journal Impact (the Myth of Impact Factors)

Supplementary Material (i.e., the DATA) missing and or incomplete

Metrics of Impact of Research

Persistence of Data

12/8/11

Page 17: Future of Research Communication 2011

Rocky ISBC

Peer Review-1 – bad reviewhard to engage expert reviews for multi-

component research

12/8/11

The Scientist

Page 18: Future of Research Communication 2011

Rocky ISBCSlide from Carol Bult

12/8/11

Large computation analysis require multiple coordinated

reviews

Page 19: Future of Research Communication 2011

Peer Review – bad review 2Computation analysis verity depends on

data input

12/8/11

Rocky ISBCFaculty of 1000

Ascertainment bias refers to a systematic distortion in measuring the true frequency of a phenomenon due to the way in which the data are collected.

Page 20: Future of Research Communication 2011

Rocky ISBC

We need to enable reproducibility

12/8/11

Page 21: Future of Research Communication 2011

Rocky ISBC 12/8/11

Page 22: Future of Research Communication 2011

Rocky ISBC

Research data is simply not available

50 highest impact research journals

1st 10 original research articles of 2009

88% of journals had ‘some’ statement on sharing of data

50% of articles did not meet journal standards

9% of articles in full compliance

algorithms and meta-date required for reproducibility not required by any journal

12/8/11Alsheikh-Ali et al., PLoS One 2011;6(9):e24357. Epub 2011 Sep 7.

Page 23: Future of Research Communication 2011

Rocky ISBC

Changing Incentives to Publish - 1

12/8/11

Page 24: Future of Research Communication 2011

Rocky ISBC

Changing Incentives to Publish - 2

12/8/11

Nature Medicine 16, 744 (2010) doi:10.1038/nm0710-744

Page 25: Future of Research Communication 2011

Rocky ISBC

2 - Impacting Our World

Open Access to data within supportive environment accelerates knowledge discovery

Social Aspects: How do we quantify impact of use/reward system?

Coolness: How do we make it attractive to do/ use?

12/8/11

Page 26: Future of Research Communication 2011

Rocky ISBC

Sharing Data – Crowd Sourcing

12/8/11

Page 27: Future of Research Communication 2011

Rocky ISBC

Interactive Communication

Imbedded links to data for figures

Immediate access to referenced material

Interactive community

12/8/11

Blogs and commentary

Analytics of impact

Page 28: Future of Research Communication 2011

Rocky ISBC

Initiatives to Access Data

12/8/11

uvm biomedical figuresearch

Page 29: Future of Research Communication 2011

Rocky ISBC

Interactive Publications

12/8/11

Jonathan Eisen's Blogspot – 9/6/11

Page 30: Future of Research Communication 2011

Rocky ISBC

Metrics of Paper Impact -1

12/8/11

Page 31: Future of Research Communication 2011

Rocky ISBC

3 – Overcoming Obstacles

New Publication Models

Data Preservation –

12/8/11

Page 32: Future of Research Communication 2011

Rocky ISBC 12/8/11publish your data – datadryad.org

Page 33: Future of Research Communication 2011

Rocky ISBC

Key PointsCommunication of research methods, data and results is changing.

Access to research is limited outside northern hemisphere - institutional access missing – thus limiting the globalization of science.

Utility of research depends on inter-connections and access to data and results; cloud-sourcing of science imminent.

The reward mechanism (tenure /cash) via publication record is changing, but slowly for most of us.

The business model of scientific publishing is under great stress.

Government investment mechanisms for support and sharing of science endeavors are under intense discussion.

New research communication mechanisms are coming; some are already here.

12/8/11

Page 34: Future of Research Communication 2011

Rocky ISBC

Summary

We must continue to support the Reasoned Argument

- in context of massive amounts of digital data, this includes comprehensive data access to maximize data integration and enable reproducibility

The necessary upheaval in scientific communication requires both technological and sociological innovations

YOU can be part of this sea change in research communication

12/8/11

Page 35: Future of Research Communication 2011

Rocky ISBC 12/8/11

Gene Ontology Consortium

Mouse Genome Informatics

FoRCe Workshop

Page 36: Future of Research Communication 2011

Rocky ISBC

Acknowledgements

MGI PIsCarol Bult, Janan Eppig, Jim Kadin, Joel Richardson, Martin

Ringwald

GO Consortium PIs and CouncilMichael Ashburner, Mike Cherry, Suzanna Lewis, Paul

Thomas, Paul Sternberg;

Rolf Apweiler, Rex Chisholm, Eva Huala

GO @ MGIAlex Diehl, Mary Dolan, David Hill,

Li Ni, Harold Drabkin, Li Ni,

Dmitry Sitnikov

Funding: NIH-NHGRI P-41 grants to MGI and GOC; GM080646 to PRO

12/8/11