society for biocuration panel discussion, april 2013

19
Biocuration and scholarly communication cycle: roles and opportunities for biocurators Panel Discussion Theo Bloom, Editorial Director for Biology, PLOS Hinxton, April 2013

Upload: theodora-bloom

Post on 27-Jan-2015

108 views

Category:

Technology


1 download

DESCRIPTION

Society for biocuration panel discussion, April 2013

TRANSCRIPT

Page 1: Society for biocuration panel discussion, April 2013

Biocuration and scholarly communication cycle: roles and opportunities for biocurators

Panel DiscussionTheo Bloom, Editorial Director for Biology, PLOS

Hinxton, April 2013

Page 2: Society for biocuration panel discussion, April 2013

2

Take-home / talking points / provocation

• The needs and motivations of authors and ‘users’ of the literature differ

• Some studies output structured data well• Many/most studies don’t, and here’s our

biggest problem• We need to move towards universal

solutions and away from bespoke ones• In the meantime there is a lot of help needed

Page 3: Society for biocuration panel discussion, April 2013

3

What authors want

Publication credit

Kudos - “good journal”

First / fast

Easy

Compliant

Page 4: Society for biocuration panel discussion, April 2013

4

What readers/users want

Reusability

Thorough

Complete

Replicable

Compliant

Page 5: Society for biocuration panel discussion, April 2013

Growth in the cost of traditional publishing

Page 6: Society for biocuration panel discussion, April 2013

6

PLOS BiologyOctober, 2003 PLOS Medicine

October, 2004

PLOS Community JournalsJune-September, 2005 October, 2007

PLOS ONEDecember,2006

Page 7: Society for biocuration panel discussion, April 2013

Growth of PLOS journals and of Open Access

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 20120

5,000

10,000

15,000

20,000

25,000

30,000

0

20,000

40,000

60,000

80,000

100,000

120,000

PLOS

Open access total (secondary axis)

Page 8: Society for biocuration panel discussion, April 2013

8

PLOS ONE’s Key Innovation: the editorial process

Editorial criteria• Scientifically rigorous• Ethical• Properly reported• Conclusions supported by the data

Editors and reviewers do not ask• How important is the work?• Which is the relevant audience?

Everything that deserves to be published, will be published

Page 9: Society for biocuration panel discussion, April 2013

9

Two types of study generating data

Type 1: the structured data is the output• e.g. DNA sequence, protein structure, clinical

trial results• Often large-scale / high-throughput• Curators and databases support these really

well, even small-scale studies

Type 2: the “paper” is the output• No structured database exists/ no

widespread agreement on standards

Page 10: Society for biocuration panel discussion, April 2013

10

Provocation: the solutions we’re all proposing don’t deal with the main problems

• Adding more steps and checks to publication makes it slow (unpopular with authors)

• Assuming an expert editor handling each article makes it slow and expensive

• Some authors are moving towards preprints, blog-style publications, and definitely away from traditional journals (e.g. PLOS ONE)

• We need to fix the problems at the time the studies are done

Page 11: Society for biocuration panel discussion, April 2013

Do some science

Write a description

Store some of the data somewhere…

Page 12: Society for biocuration panel discussion, April 2013

Do some science

Write a narrative description that is

inextricably linked to the data and methods

Integrated collection of methods, results, data, metadata

Store all of the data somewhere useful and link to publication

Page 13: Society for biocuration panel discussion, April 2013

13

Where should data go?

• Curated, subject-specific, open access, long-term databases (GenBank, ArrayExpress)

• General non-specific repositories: Dryad, FigShare, Institutional (bigger is better? Can we have a ‘kite-marked’ list?)

• Supplementary files with the article (heterogeneous, poorly formatted, hard to collate/mine)

• NOT: the author’s website or file drawer

Page 14: Society for biocuration panel discussion, April 2013

Steps towards better data handling - 1

Partnership with Dryad (www.datadryad.org)• Unstructured data ‘packages’ associated with

published articles• Freely available - CC0• A unique identifier (DOI) for each package• Statistics for access• Seamless tying together of article and data

Partnership with figshare (www.figshare.org)• figshare widget displays Supporting Information

files directly in the article • search, magnify, download singly or as a package

What to do with ‘homeless’ data?

Page 15: Society for biocuration panel discussion, April 2013

Steps towards better data handling - 2Planning in hand for ‘data papers’ • Describes reusable dataset to support reuse• Publishes associated metadata

• Structured data cross-referenced to its “natural home” (e.g., protein structures to PDB)

• Unstructured data in PLOS Dataverse instance• Ensures valuable data actionable for reuse

• actionable formats • curated to reasonable standard• accessible in a recognized, stable repository

• Inherently reusable data• Valid experimental / observational design• Good quality control, ethical experiments• Data perceived to have “standalone” value

Page 16: Society for biocuration panel discussion, April 2013

16

Publish your Big Paper

Send it to Science Exchange to reproduce

Independent scientists attempt to reproduce the study

Success! Science Exchange issues a

certificate of validation, which is posted on the paper

Reproduction is published in PLOS ONE and data is stored at figshare

Hopefully publish in PLOS ONE although this is not required

Failure! Authors think long and hard about what they’ve

done

Reproducibility Initiative

Page 17: Society for biocuration panel discussion, April 2013

PLOS + partnerships:

Page 18: Society for biocuration panel discussion, April 2013

18

Take-home / talking points / provocation

• The needs and motivations of authors and ‘users’ of the literature differ

• Some studies output structured data well• Many/most studies don’t, and here’s our

biggest problem• We need to move towards universal

solutions and away from bespoke ones• In the meantime there is a lot of help needed

Page 19: Society for biocuration panel discussion, April 2013

19

[email protected]

Open Access