open notebook science hubzero 2011

49
Open Notebook Science: Does Transparency Work? April 6, 2011 HUBzero Conference Jean-Claude Bradley Department of Chemistry Drexel University

Upload: jean-claude-bradley

Post on 18-Dec-2014

14.947 views

Category:

Education


0 download

DESCRIPTION

Jean-Claude Bradley presents "Open Notebook Science: Does Transparency Work?" at the HUBzero conference on April 6, 2011.

TRANSCRIPT

Page 1: Open Notebook Science HUBzero 2011

Open Notebook Science: Does Transparency Work?

April 6, 2011

HUBzero Conference

Jean-Claude Bradley

Department of ChemistryDrexel University

Page 2: Open Notebook Science HUBzero 2011

The current state of transparency in scientific communication

Case study of melting point data

Page 3: Open Notebook Science HUBzero 2011

The Chemical Information Validation Sheet

567 curated and referenced measurements from Fall 2010 Chemical Information Retrieval course

Page 4: Open Notebook Science HUBzero 2011

The Chemical Information Validation Explorer

(Andrew Lang)

Page 5: Open Notebook Science HUBzero 2011

Discovering outliers for melting points (stdev/average)

Page 6: Open Notebook Science HUBzero 2011

Investigating the m.p. inconsistencies of EGCG

Page 7: Open Notebook Science HUBzero 2011

Investigating the m.p. inconsistencies of cyclohexanone

Page 8: Open Notebook Science HUBzero 2011

Sigma-Aldrich, Acros and Wolfram Alpha apparently use the same sources for melting

points

Page 9: Open Notebook Science HUBzero 2011

Sigma-Aldrich, Acros and Wolfram Alpha apparently use the same sources for boiling

points

Page 10: Open Notebook Science HUBzero 2011

Sigma-Aldrich, Acros and Wolfram Alpha apparently

DO NOT use the same sources for flash points

Page 11: Open Notebook Science HUBzero 2011

Most popular data sources

Page 12: Open Notebook Science HUBzero 2011

Alfa Aesar donates melting points to the public

Page 13: Open Notebook Science HUBzero 2011

Open Melting Point Explorer

Page 14: Open Notebook Science HUBzero 2011

Outliers

MDPI dataset

EPI (via ChemSpider)

Page 15: Open Notebook Science HUBzero 2011

Outliers

Alfa Aesar

Page 16: Open Notebook Science HUBzero 2011

Inconsistencies and SMILES problems within MDPI dataset

Page 17: Open Notebook Science HUBzero 2011

MDPI Dataset labeled with High Trust Level

Page 18: Open Notebook Science HUBzero 2011

Open Melting Point Datasets

Page 19: Open Notebook Science HUBzero 2011

Open Random Forest modeling of Open Melting Point data using CDK descriptors

(Andrew Lang)

R2 = 0.78, TPSA and nHdon most important

Page 20: Open Notebook Science HUBzero 2011

Melting point prediction service

Page 21: Open Notebook Science HUBzero 2011

Using melting point for temperature dependent solubility prediction

Page 22: Open Notebook Science HUBzero 2011

Motivation: Faster Science, Better Science

Page 23: Open Notebook Science HUBzero 2011

There are NO FACTS, only measurements embedded

within assumptions

Open Notebook Science maintains the integrity of data

provenance by making assumptions explicit

Page 24: Open Notebook Science HUBzero 2011

TRUST

PROOF

Page 25: Open Notebook Science HUBzero 2011

First record then abstract structure

In order to be discoverable use Google friendly formats (simple HTML, no login)

In order to be replicable use free hosted tools (Wikispaces, Google Spreadsheets)

Strategy for an Open Notebook:

Page 26: Open Notebook Science HUBzero 2011

Crowdsourcing Solubility Data

Page 27: Open Notebook Science HUBzero 2011

Data provenance: From Wikipedia to…

Page 28: Open Notebook Science HUBzero 2011

…the lab notebook and raw data

Page 29: Open Notebook Science HUBzero 2011

Calculations Made Public on Google Spreadsheets

Page 30: Open Notebook Science HUBzero 2011

Interactive NMR spectra using JSpecView and JCAMP-DX

Page 31: Open Notebook Science HUBzero 2011

Raw Data As Images

Splatter?

Some liquid

Page 32: Open Notebook Science HUBzero 2011

YouTube for demonstrating experimental set-up

Page 33: Open Notebook Science HUBzero 2011

The importance of raw data availability

Missed in a prior publication on solubility

for this compound

Page 34: Open Notebook Science HUBzero 2011

Solubilities collected in a Google Spreadsheet

Page 35: Open Notebook Science HUBzero 2011

Rajarshi Guha’s Live Web Query using Google Viz API

Page 36: Open Notebook Science HUBzero 2011

Web services for summary data

(Andrew Lang)

Page 37: Open Notebook Science HUBzero 2011

Web service calls from within a Google Spreadsheet for solubility measurement and

prediction

(Andrew Lang)

Page 38: Open Notebook Science HUBzero 2011

Integration of Multiple Web Services to Recommend Solvents for Reactions

(Andrew Lang)

Page 39: Open Notebook Science HUBzero 2011
Page 40: Open Notebook Science HUBzero 2011
Page 41: Open Notebook Science HUBzero 2011
Page 42: Open Notebook Science HUBzero 2011

Reaction Attempts Book

Page 43: Open Notebook Science HUBzero 2011

Reaction Attempts Book: Reactants listed Alphabetically

Page 44: Open Notebook Science HUBzero 2011

ONS Challenge Solubility Book cited for nanotechnology application

Page 45: Open Notebook Science HUBzero 2011

Lulu.com Data Disks

Page 46: Open Notebook Science HUBzero 2011

Visualizing molecule-researcher connection maps reveals link between 2 Open Notebooks (Todd and

Bradley)

(Don Pellegrino)

Page 47: Open Notebook Science HUBzero 2011

All ONS web services

Page 48: Open Notebook Science HUBzero 2011

For all Formats of ONS Projects

Page 49: Open Notebook Science HUBzero 2011

Conclusions

•Our current system of publication is not as transparent as it could be

•Open Notebook Science offers an efficient way to make research transparent and discoverable