towards a data model for the australian microbial resources information network (amrin)

42
Towards a Data Model for the Australian Microbial Resources Information Network (AMRiN) Lynette Woodburn Atlas of Living Australia Version: 0.03 17/09/2010

Upload: marlee

Post on 06-Jan-2016

15 views

Category:

Documents


1 download

DESCRIPTION

Towards a Data Model for the Australian Microbial Resources Information Network (AMRiN). Version: 0.03 17/09/2010. Lynette Woodburn Atlas of Living Australia. TIP. Each slide in this presentation comes with accompanying Notes. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Towards a Data Model

for the

Australian Microbial Resources Information Network

(AMRiN)

Lynette WoodburnAtlas of Living Australia

Version: 0.0317/09/2010

Page 2: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Each slide in this presentation comes with accompanying Notes.

You can’t see them if you display this presentation in ‘Slide Show’ mode.

If you’d like to see the Notes

• view the presentation in ‘Normal’ mode, and • expand the pane below the slide (the Notes pane) to see extra text.

Only then will you have a chance of understanding all the crazy diagrams.

TIP

Page 3: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

a standard set of data fields for all micro-organisms

. to support the sharing and integration of data through AMRiN

. to pre-configure BioloMICS

Requirement

Options . choose an existing set

. develop something new

Towards a data model for AMRiN

Recommendation

. surprise!

Page 4: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

1. Requirements

2. Options

3. Recommendation

Page 5: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

AMRiN

AMRiN community

Page 6: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

AMRiN

AMRiN community

Page 7: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

AMRiN

AMRiN community

Page 8: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

1. Requirements

2. Options

3. Recommendation

- existing

CABRIMCL

Page 9: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Common Access to Biological Resources and Information CABRI

a European organization of partner collections

who contribute data to searchable ‘catalogues’ covering

http://www.cabri.org/

• bacteria & archaea

• fungi & yeasts

• animal & human cell lines

• plant cell lines

• hybridomas

• phages

• plasmids

• plant cell viruses

• genomic libraries

Page 10: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI’s sets of data elements

• 26

• 23

• 29

• 17

• 15

• 33

• 30

• 12

• 7

• bacteria & archaea

• fungi & yeasts

• animal cell lines

• plant cell lines

• hybridomas

• phages

• plasmids

• plant cell viruses

• genomic libraries

elements per set

Original_host_plant

Doubling_time

Lysogenicity

Isolated_from

Morphology

Page 11: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Common Access to Biological Resources and Information CABRI

For each different kind of biological resource,

CABRI defines nested sets of data elements

Mandatory Recommended Full

Page 12: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI : bacteria & archaea

Strain_numberOther_collection_numbersRestrictionsOrganism_typeNameInfrasubspecific_namesStatusHistoryConditions_for_growth Form_of_supply

SerovarOther_namesIsolated_fromGeographic_originMutantGenotypeLiterature

Sexual_statePathogenicityEnzyme_productionMetabolite_productionApplicationsCatalogue_entryRemarksPrice_codePlasmids

Mandatory Recommended Full

Page 13: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI : fungi & yeasts

Strain_numberOther_collection_numbersNameStatusOrganism_typeHistoryRestrictionsForm_of_supplyConditions_for_growth

Misapplied_namesRaceSubstrateGeographic_originLiteratureApplicationsMutantSexual_state

Price_codeRemarksPathogenicityMetabolite_productionEnzyme_productionGenotype

Mandatory Recommended Full

Page 14: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI : animal & human cell lines

Accession_numberCell_line_nameBrief_descriptionDescriptionDepositorBibliographic_referencesMorphologyCulture_conditionsVirusesPropertiesRelease_conditionsHazard Passage_number

Species_validation

TumorigenicityKaryologyFreezing_mediumSterilityValidation_assaysFurther_bibliographyCommentsStorageDoubling_timeMycoplasmaFingerprintCytogeneticsKaryotypeCommentsResearch_council_depositBIOMED_1

Mandatory Recommended Full

Page 15: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI’s sets of data elements

• 26

• 23

• 29

• 17

• 15

• 33

• 30

• 12

• 7

• bacteria & archaea

• fungi & yeasts

• animal cell lines

• plant cell lines

• hybridomas

• phages

• plasmids

• plant cell viruses

• genomic libraries

192

Page 16: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Sharing data about one kind of biological resource is easy

eg. phages

Page 17: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

eg. plasmids

Sharing data about one kind of biological resource is easy

Page 18: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Sharing data about multiple kinds of biological resources is hard

Other_culture_collection_numbers

Other_collection_numbers

Page 19: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

133 distinct data elements …

for describing several different kinds of biological resources ?

What is the prospect of deriving a common model from CABRI

… distributed across 9 sets

bacte

ria &

arc

haea

fungi & yeasts

animal cell lines

plant cell lines

hybridomas

phag

es plasmids

plant cell viruses

genomic libraries

Page 20: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

each of 92 elements is found in only one set

CABRI as a common model ?

only 41 elements are found in more than one set

Page 21: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI as a common model ?

27 data elements are found in two sets 10 ….. in three 4 ….. in four

No elements are found in more than 4 sets

Page 22: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Distribution of data elements across CABRI sets

• bacteria & archaea

• fungi & yeasts

• animal cell lines

• hybridomas

• phages

• plant cell lines

• plant cell viruses

• plasmids

• genomic libraries

Count of data elements in one set two three four

6 3 22 7 14 12 9 13 6 11 4 12 2 1 2 2 1 1 1 3 1

Page 23: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI data element ‘themes’

• bacteria & archaea

• fungi & yeasts

• animal cell lines

• plant cell lines

• hybridomas

• phages

• plasmids

• plant cell viruses

• genomic libraries

ID of item in

collection

Name / classific

ation of it

em

item admin

handling & distributio

n regulatio

ns

care / maintenance

characteristics

literature

….origin

Page 24: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI : comparison of elements across sets

• different names, same meaning (definition)

Accession_number, Strain_number

History, History_of_deposit

Bibliographic_references, Reference_paper, Literature, Reference, Further_bibliography

Restricted_distribution, Release_conditions,Restrictions, Distribution

Morphology, Morphology_and_growth

….

Page 25: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI : comparison of elements across sets

• same name, different meanings

Brief_description

Type

phages type of elementphage, transposon, minitransposon, IS element, …

plasmids type of elementplasmid, phasmid, cosmid, shuttle vector, transposon, minitransposon, IS element, …

genomic libraries type of libraryPAC, BAC, YAC, PI, cDNA, …

hybridomas listing of species, strain, antibody specificity

animal cell lines listing of species, strain, tissue, tumour, pathology, transformed/transfected

Page 26: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRI : comparison of data element sets

• varying levels of scope

Conditions_for_growth bacteria & archaea

fungi & yeasts

culture medium

atmospheric and light conditions

temperature conditions

additional remarks on cultivation

Medium plasmids, phages

Medium_1 plant cell lines

Light_regime plant cell lines

Light_conditions plant cell lines

Temperature plant cell lines

Humidity plant cell lines

Page 27: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

• 9 sets of data elements (but does not cover algae)

good for sharing information about one kind of organism

• few elements common to several sets

hard to share information about more than one kind of organism • does not lend itself to the derivation of a common set

elements of ‘different names, same meaning’ elements of ‘same name, different meanings’ elements with meanings of varying scope

• has international acceptance / presence (but no longer funded?)

CABRI : fitness for our purpose

Page 28: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

1. Requirements

2. Options

3. Recommendation

- existing

CABRIMCL

Page 29: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Microbiological Common Language

MCL

• a new data exchange standard for microbiological information

Research in Microbiology, 161(6), 439-445

http://www.straininfo.net/projects/mcl

• a pluggable framework, easily extended

• has the same ancestor as CABRI (MINE)

• underpins StrainInfo (www.straininfo.net)

“ a world-wide, virtual catalog integrating the information from BRC [Biological Resource Centres] catalogs with related information”

Page 30: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

CABRIMCL

CABRI compared with MCL

partitioned by kind of biological resource partitioned by workflow step

Page 31: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Sample IsolationCulture

Deposit

Medium Publication

Strain

The abstract model of Microbiological Common Language (MCL)

… follows the logical flow from sampling to subsequent deposits

Page 32: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

mcl : Sample

sampleDate

sampleCultureStrainNumber

sampleCollectorsampleCollectorInstitute

comments

sampleDescriptionsampleLocationDescription

sampleLocationCountrysampleLocationPlace

sampleAltsampleLatsampleLong

sampleHabitatEnvoTermsampleHabitat

sampleCulture

Sample

Page 33: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

mcl : Culture

Culture

[otherStrainNumbers]

id

cultureLastUpdateDateotherStrainNumberstrainNumber

catalogURL

speciesName

historyisolationDateisolatorisolatorInstituteisolationMethod

typeStrainOfSpeciestypeStrainOf

typeStrainOfGenus

comments

minimalGrowthTemperature[growthTemperature]

optimalGrowthTemperaturemaximalGrowthTemperature

oxygenRelationship

nomenclaturalPublicationpublication

environmentPublicationhistoryPublicationtaxonomicPublication

hasSamplerecommendMedium

Page 34: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

some Object Properties

Culture

hasSamplerecommendMedium

nomenclaturalPublicationpublication

environmentPublicationhistoryPublicationtaxonomicPublication

Sample

Medium Publication

Page 35: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

mcl : Medium mcl : Publication

Medium

mediumNamemediumNumbermediumURLmediumDescriptioncomments

Publication

dcterms: bibliographicCitationdc: titledc: creatorprism: publicationNameprism: volumeprism: numberprism: startingPageprism: pageRangedcterms: issued

Page 36: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

MCL : fitness for our purpose

• MCL offers a broadly-applicable suite of data elements

. data elements are grouped according to workflow steps, not organism type

. applicable to algae and cyanobacteria

. the Strain concept supports the logical linking of related cultures

• the model is modular and easily extensible

. model cohesion is achieved through Object Properties

. links easily with genomic standards (see StrainInfo)

• born and raised in Europe (StrainInfo), but now going global

. Asian biorepositories network is considering adoption

. we’re invited to contribute to ongoing development

• primarily devised (custom-built) as a data exchange standard

Page 37: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

1. Requirements

2. Options

3. Recommendation

Page 38: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Recommendation : dip a toe into the water

• MCL, custom-built for describing microbiological data, deserves consideration

Proposal

undertake a pilot, involving a small group of AMRiN participants,

to assess the suitability of MCL for AMRiN’s purpose.

Page 39: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

AMRiN

AMRiN community

Page 40: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

AMRiN participants’ input

map local elements to MCL elements

Note:some MCL elements

may not have a local equivalent

identify local elements to be kept ‘private’

identify other local elements to be shared ;

provide English definitionsto enable reconciliation with other participants’ elements

Page 41: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Pilot assessment

• Coverage?

• What additional common elements exist amongst the set to be shared?

How much orange overlaps purple?

How much purple overlaps purple?

• Other assessment criteria?

Page 42: Towards a  Data  Model  for  the  Australian Microbial Resources Information Network (AMRiN)

Pulling the pieces together

Please consider the foregoing proposal.

Does it seem reasonable to you?

Do you think there’s a better way?