biodiversity information standards: are we going wrong, or just not quite right?

103
Biodiversity Information Standards: are we going wrong, or just not quite right? Jim Croft Australian National Herbarium

Upload: alida

Post on 24-Feb-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Biodiversity Information Standards: are we going wrong, or just not quite right?. Jim Croft Australian National Herbarium. Australian National Herbarium Centre for Plant Biodiversity Research Australian National Botanic Gardens Parks Australia Taxonomy Research and Information Network - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Biodiversity Information Standards: are we going wrong, or just not quite right?

Biodiversity Information Standards: are we going wrong, or just not quite

right?

Jim CroftAustralian National Herbarium

Page 2: Biodiversity Information Standards: are we going wrong, or just not quite right?

Australian National Herbarium

Centre for Plant Biodiversity Research

Australian National Botanic Gardens

Parks Australia

Taxonomy Research and Information Network

Parks Australia

Department of the Environment, Water, Heritage and the Arts

Page 3: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG IN AUSTRALIA

Page 4: Biodiversity Information Standards: are we going wrong, or just not quite right?

Hobart

Devonport

Launceston

Adelaide

Perth

Melbourne

Hobart

Launceston

Townsville

Devonport

Armidale

Darwin

BrisbaneLismore

Orange

SydneyCanberra

Adelaide

Perth

INSTITUTIONS – Northern TerritoryDarwin

Maroochydore

Gosford

v Australian National Insect Collection (CSIRO)v Australian National Herbarium (CSIRO)v Australian National Wildlife Collection (CSIRO)v GAUBA Herbariumv Australian Biological Resources Study

INSTITUTIONS – Queensland

TDWG in Australia

Alice Springs

Page 5: Biodiversity Information Standards: are we going wrong, or just not quite right?

Australian examples

• Australian Plant Name Index– Australian Plant Census

• Australian Fauna Directory• Australia’s Virtual Herbarium• Online Zoological Catalogue of Australian Museums

• Flora of Australia On-line• Atlas of Living Australia• Identify Life• Taxonomy Research and Information Network

Page 6: Biodiversity Information Standards: are we going wrong, or just not quite right?

Australian examples

• Australian Plant Name Index– Australian Plant Census

• Australian Fauna Directory• Australia’s Virtual Herbarium• Online Zoological Catalogue of Australian Museums

• Flora of Australia On-line• Atlas of Living Australia• Identify Life• Taxonomy Research and Information Network

Page 7: Biodiversity Information Standards: are we going wrong, or just not quite right?

HISCOM

• Herbarium Information Systems Committee– Representatives at TDWG 2008

– Ben Richardson, Alex Chapman (PERTH)– Bill Barker (AD)– Alison Vaughan (MEL)– Karen Wilson (NSW)– Donna Lewis (DNA)– Jerry Cooper (CHR, NZ)– Helen Thompson (ABRS)– Greg Whitbread, Jim Croft (CANB)

– The crucible of biodiversity informatics creativity

Page 8: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG principle # 0

• A good idea has a thousand fathers

• A bad one is a bastard

Page 9: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG: making anarchy chaos the standard

Page 10: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG principle # VI-a

“Before the beginning of great brilliance, there must be chaos.

Before a brilliant person begins something great, they must look foolish in the crowd.”

- I Ching

Page 11: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG: the art of herding cats

Page 12: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG: changing standards, or making change the standard?

Page 13: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG: Standardizing stuff...

orstuffing standards?

Page 14: Biodiversity Information Standards: are we going wrong, or just not quite right?

Outline

• What is TDWG?• TDWG and ‘Standards’• Where TDWG Standards are needed• Some TDWG projects• TDWG Standards compliance• Tensions for TDWG• Future

Page 15: Biodiversity Information Standards: are we going wrong, or just not quite right?

WHAT IS TDWG?

Page 16: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG Mission

• Develop, adopt and promote standards and guidelines for the recording and exchange of data about organisms

• Promote the use of standards through the most appropriate and effective means and

• Act as a forum for discussion through holding meetings and through publications

Page 17: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG Mission

• Develop, adopt and promote standards and guidelines for the recording and exchange of data about organisms

• Promote the use of standards through the most appropriate and effective means and

• Act as a forum for discussion through holding meetings and through publications

Page 18: Biodiversity Information Standards: are we going wrong, or just not quite right?

Who are we?

‘TDWG is us’

Page 19: Biodiversity Information Standards: are we going wrong, or just not quite right?

Who are we?

• Intersection of specimens, taxonomy, knowledge, information management

• Biologists, taxonomists, computer scientists– Each with an interest in the other’s domains

– Each with something to offer each other’s domains

Page 20: Biodiversity Information Standards: are we going wrong, or just not quite right?

Who are we?

• If TDWG did not exist, we would have to invent it

• Successful– Enduring– Popular– Moderately well recognized

Page 21: Biodiversity Information Standards: are we going wrong, or just not quite right?

When are we?

• Phases of TDWG– Phase 0 (1985)

• seemed like a good idea at the time– phase 1 (first decade)

• Data dictionaries, data models– phase 2 (second decade)

• E-R models, DIGIR, DwC, XML, etc.– phase 3 (nowish)

• Schemas, ontologies, RDF– Phase 4 (?)

• ?

Page 22: Biodiversity Information Standards: are we going wrong, or just not quite right?

Why are we?

• Collaboration and sharing is essential– Taxonomy has become too big– Too diverse– Too complex– No one person can do it all– A ‘complete’ treatment requires collaboration

– Collaboration requires consistency, standards

Page 23: Biodiversity Information Standards: are we going wrong, or just not quite right?

** notes **Biodiversity Tower of Babel

Page 24: Biodiversity Information Standards: are we going wrong, or just not quite right?

Why are we?

• Untangle the ‘biodiversity Babel’

• Develop common communication

• Harness efficiency of collaboration

• Economic pressures to reduce duplication

Page 25: Biodiversity Information Standards: are we going wrong, or just not quite right?

Why are we?

• Science of information meets science of information technology

• Take advantage of new technology

• Taxonomy needs to be seen to be evolving

• “Business as usual is not an option”

Page 26: Biodiversity Information Standards: are we going wrong, or just not quite right?

Why are we?

• An annual excuse to meet in warm places when it is cold elsewhere?

Page 27: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where do we fit?

xkcd.com

taxonomistscomputerists

TDWGinformaticists

Page 28: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where have we come from?

• Frustrated taxonomists– Looking for a better way– Largely self taught

• Bored computer scientists– Looking for excitement, challenge

• Misfits and visionaries– In search of a ‘Brave New World’

• Egomaniacs– In search of glory, fame, power, riches

Page 29: Biodiversity Information Standards: are we going wrong, or just not quite right?

What are we now?

• Frustrated taxonomists– Looking for a better way– Largely self taught

• Bored computer scientists– Looking for excitement, challenge

• Misfits and visionaries– In search of a ‘Brave New World’

• Egomaniacs– In search of glory, fame, power, riches

Page 30: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where are we going?

?

Page 31: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where are we going?

• Did we go wrong?– Where did we go wrong?– Why did we go wrong?

• Lost the plot?– Regain credibility?

• Our community?• Our funders?• Ourselves?

Page 32: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where are we going?

• Perceptions of TDWG?– First decade

• Taxonomists organizing their domain• Content focused• Understandable by taxonomists

– Second decade• Taxonomists reaching limitations• Engaging technologists• Protocol and systems focussed• Opaque to taxonomists

– Third decade?

Page 33: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where are we going?

• Perceptions of TDWG?– First decade

• Content• Data dictionaries• Lists, vocabularies

– Second decade• Protocols• Formats, structure • Applications

– Third decade?• Ontologies?• Semantics?

Page 34: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where are we going?

• What should TDWG be about?

– The data?

– The technology?

– The applications?

– The community?

Page 35: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG Impediments

• Resources, funds• Time• Impetus, will, drive• Complexity, domain knowledge• Conservatism• Rivalry• Intellectual property, revenue advantage

Page 36: Biodiversity Information Standards: are we going wrong, or just not quite right?

THE TDWG VISION

Page 37: Biodiversity Information Standards: are we going wrong, or just not quite right?

A vision for TDWG

• Our domain in biodiversity?– Taxonomy?– Systematics?– Collections?– Biodiversity?– Publications?– Knowledge Management?– Knowledge discovery?

– All of the above?

Page 38: Biodiversity Information Standards: are we going wrong, or just not quite right?

A vision for TDWG

• Our Community?– Herbaria and museums?– Researchers?– Government and policy?– Conservation agencies? NGOs?– Natural resource management?– Education?– Public?

– All of the above?

Page 39: Biodiversity Information Standards: are we going wrong, or just not quite right?

A vision for TDWG

• Our questions?– What is it? How can I find out?– What does it look like?– Where does it occur?– Was it still there? When?– What occurs there with it?– What might occur there with it?– What is it related to?– Who says so?– How? Why?

– All of the above?

Page 40: Biodiversity Information Standards: are we going wrong, or just not quite right?

A vision for TDWG

• Our Products?– Data content standards?– Data storage standards?– Data communications protocols?– Data management applications?– Data management infrastructure?– Data visualization applications?– Data analysis applications?

– All of the above?

Page 41: Biodiversity Information Standards: are we going wrong, or just not quite right?

Knowledge pyramid

The Real World

DataInformation

Knowledge

Samples

Wisdom

Page 42: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG AND STANDARDS

Page 43: Biodiversity Information Standards: are we going wrong, or just not quite right?

What is a standard?

• In common English:– A flag– An upright pole or beam– A backing for currency– American automobile– A bush on a long stalk– An ideal to be judged against– Model of authority or excellence– A basis for comparison– 1,980 board feet of wood– A newspaper– An established norm

Page 44: Biodiversity Information Standards: are we going wrong, or just not quite right?

What is a standard?

• Rarely implies:– Requirement– Obligation– Compulsion– Compliance– ‘The law’

• But not so ‘technical standards’– Specify behaviour– Mandate behaviour

Page 45: Biodiversity Information Standards: are we going wrong, or just not quite right?

What is a standard?

• “an explicit set of requirements to be satisfied by a material, product, or service”

- (ATSM International)

Page 46: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG STANDARDS

Page 47: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG Standards categories

• Technical specification (TS) (3)– Protocol, service, procedure, format

• Applicability statement (AS) (1 draft)– How a tech. spec. might be applied

• Best current practice (BCP) (0)– A description of good behaviour

• Data standard (DS) (0)– Content or controlled vocabularies

Page 48: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG Standards status

• Current standard– (3)

• Current 2005 Standard– (3?)

• Draft Standard– (3)

• Prior Standard– (7 tech specs; 6 data standards)

• Retired Standard– (0)

Page 49: Biodiversity Information Standards: are we going wrong, or just not quite right?

THE STANDARDS PROCESS

Page 50: Biodiversity Information Standards: are we going wrong, or just not quite right?

ISO Standards process

• ISO standards are:

– Consensus– Industry wide– Voluntary

Page 51: Biodiversity Information Standards: are we going wrong, or just not quite right?

ISO Standards process

• 0 preliminary– Study period underway

• 1 proposal– New project under consideration

• 2 preparatory– Working draft(s) under consideration

• 3 committee– Committee draft(s) under consideration

• 4 approval– Final draft standard under consideration

• 5 publication– Standard prepared for publication

Page 52: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG Standards process

• TDWG standards are:

– Consensus– Community wide (+/-)– Voluntary

Page 53: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG Standards Process

Page 54: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG STANDARDS PRESENT

Page 55: Biodiversity Information Standards: are we going wrong, or just not quite right?
Page 56: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards – present

• ABCD– Access to biological collections data

• SDD– Structured Descriptive Data

• TCS– Taxon Concept Schema

Not bad for 22 years work...

Page 57: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG STANDARDS PAST

Page 58: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards - past

• ‘Prior Standards’

• Technical Specs (protocol stds):– HISPID 3 (now on v.5)– POSS (Plant Occurrence and Status)– Economic Botany Data Collection Std– Plant Names in Botanical Databases– XDF – language for definition and exchange– ITF – Botanic Gardens Records– DELTA

Page 59: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards - past

• ‘Prior Standards’

• Data standards (Content stds)– Authors of Plant Names– World Geographic Scheme for Plant Distributions

– Botanico Periodicum Huntianum– Index Herbariorum– Floristic Regions of the World– TL2 – Taxonomic Literature and suppl.

Page 60: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG STANDARDS FUTURE

Page 61: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards – future

• ‘Draft standards’– Real soon now

• Standards documentation spec.– The standard way to do standards

• LSID Applicability Statement– How to do LSIDs

• NCD– Natural Collections Description

Page 62: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards – future

• Watch this space?

• Observation data– Occurrence without specimens?– Ecological metadata language

• Phylogenetics data– Phylogeny repositories– Trees of life– Phylocode

Page 63: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards – future

• Watch this space?

• SPM – Species Profile Model– Online Journals; On-line Floras– Interactive Keys

• Images and multimedia

• Ethnobotany ontology

Page 64: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards – future

• How are we going to manage this?– Activities straddle many standards– Potential for duplication, conflict

• Technical Architecture Group– Ontologies– Vocabularies– Conflict identification, resolution– Evaluation, advice, recommendations

Page 65: Biodiversity Information Standards: are we going wrong, or just not quite right?

WHERE TDWG STANDARDS ARE NEEDED

Page 66: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where are TDWG standards needed?

• Nomenclature• Taxonomy• Bibliographic• Specimens• Identification• Description• Images• Multimedia

• Occurrence• Spatial• Observation• Molecular• Phylogeny• People• Institutions• etc.

Page 67: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where are TDWG standards needed?

• The problem:

• TDWG activities have been activity and discipline based– ABCD as an example

• Names, taxa, specimens, places, people, etc.

• Need to look at data from an ontological perspective– Data based

• Not activity based

Page 68: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG – the 3-legged stool

• (definition of ‘stool’?)

• GUIDs• Ontologies• Exchange protocols

Page 69: Biodiversity Information Standards: are we going wrong, or just not quite right?
Page 70: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG – the 3-legged stool

• Management cliche

• Planning• Money• Management

---• Production• Marketing• Administration

---• etc

Page 71: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG – the 3-legged stool

Page 72: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG STANDARDS COMPLIANCE

Page 73: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards compliance

• Pretty poor– Within institutions / projects– Between institutions / projects

• Partial compliance is not compliance

• Enhancement is not compliance

• Extension is not compliance

Page 74: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards compliance

• Why not?– Too complicated?– Inappropriate?– Deficient?– Too costly to implement?

– Conservatism?– Apathy?– Individual arrogance?– Institutional arrogance?

Page 75: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards compliance

• Need for stability

• TDWG has a reputation– Pursuing the ‘bleeding edge’– “Keeping up with the Jones’s”– Introducing new recommendations before old ones settled

– Frustrating users• Especially smaller institutions

Page 76: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards compliance

• Total cost of ownership– Ultra technical solutions

• Rare specialist skills• Expensive contractors

– Maintenance costs– Upgrade costs– Migration costs

– Users get stuck

Page 77: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG standards compliance

• What can be done?– Rationalization of standards?– More control of standards process?– Seek ‘appropriate technology’?

• Not necessarily the best– Seek cheaper solutions?– Focus on the ontologies, not activities?

– Apply institutional pressure?– Institutional mentorship and support?

Page 78: Biodiversity Information Standards: are we going wrong, or just not quite right?

THE TENSIONS FOR TDWG

Page 79: Biodiversity Information Standards: are we going wrong, or just not quite right?

Tensions in TDWG

• Taxonomy / technology• Innovation / stability• Innovation / conservatism• Names / taxonomy• Names / specimens• Names / names• Authority / credit• Ownership / responsibility• Data / metadata

Page 80: Biodiversity Information Standards: are we going wrong, or just not quite right?

Why not?

• Why not web 2.0 / 3.0?

• Why not annotations?

• Why not Wikipedia?

• Why not microformatting?

Page 81: Biodiversity Information Standards: are we going wrong, or just not quite right?

Disconnects

• Free access / ownership– Licensing, attribution, IP, credit

• Taxonomy / specimens– The big lie

• Concepts / names– Another big lie

• Linking taxa through basionyms– Another big lie

• Data / metadata• Distributed systems vs cache

Page 82: Biodiversity Information Standards: are we going wrong, or just not quite right?

Metadata

• So-called ‘data about data’

• “One man’s data is another’s metadata”

• Not a good or inspiring look

• Need a common and agreed understanding in TDWG domain

Page 83: Biodiversity Information Standards: are we going wrong, or just not quite right?

Metadata

• Problem of LSID byte persistence– Applies to data– Does not apply to metadata– Redefine data as metadata?– Sophistry?– Distorting our ontologies?

• Need to sort this out• Need to communicate the result

Page 84: Biodiversity Information Standards: are we going wrong, or just not quite right?

Metadata

Yesterday upon the stairMetadata wasn't thereIt wasn't there again todayHow I wish it would go away

Page 85: Biodiversity Information Standards: are we going wrong, or just not quite right?

The 3 big lies

• Names and specimens– That there is some real connection between specimens bearing the same name

– That distribution maps of specimens bearing the same name are meaningful

– That identifications bearing the same name represent the same taxon

– The ‘taxon concept problem’– Concept not explicit

Page 86: Biodiversity Information Standards: are we going wrong, or just not quite right?

The 3 big lies

• Names and concepts– That names somehow imply an unambiguous taxon concept

– That a taxon concept can be inferred from a name

– An assumption

– The ‘taxon concept problem’– Concept not explicit

Page 87: Biodiversity Information Standards: are we going wrong, or just not quite right?

The 3 big lies

• Names and types– That if we are talking about names based on the same type they are the same taxon concept

– That lists of names and synonyms based on the same type can be automatically merged

– The ‘taxon concept problem’– Concept not explicit

Page 88: Biodiversity Information Standards: are we going wrong, or just not quite right?

The 3 big lies

• What can we do?– Taxon reporting not unambiguous– Our results are at best indicative

• Users assume or infer concepts– Perhaps biggest problem in taxonomy and biodiversity informatics

– Be absolutely rigorous in talking about names and named concepts

– Educate taxonomists– Educate clients

• Limitations of data, applications• Implications of using data, limitations

Page 89: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG value for money

• Are we worth it?– This meeting cost c. $ 1,000,000

• Airfares, accommodation, salaries, etc.– What did we accomplish?

• Tangibles?• Intangibles?

– What have we produced so far?• 3 standards, several +/- standards• Compliance?• A ‘state of mind’?

Page 90: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG value for money

• Can we do it better?– Can we do it cheaper, faster?

• Use the wiki/listserv better– Accomplish more?

• New standards• Better standards

– Produce more?• New standards?• Retire standards?• Rationalize standards?

Page 91: Biodiversity Information Standards: are we going wrong, or just not quite right?

WHERE TO FROM HERE

Page 92: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where to from here?

• Tools at our disposal– TWDG Executive– Technical Architecture Group– TDWG working groups– On-line forums, lists– Web and Wiki– On-line Journal

Page 93: Biodiversity Information Standards: are we going wrong, or just not quite right?
Page 94: Biodiversity Information Standards: are we going wrong, or just not quite right?
Page 95: Biodiversity Information Standards: are we going wrong, or just not quite right?
Page 96: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where to from here?

• Increase TDWG Profile– ‘Market penetration’– Greater implementation, compliance– Attention to smaller institutions

• ‘the long tail’– Multilingual standards

– Strengthen partnerships, collaboration• GBIF, EoL, etc.• National initiatives

Page 97: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where to from here?

• TAG– Coordination of standards– Ontologies– Resolve metadata issues– Retire or deprecate standards

• ‘Us’– Participation– Implementation– Compliance

Page 98: Biodiversity Information Standards: are we going wrong, or just not quite right?

Where to from here?

xkcd.com

Page 99: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG – a glass half full

• TDWG has a lot to do• But it has accomplished a lot• Without the foundation of TDWG there could be:– No AVH– No ALA– No GBIF– No EoL– No [name your biodiversity acronym]

Page 100: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG – a glass half full

• TDWG has strong participant support– C. 200 participants in TDWG 2008

• Key institutional engagement– International– National – Regional – Local

• Increasing demand for products– Global change, habitat depletion, etc.

Page 101: Biodiversity Information Standards: are we going wrong, or just not quite right?

TDWG Mission

• Develop, adopt and promote standards and guidelines for the recording and exchange of data about organisms

• Promote the use of standards through the most appropriate and effective means and

• Act as a forum for discussion through holding meetings and through publications

Page 102: Biodiversity Information Standards: are we going wrong, or just not quite right?

** notes **

TDWG?

Page 103: Biodiversity Information Standards: are we going wrong, or just not quite right?