jim croft centre for plant biodiversity research, australian national herbarium & australian...

70
On-line On-line Flora of Australia Flora of Australia Structured data Structured data management in modern management in modern on-line Flora on-line Flora treatments treatments Jim Croft Centre for Plant Biodiversity Research, Australian National Herbarium & Australian National Botanic Gardens Helen Thompson ; Scott Payne Australian Biological Resources Study, Environment Australia Greg Whitbread Centre for Plant Biodiversity Research, Australian National Herbarium & Australian National Botanic Gardens Jim Croft – initial schema design, proof of concept Helen Thompson – project management, schema design, MS Word macros for XML markup, image scanning Scott Payne – database and application design, Java Greg Whitbread – initial database and application design, data import, Java

Upload: joana-rook

Post on 15-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

On-lineOn-lineFlora of AustraliaFlora of Australia

Structured data management Structured data management in modern on-line Flora in modern on-line Flora treatmentstreatments

Jim CroftCentre for Plant Biodiversity Research, Australian National Herbarium & Australian National Botanic Gardens

Helen Thompson ; Scott PayneAustralian Biological Resources Study, Environment Australia

Greg WhitbreadCentre for Plant Biodiversity Research, Australian National Herbarium & Australian National Botanic Gardens

• Jim Croft – initial schema design, proof of concept

• Helen Thompson – project management, schema design, MS Word macros for XML markup, image scanning

• Scott Payne – database and application design, Java

• Greg Whitbread – initial database and application design, data import, Java

Outline

• Background to the project• Structure of a Flora• Structure of a Flora in XML• Structure of a Flora in a database• Using structured Flora data• Flora of Australia examples• A general structure for Floras?• Concept for Flora production• Issues for Flora production• Suggestions / Recommendations

BackgroundBackground

Flora of Australia pteridophytes

•Three separate volumes in the Flora of Australia•The Electronic Pteridophyte Flora of Australia will integrate all these treatments a single on-line resource.

Volume 48 includes all ferns and fern allies from mainland Australia

Volume 49 covers the oceanic islands of Norfolk Is and Lord Howe Is.

Volume 50 covers the oceanic islands of Christmas Is and Macquarie Is.

Electronic Flora of Australia

• Design schema for published Flora

Using XML Schema; accommodate all data elements

• Markup published Flora

MS Word macros to replace style with XML tags

• Design relational database

Structured tables in Oracle; accommodate all data elements

• Import Flora XML files

Oracle; accommodate all data elements

Electronic Flora of Australia

• Test, adjust schema, correct markup, re-import

Refinement of structure, data and process

• Test, adjust schema, correct markup, re-import

Refinement of structure, data and process

• Test, adjust schema, correct markup, re-import

Refinement of structure, data and process

• Test, adjust schema, correct markup, re-import

Refinement of structure, data and process

Electronic Flora of Australia

• Design query interface

Using HTML forms

• Design output formats

HTML for maximum compatibility

• Test output results

For structure and completeness

• Adjust, correct code, correct markup, rerun

etc., etc., etc…

Structure of a FloraStructure of a Flora

A Flora structure in XMLA Flora structure in XML

What is XML?

XML, or Extensible Markup Language• A text markup language using simple and intuitive embedded plain text

coding tags• Similar in appearance to the Hypertext Markup Language, HTML, used

on the World Wide Web.• HTML controls the appearance of text delivered to Internet browsers• XML describes and controls the structure and content of a document.• XML and other style-sheets control the appearance on text based on

its structure and content• Hierarchical sets of rules known as XML Schema can transform a

word-processed text document into a structured, internally consistent, flexible database

• XML can be imported into other databases or modern XML enabled browsers and computer applications to present selected views of the data in different ways.

What does XML look like?Example in HTML<p><b>Platyzoma microphyllum</b> R.Br., <i>Prodr.</i> 160 (1810)</p><p ><i>Gleichenia platyzoma</i> F.Muell., <i>Veg. Chatham.-Isl.</i> 63 (1864). T: Facing Island, Qld, <i>R.Brown Iter Austral. 102</i> ; lecto: BM.</p><p>Illus.: S.B.Andrews…</p><p>Rhizome short-creeping… Sporangia in zones in distal half of frond. Fig. 55</p><p>Widespread across northern Australia… Grows in sandy or swampy soils.... Map 135.</p><p>W.A.: 14.4 km NW of Mt…</p>

Example in XML<taxon><name>Platyzoma microphyllum</name> <author>R.Br</author>, <publication><title>Prodr.</title> <page>160</page><date>1810</date> </publication><synonym> <name>Gleichenia platyzoma</name> <author> F.Muell. </author><publication>Veg. Chatham.-Isl.</publication> <page>63<page> <date>1864</date> <type>T: Facing Island, Qld, …</type></synonym><illustration>Illus.: S.B.Andrews…</illustration><description>Rhizome short-creeping… Sporangia in zones in distal half of frond. </description> <figure> Fig. 55 </figure><locality>Widespread across northern Australia… </locality><habitat>Grows in sandy or swampy soils...</habitat> <map>Map 135.</map><specimens>W.A.: 14.4 km NW of Mt…</specimens></taxon>

Fl. Australia XML Schema fragment

XML Schema is itself a structured XML document

A Flora structure in a databaseA Flora structure in a database

Fl. Australia database structure

Using structured Flora dataUsing structured Flora data

Interactive Plant Identification

Flora of Australia ExamplesFlora of Australia Examples

Telopea speciosissima (Proteaceae)

the ‘Waratah’

State flower ofNew South Wales

A general structure for Floras?A general structure for Floras?

Floras

Monographs

Serial Floras

Potential for a virtual multi Flora

• Using the ‘Virtual Herbarium’ model / DIGIR• A shared generalized Flora schema• Structured Flora in on-line databases• Common gateways to access Flora data• XML packaging of data• Integrating portal to pull in data from several sources

• View taxa according to various search criteria• View taxa in different sort orders• View restricted subset of taxon information• Compare different Flora treatments of the same taxon with a single

query• Add contemporary information from other data sources

• Taxonomy; species lists; illustrations; maps

ClientsCommon Web portals

GatewaysDatabases

Potential for a virtual multi Flora

Concept for Flora productionConcept for Flora production

W-P file

Editors W-P file

Botanist

Publisher C-R Copy

Book, etc.

An old process of publication

W-P file

Editors W-P file

Botanist

Publisher C-R Copy

Book, etc.

An new process of publication

XML file

Database XML fileOutputs

Outputs

Editors

Botanist

Publisher C-R Copy

Book, etc.

An future process of publication

XML file

DatabaseOutputs

Database

Outputs

Issues for Flora productionIssues for Flora production

Issues & implications for Flora publication

• Application of XML, relational database and Internet technology for Flora publication:

• Compilation• Production• Query and Delivery process

• It offers considerable benefit in terms of:

• Data management, integrity• Flexibility• Productivity.

Issues & implications for Flora publication

• Compilation of Floras in a database context:

• greater control over consistency and completeness• high standards for the printed product in mind.

• XML transformation capabilities:

• greater flexibility in how information is displayed• maintains editorial style and standards• production of the traditional printed product.

Issues & implications for Flora publication

• Internet access to Flora information:

• will facilitate access to a wider audience

• Management within an on-line database framework:

• will facilitate on-going maintenance• reduce the progressive aging of Flora treatments.

• Access to information on demand:

• will enable local production of tailored and more focused treatments based on local regional, taxonomic and other requirements.

Availability

• The on-line Flora of Australia will be freely available.• ABIF-Flora• Australia’s Virtual Herbarium

• XML is already freely available.• http://w3c.org/

• XML development tools are available, but not free• e.g. XMLSpy, XML Authority

• Browsers that use XML are freely available• IE6, NS6

• Other products are becoming XML enabled• MS Word• MS Excel, etc.

• The ABIF-Flora DTD/Schema will be freely available.

RecommendationsRecommendations

Recommendations

• Floras can apply database and XML technology to compile and manage past, present and future Flora treatments

• Floras could use existing treatments to evaluate and test this technology and obtain permission to make the data, images and maps from these treatments available on the Internet.

• Floras can start doing it now…

• Concept being considered by Flora of Paraguay, Flora of Taiwan. ?Flora Malesiana? ?FNA?

Doodia aspera

(Blechnaceae)