best practices for dataset metadata in ecological metadata ... · maximize interoperability of eml...
TRANSCRIPT
![Page 1: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/1.jpg)
BestPracticesforDatasetMetadatainEcologicalMetadataLanguage(EML)
Version3
November2017
ThisdocumentismanagedanddistributedbytheEnvironmentalDataInitiative(EDI)
https://environmentaldatarepository.org
Pleaseciteas:
BestPracticesforDatasetMetadatainEcologicalMetadataLanguage(EMLBestPracticesV3).2017.EnvironmentalDataInitiative.
![Page 2: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/2.jpg)
BestPracticesforDatasetMetadatainEMLV3 i
ORGANIZATIONANDCONVENTIONSOFTHISDOCUMENT
AUDIENCEThisdocumentisintendedfordatamanagers.Itassumesthatreadersarefamiliarwith
● thebasicstructureofanXMLdocument,andtheabilitytoeditinanXMLeditorlikeOxygenXMLorXMLSpy.
● theprocessforcontributingdatatoarepository.Ifyoureachedthisdocumentfromarepository’shelp-page,contactthemformoreinformation.
FONTSANDTYPEFACENumberedexamplesofEMLnodesareinfixed-widthfont:
<?xml version="1.0" encoding="UTF-8"?>.
XMLelementandattributenames,XPathandreferencestoelementnamesintextareinboldface.Singleelementnamesaresurroundedbyanglebrackets,astheyappearinXML.
<dataTable>/eml:eml/@packageId
Somerecommendationshavespecialcontext,e.g.,anXMLelementorattributemayberequestedbyacommunity(e.g.,LTER),orrequiredbytheEDIrepository(butnotbyotherrepositories).
Contextnotes:RecommendationsforEMLusageinaspecificcontextarecalled“contextnotes”,andareplacedinseparateparagraphs,initalic.
DEFINITIONSEMLpreparer:thepersonresponsiblefor“building”theEMLmetadatarecord.Generally,
thisisadatamanagerworkingwithaprojectorphysicalsitethatproducesdata.
Contributor:theresearchprojectcontributingthedatapackage,e.g.,anLTERorOBFSsite,oraMacrosystemsproject.Generally,the“EMLpreparer”workswithorforthe“Contributor.”
Datapackage:theEMLmetadatatogetherwithitsentityorentities.Thisisgenerallytheunithousedinrepositories.WeusethistermtoavoidconfusionwiththeEMLelement“dataset”.
![Page 3: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/3.jpg)
BestPracticesforDatasetMetadatainEMLV3 ii
OTHEREMLRESOURCESSomesectionsrefertofurtherinformationortools.ThesecanbefoundontheEDIwebsite,under“ResourcesandHowTo...”,athttps://environmentaldatarepository.org
![Page 4: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/4.jpg)
BestPracticesforDatasetMetadatainEMLV3 iii
TableofContents
Organizationandconventionsofthisdocument iAudience iFontsandtypeface iDefinitions iOtherEMLResources ii
I.Introduction 1History 1GeneralRecommendations 2MetadataDistribution 2DataPackageIdentifiers 2High-priorityElements 2
II.ContentrecommendationsForElementsandAttributes 2Therootelement:<eml:eml> 2@schemaLocation(XMLattribute) 2@packageId(XMLattribute) 3TopLevelElements 3id,systemandscope(XMLattribute-group) 3access 5alternateIdentifier 6title(dataset) 7PeopleandOrganizations(Parties) 7pubDate 12abstract 12keywordSetandkeyword 12intellectualRights 14distribution 14coverage 16maintenance 21methods 22project 25[entity]=dataTable,spatialRaster,spatialVector,storedProcedure,view,otherEntity 27attributeList 31constraint 39additionalMetadata 41
III.DescriptionsofEMLsamplefilesprovidedwiththisdocument 42
INDEX 43
![Page 5: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/5.jpg)
BestPracticesforDatasetMetadatainEMLv3 1
I.INTRODUCTIONThisdocumentcontainscurrent"BestPractice"recommendationsforEMLcontentformetadatarelatedtoecologicalandenvironmentaldata.ItisintendedtoaugmenttheEMLschemadocumentation(normativedocuments)foraless-technicalaudience.Thecurrentversion(v3,2017)isonecomponentofseveralresourcesavailabletoEMLpreparers.Therecommendationsaredirectedtowardsthefollowinggoals:
● ProvideguidanceandclarificationintheimplementationofEMLfordatapackages● MinimizeheterogeneityofEMLdocumentstosimplifydevelopmentandre-useof
softwarebuilttoingestit● MaximizeinteroperabilityofEMLdocumentstofacilitatedatasynthesis
Attimeofthisdocument'spublication(late2017),theversionofEMLcurrentlyinproductionwasEML.2.1.1.EML2.2.0isanticipatedwithinthenextyear.ContactEDIformoreinformation.
HISTORY
EMLBestPracticerecommendationshaveevolvedovertime.ThemostactivecontributorshavebeenmembersoftheLTERInformationManagersCommitteeinmultipleworkinggroupsandworkshops.EMLhasbeenwidelyusedforseveralyearswithmultipleapplicationswrittenagainstit,andthecommunityhashadtheopportunitytoobservetheconsequencesofmanycontentpatterns.Asmuchaspossible,recommendationshavebeenalignedwiththoseexperiences,aswellaswiththecapabilityofdatacontributors.
TimelineandPreviousRevisions• 2017BestPracticesforDatasetMetadatainEMLv3(thisdocument)• 2016EDIinception,seehttp://environmentaldatainitiative.org• 2011EMLBestPracticesforLTERsitesv2• 2008EML2.1release• 2004EMLBestPracticesforLTERsites• 2003LTERadoptsEMLasnetworkexchangestandard
Contributors,includingLTEREMLBestPracticesWorkingGroupsandworkshopsin2003,2004,2010(alphabeticalorder):DanBahauddin,BarbaraBenson,EmeryBoose,JamesBrunt,DuaneCosta,CorinnaGries,DonHenshaw,MargaretO’Brien,KenRamsey,InigoSanGil,MarkServilla,WadeSheldon,PhilipTarrant,TheresaValentine,JohnVandeCastle,KristinVanderbilt,JonathanWalsh,YangXia
![Page 6: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/6.jpg)
BestPracticesforDatasetMetadatainEMLv3 2
GENERALRECOMMENDATIONSFollowingaregeneralbestpracticesforhandlingEMLdatasetmetadata:
MetadataDistributionDonotpubliclydistributeEMLdocumentscontainingelementswithincorrectinformation,e.g.,asaworkaroundformissingmetadataortomeetvalidationrequirements.Pre-publicationdrafts,orEMLproducedfordemonstrationortestingpurposesshouldbeclearlyidentifiedassuchandnotcontributedtopublicarchives,becausethesearepassedontolarge-scaleclearinghouses.Forpreviewsofdraftsorhandlingtestanddemonstrationdatapackages,consultyourrepositorytolearnaboutoptions.
DataPackageIdentifiersMetadataanddatasetversioningarecontrolledbythecontributor,andsoidentifiersaretiedtolocalsystems.ManyrepositorysystemsthatacceptEML-describeddatasupportprinciplesofimmutablemetadataanddataentityversioning.EMLhaselementstocontainpackageidentifiers,althoughthesemayalsobeassignedexternally.Itistheresponsibilityofthesubmitterstounderstandthepracticesoftheirintendedrepositorywhenusingidentifiers.
High-priorityElements● Tosupportlocatingdatabytime,geographiclocation,andtaxonomically,metadata
shouldprovideasmuchinformationaspossibleforthedatapackage,inthethree<coverage>;elementsof<temporalCoverage>;(when),<geographicCoverage>;(where)and<taxonomicCoverage>(what).
● Forapotentialusertoevaluatetherelevanceandusabilityofthedatapackagefortheirresearchstudyorsynthesisprojects,metadatashouldincludedetaileddescriptionsinthe<project>,<methods>,<protocols>,and<intellectualRights>elements.
II.CONTENTRECOMMENDATIONSFORELEMENTSANDATTRIBUTES
Therootelement:<eml:eml>ThiselementistherootelementinallEMLdocuments.TheXPathnotationis:/eml:eml
Therootelementholdstwoimportantparts,bothofwhichareoptional,butrecommended.
@schemaLocation(XMLattribute)Thisattributeisthislocation(XPath):
![Page 7: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/7.jpg)
BestPracticesforDatasetMetadatainEMLv3 3
/eml:eml/@schemaLocation
TheschemaLocationattributetellsaprocessorthenameoftheschematowhichtheEMLdocumentbelongsandwheretofindit.Mostrepositoriescheckschemacompliancewhendatapackagesaredeposited,butitishighlyrecommendedthatdatamanagersknowhowandwheretospecifytheschemathattheirmetadatadocumentshouldadhereto.Thisway,theycanvalidatetheirownworkinprogress,e.g.,throughanXMLeditorlikeOxygenXML.
@packageId(XMLattribute)Thisattributeisfoundatthislocation(XPath):/eml:eml/@packageId
Asoutlinedelsewhere,EMLpreparersshouldmanageuniqueidentifiersandversioningatthelocallevel(see@systemdiscussionbelow).ThepackageIdattributecanbeusedtocontainthesameidentifierasisusedbytherepository.
SeeSectionIIIforotherinformationaboutEMLdocumentsinMetacat.
ContextNote:ThepackageIDattributeisrequiredinallEMLdocumentssubmittedtoEDI.Itisenteredintotherepositorysoftware,andtheformatisstandardizedtothreeparts:scope,package-number,revision.Thescopeshouldbe“edi”unlessanotherscopeisjustifiedbypriorarrangement.SeeExample1.
TopLevelElementsAnEMLdatasetiscomposedofuptothreeelementsundertherootelement(<eml:eml>):
<access><dataset><additionalMetadata>
id,systemandscope(XMLattribute-group)ThisattributegroupcanbeusedontheseEMLelements:<access><dataset><creator><associatedParty>
![Page 8: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/8.jpg)
BestPracticesforDatasetMetadatainEMLv3 4
<contact><metadataProvider><publisher><coverage><geographicCoverage><temporalCoverage,><taxonomicCoverage><distribution><software><citation><protocol><project><dataTable><otherEntity><spatialRaster><spatialReference><spatialVector><storedProcedure><view><attribute><constraint>Thesethreeattributesarefoundasagroupandareusuallyoptional.Theprimaryuseoftheidattributeisasaninternalreference,henceeachidmustbeuniquewithinoneEMLdocument.E.g.,.a<creator>musthaveadifferentidthana<dataTable>.Andifthesamepersonappearsinseberalplaces(atdataset/creator,protocol/creatororproject/creator,thesameidcannotberepeated,soeitherthecontentoftheidmustbechangedorareferenceusedforrepeatedinstances.
ThisrestrictioncancauseproblemswhencontentisdrawnfromasystemwithIDs(e.g.apersonneldatabase),andisunderconsiderationbytheEMLdevelopers.Ideallythethreeattributeswouldworktogether.Thescopeattributecanhaveoneoftwovalues,“system”or“document”.Itispreferredthatwhenthescopeissetto“system”,thatthesystemattributedefinestheID-system,theidattributecontentis(presumably)fromthatsystem.
Currently,areasonablegeneralpracticeshouldbetodefineasystemonthe<eml:eml>elementandsetittothesite(butnotsetthesystemattributeatanyotherlevel),andtosetscope=“document”onelementsotherthan<eml:eml>.
Example1:attributespackageId,id,system,andscope<?xml version="1.0" encoding="UTF-8"?> <eml:eml xmlns:ds="eml://ecoinformatics.org/dataset-2.1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:eml="eml://ecoinformatics.org/eml-2.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:stmml="http://www.xml-cml.org/schema/stmml" xsi:schemaLocation="eml://ecoinformatics.org/eml-2.1.0 https://nis.lternet.edu/eml-2.1.0/eml.xsd"
![Page 9: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/9.jpg)
BestPracticesforDatasetMetadatainEMLv3 5
packageId="knb-lter-fls.21.3" system="FLS" scope="system">
accessThedatapackagetitleelementisfoundatthislocation(XPath):/eml:eml/access/eml:eml/[entityType]/physical/distribution/access
<access>containsalistofrulesdefiningpermissionsforthismetadatarecordanditsdataentity.Valuesmustbeapplicablebythesystemwheredataisstored.ManyrepositoriesfollowtheKNBsystemofusingaccesscontrolformatthatconformstotheLDAP“distinguishedName(dn)”foranindividual,asin“uid=FLS,o=LTER,dc=ecoinformatics,dc=org”.
AsofEML2.1.0,<access>treesareallowedattwoplaces:asthefirstchildofthe<eml:eml>rootelement(asiblingto<dataset>)forcontrollingaccesstotheentiredocument,andinaphysical/distributiontreeforcontrollingaccesstotheresourceURL.Withtheexceptionofcertainsensitiveinformation,metadatashouldbepubliclyaccessible.The<access>elementisoptional,andifomitted,therepositorymaypresumethatonlythedatasetsubmitterwillbeallowedaccess.
Example2:access <access authSystem="knb" order="allowFirst" scope="document"> <allow> <principal>uid=FLS,o=lter,dc=ecoinformatics,dc=org</principal> <permission>all</permission> </allow> <allow> <principal>public</principal> <permission>read</permission> </allow> </access>
dataset
Thiselementisfoundattheselocations(XPath):/eml:eml/datasetUnder<dataset>,thefollowingelementsareavailable.Someareoptional,butiftheyappear,thisorderisenforcedbytheschema.Generally,therecommendationsarepresentedhereinthisorder,withtheexceptionofelementsrelatedtopeopleandorganizationswhicharegroupedtogethersothatthedistinctionsbetweentheusesof
![Page 10: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/10.jpg)
BestPracticesforDatasetMetadatainEMLv3 6
thoseelementsareclear.ElementsthatcanappearatdifferentlevelswithinanEMLfilearediscussedattheirfirstappearance,orhighestlevel.
<alternateIdentifer><shortName><title><creator><metadataProvider><associatedParty><pubDate><language><series><abstract><keywordSet><additionalInfo><intellectualRights><distribution><coverage><purpose><maintenance><contact><publisher><pubPlace><project>Theseelementsarethenfollowedbyoneormoreelementsforthedataentity(orentities),designatedbychoosing:[dataTable|spatialRaster|spatialVector|storedProcedure|view|otherEntity]
alternateIdentifierThealternateIdentifierelementisfoundatthislocation(XPath):/eml:eml/dataset/alternateIdentifier/eml:eml/dataset/[entity]/alternateIdentifierThecontributingorganization’slocaldatasetidentifiershouldbelistedastheEML<alternateIdentifier>,particularlywhenitdiffersfromthe“packageId”attributeinthe<eml:eml>element.The<alternateIdentifier>shouldalsobeusedtodenotethatapackagebelongstomorethancontributingorganizationbyincludingeachindividualIDin
![Page 11: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/11.jpg)
BestPracticesforDatasetMetadatainEMLv3 7
aseparate<alternateIdentifier>tag.Attheentitylevel,the<alternateIdentifier>shouldcontainanalternatenameforthedatatable(orotherentity)itself(seeadditionalcommentsunderentities,below.)
title(dataset)Thedatasettitleelementisfoundatthislocation(XPath):/eml:eml/dataset/title/eml:eml/method/methodStep/protocol/title/eml:eml/project/title
Thedataset<title>shouldbedescriptiveandshouldmentionthedatacollected,geographiccontext,researchsite,andtimeframe(what,where,andwhen).
Example3:dataset,alternateIdentifier,shortName,title<dataset id="FLS-1" system="FLS" scope = "system"> <alternateIdentifier>FLS-1</alternateIdentifier> <shortName>Arthropods</shortName> <title>Long-term Ground Arthropod Monitoring Dataset at Ficity, USA from 1998 to 2003</title>
PeopleandOrganizations(Parties)Peopleandorganizationsarealldescribedusinga“ResponsibleParty”groupofelements,whichisfoundattheselocations(XPath):/eml:eml/dataset/creator/eml:eml/dataset/contact/eml:eml/dataset/metadataProvider/eml:eml/dataset/associatedParty/eml:eml/dataset/publisher/eml:eml/dataset/project/creator/eml:eml/dataset/method/methodStep/protocol/creatorGeneralrecommendations:Whenusing<individualName>elementsanywherewithinanEMLdocument,namesshouldbeconstructedwithEnglishalphabetizationinmind.Manysiteshavefoundthatmaintainingfullcontactinformationforeverycreatorisimpractical,howeverafewimportantcontactinformationshouldbekeptuptodate(seebelow).Ifanameincludesasuffix,itshouldbeincludedinthe<surName>elementafterthelastname.
Itisrecommendedtoincludecompletecontactinformationforapermanentrolethatisindependentofthepersonholdingthatposition.Forexample,foraninformationmanager,
![Page 12: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/12.jpg)
BestPracticesforDatasetMetadatainEMLv3 8
sitecontact,paycarefulattentiontophonenumberanduseane-mailaliasthatcanbepassedon.(seebelow,under<contact>).
WiththeadventofgeneralidentifierssuchasORCIDs,thetextinthe<address>,<phone>,and<onlineURL>elementsmaybecomeunnecessaryforindividualsandsoisoptionalifandanindividual’sORCIDisincluded.<electronicMailAddress>isrecommendedtosimplifycontactingresponsibleparties.Seethe<userId>field.ORCIDidentifiersarenotyetavailablefororganizations,so<address>,<phone>,and<onlineURL>elementsshouldbeincludedforthem.Intheexamples,theseelementsareincludedforcompleteness.
<userId>Thiselementisfoundatthislocation(XPath):/eml:eml/dataset/creator/userId/eml:eml/dataset/contact/userId/eml:eml/dataset/metadataProvider/userId/eml:eml/dataset/associatedParty/userId/eml:eml/dataset/publisher/userId/eml:eml/dataset/project/creator/userId/eml:eml/dataset/method/methodStep/protocol/creator/userId
Theoptional<userId>fieldholdsidentifiersforresponsiblepartiesfromothersystems.Thiselementisrepeatablesothatmultiplesystemscanbereferenced.EMLpreparesshouldcontactthesystemtheyplantousetolearntheirpreferencesforinclusioninmetadata.TheexampleshereareforORCIDidentifiers,andthatorganizationhasaskedthatitsfullURIbeusedasboththesystemattribute,andastheheadoftheidentifieritself.
Example4:creator<creator id="org-1" system="FLS" scope="system"> <organizationName>Fictitious LTER Site</organizationName> <address> <deliveryPoint>Department for Ecology</deliveryPoint> <deliveryPoint>Fictitious State University</deliveryPoint> <deliveryPoint>PO Box 111111</deliveryPoint> <city>Ficity</city> <administrativeArea>FI</administrativeArea> <postalCode>11111-1111</postalCode> </address> <phone phonetype="voice">(999) 999-9999</phone> <electronicMailAddress>[email protected]</electronicMailAddress> <onlineUrl>http://www.fsu.edu/</onlineUrl> <userId system=”https://orcid.org”>https://orcid.org/0000-0000-0000-0000</userId> </creator> <creator id="pos-1" system="FLS" scope="system"> <positionName>FLS Lead PI</positionName> <address> <deliveryPoint>Department for Ecology</deliveryPoint> <deliveryPoint>Fictitious State University</deliveryPoint>
![Page 13: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/13.jpg)
BestPracticesforDatasetMetadatainEMLv3 9
<deliveryPoint>PO Box 111111</deliveryPoint> <city>Ficity</city> <administrativeArea>FI</administrativeArea> <postalCode>11111-1111</postalCode> </address> <phone phonetype="voice">(999) 999-9999</phone> <electronicMailAddress>[email protected]</electronicMailAddress> <onlineUrl>http://www.fsu.edu/</onlineUrl> <userId system=”https://orcid.org”>https://orcid.org/0000-0000-0000-0000</userId> </creator> <creator id="pers-1" system="FLS" scope="system"> <individualName> <salutation>Dr.</salutation> <givenName>Joe</givenName> <givenName>T.</givenName> <surName>Ecologist Jr.</surName> </individualName> <organizationName>FSL LTER</organizationName> <address> <deliveryPoint>Department for Ecology</deliveryPoint> <deliveryPoint>Fictitious State University</deliveryPoint> <deliveryPoint>PO Box 111111</deliveryPoint> <city>Ficity</city> <administrativeArea>FI</administrativeArea> <postalCode>11111-1111</postalCode> </address> <phone phonetype="voice">(999) 999-9999</phone> <electronicMailAddress>[email protected]</electronicMailAddress> <onlineUrl>http://www.fsu.edu/~jecologist</onlineUrl> <userId system=”https://orcid.org”>https://orcid.org/0000-0000-0000-0000</userId> </creator>
creatorThiselementisfoundatthislocation(XPath):/eml:eml/dataset/creator
<creator>Thecreatorisconsideredtobetheauthorofthedatapackage,i.e.theperson(s)responsibleforintellectualinputintoitscreation.<surName>and<givenName>elementsareusedtobuildcitations,sotheseshouldbecompletedfullyforcredittobeunderstandable.Forlong-termdata,e.g.,fromanLTERSite,preparersshouldincludetheorganization(usingthe<organizationName>)orcurrentprincipalinvestigator(PI,using<postitionName>).Itshouldbekeptinmindthatinthepast,differentapproacheshaveleadtoconfusionoverhowtobestsearchforlong-termdata,andsearchersfrequentlydefaulttosearchesusingPI’slastname.Thereforeitisareasonablepracticetoincludemorecreatorsratherthanfewer,evenifitblursthecreditforlong-termdata.
![Page 14: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/14.jpg)
BestPracticesforDatasetMetadatainEMLv3 10
metadataProviderThiselementisfoundatthislocation(XPath):/eml:eml/dataset/metadataProviderThe<metadataProvider>elementliststhepersonororganizationresponsibleforproducingorprovidingthemetadatacontent.ForprimarydatasetsgeneratedbyLTERsites,theLTERsiteshouldtypicallybelistedunder<metadataProvider>usingthe<organizationName>element.Foracquireddatasets,wherethe<creator>or<associatedParty>arenotthesamepeoplewhoproducedthemetadatacontent,theactualmetadatacontentprovidershouldbelistedinstead(seeExamplebelow).
Example5:metadataProvider<metadataProvider> <organizationName>Fictitious LTER Site</organizationName> <address> <deliveryPoint>Department of Ecology</deliveryPoint> <deliveryPoint>Fictitious State University</deliveryPoint> <deliveryPoint>PO Box 111111</deliveryPoint> <city>Ficity</city> <administrativeArea>FI</administrativeArea> <postalCode>11111-1111</postalCode> </address> <phone phonetype="voice">(999) 999-9999</phone> <electronicMailAddress>[email protected]</electronicMailAddress> <onlineUrl>http://www.fsu.edu/</onlineUrl> <userId system=”https://orcid.org”>https://orcid.org/0000-0000-0000-0000</userId> </metadataProvider>
associatedPartyThiselementisfoundatthislocation(XPath):/eml:eml/dataset/associatedPartyListotherpeoplewhowereinvolvedwiththedatainsomeway(fieldtechnicians,studentsassistants,etc.)as<associatedParty>.All<associatedParty>treesrequirea<role>element.Theparentuniversity,institution,oragencycouldalsobelistedasan<associatedParty>using<role>of“owner”whenappropriate.
Example6:associatedParty<associatedParty id="12010" system="FLS" scope="system"> <individualName> <givenName>Ima</givenName> <surName>Testuser</surName> </individualName> <organizationName>FSL LTER</organizationName> <address> <deliveryPoint>Department for Ecology</deliveryPoint> <deliveryPoint>Fictitious State University</deliveryPoint> <deliveryPoint>PO Box 111111</deliveryPoint> <city>Ficity</city> <administrativeArea>FI</administrativeArea>
![Page 15: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/15.jpg)
BestPracticesforDatasetMetadatainEMLv3 11
<postalCode>11111-1111</postalCode> </address> <phone phonetype="voice">(999) 999-9999</phone> <electronicMailAddress>[email protected]</electronicMailAddress> <onlineUrl>http://search.lternet.edu/directory_view.php?personid=12010&query=itestuser</onlineUrl>
<userId system=”https://orcid.org”>https://orcid.org/0000-0000-0000-0000</userId> <role>Technician</role> </associatedParty>
contactThiselementisfoundatthislocation(XPath):/eml:eml/dataset/contactA<contact>elementisrequiredinallEMLmetadatarecords.Fullcontactinformationshouldbeincludedforthepositionofdatamanagerorotherdesignatedcontact,andshouldbekeptcurrentandindependentofpersonnelchanges.Ifseveralcontactsarelisted(e.g.bothadataandsitemanager)allshouldbekeptcurrent.Technicianswhoperformedtheworkbelongunder<associatedParty>ratherthan<contact>.Completethe<address>,<phone>,<electronicMailAddress>,and<onlineURL>elementsforthe<contact>element.
Example7:contact<contact> <positionName id=”pos-4”>Information Manager</positionName> <address> <deliveryPoint>Department for Ecology</deliveryPoint> <deliveryPoint>Fictitious State University</deliveryPoint> <deliveryPoint>PO Box 111111</deliveryPoint> <city>Ficity</city> <administrativeArea>FI</administrativeArea> <postalCode>11111-1111</postalCode> </address> <phone phonetype="voice">(999) 999-9999</phone> <electronicMailAddress>[email protected]</electronicMailAddress> <onlineUrl>http://www.fsu.edu/</onlineUrl> <userId system=”https://orcid.org”>https://orcid.org/0000-0000-0000-0000</userId> </contact>
publisherThiselementisfoundatthislocation(XPath):/eml:eml/dataset/publisherTheorganizationproducingtheEMLmetadata(e.g.,anLTERsiteorfieldstation)shouldbeplacedinthe<publisher>element.Spellouttheorganization’sname(<organizationName>).Completethe<address>,<phone>,<electronicMailAddress>,
![Page 16: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/16.jpg)
BestPracticesforDatasetMetadatainEMLv3 12
and<onlineURL>elementsforeachpublisherelement.Somecitationdisplaysmayusethiselement,althoughtypically,therepositorybecomesthepublisherincitations.
Example:publisher<publisher>
<organizationName>Fictitious LTER site</organizationName> </publisher>
pubDateThiselementisfoundatthislocation(XPath):/eml:eml/dataset/pubDateTheyearofpublicreleaseofdataonlineshouldbelistedasthe<pubDate>element.Becausethiselementmaybeusedinconstructingcitations,thepubDatealsoshouldreflectthe'recentness'ofapackage,withpubDateupdatedalongwithsignificantrevisionordataadditions(e.g.,correcteddata,oradditionstoanongoingtimeseries).ThereisanargumentforpubDatereferringtooriginaldateofrelease,butthisisprobablyonlyusefulforstaticdatapackages,oriftheonlymetadatachangesaretoenhancediscovery.
abstractThiselementisfoundattheselocations(XPath):/eml:eml/dataset/abstract/eml:eml/dataset/project/abstractForadataset,theabstractelementcanappearattheresourcelevelortheprojectlevel.The<abstract>elementwillbeusedforfull-textsearches,anditshouldberichwithdescriptivetext.Inparticular,descriptionsshouldincludeinformationthatdoesnotfitintostructuredmetadata,andfocusonthe“what”,“when”,and“where”information,generaltaxonomicinformation,aswellaswhetherthedatasetisongoingorcompleted.Somegeneralmethodsdescriptionisappropriate,andbroadclassesofmeasuredparametersshouldalsobeincluded.Foralargenumberofparameters,usecategoriesinsteadoflistingallparameters(e.g.usetheterm“nutrients”insteadofnitrate,phosphate,calcium,etc.),incombinationwiththeparametersthatseemmostrelevantforsearches.
keywordSetandkeywordThiselementisfoundattheselocations(XPath):/eml:eml/dataset/keywordSet
![Page 17: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/17.jpg)
BestPracticesforDatasetMetadatainEMLv3 13
/eml:eml/dataset/project/keywordSetItisrecommendedthatmeaningfulsetsofkeywordseachbecontainedwithin<keywordSet>tag.Useone<keywordSet>foragroupoftermsidentifyingthecontributingorganization(s),e.g.,theLTERorOBFSsite,LTREBorMacrosystemsproject,whichisespeciallyifdataareco-fundedorfundingisleveraged.Meaningfulgeographicplacenamesalsoareappropriate(e.g.state,city,county).Ifgroupsofkeywordsarefromaspecificvocabulary,itsnamebelongstheoptionaltag<keywordThesaurus>.
Context:Communitiessometimeshavespecificrequestsforkeywordstoassistinsearches.E.g,theLTERrequeststhatkeywordsshouldincludeaLTERcoreresearcharea(s),thenetworkacronym(LTER,ILTER,etc.),three-lettersiteacronymandsitename.Inadditiontospecifickeywords,relevantconceptualkeywordsshouldalsobeincluded,e.g.,fromtheLTERControlledVocabulary.
Example:pubDate,abstract,keywordSet,keyword<pubDate>2014</pubDate> <abstract> <para>Ground arthropods communities are monitored in different habitats in a rapidly changing environment. The arthropods are collected in traps four times a year in ten locations and determined as far as possible to family, genus or species.</para> </abstract> <keywordSet> <keyword keywordType="place">City</keyword> <keyword keywordType="place">State</keyword> <keyword keywordType="place">Region</keyword> <keyword keywordType="place">County</keyword> <keyword keywordType="theme">FLS</keyword> <keyword keywordType="theme">Fictitious LTER Site</keyword> <keyword keywordType="theme">LTER</keyword> <keyword keywordType="theme">Arthropods</keyword> <keyword keywordType="theme">Richness</keyword> <keywordThesaurus>FLS site thesaurus</keywordThesaurus> </keywordSet> <keywordSet> <keyword keywordType="theme">ecology</keyword> <keyword keywordType="theme">biodiversity</keyword> <keyword keywordType="theme">population dynamics</keyword> <keyword keywordType="theme">terrestrial</keyword> <keyword keywordType="theme">arthropods</keyword> <keyword keywordType="theme">pitfall trap</keyword> <keyword keywordType="theme">monitoring</keyword> <keyword keywordType="theme">abundance</keyword> <keywordThesaurus>LTER controlled vocabulary</keywordThesaurus> </keywordSet> <keywordSet> <keyword keywordType="theme">populations</keyword> <keywordThesaurus>LTER core research areas</keywordThesaurus> </keywordSet>
![Page 18: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/18.jpg)
BestPracticesforDatasetMetadatainEMLv3 14
intellectualRightsThiselementisfoundatthislocation(XPath):/eml:eml/dataset/intellectualRights
<intellectualRights>arecontrolledatthesource,howeveritisrecommendedthatdatabereleasedwithasfewrestrictionsaspossible.Eachdatapackageshouldcontainadataaccesspolicy,plusadescriptionofanydeviationfromthegeneralpolicyspecificforthisparticularpackage(e.g.restricted-accesspackages).Thetimeframeforreleaseshouldbeincludedaswell.
Context:Ifno<intellectualRights>elementisincludedEDIwillinserttextthatreleasesdataunder“CC-0”(showninexample).TheLTERNetwork-widedefaultpolicyis“CC-by”.Pleaseconsultthoseorganizationsformoreinformationandmoredetails.
Example:intellectualRights<intellectualRights> <section> <title>Data Policy</title> <para> This data package is released to the “public domain” under Creative Commons CC0 1.0 “No Rights Reserved” (see: https://creativecommons.org/publicdomain/zero/1.0/). It is considered professional etiquette to provide attribution of the original work if this data package is shared in whole or by individual components. A generic citation is provided for this data package on the website https://portal.edirepository.org (herein “website”) in the summary metadata page. Communication (and collaboration) with the creators of this data package is recommended to prevent duplicate research or publication. This data package (and its components) is made available “as is” and with no warranty of accuracy or fitness for use. The creators of this data package and the website shall not be liable for any damages resulting from misinterpretation or misuse of the data package or its components. Periodic updates of this data package may be available from the website. Thank you</para> </section> </intellectualRights>
distributionThiselementisfoundattheselocations(XPath):/eml:eml/dataset/distribution/eml:eml/dataset/[entity]/physical/distributionThe<distribution>elementcanappearatboththedatasetandentitylevels.Attheentity-level,itcontainsinformationonhowthatspecificdataentitycanbeaccessed.The<distribution>elementhasoneofthreechildrenfordescribingthelocationoftheresource:<online>,<offline>,and<inline>.
OfflineData:Usethe<offline>elementtodescriberestrictedaccessdataordatathatisnotavailableonline.Theminimumthatshouldbeincludedisthe<mediumName>tag,ifusingthe<offline>element.
![Page 19: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/19.jpg)
BestPracticesforDatasetMetadatainEMLv3 15
InlineData:The<inline>elementcontainsdatathatisstoreddirectlywithintheEMLdocument.Dataincludedastextorstringwillbeparsed.Ifdataarenottobeparsed,encodethemas“CDATAsections,”bysurroundingthemwith“<![CDATA[“and“]]>”tags.
OnlineData:The<online>elementhastwosubelements,<url>,and<onlineDescription>(optional).<url>tagsmayhaveanoptionalattributenamedfunction,whichmaybesettoeither“download”or“information”.Ifthe"function"attributeisomitted,then"download"isimplied.
@function=”download”:accessingtheURLdirectlyreturnsthedatastream
@function=”information”:URLleadstoadatacatalog,intended-usepage,orotherpagethatprovidesinformationaboutdownloadingtheobjectbutdoesnotdirectlyreturnthedatastream,thenthe"function"attributeshouldbesetto"information".
Context:foramEMLdatapackagetobeacceptedintotheEDIrepository,itmustincludeatleastoneURL;attheentitylevel(e.g.,adataTableat/eml:eml/dataset/dataTable/physical/distribution/url).TheURLmustincludethefunctionattributewiththevalue“download”(orempty,i.e.,defaultsto“download”).
Whenusedattheentitylevel,analternativetagisavailableto<url>,called<connection>.Thiselementisdiscussedunderdataentities,below.
AsofEML2.1,thereisalsoanoptional<access>elementina<distribution>treeatthedataentitylevel(/eml:eml/dataset/[entity]/physical/distribution/access).Thiselementisintendedspecificallyforcontrollingaccesstothedataentityitself.Formoreinformationonthe<access>tree,seeabove,underthegeneralaccessdiscussion.
Example:distribution<distribution> <online> <onlineDescription>f1s-1 Data Web Page</onlineDescription> <url function=”information”>http://www.fsu.edu/lter/data/fls-1.htm</url> </online> </distribution> <dataTable> <physical> … <distribution> <online> <onlineDescription>f1s-1 Data Web Page</onlineDescription> <url function=”download”>http://www.fsu.edu/lter/data/fls-1.csv</url> </online> </distribution> </physical> </dataTable>
![Page 20: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/20.jpg)
BestPracticesforDatasetMetadatainEMLv3 16
coverageThiselementisfoundattheselocations(XPath):/eml:eml/dataset/coverage/eml:eml/dataset/methods/sampling/studyExtent/coverage/eml:eml/dataset/methods/sampling/spatialSamplingUnits/coverage/eml:eml/dataset/[entity]/coverage/eml:eml/dataset/[entity]/methods/sampling/studyExtent/coverage/eml:eml/dataset/[entity]/methods/sampling/spatialSamplingUnits/coverage/eml:eml/dataset/[entity]/attributeList/attribute/coverage/eml:eml/dataset/[entity]/attributeList/attribute/methods/sampling/studyExtent/coverage/eml:eml/dataset/[entity]/attributeList/attribute/methods/sampling/spatialSamplingUnits/coverage/eml:eml/dataset/project/studyAreaDescription/coverageThe<coverage>elementcanappearatthedataset,methods,entityandattributelevels,andcontainsthreeelementsfordescribingthecoverageintermsofspace,taxonomy,andtime,<geographicCoverage>,<taxanomicCoverage>,and<temporalCoverage>.Populatingtheseelementsasrecommendedenablesadvancedsearchesandunderstanding.BecausetheyappearatmanyXPaths,therearemanyoptionsforhowcoverageelementscanbeused.
geographicCoverageGeneralInformation:The<geographicCoverage>elementdescribeslocationsofresearchsitesandareasrelatedtothedata,andisintendedforgeneralplacementofpointsonamap.Itisrecommendedtousetheelementatdifferentlevelsfordifferenttypesofinformation.Thecardinalityofthe<geographicCoverage>elementisone-to-many.Themiminumrequirementunder<geographicCoverage>istwoelements,a<geographicDescription>and<boundingCoordinates>withaboundingboxcontainingN,S,E,Wlimits.
Atthedatasetlevel(eml:eml/dataset/coverage)one<geographicCoverage>elementshouldbeincluded,whose<boundingCoordinates>describetheextentofthedata.Asadefault,thiscouldbethenominalboundariesofasamplingarea.Amoreaccurateextent(recommended)wouldbethemaximumextentofthedata,foreachofeast,west,northandsouth.
Additional<geographicCoverage>elementsshouldbeincludediftherearesignificantdistancesbetweenstudysitesandgroupingtheminoneboundingboxwouldbemisleadingorconfusing.Forexample,across-sitestudyshouldhaveboundingboxesforeachsite.
Example:geographicCoverageatthedatasetlevel<coverage> <geographicCoverage>
![Page 21: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/21.jpg)
BestPracticesforDatasetMetadatainEMLv3 17
<geographicDescription>Ficity, FI metropolitan area, USA</geographicDescription> <boundingCoordinates> <westBoundingCoordinate>-112.373614</westBoundingCoordinate> <eastBoundingCoordinate>-111.612936</eastBoundingCoordinate> <northBoundingCoordinate>33.708829</northBoundingCoordinate> <southBoundingCoordinate>33.298975</southBoundingCoordinate> <boundingAltitudes> <altitudeMinimum>300</altitudeMinimum> <altitudeMaximum>600</altitudeMaximum> <altitudeUnits>meter</altitudeUnits> </boundingAltitudes> </boundingCoordinates> </geographicCoverage> </coverage>
Ifsamplingtookplaceindiscretepointlocation,thosesitesshouldalsoappearwithorwithoutaboundingbox.Individualsamplingsitesmayalsobebeenteredunder<spatialSamplingUnits>,eachsiteinaseparatecoverageelement(seebelow).
Example:geographicCoverageunderspatialSamplingUnits<spatialSamplingUnits> <coverage> <geographicDescription>sitenumber 1</geographicDescription> <boundingCoordinates> <westBoundingCoordinate>-112.2</westBoundingCoordinate> <eastBoundingCoordinate>-112.2</eastBoundingCoordinate> <northBoundingCoordinate>33.5</northBoundingCoordinate> <southBoundingCoordinate>33.5</southBoundingCoordinate> </boundingCoordinates> </coverage> <coverage> <geographicDescription>sitenumber 2</geographicDescription> <boundingCoordinates> <westBoundingCoordinate>-111.7</westBoundingCoordinate> <eastBoundingCoordinate>-111.7</eastBoundingCoordinate> <northBoundingCoordinate>33.6</northBoundingCoordinate> <southBoundingCoordinate>33.6</southBoundingCoordinate> </boundingCoordinates> </coverage> <coverage> <geographicDescription>sitenumber 3</geographicDescription> <boundingCoordinates> <westBoundingCoordinate>-112.1</westBoundingCoordinate> <eastBoundingCoordinate>-112.1</eastBoundingCoordinate> <northBoundingCoordinate>33.7</northBoundingCoordinate> <southBoundingCoordinate>33.7</southBoundingCoordinate> </boundingCoordinates> </coverage> </spatialSamplingUnits>
Latitudesandlongitudesshouldbeinthesamedatum,commonlyused(i.e.,allvaluesinWGS84orNAD83)andexpressedtoatleastsixdecimalplaces(theEML2.1schemaenforcesdecimalcontent).Internationalconventiondictatesthatlongitudeseastoftheprimemeridianandlatitudesnorthoftheequatorbeprefixedwithaplussign(+),orbythe
![Page 22: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/22.jpg)
BestPracticesforDatasetMetadatainEMLv3 18
absenceofaminussign(-),andthatwestlongitudesandsouthlatitudesbeprefixedwithminussign(-).SeeExamplebelow,andtheEMLspecificationformoreinformationandotherexamples.
<geographicDescription>Thedescriptionisastring.Itshouldbecomprehensivesothatsearchescanberunagainstit,andincludethecountry,state,countyorprovince,city,generaltopography,landmarks,riversandotherrelevantinformation.Themethodfordetermining<boundingCoordinates>,<boundingAltitudes>,coordinates,datums,etc.,shouldbeincludedwiththe<geographicDescription>,sincethoseelementsdonotencodethisinformation.
The<datasetGPolygon>elementmaybeincludedwhentherequiredboundingboxdoesnotadequatelydescribethestudylocation,forexample,ifanirregularpolygonisnecessarytodescribethestudyarea,orthereisanareawithintheboundingboxthatisexcluded.Thiselementisoptional,andhastwosubelements.
<datasetGPolygonOuterGRing>:Thisistheouterpartofthepolygonshapethatencompassesthebroadestareaofcoverage.ItcanbecreatedeitherbyagRing(listofpoints)or4ormore<gRingPoint>s.DocumentationforanFGDCG-Ringstatesthatfourpointsarerequiredtodefineapolygon,andthefirstandlastshouldbeidentical.HoweverthisisnotenforceableinXMLSchema,andsoinEMLaminimumofthree<gRingPoint>sisrequiredtodefinethepolygon,anditcanbeassumedthatasinceapolygonisclosed,thelastpointcanbejoinedtothefirst.
The<datasetGPolygonExclusionGRing>istheclosed,nonintersectingboundaryofavoidarea(orholeinaninteriorarea).Thiscouldbethecenterofthedoughnutshapecreatedbythe<datasetGPolygon>.ItcanbecreatedeitherbyagRing(listofpoints)oroneormore<gRingPoint>s.Thisisusedifthereisaninternalpolygontobeexcludedfromtheouterpolygon,e.g,alaketobeexcludedfromthebroadergeographiccoverage.
TherearealternativemethodsforincludinglocationinformationwithEML,especiallywhenitisintendedforuseinanexternalapplication.GISshapefiles,KeyholeMarkupLanguage(KMLorKMZ),orEMLspatialmodulescanbeincludedasdataentities(seeadditionalresourcesfordifferentdatafiletypesatEDI).
temporalCoverageThe<temporalCoverage>elementrepresentstheperiodoftimethedatawerecollected,nottheyearthestudywasconductedifitusesretrospectiveorhistoricaldata.Mostcommonly,<singleDate>or<rangeOfDates>elementsareused.Sometimesan<alternativeTimeScale>ismoreappropriate,suchastheuseof“yearsbeforepresent”,e.g.,forlong-termtreeringchronologydatingbackhundredsofyears.Twoformatsareallowed,eithera4-digityear,oradateinISOformat:YYYY-MM-DD.
![Page 23: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/23.jpg)
BestPracticesforDatasetMetadatainEMLv3 19
Insomecases,apackagemaybeconsidered"ongoing",i.e.,dataareplannedtobeaddedatintervals.Itisnotcurrentlyvalidtoleaveanempty<endDate>taginEML.Further,EMLisintendedtohouse“snapshots”ofdatawhichcanbeimmutable(iftherepositorysupports).Soforapackagewhichisplannedtobeongoing,thebestsolutionistopopulatethe<endDate>elementwiththeendofthecurrentdatarangeandtoupdatethismetadatafieldalongwithdataupdates,sothatthe<endDate>tagreflectsonlythedatathathavealreadybeenincluded.Itisbettertostateanenddatethatguaranteesthatdataarepresentuptothatdatewithmoredatapossiblybeingavailable,thananenddateinthefuturethatincludesaperiodoftimeforwhichnodataareyetavailable.Usethe<maintanence>tag(below)todescribetheupdatefrequency.Themethods/samplingtreeshouldbeusedtodescribetheongoingnatureofthedatacollection.
Example:temporalCoverage<temporalCoverage> <rangeOfDates> <beginDate> <calendarDate>1998-11-12</calendarDate> </beginDate> <endDate> <calendarDate>2003-12-31</calendarDate> </endDate> </rangeOfDates> </temporalCoverage>
taxonomicCoverageThe<taxonomicCoverage>elementshouldbeusedtodocumenttaxonomicinformationforallorganismsrelevanttothestudy.Thelowestavailablelevel,preferablythespeciesbinomialandcommonnameshouldalwaysbeincluded,buthigher-leveltaxashouldalsobeincludedtosupportbroadertaxonomicsearches.Blocksof<taxonomicClassification>elementsshouldbehierarchicallynestedwithinasingle<taxonomicCoverage>elementratherthanrepeatedatthesamelevel.The<generalTaxonomicCoverage>elementcouldincludea)descriptionsofthegeneralprocedureofhowthetaxonomywasdetermined(keysused,etc.),b)generaltextualdescriptionofallflora/faunainthestudy(scope),andc)denotehowfinelygrainedthetaxonomyis–forexampleto“family”or“genusandspecies.”
Notethatitisallowabletocombineelementsinthehierarchyunderlike<taxonRankName>entriestocreateataxonomic“tree”(notillustrated),butthispracticemayimpedecombiningandre-using<taxonomicClassification>informationfrommultipledocumentssoshouldbeconsideredcarefully.
TheoptionaltaxonomicCoverage/taxonomicSystemtreesmaybeusedtodetailtheuseoftaxonomicidentificationresourcesandontheidentificationprocess.<classificationSystem>shouldbeusedtolistauthoritativetaxonomicdatabases(suchas
![Page 24: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/24.jpg)
BestPracticesforDatasetMetadatainEMLv3 20
ITIS,IPNI,NCBI,IndexFungorum,orUSDAPlants)orclassificationsystemsusedfortaxonomicidentification.Documentationandrelevantliteratureregarding,usedauthoritativesources,includingURL’spointingtothesesources,shouldbelistedin<classificationSystemCitation>.Exceptionsto,ordeviationfrom,usedauthoritativesourcesshouldbeexplainedin<classificationSystemModification>.
Methodsandprotocolsusedfortaxonomicclassificationshouldbedetailedusingthe<identifierName>and<taxonomicProcedures>tags.Examplesofmethodsthatshouldbelistedin<taxonomicProcedures>aredetailsofspecimenprocessing,keys,andchemicalorgeneticanalyses.<taxonomicCompleteness>maybeusedtodocumentthestatus,estimatedimportance,andreasonforincompleteidentifications.
Example:taxonomicCoverage<taxonomicCoverage> <taxonomicSystem> <classificationSystem> <classificationSystemCitation> <title>Integrated Taxonomic Information System (ITIS)</title> <creator> <organizationName>Integrated Taxonomic Information System</organizationName> <onlineUrl>http://www.itis.gov/</onlineUrl> </creator> <generic> <publisher> <organizationName>Integrated Taxonomic Information System</organizationName> <onlineUrl>http://www.itis.gov/</onlineUrl> </publisher> </generic> </classificationSystemCitation> </classificationSystem> <identifierName> <references>pers-1</references> </idnetifierName> <taxonomicProcedures>All individuals where identified and stored in alcohol, except for one voucher specimen for each species which was tagged and pinned.</taxonomicProcedures> </taxonomicSystem> <generalTaxonomicCoverage>Orthopteran insects (grasshoppers) were identified to species</generalTaxonomicCoverage> <taxonomicClassification> <taxonRankName>Kingdom</taxonRankName> <taxonRankValue>Animalia</taxonRankValue> <taxonomicClassification> <taxonRankName>Phylum</taxonRankName> <taxonRankValue>Mollusca</taxonRankValue> <taxonomicClassification> <taxonRankName>Class</taxonRankName> <taxonRankValue>Gastropoda</taxonRankValue> <taxonomicClassification> <taxonRankName>Order</taxonRankName> <taxonRankValue>Basommatophora</taxonRankValue> <taxonomicClassification> <taxonRankName>Genus</taxonRankName> <taxonRankValue>Detracia</taxonRankValue> <taxonomicClassification>
![Page 25: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/25.jpg)
BestPracticesforDatasetMetadatainEMLv3 21
<taxonRankName>Species</taxonRankName> <taxonRankValue>Detracia floridana</taxonRankValue> <commonName>Florida Melampus</commonName> </taxonomicClassification> </taxonomicClassification> </taxonomicClassification> </taxonomicClassification> </taxonomicClassification> </taxonomicClassification> <taxonomicClassification> <taxonRankName>Kingdom</taxonRankName> <taxonRankValue>Animalia</taxonRankValue> <taxonomicClassification> <taxonRankName>Phylum</taxonRankName> <taxonRankValue>Mollusca</taxonRankValue> <taxonomicClassification> <taxonRankName>Class</taxonRankName> <taxonRankValue>Bivalvia</taxonRankValue> <taxonomicClassification> <taxonRankName>Order</taxonRankName> <taxonRankValue>Filibranchia</taxonRankValue> <taxonomicClassification> <taxonRankName>Genus</taxonRankName> <taxonRankValue>Geukensia</taxonRankValue> <taxonomicClassification> <taxonRankName>Species</taxonRankName> <taxonRankValue>Geukensia demissa</taxonRankValue> <commonName>Ribbed Mussel</commonName> </taxonomicClassification> </taxonomicClassification> </taxonomicClassification> </taxonomicClassification> </taxonomicClassification> </taxonomicClassification> </taxonomicCoverage>
maintenanceThiselementisfoundattheselocations(XPath):eml:eml/dataset/maintenanceThedataset/maintenance/descriptionelementshouldbeusedtodocumentchangestothedatatablesormetadata,includingupdatefrequency.Thechangehistorycanalsobeusedtodescribealterationsinstaticdocuments.Thedescriptionelement(TextType)cancontainbothformattedandunformattedtextblocks.
Example:maintenance<maintenance> <description> <para>Data are updated annually at the end of the calendar year.</para> </description> </maintenance>
![Page 26: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/26.jpg)
BestPracticesforDatasetMetadatainEMLv3 22
methodsThiselementisfoundattheselocations(XPath):/eml:eml/dataset/methods/eml:eml/dataset/[entity]/methods/eml:eml/dataset/[entity]/attributeList/attribute/methodsGeneralInformation:InearlyEMLversions,both"<method>"and"<methods>"elementswerefound,whichcausedconfusion.InEML2.1.0,theelementswerestandardizedto"<methods>".
The<methods>treeappearsatthedataset,entity,andattributelevels,andcontentisgenerallyregardedashumanreadable,notmachine-readable.Asa‘ruleofthumb’,methodsaredescriptive,andprotocolsareprescriptive,i.e.themethodsdescribewhatwasdonewhencollectingdata,andprotocolsareasetofproceduresorprescribedactions.Amethodoftenincludesorfollowsaparticularprotocol.Asaminimum,areferencetoanexternalprotocolshouldbegivenatthedatasetlevel.However,detailed,textmethodsatthisarepreferablesothattheircontentcanbeperusedinabrowserorindexedforsearching.Iffurtherrefinementisneeded,methodscanbedefinedforindividualdataentitiesorevenindividual<attribute>,althoughthesemaynotbenotindexed.ThescopeofthemethoddefinedcanbetailoredtomatchtheEMLdocumentlevelwhereitisapplied.Forexample,methodsatthedatasetleveldescribethestudy,fora<dataTable>methodsmightincludepre-/post-processingsteps,andattheattributelevel,qualitycontrol.Theuseofmethodsrefinementvariesandkeepingallmethodsinoneplaceandatonelevel(dataset)issimplertomanage.Sincetheyaremostlyforhumanconsumption,onedetaileddescriptionofallstepstakenatthedatasetlevelisfrequentlysufficientandmoreuserfriendly.
Adescriptionofmethodscontainstheelements<methodStep>,<sampling>,and/or<qualityControl>.
methodStepAtleastone<methodStep>isrequiredunder<methods>,andeachstepisalogicalportionofthemethods,forexample,field,labandstatistical.Alltextualmethodsdescriptionsbelonghere,using<description>andTextTypetags.
Ataminimum,todescribeanexternaldocumenttwotagscanbeused:<citation>forareferraltoapublisheddocumentorpaper,or<protocol>.Ataminimum,the<protocol>requires<title>,<creator>and<distribution>tags,wherethe<distribution>treemaybeusedtorefertoanonlinedocument;seetherecommendationsaboveforusingthattree.Alternatively,theentireprotocolmaybewrittenintoEMLunderprotocol/methodStep.
instrumentation
![Page 27: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/27.jpg)
BestPracticesforDatasetMetadatainEMLv3 23
The<instrumentation>tagshouldcontainafulldescriptionoftheinstrumentsused,includingmanufacturer,model,calibrationdatesandaccuracy.Changesininstrumentationanddatesofchangesshouldbementionedearlierunderthe<description>.
dataSourceTheoptional<dataSource>tagisfornestinganEMLdatasetthatisinputtoa<methodStep>ofthedatabeingdescribed,e.g.,calibrationinformationforaninstrumentorinputparametersforamodel.Italsomayholdthesource(provenance)datawhendescribingaderiveddataset.
ContextNote:The<dataSource>elementisusedbytheEDIrepository’sprovenancetrackingsystemforlinkingbetweenderivedandsourcedatapackages.Formoreinformation,seeadditionaldatarepositoryresourcesfromEDI.
samplingThisoptionaltreecancontainvaluableandveryspecificinformationaboutthestudysite,coverageandfrequencyinadditiontothatlistedatotherlevels.
<studyExtent>Providesspecificinformationaboutthetemporalandgeographicextentofthestudysuchasdomainsofinterestinadditiontogeographic,temporal,andtaxonomiccoverageofthestudysite.<studyExtent>canbeasurrogateforthe<studyAreaDescription>under<project>.Descriptionscanbeeitherasasimpletextusing<description>orbyincludingdetailedtemporalorgeographic<coverage>elementsdescribingdiscretetimeperiodssampledormultiplesub-regionssampledwithintheoverallgeographicboundingboxthatwasdescribedatthedatasetlevel.
ContextNote:Inthepast,LTERrequestedthatindividualsamplinglocationsbelistedhere(understudyExtent/spatialSamplingUnits),andsomeLTERsitesmayhaveapplicationsthatspecificallyusethatAPath.However,ingeneraluse,thedataset-levelgeographicCoverageelementsaremorepractical.SeeEDI“OtherResources”,formoreinformationabouthowindexerstypicallyhandleEML.
<samplingDescription>atextbasedversion,similartothesamplingmethodssectioninajournalarticle.
qualityControlLikeothertreesunder<methods>,<qualityControl>canbeusedatthedataset,entityorattributelevel,whicheverisappropriate.Atitsmostbasic,usethe<description>element.Tagsarealsoavailablefora<citation>or<protocol>.
Example:methods
![Page 28: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/28.jpg)
BestPracticesforDatasetMetadatainEMLv3 24
<methods> <methodStep> <description> <section> <title>Pitfall trap sampling for ground arthropod biodiversity monitoring</title> <para>Supplies used: pitfall traps (P-16 plastic Solo cups with lids) metal spades and large bulb planters (to dig holes in which to put traps) 70% ethanol (to preserve specimens) Qorpak glass jars with lids from the VWR Corporation, 120ml (4oz), cap size 58-400 (comes included), Qorpak no. 7743C, VWR catalog no. 16195-703.</para> <para>Between 10 and 21 traps are placed at each site in siutable location. </para> <para>All trapped taxa counted and measured (body length), most taxa identified to Family, ants to Genus</para> </section> </description> <instrumentation>SBE MicroCAT 37-SM (S/N 1790); manufacturer: Sea-Bird Electronics (model: 37-SM MicroCAT); parameter: Conductivity (accuracy: 0.0003 S/m, readability: 0.00001 S/m, range: 0 to 7 S/m); last calibration: Feb 28, 2001</instrumentation> <instrumentation>SBE MicroCAT 37-SM (S/N 1790); manufacturer: Sea-Bird Electronics (model: 37-SM MicroCAT); parameter: Pressure (water) (accuracy: 0.2m, readability: 0.0004m, range: 0 to 20m); last calibration: Feb 28, 2001</instrumentation> <instrumentation>SBE MicroCAT 37-SM (S/N 1790); manufacturer: Sea-Bird Electronics (model: 37-SM MicroCAT); parameter: Temperature (water) (accuracy: 0.002°C, readability: 0.0001°C, range: -5 to 35°C); last calibration: Feb 28, 2001</instrumentation> </methodStep> <sampling> <studyExtent> <description> <para> Arthropod pit fall traps are placed in three different locations four times a year</para> </description> </studyExtent> <samplingDescription> <para>Six traps were set in a transect at each location.</para> </samplingDescription> <spatialSamplingUnits> <coverage> <geographicDescription>site number 1</geographicDescription> <boundingCoordinates> <westBoundingCoordinate>-112.234566</westBoundingCoordinate> <eastBoundingCoordinate>-112.234566</eastBoundingCoordinate> <northBoundingCoordinate>33.534566</northBoundingCoordinate> <southBoundingCoordinate>33.534566</southBoundingCoordinate> </boundingCoordinates> </coverage> <coverage> <geographicDescription>site number 2</geographicDescription> <boundingCoordinates> <westBoundingCoordinate>-111.745677</westBoundingCoordinate> <eastBoundingCoordinate>-111.745677</eastBoundingCoordinate> <northBoundingCoordinate>33.64577</northBoundingCoordinate> <southBoundingCoordinate>33.64577</southBoundingCoordinate> </boundingCoordinates> </coverage> <coverage> <geographicDescription>site number 3</geographicDescription> <boundingCoordinates> <westBoundingCoordinate>-112.167899</westBoundingCoordinate>
![Page 29: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/29.jpg)
BestPracticesforDatasetMetadatainEMLv3 25
<eastBoundingCoordinate>-112.16799</eastBoundingCoordinate> <northBoundingCoordinate>33.76799</northBoundingCoordinate> <southBoundingCoordinate>33.76799</southBoundingCoordinate> </boundingCoordinates> </coverage> </spatialSamplingUnits> </sampling> <qualityControl> <description> <para>All specimens are archived for future reference. Quality control during data entry is achieved with standard database techniques of pulldowns that prevent typos and constraints. Scientists inspect standard data summary statistics after data entry.</para> </description> </qualityControl> </methods>
Example:methods,withdataSource<methods> <methodStep> <description> <section> <para> We utilize NPP data collected from 1906 to 2006 from the ONL LTER site. The ONL NPP data unit definition is kg/m^2/yr. This unit does not require conversion. </para> </section> </description> <dataSource> <title> NPP data from ONL 1906 to 2006 </title> <creator> <organizationName> ONL LTER </organizationName> </creator> <distribution> <online> <url> http://metacat.lternet.edu/knb/metacat/knb-lter-onl.23.1 </url> </online> </distribution> <contact> <organizationName> ONL LTER </organizationName> <positionName> ONL Information Manager </positionName> <electronicMailAddress> [email protected] </electronicMailAddress> </contact> </dataSource> </methodStep> </methods>
projectThiselementisfoundatthislocation(XPath):/eml:eml/dataset/projectGeneralinformation:EMLisoneofthefewspecificationswithadetailedtreededicatedtoprojects,andwhichcanbenested,using<relatedProject>Atitssimplest,a<project>treecanholdageneraldescriptionsoftheprojectsponsoringthedatapackageandnestedifsmallersub-projects.ArelatedprojectMinimally,thedescriptionofaprojectshouldinclude<title>,<personnel>and<abstract>,withthestudyareadescriptionandmission
![Page 30: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/30.jpg)
BestPracticesforDatasetMetadatainEMLv3 26
statement.The<distribution>treeshouldlinktotheproject’shomepage,oralternativelycouldlinktoapublicationdescribingtheproject.Asstatedearlier,thedescriptionofelementsthatarereused(e.g.,XMLtypes)arediscussedwheretheyfirstappear,sothedescriptionsforthesethreeelements(<title>,<personnel>and<abstract>)canbefoundabove,under<dataset>,above.Twoelementsareuniquetothe<project>tree,<fundingSource>and<studyAreaDiscription>.
<fundingSource>shouldcontaintheagencyandgrantnumber.Itisnotoptional.
<studyAreaDiscription>treeanditsaccompanying<citation>treeareoptional,andmaybeusedtodescribenon-coveragecharacteristicsofthestudyareasuchasclimate,geologyordisturbancesorreferencestocitablebiologicalorgeophysicalclassificationsystemssuchastheBaileyEcoregionsortheHoldridgeLifeZones.ThestudyAreaDiscriptiontreealsosupportsmultiple<coverage>elementsthatcanbeusedtodescribethegeographicboundariesofindividualstudysiteswithinthelargerarea.ThesecanbereferencedbythestudyExtent/spatialSamplingUnits/referencedEntityId.Thesibling<descriptor>tagcanbeusedfortextdescriptionsofthesite.
Example:project<projec> <title>FSL basic monitoring program</title> <personnel id="pers-30" system="FLS"> <individualName> <salutation>Dr.</salutation> <givenName>Eva</givenName> <givenName>M.</givenName> <surName>Scientist</surName> </individualName> <address> <deliveryPoint>Department of Ecology</deliveryPoint> <deliveryPoint>Fictitious State University</deliveryPoint> <deliveryPoint>PO Box 111111</deliveryPoint> <city>Ficity</city> <administrativeArea>FI</administrativeArea> <postalCode>11111-1111</postalCode> </address> <role>principalInvestigator</role> </personnel> <personnel id="pers-130" system="FLS"> <individualName> <givenName>Monica</givenName> <givenName>D.</givenName> <surName>Techy</surName> </individualName> <address> <deliveryPoint>Department for Ecology</deliveryPoint> <deliveryPoint>Fictitious State University</deliveryPoint> <deliveryPoint>PO Box 111111</deliveryPoint> <city>Ficity</city> <administrativeArea>FI</administrativeArea> <postalCode>11111-1111</postalCode> </address> <role> principalInvestigator</role> </personnel>
![Page 31: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/31.jpg)
BestPracticesforDatasetMetadatainEMLv3 27
<abstract> <para>The FLS basic monitoring program consists of monitoring of arthropod populations, plant net primary productivity, and bird populations. Monitoring takes place at 3 locations, 4 times a year. Climate parameters a continuously measured at all stations.</para> </abstract> </project>
[entity]=dataTable,spatialRaster,spatialVector,storedProcedure,view,otherEntityThiselementisfoundatthislocation(XPath):/eml:eml/dataset/dataTable/eml:eml/dataset/spatialRaster/eml:eml/dataset/spatialVector/eml:eml/dataset/storedProcedure/eml:eml/dataset/view/eml:eml/dataset/otherEntityGeneralinformation:Ifatallpossible,donotpublishdataindated,proprietary,binaryformatssuchasMS-Excel,andinstead,exporttoplaintextrepresentationssuchascsv.Theentitytypes<dataTable>,<otherEntity>and<view>covermanycommonlyencountereddatastructuresandarecoveredhere.<spatialRaster>,<spatialVector>,<storedProcedure>)willbeaddressedinmoredepthinafutureversionofthisdocument.Table1givesthegeneralfeaturesofEML’ssixentitytypes,toassistinselection.
Table1.SummaryofthesixentitiesinEML2,includingthetypeofdataentitytypicallydescribedwiththatelement,howtheyarecreatedandabriefdescriptionofitsmetadata.
Elementname Usedfor Createdfrom Metadatafeatures
dataTable StaticASCIItables
exportfromcode,RDBMSorspreadsheets
columns/rowsnamedanddefined,e.g.,measurementandstoragetyping
otherEntity Binaryfiles,images,maps,KML,KMZ,code
applications typeofentity
spatialRaster grid,rastercelldata,remotesensingdata
applications,stylesheetconversions.See“OtherResources”
spatialorganizationoftherastercells,theirdatavalues,andifderivedviaimagingsensors,characteristicsabouttheimageanditsindividualbands
![Page 32: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/32.jpg)
BestPracticesforDatasetMetadatainEMLv3 28
spatialVector lines,pointspolygons,KML(ifconverted),ESRIshapefiles
applications,stylesheetconversions.See“OtherResources”
informationaboutthevector'sgeometrytype,countandtopologylevel
view Datareturnedfromadatabasequery
RDBMS similartodataTable,plusdescriptionofthequery
storedProcedure Datareturnedfromastoredprocedureinadatabase
RDBMS similartodataTable,plusprocedure’sparameters
EveryEMLdataentityhasasetofelementsincommon,calledtheEntityGrouptree,whichdescribegeneralinformationaboutanydataresource.Otherelementsareprovidedwhichareuniquetoeachentitytype.TheelementsintheEntityGroupappearfirst,andare
<alternateIdentifier><entityName><entityDescription><physical>(includingoptional<access>)<coverage><methods><additionalInfo><alternateIdentifier>(optional):TheprimaryidentifierbelongsintheidattributeoftheentityName(e.g.,<dataTableid=”xxx”>,butthistagcanaccommodateadditionalidentifiersthatmightbeused,possiblyfromdifferentdatamanagementsystems.Itisusedsimilarlytothe<alternateIdentifier>elementatthedatasetlevel,above.
<entityName>(required):thenameofthetable,fileordatabasetable.IntheearlyphasesofEMLadoption,thiswasoftentheoriginalasciifilename.However,abetteranalogyisthatthe<entityName>isaclass,e.g.,“FLStimeseriesofairtemperatureatfieldstation”,withitsinstantiation(filename)inthe<objectName>element(seebelow).
Context:TheEDIrepositoryrequiresthat<entityName>sbeuniquewithintheentity.
<entityDescription>Thisshouldbealonger,moredescriptiveexplanationofthedataintheentity.Likealldescriptions,itishuman-readable,andshouldhelpdetermineifitisappropriateforaparticularuse.
![Page 33: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/33.jpg)
BestPracticesforDatasetMetadatainEMLv3 29
The<physical>tree(/eml:eml/dataset/[entity]/physical)furtherdescribesthephysicalformatofthedata.
<objectName>shouldbethenameofthefilewhendownloaded,orexportedastextfromadatabase.The<objectName>oftenisthefilenameofafileinafilesystemorthatisaccessibleonthenetwork.
<externallyDefinedFormat>Fordataentitiesinprescribedformats(e.g.,NetCDF,KML,Excel),namethatformathere.Itisrecommendedthatwherepossible,formatsfollowmimetype(e.g.,“image/jpeg”).Descriptionsthataresoftware-specificshouldincludemanufacturer,program,andversion,e.g.“MicrosoftExcel2003”.AKMLfileofsamplinglocationscanbedeclaredhereaseither“KML”or“KMZ”.
<distribution>providesinformationonhowtheresourceisdistributed,andthecontentsofthistreewasgenerallycoveredatthedatasetlevel.However,thereareafewpointswhichwillbereiteratedhere.
Thecontentofa<url>elementattheentitylevelshoulddeliverdata,andnotpointtoanotherapplicationorusepage.The<url>’sattribute,“function”,shouldhavethevalue“download”.Thisisimpliedifthe“function”attributeisomitted.
AsofEML2.1,thereisalsoanoptional<access>elementina<distribution>treeattheentitylevel.Thiselementisintendedspecificallyforcontrollingaccesstothedataentityseparatelyfromthemetadata.Formoreinformationonusingthe<access>tree,refertothegeneralaccessdiscussionabove.
<coverage>providesinformationonthegeographic,spatialandtemporalcoveragesusedinthis[entity].Seethediscussionatthedatasetlevelformoreinformation.
<methods>providesinformationonthespecificmethodsusedtocollectinformationinthis[entity].Pleaseseethediscussionatthedatasetlevelformoreinformation.
<additionalInfo>isatextfieldforanymaterialthatcannotbecharacterizedbytheotherelementsforthedatatype.
Example:TheelementsintheEntityGroup,showingthe<dataTable>entity.<dataTable> <entityName>arthro_hab</entityName> <entityDescription> habitat description for the sampling locations</entityDescription> <physical> <objectName>fls-1.csv</objectName>
![Page 34: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/34.jpg)
BestPracticesforDatasetMetadatainEMLv3 30
<dataFormat> <textFormat> <numHeaderLines>1</numHeaderLines> <numFooterLines>0</numFooterLines> <recordDelimiter>\r</recordDelimiter> <numPhysicalLinesPerRecord>1</numPhysicalLinesPerRecord> <recordDelimiter>#x0A</recordDelimiter> <attributeOrientation>column</attributeOrientation> <simpleDelimited> <fieldDelimiter>,</fieldDelimiter> </simpleDelimited> </textFormat> </dataFormat> <distribution> <online> <onlineDescription>f1s-1 Data File</onlineDescription> <url function=”download”>http://www.fsu.edu/lter/data/fls-1.csv</url> </online> </distribution> </physical>
Eachdatatypehasaspecificsetofelementsthatfollowthecommonelements.Table2showsthespecifictreesthatareappliedtoeachofthedatatype.
Table2.Elementsspecifictoeachofthesixentitytypes.
EntityType TypicalUses ElementsfollowingEntityGroup
<dataTable> StaticASCIItables <attributeList><constraint><caseSensitivity><numberOfRecords>
<view> Datareturnedfromadatabasequery
<attributeList><constraint><queryStatement>
<storedProcedure> Datareturnedfromastoredprocedureinadatabase
<attributeList><constraint><parameter>
<otherEntity> <attributeList><constraint><entityType>
<spatialRaster> Lines,pointspolygons,KML(ifconverted),ESRIshapefiles
<attributeList><constraint><spatialReference><georeferenceInfo><horizontalAccuracy><verticalAccuracy>
![Page 35: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/35.jpg)
BestPracticesforDatasetMetadatainEMLv3 31
<cellSizeYDirection><numberOfBands><rasterOrigin><rows><columns><verticals><cellGeometry><toneGradation><scaleFactor><offset><imageDescription>
<spatialVector Lines,pointspolygons,KML(ifconverted),ESRIshapefiles
<attributeList><constraint><geometry><geometricObjectCount><topolgyLevel><spatialReference><horizontalAccuracy><vericalAccuracy>
attributeListThiselementtreeisfoundat(XPath):/eml:eml/dataset/dataTable/attributeList/eml:eml/dataset/view/attributeList/eml:eml/dataset/storedProcedure/attributeList/eml:eml/dataset/spatialRaster/attributeList/eml:eml/dataset/spatialVector/attributeList/eml:eml/dataset/otherEntity/attributeListThe<attributeList>treeisrequiredforalldatatypesexceptfor<otherEntity>.Itdescribesallvariablesinadataentityinindividual<attribute>elements.Thedescriptionincludesthenameanddefinitionofeachattribute,itsdomain,definitionsofcodedvalues,andotherpertinentinformation.
<attributeName>istypicallythenameofafieldinadatatable.Thisisoftenshortand/orcryptic.ItisrecommendedthatattributeNamesbesuitableforuseasavariable,e.g.,composedofASCIIcharacters,andthatthe<attributeName>smatchthecolumnheadersofaCSVorothertexttable.
Context:intheEDIrepository,<attributeName>smustbeuniquewithinadataentity.
![Page 36: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/36.jpg)
BestPracticesforDatasetMetadatainEMLv3 32
<attributeLabel>(optional):isusedtoprovidealessambiguousorlesscrypticalternativeidentificationthanwhatisprovidedin<attributeName>.<attributeLabel>islikelytobeusedasacolumnorrowheaderinanHTMLdisplay.
<attributeDefinition>givesapreciseandcompletedefinitionofattributebeingdocumented.Itexplainsthecontentsoftheattributefullysothatadatausercaninterprettheattributeaccurately.
<storageType>maybesystemspecific,asforaRDBMS,i.e.,AMicrosoftSQLvarchar,orOracledatetime.Thisfieldrepresentsa'hint'toprocessingsystemsastohowtheattributemightberepresentedinasystemorlanguage,butisdistinctfromtheactualexpressionofthedomainoftheattribute.Nonsystem-specificvaluesincludefloat,integerandstring.
<measurementScale>indicatesthetypeofscalefromwhichvaluesaredrawnfortheattribute.EML’sattribute-unitmodelisdescribedindetail;see“OtherResources”.Oneofthe5scaletypesmustbeused:nominal,ordinal,interval,ratio,ordateTime,asfollows:
Non-numerictypes:The<nominal>scaleisusedtorepresentnamedcategories.Valuesareassignedtodistinguishthemfromotherobservations.Thiswouldincludealistofcodedvalues(e.g.1=male,2=female),orplaintextdescriptions.Columnsthatcontainstringsorsimpletextarenominal.Example:plot1,plot2,plot3.
<ordinal>valuesarecategoriesthathavealogicalororderedrelationshiptooneanother,butthemagnitudeofthedifferencesbetweenthevaluesisnotdefinedormeaningful.Example:Low,Medium,High.
Boththenominalandordinalscalesare<nonNumericDomain>types,andcanbeeithertextoranenumeratedlist.The<enumeratedDomain>appliestocodedvalues,andrequiresa<codeDefinition>orareferencedentitycontainingthecodeexplanations.For<textDomain>anoptionalpatternmaydescribethetext,e.g.,aUStelephonenumbercanbedescribedbytheformat“\d\d\d-\d\d\d-\d\d\d\d”.
Numerictypes:<interval>measurementsareordinal,butinaddition,useequal-sizedunitsonascalebetweenvalues.Becausetheunitsareequalsized,thesemeasurementsarenumeric.However,thestartingpointisarbitrary,soavalueofzeroisnotmeaningful.Forexample,theCelsiustemperaturescaleusesdegreeswhichareequallyspaced,butwherezerodoesnotrepresent“absolutezero”(i.e.,thetemperatureatwhichmolecularmotionstops),and20Cisnot“twiceashot”as10C.
![Page 37: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/37.jpg)
BestPracticesforDatasetMetadatainEMLv3 33
<ratio>measurementshaveameaningfulzeropoint,andratiocomparisonsbetweenvaluesarelegitimate.Forexample,theKelvinscalereflectstheamountofkineticenergyofasubstance(i.e.,zeroisthepointwhereasubstancetransmitsnothermalenergy),andsotemperaturemeasuredinkelvinunitsisaratiomeasurement.Concentrationisalsoaratiomeasurementbecauseasolutionat10micromolePerLiterhastwiceasmuchsubstanceasoneat5micromolePerLiter.
Thenumerictypes<interval>and<ratio>scalesrequireadditionaltagsdescribingthe<unit>,<numericDomain>,and<precision>.
<unit>Unitsshouldbedescribedincorrectphysicalunits.Termswhichdescribedatabutarenotunitsshouldbeusedin<attributeDefinition>.Forexample,fordatadescribing“milligramsofCarbonpersquaremeter”,“Carbon”belongsinthe<attributeDefinition>,whilethe<unit>is“milligramPerMeterSquared”.
<standardUnit>and<customUnit>:Unitnamesmustbeeither<standardUnit>,fromtheunitdictionaryincludedwithEML(http://knb.ecoinformatics.org/software/eml/eml-2.1.0/eml-unitTypeDefinitions.html#StandardUnitDictionary)or<customUnit>anddefinedinthe<additionalMetadata>.
Forgeneralpurposes,thefollowingguidelines(fromISOrecommendations)applyto<customUnits>:Unitsshouldbewrittenout,notabbreviated.Unitmodifiers,suchas“squared”,shouldfollowtheunitbeingmodified.Forexample,meterSquaredispreferred,whilesquareMeterisimproper.Unitsshouldbesingular,suchas“meter”,andnotplural,suchas“meters”.
Context:EDIhasadoptedtheLTERUnitRegistryandrecommendsthat<customUnit>elementbeusedforallunitswithcontentpulledfromtheUnitRegistry,evenwhentheunitisalreadylistedinthestandardunitdictionary.
<numericDomain>Thistagincludeselementsspecifyingthe<numberType>andtheminimumandmaximumallowablevaluesofanumericattribute.Ameasurement’s<numberType>shouldbedefinedasreal,natural,wholeorintegerasexplainedinEMLhandbook:(see“OtherResources”).The<bounds>aretheoreticalorallowableminimumandmaximumvalues(prescriptive),ratherthantheactualobservedrangeinadataset(descriptive).The<bounds>treeisoptional.
<precision>describesthenumberofdecimalplacesfortheattribute.Currently,EMLdoesnotallowmorethanoneprecisionvalueforacolumn.Forexample,acolumncontaininglengthsoffishmaybemeasuredtoaprecisionof.01meterforonespeciesoffish(e.g.,large),and.001metersforadifferentspecies,butallthedataon“fishlength”arecollectedintooneattributeandaremeasuredusingtheirappropriateprecisionvalues.Forthese
![Page 38: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/38.jpg)
BestPracticesforDatasetMetadatainEMLv3 34
casesprecisioncanbeomitted,butthevariableprecisioninformationshouldbedescribedindetailinmethod/methodStep.Together,theinformationin<numericDomain>and<precision>aresufficienttodecideuponanappropriatesystem-specificdatatypeforrepresentingaparticularattribute.Forexample,anattributewithanumericdomainfrom0-50,000andaprecisionof1couldberepresentedintheClanguageusinga'long'value,butiftheprecisionischangedto'0.5'thena'float'typewouldbeneeded.
The<measurementType>element,<dateTime>,isadate-timevaluefromtheGregoriancalendaranditisrecommendedthatthesebeexpressedinaformatthatconformstotheISO8601standard.AnexampleofanallowableISOdate-timeis“YYYY-MM-DD”,asin2004-06-25,or,morefully,as“YYYY-MM-DDThh:mm:ssTZD”(eg1997-07-16T19:20:30.45Z).TheISOstandardisquitestrictaboutthestructureofdatecomponents.Sincelegacydataoftencontainnon-standarddates,andexistingequipment(e.g.,sensors)maystillbeproducingnon-standarddates,theEMLauthorshaveprovidedadditionalallowableformats.SeetheEMLdocumentationforacompletelist.ItisimportanttonotethatthedateTimefieldshouldnotbeusedforrecordingtimedurations.Inthatcase,useaunitsuchasseconds,nominalMinuteornominalDay,thatdefinesthedurationintermsofitsrelationshiptoSIsecond.
The<missingValueCode>isoptional,butshouldbeincludedtodescribeanymissingvaluecodespresentinthedataset(e.g.NA,NaN,ND,9999).Themissingvaluecodeisastring,notavalue,whichmeansthatthecontentofthisfieldmustexactlymatchwhatappearsinplaceofdatavaluesforittobecorrectlyinterpreted.Forexample,ifdataareoutputwithprecision.01andwithmissingvaluesformattedto“-9999.00”,thenthecontentofthe<missingValueCode>elementmustbe“-9999.00”not“-9999”.
Theexamplesshowtwoattributetrees.ThefirstwasgeneratedfromanSQLsystemwithadefinedstoragetype.Thesecond<attributeList>includestagsfor<customUnits>,withtheUnitdefinedinthe<additionalMetadata>tree.
Example:attributeList/attributedataTable<attributeList> <attribute id="soil_chemistry.site_id"> <attributeName>site_id</attributeName> <attributeDefinition>Site id as used in sites table</attributeDefinition> <storageType typeSystem="http://www.w3.org/2001/XMLSchema-datatypes">string</storageType> <measurementScale> <nominal> <nonNumericDomain> <textDomain> <definition>Site id as used in sites table</definition> </textDomain> </nonNumericDomain> </nominal> </measurementScale> </attribute>
![Page 39: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/39.jpg)
BestPracticesforDatasetMetadatainEMLv3 35
<attribute id="soil_chemistry.pH"> <attributeName>pH</attributeName> <attributeDefinition>ph of soil solution</attributeDefinition> <storageType typeSystem="http://www.w3.org/2001/XMLSchema-datatypes">float</storageType> <measurementScale> <ratio> <unit> <standardUnit>dimensionless</standardUnit> </unit> <precision>0.01</precision> <numericDomain> <numberType>real</numberType> </numericDomain> </ratio> </measurementScale> </attribute> <attribute id="pass2001.q110"> <attributeName>q110</attributeName> <attributeDefinition>Q110-Preference for front yard landscape</attributeDefinition> <storageType typeSystem="http://www.w3.org/2001/XMLSchema-datatypes">float</storageType> <measurementScale> <ordinal> <nonNumericDomain> <enumeratedDomain> <codeDefinition> <code>1.00</code> <definition>1-A desert landscape</definition> </codeDefinition> <codeDefinition> <code>2.00</code> <definition>2-Mostly lawn</definition> </codeDefinition> <codeDefinition> <code>3.00</code> <definition>3-Some lawn</definition> </codeDefinition> </enumeratedDomain> </nonNumericDomain> </ordinal> </measurementScale> </attribute> <attribute id="att.2"> <attributeName>Year</attributeName> <attributeDefinition>Calendar year of the observation from years 1990 - 2010</attributeDefinition> <storageType>integer</storageType> <measurementScale> <dateTime> <formatString>YYYY</formatString> <dateTimePrecision>1</dateTimePrecision> <dateTimeDomain> <bounds> <minimum exclusive="false">1993</minimum> <maximum exclusive="false">2003</maximum> </bounds> </dateTimeDomain> </dateTime> </measurementScale> </attribute> <attribute id="att.7">
![Page 40: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/40.jpg)
BestPracticesforDatasetMetadatainEMLv3 36
<attributeName>Count</attributeName> <attributeDefinition>Number of individuals observed</attributeDefinition> <storageType>integer</storageType> <measurementScale> <interval> <unit> <standardUnit>number</standardUnit> </unit> <precision>1</precision> <numericDomain> <numberType>whole</numberType> <bounds> <minimum exclusive="false">0</minimum> </bounds> </numericDomain> </interval> </measurementScale> <missingValueCode> <code>NaN</code> <codeExplanation>value not recorded or invalid</codeExplanation> </missingValueCode> </attribute> <attribute id="att.7"> <attributeName>cond</attributeName> <attributeLabel>Conductivity</attributeLabel> <attributeDefinition>measured with SeaBird Elecronics CTD-911</attributeDefinition> <storageType>float</storageType> <measurementScale> <ratio> <unit> <customUnit>siemensPerMeter</customUnit> </unit> <precision>0.0001</precision> <numericDomain> <numberType>real</numberType> <bounds> <minimum exclusive="false">0</minimum> <maximum exclusive="false">40</maximum> </bounds> </numericDomain> </ratio> </measurementScale> </attribute> </attributeList>
Theexamplesbelowshowcompleteentitytreesfor<spatialVector>and<spatialRaster>convertedviaXSLT(stylesheet)fromESRImetadataformat.Fordetailssee“OtherResources”.
Example:EntityandattributeinformationforspatialVector<spatialVector id="Landuse for Ficity in 1955"> <entityName>Landuse for Ficity in 1955</entityName> <entityDescription>This GIS layer represents a reconstructed generalized landuse map for the area of current Ficity around the time period of 1955.</entityDescription> <physical> <objectName>fls-20.zip</objectName> <dataFormat>
![Page 41: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/41.jpg)
BestPracticesforDatasetMetadatainEMLv3 37
<externallyDefinedFormat> <formatName>Shapefile</formatName> </externallyDefinedFormat> </dataFormat> <distribution> <online> <onlineDescription>f1s-20 Zipped Shapefile File</onlineDescription> <url function=”download”> http://www.fsu.edu/lter/data/fls-20.zip</url> </online> </distribution> </physical> <attributeList id="Landuse for Ficity in 1955.attributeList"> <attribute id="Landuse for Ficity in 1955.FID"> <attributeName>FID</attributeName> <attributeDefinition>Internal feature number.</attributeDefinition> <storageType typeSystem="http://www.esri.com/metadata/esriprof80.html">OID</storageType> <measurementScale> <nominal> <nonNumericDomain> <textDomain> <definition>Sequential unique whole numbers that are automatically generated.</definition> </textDomain> </nonNumericDomain> </nominal> </measurementScale> </attribute> <attribute id="Landuse for Ficity in 1955.Shape"> <attributeName>Shape</attributeName> <attributeDefinition>Feature geometry.</attributeDefinition> <storageType typeSystem=http://www.esri.com/metadata/esriprof80.html>Geometry</storageType> <measurementScale> <nominal> <nonNumericDomain> <textDomain> <definition>Coordinates defining the features.</definition> </textDomain> </nonNumericDomain> </nominal> </measurementScale> </attribute> <attribute id="Landuse for Ficity in 1955.Z955"> <attributeName>Z955</attributeName> <attributeDefinition>This field signifies the landuse value for each polygon.</attributeDefinition> <storageType typeSystem="http://www.w3.org/2001/XMLSchema-datatypes">string</storageType> <measurementScale> <nominal> <nonNumericDomain> <enumeratedDomain> <codeDefinition> <code>Agriculture</code> <definition>Agricultural land use</definition> </codeDefinition> <codeDefinition> <code>Urban</code> <definition>Urbanized area</definition> </codeDefinition> <codeDefinition>
![Page 42: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/42.jpg)
BestPracticesforDatasetMetadatainEMLv3 38
<code>Desert</code> <definition>Unmodified area</definition> </codeDefinition> <codeDefinition> <code>Recreation</code> <definition>Recreational land use</definition> </codeDefinition> </enumeratedDomain> </nonNumericDomain> </nominal> </measurementScale> </attribute> </attributeList> <geometry>Polygon</geometry> <geometricObjectCount>78</geometricObjectCount> <spatialReference> <horizCoordSysName>NAD_1927_UTM_Zone_12N</horizCoordSysName> </spatialReference> </spatialVector>
Example:EntityandattributeinformationforspatialRaster
<spatialRaster id="fi_24k"> <entityName>fi_24k</entityName> <entityDefinition>Ficiticiou State 7.5 Minute Digital Elevation Model</entityDefinition> <physical> <objectName> fls-30.zip </objectName> <dataFormat> <externallyDefinedFormat> <formatName>Esri Grid</formatName> </externallyDefinedFormat> </dataFormat> <distribution> <online> <onlineDescription>f1s-30 zipped raster data File</onlineDescription> <url function=”download”> http://www.fsu.edu/lter/data/fls-30.zip</url> </online> </distribution> </physical> <attributeList id="fi_24k.attributeList"> <attribute id="fi_24k.ObjectID"> <attributeName>ObjectID</attributeName> <attributeDefinition>Internal feature number.</attributeDefinition> <storageType typeSystem="http://www.esri.com/metadata/esriprof80.html">OID</storageType> <measurementScale> <nominal> <nonNumericDomain> <textDomain> <definition>Sequential unique whole numbers that are automatically generated.</definition> </textDomain> </nonNumericDomain> </nominal> </measurementScale> </attribute> <attribute id="fi_24k.Cell Value"> <attributeName>Cell Value</attributeName>
![Page 43: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/43.jpg)
BestPracticesforDatasetMetadatainEMLv3 39
<attributeDefinition>Elevation Value</attributeDefinition> <storageType typeSystem="http://www.esri.com/metadata/esriprof80.html" >Integer</storageType> <measurementScale> <ratio> <unit> <standardUnit>meter</standardUnit> </unit> <precision/> <numericDomain> <numberType>integer</numberType> <bounds> <minimum exclusive="true">-5193.000000</minimum> <maximum exclusive="true">14785.000000</maximum> </bounds> </numericDomain> </ratio> </measurementScale> </attribute> <attribute id="fi_24k.Count"> <attributeName>Count</attributeName> <attributeDefinition>Count</attributeDefinition> <storageType typeSystem="http://www.esri.com/metadata/esriprof80.html" >Integer</storageType> <measurementScale> <ratio> <unit> <standardUnit>number</standardUnit> </unit> <precision/> <numericDomain> <numberType>whole</numberType> </numericDomain> </ratio> </measurementScale> </attribute> </attributeList> <spatialReference> <horizCoordSysName>NAD_1927_UTM_Zone_12N</horizCoordSysName> </spatialReference> <horizontalAccuracy>not available</horizontalAccuracy> <verticalAccuracy>not available</verticalAccuracy> <cellSizeXDirection>30.0</cellSizeXDirection> <cellSizeYDirection>30.0</cellSizeYDirection> <numberOfBands>1</numberOfBands> <rasterOrigin>Upper Left</rasterOrigin> <rows>21092</rows> <columns>18136</columns> <verticals>1</verticals> <cellGeometry>matrix</cellGeometry> </spatialRaster>
constraintThiselementtreeisfoundat(XPath):/eml:eml/dataset/dataTable/constraint/eml:eml/dataset/view/constraint/eml:eml/dataset/spatialRaster/constraint/eml:eml/dataset/spatialVector/constraint
![Page 44: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/44.jpg)
BestPracticesforDatasetMetadatainEMLv3 40
/eml:eml/dataset/storedProcedure/constraint
The<constraint>treeisfordescribinganyintegrityconstraintsbetweenentities(e.g.tables),astheywouldbemaintainedinarelationalmanagementsystem.Useofthe<constraint>treeisencouragedwhendataelementscontainintegrityconstraintsfromarelationaldatabase.ExampleTO-DOshowstheconstraintsforthe<attributeList>inExampleTO-DO.Ifthereareconstraintsinwhichseveralcolumnsareinvolved,theseshouldbedescribedinmethods/qualityControl,sinceEMLisnotcurrentlyequippedtohandlekeysdefinedbymultiplecolumns.Whenthe<constraint>treeisused,alloftheentitiesthatmaybereferencedshouldbeinthesamepackage.Therearesixchildelements:
<primaryKey>isanelementwhichdeclarestheprimarykeyintheentitytowhichthedefinedconstraintpertains.
<uniqueKey>isanelementwhichrepresentsauniquekeywithinthereferencedentity.Thisisdifferentfromaprimarykeyinthatitdoesnotformanyimplicitforeignkeyrelationshipstootherentities;howeveritisrequiredtobeuniquewithintheentity.
<nonNullConstraint>definesaconstraintthatindicatesthatnonullvaluesshouldbepresentforanattributeinthisentity.
<checkConstraint>definesaconstraintwhichchecksaconditionalclausewithinanentity.
<foreignKey>definesanSQLstatementorotherlanguageimplementationoftheconditionforacheckconstraint.Generallythisprovidesameansforconstrainingthevalueswithinandamongentities.Italsoprovidesthemeanstomeaningfullylinktableforexplanationofcodes(de-normalization).
<joinCondition>definesaforeignkeyrelationshipamongentitieswhichrelatesthisentitytoanother'sprimarykey.
The<primaryKey>,<uniqueKey>,<nonNullConstraint>requireanadditional<key>tagdefiningtheattributetowhichthisconstraintapplies,referencedbyitsidattribute(describedinanotherarea).All<ConstraintType>entitiesrequireadditional<constraintName>and<attributeReference>tags.
Example:constraint<constraint id="soil_chemistry.PRIMARY"> <primaryKey> <constraintName>PRIMARY</constraintName> <key> <attributeReference>soil_chemistry.ID</attributeReference> </key> </primaryKey>
![Page 45: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/45.jpg)
BestPracticesforDatasetMetadatainEMLv3 41
</constraint> <constraint id="soil_chemistry.FK_soil_chemistry_sites"> <foreignKey> <constraintName>FK_soil_chemistry_sites</constraintName> <key> <attributeReference>soil_chemistry.site_id</attributeReference> </key> <entityReference>sites</entityReference> </foreignKey> </constraint>
additionalMetadataThiselementtreeisfoundat(XPath):eml:eml/additionalMetadata<additionalMetadata>isaflexiblefieldforincludinganyotherrelevantmetadatathatpertainstotheresourcebeingdescribed.ItscontentmustbevalidXML.Aunitasa<customUnit>mustbedescribedinthistree.
<describes>(optional)isapointertoan“id”attributeonanEMLelement(“id”describedinanotherarea).Thispointermustbeidenticaltotheattributeitispointingat,sothatautomatedprocessesareabletoassociate<additionalMetadata>tothedescribedattribute.Ifthe<describes>elementisomitted,itisassumedthatthe<additionalMetadata>contentappliestotheentireEMLdocument.
<metadata>containstheadditionalmetadatatobeincludedinthedocument.ThecontentscanbeanyvalidXML.ThiselementshouldbeusedforextendingEMLtoincludemetadatathatisnotalreadyavailableinanotherpartoftheEMLspecification,ortoincludesite-orsystem-specificextensionsthatareneededbeyondthecoremetadata.Theadditionalmetadatacontainedinthisfielddescribestheelementreferencedinthe<describes>elementprecedingit.If<describes>isnotused,either<metadata>mustcontainsufficientinformationtodefinetheassociationbetween<additionalMetadata>orthe<additionalMetadata>canbepresumedtoapplytotheentiredatapackage.
Anexampleof“sufficientinformationtodefinetheassociation”isthedefinitionofa<customUnit>.TheEMLParserexpectstofindthedescriptionofa<customUnit>intheidattributeofa<unit>elementina<unitList>,i.e.,at/eml:eml/additionalMetadata/metadata/unitList/unit.Forexample,“stmml:unitid="siemenPerMeter"”pointsatthe<customUnit>“simenPerMeter”.TheEMLParserisavailablefromGitHub,withtheEMLproject.Fordescriptionsofcustomunitssee“OtherResources”.
Example:additionalMetadatacustomunit<additionalMetadata> <metadata>
![Page 46: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/46.jpg)
BestPracticesforDatasetMetadatainEMLv3 42
<stmml:unitList> <stmml:unit id="siemensPerMeter" name="siemensPerMeter" abbreviation="S/m" unitType="conductance" parentSI="siemen" multiplierToSI="1" constantToSI="0"> <stmml:description>conductivity unit</stmml:description> </stmml:unit> </stmml:unitList> </metadata> </additionalMetadata>
III.DESCRIPTIONSOFEMLSAMPLEFILESPROVIDEDWITHTHISDOCUMENTExample1:CompleteEMLfromwhichtheexamplesinthisdocumentwerederivedcanbefoundinEDI’sGitHubrepositoryhttps://github.com/EDIorg/dm-best-practices
![Page 47: Best Practices for Dataset Metadata in Ecological Metadata ... · Maximize interoperability of EML documents to facilitate data synthesis At time of this document's publication (late](https://reader034.vdocuments.site/reader034/viewer/2022042806/5f7012d20e3d8a24906eaff7/html5/thumbnails/47.jpg)
BestPracticesforDatasetMetadatainEMLv3 43
INDEXTbd.