aat lod microthesauri

Download AAT LOD Microthesauri

If you can't read please download the document

Post on 03-Nov-2014

319 views

Category:

Data & Analytics

1 download

Embed Size (px)

DESCRIPTION

Create Linked Open Data (LOD) Microthesauri using Art & Architecture Thesaurus (AAT) LOD. View and manage options by a non-techy person. Everyone can use, create, derive from, & map to AAT microthesauri and make the digital collection become LOD-ready dataset.

TRANSCRIPT

  • 1. AAT LOD MicrothesauriCreate Linked Open Data (LOD) Microthesauriusing Art & Architecture Thesaurus (AAT) LODMarcia Lei ZengAAT International Terminology Working Group (ITWG) meetingSeptember 5-7, 2014Dresden, Germany

2. 1. DefinitionMicrothesaurus: designated subset of a thesaurus thatis capable of functioning as a complete thesaurus.-- ISO25964-2:2013Microthesauri aredifferent from: Derived vocabulariesS(source)SSSNewNewN - - N e w - -NDerivation/Modeling adaptation modification expansion partial adaptation translation 3. 12334AAT-basedVocabularies 56Full ATT orAAT MicrothesauiOther Non-LODVocabsThe need to use, create, derive from, map to AAT& go to LOD2. Overview: Situations and decisionsfor an art and architecturedigital collectionthat wants to become a LOD dataset 4. 3. Can a microthesaurus be madefrom an existing thesaurus?Structure ExampleYES Classificatorystructure EUROVOC Chinese Classified Thesaurus [English Heritage Thesauri]YES Faceted structure AAT FAST (Faceted Application of SubjectTerminology)YES/MaybeDeep hierarchies(family trees) AAT NASA Thesaurus INSPEC ThesaurusNO/Not-directlyflat structure[alphabeticallyorganized] LCSH many thesauriMicrothesaurus: designated subset of a thesaurus that is capableof functioning as a complete thesaurus. -- ISO25964-2:2013 5. Example: Eurovoc "EuroVoc is split into 21domains and 127microthesauri.Each domain is divided into anumber of microthesauri.A microthesaurus isconsidered as a conceptscheme with a subset of theconcepts that are part of thecomplete EuroVocthesaurus."Source:http://eurovoc.europa.eu/drupal/?q=node/555 6. CHIN listed890+recommendedresources.AAT's facetsandhierarchiesthat are listedseparately.Canadian Heritage Information Network (CHIN)Source: Search "AAT" from http://www.pro.rcip-chin.gc.ca/ressources-resources/index-eng.jsp 7. From: Getty Vocabularies: Linked Open DataSemantic Representation.Section 2.3.4 Top Conceptshttp://vocab.getty.edu/doc/#The_Getty_Vocabularies_and_LOD4. AATStructure'sSemanticRepresentation(Go to nextslide for non-techyview.) 8. Art and Architecture Thesaurus (AAT)Facet:ObjectsHierarchy:Furnishing and EquipmentConcept:containers (receptacles)Guide term:concept:vessels (containers)concept:rhyta(cont.) AATStructure'sSemanticRepresentation 9. Facet:ObjectsHierarchy:Furnishing and EquipmentConcept:containers (receptacles)Guide term:concept:vessels (containers)concept:rhytaWhat are special inAATFacetsSub-facets(Indicated bynode labels)Art and Architecture Thesaurus (AAT)[large] Hierarchies(full coverage, deep layer)The units wererecommended to useby projects such asThe CanadianHeritage InformationNetwork (CHIN) 10. What are usuallyavailable in a flatstructured LODconceptconcept:ConceptBTNTSource: http://id.loc.gov/authorities/subjects/sh85142374.skos.rdfthesaurus 11. so are in AAT;conceptconcept:ConceptBTResults are obtained by entering the following in NThttp://vocab.getty.edu/sparql :# 5.1.10 Find Subject by Exact English PrefLabelselect * {?subj gvp:prefLabelGVP/xl:literalForm"rhyta"@en} 12. Facet:ObjectsHierarchy:Furnishing and EquipmentConcept:containers (receptacles)Guide term:concept:vessels (containers)concept:rhyta but AAT LOD has more:FacetsArt and Architecture Thesaurus (AAT)[large] Hierarchies(full coverage, deep layer)Sub-facets(Indicated bynode labels) 13. 5. An example-- Use a to obtainall concept URIsin a facet or hierarchyPart 1. Get Data 14. Steps:After choosing a facet or ahierarchy from AAT...1. Get the ID2. Go to SPARQL Endpointnext slide 15. Step 2. Go to Getty Vocab SPARQL Endpoint: http://vocab.getty.edu/sparql 16. http://vocab.getty.edu/sparqlStep 3. Choose "Descendants of aGiven Parent" from the template,click. The template's text will showon the top Query box. 17. Steps4. Replace the ID (e.g., 300117143) in theQuery template[you may modify to add more requests]5. Submit6. Get all URIs and labels under this guideterm.Note: I replaced the aat ID, also inserted a line to get thelabels, and sort by label. Here is the text of the query:select * {?x gvp:broaderExtended aat:300117143.?x gvp:prefLabelGVP [xl:literalForm ?l]; skos:inScheme aat:} order by ?l 18. It gave me the results in 2seconds: 19. (I checked to make sure thatthe results are from multiplelevels in the hierarchy. ) 20. Step 7. Download JSON formatdata.Download Options:(1) JSON*(2) XML*JSON (JavaScript ObjectNotation) is a lightweight data-interchangeformat. 21. select * {?x gvp:broaderExtendedaat:300117143.?x gvp:prefLabelGVP [xl:literalForm ?l];skos:inScheme aat:} order by ?lResults of the JSON file.Descendants of a Given Parent: 22. (cont.) 5. An example-- Use a to obtain allconcept URIsin a facet or hierarchyPart 2. Viewing the dataset bya non-techy personAcknowledgement: Thanks to aVisiting Scholar En-bo Jiang forhelping the testing. 23. How to manage it by a non-techy person?Non-techy person's wish:I can see what are in the dataset;I can use a spreadsheet to open and manage it.Techy-person can prepare the fileas:1. From a JSON* file convert toCSV** file (can be opened asspreadsheet) using an open sourceconverter*JSON = (JavaScript ObjectNotation), a lightweight data-interchangeformat.**CSV = Comma SeparatedValue file format 24. "Form" view onlineUsing an online converter, turn JSON to CSV.http://codebeautify.org/view/jsonviewer 25. "Tree" view onlinehttp://codebeautify.org/view/jsonviewer 26. (cont.) How to manage it by a non-techy person?Non-techy person's wish:I can see what are in the dataset;I can use a spreadsheet to open and manage it.Techy-person can prepare thefile as:1. From a JSON* file convert toCSV** file (can be opened asspreadsheet) using an opensource converter, or2. From a JSON file Managefrom OpenRefine (open sourcesystem) or export to aspreadsheet 27. When uploaded the JSONfile to OpenRefine,highlight the first enter inorder for the software to tellthe structure. 28. Establish a 'Project',then ready to edit.Note: OpenRefine can beused for many otherfunctions for management,clean up, reconcile, etc. 29. Export 30. Open the JSON filefrom spreadsheet onmy laptopTo do: need to double check if all nodelabels and preferred terms are in. 31. If open the XML file fromspreadsheet, it looks like: 32. The least techy-wayis to copy-paste to aspreadsheet. 33. Summary of the processes1. Choose the facet or hierarchy you like to start;2. Find the ID of that concept.3. Use this template to get the URIs and labels: Replace the ID in the Querytemplate Submit Get the URIs and labels inunder this guide term. Sort by order (column x)# 5.1.2 Descendants of a Given Parentselect * {?x gvp:broaderExtendedaat:300117143.?x gvp:prefLabelGVP [xl:literalForm ?l];skos:inScheme aat:} order by ?l4. Use a tool that can treat JSON to view andmanage.5. Additional ideas: Use other templates to obtain needed data foryour microthesauri. (See next slide.)6. Additional ideas: Using RelFinder to Visualizehttp://www.visualdataweb.org/relfinder.php 34. More examplesUse other templates to obtain needed data for your microthesauri. Find AAT URIs and labels according to aContributor:#5.1.3 Subjects by Contributor Idselect * {?x a gvp:Subject; dct:contributoraat_contrib:10000178.?x gvp:prefLabelGVP [xl:literalForm ?l]} Find, within this set of data, only thoseinvolving a particular contributor, e.g.,by CDBP-DIBAM (Direccin deBibliotecas, Archivos y Museos;Santiago, Chile), id:300117143.)select ?x ?l ?contrib {?x gvp:broaderExtended aat:300117143.?x gvp:prefLabelGVP [xl:literalForm ?l].?x dcterms:contributor aat_contrib:10000131.} Click to view and get alldata related to an URI 35. & go to LOD6. ConclusionLOD AAT Microthesauri use, create, derive from, & map tohttp://marciazeng.slis.kent.edu/ http://lod-lam.slis.kent.edu/