structured wiki with annotation for knowledge management: an application to cultural heritage

8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

1/17

International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

230

Structured Wiki with Annotation for Knowledge Management: an Application to

Cultural Heritage

ric Leclercq and Marinette Savonnet

LE2I Laboratory UMR CNRS 5158 - Universit de Bourgogne9, Avenue Alain Savary - 21078 Dijon, [email protected], [email protected]

ABSTRACT

In this paper, we highlight how semanticwikis can be relevant solutions for buildingcooperative data driven applications indomains characterized by a rapid evolutionof knowledge. We will point out thesemantic capabilities of annotated databasesand structured wikis to provide better qualityof content, to support complex queries andfinally to carry on different type of users.Then we compare database applicationdevelopment with wiki for domains thatencompass evolving knowledge.We detail the architecture of WikiBridge, asemantic wiki, which integrates templatesforms and allows complex annotations aswell as consistency checking. We describethe archaeological CARE project, andexplain the conceptual modeling approach.A specific section is dedicated to ontologydesign, which is the compulsoryfoundational knowledge for the application.We finally report related works of thesemantic wiki use for archaeologicalprojects.

KEYWORDS

Annotated database, Cultural heritageapplication, eScience, Ontology engineering,Semantic wiki.

1 INTRODUCTION

Scientists produce more and more data,for example Galperin and Cochrane [1]have counted 1330 databases containing

more than two petabytes of datacovering different aspects of cell andmolecular biology. This data intensivescience, called eScience, takes a datadriven approach, where knowledgeemerges from data as opposed to a moretraditional knowledge-driven approach

that examines hypothesized patternsexpected from data [2]. To meet theeScience applications we must take intoaccount several characteristics (figure 1):1) Domain knowledge that can be

represented by thesauri, domainontologies (for example GeneOntology in biology or CIDOC-CRM in cultural heritage), standards(for example FuGE1 in functionalgenomics) and recommendations.

Domain ontology is specialized intoapplication ontology. Knowledge iscontinually changing requiring aflexible data structure;

2) Know-how that is expressed by thebusiness process. Generally,scientific process could be describedusing five steps: a) data acquisitionand modeling, b) collaboration, c)analysis and data mining, d)dissemination and sharing, e)

archiving and preservation;3) Technical basis that often takes theform of collaborative platforms withsophisticated technologies (e.g. theability to integrate other services

1FuGE : Functional Genomics Experimenthttp://fuge.sourceforge.net/


2/17


231

such as visualization tools, or spatialanalysis tools).

Figure 1.Working environment.

Archaeology is on the verge to embraceeScience [3]: excavations generateexponentially more massive datasets,spurred by the increasing use of imagingtools. Documents (archives, excavationreports) are the basis of thearchaeologists work. To meet the needsof archaeologists, we have developed asemantic wiki, WikiBridge, whereknowledge takes the form of annotated

database coupled with a triple-store anda wiki as user interface.

The rest of the paper is organized asfollows: section 2 gives an overview ofannotation and implementation in twomajor kinds of systems, section 3describes the requirements andWikiBridge architecture, section 4describes our annotation model which iscompared to other annotation systems.

Section 5 gives an overview of theCARE project and describes semantictools for archaeology, and section 6 is arelated works description. Finally,section 7 concludes the paper.

2 STATE OF ART

Annotations of resources (documents,images, data, web resources, etc.) can becreated using different models fromsimple text to RDF graph. New webbased applications such as semantic

blogs, social networks, semantic wikis,or data sets providers (for biological,georeferenced, climate data) make anextensive use of annotation systems.Moreover, applications use annotationsin different ways [4] for example forfinding communities in social networkor as a basis for semantic queries in bigdata applications. Currently no unifiedmodel exists for all these kinds andusages of annotations. In this section we

first study annotation models and thentwo kinds of application that make anextensive use of annotations.

2.1 Annotations

Generally speaking, the term annotationrefers to a piece of data associated toanother piece of data. In softwareapplications the term annotation candenote both the process of annotating a

resource or the result of the process [4].In this article we use annotation for theresult of the process.

In document management systems,annotations can be used at differentlevels: from the whole document to theword level. Annotations can be manuali.e. made by a person, semi-automatici.e. based upon suggestions or fullyautomated. Annotations can beassociated to a group of users (experts,novice, etc.) and shared with the samegroup or with other groups.

Annotation creates a relationshipbetween resources denoted by URIs. Itestablished a typed relation between theannotated data and the annotating data.


3/17


232

The set of all annotations related to thesame resource take the form of a graphstructure.

The different models of annotation used

in web based applications share acommon basis organization in a three-dimensional space: a subject (theannotated data), a predicate (the typedrelationship between the annotated dataand the annotating data), an object (theannotating data). This conceptual modelcan be implemented using RDF triples,binary predicates of the first order logic,conceptual graph or semantic network.

Oren et al. in [4] differentiate three typesof annotations: informal, formal andontological. Informal annotations do notuse a formal language and thus are notmachine-readable. Formal annotationsuse formal languages that are machine-readable but which do not refer to acommon knowledge and thus are notmachine-understood. Ontologicalannotation uses ontology terms thatcorrespond to the conceptualization of a

shared knowledge. Therefore, ontologybased annotation are machine-readableand machine-understood.

2.2 Annotated Database

In scientific projects, there is a growingneed to associate annotations with thecorresponding data. Several worksrelated to annotation integration inrelational database have been madeduring the last decade.The DBNotes system [5] proposes anannotation mechanism for relationaldatabases where each attribute in arelation has a corresponding attribute tohold annotation. DBNotes also, extendsthe SQL language with a PROPAGATEclause which allows users to specify how

to propagate the annotations along withthe query answers.MONDRIAN system [6] introduces anannotation model based on blocks toannotate a set of values. It defines an

algebra that allows querying data basedon annotations.bdbms system [7] allows annotations tobe defined at multiple levels ofgranularities (relation, tuple, column andvalue levels). Moreover, it allows a userrelation to have multiple annotationrelations attached to it. Each annotationis attached to a region (i.e. defined by abounding box) that represents thecovered area of the annotation. SQL has

also been extended to take into accountthe annotations in the result of a query.Curated databases are a specific kind ofannotated databases [8]. They areexpensive to establish, because expertsmust manually check each data. Dataincluding past versions are recorded aswell as data provenance (source of thedata) and annotations describingopinions of the experts. The added valueof curated databases lies in their quality

and organization.2.3 Semantic Wiki

In traditional wiki, semantics isimplicitly described by links betweenpages and by the context of the link(surrounding text). A semantic wiki is awiki that makes explicit the semantics, itincludes semantic web technologies toenable annotation of resources. Semanticwikis can be built on top of existing wikior created from scratch. In [9], authorshave identified two approaches ofsemantic wiki: 1) wiki centricapproaches use the wiki to organizeknowledge i.e. ontology emerges fromthe wiki through categories and links(wikis for ontologies) or 2) ontology


4/17


233

based approaches allow importing anexisting ontology and using it in theannotation process (ontologies forwikis).In the most popular wiki, MediaWiki,

categories are the simplest form ofannotation. They are used to classifywiki pages. Semantic MediaWiki [10] isa project that extends MediaWiki andprovides new features such as: 1)relations to describe relationshipsbetween two pages by assigningannotations to existing links and 2)attributes that allow users to specifyrelationships between pages and literals.Table 1 gives an example of a page in

MediaWiki using links and categoriesand the same page using SemanticMediaWiki capabilities.

The city of Moulis is located onthe [[Medoc region]]. Thebuilding of the High [[MiddleAges]] was discovered in 1993under the present parish church,largely Romane, surrounded by aparish cemetery until 1901, thentransformed in the publicsquare...

[Category:ArchaelogicalSite]The city of Moulis is located onthe [[region::Medoc]] region.The building of the High[[Middle Ages]] was discoveredin [[date::1993]] under thepresent [[building-type::parishchurch]], largely Romane,surrounded by a parish cemeteryuntil [[date::1901]], thentransformed in the publicsquare...[Category:ArchaelogicalSite]

Table 1. Concepts, relationships and attributesin MediaWiki and Semantic MediaWiki.

Semantic MediaWiki facilitates entry ofdata by using the Semantic Forms

extension2. Moreover, SemanticMediaWiki engine allows to loadontologies and to consult them as wikipages. OntoWiki [11] has beendeveloped with the main objective of

facilitating the acquisition andpresentation of data. It offers forms andincludes the RDF triples directly in thetext using an appropriate syntax.AceWiki [12] follows a differentknowledge acquisition strategy using acontrolled language ACE (AttemptoControlled English). Sentences in ACEare automatically translated into OWLand/or SWRL. AceWiki integrates theOWL reasoner Pellet and ensures that

the ontology is always consistent.IkeWiki and KnowWE are two examplesof the second category. IkeWiki [13] is atool for collaborative knowledgemanagement that requires a pre-existingontology. The knowledge base is storedusing the Jena RDF framework, and aSPARQL engine allows querying it.Two editors are available: one formetadata with a self-completionmechanism and a WYSIWYG editor for

content. In addition IkeWiki supportsimportation of existing content fromWikipedia. Similarly, KnowWE [14],built on top of JSPWiki, uses the SesameRDF storage.SweetWiki [9] allows users to tag pages,(called social tagging) and alsointegrates external ontologies. The set ofusers tags generates a folksonomy. Inaddition, SweetWiki adds a WYSIWYGeditor for managing content and meta-data, a reasoning engine used forquerying the wiki content.Some semantic wikis have beendeveloped for specific domains, forexample, BOWiki for biomedicine and

2An extension for MediaWikihttp://www.mediawiki.org/wiki/Extension:Semantic_Forms


5/17


234

SWiM for mathematics. BOWiki [15]allows to access to several ontologieslike the Gene Ontology and ontologiesabout cell types or anatomy. SWiM [16]extends IkeWiki adding support for the

OpenMath language, an XML languagefor expressing the logical structure ofmathematical formula.

In short, the first category of semanticwikis can be used to present knowledgeby structuring concepts through pages,categories and links. The secondcategory of semantic wikis based on pre-existing ontologies can be used as aplatform to build applications that

require a global consensus overknowledge in order to maintain thequality of data.

2.4 Comparison

The annotation tools and structuredcontent provided by template forms [17]found in semantic wikis place thembetween conventional wikis anddatabases. A semantic wiki is more

structured than a wiki, but at the sametime the structure is dynamic andextensible. In a database centricapproach, the database schema is builtupon entities identified in the first stepof analysis, and thus based on an instantknowledge. In domains characterized bya rapid evolution of knowledge, such asbiology or archaeology, a static databaseschema is not suitable and can beproscribed by the cost of evolution. Theprocess of generating the structure isalso different. In database the domainmodeling is usually done by someexperts; in semantic wiki this is acollaborative, dynamic and evolutionaryprocess.Compared to a traditional database, asemantic wiki allows: 1) to expand the

structure of documents content; 2) toenable a data model emergence from theusage; and 3) to support collaborative,distributed workflows and processes(figure 2).

Figure 2. Semantic wikis between wikis anddatabases.

3 ARCHITECTURE OF

WIKIBRIDGE

In a survey authored by Uren et al. [18],authors study semantic annotations,identify a number of requirements, andreview some semantic annotationsystems. WikiBridge's design principlesare following the seven requirementsgiven by Uren et al.: easy to useinterface, user collaborative design,support of different user skills, supportof heterogeneous format, compatibility

with Semantic Web standards,annotation capabilities and storage, andsupport for reasoning. In the nextsubsection we develop the mostimportant requirements with regard tothe architectural design of WikiBridge.

3.1 Requirements

In a knowledge engineering process, it iscommon that non-technical domainexperts work together with experiencedknowledge engineers. To supportdifferent levels of user skills certainadvanced functionalities should behidden from novice users but madeavailable to experienced users. Thus, weuse an Access Control List (ACL)mechanism to describe privilege control


6/17


235

depending on user identity and groupaffiliation. Advanced users can defineforms to help users to structure wikiarticles; each part of a form generatesautomatically annotations. During the

annotation process, a wizard suggeststerms of the ontology according tohighlighted section in the form.To be able to exchange data with otherapplications (e.g. ontology editors, Webservices, other wikis), compliance toSemantic Web standards is required.WikiBridge is purely based on existingSemantic Web standards such as theWeb Ontology Language OWL fordescribing ontologies and W3C's RDF

for annotations.We consider reasoning as one of themost important functionalities as itallows: 1) to emerge knowledge that isnot explicit in the data; 2) to check themeaning of annotations with regards tothe context of the annotation; and 3) toenhance navigation and search.

3.2 Architecture

One of the most famous semantic wiki isSemantic MediaWiki (SMW), which isbased on MediaWiki [19]. In 2009, whenwe have started the project, complexannotation and consistency checkingwere identified as mandatoryfunctionalities. In 2009, SMW doesn'tprovide complex annotation and doesn'thave consistency checking in itsroadmap.We have started by extendingMediaWiki with the following semanticcomponents: form based acquisitioninterface with automatic annotation,annotations wizard, annotationsvalidation based on the context of adocument, semantic rules and a queryengine.

Figure 3. WikiBridges architecture.

Figure 3 presents the architecture ofWikiBridge, the semantic componentsare structured in three layers, third partycomponents are in blue boxes, andbrown boxes include Web services formanaging interactions between innerservices or for adding query capabilitiesfor external applications.

The user interaction layer is coveredby MediaWiki and structured datacontrol needed for inexperienced users ismanaged by Semantic Forms. Thisextension provides advanced users witha specific description language thatallows to define new forms (figure 4).Modules corresponding to the interactionlayer are represented on the top of figure3.The semantic layer manages

annotations and query processing. Toimprove the quality of informationduring the input and annotationprocesses, we propose three semanticcomponents in WikiBridge Core (whiteboxes in figure 3) developed on the topof third party tools (RAP - RDF API forPHP, Pellet and Jena).


7/17


236

Figure 4. User interaction layer

The syntax and the semantics ofannotations, made by experts, areguaranteed by application ontology. Anannotation wizard helps users toconstruct simple or complex annotationsby selecting ontology terms in lists andgiving them properties and values(figure 5). Ontology terms are retrievedfrom ontology concepts throughSPARQL queries. Annotationconstruction is a context sensitiveprocess, initial terms displayed to usersare directly connected to fields of forms.

Each document is identified by its URLwithin the wiki and annotations use thisURL as a basement for identifyingfragments of content.Simple annotation allows annotating asubject by describing its property using aliteral or a reference to an ontology term.Complex annotation allows annotating a

subject with two or more simpleannotations and references to otherelements (subjects). For example we canannotate an altar with its dimension, itsbuilding material, its location in the

nave. The nave is detailed in another partof the document.

Figure 5. Annotation wizard.

Annotation consistency checking isoperated by a set of specific Javacomponents interacting with RAP, Pelletand Jena. Only a subset of first orderlogic constraints is checked. WikiBridge

extension (top of figure 3) connects tothe Java constraint checker by the meansof a Web service. Moreover, rules can beadded to query ontology and annotationsin order to test new facts and thus toproduce new knowledge that can beinserted in the set of semanticconstraints. Two kinds of constraints canbe checked by using the ontologyknowledge: 1) domain values ofproperties using ABox capabilities; and2) structural consistency of propertiesusing TBox capabilities (for instance, acathedral can have a nave but cannothave an atrium). Nevertheless, somedomain dependent constraint cannot beembedded in the structure. For example"In France, there is no church withrammed earth wall for the studied period

Structuration of wiki article

Creation of a form


8/17


237

however, this technique is used inIreland and in other countries" can betranslated by the following constraintthat must remains consistent:

hasRammedWall(?x)

edificeType(?x,?t)edificeCountry(?x,?c)

c='France't 'church'

The persistency layer includes fourtypes of storage (bottom of figure 3):documents content, semanticannotations, ontology, and constraints.The content of documents is stored byMediaWiki specific database.Annotations are stored as triple in RAPtriple store and they can be retrieved byWikiBridge user layer to displayannotations with icons and colors in adocument or by the SPARQL queryengine.The ontology imported as an OWL fileis stored in a specific schema managedby RAP. Ontology terms can be thenqueried using SPARQL and results cangenerate wiki pages.Constraints are stored in plain text usingJena rules syntax. A type attributespecifies if the rule can be applied tocheck the ontology structure or to checkannotation consistency.

Information access has been designedwith taking into account some featuresabout users. We have thus identified ausage typology in accordance to 1) kindof usage (reader, investigator, andannotator); 2) knowledge degree of thedomain (domain specialists like historian

researchers and non-specialists).

Figure 6. Query interface.

To handle these different types of users,we offer three types of queries: Faceted browsing allows users to

explore by filtering availableinformation with the ontologystructure (figure 6);

Form based searching providessemantic search by filling inparameters of parametric queriesidentified during the analysis ofrequirements;

Aggregate view for each article, allannotations related to the article aredisplayed in a factbox.

Nevertheless all types of queries rely onthe SPARQL query engine that alsoallows to process in line queries intowiki pages in order to summarizeinformation.To operate spatial and temporal analysison annotations a set of Web services hasbeen developed. Some specific servicesallow retrieving objects and theircoordinates according to a set ofconjunctive properties (figure 7).

Moreover, a generic Web service hasbeen developed to handle SPARQLqueries.


9/17


238

Figure 7. OpenLayers interface interacting witha Web service of WikiBridge.

4 ANNOTATION MODEL

In this section we describe theannotation model used in WikiBridgeand then we compare WikiBridgecapabilities with other annotation-basedsystem according to Orens criteria.

4.1 WikiBridge annotation model

Three kinds of annotation can be set inWikiBridge: automatic annotations areprovided by template form, assistedannotations are set by users using theannotation wizard and fully manualannotation can be written by experiencedusers directly in documents using thewiki syntax. These three kinds ofannotation share the same basicconstruct:

A=(s,p,o)where s is the subject (the annotateddata), p is the predicate (the type of thelink) and o is the object (the annotatingdata). The types of each component arethe following: s is a URI/URL that refers to the

document (i.e. an article in the wikior a part of an article);

pis a URI that refers to an ontologyconcept or property;

o is a literal or a URI that refers aindividual in the ontology or a URIoutside the scope of the wiki, or null.

Figure 8. Construction of recursive, n-aryannotation with the annotation wizard.

Our model of annotation allows definingthree basic structures of annotation:simple, complex, and recursive. A simple annotation has the type

(s,p,o). s and pcannot be null.If oisnull and p refers to a concept, theannotation specifies the type of thesubject. It can be viewed as a

constraint that is a restriction of anattribute domain. If o is not null itmust refer to a literal or to anindividual that belongs to theconcepts specified by p. It can beviewed as a database constraint thatchecks that an attribute value is in anenumerate list of values;

A complex annotation or n-ary(noted A-cplx in table 2) is a list ofsimple annotations related to the

same subject (figure 8). All thepredicates used in the list must bedifferent;

A recursive annotation (noted A-recin table 2) is an annotation based ona previous one, used to give detailson the object. A recursive annotationhas different levels. An annotation ofthe level i explains the object oof theparent annotation (i.e. from the leveli-1). If all the annotations of level i

are in list of annotations then allannotations in the list share the samesubject (o). The recursive form ofannotation is based on the semanticvalue model definedby Sciore andRosenthal in [20].


10/17


239

According to the model, the abstractsyntax depicted in table 2 is used in thewiki annotation syntax.

:=||

:=(,

,):=(,):=((,

,)):= URI | URL

:= ontology concept

|ontology property:= ontology individual

| literal| URI| URL| null

Table 2. A simplified version of the abstract

syntax used for WikiBridges annotations.

An annotation is defined with regard of acontext. The context has twodimensions: the common knowledgedimension (i.e. the structure of theontology and the rules over theconcepts), the local knowledgedimension (i.e. the set of all theannotations and sub-annotation whichshare the same root subject). An

annotation is valid only if it is consistentin the two contexts.

Annotation consistency in the commonknowledge dimension is checked byevaluating ontologys constraintsinstanciated by annotations (an exampleis given on page 8). The annotation isinconsistent if it violates one constraintthat uses one of the predicate or object ofthe annotation.

Annotation consistency in the localknowledge dimension is checked byevaluating intrinsic annotation modelconstraints. For example a subjectcannot be associated to two differentobjects using the same predicate.

4.2 Evaluation Criteria

Oren et al. in [4] have combined severalcriteria from the literature as well astheir own criteria to classify precisely

annotation systems using six properties:1. Association refers to the way theannotation is associated with theannotated resource (embedded orexternally stored);

2. Subject granularity indicates thescope of the annotation (a wholedocument, a section, a sentence, aword, a numerical data etc.);

3. Representation distinction indicateswhether the annotation can

distinguished object or concepts indocument using a reference systemfrom values;

4. Terminology reuse is related to thelevel of interoperability of annotationand indicates whether theannotations are ad-hoc or use termsfrom ontologies;

5. Object type indicates the type of theannotation (literal, textual,structured, ontological);

6.

Context indicates meta-data for anannotation (when it was made, bywhom, provenance, etc.).

SystemProperty

AnnotatedDBs

SemanticWikis

WikiBridge

Association embedded currentpageembedded

current pageembeddedand triplestore

Granularity relation,tuple,attributeand value

documentandfragment

documentandfragment

Representation yes/no yes/no yes

Terminologyreuse yes/no yes/no yes

Object type literal literal,ontologyterm, URI

literal,ontologyterm, URI

Context no/yes no/yes yes

Table 3. Comparison of WikiBridge with wellestablished annotation systems.


11/17


240

A comparison between annotateddatabases, semantic wikis andWikiBridge is shown in table 3. It showsthat WikiBridge provides a betterrepresentation of semantics with levels

of granularity and contextrepresentation. Furthermore, WikiBridgesupports strictly typed annotationthrough association, representation andobject type that enable consistencychecking.

5 PROJECT OVERVIEW

The aim of the international projectCARE (Corpus Architecturae Religiosae

Europeae) is the setting up of a corpusdescribing Christian edifices in Europe.Italy, Spain, Czech Republic, Poland,Slovakia, France and Croatia have beenincluded in the project four years ago.Each edifice is described in a documentthat focuses on the definition of states ofevolutions from the 4th century to the11th century.The French corpus focuses on the 7thand 8th centuries with very rich decades

in terms of number of monuments(http://care.u-bourgogne.fr).Archaeology is the science ofdestruction, in which the process ofexcavation removes the sedimentary andcultural context of artifacts andarchitecture definitively. Thus, theaccurate recording of contexts andartifacts are crucial. Representing andmanaging knowledge in cultural heritagerequire a deep understanding of specificconcepts. Building collaborativeplatform brings out some challengingcharacteristics: 1) complexity of data(heterogeneous, incomplete, uncertain,inconsistent, spatial, and temporal); 2)domain knowledge barrier; 3) evolvingknowledge; and 4) skills of actors.

5.1 The CARE Community

From an organizational perspective, theCARE project takes the form of anexpert network collecting and providing

information on edifices, analyzinghistorical sources, filling documents andcollaborating in the exploitation of thecorpus through smaller research groups.Additionally, the project involvesundergraduate students that help incollecting information but lack expertiserequired to interpret complex data.In France, more than sixty researchersare collecting and analyzing dataconcerning approximately 2700

monuments.Two key characteristics outline theCARE community: Multi-disciplinarity: The data

collecting process involvesarchaeologists, historians, arthistorians, topographers, draftsmen.It is designed as a collaborativeprocess which merges informationfrom various disciplines;

Inter-disciplinarity: Theinterpretation of data brings togetherall the actors which also enrich theirrespective practices by theconfrontation of methods orproblems.

5.2 Conceptual Modeling for the

Foundational Knowledge

Linster in [21] shows that the interactionamong domain experts, knowledge

engineers and tools creates knowledge.In addition, he has shown that theprocess of elaborating a knowledge-based system is a constructive model-building process that includes: adiscussion process between knowledgeengineers and domain experts as well asthe construction of a conceptual model


12/17


241

(i.e. a general and abstract framework).Thus, the knowledge engineeringactivity encompasses the design of twokinds of models: model to make senseand model to implement systems.

We have applied Linster's guidelines tothe CARE project in order to initiate afoundational knowledge from the corpusof documents. The first stage isidentification of salient concepts. Thekey concept is the edifice to which it isessential to model changes. Allconstituent elements of a building needto be described. They can delimit spaceor define religious function (baptismal,

funerary, etc.). All changes of space orreligious function determine a newedifice state. Edifices and theirevolutions are described in a set ofdocuments. Salient concepts let us tobuild a conceptual model in which threegroups of elements have been identified(figure 9):

1. Spatial concepts without temporalrelationship (light grey): concept

EGS refers to edifice, group ofedifices or space inside edifice suchas nave or apse. Compositionrelationships can be identifiedbetween EGS;

2. Spatio-temporal concepts (grey)called SEGS, represent variations ofspatial concepts in time. Spatio-temporal concepts are linked to adate or a period. Dating elements canbe determined by documents, ordescribed by methods such as C14,thermoluminescence or stratigraphy;

3. Identification of vocabulary termsfor the project domain (black). Whileconcepts EGS and SEGS are used tostructure the descriptions, terms areused to describe specific elementsincluding, properties of edifices,

religious functions or manufacturingtechniques.

Figure 9. Conceptual model of the CAREcorpus.

The description of the design of theontology is detailed in the next section.

5.3 Offering Semantic Tools For

Archaeology: The CARE Ontology

Cultural heritage collections can beannotated with different thesauri. Abouttwenty thesauri are described at the pagehttp://tinyurl.com/5u8bjer.The CIDOC Conceptual ReferenceModel (CRM) provides an extensible

ontology for concepts and relationshipsin cultural heritage domain(http://www.cidoc-crm.org). Since2006, it is an ISO standard (21127:2006)for exchange of cultural heritageinformation. CIDOC-CRM aims attreating all types of material collectedand/or displayed by museums: sites,monuments as well as collections of fineand applied arts. It is intended toencompass the detailed description both

of individual materials as well as groupsof materials as a whole. It also coverscontextual information: historical,geographical context in which materialsare placed and which gives them muchof their significance and value. Theevent notion in CIDOC-CRM isrepresented by the concept of event. An


13/17


242

event describes environment of thematerial over the time and what couldhappen to it. The central notion iscomplemented by: 1) the TimeSpanconcept describes the moment it

happened; 2) the concept of Place; 3)who did it (Actor); and 4) what is beingdescribed. CRM offers notions ofPhysical Objects that can be natural orhave been manufactured by man, andConceptual Objects.Since CRM is a reference in the field,we use it as a starting point to establishCARE ontology. The view of the CAREontology as a specialization of CRMallows us to relate to a standard and

comply with it. The archaeologist usestwo sources of information. First, thereare material data like pottery, clothingand architectural items like walls,columns, floors. Second, there aredescriptive data that help to associatematerial data found in a location withtheir context. Descriptive data can bemeasurements (height, width and lengthbut also mass, density), direction, andassociation. Association seeks the

position of the material data in relationto its surroundings. Accordingly, theCARE ontology has four parts: 1)religious concepts, 2) their spatialrelationships and 3) characteristics and4) timeline to track evolutions (figure10). Each sub-tree from THINGmay beconsidered as an ontology of a particulardomain. We call a branch of ontology ahierarchy is_awith one root. To covera wider field of knowledge it should

compulsorily consider the relationshipsbetween several sub-trees. Grenon et al.[22] propose the definition of three kindsof relationships: Intra-ontology: relationship with two

concepts of same part of anontology;

Trans-ontology: relationship with aconcept of a sub-tree and a conceptof another sub-tree. For instance, abuilding is consecrated to a saint, inDL we can write:

BuildingisConsecrated.Saint

Meta-ontology: relationship with aconcept of an ontology and anotherontology (considered as a whole).

Figure 10. Branches of CARE ontology (in blueCIDOC-CRM ontology, concepts with EXX

prefix are CIDOC-CRM concepts).

Modeling religious conceptsReligious concepts in CARE areedifices, with its decomposition intodifferent constituent elements (nave,transept, apse, etc.), liturgicalinstallations (altar, ambo, ciborium, etc.)and burials. These concepts have beenplaced under the concept E24

Physical Man-Made ThingCIDOC-CRM. Indeed, CIDOC-CRMdefines this concept as all persistentphysical items that are purposely

created by human activity.

Modeling characteristics of an edificeTo detail parts of an edifice, weintroduced the concept of architecturalelements. It describes masonry, floor,opening, inscriptions, as well as

construction techniques, dimensions, andcolors, etc.

Modeling spatial relationships in

archaeologyThe geometry implementation in thetextual descriptions that are analyzed is acomplex geometry. Indeed, these


14/17


243

descriptions do not refer to an absoluteand orthonormal space: it is rather, aspace perception or a cognitive spacewhose structure is largely based on thefunctional aspects and objects described,

and the perspective of the archaeologist.From the analysis of textual descriptionsof religious concepts (description of theposition and shape) given byarchaeologists we have found four typesof spatial properties:1. Orientation properties: forward,

back, bottom, next to, on one side,under, below, at a lower level, at thesame level, above, right, left, center,prior, and the cardinal directions;

2.

Boundary properties: outside, inside;3. Distance properties: near, far, nextto, around;

4. Topological properties: flank, join,open on the side, link to, standagainst, surround, isolated. Torepresent the topological properties,we used the work of Hegenhofer andHerring [23]. The authors havedefined a minimum set of eightrelations (disconnected, externally

connected, partially overlap, equal,tangential proper part, non-tangentialpart, tangential proper part inverse,non-tangential part inverse)describing the relations between tworegions. This set serves as a basis fordefining other topological relation byusing a composition.

Modeling temporal relationships to

track evolutions

When writing of his excavation report,the archaeologist graphicallysummarizes the results obtained with atimeline that is often organized byanterior/posterior relationships: materialsare considered in relation to each other.In the CARE project, time model isbased on following criteria: some

absolute benchmarks and a relativechronology based on intervals. We haveestablished a convention for centurydivision and boundaries. Centuries startat year 1 and end at year 100. We also

have established subdivisions terms suchas early (132), mid (3366) and late(67100). These century divisions areplaced underE52 TimeSpan.CIDOC-CRM offers specific conceptsrelate to time [24, 25]. Some Allen'srelationships [26] are properties used.The CARE project aims to follow theevolutions (creation, modification,deletion) of an edifice and its constituentelements. The concept of activity is

important for CARE project because astate may be characterized by an activityin the CIDOC-CRM ontology. Theconcept E7 Activity is defined asfollows: The action or sequence of

actions intentionally carried out by

Actors that result in changes of state inthe cultural, social, material systems

which interest. This notion includes

both complex and long-lasting actionssuch as building an edifice, as well as

simple and short-lived actions.Following the concepts of the CIDOC-CRM ontology, we use seven concepts,specialized concepts of E7 activity, tomodel states of edifices:E6 Destruction,E11 Modification, E81 Transformation,E63 Beginning of Existence, E64 End of

Existence, E79 Part Addition and E80

Part Removal.The CARE ontology has been designedusing Protg and it actually

encompasses 124 classes and 715individuals.

6 RELATED WORKS

Several semantic wikis have beendeveloped or used specifically forcultural heritage applications.


15/17


244

Witte et al. [27] present an approach tocultural heritage data management whichintegrates different technologies: a wikiuser interface, text mining support usinga Natural Language Processing (NLP)

framework and ontologies based onOWL and RDF. Authors haveimplemented the ideas for the GermanHandbuch der Architektur, acomprehensive multi-volumeencyclopedia of architecture. A volume(506 pages) of the encyclopedia isconverted into wiki pages. Authors haveto capture two sub-domains byontologies: the domain of documentmanagement (i.e. sentence, noun, page

number, etc.) and architectural domain(i.e. wall, building material, etc.). NLPallows connecting architectural conceptswith document-specific one, e.g.sentences that mention constructionelements of a certain material. A publicversion is available athttp://www.semanticsoftware.info

/durm.

The HermesWiki [28] is a semantic wikiin the historical domain in Germanlanguage. The main objective is toprovide an overview on Ancient GreekHistory for teaching purposes ofundergraduate students. The wikiconsists of three parts: a collection abouttwenty essays giving a comprehensivedomain walk-through, translations of thedescribing ancient sources and aglossary. The entries in the glossary aretagged. It has been implemented as aplugin for KnowWE, reusing as much ofthe core components as possible. Apublic version is available athttp://hermeswiki.informatik.uni

-wuerzburg.de.

NavEditOW is a framework for ontologydriven web site. It has been exploited tosupport a semantic description of two

projects: 1) a web portal and a set ofadvanced services supporting the sharingof knowledge about Prehistory andProtohistory in the Italian context [29].In particular, one of the services is

represented by a digital library, in whichentries (i.e. bibliographic description ofpublications) will be ontologicallydescribed. The system is currently on-line at http/www.archeoserver.itand; 2) SilkRoDE (Silk Roads in theDigital Area) project that aims to collect,structure and diffuse all knowledgeabout the Cultural Heritage of CentralAsia from fields such as archaeology,geography or history [30]. The

ontological approach provides therequired expressiveness and flexibility tosupport rich forms of navigation amongstored contents. The frameworkintegrates a wiki engine for renderingdocuments stored in the ontological tier.In the same view, MANTIC is a webapplication that realizes a portal forarchaeological information about the cityof Milan [31]. MANTIC integratesdifferent data sources and the global

schema is based on CIDOC-CRM.Our approach of semantic wiki isdirected towards scientific applicationdomains, which contribute to produceknowledge [32]. These kinds ofapplications rely on core ontologies thatact as a consensus. Querying andanalyzing data enhance knowledge, newconcepts can emerge and newconstraints can be found out. As a result,ontologies can be modified dynamicallyand semantic checks are necessary tofind inconsistent annotations withregards to ontology.

7 CONCLUSION


16/17


245

Wiki solutions meet the requirements ofa web platform with collaborativecapabilities. Easy setup and rich editingsupport are primary reasons for thewidespread adoption of wikis. Users can

enter text and others types of data(pictures, video) and connect contentthrough hyperlinks. Most of wikis alsoprovides a versioning system to trackcontent changes and a full-text searchengine for querying wiki pages. Thenarrative structure is one advantage ofwiki documents centric approach,compared to a database centric approach.Nevertheless, a mere documentmanagement system is not sufficient to

catch interdependent structures ofknowledge. Adding semantic annotationcapabilities to documents allowsdifferent levels of interpretation and cansustain: 1) knowledge evolution bykeeping track of the successiveannotations; 2) better quality in thequery evaluation process; and 3)amenable result displayed according touser skills. Annotations can be defined ata coarse-grained level or at a fine-

grained level. Ontology must beassociated to the annotation system toprovide a semantics for annotation termsaccording to domain knowledge.Semantic wiki solutions meet therequirements of annotation system andknowledge description. Addingsemantics yields two dimensions ofenhancements to a wiki: 1) adding amore formal structure to the wiki; 2)exporting, integrating and reusinginformation by the adoption of standardsemantic technologies. WikiBridge thusseems to combine the best from twoworlds: structure from databases as wellas expandability and collaborationcapabilities from wiki systems.

ACKNOWLEDGMENTS

This research is supported by the FrenchANR funding (ANR-07-CORP-011).8 REFERENCES

1. Galperin, M., Cochrane, G.: The 2011Nucleic Acids Research Database Issue andOnline Database Collection. In NucleicAcids Research, vol.39 (Database issue),

pp.1--6, (2011).2. Newman, H., Ellisman, M., Orcutt, J.: Data-

intensive e-science frontier research. InCommunications of the ACM, Vol.46(11),

pp. 68--77, (2003).3. Petrovic, V., Gidding, A., Wypych, T.,

Kuester, F., DeFanti, T., Levy, T.: Dealingwith Archaeology's Data Avalanche, InComputer, Vol.44(7), pp. 56--60, IEEE,(2011).

4. Oren, E., Mller, K., Scerri, S., Handschuh,S., Sintek, M.: What are SemanticAnnotations?, Technical report, DERIGalway, (2006).

5. Chiticariu, L., Tan, W., Vijayvargiya, G.:DBNotes: A Post-It System for RelationalDatabases based on Provenance. InProceedings of the ACM SIGMODInternational Conference on Management ofData, pp. 942--944, (2005).

6. Geerts, F., Kementsietsidis, A., Milano, D.:MONDRIAN: Annotating and QueryingDatabases through Colors and Blocks. InProceedings of the 22nd International

Conference on Data Engineering (ICDE),pp. 82--91, (2006).7. Eltabakh, M., Aref, W. ,Elmagarmid, A.,

Ouzzani, M., Silva, Y. : SupportingAnnotations on Relations. In 12thInternational Conference on ExtendingDatabase Technology (EDBT), pp. 379--390, (2009).

8. Buneman, P., Cheney, J., Tan, W.,Vansummeren, S.: Curated databases. InPODS, pp. 1--12, (2008).

9. Michel Buffa, M., Gandon, F., Erto, G.,Sander, P., Faron, C.: SweetWiki: Asemantic wiki. In Journal of Web Semantics,vol.6(1), pp. 84--97, (2008).

10. Krtzsch, M., Vrandecic, D., Vlkel, M.:Semantic MediaWiki. In 5th InternationalSemantic Web Conference, pp. 935--942,(2006).

11. Auer , S., Dietzold, S., Riechert, T. :OntoWiki - A Tool for Social, SemanticCollaboration. In 5th International SemanticWeb Conference, pp. 736--749, (2006).


17/17


246

12. Kuhn, T.: How Controlled English canImprove Semantic Wikis. In 4th SemanticWiki Workshop (SemWiki), (2009).

13. Schaffert, S.: IkeWiki: A Semantic Wiki forCollaborative Knowledge Management. In15th IEEE International Workshops on

Enabling Technologies: Infrastructures forCollaborative Enterprises (WETICE), pp.388--396, (2006).

14. Reutelshfer, J., Haupt, F., Lemmerich, F.,Baumeister, J.: An Extensible SemanticWiki Architecture. In 4th Semantic WikiWorkshop (SemWiki), (2009).

15. Hoehndorf, R., Bacher, J., Backhaus, M.,Gregorio, G., Loebe, F., Prfer, K., Uciteli,A., Visagie, J., Herre, H., Kelso, J.:BOWiki: an ontology-based wiki forannotation of data and integration ofknowledge in biology. In Proceedings ofBMC Bioinformatics, Vol.10(s-5), (2009).

16. Lange, C.: SWiM - A Semantic Wiki forMathematical Knowledge Management. In5th European Semantic Web Conference,(ESWC), pp.832--837, (2008).

17. Haake, ., ukosch, ., chmmer, T.:Wiki-templates : adding structure support towikis on demand. In InternationalSymposium on Wikis, pp. 41--51, (2005).

18. Uren, V., Cimiano, P., Iria, J., Handschuh,S., Vargas-Vera M., Motta E.,Ciravegna, F.:Semantic Annotation for KnowledgeManagement: Requirements and a Survey ofthe State of the Art. In Web Semantics:

Science, Services and Agents on the WorldWide Web, vol. 4(1), pp. 14--28, (2006).

19. Vlkel, M., Krtzsch, M., Vrandecic, D.,Haller, H., Studer, T.: Semantic wikipedia.In Proceedings of the 15th internationalconference on World Wide Web, pp. 585--594, (2006).

20. Sciore, E., Siegel, M., Rosenthal, A.: Usingsemantic values to facilitate interoperabilityamong heterogeneous information systems.In ACM Trans. Database Syst., vol.19(2),pp.254--290, (1994).

21. Linster, M.: Linking Modeling to MakeSense and Modeling to Implement Systemsin an Operational Modeling Environment. InEKAW, pp. 55--74, (1992).

22. Grenon, P., Smith, B.: SNAP and SPAN:Upper level ontology and framework forformalization. In Spatial cognition andcomputation, vol.4(1), (2004).

23. Hegenhofer, M., Herring, J.: CategorizingBinary Topological Relations BetweenRegions, Lines and Points in Geographic

Databases. Technical report, National Centerfor Geographic Information and Analysis,CA, (1991).

24. Binding, C.: Implementing ArchaeologicalTime Periods Using CIDOC CRM andSKOS. In Proceedings of the 7th Extended

Semantic Web Conference (ESWC), pp.273--287, (2010).25. M. Doerr, M., Kritsotaki, A., Stead, S.:

Thesauri of Historical Periods - A Proposalfor Standardization. In Proceedings ofCIDOC Conference, (2005).

26. Allen, J. F.: Maintaining Knowledge aboutTemporal Intervals. In CommunicationACM, vol.26(11), pp. 832--843, (1983).

27. Witte, R., Krestel, R., Kappler, T.,Lockemann, P.C.: Converting a HistoricalArchitecture Encyclopedia into a SemanticKnowledge Base. In IEEE IntelligentSystems, vol.25(1), pp.58--67, (2010).

28. Reutelshoefer, J., Lemmerich, F.,Baumeister, J., Wintjes, J., Haas, L.: TakingOWL to Athens: Semantic Web technologytakes Ancient Greek history to students. InProceedings of the 7th Extended SemanticWeb Conference (ESWC), pp. 333--347,(2010).

29. Bonomi, A., Mantegari, G., Vizzari, G.: AFramework for Ontological Description ofArchaeological Scientific Publications. InSWAP, (2006).

30. Bonomi, A., Mosca, A., Palmonari, M.,Vizzari, G.: Integrating a Wiki in an

Ontology Driven Web Site: Approach,Architecture and Application in theArchaeological Domain. In Proceedings ofthe Third Workshop on Semantic Wikis:The Wiki Way of Semantics (SemWiki),(2008).

31. Mantegari, G., Palmonari, M., Vizzari, G.:Rapid Prototyping a Semantic WebApplication for Cultural Heritage: The Caseof MANTIC. In Proceedings of the 7thExtended Semantic Web Conference(ESWC), pp. 406--410, (2010).

32. Leclercq, E., Savonnet, M.: Access andannotation of archaelogical corpus via asemantic wiki. In Fifth Workshop onSemantic Wikis - Linking Data and People(Semwiki), (2010).