structured wiki with annotation for knowledge management: an application to cultural heritage

Upload: ijdiwc

Post on 03-Jun-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    1/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    230

    Structured Wiki with Annotation for Knowledge Management: an Application to

    Cultural Heritage

    ric Leclercq and Marinette Savonnet

    LE2I Laboratory UMR CNRS 5158 - Universit de Bourgogne9, Avenue Alain Savary - 21078 Dijon, [email protected], [email protected]

    ABSTRACT

    In this paper, we highlight how semanticwikis can be relevant solutions for buildingcooperative data driven applications indomains characterized by a rapid evolutionof knowledge. We will point out thesemantic capabilities of annotated databasesand structured wikis to provide better qualityof content, to support complex queries andfinally to carry on different type of users.Then we compare database applicationdevelopment with wiki for domains thatencompass evolving knowledge.We detail the architecture of WikiBridge, asemantic wiki, which integrates templatesforms and allows complex annotations aswell as consistency checking. We describethe archaeological CARE project, andexplain the conceptual modeling approach.A specific section is dedicated to ontologydesign, which is the compulsoryfoundational knowledge for the application.We finally report related works of thesemantic wiki use for archaeologicalprojects.

    KEYWORDS

    Annotated database, Cultural heritageapplication, eScience, Ontology engineering,Semantic wiki.

    1 INTRODUCTION

    Scientists produce more and more data,for example Galperin and Cochrane [1]have counted 1330 databases containing

    more than two petabytes of datacovering different aspects of cell andmolecular biology. This data intensivescience, called eScience, takes a datadriven approach, where knowledgeemerges from data as opposed to a moretraditional knowledge-driven approach

    that examines hypothesized patternsexpected from data [2]. To meet theeScience applications we must take intoaccount several characteristics (figure 1):1) Domain knowledge that can be

    represented by thesauri, domainontologies (for example GeneOntology in biology or CIDOC-CRM in cultural heritage), standards(for example FuGE1 in functionalgenomics) and recommendations.

    Domain ontology is specialized intoapplication ontology. Knowledge iscontinually changing requiring aflexible data structure;

    2) Know-how that is expressed by thebusiness process. Generally,scientific process could be describedusing five steps: a) data acquisitionand modeling, b) collaboration, c)analysis and data mining, d)dissemination and sharing, e)

    archiving and preservation;3) Technical basis that often takes theform of collaborative platforms withsophisticated technologies (e.g. theability to integrate other services

    1FuGE : Functional Genomics Experimenthttp://fuge.sourceforge.net/

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    2/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    231

    such as visualization tools, or spatialanalysis tools).

    Figure 1.Working environment.

    Archaeology is on the verge to embraceeScience [3]: excavations generateexponentially more massive datasets,spurred by the increasing use of imagingtools. Documents (archives, excavationreports) are the basis of thearchaeologists work. To meet the needsof archaeologists, we have developed asemantic wiki, WikiBridge, whereknowledge takes the form of annotated

    database coupled with a triple-store anda wiki as user interface.

    The rest of the paper is organized asfollows: section 2 gives an overview ofannotation and implementation in twomajor kinds of systems, section 3describes the requirements andWikiBridge architecture, section 4describes our annotation model which iscompared to other annotation systems.

    Section 5 gives an overview of theCARE project and describes semantictools for archaeology, and section 6 is arelated works description. Finally,section 7 concludes the paper.

    2 STATE OF ART

    Annotations of resources (documents,images, data, web resources, etc.) can becreated using different models fromsimple text to RDF graph. New webbased applications such as semantic

    blogs, social networks, semantic wikis,or data sets providers (for biological,georeferenced, climate data) make anextensive use of annotation systems.Moreover, applications use annotationsin different ways [4] for example forfinding communities in social networkor as a basis for semantic queries in bigdata applications. Currently no unifiedmodel exists for all these kinds andusages of annotations. In this section we

    first study annotation models and thentwo kinds of application that make anextensive use of annotations.

    2.1 Annotations

    Generally speaking, the term annotationrefers to a piece of data associated toanother piece of data. In softwareapplications the term annotation candenote both the process of annotating a

    resource or the result of the process [4].In this article we use annotation for theresult of the process.

    In document management systems,annotations can be used at differentlevels: from the whole document to theword level. Annotations can be manuali.e. made by a person, semi-automatici.e. based upon suggestions or fullyautomated. Annotations can beassociated to a group of users (experts,novice, etc.) and shared with the samegroup or with other groups.

    Annotation creates a relationshipbetween resources denoted by URIs. Itestablished a typed relation between theannotated data and the annotating data.

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    3/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    232

    The set of all annotations related to thesame resource take the form of a graphstructure.

    The different models of annotation used

    in web based applications share acommon basis organization in a three-dimensional space: a subject (theannotated data), a predicate (the typedrelationship between the annotated dataand the annotating data), an object (theannotating data). This conceptual modelcan be implemented using RDF triples,binary predicates of the first order logic,conceptual graph or semantic network.

    Oren et al. in [4] differentiate three typesof annotations: informal, formal andontological. Informal annotations do notuse a formal language and thus are notmachine-readable. Formal annotationsuse formal languages that are machine-readable but which do not refer to acommon knowledge and thus are notmachine-understood. Ontologicalannotation uses ontology terms thatcorrespond to the conceptualization of a

    shared knowledge. Therefore, ontologybased annotation are machine-readableand machine-understood.

    2.2 Annotated Database

    In scientific projects, there is a growingneed to associate annotations with thecorresponding data. Several worksrelated to annotation integration inrelational database have been madeduring the last decade.The DBNotes system [5] proposes anannotation mechanism for relationaldatabases where each attribute in arelation has a corresponding attribute tohold annotation. DBNotes also, extendsthe SQL language with a PROPAGATEclause which allows users to specify how

    to propagate the annotations along withthe query answers.MONDRIAN system [6] introduces anannotation model based on blocks toannotate a set of values. It defines an

    algebra that allows querying data basedon annotations.bdbms system [7] allows annotations tobe defined at multiple levels ofgranularities (relation, tuple, column andvalue levels). Moreover, it allows a userrelation to have multiple annotationrelations attached to it. Each annotationis attached to a region (i.e. defined by abounding box) that represents thecovered area of the annotation. SQL has

    also been extended to take into accountthe annotations in the result of a query.Curated databases are a specific kind ofannotated databases [8]. They areexpensive to establish, because expertsmust manually check each data. Dataincluding past versions are recorded aswell as data provenance (source of thedata) and annotations describingopinions of the experts. The added valueof curated databases lies in their quality

    and organization.2.3 Semantic Wiki

    In traditional wiki, semantics isimplicitly described by links betweenpages and by the context of the link(surrounding text). A semantic wiki is awiki that makes explicit the semantics, itincludes semantic web technologies toenable annotation of resources. Semanticwikis can be built on top of existing wikior created from scratch. In [9], authorshave identified two approaches ofsemantic wiki: 1) wiki centricapproaches use the wiki to organizeknowledge i.e. ontology emerges fromthe wiki through categories and links(wikis for ontologies) or 2) ontology

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    4/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    233

    based approaches allow importing anexisting ontology and using it in theannotation process (ontologies forwikis).In the most popular wiki, MediaWiki,

    categories are the simplest form ofannotation. They are used to classifywiki pages. Semantic MediaWiki [10] isa project that extends MediaWiki andprovides new features such as: 1)relations to describe relationshipsbetween two pages by assigningannotations to existing links and 2)attributes that allow users to specifyrelationships between pages and literals.Table 1 gives an example of a page in

    MediaWiki using links and categoriesand the same page using SemanticMediaWiki capabilities.

    The city of Moulis is located onthe [[Medoc region]]. Thebuilding of the High [[MiddleAges]] was discovered in 1993under the present parish church,largely Romane, surrounded by aparish cemetery until 1901, thentransformed in the publicsquare...

    [Category:ArchaelogicalSite]The city of Moulis is located onthe [[region::Medoc]] region.The building of the High[[Middle Ages]] was discoveredin [[date::1993]] under thepresent [[building-type::parishchurch]], largely Romane,surrounded by a parish cemeteryuntil [[date::1901]], thentransformed in the publicsquare...[Category:ArchaelogicalSite]

    Table 1. Concepts, relationships and attributesin MediaWiki and Semantic MediaWiki.

    Semantic MediaWiki facilitates entry ofdata by using the Semantic Forms

    extension2. Moreover, SemanticMediaWiki engine allows to loadontologies and to consult them as wikipages. OntoWiki [11] has beendeveloped with the main objective of

    facilitating the acquisition andpresentation of data. It offers forms andincludes the RDF triples directly in thetext using an appropriate syntax.AceWiki [12] follows a differentknowledge acquisition strategy using acontrolled language ACE (AttemptoControlled English). Sentences in ACEare automatically translated into OWLand/or SWRL. AceWiki integrates theOWL reasoner Pellet and ensures that

    the ontology is always consistent.IkeWiki and KnowWE are two examplesof the second category. IkeWiki [13] is atool for collaborative knowledgemanagement that requires a pre-existingontology. The knowledge base is storedusing the Jena RDF framework, and aSPARQL engine allows querying it.Two editors are available: one formetadata with a self-completionmechanism and a WYSIWYG editor for

    content. In addition IkeWiki supportsimportation of existing content fromWikipedia. Similarly, KnowWE [14],built on top of JSPWiki, uses the SesameRDF storage.SweetWiki [9] allows users to tag pages,(called social tagging) and alsointegrates external ontologies. The set ofusers tags generates a folksonomy. Inaddition, SweetWiki adds a WYSIWYGeditor for managing content and meta-data, a reasoning engine used forquerying the wiki content.Some semantic wikis have beendeveloped for specific domains, forexample, BOWiki for biomedicine and

    2An extension for MediaWikihttp://www.mediawiki.org/wiki/Extension:Semantic_Forms

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    5/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    234

    SWiM for mathematics. BOWiki [15]allows to access to several ontologieslike the Gene Ontology and ontologiesabout cell types or anatomy. SWiM [16]extends IkeWiki adding support for the

    OpenMath language, an XML languagefor expressing the logical structure ofmathematical formula.

    In short, the first category of semanticwikis can be used to present knowledgeby structuring concepts through pages,categories and links. The secondcategory of semantic wikis based on pre-existing ontologies can be used as aplatform to build applications that

    require a global consensus overknowledge in order to maintain thequality of data.

    2.4 Comparison

    The annotation tools and structuredcontent provided by template forms [17]found in semantic wikis place thembetween conventional wikis anddatabases. A semantic wiki is more

    structured than a wiki, but at the sametime the structure is dynamic andextensible. In a database centricapproach, the database schema is builtupon entities identified in the first stepof analysis, and thus based on an instantknowledge. In domains characterized bya rapid evolution of knowledge, such asbiology or archaeology, a static databaseschema is not suitable and can beproscribed by the cost of evolution. Theprocess of generating the structure isalso different. In database the domainmodeling is usually done by someexperts; in semantic wiki this is acollaborative, dynamic and evolutionaryprocess.Compared to a traditional database, asemantic wiki allows: 1) to expand the

    structure of documents content; 2) toenable a data model emergence from theusage; and 3) to support collaborative,distributed workflows and processes(figure 2).

    Figure 2. Semantic wikis between wikis anddatabases.

    3 ARCHITECTURE OF

    WIKIBRIDGE

    In a survey authored by Uren et al. [18],authors study semantic annotations,identify a number of requirements, andreview some semantic annotationsystems. WikiBridge's design principlesare following the seven requirementsgiven by Uren et al.: easy to useinterface, user collaborative design,support of different user skills, supportof heterogeneous format, compatibility

    with Semantic Web standards,annotation capabilities and storage, andsupport for reasoning. In the nextsubsection we develop the mostimportant requirements with regard tothe architectural design of WikiBridge.

    3.1 Requirements

    In a knowledge engineering process, it iscommon that non-technical domainexperts work together with experiencedknowledge engineers. To supportdifferent levels of user skills certainadvanced functionalities should behidden from novice users but madeavailable to experienced users. Thus, weuse an Access Control List (ACL)mechanism to describe privilege control

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    6/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    235

    depending on user identity and groupaffiliation. Advanced users can defineforms to help users to structure wikiarticles; each part of a form generatesautomatically annotations. During the

    annotation process, a wizard suggeststerms of the ontology according tohighlighted section in the form.To be able to exchange data with otherapplications (e.g. ontology editors, Webservices, other wikis), compliance toSemantic Web standards is required.WikiBridge is purely based on existingSemantic Web standards such as theWeb Ontology Language OWL fordescribing ontologies and W3C's RDF

    for annotations.We consider reasoning as one of themost important functionalities as itallows: 1) to emerge knowledge that isnot explicit in the data; 2) to check themeaning of annotations with regards tothe context of the annotation; and 3) toenhance navigation and search.

    3.2 Architecture

    One of the most famous semantic wiki isSemantic MediaWiki (SMW), which isbased on MediaWiki [19]. In 2009, whenwe have started the project, complexannotation and consistency checkingwere identified as mandatoryfunctionalities. In 2009, SMW doesn'tprovide complex annotation and doesn'thave consistency checking in itsroadmap.We have started by extendingMediaWiki with the following semanticcomponents: form based acquisitioninterface with automatic annotation,annotations wizard, annotationsvalidation based on the context of adocument, semantic rules and a queryengine.

    Figure 3. WikiBridges architecture.

    Figure 3 presents the architecture ofWikiBridge, the semantic componentsare structured in three layers, third partycomponents are in blue boxes, andbrown boxes include Web services formanaging interactions between innerservices or for adding query capabilitiesfor external applications.

    The user interaction layer is coveredby MediaWiki and structured datacontrol needed for inexperienced users ismanaged by Semantic Forms. Thisextension provides advanced users witha specific description language thatallows to define new forms (figure 4).Modules corresponding to the interactionlayer are represented on the top of figure3.The semantic layer manages

    annotations and query processing. Toimprove the quality of informationduring the input and annotationprocesses, we propose three semanticcomponents in WikiBridge Core (whiteboxes in figure 3) developed on the topof third party tools (RAP - RDF API forPHP, Pellet and Jena).

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    7/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    236

    Figure 4. User interaction layer

    The syntax and the semantics ofannotations, made by experts, areguaranteed by application ontology. Anannotation wizard helps users toconstruct simple or complex annotationsby selecting ontology terms in lists andgiving them properties and values(figure 5). Ontology terms are retrievedfrom ontology concepts throughSPARQL queries. Annotationconstruction is a context sensitiveprocess, initial terms displayed to usersare directly connected to fields of forms.

    Each document is identified by its URLwithin the wiki and annotations use thisURL as a basement for identifyingfragments of content.Simple annotation allows annotating asubject by describing its property using aliteral or a reference to an ontology term.Complex annotation allows annotating a

    subject with two or more simpleannotations and references to otherelements (subjects). For example we canannotate an altar with its dimension, itsbuilding material, its location in the

    nave. The nave is detailed in another partof the document.

    Figure 5. Annotation wizard.

    Annotation consistency checking isoperated by a set of specific Javacomponents interacting with RAP, Pelletand Jena. Only a subset of first orderlogic constraints is checked. WikiBridge

    extension (top of figure 3) connects tothe Java constraint checker by the meansof a Web service. Moreover, rules can beadded to query ontology and annotationsin order to test new facts and thus toproduce new knowledge that can beinserted in the set of semanticconstraints. Two kinds of constraints canbe checked by using the ontologyknowledge: 1) domain values ofproperties using ABox capabilities; and2) structural consistency of propertiesusing TBox capabilities (for instance, acathedral can have a nave but cannothave an atrium). Nevertheless, somedomain dependent constraint cannot beembedded in the structure. For example"In France, there is no church withrammed earth wall for the studied period

    Structuration of wiki article

    Creation of a form

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    8/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    237

    however, this technique is used inIreland and in other countries" can betranslated by the following constraintthat must remains consistent:

    hasRammedWall(?x)

    edificeType(?x,?t)edificeCountry(?x,?c)

    c='France't 'church'

    The persistency layer includes fourtypes of storage (bottom of figure 3):documents content, semanticannotations, ontology, and constraints.The content of documents is stored byMediaWiki specific database.Annotations are stored as triple in RAPtriple store and they can be retrieved byWikiBridge user layer to displayannotations with icons and colors in adocument or by the SPARQL queryengine.The ontology imported as an OWL fileis stored in a specific schema managedby RAP. Ontology terms can be thenqueried using SPARQL and results cangenerate wiki pages.Constraints are stored in plain text usingJena rules syntax. A type attributespecifies if the rule can be applied tocheck the ontology structure or to checkannotation consistency.

    Information access has been designedwith taking into account some featuresabout users. We have thus identified ausage typology in accordance to 1) kindof usage (reader, investigator, andannotator); 2) knowledge degree of thedomain (domain specialists like historian

    researchers and non-specialists).

    Figure 6. Query interface.

    To handle these different types of users,we offer three types of queries: Faceted browsing allows users to

    explore by filtering availableinformation with the ontologystructure (figure 6);

    Form based searching providessemantic search by filling inparameters of parametric queriesidentified during the analysis ofrequirements;

    Aggregate view for each article, allannotations related to the article aredisplayed in a factbox.

    Nevertheless all types of queries rely onthe SPARQL query engine that alsoallows to process in line queries intowiki pages in order to summarizeinformation.To operate spatial and temporal analysison annotations a set of Web services hasbeen developed. Some specific servicesallow retrieving objects and theircoordinates according to a set ofconjunctive properties (figure 7).

    Moreover, a generic Web service hasbeen developed to handle SPARQLqueries.

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    9/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    238

    Figure 7. OpenLayers interface interacting witha Web service of WikiBridge.

    4 ANNOTATION MODEL

    In this section we describe theannotation model used in WikiBridgeand then we compare WikiBridgecapabilities with other annotation-basedsystem according to Orens criteria.

    4.1 WikiBridge annotation model

    Three kinds of annotation can be set inWikiBridge: automatic annotations areprovided by template form, assistedannotations are set by users using theannotation wizard and fully manualannotation can be written by experiencedusers directly in documents using thewiki syntax. These three kinds ofannotation share the same basicconstruct:

    A=(s,p,o)where s is the subject (the annotateddata), p is the predicate (the type of thelink) and o is the object (the annotatingdata). The types of each component arethe following: s is a URI/URL that refers to the

    document (i.e. an article in the wikior a part of an article);

    pis a URI that refers to an ontologyconcept or property;

    o is a literal or a URI that refers aindividual in the ontology or a URIoutside the scope of the wiki, or null.

    Figure 8. Construction of recursive, n-aryannotation with the annotation wizard.

    Our model of annotation allows definingthree basic structures of annotation:simple, complex, and recursive. A simple annotation has the type

    (s,p,o). s and pcannot be null.If oisnull and p refers to a concept, theannotation specifies the type of thesubject. It can be viewed as a

    constraint that is a restriction of anattribute domain. If o is not null itmust refer to a literal or to anindividual that belongs to theconcepts specified by p. It can beviewed as a database constraint thatchecks that an attribute value is in anenumerate list of values;

    A complex annotation or n-ary(noted A-cplx in table 2) is a list ofsimple annotations related to the

    same subject (figure 8). All thepredicates used in the list must bedifferent;

    A recursive annotation (noted A-recin table 2) is an annotation based ona previous one, used to give detailson the object. A recursive annotationhas different levels. An annotation ofthe level i explains the object oof theparent annotation (i.e. from the leveli-1). If all the annotations of level i

    are in list of annotations then allannotations in the list share the samesubject (o). The recursive form ofannotation is based on the semanticvalue model definedby Sciore andRosenthal in [20].

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    10/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    239

    According to the model, the abstractsyntax depicted in table 2 is used in thewiki annotation syntax.

    :=||

    :=(,

    ,):=(,):=((,

    ,)):= URI | URL

    := ontology concept

    |ontology property:= ontology individual

    | literal| URI| URL| null

    Table 2. A simplified version of the abstract

    syntax used for WikiBridges annotations.

    An annotation is defined with regard of acontext. The context has twodimensions: the common knowledgedimension (i.e. the structure of theontology and the rules over theconcepts), the local knowledgedimension (i.e. the set of all theannotations and sub-annotation whichshare the same root subject). An

    annotation is valid only if it is consistentin the two contexts.

    Annotation consistency in the commonknowledge dimension is checked byevaluating ontologys constraintsinstanciated by annotations (an exampleis given on page 8). The annotation isinconsistent if it violates one constraintthat uses one of the predicate or object ofthe annotation.

    Annotation consistency in the localknowledge dimension is checked byevaluating intrinsic annotation modelconstraints. For example a subjectcannot be associated to two differentobjects using the same predicate.

    4.2 Evaluation Criteria

    Oren et al. in [4] have combined severalcriteria from the literature as well astheir own criteria to classify precisely

    annotation systems using six properties:1. Association refers to the way theannotation is associated with theannotated resource (embedded orexternally stored);

    2. Subject granularity indicates thescope of the annotation (a wholedocument, a section, a sentence, aword, a numerical data etc.);

    3. Representation distinction indicateswhether the annotation can

    distinguished object or concepts indocument using a reference systemfrom values;

    4. Terminology reuse is related to thelevel of interoperability of annotationand indicates whether theannotations are ad-hoc or use termsfrom ontologies;

    5. Object type indicates the type of theannotation (literal, textual,structured, ontological);

    6.

    Context indicates meta-data for anannotation (when it was made, bywhom, provenance, etc.).

    SystemProperty

    AnnotatedDBs

    SemanticWikis

    WikiBridge

    Association embedded currentpageembedded

    current pageembeddedand triplestore

    Granularity relation,tuple,attributeand value

    documentandfragment

    documentandfragment

    Representation yes/no yes/no yes

    Terminologyreuse yes/no yes/no yes

    Object type literal literal,ontologyterm, URI

    literal,ontologyterm, URI

    Context no/yes no/yes yes

    Table 3. Comparison of WikiBridge with wellestablished annotation systems.

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    11/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    240

    A comparison between annotateddatabases, semantic wikis andWikiBridge is shown in table 3. It showsthat WikiBridge provides a betterrepresentation of semantics with levels

    of granularity and contextrepresentation. Furthermore, WikiBridgesupports strictly typed annotationthrough association, representation andobject type that enable consistencychecking.

    5 PROJECT OVERVIEW

    The aim of the international projectCARE (Corpus Architecturae Religiosae

    Europeae) is the setting up of a corpusdescribing Christian edifices in Europe.Italy, Spain, Czech Republic, Poland,Slovakia, France and Croatia have beenincluded in the project four years ago.Each edifice is described in a documentthat focuses on the definition of states ofevolutions from the 4th century to the11th century.The French corpus focuses on the 7thand 8th centuries with very rich decades

    in terms of number of monuments(http://care.u-bourgogne.fr).Archaeology is the science ofdestruction, in which the process ofexcavation removes the sedimentary andcultural context of artifacts andarchitecture definitively. Thus, theaccurate recording of contexts andartifacts are crucial. Representing andmanaging knowledge in cultural heritagerequire a deep understanding of specificconcepts. Building collaborativeplatform brings out some challengingcharacteristics: 1) complexity of data(heterogeneous, incomplete, uncertain,inconsistent, spatial, and temporal); 2)domain knowledge barrier; 3) evolvingknowledge; and 4) skills of actors.

    5.1 The CARE Community

    From an organizational perspective, theCARE project takes the form of anexpert network collecting and providing

    information on edifices, analyzinghistorical sources, filling documents andcollaborating in the exploitation of thecorpus through smaller research groups.Additionally, the project involvesundergraduate students that help incollecting information but lack expertiserequired to interpret complex data.In France, more than sixty researchersare collecting and analyzing dataconcerning approximately 2700

    monuments.Two key characteristics outline theCARE community: Multi-disciplinarity: The data

    collecting process involvesarchaeologists, historians, arthistorians, topographers, draftsmen.It is designed as a collaborativeprocess which merges informationfrom various disciplines;

    Inter-disciplinarity: Theinterpretation of data brings togetherall the actors which also enrich theirrespective practices by theconfrontation of methods orproblems.

    5.2 Conceptual Modeling for the

    Foundational Knowledge

    Linster in [21] shows that the interactionamong domain experts, knowledge

    engineers and tools creates knowledge.In addition, he has shown that theprocess of elaborating a knowledge-based system is a constructive model-building process that includes: adiscussion process between knowledgeengineers and domain experts as well asthe construction of a conceptual model

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    12/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    241

    (i.e. a general and abstract framework).Thus, the knowledge engineeringactivity encompasses the design of twokinds of models: model to make senseand model to implement systems.

    We have applied Linster's guidelines tothe CARE project in order to initiate afoundational knowledge from the corpusof documents. The first stage isidentification of salient concepts. Thekey concept is the edifice to which it isessential to model changes. Allconstituent elements of a building needto be described. They can delimit spaceor define religious function (baptismal,

    funerary, etc.). All changes of space orreligious function determine a newedifice state. Edifices and theirevolutions are described in a set ofdocuments. Salient concepts let us tobuild a conceptual model in which threegroups of elements have been identified(figure 9):

    1. Spatial concepts without temporalrelationship (light grey): concept

    EGS refers to edifice, group ofedifices or space inside edifice suchas nave or apse. Compositionrelationships can be identifiedbetween EGS;

    2. Spatio-temporal concepts (grey)called SEGS, represent variations ofspatial concepts in time. Spatio-temporal concepts are linked to adate or a period. Dating elements canbe determined by documents, ordescribed by methods such as C14,thermoluminescence or stratigraphy;

    3. Identification of vocabulary termsfor the project domain (black). Whileconcepts EGS and SEGS are used tostructure the descriptions, terms areused to describe specific elementsincluding, properties of edifices,

    religious functions or manufacturingtechniques.

    Figure 9. Conceptual model of the CAREcorpus.

    The description of the design of theontology is detailed in the next section.

    5.3 Offering Semantic Tools For

    Archaeology: The CARE Ontology

    Cultural heritage collections can beannotated with different thesauri. Abouttwenty thesauri are described at the pagehttp://tinyurl.com/5u8bjer.The CIDOC Conceptual ReferenceModel (CRM) provides an extensible

    ontology for concepts and relationshipsin cultural heritage domain(http://www.cidoc-crm.org). Since2006, it is an ISO standard (21127:2006)for exchange of cultural heritageinformation. CIDOC-CRM aims attreating all types of material collectedand/or displayed by museums: sites,monuments as well as collections of fineand applied arts. It is intended toencompass the detailed description both

    of individual materials as well as groupsof materials as a whole. It also coverscontextual information: historical,geographical context in which materialsare placed and which gives them muchof their significance and value. Theevent notion in CIDOC-CRM isrepresented by the concept of event. An

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    13/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    242

    event describes environment of thematerial over the time and what couldhappen to it. The central notion iscomplemented by: 1) the TimeSpanconcept describes the moment it

    happened; 2) the concept of Place; 3)who did it (Actor); and 4) what is beingdescribed. CRM offers notions ofPhysical Objects that can be natural orhave been manufactured by man, andConceptual Objects.Since CRM is a reference in the field,we use it as a starting point to establishCARE ontology. The view of the CAREontology as a specialization of CRMallows us to relate to a standard and

    comply with it. The archaeologist usestwo sources of information. First, thereare material data like pottery, clothingand architectural items like walls,columns, floors. Second, there aredescriptive data that help to associatematerial data found in a location withtheir context. Descriptive data can bemeasurements (height, width and lengthbut also mass, density), direction, andassociation. Association seeks the

    position of the material data in relationto its surroundings. Accordingly, theCARE ontology has four parts: 1)religious concepts, 2) their spatialrelationships and 3) characteristics and4) timeline to track evolutions (figure10). Each sub-tree from THINGmay beconsidered as an ontology of a particulardomain. We call a branch of ontology ahierarchy is_awith one root. To covera wider field of knowledge it should

    compulsorily consider the relationshipsbetween several sub-trees. Grenon et al.[22] propose the definition of three kindsof relationships: Intra-ontology: relationship with two

    concepts of same part of anontology;

    Trans-ontology: relationship with aconcept of a sub-tree and a conceptof another sub-tree. For instance, abuilding is consecrated to a saint, inDL we can write:

    BuildingisConsecrated.Saint

    Meta-ontology: relationship with aconcept of an ontology and anotherontology (considered as a whole).

    Figure 10. Branches of CARE ontology (in blueCIDOC-CRM ontology, concepts with EXX

    prefix are CIDOC-CRM concepts).

    Modeling religious conceptsReligious concepts in CARE areedifices, with its decomposition intodifferent constituent elements (nave,transept, apse, etc.), liturgicalinstallations (altar, ambo, ciborium, etc.)and burials. These concepts have beenplaced under the concept E24

    Physical Man-Made ThingCIDOC-CRM. Indeed, CIDOC-CRMdefines this concept as all persistentphysical items that are purposely

    created by human activity.

    Modeling characteristics of an edificeTo detail parts of an edifice, weintroduced the concept of architecturalelements. It describes masonry, floor,opening, inscriptions, as well as

    construction techniques, dimensions, andcolors, etc.

    Modeling spatial relationships in

    archaeologyThe geometry implementation in thetextual descriptions that are analyzed is acomplex geometry. Indeed, these

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    14/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    243

    descriptions do not refer to an absoluteand orthonormal space: it is rather, aspace perception or a cognitive spacewhose structure is largely based on thefunctional aspects and objects described,

    and the perspective of the archaeologist.From the analysis of textual descriptionsof religious concepts (description of theposition and shape) given byarchaeologists we have found four typesof spatial properties:1. Orientation properties: forward,

    back, bottom, next to, on one side,under, below, at a lower level, at thesame level, above, right, left, center,prior, and the cardinal directions;

    2.

    Boundary properties: outside, inside;3. Distance properties: near, far, nextto, around;

    4. Topological properties: flank, join,open on the side, link to, standagainst, surround, isolated. Torepresent the topological properties,we used the work of Hegenhofer andHerring [23]. The authors havedefined a minimum set of eightrelations (disconnected, externally

    connected, partially overlap, equal,tangential proper part, non-tangentialpart, tangential proper part inverse,non-tangential part inverse)describing the relations between tworegions. This set serves as a basis fordefining other topological relation byusing a composition.

    Modeling temporal relationships to

    track evolutions

    When writing of his excavation report,the archaeologist graphicallysummarizes the results obtained with atimeline that is often organized byanterior/posterior relationships: materialsare considered in relation to each other.In the CARE project, time model isbased on following criteria: some

    absolute benchmarks and a relativechronology based on intervals. We haveestablished a convention for centurydivision and boundaries. Centuries startat year 1 and end at year 100. We also

    have established subdivisions terms suchas early (132), mid (3366) and late(67100). These century divisions areplaced underE52 TimeSpan.CIDOC-CRM offers specific conceptsrelate to time [24, 25]. Some Allen'srelationships [26] are properties used.The CARE project aims to follow theevolutions (creation, modification,deletion) of an edifice and its constituentelements. The concept of activity is

    important for CARE project because astate may be characterized by an activityin the CIDOC-CRM ontology. Theconcept E7 Activity is defined asfollows: The action or sequence of

    actions intentionally carried out by

    Actors that result in changes of state inthe cultural, social, material systems

    which interest. This notion includes

    both complex and long-lasting actionssuch as building an edifice, as well as

    simple and short-lived actions.Following the concepts of the CIDOC-CRM ontology, we use seven concepts,specialized concepts of E7 activity, tomodel states of edifices:E6 Destruction,E11 Modification, E81 Transformation,E63 Beginning of Existence, E64 End of

    Existence, E79 Part Addition and E80

    Part Removal.The CARE ontology has been designedusing Protg and it actually

    encompasses 124 classes and 715individuals.

    6 RELATED WORKS

    Several semantic wikis have beendeveloped or used specifically forcultural heritage applications.

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    15/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    244

    Witte et al. [27] present an approach tocultural heritage data management whichintegrates different technologies: a wikiuser interface, text mining support usinga Natural Language Processing (NLP)

    framework and ontologies based onOWL and RDF. Authors haveimplemented the ideas for the GermanHandbuch der Architektur, acomprehensive multi-volumeencyclopedia of architecture. A volume(506 pages) of the encyclopedia isconverted into wiki pages. Authors haveto capture two sub-domains byontologies: the domain of documentmanagement (i.e. sentence, noun, page

    number, etc.) and architectural domain(i.e. wall, building material, etc.). NLPallows connecting architectural conceptswith document-specific one, e.g.sentences that mention constructionelements of a certain material. A publicversion is available athttp://www.semanticsoftware.info

    /durm.

    The HermesWiki [28] is a semantic wikiin the historical domain in Germanlanguage. The main objective is toprovide an overview on Ancient GreekHistory for teaching purposes ofundergraduate students. The wikiconsists of three parts: a collection abouttwenty essays giving a comprehensivedomain walk-through, translations of thedescribing ancient sources and aglossary. The entries in the glossary aretagged. It has been implemented as aplugin for KnowWE, reusing as much ofthe core components as possible. Apublic version is available athttp://hermeswiki.informatik.uni

    -wuerzburg.de.

    NavEditOW is a framework for ontologydriven web site. It has been exploited tosupport a semantic description of two

    projects: 1) a web portal and a set ofadvanced services supporting the sharingof knowledge about Prehistory andProtohistory in the Italian context [29].In particular, one of the services is

    represented by a digital library, in whichentries (i.e. bibliographic description ofpublications) will be ontologicallydescribed. The system is currently on-line at http/www.archeoserver.itand; 2) SilkRoDE (Silk Roads in theDigital Area) project that aims to collect,structure and diffuse all knowledgeabout the Cultural Heritage of CentralAsia from fields such as archaeology,geography or history [30]. The

    ontological approach provides therequired expressiveness and flexibility tosupport rich forms of navigation amongstored contents. The frameworkintegrates a wiki engine for renderingdocuments stored in the ontological tier.In the same view, MANTIC is a webapplication that realizes a portal forarchaeological information about the cityof Milan [31]. MANTIC integratesdifferent data sources and the global

    schema is based on CIDOC-CRM.Our approach of semantic wiki isdirected towards scientific applicationdomains, which contribute to produceknowledge [32]. These kinds ofapplications rely on core ontologies thatact as a consensus. Querying andanalyzing data enhance knowledge, newconcepts can emerge and newconstraints can be found out. As a result,ontologies can be modified dynamicallyand semantic checks are necessary tofind inconsistent annotations withregards to ontology.

    7 CONCLUSION

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    16/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    245

    Wiki solutions meet the requirements ofa web platform with collaborativecapabilities. Easy setup and rich editingsupport are primary reasons for thewidespread adoption of wikis. Users can

    enter text and others types of data(pictures, video) and connect contentthrough hyperlinks. Most of wikis alsoprovides a versioning system to trackcontent changes and a full-text searchengine for querying wiki pages. Thenarrative structure is one advantage ofwiki documents centric approach,compared to a database centric approach.Nevertheless, a mere documentmanagement system is not sufficient to

    catch interdependent structures ofknowledge. Adding semantic annotationcapabilities to documents allowsdifferent levels of interpretation and cansustain: 1) knowledge evolution bykeeping track of the successiveannotations; 2) better quality in thequery evaluation process; and 3)amenable result displayed according touser skills. Annotations can be defined ata coarse-grained level or at a fine-

    grained level. Ontology must beassociated to the annotation system toprovide a semantics for annotation termsaccording to domain knowledge.Semantic wiki solutions meet therequirements of annotation system andknowledge description. Addingsemantics yields two dimensions ofenhancements to a wiki: 1) adding amore formal structure to the wiki; 2)exporting, integrating and reusinginformation by the adoption of standardsemantic technologies. WikiBridge thusseems to combine the best from twoworlds: structure from databases as wellas expandability and collaborationcapabilities from wiki systems.

    ACKNOWLEDGMENTS

    This research is supported by the FrenchANR funding (ANR-07-CORP-011).8 REFERENCES

    1. Galperin, M., Cochrane, G.: The 2011Nucleic Acids Research Database Issue andOnline Database Collection. In NucleicAcids Research, vol.39 (Database issue),

    pp.1--6, (2011).2. Newman, H., Ellisman, M., Orcutt, J.: Data-

    intensive e-science frontier research. InCommunications of the ACM, Vol.46(11),

    pp. 68--77, (2003).3. Petrovic, V., Gidding, A., Wypych, T.,

    Kuester, F., DeFanti, T., Levy, T.: Dealingwith Archaeology's Data Avalanche, InComputer, Vol.44(7), pp. 56--60, IEEE,(2011).

    4. Oren, E., Mller, K., Scerri, S., Handschuh,S., Sintek, M.: What are SemanticAnnotations?, Technical report, DERIGalway, (2006).

    5. Chiticariu, L., Tan, W., Vijayvargiya, G.:DBNotes: A Post-It System for RelationalDatabases based on Provenance. InProceedings of the ACM SIGMODInternational Conference on Management ofData, pp. 942--944, (2005).

    6. Geerts, F., Kementsietsidis, A., Milano, D.:MONDRIAN: Annotating and QueryingDatabases through Colors and Blocks. InProceedings of the 22nd International

    Conference on Data Engineering (ICDE),pp. 82--91, (2006).7. Eltabakh, M., Aref, W. ,Elmagarmid, A.,

    Ouzzani, M., Silva, Y. : SupportingAnnotations on Relations. In 12thInternational Conference on ExtendingDatabase Technology (EDBT), pp. 379--390, (2009).

    8. Buneman, P., Cheney, J., Tan, W.,Vansummeren, S.: Curated databases. InPODS, pp. 1--12, (2008).

    9. Michel Buffa, M., Gandon, F., Erto, G.,Sander, P., Faron, C.: SweetWiki: Asemantic wiki. In Journal of Web Semantics,vol.6(1), pp. 84--97, (2008).

    10. Krtzsch, M., Vrandecic, D., Vlkel, M.:Semantic MediaWiki. In 5th InternationalSemantic Web Conference, pp. 935--942,(2006).

    11. Auer , S., Dietzold, S., Riechert, T. :OntoWiki - A Tool for Social, SemanticCollaboration. In 5th International SemanticWeb Conference, pp. 736--749, (2006).

  • 8/12/2019 STRUCTURED WIKI WITH ANNOTATION FOR KNOWLEDGE MANAGEMENT: AN APPLICATION TO CULTURAL HERITAGE

    17/17

    International Journal of Digital Information and Wireless Communications (IJDIWC) 1(1): 230-246The Society of Digital Information and Wireless Communications, 2011(ISSN 2225-658X)

    246

    12. Kuhn, T.: How Controlled English canImprove Semantic Wikis. In 4th SemanticWiki Workshop (SemWiki), (2009).

    13. Schaffert, S.: IkeWiki: A Semantic Wiki forCollaborative Knowledge Management. In15th IEEE International Workshops on

    Enabling Technologies: Infrastructures forCollaborative Enterprises (WETICE), pp.388--396, (2006).

    14. Reutelshfer, J., Haupt, F., Lemmerich, F.,Baumeister, J.: An Extensible SemanticWiki Architecture. In 4th Semantic WikiWorkshop (SemWiki), (2009).

    15. Hoehndorf, R., Bacher, J., Backhaus, M.,Gregorio, G., Loebe, F., Prfer, K., Uciteli,A., Visagie, J., Herre, H., Kelso, J.:BOWiki: an ontology-based wiki forannotation of data and integration ofknowledge in biology. In Proceedings ofBMC Bioinformatics, Vol.10(s-5), (2009).

    16. Lange, C.: SWiM - A Semantic Wiki forMathematical Knowledge Management. In5th European Semantic Web Conference,(ESWC), pp.832--837, (2008).

    17. Haake, ., ukosch, ., chmmer, T.:Wiki-templates : adding structure support towikis on demand. In InternationalSymposium on Wikis, pp. 41--51, (2005).

    18. Uren, V., Cimiano, P., Iria, J., Handschuh,S., Vargas-Vera M., Motta E.,Ciravegna, F.:Semantic Annotation for KnowledgeManagement: Requirements and a Survey ofthe State of the Art. In Web Semantics:

    Science, Services and Agents on the WorldWide Web, vol. 4(1), pp. 14--28, (2006).

    19. Vlkel, M., Krtzsch, M., Vrandecic, D.,Haller, H., Studer, T.: Semantic wikipedia.In Proceedings of the 15th internationalconference on World Wide Web, pp. 585--594, (2006).

    20. Sciore, E., Siegel, M., Rosenthal, A.: Usingsemantic values to facilitate interoperabilityamong heterogeneous information systems.In ACM Trans. Database Syst., vol.19(2),pp.254--290, (1994).

    21. Linster, M.: Linking Modeling to MakeSense and Modeling to Implement Systemsin an Operational Modeling Environment. InEKAW, pp. 55--74, (1992).

    22. Grenon, P., Smith, B.: SNAP and SPAN:Upper level ontology and framework forformalization. In Spatial cognition andcomputation, vol.4(1), (2004).

    23. Hegenhofer, M., Herring, J.: CategorizingBinary Topological Relations BetweenRegions, Lines and Points in Geographic

    Databases. Technical report, National Centerfor Geographic Information and Analysis,CA, (1991).

    24. Binding, C.: Implementing ArchaeologicalTime Periods Using CIDOC CRM andSKOS. In Proceedings of the 7th Extended

    Semantic Web Conference (ESWC), pp.273--287, (2010).25. M. Doerr, M., Kritsotaki, A., Stead, S.:

    Thesauri of Historical Periods - A Proposalfor Standardization. In Proceedings ofCIDOC Conference, (2005).

    26. Allen, J. F.: Maintaining Knowledge aboutTemporal Intervals. In CommunicationACM, vol.26(11), pp. 832--843, (1983).

    27. Witte, R., Krestel, R., Kappler, T.,Lockemann, P.C.: Converting a HistoricalArchitecture Encyclopedia into a SemanticKnowledge Base. In IEEE IntelligentSystems, vol.25(1), pp.58--67, (2010).

    28. Reutelshoefer, J., Lemmerich, F.,Baumeister, J., Wintjes, J., Haas, L.: TakingOWL to Athens: Semantic Web technologytakes Ancient Greek history to students. InProceedings of the 7th Extended SemanticWeb Conference (ESWC), pp. 333--347,(2010).

    29. Bonomi, A., Mantegari, G., Vizzari, G.: AFramework for Ontological Description ofArchaeological Scientific Publications. InSWAP, (2006).

    30. Bonomi, A., Mosca, A., Palmonari, M.,Vizzari, G.: Integrating a Wiki in an

    Ontology Driven Web Site: Approach,Architecture and Application in theArchaeological Domain. In Proceedings ofthe Third Workshop on Semantic Wikis:The Wiki Way of Semantics (SemWiki),(2008).

    31. Mantegari, G., Palmonari, M., Vizzari, G.:Rapid Prototyping a Semantic WebApplication for Cultural Heritage: The Caseof MANTIC. In Proceedings of the 7thExtended Semantic Web Conference(ESWC), pp. 406--410, (2010).

    32. Leclercq, E., Savonnet, M.: Access andannotation of archaelogical corpus via asemantic wiki. In Fifth Workshop onSemantic Wikis - Linking Data and People(Semwiki), (2010).