towards summarizing knowledge: brief ontologies

Expert Systems with Applications 39 (2012) 3213–3222

Contents lists available at SciVerse ScienceDirect

Expert Systems with Applications

journal homepage: www.elsevier .com/locate /eswa

Towards summarizing knowledge: Brief ontologies

Julián Garrido ⇑, Ignacio RequenaDep. Computer Science and A.I. ETSI Informática y Telecomunicaciones, University of Granada, Daniel Saucedo Aranda s/n, 18071 Granada, Spain

a r t i c l e i n f o

Keywords:Brief ontologyKnowledge representationKnowledge mobilizationEnvironmental impact assessment

0957-4174/$ - see front matter � 2011 Elsevier Ltd. Adoi:10.1016/j.eswa.2011.09.008

⇑ Corresponding author.E-mail addresses: [email protected] (J. Gar

(I. Requena).

a b s t r a c t

There are currently two major tendencies in ontology development. The first is typical of specializedknowledge areas such as genetics, biology, medicine, etc., and involves building large ontologies in orderto include all the concepts in the knowledge field. In contrast, the second type of ontology developmentcontains fewer concepts, and is more focused on the semantic completeness of concept definitions ratherthan on the number of concepts. As an innovative solution to both problems, this paper presents an algo-rithm that permits the extraction of brief ontologies from large ontologies, thus reducing the number ofconcepts and the semantic complexity of their definitions. In addition, this algorithm has the advantageof being a user-centered tool, capable of automatically building brief ontologies, as exemplified in a casestudy in the area of environmental science.

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction

Ontologies, as collections of data, have become increasinglylarge and complex. This has the disadvantage of obliging userswho are only interested in one part of the ontology, to use all ofit. Various authors have studied how this problem can be solvedin a fairly short time and with a low computational cost. Many re-lated approaches, such as the one described in the following sec-tions, also target this problem.

One solution is to divide large ontologies into subdivisions withfewer member concepts. This would make them more manageableand less time consuming to handle. Still another possibility is tofind a way to partially reuse ontologies without importing thewhole ontology.

Nevertheless, the problem not only lies in the number of con-cepts, but also in the complexity inherent in the semantic defini-tions of the concepts. For instance, a concept may be defined interms of other concepts from very heterogeneous knowledgesources. Nevertheless, the user may not be interested in all of thedefinition, but only a portion of it that is related to a particular sub-ject field.

For example, John can be regarded as an instance of the classPERSON, whose definition contains his physical properties (size,age, weight, eye color, etc.), family relationships (children, parents,siblings, etc.), possessions owned (house, car, land, etc.), job infor-mation (work type, company, salary, etc.). Nonetheless, the user ofthe ontology might only be interested in John’s blood relatives and

ll rights reserved.

rido), [email protected]

the company where he works because the user wishes to studyJohn in relation to his relatives who are employed by the company.

This paper deals with both problems, and presents an approachto building ontologies with the relevant knowledge by reducingthe number of concepts and simplifying the complexity of defini-tions. These ontologies are called brief ontologies.

In general, brief ontologies have a wide range of advantageswhen, for whatever reason, the user or application does not wishto deal with the whole ontology. Sometimes users may not beinterested in accessing so much information or they may not havea tool capable of handling such a large resource. Mobile devices arean example of this because they may be limited by their autonomy,bandwidth, and/or computational restrictions.

If the mobile device uses a brief ontology, it is not obliged tostore a complete copy of the ontology when it only needs to usea small part of it. This avoids requesting, sending, and managingunnecessary information. The application thus saves on powerconsumption, time, and bandwidth. It can even improve reasoningtime because of the reduced amount of knowledge.

Brief ontologies also improve knowledge mobilization becausethey make knowledge available, and at the same time, take into ac-count user needs and context. In this respect, environmental im-pact assessment (EIA) is an example of an application whereknowledge mobilization, context, and brief ontologies can be usedtogether. This is accomplished by asking for EIA brief ontologiesfrom mobile applications in real time.

The remainder of this paper is organized as follows. Section 2briefly summarizes relevant research work. Section 3 then de-scribes basic concepts related to ontologies and description logic,in particular, the concept of brief ontology. After an explanationof the terminology and notation used in this paper, Section 4

http://dx.doi.org/10.1016/j.eswa.2011.09.008

mailto:[email protected]

mailto:[email protected]

http://dx.doi.org/10.1016/j.eswa.2011.09.008

http://www.sciencedirect.com/science/journal/09574174

http://www.elsevier.com/locate/eswa

3214 J. Garrido, I. Requena / Expert Systems with Applications 39 (2012) 3213–3222

introduces brief ontology extraction, and presents the algorithmand graph underlying this process. This is followed by a descriptionof the ontology building architecture and the user interface in Sec-tion 5. Finally, Section 6 presents a case study in environmentalscience which analyzes a knowledge model and application. The fi-nal sections give the conclusions derived from this research andthe references cited.

2. Antecedents

One of the first approaches to reducing an ontology to manage-able size is described in Gu, Perl, Geller, Halper, and Singh (1999).In this study, the authors partitioned a large, complex semanticnetwork into separate smaller sub-networks. They made thevocabulary more understandable to users by using a graphical rep-resentation of the meaning units. It was reduced because when avocabulary is very large, it loses its intuitive appeal.

Delgado, Pérez-Pérez, and Requena (2005) introduced the con-cepts of re-addressable and brief ontologies for the first time. Theirarticle describes a system for knowledge mobilization based onontologies, web services, and a multi-agent architecture to addressrequests until the desired knowledge is obtained. Mobile devicescontain re-addressable ontologies. This allowed them to communi-cate with the brief ontologies in the server, and thus access theinformation in a larger, generic ontology.

Kim, Caralt, and Hilliard (2007) made a study of pruning meth-ods applied to Bio-Ontologies, and Pathak, Johnson, and Chute(2009) review techniques based on logical formalisms or graphtheories applied to modular ontologies. Even more relevant toour study, Stuckenschmidt and Schlicht (2009) divided a largeontology into a set of modules, each with a set of semantically re-lated concepts though without strong dependencies between mod-ules. The ontology was automatically translated into a weightedgraph where nodes represent concepts and links represent depen-dencies (structural properties), which are weighted, depending onthe strength of the dependency. According to these authors, thisapproach is particularly helpful for the visualization of large ontol-ogies and the extraction of key topics.

Noy and Musen (2009) develop the concept of Traversal View forontologies, which is similar in meaning to database views. The twomechanisms proposed for its implementation are firstly, the defini-tion of starter concepts for a traversal algorithm, and secondly, thespecification of meta-information in the ontology to describe howand from which perspective the concepts and relationships shouldappear. Furthermore, ontology views can be also built the sameway as in databases by specifying a query in some ontology-querylanguage (query-based approach).

Some approaches, as the one described above, make moduleextraction using traversal algorithms. However, the algorithm pre-sented in this paper goes far away because it creates copies of theconcepts whose selection is carried out exploring their definitionsor structural relationships at the same time that it prunes these def-initions and relationships. For this reason, it really allows to includeonly relevant knowledge and to generate a self contained ontology.Certain approaches, such as the one previously mentioned, extractmodules with traversal algorithms. However, the algorithm pre-sented in this paper goes far beyond previous research because itcreates copies of the selected concepts and explores their definitionsor structural relationships at the same time as it prunes these defini-tions and relationships. For this reason, it includes only relevantknowledge and generates a self-contained ontology.

3. Brief ontology

Description logics (DLs) (Baader, Calvanese, McGuiness, Nardi, &Patel-Schneider, 2003) have frequently been used for knowledge

representation because they provide a precise characterization ofa knowledge base and reasoning capabilities.

Logics based on Attributive Language are part of the family ofdescription logics, which have an informal naming convention fordescribing the operators. For example, ALC stands for AttributiveLanguage with Complements of complex concepts. It correspondsto the first order logic restricting the syntax to formulas withtwo variables (ALC is also known as S). Another case,SHOINðDÞ, is an ALC logic (role transitivity) with role hierarchiesðHÞ, nominals ðOÞ, inverses on roles ðIÞ, unqualified numberrestrictions ðN Þ, and datatypes ðDÞ. SHOINðDÞ forms the core ofOWL-DL (Ontology Web Language-Description Logic), and is al-most its equivalent. Consequently, this paper focuses on this logicand on OWL-DL.

An ontology is composed of a TBox, RBox, and ABox ðKT ;KR;KAÞ.The Terminological Box (TBox) contains intensional knowledge (ele-ments defined by the allowed operators). The Assertional Box (ABox)contains the extensional knowledge (assertions about individuals),and the Role Box (RBox) contains a finite set of axioms for roles.

The concepts may be defined using atomic primitives (operators)in ALC logic. However, the more operators in the logic, the moreexpressive it is. According to the W3C (World Wide Web Consor-tium), the application works with OWL ontologies. Table 1 showsOWL-DL operators, their equivalent operators in SHOINðDÞ logic,and their semantic descriptions. SHOINðDÞ semantics is definedby an interpretation I . It consists of the interpretation domainðDI Þ and an interpretation function, where atomic concepts areinterpreted as subsets of the interpretation domain, and atomicroles are interpreted as a binary relation belonging to DIxDI . Accord-ing to Table 1, the semantics of other constructors can be specified bydefining the individuals represented by each construct (Baader et al.,2003). For each constructor, the semantics of the resulting class isdefined in terms of the semantics of its components (Horrocks &Patel-Schneider, 2004).

In our proposal, a brief ontology OB is built from another ontologyO. This brief ontology contains only the essential knowledge of O fora particular purpose in terms of axioms as well as intensional andextensional knowledge. This concept is formally described inDefinition 1.

Definition 1. For ontology O = (KT, KR, KA), a brief ontology is anontology OB ¼ KB

T ;KBR;K

BA

� �such that KB

T v KT ^ KBR v KR ^ KB

A v KA,and for every concept C 2 KT and its equivalent CB 2 KB

T , thedefinition of C exactly matches the definition of CB or the definitionof CB is a generalization of the definition of C.

4. Brief ontology extracting process

The goal of the extraction process is to build a new ontology(brief ontology, OB) without the huge quantity of information con-tained in the original (large ontology, O). For this reason, a mecha-nism is needed that enables this reduction.

The algorithm used to build OB is a recursive process over Owhich explores individuals, concepts, and their axioms. However,only the relevant ones will be part of the new OB. For this reason,the structure of concepts and their expressions (i.e. how they aredefined) is particularly significant.

Each concept C is defined by a concept expression D which maybe a complete definition in the case of defined concepts (C � D) orincomplete definitions in the case of primitive concepts (C v D).

Although some axioms or concept expressions can be as simpleas $r.C, others can be combinations of negations ð:CÞ, unions(o1 t o2), or intersections of concepts (C1 u � � � u Cn), individuals,primitive values, or concept subexpressions. In all cases, conceptexpressions are composed of the primitive operators in Table 1.

Table 1Syntax and semantics of SHOIN ðDÞ (Horrocks & Patel-Schneider, 2004).

OWL fragment SHOINðDÞ syntax Semantics

A, OWL class A AI # DI

D, datatype D DD # DI

R, OWL object property R RI # DI � DI

T, OWL datatype property T TI # DI � DI

o, OWL individual o oI 2 DI

v, OWL data value v vI ¼ vD

intersectionOf(C1 � � � Cn) C1 u � � � u Cn ðC1 u � � � u CnÞI ¼ CI1 u � � � u CInunionOf(C1 � � � Cn) C1 t � � � t Cn ðC1 t � � � t CnÞI ¼ CI1 t � � � t CIncomplementOf(C) :C ð:CÞI ¼ DI n CI

oneOf(o1 � � � on) {o1, . . . , on} fo1; . . . ; ongI ¼ oI1 ; . . . ; oIn� �

restriction(R r1 � � � rn) R r1 u � � � u Rrn ðRr1 u � � � u RrnÞI ¼ ðRr1ÞI u � � � u ðRrnÞI

restriction(R someValuesFrom(C)) $R � C ð9R � CÞI ¼ fx 2 DI j 9y � ðx; yÞ 2 RI and y 2 CI grestriction(R allValuesFrom(C)) "R � C ð8R � CÞI ¼ fx 2 DI j 8y � ðx; yÞ 2 RI ! y 2 CI grestriction(R hasValue(o)) $R � o ð9R � oÞI ¼ fx 2 DI j 9ðx; oÞ 2 RI and o 2 oI grestriction(R cardinality(n)) PnR u 6nR ðP nRu 6 nRÞI ¼ ðP nRÞI u ð6 nRÞI

restriction(R maxCardinality(n)) PnR ðP nRÞI ¼ fx j ]ðfy � ðx; yÞ 2 RI gÞP nrestriction(R minCardinality(n)) 6nR ð6 nRÞI ¼ fx j ]ðfy � ðx; yÞ 2 RI gÞ 6 nrestriction(T r1 � � � rn) T r1 u � � � u T rn ðTr1 u � � � u TrnÞI ¼ ðTr1ÞI u � � � u ðTrnÞI

restriction(T someValuesFrom(D)) $T � D ð9T � DÞI ¼ fx 2 DI j 9y � ðx; yÞ 2 TI and y 2 DDgrestriction(T allValuesFrom(D)) "T � D ð8T � DÞI ¼ fx 2 DI j 8y � ðx; yÞ 2 TI ! y 2 DDgrestriction(T hasValue(v)) $T � v ð9T � vÞI ¼ fx 2 DI j 9ðx;vÞ 2 TI and v 2 vI grestriction(T cardinality(n)) PnT u 6nT ðP nTu 6 nTÞI ¼ ðP nTÞI u ð6 nTÞI

restriction(T maxCardinality(n)) PnT ðP nTÞI ¼ fx j ]ðfy � ðx; yÞ 2 TI gÞP nrestriction(T minCardinality(n)) 6nT ð6 nTÞI ¼ fx j ]ðfy � ðx; yÞ 2 TI gÞ 6 n

oneOf(v1 � � � vn) {v1, . . . , vn} fv1; . . . ; vngI ¼ vI1 ; . . . ;vIn� �

J. Garrido, I. Requena / Expert Systems with Applications 39 (2012) 3213–3222 3215

Before further describing the algorithm, it is necessary to definepreliminary concepts such as restrictive role, relevant role, andmain concept.

Definition 2. For an ontology O = (KT, KR, KA), a restrictive role is arole r 2 KR which is used during the brief ontology building in orderto limit the knowledge included and stop the recursive process.

Definition 3. For O = (KT, KR, KA), an ontology, and forOB ¼ KB

T ;KBR;K

BA

� �, a brief ontology of O, a relevant role is a role

rB 2 KBR such that the role r r 2 KR is equivalent to role rB, where r

is defined as a restrictive role in O to build OB.

Definition 4. For ontology O = (KT, KR, KA), a main concept is a con-cept C 2 KT which is used in the brief ontology building as the start-ing point of the recursive process.

4.1. Algorithm

This subsection describes the algorithms needed in the extrac-tion process of the brief ontologies. Algorithm 1 builds the briefontology from the original with the previous definition of a set ofmain concepts and a set of restrictive roles. It then makes a callto Algorithm 2 for each main concept. Algorithm 2 (createTaxono-my) creates a copy of the concept from the original ontology in thebrief ontology, and if its concept expression is considered valid, itmakes a recursive call to itself to add the individuals or conceptsused in the concept expression to the brief ontology. Finally, Algo-rithm 3 verifies the validity of a concept expression.

With regard to Algorithm 2, given a concept C 2 O and its con-cept expression D, if C is added to OB, then every concept or individ-ual in D is a candidate for inclusion in OB. If the concept is added, itsconcept expression is then explored in order to find newcandidates.

Moreover, the roles used in a concept definition express partof the semantics. Particularly, restrictive roles permit the knowl-edge incorporated in OB to be controlled by the selection of suit-able concepts and individuals for the particular purpose of theontology OB.

Restrictive roles are used in the process of candidate selection.This is because only concepts and individuals directly related to rel-evant roles in their concept expressions are added to OB. These rolesare also used during the pruning process. Consequently, the knowl-edge in the brief ontology is delimited by specifying the restrictiveroles and main concepts where the recursive process begins.

Given A, R, o, the set of concepts, roles and individuals respec-tively in O and AB, RB, oB the equivalent ones in OB, the Algorithms1–3 summarize the brief ontology extraction process taking intoaccount the following notation and considerations. Given A, R, o,the set of concepts, roles, and individuals, respectively, in O, andAB, RB, oB, the corresponding ones in OB, Algorithm 1–3 summarizethe brief ontology extraction process in terms of the followingnotation and considerations:

ffi represents the equivalence of a concept, role or individual inOB to another in O.v represents the subsumption of a concept by another concept.2 represents the membership of an element in a specific set.� represents that the presence of a specific concept, individualor role in a concept expression.

Most OWL APIs provide a set of operations to edit and managethe ontologies. Regarding the Protégé owl-api, the operation Createa concept in Algorithm 2 corresponds to the method createOWL-NamedClass. Accordingly, it creates a new concept with the speci-fied name and superclass. The operation Retrieve all concepts thatsubsume C is the same as the method getSuperclasses, and the oper-ation Subsumption of a concept C by another one is accomplishedwith the method addSuperclass. The operation Create an individualcorresponds to the method createOWLIndividual, but requires the


previous creation of other individuals, which are used as resourcesby means of relevant properties.

Each expression in the algorithms preceded by the symbol .represents an assertion that is true at that point of execution.

Algorithm 1. A new brief ontology OB is built by taking as inputsthe original ontology O and a set of concepts MC and restrictiveroles RR belonging to O:

1. To create roles in OB equivalent to the restrictive roles in O: rel-evant roles. . "r 2 RR jr 2 R) $rB 2 RB ^ r ffi rB

2. For each concept C jC 2 O ^ C 2MC ^ C 2MCcreateTaxonomy (C) (Algorithm 2).

3. To create concept MCB (main concept) in OB.4. For the subsumption of main concepts by MCB.

. 8C 2 MC ) 9C 0 2 AB ^ C 0 v MCB ^ C0 ffi C

Algorithm 2. createTaxonomy (C), recursive algorithm to create abrief concept CB in OB, equivalent to C. Nothing happens if the con-cept already exists:

1. If $CB 2 OB jC ffi CB Then: End Algorithm 2.2. To create a brief concept CB in OB equivalent to C

. CB 2 OB j C ffi CB ^ C 2 A

3. To retrieve all concepts that subsume C

. AC ¼ fC 0 j C0 2 A ^ C v C0g

4. For each concept C0 2 AC(a) createTaxonomy (C0) (Algorithm 2)

. 9C 00 2 AB j C 00 ffi C 0 ^ C0 2 A

(b) To subsume C by C00. . C v C00 ^ C00 2 A5. Retrieve all concept expressions of C: GE6. For each concept expression E 2 GE

If validExpression(E) (Algorithm 3) Then(a) To retrieve all concepts that appear in E and concepts whose

individuals are in E.

. GP ¼ fP1 2 O j P1 2 A ^ P1 � Eg\fP2 2 OjP2 2 A ^ u 2 o ^ u # P2 ^ u � Eg

(b) For each concept P in GP
(i) createTaxonomy (P) (Algorithm 2)
EndIf7. For each concept expression E 2 GE

If validExpression(E) (Algorithm 3) Then(a) To retrieve all individuals in E

. IE ¼ fu � E j u 2 og

(b) For each individual u 2 IE
(i) To create an equivalent individual uB
. 9uB 2 oB j u ffi uB ^ u 2 o

(c) If C is subsumed by E and E is not a concept in A Then

C v E ^ E R A

(i) To build an equivalent concept expressionEB 2 OB jEB ffi E ^ E 2 O:

. 8P1 � E j P1 2 A) 9PB1 � EB j PB

1 2 AB ^ P1 ffi PB1

. 8u � E j u 2 o) 9uB � EB j uB 2 oB ^ u ffi uB

(ii) To subsume concept CB by EB

. CB v EB

(d) ElseIf E is a complete definition of C and E is not a concept inA Then

. C � E ^ E R A

(i) To build an equivalent concept expressionEB 2 OB jEB ffi E ^ E 2 O

. 8P1 � E j P1 2 A) 9PB1 � EB j PB

1 2 AB ^ P1 ffi PB1

. 8u � E j u 2 o) 9uB � EB j uB 2 oB ^ u ffi uB

(ii) Set EB as complete definition of CB. CB � EB

EndIf
EndIf
Algorithm 3. validExpression (E), algorithm to evaluate whether aconcept expression is valid, taking as input an expression E.

1. Retrieve all restrictive roles in RR 2 O.2. If all roles in E are in RR Then

(a) E is a valid expression

. 8r � E j r 2 R) r 2 RR ^ 9rB 2 RB ^ r ffi rB

4.2. Graph representation

Let us describe an ontology O using a graph representationwhich is focused on which concepts or individuals are related toby the definition expressed in the concept expressions rather thanthe IS-A relationships. For this reason, it does not include data val-ues (float, string, etc.), operators or the IS-A relationship; and itonly comprises the relevant elements for the algorithm: concepts,roles and individuals.

Let us describe an ontology O, using a graph representationthat focuses on which concepts or individuals are related by thedefinition in the concept expressions rather than by IS-A relation-ships. For this reason, it does not include data values (e.g. float,string, etc.), operators, or the IS-A relationship. Instead, it only in-cludes the relevant elements for the algorithm: concepts, rolesand individuals.

The graph representation is interpreted as follows:

� An elliptical node with a label represents a concept.� A rectangular node with no label is called a blank node. It

represents a concept expression which defines a concept,such as a one-of operator or intersection of concept subex-pressions. More than one blank node might be associatedwith a concept because it can be defined by several com-plete definitions and one incomplete definition.

� A rectangular node with a label represents an individual. Ifthere is no blank node between a concept and an individual,this means that the individuals are instances of the concept.

� An arc with a label represents a role used in the conceptexpression at the blunt end of the arc. The label on thearc identifies the role. The node at the sharp end of thearc represents a concept or individual linked to the conceptexpression through the role.

� An arc without a label between an elliptical node and ablank node represents the fact that the concept is definedby the expression identified by the blank node.

Fig. 1. Ontology representation.


4.3. Example

Let us take one small part of an ontology as an example whereconcepts do not have exhaustive definitions in order to graphicallyexplain the results and effects of the algorithm. It includes the fol-lowing common types of expression:

C1 � C2 u $r1 � C3 u C10

C2 � "r2 � C4

C3 � $r3 � C5

C3 � $r4 � C4 u $r4 � C6

C4 � {v1, v2}C5 � $r7 � C7 u $r8 � C8

C6 � $r7 � C9

C10 � $r8 � v1 t $r8 � v2

Fig. 1 shows the graphical representation of the precedingexample, where the IS-A relationships between concepts are omit-ted for the sake of simplicity. It thus focuses on the concepts orindividuals used in the definition of the concept expression. Fur-thermore, this representation makes it possible to identify at aglance the set of roles used in a concept.

Let us now examine concept C3, which is defined by two con-cept expressions. Although the second concept expression of C3 iscomposed of an intersection of existential clauses, this informationdoes not appear in Fig. 1. C3 is exclusively related to three blanknodes. The first node represents the concept expression in whichC3 is used, whereas the second and third blank nodes representthe concept expressions which define C3.

In contrast, C4 is related to four elements in the graph. The firstis a blank node, which means that C2 has a concept expressionwhere the role r2 links C4 to C2. In other words, C2 is defined exclu-sively in terms of r2 and C4. The second one is a blank node withseveral arcs, which means that C4, r4, as well as others are usedin the definition of C3. The last two rectangles represent the factthat v1 and v2 are individuals of the concept C4.

Fig. 2 shows the graphical representation after applying thealgorithm in two cases where concept C1 is the starting point (MC).

In the first case, Fig. 2a represents a brief ontology with the rel-evant roles r1 and r8. More specifically, v1 and v2 belong to the briefontology because they are connected to C10 by r8 (a relevant role).Concept C4 is also added because v1 and v2 are its individuals.

In the second case, Fig. 2b represents another brief ontology inwhich r1, r4 and r7 are the relevant roles. The blank boxes of C2 andC10 are not included because roles r2 and r8 are not in the set ofrestrictive roles.

Concepts C4 and C6 are added because they are connected to ablank node of C3 by the role r4. Regarding C3, despite the fact thattwo blank boxes are connected to it, only one of them has beenadded to the brief ontology. This is because all the roles in the firstblank box of C3 belong to the set of restrictive roles (r4), whereas r3

does not belong to the set.It is worth mentioning that v1 and v2 were not added because

they are not directly connected to any blank box in the brief ontol-ogy despite the fact that C4 is in Fig. 2b.

Fig. 2. Brief ontologies of Fig. 1.

5. Brief ontology builder

This section describes the brief ontology builder in a less theoret-ical way, with emphasis on its tools, languages, and architecture.

At this point, there is a change in the terminology referring tothe ontology elements (see column 1 of Table 1). Concepts androles will be called classes and properties, respectively. Propertieswill be divided into datatype properties, whose range is either cho-sen from a set of standard data types (i.e. RDF literals or schema

data types) or they will be divided into object properties, whichare relationships between two classes.

Additionally, each complete definition for defined concepts orincomplete definition for primitive concepts has a relationship ofclass equivalence (C � D) or subsumption (C v D), respectively.

The concept expressions are defined by the operators andrestrictions, which are in the second and third group of Table 1:intersections, unions, restrictions, etc. In addition, Fig. 3 showsthe layered architecture of the brief ontology builder as describedin the following paragraphs.

The BRief ONtology buildER (BRONER) framework was devel-oped as an API for the Java language because of Java’s facilities,and also the large number of Java tools, including but not limited


to JENA (McBride, 2002) or SESAME APIs (Broekstra, Kampman, &Harmelen, 2002), D2RQ (Bizer & Seaborne, 2004), KAON2 (Oberle,Eberhart, Staab, & Volz, 2004), reasoners (Bobillo & Straccia,2008; Sirin, Parsia, Grau, Kalyanpur, & Katz, 2007) and visualrepresentation tools (Dokulil & Katreniakov, 2009; Goyal &Westenthaler, 2009). Some of these tools will make it possible toexpand BRONER’s functionality in future versions.

In particular, the ontology builder was developed with the JavaAPI, provided by Noy, Sintek, Crubézy, Fergerson, and Musen(2001) to manage OWL ontologies, which was developed abovethe reasoner layer.

BRONER API has been tested with pellet (Sirin et al., 2007) andracer (Möller, 2007) reasoners although any other DIG reasonercould have been used instead since the DIG interface provides uni-form access to description logic reasoners.

The BRONER UI (user interface) is at the topmost layer in Fig. 3.The user interface is based on the Protègè framework, whereas itsfunctionalities are based on the brief ontology builder API.

Protègè provides a plugin framework to define all user inter-faces as an extension of Protègè. This includes tabs, slots, back-end, etc. For this reason, the BRONER UI is distributed as atab-plugin encapsulated in Protègè.

Fig. 4 shows a screen capture of the Protègè BRONER plugin. Thescreen is divided into six sections. The first section permits themanagement of one of the required inputs for building the briefontology. It allows the user to explore the hierarchy of the originalontology and select the main concepts, which will be the startingpoint of the algorithm.

The second section contains the list of concepts which has beenselected by the user as a starting point for the algorithm. It also in-cludes the button to start the building process for the new briefontology.

Once the building process has finished, Section 3 displays thehierarchy for the brief ontology. The detailed information of thisontology can be displayed in a new window by clicking on theconcept.

Moreover, if the plugin is working in a verbose mode, the lastsection contains text messages from the building process. Thesemessages provide information about how the brief ontology wasbuilt. For instance, a new message is attached when a concept isadded to the brief ontology or another message is displayed ifthe concept is rejected. Such rejection can occur when a conceptexpression contains a property which does not belong to the setof restrictive properties.

The fifth and sixth sections contain the lists of restrictive objectand datatype properties. Each ‘add’ button opens a window that

Fig. 3. Brief ontology builder architecture.

displays the hierarchies for the object and datatype properties,respectively. These properties can be explored and selected to beadded to their respective list, as shown in Fig. 4.

Furthermore, the user can switch to another ontology by fillingin a form with a new URI (uniform resource identifier) from themain toolbar of the plugin.

5.1. Additional methodology

As previously mentioned, BRONER can be used to extract rele-vant knowledge from an ontology. However, the user may needto extract unconnected parts from the same ontology.

This problem is solved by a suitable selection of the main con-cepts before applying the algorithm to the ontology. Nevertheless,this is not feasible if the user wants to use a different set of prop-erties to retrieve every part of the ontology.

One solution to this problem is to enhance the model by addingthe possibility of assigning different sets of restrictive properties tothe main concepts. Although this makes the algorithm more flexi-ble, it also makes it more complicated and less intuitive. Neverthe-less, the main problem with this type of solution is that it requiresa less efficient algorithm along with a deeper comprehension of theontology. This means that it would be more difficult to build thebrief ontology on the first try.

A second possible solution is an iterative methodology that al-lows the easy extraction of different parts of the ontology, basedon trial and error. Fig. 5 shows the work flow for this approach. Firstof all, the user must select and open the ontology from where hewishes to extract the relevant knowledge for his problem. Secondly,he must examine and select the main concepts and restrictive roles.BRONER needs this information so that it can begin to extract the rel-evant knowledge. The brief ontology is created in the third step.

After the creation of the brief ontology, the user can inspect it,and check whether it matches the expected outcome. Dependingon whether or not the user is satisfied, he can either discard or savethe brief ontology (steps 5 and 6). If the user wishes to generate an-other brief ontology, he can return to the second step and start anew iteration. Once there is no further need for iteration, all thebrief ontologies stored are merged to produce a new one that con-tains the relevant parts of the original ontology. The matching andmerging process is quite simple because all the brief ontologies areparts of the same general ontology.

6. Case study: Environmental impact assessment

This section describes a case study in the area of environmentalimpact assessment (EIA). It shows a real scenario where BRONER isused. A description of the problem is given as well as key pointsregarding the decisions that the user must take.

From a technical perspective, EIA is an analytical process thatidentifies cause–effect relationships. Furthermore, it also quanti-fies, assesses, and prevents the potential negative environmentalimpact of a project (Gómez, 2003).

Methods for performing EIA include impact identification, af-fected environment description, impact prediction, impact assess-ment, etc. According to the precautionary, prevention, andcorrection principles defined by the European Union (E.U., 1997),the main target of EIA is to avoid failures, accidents, and environ-mental deterioration that can be produced by the difficult andcostly repair of the environment (Conesa, 2000).

Currently, EIA uses information technology (IT) techniques, suchas geographic information systems (GIS) (Antunes, Santos, & Jordao,2001) and multi-criteria methods (Boufateh, Perwuelz, Rabenasolo,& Jolly-Desodt, 2009) to make it more effective. Other less frequentapproaches use fuzzy systems (Duarte Velasco, Delgado, & Requena,

Fig. 4. Brief ontology builder screen capture.


2006; Duarte Velasco, Requena, & Rosario, 2007). However, thiscase study is based on knowledge representation.

The EIA problem is frequently addressed by multi-disciplinaryteams, and for this reason, very heterogeneous data are produced,which have to be integrated in some way. Ontologies are com-monly used as a mechanism to integrate information from differ-ent sources because they provide shared knowledge.

An EIA ontology enables the analysis of domain knowledge, sep-aration of domain knowledge from operational knowledge, explic-itation of domain assumptions, reuse of domain knowledge, andsharing of structural information (Noy & McGuinness, 2005).

The starting point for this example is the EIA ontology (Garrido& Requena, 2009; Garrido & Requena, 2010), whose knowledgerepresentation is suitable for the algorithm described in Section4. This ontology contains the relevant concepts in EIA, their rela-tionships, and the concept expressions which allow concepts orindividuals to be linked by relations other than the conventionalIS-A operator.

6.1. Knowledge model

This subsection describes part of the EIA ontology because, be-fore understanding how the application works, the developerneeds to understand the knowledge model used in the application.

Fig. 5. Incremental metho

Although a comprehensive description (Garrido & Requena,2010) of the knowledge model in this example is beyond thescope of this paper, Fig. 6 shows some of the concepts in theontology, and gives an idea of its content. Concepts are repre-sented by boxes, whereas the relevant properties used in theirconcept expressions are represented by arcs. This figure includesonly some of the root concepts in the IS-A hierarchy of theontology and the relevant properties for this example, both ofwhich are described below.

The IndustrialActivities box represents the set of industrialactivities that should be environmentally controlled, according toEU legislation. The ImpactingActions box represents a set of humanactions that can cause environmental impacts, but which are lessgeneric than those in IndustrialActivities. The PreventiveActionbox represents a set of preventive actions, which, as defined inthe ontology, avoid or reduce environmental impacts.

The Impact box represents a set of environmental impactswhich refer to adverse as well as positive changes in the environ-ment. The Indicators&MeasureUnits box represents a set of envi-ronmental indicators or units used to quantify an environmentalimpact. The ImpactAssessment box represents a set of conceptsto indicate the assessment of a concept (e.g. positive, negative, orsynergic). The ImpactedElement box represents a set of environ-mental element which may suffer impacts.

dology for BRONER.


The hasPreventiveAction property is used in the definitions ofindustrial activities and impacting actions, and represents theirpreventive actions. The produceImpact property appears in thedefinitions of industrial activities and impacting actions to repre-sent their possible impacts. The properties, hasIndicator&Meassu-reUnit, hasImpactAssessment, and impactIn, are used in thedefinitions of impacts to represent their indicators or units, theirpossible assessment assignations, and the environmental factorsaffected by the impact, respectively.

6.2. Environmental application

The previous ontology contains a wide range of knowledgeabout environmental assessment, which is generally consideredto be an advantage. In contrast, if the amount of knowledge isexcessive or more than expected or reasonable, this is regardedas unfavorable.

In general, EIA methodology involves determining the variablesand factors that have to be measured. In fact, there are many exam-ples in the literature which propose methods for specific problems,such as construction, wastewater management, landfills, etc.

The BRONER API affords the possibility of developing EIA meth-odologies, depending on the human activity evaluated and theapplication avoids dealing with unnecessary information.

This case study starts from the assumption that the user needsto know the environmental indicators to assess a specific installa-tion or human activity and the environmental factors affected bythis activity in order to develop a methodology for its EIA.

Once the knowledge source has been selected (EIA ontology), theuser must select the restrictive roles and main concepts becausethat is the way the algorithm spreads. Moreover, this selection pro-cess is the mechanism used to define the relevant knowledge.

Fig. 6 shows that actions are indirectly connected to indicatorsand environmental factors, respectively, by tracing a path. Impactingactions are defined by the impacts they produce. This is done byusing the role produceImpact. The Impacts are defined by their indi-cators and sensitive environmental factors. The roles used for thispurpose are hasIndicator&MeasureUnit and impactIn, respectively.

For this reason, all the concepts which are not on the path goingto the indicators and the environmental factors from the actionsare irrelevant knowledge for this tailor-made EIA methodology.

As a result, the system builds a specific ontology for the prob-lem. Its contents are the actions selected by the user, the impactsof these actions, the sensitive environmental factors, and the envi-ronmental indicators.

Fig. 7 depicts an example of a brief ontology for the activities,landfills and leachate ponds. Leachate is a highly contaminated li-quid produced by decomposition in landfills. Most leachate is theresult of runoff that infiltrates the landfill, and comes into contact

Fig. 6. Ontology schema11.

with decomposing garbage. It can cause serious environmentalpollution problems, and for this reason, it is stored and kept inimpermeable ponds.

This ontology was generated by selecting the main concepts,landfills and LeachatePond. Such concepts are always subsumedby the concept rootBriefModel in the brief ontology, and this per-mits the user to know how the ontology was built.

As previously explained, the selection of produceImpact, hasIndi-catorAndMeasureUnit and impactIn as restrictive roles and the ini-tial selection of the main concepts produce a brief ontology withimpacts for these activities, their indicators, and the possible envi-ronmental factors affected by these impacts. Fig. 7 shows the rele-vant taxonomies for these concepts, although the indicators are notincluded because there are too many of them (40 indicators andmeasure units). It is worth mentioning that for this example nodatatype property was included.

The brief ontology for landfills and leachate ponds does not con-tain the full taxonomy for the impacts, environmental factors, andindicators. It only contains the relevant ones for this particular EIA.Likewise, their concepts and individuals do not contain referencesto concepts or individuals which are not relevant.

As an example, the following is the definition of the conceptleachatePond before the extraction of the brief ontology, accordingto the syntax shown in Table 1:

v WasteManagement

$ hasPreventiveAction � AppropriateDikeSize$ hasPreventiveAction � AppropriateCapacity$ hasPreventiveAction � CompactingDike$ produceImpact � LandscapeQualityChange$ produceImpact � VisualImpact$ produceImpact � SmellAccumulation$ produceImpact � GroundwaterQualityChanges$ produceImpact � SurfaceWaterQualityChanges$ produceImpact � SurfaceWaterAffection$ produceImpact � SoilPollution$ usePollutantElement � Leachate$ usePollutantElement � OrganicCompounds$ usePollutantElement � HeavyMetal$ usePollutantElement � SulphurCompounds

After the extraction of the brief ontology, the definition of thisconcept does not contain any reference to the object propertieshasPreventiveAction and usePollutantElement since they were notincluded in the relevant object properties set. This results in thefollowing definition:

v rootBriefModel

$ produceImpact � LandscapeQualityChange$ produceImpact � VisualImpact$ produceImpact � SmellAccumulation$ produceImpact � GroundwaterQualityChanges$ produceImpact � SurfaceWaterQualityChanges$ produceImpact � SurfaceWaterAffection$ produceImpact � SoilPollution

With regard to the restrictive roles, the selection of roles wasencapsulated for this case study because the selected roles are al-ways the same for this problem. For this reason, a specific userinterface was developed where the selection of roles is hiddenand thus not visible to the end user. Similarly, the knowledgesource does not show the whole ontology, only the relevant part(impacting actions and industrial activities). The simplified userinterface is called EIATab.

Because of these changes, the environmental expert does notneed to study the ontology before using the application. He can

Fig. 7. Brief ontology for EIA in landfills and leachate ponds.



just use it. The user only has to select the relevant human actionsfrom the ones in the EIA ontology. Once he has selected the humanactions, the system immediately builds a new ontology that is spe-cific for the activity. Otherwise, the user would have to analyzehow the knowledge is organized and study the domain and rangeof each property in order to select an appropriate set of roles andmain concepts.

7. Conclusions

This paper introduces the concept of brief ontology and an algo-rithm for creating such ontologies. It also presents a Java API and aplugin for Protégé that allow users to easily build brief ontologieswith very little effort.

The algorithm used is a traversal algorithm because it exploresthe ontology by starting from a concept or group of concepts, andcontinues with the concepts that are related by structural proper-ties or relationships.

Furthermore, the algorithm extracts not only a relevant part ofthe ontology in terms of concepts, but it also accomplishes a selec-tive pruning of each concept definition. For this reason, the userobtains a relevant portion of the ontology which is self-contained,and can be used independently of the original ontology.

BRONER has been satisfactorily applied to environmental sci-ences, in particular, to enhance and facilitate the creation of meth-odologies for EIA. However, an additional and specific purposeplugin for EIA was developed so that environmental experts wouldnot have to deal with the inspection of owl syntax. This plugin onlyallows the user to explore part of the original ontology for theselection of main concepts, and does not require the selection ofrestrictive roles because it is encapsulated.

Acknowledgements

This work has been partially supported by research projects(CICE) P07-TIC-02913, P08RNM-03584, and TIN06-15041-C04-01funded by the Andalusian Regional and Spanish Governments.

References

Antunes, P., Santos, R., & Jordao, L. (2001). The application of geographicalinformation systems to determine environmental impact significance.Environmental Impact Assessment Review, 21(6), 511–535.

Baader, F., Calvanese, D., McGuiness, D., Nardi, D., & Patel-Schneider, P. (2003). Thedescription logic handbook: Theory, implementation and applications. CambridgeUniversity Press.

Bizer, C., & Seaborne, A. (2004). D2rq – Treating non-rdf databases as virtual rdfgraphs (poster). In The semantic Web-ISWC.

Bobillo, F., & Straccia, U. (2008). FuzzyDL: An expressive fuzzy description logicreasoner. In IEEE international conference on fuzzy systems (pp. 923–930).

Boufateh, I., Perwuelz, A., Rabenasolo, B., & Jolly-Desodt, A. (2009). Multiple criteriadecision making for environmental impacts optimization. Computers andIndustrial Engineering, 606–611.

Broekstra, J., Kampman, A., & Harmelen, F. (2002). Sesame: A generic architecturefor storing and querying rdf and rdf schema. In The semantic Web-ISWC (Vol.2342, pp. 54–68).

Conesa, V. (2000). Methodological guide for environmental impact assessment. Mundi-Prensa (in Spanish).

Delgado, M., Pérez-Pérez, R., & Requena, I. (2005). Knowledge mobilization throughre-addressable ontologies. In EUSFLAT conf. (pp. 154–158).

Dokulil, J., & Katreniakov, J. (2009). Rdf visualization – Thinking big. In Proceedings ofthe international workshop on database and expert systems applications, DEXA (pp.459–463).

Duarte Velasco, O., Delgado, M., & Requena, I. (2006). An arithmetic approach for thecomputing with words paradigm. International Journal of Intelligent Systems,21(2), 121–142.

Duarte Velasco, O., Requena, I., & Rosario, Y. (2007). Fuzzy techniques forenvironmental-impact assessment in the mineral deposit of Punta Gorda(Moa, Cuba). Environmental Technology, 28, 659–669.

E.U. (1997). Treaty of Amsterdam amending the treaty on European union, the treatiesestablishing the European communities and certain related acts (OJ c 340,10.11.1997, p. 93).

Garrido, J., & Requena, I. (2009). Knowledge representation in environmental impactassessment – A case of study with high level requirements in validation. InProceedings of the international conference on knowledge engineering and ontologydevelopment, KEOD (pp. 412–415).

Garrido, J., & Requena, I. (2010). Proposal of ontology for environmental impactassessment. An application with knowledge mobilization. Expert System withApplications.

Gómez, D. (2003). Environmental impact assessment: A preventive tool forenvironmental management. Mundi-Prensa (in Spanish).

Goyal, S., & Westenthaler, R. (2009). Rdf gravity (rdf graph visualization tool). <http://www.semweb.salzburgresearch.at/apps/rdf-gravity>.

Gu, H., Perl, Y., Geller, J., Halper, M., & Singh, M. (1999). A methodology forpartitioning a vocabulary hierarchy into trees. Artificial Intelligence in Medicine,15(1), 77–98.

Horrocks, I., & Patel-Schneider, P. (2004). Reducing owl entailment to descriptionlogic satisfiability. Web Semantics, 1(4), 345–357.

Kim, J., Caralt, J., & Hilliard, J. (2007). Pruning bio-ontologies. In Proceedings of theannual Hawaii international conference on system sciences.

McBride, B. (2002). Jena: A semantic web toolkit. IEEE Internet Computing, 6(6),55–58.

Möller, R. (2007). Building a commercial owl reasoner with lisp. In ILC ’07:Proceedings of the 2007 international Lisp conference (p. 1). New York, NY, USA:ACM.

Noy, N. F., & McGuinness, D. L. (2005). A guide to creating your first ontology. StanfordUniversity.

Noy, N., & Musen, M. (2009). Traversing ontologies to extract views. Lecture Notes inComputer Science (including subseries Lecture Notes in Artificial Intelligence andLecture Notes in Bioinformatics), 55, 245–260.

Noy, N. F., Sintek, M. D. S., Crubézy, M., Fergerson, R. W., & Musen, M. A. (2001).Creating semantic web contents with protégé-2000. IEEE Intelligent Systems andtheir Applications, 16(2), 60–71.

Oberle, D., Eberhart, A., Staab, S., & Volz, R. (2004). Developing and managingsoftware components in an ontology-based application server. Lecture Notes inComputer Science (including subseries Lecture Notes in Artificial Intelligence andLecture Notes in Bioinformatics), 3231, 459–477.

Pathak, J., Johnson, T., & Chute, C. (2009). Survey of modular ontology techniquesand their applications in the biomedical domain. Integrated Computer-AidedEngineering, 16(3), 225–242.

Sirin, E., Parsia, B., Grau, B., Kalyanpur, A., & Katz, Y. (2007). Pellet: A practical OWL-DL reasoner. Web Semantics, 5(2), 51–53.

Stuckenschmidt, H., & Schlicht, A. (2009). Structure-based partitioning of largeontologies. Lecture Notes in Computer Science (including subseries Lecture Notes inArtificial Intelligence and Lecture Notes in Bioinformatics), 5445, 187–210.

http://www.semweb.salzburgresearch.at/apps/rdf-gravity

http://www.semweb.salzburgresearch.at/apps/rdf-gravity

towards summarizing knowledge: brief ontologies

Documents