the effect of data structures modifications on …hdp/pdf/dissertation.pdf · structures tool...
TRANSCRIPT
![Page 1: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/1.jpg)
THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON ALGORITHMS
FOR REASONING OPERATIONS USING A CONCEPTUAL GRAPHS
KNOWLEDGE BASE
BY
HEATHER DAY PFEIFFER, B.S., M.S.
A dissertation submitted to the Graduate School
in partial fulfillment of the requirements
for the degree
Doctor of Philosophy
Subject: Computer Science
New Mexico State University
Las Cruces, New Mexico
December 2007
Copyright c© 2007 by Heather Day Pfeiffer, B.S., M.S.
![Page 2: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/2.jpg)
“The Effect of Data Structures Modifications On Algorithms for Reasoning Operations
Using a Conceptual Graphs Knowledge Base,” a dissertation prepared by Heather Day
Pfeiffer, B.S., M.S. in partial fulfillment of the requirements for the degree, Doctor of
Philosophy, has been approved and accepted by the following:
Linda LaceyDean of the Graduate School
Roger T. HartleyChair of the Examining Committee
Date
Committee in charge:
Dr. Roger T. Hartley, Chair
Dr. Desh Ranjan
Dr. Clinton Jeffery
Dr. Jeanine Cook
ii
![Page 3: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/3.jpg)
DEDICATION
This Dissertation is dedicated to my husband, Dr. Joseph J. Pfeiffer, Jr. who has
supported me through "thick and thin", my children, Joseph “Joel” III and Rebecca
“Becca” who have seen "Mom" work on a degree all their lives, my parents, Lloyd
and Barbara Day who have always believed in education and instilled that belief in
their children, and my in-laws (may they rest in peace) Joe and Mary Elizabeth “Betty”
Pfeiffer.
iii
![Page 4: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/4.jpg)
ACKNOWLEDGMENTS
David J. Benn, from the University of South Australia at Adelaide, for working to help
intergrat his ‘pCG’ system with the CPE "Operations" moduleand help in testing and
debugging comparison tests with CPE and pCG.
Dr. John F. Sowa who gave me some very lively discussions on growing ideas of Con-
ceptual Structures and especially Conceptual Graphs. Also, for allowing me to work
with and expand on his original CGIF format.
Dr. Jean-François Baget and Dr. Madalina Croitoru who have taught me much about
Simple Conceptual Graphs (SCGs) and how relation hierarchies make great Supports
for SCGs. Also for evaluating and discussing some of the theoretical finds of this dis-
sertation.
All the past and current AI graduate students at New Mexico State University, in partic-
ular, Dr. Melanie Martin, Nemecio “Chito” Chavez, Jr., Dr. Dan Tappan and Dr. Tom
O’Hara.
The hard work of my committee, in particular, Dr. Clinton Jeffery who carefully looked
at both content and formatting of all the chapters and traveled all the way back from
Idaho, and Dr. Jeanine Cook who kept me "on track" and over thebumps in the roads.
iv
![Page 5: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/5.jpg)
VITA
February 11, 1955 Born in Dallas, Texas, USA
June 1977 B.S. in Microbiology/Biology from University of Washington
1980-1984 Systems Analyst at The Boeing Company in Seattle,Washington
May 1988 M.S. in Computer Science from New Mexico State University
1987-2007 Computer Consultant based in Las Cruces, New Mexico
2005-2006 Senior Computer Scientist at Horton Technical Associates, Inc.in Las Cruces, New Mexico
Professional Societies
Association for Computing Machinery (ACM)
IEEE Computer Society
The American Society for Information Systems and Technology (ASIS&T)
New Mexico Network for Women in Science and Engineering (NMNWSE)
Publications
H.D. Pfeiffer and R.T. Hartley. Semantic additions to conceptual programming. InProc. of the Fourth Annual Workshop on Conceptual Structures, Detroit, MA, 1989.
v
![Page 6: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/6.jpg)
M.J. Coombs, R.T. Hartley, H.D. Pfeiffer, and B. Kilgore. How to become immuneto facts. InProc. Rocky Mountain Conference on Artificial Intelligence, Las Cruces,NM, June 1990.
H.D. Pfeiffer and R.T. Hartley. Additions for set representation and processing to con-ceptual programming. InProc. of the Fifth Annual Workshop on Conceptual Structures,pages 131–140, Boston&Stockholm, 1990.
H.D. Pfeiffer and R.T. Hartley. The Conceptual ProgrammingEnvironment, CP: Rea-soning representation using graph structures and operations. InProc. of IEEE Work-shop on Visual Languages, Kobe, Japan, 1991.
M.J. Coombs, H.D. Pfeiffer, and R.T. Hartley. e-MGR: an Architecture for SymbolicPlasticity. Inthe special issue of International Journal of Man-Machine Studies on inSymbolic Problem Solving in Noisy, Novel, and Uncertain Task Environments, 36:1–17,1992.
C.A. Fields, H.D. Pfeiffer, and T.C. Eskridge. Knowledge representation and control ingm1, and automated dna sequence analysis system based on theMGR architecture. InInternational Journal of Man-Machine Studies, 34:549–573,1992.
R.T. Hartley, H.D. Pfeiffer, and D. Qui. Representation forViewgen: Structures andReasoning. InWorkshop on Propositional Knowledge Representation, Stanford, CA,1992.
H.D. Pfeiffer and R.T. Hartley. The Conceptual ProgrammingEnvironment, CP. InT.E. Nagle, J.A. Nagle, L.L. Gerholz, and P. W. Ekland, editors,Conceptual Structures:Current Research and Practice, Ellis Horwood Workshops. Ellis Horwood, 1992.
H.D. Pfeiffer and R.T. Hartley. Temporal, spatial, and constraint handling in the Con-ceptual Programming Environment, CP.Journal of Experimental and Theoretical AI,4(2):167–182,1992.
H.D. Pfeiffer and T.E. Nagle, editors.Conceptual Structures: Theory and Implementa-tion, volume 754 ofLNAI. Springer-Verlag, Heidelberg, W. Germany, 1993.
H.D. Pfeiffer and B.J. Waltar. Automated message analysis using the Conceptual Pro-gramming Environment, CP. In G. Ellis and P. Ekland, editors, Supp. Proc. of the3rdInternational Conference On Conceptual Structures, Santa Cruz, CA, 1995.
vi
![Page 7: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/7.jpg)
H.D. Pfeiffer and R.T. Hartley. Visual CP representation ofknowledge. In G. Stumme,editor, Working with Conceptual Structures - Contributions to ICCS2000, Shaker-Verlag. pages 175–188, 2000.
H.D. Pfeiffer and R.T. Hartley. ARCEdit - CG editor. InCGTools Workshop Pro-ceedings in connection with ICCS 2001, Stanford, CA, 2001. [Online Access: July2001] URL:http://www.cs.nmsu.edu/ hdp/CGTOOLS/proceedings/index.html.
H.D. Pfeiffer and R.T. Hartley, editors.CGTools Workshop Proceedings in connec-tion with ICCS 2001, Stanford, CA, 2001. [Online Access: July 2001]URL:http://www.cs.nmsu.edu/ hdp/CGTOOLS/proceedings/index.html.
R.T. Hartley and H.D. Pfeiffer. Data models for Conceptual Structures. InFoundationsand Applications of Conceptual Structures, Contributionsto ICCS 2002. ICCS2002,2002.
K.E. Wolff, H.D. Pfeiffer, and H.S. Delugach, editors.Conceptual Structures at Work,volume 3127 ofLNAI. ICCS2004, Springer, July 2004.
H.D. Pfeiffer, K.E. Wolff, and H.S. Delugach, editors.Conceptual Structures at Work,Contributions to ICCS 2004, Aachen, July 2004. ICCS2004, Shaker Verlag.
H.D. Pfeiffer. An exportable CGIF module from the CP environment: A pragmaticapproach. In K.E. Wolff, H.D. Pfeiffer, and H.S. Delugach, editors,Conceptual Struc-tures at Work, volume 3127 ofLNAI, pages 319–332. ICCS2004, Springer, July 2004.
M.A. Keeler and H.D. Pfeiffer. Collaboratory testbed partnerships as a knowledgecapture challenge. In P. Clark and G. Schreiber, editors,Proceedings of the Third Inter-national Conference on Knowledge Capture, pages 203–204. KCAP’05, ACM Press,October 2005.
M.A. Keeler and H.D. Pfeiffer. Games of inquiry for collaborative concept structuring.In F. Dau, M-L Mugnier, and G. Stumme, editors,Conceptual Structures: Common Se-mantics for Sharing Knowledge, ICCS2005, pages 396–410, Berlin, Springer-Verlag,LNAI 3596, July 2005.
H.D. Pfeiffer. Games for co-evolution of digital resourcesand knowledge tools. InInformation Realities: Shaping the Digital Future for All, ASIS&T 2006, Austin, TX,November 2006.
vii
![Page 8: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/8.jpg)
M.A. Keeler and H.D. Pfeiffer. Building a pragmatic methodology for KR tool re-search and development. In H. Scharfe, P. Hitzler, and P. Ohrstrom, editors,ConceptualStructures: Inspiration and Application, ICCS2006, pages 314–330, Berlin, Springer-Verlag, LNAI 4068, July 2006.
H.D. Pfeiffer and R.T. Hartley. A comparison of different conceptual structures projec-tion algorithms. In U. Priss, S. Polovina, and R. Hill, editors, Conceptual Structures:Knowledge Architectures for Smart Applications, ICCS’07, pages 165–178, Berlin Hei-delberg, Springer-Verlag, LNAI 4604, July 2007.
H.D. Pfeiffer and J.J. Pfeiffer, Jr. Representation levelswithin knowledge represen-tation. In U. Priss, S. Polovina, and R. Hill, editors,Conceptual Structures: Knowl-edge Architectures for Smart Applications, ICCS’07, pages 484–487, Berlin Heidel-berg, Springer-Verlag, LNAI 4604, July 2007.
H.D. Pfeiffer, N.R. Chavez, Jr., and J.J. Pfeiffer, Jr. CPE design considering inter-operability. In H.D. Pfeiffer, A. Kabbaj, and D.J. Benn, editors,CS-TIW 2007 SecondConceptual Structures Tool Interoperability Workshop, pages 71–75, 2007.
H.D. Pfeiffer, A. Kabbaj, and D.J. Benn, editors.CS-TIW 2007 Second ConceptualStructures Tool Interoperability Workshop.Research Press International, 2007.
Field of Study
Major field: Artificial Intelligence
Conceptual Structures
viii
![Page 9: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/9.jpg)
ABSTRACT
THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON ALGORITHMS
FOR REASONING OPERATIONS USING A CONCEPTUAL GRAPHS
KNOWLEDGE BASE
BY
HEATHER DAY PFEIFFER, B.S., M.S.
Doctor of Philosophy
New Mexico State University
Las Cruces, New Mexico, 2007
Dr. Roger T. Hartley, Chair
Knowledge representation (KR) is used to store and retrievemeaningful in-
formation. Meaning cannot be directly stored in the computer; therefore, a series of
levels of representation transforms knowledge to a format that a computer can process.
This transformed knowledge is saved using dynamic data structures that are suitable
for the style of KR being implemented, and through the KR the system manipulates
the knowledge in the data using reasoning operations. The data structure, together with
the contents of the transformed knowledge, is called the knowledge base (KB). An al-
gorithm and the associated data structures make up the reasoning operation, and the
performance of this operation is dependent on the KB it uses.
ix
![Page 10: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/10.jpg)
In this work, the basic reasoning operations for knowledge management will
be explored using a particular style of KR called ConceptualGraphs (CGs). These
operations,projectionandmaximal join, are the foundation for query/answer and hy-
pothesis generation (abduction) systems, respectively. It is believed that changing the
reasoning operation’s algorithm and providing adequate data structures for them can
improve the implementation of the operation for use in intelligent systems; therefore,
making them faster and more efficient. Different algorithmsand data structures execu-
tion times are analyzed over the most general form of CGs knowledge base showing
that flexible, fast and efficient operations can improve higher level systems.
x
![Page 11: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/11.jpg)
TABLE OF CONTENTS
LIST OF ALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Knowledge and Knowledge Representation . . . . . . . . . . . . .. . 2
1.1.1 Representation Levels . . . . . . . . . . . . . . . . . . . . . . 4
1.1.2 Speed and Efficiency in Processing . . . . . . . . . . . . . . . 16
1.2 Foundational Information . . . . . . . . . . . . . . . . . . . . . . . . 17
1.2.1 Basis of Subgraph Isomorphism . . . . . . . . . . . . . . . . 18
1.2.2 Overview of Unification/Matching . . . . . . . . . . . . . . . 20
1.2.3 Database vs Knowledge Base . . . . . . . . . . . . . . . . . . 22
1.3 Organization of Dissertation . . . . . . . . . . . . . . . . . . . . . .. 23
2 ONTOLOGY, KNOWLEDGE AND REPRESENTATION . . . . . . . . . . 27
2.1 Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1.1 Abstract Hierarchies . . . . . . . . . . . . . . . . . . . . . . . 29
2.1.2 Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.1.2.1 Compositional . . . . . . . . . . . . . . . . . . . 29
2.1.2.2 Quantification . . . . . . . . . . . . . . . . . . . . 31
xi
![Page 12: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/12.jpg)
2.1.2.3 Qualitative . . . . . . . . . . . . . . . . . . . . . . 33
2.2 Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.2.1 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.2.1.1 Declarative Knowledge . . . . . . . . . . . . . . . 36
2.2.1.2 Procedural Knowledge . . . . . . . . . . . . . . . 37
2.2.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2.2.1 Terminological . . . . . . . . . . . . . . . . . . . 38
2.2.2.2 Assertional . . . . . . . . . . . . . . . . . . . . . 39
2.2.2.3 Generalization . . . . . . . . . . . . . . . . . . . . 39
2.2.2.4 Specialization . . . . . . . . . . . . . . . . . . . . 40
2.3 Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.1 Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.1.1 Logic . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.1.2 Rule-Bases . . . . . . . . . . . . . . . . . . . . . 43
2.3.1.3 Semantic Network . . . . . . . . . . . . . . . . . . 43
2.3.2 Internal Representation . . . . . . . . . . . . . . . . . . . . . 47
2.3.2.1 Predicate Calculus . . . . . . . . . . . . . . . . . . 47
2.3.2.2 IF..THEN . . . . . . . . . . . . . . . . . . . . . . 49
2.3.2.3 Conceptual Structures . . . . . . . . . . . . . . . . 50
xii
![Page 13: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/13.jpg)
3 DEFINITIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1 Graph Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1.1 Digraph and Bigraph . . . . . . . . . . . . . . . . . . . . . . 56
3.1.2 Walk, Path and Connected . . . . . . . . . . . . . . . . . . . 57
3.2 Types and Hierarchies . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2.1 Concept Type Hierarchy . . . . . . . . . . . . . . . . . . . . 61
3.2.2 Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3 FOL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4 Conceptual Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.4.1 Graph Theory Relationships . . . . . . . . . . . . . . . . . . 70
3.4.2 Formation Rules . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.4.3 Simple Conceptual Graphs (SCGs) . . . . . . . . . . . . . . . 74
3.4.4 Conceptual Graphs Interchange Format (CGIF) . . . . . . .. 76
3.5 Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.5.1 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.5.2 Hash Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.5.2.1 Perfect Hashing . . . . . . . . . . . . . . . . . . . 79
3.5.2.2 Hash Table/Hash Tables . . . . . . . . . . . . . . . 80
4 REASONING OPERATIONS . . . . . . . . . . . . . . . . . . . . . . . . . 81
xiii
![Page 14: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/14.jpg)
4.1 Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.1.1 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.1.2 Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.2 Graph and Subgraph Isomorphism . . . . . . . . . . . . . . . . . . . 85
4.2.1 Graph Isomorphism . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.2 Subgraph Isomorphism . . . . . . . . . . . . . . . . . . . . . 85
4.2.2.1 Non-labeled nodes and undirected edges . . . . . . 87
4.2.2.2 Labeled nodes and undirected edges . . . . . . . . 87
4.2.3 Subtree Isomorphism . . . . . . . . . . . . . . . . . . . . . . 88
4.2.3.1 Hamiltonian Path . . . . . . . . . . . . . . . . . . 88
4.2.3.2 Subforest Isomorphism . . . . . . . . . . . . . . . 88
4.2.4 Subbipartite Isomorphism . . . . . . . . . . . . . . . . . . . . 89
4.2.5 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.2.5.1 Historical Algorithms . . . . . . . . . . . . . . . . 90
4.2.5.2 Proposed Algorithm . . . . . . . . . . . . . . . . . 91
4.2.6 Maximal Join . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.2.6.1 Historical Algorithms . . . . . . . . . . . . . . . . 92
4.2.6.2 Proposed Algorithm . . . . . . . . . . . . . . . . . 92
4.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
xiv
![Page 15: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/15.jpg)
4.3.1 Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.3.2 Maximal Join . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.3.3 Over Knowledge bases . . . . . . . . . . . . . . . . . . . . . 98
5 ALGORITHMS AND ANALYSIS . . . . . . . . . . . . . . . . . . . . . . . 101
5.1 Foundational Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.1.1 SCG Projection . . . . . . . . . . . . . . . . . . . . . . . . . 103
5.1.2 SCG Relation Projection . . . . . . . . . . . . . . . . . . . . 107
5.1.3 Polyprojection . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.1.4 Notio Projection . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2 New Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.2.1 Supporting Information . . . . . . . . . . . . . . . . . . . . . 115
5.2.1.1 Variables and Given values . . . . . . . . . . . . . 115
5.2.1.2 Actual Supporting Routines . . . . . . . . . . . . . 117
5.2.1.3 Worst Case Analysis for Support Routines . . . . . 117
5.2.2 New Projection . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.2.2.1 Actual Algorithm . . . . . . . . . . . . . . . . . . 124
5.2.2.2 Execution Time . . . . . . . . . . . . . . . . . . . 124
5.2.2.3 Worst Case Analysis for Projection . . . . . . . . . 125
5.2.3 New Maximal Join . . . . . . . . . . . . . . . . . . . . . . . 126
xv
![Page 16: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/16.jpg)
5.3 Typical Scenario Analysis for Projection Algorithms . .. . . . . . . . 128
5.3.1 Projection Algorithms using SCG . . . . . . . . . . . . . . . 128
5.3.1.1 SCG Projection . . . . . . . . . . . . . . . . . . . 129
5.3.1.2 SCG Relation Projection . . . . . . . . . . . . . . 129
5.3.2 Notio Projection . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.3.3 New Projection . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.3.3.1 Typical Case for Support Routines . . . . . . . . . 130
5.3.3.2 Typical Case for New Projection Algorithm . . . . 132
6 SYSTEMS/ENVIRONMENTS AND IMPLEMENTATIONS . . . . . . . . . 135
6.1 Semantic Network Systems . . . . . . . . . . . . . . . . . . . . . . . 135
6.1.1 KL-ONE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
6.1.2 SNePS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
6.1.3 SNAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.1.4 CS Initial Project - PEIRCE . . . . . . . . . . . . . . . . . . . 143
6.2 Conceptual Graphs Environments . . . . . . . . . . . . . . . . . . . .147
6.2.1 CoGITaNT . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2.2 Amine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2.3 pCG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.2.4 CPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
xvi
![Page 17: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/17.jpg)
6.2.4.1 Basic Architecture for the Environment . . . . . . 151
6.2.4.2 Data Flow within the Environment . . . . . . . . . 152
6.2.4.3 Data Structures used by the Environment . . . . . . 153
6.3 ADT Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.3.1 Logical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.3.2 Basic Data Structures . . . . . . . . . . . . . . . . . . . . . . 154
6.3.3 Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.4 Experiment Systems Implementation . . . . . . . . . . . . . . . . .. 156
6.4.1 pCG - Original Notio . . . . . . . . . . . . . . . . . . . . . . 157
6.4.2 CP Environment (CPE) . . . . . . . . . . . . . . . . . . . . . 159
6.4.2.1 Array (Vectors) . . . . . . . . . . . . . . . . . . . 160
6.4.2.2 Hash Tables . . . . . . . . . . . . . . . . . . . . . 162
7 PROJECTION EXPERIMENTS, RESULTS AND ANALYSIS . . . . . . . . 165
7.1 Domain Problem - ‘Blocks World’ . . . . . . . . . . . . . . . . . . . . 165
7.2 Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.2.1 Single Appearance of Relation within Graph . . . . . . . . .172
7.2.1.1 Increase # of Graphs in KB . . . . . . . . . . . . . 173
7.2.1.2 Increase # of Nodes in Graphs in KB . . . . . . . . 173
7.2.1.3 Increase # of Nodes in Query Graph . . . . . . . . 175
xvii
![Page 18: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/18.jpg)
7.2.2 Multiple Appearance of Relation with a Graph . . . . . . . .. 178
7.2.2.1 Increase # of Nodes in Graphs in KB . . . . . . . . 179
7.2.2.2 Increase # of Nodes in Query Graph . . . . . . . . 180
7.3 Results of Each Experiment Systems . . . . . . . . . . . . . . . . . .182
7.3.1 pCG - Original Notio . . . . . . . . . . . . . . . . . . . . . . 182
7.3.2 CP Environment . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.3.2.1 Array (Vector) . . . . . . . . . . . . . . . . . . . 183
7.3.2.2 Hash Tables . . . . . . . . . . . . . . . . . . . . . 183
7.4 Results of Each # of Nodes in KB . . . . . . . . . . . . . . . . . . . . 184
7.4.1 5 nodes in KB graphs . . . . . . . . . . . . . . . . . . . . . . 184
7.4.2 11 nodes in KB graphs . . . . . . . . . . . . . . . . . . . . . 187
7.4.3 21 nodes in KB graphs . . . . . . . . . . . . . . . . . . . . . 189
7.4.4 31 nodes in KB graphs . . . . . . . . . . . . . . . . . . . . . 192
7.4.5 53 nodes in KB graphs . . . . . . . . . . . . . . . . . . . . . 194
7.4.6 73 nodes in KB graphs . . . . . . . . . . . . . . . . . . . . . 197
7.5 Analysis of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
7.5.1 Change # of Graphs in KB . . . . . . . . . . . . . . . . . . . 200
7.5.2 Change # of Nodes in KB Graphs . . . . . . . . . . . . . . . . 200
7.5.3 Change # of Nodes in Query Graph . . . . . . . . . . . . . . . 201
xviii
![Page 19: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/19.jpg)
7.5.4 Change # of Identical Relations in Graph . . . . . . . . . . . .202
8 CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . . . . 203
8.1 Evaluation of Four Projection Algorithms . . . . . . . . . . . .. . . . 203
8.1.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.1.2 Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.2 Data Structures and Algorithms Effectiveness Comparison forImplemented Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.2.1 Strengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
8.2.2 Weaknesses . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
8.3 Significance of Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
8.3.1 Full Conceptual Graphs . . . . . . . . . . . . . . . . . . . . . 208
8.3.2 Finds All Valid Projections . . . . . . . . . . . . . . . . . . . 208
8.3.3 Data Structure Integration in Algorithm over LargeKB and Graphs . . . . . . . . . . . . . . . . . . . . . . . . . 209
8.4 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
8.4.1 Experiments and Analysis of Maximal Join Algorithm . .. . 210
8.4.2 KB Stored From and To Standard Relational DB . . . . . . . . 210
8.4.3 Time and Space Constraints . . . . . . . . . . . . . . . . . . . 211
8.4.3.1 Heuristics . . . . . . . . . . . . . . . . . . . . . . 212
8.4.3.2 Time . . . . . . . . . . . . . . . . . . . . . . . . . 213
xix
![Page 20: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/20.jpg)
8.4.3.3 Space . . . . . . . . . . . . . . . . . . . . . . . . 215
8.4.4 Different Domain Problems and Interoperability . . . .. . . . 217
APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
A PROGRAMMING LANGUAGE CRITERIA . . . . . . . . . . . . . . . . . 219
A.1 Language Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
A.1.1 Visual Basic .Net . . . . . . . . . . . . . . . . . . . . . . . . 222
A.1.2 JavaTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
A.1.3 C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
A.1.4 C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
A.2 Language Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 223
A.2.1 C++ to C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
A.2.2 C++ to JavaTM . . . . . . . . . . . . . . . . . . . . . . . . . 224
A.2.3 C++ to Prolog . . . . . . . . . . . . . . . . . . . . . . . . . . 225
A.2.4 C++ to Visual Basic 6.0 . . . . . . . . . . . . . . . . . . . . . 225
B DOCUMENTATION OF CGIF - VERSION 2001 . . . . . . . . . . . . . . . 227
B.1 Added Definitions For CGIF Categories . . . . . . . . . . . . . . . .. 227
B.2 Lexical Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
B.3 Syntactic Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
C DOCUMENTATION OF SYSTEMS . . . . . . . . . . . . . . . . . . . . . . 245
xx
![Page 21: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/21.jpg)
C.1 pCG (CGP Programs) . . . . . . . . . . . . . . . . . . . . . . . . . . 245
C.2 CP Environment, CPE . . . . . . . . . . . . . . . . . . . . . . . . . . 250
C.2.1 CPE Module Documentation . . . . . . . . . . . . . . . . . . 250
C.2.1.1 CP_Graph Reasoning Operations . . . . . . . . . . 250
C.2.1.2 CP_Graph Reasoning Internal Operations . . . . . 251
C.2.1.3 CGHash_Graph and CG_Graph Public Functions . 252
C.2.2 CPE Class Documentation . . . . . . . . . . . . . . . . . . . 253
C.2.2.1 cp_graph Class Reference . . . . . . . . . . . . . . 253
C.2.2.2 cghash_graph Class Reference . . . . . . . . . . . 254
C.2.2.3 cg_graph Class Reference . . . . . . . . . . . . . . 255
D DATA COLLECTED FROM SAMPLE TESTS . . . . . . . . . . . . . . . . 257
D.1 Data Collected for Computing Each Experimental ResultsTestSet - 53 nodes in KB Graphs . . . . . . . . . . . . . . . . . . . . . . . 257
D.2 Error Bar Data - 53 nodes in KB Graphs . . . . . . . . . . . . . . . . . 259
D.3 Validation of Correct Projection . . . . . . . . . . . . . . . . . . .. . 262
D.3.1 11 nodes in KB graphs - Unique Relation Results . . . . . . .264
D.3.2 13 nodes in KB graphs - Multi-Instances Relation Results . . . 265
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
xxi
![Page 22: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/22.jpg)
LIST OF ALGORITHMS
5.1 Π is a General Projection fromT to G . . . . . . . . . . . . . . . . . . 105
5.2 Π Modified as an Injective Projection fromT to G . . . . . . . . . . . 107
5.3 Notio Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
5.4 Supporting Projection Routines . . . . . . . . . . . . . . . . . . . .. 118
5.5 Supporting Projection Routines (Cont1) . . . . . . . . . . . . .. . . . 119
5.6 Supporting Projection Routines (Cont2) . . . . . . . . . . . . .. . . . 120
5.7 New Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
5.8 New Maximal Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
xxii
![Page 23: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/23.jpg)
LIST OF TABLES
1.1 Brachman and Guarino Classification Levels and Main Fea-tures (Adapted from [[45], Figure 6]). . . . . . . . . . . . . . . . . . .9
3.1 Execution Times For Single Element with Set of Sizen. . . . . . . . . . 78
4.1 Related Problem Classes. . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.1 KB Single Relation Graph Files. . . . . . . . . . . . . . . . . . . . . .172
7.2 Single Relation: Query Graph Size Run vs Number of Nodesin KB Graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.3 Multi-Relation: Query Graph Size Run vs Number of Nodesin KB Graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.4 Number of Projections Found: Query Graph Size vs KB GraphSize. . . 202
8.1 Comparison of Four Algorithms. . . . . . . . . . . . . . . . . . . . . 203
C.1 CGP Program Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
D.1 Average Data Values for 53 nodes KB with 1000 Graphs. . . . .. . . . 257
D.2 Average Data Values for 53 nodes KB with 2500 Graphs. . . . .. . . . 258
D.3 Average Data Values for 53 nodes KB with 5000 Graphs. . . . .. . . . 259
D.4 Fast/Slow Values for 53 nodes KB with 1000 Graphs. . . . . . .. . . . 260
D.5 Fast/Slow Values for 53 nodes KB with 2500 Graphs. . . . . . .. . . . 261
xxiii
![Page 24: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/24.jpg)
D.6 Fast/Slow Values for 53 nodes KB with 5000 Graphs. . . . . . .. . . . 261
D.7 Error Bar Data Values for 53 nodes KB with 1000 Graphs. . . .. . . . 262
D.8 Error Bar Data Values for 53 nodes KB with 2500 Graphs. . . .. . . . 263
D.9 Error Bar Data Values for 53 nodes KB with 5000 Graphs. . . .. . . . 263
xxiv
![Page 25: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/25.jpg)
LIST OF FIGURES
1.1 Levels of Representations. . . . . . . . . . . . . . . . . . . . . . . . .13
1.2 Abstract Data Type (ADT). . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 UnifierU, ProjsU −→ G1 andU −→ G2, Unification G isFound (Adapted from [[136], Figure 5]). . . . . . . . . . . . . . . . . .22
2.1 Time Chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.2 Logic Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3 Meaning Triangle for Symbols, Concepts, and Referents (Basedon [[129], Figure 1]). . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.4 Peirce’s Triadic Relation. . . . . . . . . . . . . . . . . . . . . . . . .. 52
3.1 A Graph to Illustrate Graph Theory Concepts (Adapted from[[46], Figure 2.9]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2 A Digraph that is a Bipartite Graph. . . . . . . . . . . . . . . . . . .. 57
3.3 A Type Hierarchy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.4 An Animal Concept Hierarchy. . . . . . . . . . . . . . . . . . . . . . . 61
3.5 Support Using a Relation Hierarchy (Based on [[5], Figure 1]). . . . . . 63
3.6 Basic Abstract Conceptual Graph. . . . . . . . . . . . . . . . . . . .. 66
3.7 Basic Abstract Conceptual Graph in Digraph Format that is Bipartite. . 67
3.8 Basic Conceptual Graph with Actor. . . . . . . . . . . . . . . . . . .. 69
3.9 Action Function For Basic Actor Graph. . . . . . . . . . . . . . . .. . 69
xxv
![Page 26: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/26.jpg)
3.10 Basic Detached Conceptual Graph. . . . . . . . . . . . . . . . . . .. . 72
3.11 Simple Basic Conceptual Graph. . . . . . . . . . . . . . . . . . . . .. 73
3.12 Second Concept Type Hierarchy. . . . . . . . . . . . . . . . . . . . .. 73
3.13 Simple Restricted Basic Conceptual Graph. . . . . . . . . . .. . . . . 74
3.14 Simple Conceptual Graph (SCG). . . . . . . . . . . . . . . . . . . . .75
4.1 Project (Mp (Q, H) = P) (Adapted from [[92], Figure 3]). . . . . . . . 82
4.2 Join (MJ (Q, H) = J) (Adapted from [[92], Figure 2]). . . . . . . . . . 84
4.3 Query Graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.4 KB Graph with Type Hierarchy. . . . . . . . . . . . . . . . . . . . . . 95
4.5 Projection Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.6 Join ofP1 andP2 Graphs. . . . . . . . . . . . . . . . . . . . . . . . . 97
4.7 Common Graph of Basic Graphs. . . . . . . . . . . . . . . . . . . . . . 98
4.8 Join of Detached Basic and Simple Basic Graphs. . . . . . . . .. . . . 98
6.1 A KL-ONE Diagram of a Simple ‘Blocks-World’ Arch (Basedon [[141], Figure 1]). . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
6.2 A SNePS Representation of “A on B on a Table” (Based on[[110], Figure 12]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
6.3 SNAP Semantic Network of “USC in LA, CA” (Based on [[72],Figure 2]). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
6.4 PEIRCE Schema for Age (Based on [[119], Figure 6.5]). . . .. . . . . 144
xxvi
![Page 27: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/27.jpg)
6.5 Current CP Environment (From [[87], Figure 1, page 322]). . . . . . . . 152
7.1 Part 1: Example of Blocks World Benchmark File. . . . . . . . .. . . 166
7.2 Part 2: Example of Blocks World Benchmark File. . . . . . . . .. . . 167
7.3 Part 3: Example of Blocks World Benchmark File. . . . . . . . .. . . 169
7.4 Part 4: Example of Blocks World Benchmark File. . . . . . . . .. . . 170
7.5 A Picture of the Benchmark File. . . . . . . . . . . . . . . . . . . . . .171
7.6 5 nodes in KB of 1000 Graphs. . . . . . . . . . . . . . . . . . . . . . . 185
7.7 5 nodes in KB of 2500 Graphs. . . . . . . . . . . . . . . . . . . . . . . 186
7.8 5 nodes in KB of 5000 Graphs. . . . . . . . . . . . . . . . . . . . . . . 186
7.9 11 nodes in KB of 1000 Graphs. . . . . . . . . . . . . . . . . . . . . . 187
7.10 11 nodes in KB of 2500 Graphs. . . . . . . . . . . . . . . . . . . . . . 188
7.11 11 nodes in KB of 5000 Graphs. . . . . . . . . . . . . . . . . . . . . . 189
7.12 21 nodes in KB of 1000 Graphs. . . . . . . . . . . . . . . . . . . . . . 190
7.13 21 nodes in KB of 2500 Graphs. . . . . . . . . . . . . . . . . . . . . . 191
7.14 21 nodes in KB of 5000 Graphs. . . . . . . . . . . . . . . . . . . . . . 191
7.15 31 nodes in KB of 1000 Graphs. . . . . . . . . . . . . . . . . . . . . . 192
7.16 31 nodes in KB of 2500 Graphs. . . . . . . . . . . . . . . . . . . . . . 193
7.17 31 nodes in KB of 5000 Graphs. . . . . . . . . . . . . . . . . . . . . . 194
7.18 53 nodes in KB of 1000 Graphs. . . . . . . . . . . . . . . . . . . . . . 195
xxvii
![Page 28: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/28.jpg)
7.19 53 nodes in KB of 2500 Graphs. . . . . . . . . . . . . . . . . . . . . . 196
7.20 53 nodes in KB of 5000 Graphs. . . . . . . . . . . . . . . . . . . . . . 196
7.21 73 nodes in KB of 1000 Graphs. . . . . . . . . . . . . . . . . . . . . . 197
7.22 73 nodes in KB of 2500 Graphs. . . . . . . . . . . . . . . . . . . . . . 198
7.23 73 nodes in KB of 5000 Graphs. . . . . . . . . . . . . . . . . . . . . . 199
8.1 Interval Time Relationships. . . . . . . . . . . . . . . . . . . . . . .. 212
8.2 A Simple Time Map. . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.3 Time Chart for a Bouncing Ball. . . . . . . . . . . . . . . . . . . . . . 215
8.4 Conceptual Space Diagram for a Bouncing Ball. . . . . . . . . .. . . 216
B.1 The Display Format for‘A person is between a rock and a hard place.’. 244
C.1 Part 1: Example of CGP Program from pCG. . . . . . . . . . . . . . . 246
C.2 Part 2: Example of CGP Program from pCG. . . . . . . . . . . . . . . 247
C.3 Part 3: Example of CGP Program from pCG. . . . . . . . . . . . . . . 248
C.4 Part 4: Example of CGP Program from pCG. . . . . . . . . . . . . . . 249
C.5 Inheritance Diagram for Class ‘cp_graph’. . . . . . . . . . . .. . . . . 254
D.1 KB for Verifying 3 nodes Query onto 11 nodes KB. . . . . . . . . .. . 265
D.2 Query Graph for Verifying 3 nodes Query onto 11 nodes KB. .. . . . . 266
D.3 Projection Verifying 3 nodes Query onto 11 nodes KB. . . . .. . . . . 266
xxviii
![Page 29: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/29.jpg)
D.4 Query Graph for Verifying 5 nodes Query onto 13 nodes KB. .. . . . . 267
D.5 KB for Verifying 5 nodes Query onto 13 nodes KB. . . . . . . . . .. . 268
D.6 Projections Verifying 5 nodes Query onto 13 nodes KB. . . .. . . . . . 269
xxix
![Page 30: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/30.jpg)
xxx
![Page 31: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/31.jpg)
CHAPTER 1
INTRODUCTION
Knowledge representation (KR) is used to store and retrievemeaningful infor-
mation that can not directly be stored in a computer. However, this work develops a se-
ries of levels of representation to transforms knowledge toa format that a computer can
use to process this information. This transformed knowledge is saved using dynamic
data structures that are suitable for the style of KR being implemented, and through the
KR the system manipulates the knowledge in the data using reasoning operations.
The data structure used together with the contents of the transformed knowl-
edge, is called the knowledge base (KB). An algorithm and this associated data struc-
ture makes up the reasoning operation, and the performance of this operation is de-
pendent on the associated KB. In this work, the basic reasoning operations for knowl-
edge management will be explored using a particular style ofKR called Conceptual
Graphs (CGs). These operations,projectionandmaximal join, are the foundation for
query/answer and hypothesis generation (abduction) systems, respectively. It will be
shown that changing the reasoning operation’s algorithm and providing adequate data
structures for them can improve the implementation of the operation for use in an intel-
ligent system; therefore, making it faster and more efficient. Different algorithms and
data structures execution times are analyzed over the most general form of CGs knowl-
1
![Page 32: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/32.jpg)
edge base showing that flexible, fast and efficient operations can improve a higher level
system.
1.1 Knowledge and Knowledge Representation
Artificial Intelligence (AI) emerged in the 1960’s, and can be characterized as
the process of describing a problem in such a way that a machine could find a solution.
AI uses general reasoning techniques that develop along thelines believed used by an
intelligent human [12, 65, 106]. AI systems, therefore, needed to representknowledge
in the computer so that these reasoning techniques can be applied to the problem. First,
consider what knowledge is and then how to represent it to thecomputer. According
to the on-line dictionaries, knowledge is “the range of onesinformation or understand-
ing; the circumstance or condition of apprehending truth orfact through reasoning; the
fact or state of knowing; the perception of fact or truth; clear and certain mental ap-
prehension” [60, 59]. However, there are two types of knowledge that human beings
deal with every day, 1) knowledge that defines an idea or concept and their relation-
ships [120], and 2) knowledge that gives understanding to time, space, or constraints in
connection to these definitions [3, 26, 4]. So knowledge allows us to have a definition
or understanding of the events and acts around us; knowledgeallows us todescribeour
world. Second, for the computer the description of the problem that it is to solve has
become known asknowledge representation. The representation consists of a set of
syntactic and semantic rules to describe a problem domain [1]. Given that syntax stud-
2
![Page 33: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/33.jpg)
ies the grammar rules for expressing the arrangement of symbols [119], and semantics
“is the scientific study of the relations between signs or symbols and what they denote
or mean” [[139] page 41], knowledge representation, when abstractly described, may
appear very informal and without concrete structure. This seems informal because the
syntactic rules perform symbol manipulation, while the semantic rules define a map-
ping that gives an interpretation of the representation in terms of another representation.
The term “semantics” (meaning) has come to be associated with many different
types of processing of relationships. Two key relationshiptypes (discussed as links in
[139]) are 1) structural links - which set up parts of propositions, and are definitional
relationships within a network of concepts, and 2) assertion links - which assert some-
thing about the world, and are basic relations that hold between concepts (i.e. part-of,
a-kind-of, etc.). Structural links give definition of knowledge, where assertion links
define facts. However, the processing of each of these links does not imply semantic
meaning. Meaning can be defined in terms of axioms of basic propositions, or truth
maintenance with correctness of assertion [82, 71, 123]. Semantic interpretations and
procedural semantics are used in determining these meanings [139]. One misuse of the
term semantics is in the area of semantic inferences [139]. Semantic inferences refer to
inferences that cross the boundary between symbol and referent; however, all steps of
the process are not semantic. If the step of the process involves parsing or processing a
structural link then one now has a syntactic operation.
3
![Page 34: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/34.jpg)
Many knowledge representations used by computers have beendeveloped in-
cluding semantic networks, logic, frames, and rule-based representations. Within AI,
knowledge representations have been built into different working applications some of
which are referred to assoftware information systems[134].
Knowledge representation systems are built to help find solutions to problems.
Many times the knowledge representation, KR, is broken intoboth a processing lan-
guage, and a knowledge base, KB (see Section 1.2.3 for discussion), that has special
data structures and operations that process the data. Some systems address only partic-
ular problem domains, i.e. neural networks for pattern matching, while other systems
attempt to process large amounts of diverse data, i.e. CYC KBfrom Cycorp [62]. Also,
historically, many KRs and KBs work as standalone systems, while newer systems are
being constructed as a group of modules each handling a specific aspect of the problem
solving process [124]. Sometimes these are actually different modules within a single
system [87]; others are designed as agents in a multi-agent environment [31].
1.1.1 Representation Levels
For Newell, intelligent systems (AI systems) need both a symbol and a knowl-
edge level to perform reasoning [80]. The symbol level is where representations of
knowledge would be processed. This is the level where the data structures are defined
and acted upon. The knowledge level has no physical structure, only a general func-
tional equation for knowledge. The symbol or program level is where physical structure
4
![Page 35: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/35.jpg)
or environment is defined for the knowledge level. Within thesymbol level, computa-
tional mechanisms are defined for the environment of knowledge.
Some of the confusion in the field of knowledge representations, and in particu-
lar semantic networks, is what rules, syntactic or semantic, are defined at each of these
levels of representation. In many readings, it is not made clear what knowledge can
be processed directly by the computer as machine code representation, and what must
be transformed (mapped) into another representation level. It should be noted that, in
general, abstract representations are too informal for machine processing. Therefore,
most knowledge representations must be translated to a moreconcrete representation
in order to be coded for the machine, and for execution and analysis to be performed
by the computer.
Back in 1971, Shapiro [109] attempted to divide all representations defined by
semantic networks into the following two levels:
• item - conceptual level of a semantic network.
• system - structural level of interconnection that ties the structured assertions of
facts represented in the network to items participating in those facts.
Levelization only looks at the actual semantic network represented on the page. It does
not consider the semantics defined by the network or this knowledge representation
would be coded for machine processing. The item level is concerned with the nodes
5
![Page 36: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/36.jpg)
that appear in the network. These nodes are both concepts andrelations, and have
some definition represented within the semantic network. The system level, according
to Shapiro, is attempting to define the links that are presentbetween the nodes in the
network.
In 1979, Brachman [11] tried to address the confusion about representations of
knowledge by defining levels for different types of semanticnetwork representations. In
this way, Brachman was describing one representation in terms of another. When levels
are defined by other levels and representations are defined byother representations, a
confusion [12] is produced in the field of knowledge representations. When knowledge
representations have this interpretation one can see it as a“levelization” of representa-
tions. Historically, this levelization of representations has been looked at mainly when
discussing the specific knowledge representation scheme known as semantic networks
[123], but it could be applied to most representation schemes. In his paper Brachman
defines a “level” as a distinctive type of node or link. These are conceptual levels and
a network’s notation can be analyzed in terms of any of these levels. The levels are the
following:
1. implementation level - a network is only a data structure.
2. logical level - in a network, links represent logical relationships such as:
• ∀ (for all)
6
![Page 37: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/37.jpg)
• ∃ (there exists)
• ¬ (not)
• ∨ (and)
• ∧ (or)
• → (implication)
• ≡ (if and only if)
3. epistemological level (Brachman’s missing level) - in a network, links give for-
mal structure to conceptual units and create a set of their interrelationships as
conceptual units.
4. conceptual level - in a network, links represent semanticor conceptual relation-
ships.
5. linguistic level - in a network, primitive elements are language-specific and links
stand for arbitrary relationships that exist in the world.
Brachman’s levels are defined types of network nodes and links. He states:
“It should be clear, then, that one of the main problems with many of
the older formalisms was their lack of a clear notion of what level they
were designed for” [[11] page 32].
7
![Page 38: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/38.jpg)
For the five levels given above, Brachman saw the implementation as the lowest level;
that is, the most basic type of network. This level only had data structures associated
with it; there are really no semantics related to the network. The logical level is seen
as needing the semantics of the basic logical operators. Theconceptual level is similar
to Shapiro’s item level discussed above. However, this level defines the semantics of
the concepts being included within the level. The linguistic level is very abstract and is
used to define, for the network, a level that has an open concept.
The epistemological level is seen by Brachman as a missing level, located be-
tween the logical level and the conceptual level. Brachman then uses all these levels to
define semantic networks in terms of cases (or roles) with slots (or sets of fillers), by
looking at the types of links needed when processing the network. Currently, this type
of representation is known as “frames” and in some circles isa knowledge representa-
tion in its own right (see Section 6.1.1 for a discussion of a Frame system).
On evaluation of the main feature of the epistemological level, this author would
place it between the conceptual and linguistic level. The reasoning behind the move is
because it is similar to the system level discussed by Shapiro, and is very much con-
cerned with the interrelationships of concepts and conceptual units. However, Guarino
like Brachman also saw missing information in the levels, but argues instead of moving
the level to add an ontological level to Brachman’s classification levels. The ontologi-
cal level gives a foundation for the knowledge engineering process and depict a set of
8
![Page 39: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/39.jpg)
features for the computational properties of each level (see Table 1.1) [45]. The onto-
logical level in Guarino’s eyes should be introduced between the epistemological and
conceptual level, being neutral with respect to the epistemological level, but not any
epistemological formalism is necessarily adequate. For Brachman and Guarino, all the
levels are processed as part of the knowledge representation.
Table 1.1: Brachman and Guarino Classification Levels and Main Features(Adapted from [[45], Figure 6]).
Level Primitive concepts Main feature Interpretation
Implementation are pointers Concrete ObjectiveLogical are predicates Formalization Arbitrary
Epistemological are structuring primitives Structure ArbitraryOntological satisfy meaning postulates Meaning ConstrainedConceptual are cognitive primitives Conceptualization SubjectiveLinguistic are linguistic primitives Language Subjective
Brachman did not try to actually look at processing representation from a com-
puter processing point of view. Then in 1982, Newell [80] began the redefinition of a
“level” from this new point of view. He defined a level in the following way:
“a level consists of a medium that is to be processed, components that
provide primitive processing, laws of composition that permit compo-
nents to be assembled into systems, and laws of behavior thatdetermine
how system behavior depends on the component behavior and the struc-
9
![Page 40: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/40.jpg)
ture of the system” [[80] page 92].
Newell referred to computer systems levels as going throughthe following bottom
(highest) to top (lowest) sequence:
• device level
• circuit level
• logic level (sub-levels - combinatorial and sequential circuits)
• register-transfer level and symbol (program) level
• configuration level
• knowledge level (new level)
As a third sibling just below the configuration level, Newelladded a new level known
as the knowledge level. For each of the levels, the followingaspects need to be de-
fined: the medium, the components, the assembly of the components into a system, the
composition laws and the behavior laws. In looking at knowledge and representation,
Newell’s symbol level and knowledge level are the most important. Each of these levels
has been defined according to the above aspects. The medium for the symbol level is
symbols and/or expressions. The components include memories and operations. The
components are assembled into systems known as computer systems. In the aspect of
10
![Page 41: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/41.jpg)
laws, composition is built on designation and/or association, while behavior is sequen-
tial interpretation. When looking at the knowledge level, the medium is knowledge.
The components are goals, actions and bodies (physical code). The composition laws
are a set of actions, a set of goals and a body, code, for the system that is referred to as
the agent. Lastly, the behavior law is the principle of rationality: “Actions are selected
to attain the agent’s goals”. This principle provides a general functional equation for
the knowledge medium to act on. However, the agent is very abstract and has no real
physical structure. The medium definition shows that knowledge is very open and has a
potential for generating an action. The knowledge level is an approximation and there
are no guarantees on the system’s behavior.
In 2002, the Object Management Group (OMG) dealing with relating legacy
systems to business modeling continued to defined aModel Driven Architecture(MDA)
[114, 112]. Within this architecture, the modeling space transforms into the code space,
where the representation of the business process/rules transforms all the way to the code
to be deployed. MDA-enabled tools do this transformation using a set of levels to move
from the business model, through to an intermediate level that represents the aspects of
the model that need to be coded, on to the actual generation ofthe code. Even though
this is discussed in terms of an architecture instead of a knowledge representation,
the transitioning of representation from an abstract levelto a concrete code level still
applies.
11
![Page 42: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/42.jpg)
This work expands on Newell’s computer processing level idea, in particular
investigating what could be the possible computational mechanisms or physical struc-
tures of the symbol level (representations), while seeing level relationships more from
Brachman’s definition [11] point of view. This work defineslevelas:
“There is alevel of processing of representations that sees the lowest
level to be a very abstract representation and then, as levels increase, the
representation becomes more concrete or machine like.”
The highest level of representation would then be processeddirectly by a computer
(see Figure 1.1) because it is the actual implementation that is compiled or interpreted
as machine code.
When one discusses semantic networks, it is not clear what rules, syntactic or
semantic, are defined at each of these processing levels of representation. In many read-
ings, it is not indicated clearly what rules can or should be processed directly by the
computer at each knowledge representation level. In general, abstract representations
are too informal for machine processing and these need to be translated to another more
concrete representation. Therefore, when looking at all forms of knowledge representa-
tion translation to a more concrete representation allows coding, and later for execution
and analysis to be performed with the computer.
Therefore, now consider representation in an AI system to bea series of these
processing levels. Encapsulating the knowledge representation (KR) is the level of
12
![Page 43: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/43.jpg)
ontological information[81], level 0. This level would be considered the knowledge
level under Newell’s levels, part of the linguistic level for Brachman, and would be a
relocation of Guarino’s ontological level. The information represented is not actually
part of the structure of the domain knowledge and is the most abstract of all the levels
of representation and implementation. In fact, it is more ofa hierarchy of conceptual
Level 0
Knowlege Representation
Defining ADT
Level 2
Level 1
Level 3
Level 4
Storing RepresentationImplementing ADT
Defining Representation
Internal RepresentationDeclaring ADT
Ontology
Figure 1.1: Levels of Representations.
13
![Page 44: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/44.jpg)
information than knowledge, so will be called “ontology” (see Section 2.1).
This ontology level contains more general information thanwhat is found within
the KR [61]; this might also contain any meta data needed to bestored for the knowl-
edge representation. Within the ontology level any particular system may use an ab-
stract hierarchy. These hierarchies define relationships between the conceptual units
within the knowledge representation and information outside the KR, such as group
membership. Therefore, defined hierarchies are to be considered part of level 0 in our
representation levels.
KR will start processing at level 1. It should be noted, the semantics at this
level are declarative and/or procedural in terms of its interpretation to a second repre-
sentation, and therefore are not concrete. For Newell this level would be part of the
symbol level, very close to the knowledge level. This is where the representation of
the knowledge medium would begin. In Brachman’s levels thiswould encompass part
of the conceptual level and all of the epistemological level. The epistemological level
as defined clearly should be placed between the linguistic level and conceptual level as
opposed to where Brachman placed it in his work [11].
The second level of representation, level 2, is an internal representation that
could be viewed as a virtual machine. When comparing this level to the MDA architec-
ture, this would be the platform-independent modeling level. Within the representation
of KR, this is where the declaration of an abstract data type (ADT) is performed (see
14
![Page 45: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/45.jpg)
Section 1.1.2). This syntactic representation is more formal and can be used in the def-
inition and implementation of the declared ADT. The syntactic rules are concrete and
define a mapping of symbols to operators. However, in order toimplement the ADT
declared by this level of representation, there must be a third level of definitions giving
more structure to the representation.
This level, level 3, consists of the actual semantic definition of the ADT declared
in level 2. The semantic rules are also concrete, and define a mapping of operations to
functions. This representation level can be use to implement code for the computer to
store and retrieve knowledge. It defines the algorithms to beperformed, and theoretical
time/space analysis can be performed on these algorithms. There is a strong connection
between level 2 and level 3 because the concrete rules of the representation in level 2
will work over the algorithms of level 3 during the implementation of the data structures
at the next level.
The innermost level of representation, level 4, is the actual implementation of
the ADT definition and the implementation within the MDA architecture. This level is
where all the data structures come together. It is at this level that a computer program-
ming language (see Appendix A), such as, C, Prolog, Lisp, or anewly defined language
is chosen [134]. This is also the level that the coding of datastructures and algorithms
will be performed, and any time/space analysis is done. Thisrepresentation is the most
concrete. Level 4 is the representation where the domain knowledge being worked on
15
![Page 46: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/46.jpg)
will tie into the computer language used for the implementation.
1.1.2 Speed and Efficiency in Processing
An abstract data type (ADT) (see Figure 1.2) can be broken down into two
parts: 1) specification and 2) implementation. The specification, which is abstract,
includes the definition of data types, including their structure and values, and supporting
operations for those data types; this half of an ADT will be referred to as adata model.
ADT
SPECIFICATION IMPLEMENTATIONabstract concrete
Set of OperationsSet of Values Data Representation Algorithm Code Bodies
Figure 1.2: Abstract Data Type (ADT).
The data model provides a mapping from general knowledge to the abstract
element of the ADT. The implementation, which is concrete, contains the data repre-
sentation used by the algorithms and algorithmic code bodies of the operations. This is
the second half of the ADT, and connects the knowledge to the algorithms being used
for implementation.
Many issues come into play when defining an ADT for a processor. Probably
16
![Page 47: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/47.jpg)
one of the most important is the efficiency of data structureswhen implementing the op-
erations. When examining the data in terms of their time and space requirements, some
data structures are better than others. Well-designed and well-defined data structures
can certainly help in these respects, whereas poorly definedstructures lead to ineffi-
ciencies. The data structures directly affect the efficiency of both aspects of the ADT
because the data model deals with the data types, while the data representation is part
of the implementation. Modification of the data structures to give faster access times
to the data types and representations can help in the efficiency of the algorithms being
implemented for the operations to be performed. In this way the algorithms are being
implemented to work towards their best possible execution time, where as, if the data
structures are not optimized there will be a higher probability that the worst possible
execution time is seen.
One important aspect in processing the underlying knowledge represented is
how to communicate that knowledge to other systems and applications. This may affect
the speed and efficiency of the implemented data storage.
1.2 Foundational Information
The following sections: subgraph isomorphism, unificationand data/knowledge
base give some foundational information as building blocksfor working with the rea-
soning operations that are the basis of this work.
17
![Page 48: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/48.jpg)
1.2.1 Basis of Subgraph Isomorphism
When dealing with basic graph operations (see Section 3.1 for definitions), the
efficiency of some of the algorithms have been investigated by many researchers. In
looking at these algorithms and their efficiency, it is important to understand relation-
ships between the different complexity classes that they may fall into:
P =⇒NP =⇒ NP-Complete=⇒NP-Hard
P: Problems that can be solved in polynomial time; NP: problems that are in NP, but not
known to be either in P or NP-Complete; NP-Complete: problems that are reducible to
NP-Complete problems and are decision questions; and NP-Hard: problems that are at
least as hard as an NP-Complete problem, but are not decisionquestions so can not be
reduced to a known NP-Complete problem.
At the core of graph isomorphism (see Section 4.2.1 for full definition and ex-
ample), the problem is to find a mapping,f , of graphG to graphH, such thatG and
H are identical. Discovering if two graphs are isomorphic is not known to be an NP-
Complete or P problem [42]. It is defined to be in the complexity class between P and
NP given that P6=NP. For this discussion, it will be called the class ‘NP’.
However, in most cases involving reasoning operations, given graphsG andH,
a more important question, than if they are identical, is knowing whether a small pattern
graph inG, asubgraph, is isomorphic toH. This is known assubgraph isomorphism.
18
![Page 49: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/49.jpg)
Because this question can be restricted to the well known NP-complete problem of a
CLIQUE (see page 64 of Garey and Johnson [42]) by allowing only instances for which
H is a complete graph, it is known to be an NP-complete problem [42].
When sub-problems (“special cases”) of the subgraph isomorphic question are
analyzed, some are found to be solvable in polynomial time. One of these sub-problems
is subtree isomorphism[42]; this is when bothG andH are trees (a graph,G1, is atree
if and only if every two distinct vertices ofG1 are connected by a unique path ofG1
[Theorem 3.4 on page 69 of [13]] ). A polynomial time algorithm for this sub-problem
was shown by Reyner [103].
When labels are added to graphs, such as in bipartite graphs,these can be fac-
tored into an isomorphic algorithm. The two label types can significantly speed up
subgraph matching by allowing a pruning of some possibilities through separating the
vertices into two groups.
Another sub-problem that produces a polynomial time algorithm besides just
labeling the vertices, is to define that a class or group of thevertices may only have a
specific number of edges [2]; such as in feature term graphs (see Section 3.4.3). This
process is constraining the problem to bring it into polynomial space.
It should be mentioned that all of these sub-problems are concerning two graphs
and are considering the running time based on the number of vertices in the graphs,n
= vertices [68].
19
![Page 50: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/50.jpg)
1.2.2 Overview of Unification/Matching
As was discussed in Martelli [67],“unification was first introduced by Robinson
[104] as the central step of the inference rule calledresolution.” Resolution became a
single rule that could replace all the axioms and inference rules of first-order predicate
calculus and be used in designing mechanical theorem provers. Unification can be
expressed in the following way: Given two terms containing some variables, if there
exist such, find the simplest substitution (assignment of some term to every variable)
which makes the two terms equal. This substitution becomes amatching of the terms
based on variables binding assignment and therefore is aunifier. There may be many
ways to unify a pair of terms, but there will be at most onemost general unifier,MGU;
the other unifiers add extra bindings for sub-terms which arevariables in the original
terms. If a unifier,U , is the MGU of a set of expressions, then any other unifier,V, can
be expressed asV = UW, whereW is another substitution.
As discussed in Myaeng and Lopez-Lopez [77], graph matchinghas been rec-
ognized as a central problem across many application areas.Many researchers have
attempted to reduce the computational complexity by developing application-specific
matches [78, 79]. As discussed above, while the general subgraph isomorphism prob-
lem is known to be NP-complete, matching graphs containing conceptual information
appears to be computationally tractable [77]. This is because conceptual graphs are
connected (acyclic or cycles), bipartite (can be separatedinto two distinct groups) and
20
![Page 51: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/51.jpg)
directed (finite), and for reasons given in Section 1.2.1 improves the general subgraph
isomorphism problem. However, adding labels to the conceptual information [77] is
essential in extending the plain graphs to be tractable.
If graph matching is reduced to a unification problem, then one should be able
to check a setU of finite terms over a set of function symbols and a countable set of
variables, where there is defined a finite set of pairs of terms, {< ui ,vi > |i ∈ I}. The
question is now to determine if there exists a substitutionσ = {(x j→ t j) | t j ∈U, j > 0}
such thatσ(ui) = σ(vi) for i ∈ I [84]. However, most unification algorithms that can
be done in linear time require that the graph is acyclic [84].The reason the graphs can
not contain cycles is because of theoccurs check. This is a feature of implementations
of unification which causes substitution to fail if the structure S being unified against
contains the variable,V, being substituted [133]. If occurs check is not evaluated, then
unsound inference could occur. Some implementations couldgo into a indefinite loop
if a cycle appears in the structure; therefore, it is disallowed [84].
However, if the relationship is functional as in feature term graphs (see Section
3.4.3), then the unification can be performed even with cycles. Figure 1.3 from Willems
paper on projection and unification with conceptual graphs[136], shows the unification
of two projections into a single graph even when one graph contains a cycle. The Figure
1.3 will be explained more fully in Section 4.3.2.
21
![Page 52: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/52.jpg)
U: [Person:*y] -> (Name) -> [Word:*x].G1: [Man] ->
(Name) -> [Word:*x].(Child) -> [Girl:*y] -> (Name) -> [Word:*x].
G2: [Person:*y] -> (Name) -> [Word:*x ‘Smith’].G: [Man] ->
(Name) -> [Word:*x ‘Smith’].(Child) -> [Girl:*y] -> (Name) -> [Word:*x ‘Smith’].
Figure 1.3: UnifierU, Projs U −→ G1 and U −→ G2, Unification G is Found(Adapted from [[136], Figure 5]).
1.2.3 Database vs Knowledge Base
As defined by Wikipedia: “a database is a collection of records stored in a
computer.” These records are fields of data that contain information that is queried
to answer questions and make decisions. This data is stored in files by records. A
“database management system” is used to access and query thefields and records of
the data information. However, databases only can retrievedata that is explicitly stored
in its structures. No information that is not factual can be retrieved.
A knowledge base is like a database, but it contains more thanfields and records
of data. Most knowledge bases also contain some kind of inference engine that uses
reasoning operations over the structure of the data stored in the records to infer more
information. Theknowledge baseas describe by Tappan, “operates over a framework
of objects, properties, and relations towards the goal of supporting reasoning” [128].
22
![Page 53: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/53.jpg)
The framework will be different depending on the “goal” of the domain of knowledge.
Knowledge bases like databases use a management system, butthis system adds struc-
tural information to the data (many times called meta-data)to help in the discovery
of additional information. This meta-data gives organization to the data and allows
the deduction of contextual knowledge from implied semantics of the inference engine
[74, 128]. The new contextual knowledge deduced by implicitsemantics may or may
not be factual, but can be presented to the user of the knowledge base to see if it should
be added to the stored data information. Then this new data can be used to answer
queries and help in making decisions.
In the future, it is hoped that more and more users will use knowledge bases
over databases; however, the speed of retrieval from a knowledge base is slower than
a database because of the added structural information and because of the built in in-
ference engine. In this work, some of the advances that have been found in database
algorithms and data structures will be applied to knowledgebases, in hope of improving
some of those retrieval speed problems.
1.3 Organization of Dissertation
This work will begin by looking at ontology, knowledge and representation in
Chapter 2. This will include looking at how ontology can be processed, knowledge
types and operation, and moving knowledge through different types of representations.
Above the outer most level of representation as seen in Figure 1.1 (one could describe
23
![Page 54: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/54.jpg)
this level as being the top) would be a zero level of macro information. The informa-
tion represented is not actually part of the domain knowledge and is even more abstract
than the first level of representation. In fact, it is more of ahierarchy of conceptual
information than knowledge, so will be referred to asabstract hierarchiesand discuss
this zero level in Section 2.1.1. Different ADT representations of knowledge are used
for implementing a semantic networks KR. These internal representations (see Section
2.3.2) use different formal approaches for syntactic processing, such as: 1) proposi-
tional logic, 2) predicate calculus, and 3) graph grammar (with set theory). Some of
the representations of knowledge used by semantic networkswill be discussed in more
detail in Section 2.3.1.3. However, within a propositionallogic approach, propositions
and logical operators use arbitrary conceptual units and expression links to define nodes
and arcs with semantic descriptions and context. Predicatecalculus is built on top of
a propositional approach and also incorporates the use of predicates with quantifica-
tion over variables. Using graph grammars or a set theoreticapproach not only is built
above the predicate calculus and quantified variables approach, but also uses primitive
objects and actions with procedural operators to help definethe semantics of the net-
work. Graph grammars built on top of graph theory instead of set theory also give a
visual representation which can be more expressive (see Section 3.1). Several different
types of knowledge representations were originally investigated, but a detailed example
(see Section 2.3.1.3) will be given for semantic networks.
Chapter 3 gives definitions for several elements that will beused through out
24
![Page 55: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/55.jpg)
the thesis, so that the reader has a frame of reference. Data structures that are rele-
vant to the implementation of the problem are defined and their basic running times are
examined. Reasoning operations, projection and maximal join, are then explained in
Chapter 4. Chapter 5 presents a new projection algorithm after explaining and analyz-
ing the foundational projection algorithms; continuing onto theoretically analyze the
new algorithm and show how it compares in a “typical case” with the other algorithms.
Some example environments/systems will be discussed in Chapter 6. KL-ONE,
SNePS, SNAP, PEIRCE, CoGITaNT, Amine, pCG and CPE, are all semantic network
knowledge representation systems. In Chapter 6 each of these systems will be dis-
cussed; evaluating the different ADT representations thatare used in each case. Chapter
6 also gives an evaluation of possible data structures to used in implementation of the
new algorithm given in Chapter 5. While implementation different ADTs, different data
structures will be explored, and their efficiency with storage and speed of algorithmic
execution will be analysis. This leads into the practical element of this dissertation.
An important aspect of the dissertation (the practical element) is presented in
Chapter 7 where the change in data structures, and algorithms are shown to effect speed,
efficiency, flexibility and space needs. One can see how the change in the data struc-
tures can effect the system speed and efficiency. Creating a system fast enough to
retrieve and process thousands of graphs in a reasonable amount of time for simple
query processing, therefore making it a usable system. As well as looking at tying the
25
![Page 56: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/56.jpg)
algorithms to the data structures to improve the system’s functionality, the new system
was designed with a flexible system in mind, such that, even a sub part of the system,
a module, can connect and be used by another standalone application. Also, the new
algorithm can find results that the baseline system was not able to process. These last
two features are important contributions of this dissertation. Chapter 8 draws conclu-
sions and describes future work. Different implementationlanguages were examined
to find the fastest system; they are discussed in Appendix A. Next, Appendix B gives
the actual CGIF format for the 2001 version. Appendix C givesdocumentation for how
pCG program work and for the implementation of the CPE systems. Appendix D gives
sample data of the averages and error spreads that are shown in the experimental results.
It also show verification that the CPE algorithm for projection gives correct results for
both single and multiple projections.
26
![Page 57: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/57.jpg)
CHAPTER 2
ONTOLOGY, KNOWLEDGE AND REPRESENTATION
For use later in declaring an ADT for an internal representation, this chapter begins
by evaluating the interplay between hierarchies, relationships and operations. These
elements have impact on how the higher levels of representation will be designed, de-
fined and implemented, and an understanding of each of these elements is necessary to
clarify the different representation level interactions.
2.1 Ontology
Unlike the definition of a knowledge base given in the Tappan thesis [128] (dis-
cussed in Section 1.2.3), the knowledge base is more than just its ontology, but also the
higher levels of representation. Hierarchies and relationships are the abstract elements
of ontology, and are more informal and open in their presentation of the representa-
tion. One can see them as thebuilding blocksof the ontology; therefore, they will be
discussed in this section. Operations are more processing oriented and give a more
concrete representation. These depict how information blocks are put together, so will
be discussed later (see Section 2.2.2). Below are defined some of the elements of the
knowledge that are needed to process the representation levels. As discussed in [80],
the knowledge level gives the general functional expression that builds the notation at
27
![Page 58: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/58.jpg)
the symbol or programming level. Evaluating the function ofan ontology will reveal
some basic elements of this expression.
In order to look at the ontological elements of knowledge representation, let us
define some basic entities: object and act. Anobjectis a thing, for example a subject of
a sentence, and is commonly considered to be a physical object such as a ball, a book,
a person, etc. An object has size, shape, mass, color, temperature, speed, etc. Basically,
an object exists as a physical thing. Anact is to perform, i.e. a verb in a sentence. An
act has properties such as rate, acceleration, direction, orientation, etc. Each of these
entities are important in understanding overall concepts about representations and both
not only have related term information, but also exist in time and space.
As defined in [120, 123],ontologycomes from the Greek wordsonto, being,
and logosmeaning the study of being or the basic categories for existence. Anontol-
ogy is a synonym for the arrangement of a generalization hierarchy that classifies the
categories or concept types of the hierarchy. The ontology also looks at the relation-
ships, operations and constraints that are essential to help define the nature (knowledge)
of our world orreality [106]. This general knowledge defines an informal list of con-
cepts that are part of the domain. These concepts will be seenas terms(see Section
2.1.2 for more of a discussion on terms) within the ontology,and they may be defined
by categories [106] in which they are members. The next section will begin by looking
at different abstract hierarchies, and later tie the hierarchies to the categories of objects
28
![Page 59: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/59.jpg)
within a domain.
2.1.1 Abstract Hierarchies
Abstract hierarchies can be used to define membership in various categories, or
give macro definitions about the categories. The actual structure of hierarchies will be
discussed in Section 3.2 within Chapter 3 containing definitions.
2.1.2 Relationships
Relationships can be divided into different categories:compositional, quantita-
tiveand/orqualitative[43]. Each of these relationships are involved in the construction
and propagation of information within sentences or expressions. Quantification deals
with fuzzy quantifiers likemost, as they relate to the classical universal and existential
quantifiers; qualification looks at fuzzy probabilities [43]. Sentences from a logic point
of view may be simple predicates with an arity ofn termarguments that return either
a true or false value. Sentences also may be more complex and return more complex
information within structures. First, looking at simple sentences and how they are built
using compositional operations.
2.1.2.1 Compositional
Within simple sentences there are term arguments [65]. Terms may be of three
different types: constant symbols, variable symbols and function expressions. The con-
stant symbols are symbols that do not change; two known constant symbols are the
29
![Page 60: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/60.jpg)
truth symbols,trueandfalse. These symbols may also be things such as numbers, 1, 2,
etc. Each of these symbols has a known interpretation as specific objects or acts in the
world. They are also members of a specific category within thedefined world. Variable
symbols are used to designate general classes of objects or properties in the world [65].
Variables are not constant, and as seen later, they may be substituted. Function sym-
bols have an attached arity indicating the number of elements of the domain mapped
onto each element of the range. A function expression consists of a function symbol
followed by the number of terms indicated in the function symbol’s arity.
The terms are built into sentences using connectives. Thereare different types of
Boolean connectives that are used when mathematically working with sets or equations,
for example: conjunction, disjunction, negation, implication and equivalence. These
Boolean connectives can be used to create sentences or composite sentences by treating
the connectives as compositional relationships (or functions). Each of these connectives
operate as follows: theConjunction(“and”) operator forms a ‘collective’1 set, where
each member of the set is “anded” with the other members of theset; theDisjunction
(“or”) operator forms a ‘distributive’2 set, where each member is “ored” with the other
members of the set; theNegation(“not”) operator forms an ‘opposite’ set, where each
member is the opposite of what it is in the set; theImplication(“if A then B”) operator
1Refers to a generic assemblage of items.
2Refers to a generic bag of items.
30
![Page 61: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/61.jpg)
is used in an equation, where the truth of A causes the truth value assignment to be
concluded from the truth value of B; otherwise, the assignment of the implication is
always true; and theEquivalence(“equals”) operator forms a ‘identity’ set, where all
members within the set contain the following properties: 1)Reflexivity: a ≡ a; 2)
Symmetry:i f a≡ b then b≡ a; and 3) Transitivity:i f a≡ b and b≡ c then a≡ c.
These simple and complex sentences or expressions can be expanded to use
variable symbols and function expressions. This introduces more complex relation-
ships where properties are applied to a whole set of terms or acollection. These rela-
tionships may be either quantitative or qualitative in nature. A quantitative relationship
propagates information by performing quantification of values and variables to provide
an interpretation or meaning for a symbol or expression [65]; a qualitative relationship
is based on qualitative physics and propagates through bothmoments in time when acts
occur, and locations of objects in space [47]. Each of these types of relationship will be
used in the next section when discussing constraints. Next these relationships will be
examined more closely.
2.1.2.2 Quantification
Quantification allows the substitution of variables with numeric values, and
these values using numbers and arithmetic operations can beperformed within a rea-
soning process [92]. When there is a fixed number of constant symbols with only a
finite number of substitution possibilities, atruth value assignmentcan be determined
31
![Page 62: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/62.jpg)
as either true or false, for each substitution of the quantification. These truth value as-
signments are then collected into atruth tablegiving an interpretation for expressions
or sentences over a domain. A truth table can be used to exhaustively test all possible
assignments of member values [12].
However, a more common use of a quantitative relationship iswith variables.
Variables may be quantified in two ways: 1) universally or 2) existentially. A variable
is universally quantified when in a sentence it is true that all constants intended in the
interpretation can be substituted for the variable. The symbol indicating this universal
quantifier is∀. Universal quantification introduces problems in computing truth value
assignments for a complete sentence. There now becomes an infinite number of possi-
ble substitutions; therefore, making the creation of a complete truth table impossible.
This exhaustive testing of all substitutions computationally is an undecidable problem
[65]. At the same time, the quantitative relationships (or functions) allow a larger map-
ping of information in a knowledge-base which can be more powerful as seen later.
The second quantifier for variables is existentially quantified. In this case at
least one substitution is true for the variable across the interpretation of the domain.
The symbol for an existential quantifier is∃. Existential quantification is no easier to
compute than universal quantification; this is because of the infinite number of possi-
bilities.
For quantification of variables, the scope of the quantified variable is indicated
32
![Page 63: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/63.jpg)
by enclosing the quantified occurrences of the variable in parentheses. Quantification
allows one to look at infinite possibilities within one instance in time and space; when
a time or space continuum is introduced then qualitative relationships (or functional
mappings) are needed.
2.1.2.3 Qualitative
Qualitative relationships are used inqualitative physics. This area of knowledge
representation is concerned with constructing a logical, non-numeric theory of objects
and acts [106]. This theory defines relationships that process operators of time and
space. In order to define these relationships, one must definewhat entities and relation-
ships are relevant to time and space. An entity in the domain of time is called amoment
or instant [47], while an entity in a spatial domain is alocation. A relationship over
time is anintervaland over space is aregion. When these aspects of entities are related
to objects and acts within the world and to each other interesting things start to occur.
When one looks at the properties of an object for the current point in time and
space they are said to have astate [47]. If that object is in relationship with other
objects, apartial ordering of statescan be produced. When the properties of an act
for the current point in time and space are examined there is aprocess; that act in
relationship with other acts gives apartial ordering of processes.
When one now starts to look at the interconnection between objects and acts, if
a set of objects are in a spatial relationship for a single moment in time they are said to
33
![Page 64: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/64.jpg)
have aschematic. As these relationships are looked at over a set of moments intime,
one gets apartial ordering of schematics. Also, when there is a set of acts in a temporal
relationship for a single region in space, it has achronicle. Extending this to a set of
regions, one gets apartial ordering of chronicles.
When qualitative relationships are executed as functions the above defined par-
tial orderings are processed. For example, a ball in one state may be at the top of a
bounce, and in the next state at the bottom. However, the interesting part is that when
moving between the two states the ball also moved from point Ato point B which
was a forward direction. Here is seen a time and space progression being performed
within an operation. There are many more temporal and spatial qualitative functions
(see Hartley’s work [47] for a much more complete list).
Now, if one looks at a set of objects that are participating ina single act, this
is said to be anevent. If one looks at a set of acts with a single object, this can be
described as anexperience. Both an event and an experience are atomic units to the
single act or object, respectively. It should be noted that events are time-independent
and experiences are space-independent [47]. Unlike the functions defined above, these
do not produce a partial ordering across entities. This is because there is only one act
or object present. However, these events and experiences can be linked together in a
set to form related knowledge structures. These structuresare similar to standard case
relations already available within knowledge representations [47].
34
![Page 65: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/65.jpg)
Temporal and spatial operators allow one to discuss time andspace relation-
ships. However, the drawback is how to represent the knowledge and the time and
space it takes to process the partial orderings.
2.2 Knowledge
First, considering closer the actual definition of knowledge and learning from
Piaget [98],
“in each act of understanding, some degree of invention is involved; in
development, the passage from one stage to the next is alwayscharac-
terized by the formation of new structures which did not exist before,
either in the external world or in the subject’s mind” [[98] page 70]
and the types of knowledge that make up this definition as discussed previously in
Section 1.1. If one learns information and then keeps it in their mind so that they can
understand it, obviously that information, or knowledge, must be stored. However, what
representation it is actually stored in is still a mystery. Whatever the representation, the
mind is able to recall the information at will.
Second, ontological information can add additional structure to the representa-
tion by placing a macro level of knowledge for the defined world, outside of the types
of conceptual knowledge that has been defined. This additional structure can work
35
![Page 66: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/66.jpg)
through knowledge operations to use the representation levels previously defined (see
Section 1.1.1) to store information in a knowledge base.
2.2.1 Types
Therefore, to examine more closely what representation themind might use to
store knowledge, the types of knowledge will be discussed. Knowledge can be thought
of asDeclarativeor Procedural; the following sections will define what is entailed in
each type.
2.2.1.1 Declarative Knowledge
The first type of knowledge is known as declarative knowledge, describing a
collection of definitions about the world. Throughout history, language has been used
to describe knowledge and conceptual relationships. In many instances it is easier to
describe in words definitions of concepts and their relationships, for example: a cat is an
animal with four legs and a long tail. In this example, a definition is being performed
to give attributes and characteristics to a cat; that is, an attribute of four legs and a
characteristic of a long tail. There could also be given domain information, that is, a
cat is an animal, but this will be discussed further in ontologies. It can be noted that in
the definition of a cat, it has been declared that this animal has four legs and a tail. In
some other world of declarative knowledge, a cat may have only three legs and no tail.
36
![Page 67: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/67.jpg)
2.2.1.2 Procedural Knowledge
The second type of knowledge is procedural knowledge, describing the tempo-
ral, spatial, and constraint aspects for the above definitions. It is believed that there is a
duality between these two types of knowledge [107], but one type is inadequate without
the other. If the simple example given above is expanded to include a location for the
cat, it can now be defined that: a cat is an animal with four legsand a tail and the cat
is located on a mat. Within this expanded example both types of knowledge are being
used: 1) cats have attributes and characteristics, and 2) spatially, the cat is located on a
mat. If then this statement is slightly changed to add that a cat with four legs and a tail
saton a mat, not only is declared definitional information aboutthe cat, four legs and
a tail, but spatial information of the location, a mat, and temporal information, sat (this
moment in time). Sometimes, written language is not an easy tool to use to describe all
knowledge information. If the example is changed to add one more temporal wrinkle:
A rat sat on the mat before a cat sat on the mat. Assuming that the cat is the one already
defined in our world knowledge, there can now be two interpretations of this idea: 1)
the rat is sitting in front of the cat on the mat at the same time, or 2) the rat sat on
the mat prior to the time the cat sat on the mat. Here a picture or a time diagram (see
Figure 2.1 from [95] on page 176), can help display the correct interpretation. Again,
both types of knowledge are being used, 1) cats located on mats; rats located on mats;
2) spatially, the cat on the mat and the rat on the mat; and 3) temporally, the rat on the
37
![Page 68: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/68.jpg)
mat before the cat on the mat or the rat and cat on the mat at the same time.
Figure 2.1: Time Chart.
2.2.2 Operations
Besides using the relationships just defined above, an example system uses dif-
ferent types of operations to process the internal knowledge being stored in the knowl-
edge base and the hierarchies of ontology information beingapplied to the data struc-
tures.
2.2.2.1 Terminological
Terminological operations work over terms orconceptsand are designed to fa-
cilitate the expression of definitions [66]. Some common operations are: subsumption,
inheritance, completion and coherence. Let us look at each operation briefly [141].
Subsumption, as defined in Section 3.2, is when a term is subsumed by another term.
When all appropriate subsumption relations are identified for a given set of terms, then
the terms are said to beclassified. Inheritance is the operation of identifying the appro-
priate subsumption relations, and completion is the process of identifying and recording
38
![Page 69: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/69.jpg)
all conditions that should be applied to a term so it can be classified. Lastly, the coher-
ence operation is finding a model in which the term’s denotation is not empty. These
terminological operations work above the knowledge base when trying to actually pro-
cess rules or predicates.
2.2.2.2 Assertional
Assertional operations try to state constraints or facts that apply to a particular
domain or world [66]. The most common assertional operationis realization. Realiza-
tion is the process of identifying all concepts that have been instantiated [141]. Once
a concept has been instantiated, it can be entered into the domain as a fact. One very
important aspect of this operation is whether or not theclosed world assumption, this
is where only definitions or facts defined within the world canbe operated on, is being
made [141]. Most systems no longer make this assumption.
2.2.2.3 Generalization
Thesimplificationoperator generalizes an entity by taking it to a more general
form [119]. This generalization sometimes removes part of aconceptual idea that has
more information to take the idea to a more general form. Whengeneralization is
performed on hierarchies the concepts are moved upward in the hierarchy from the
bottom to the top. The top (⊤) is the most generalized.
39
![Page 70: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/70.jpg)
2.2.2.4 Specialization
The join operator (see Section 4.1.2) allows the specialization of entities by
performing unification (see Section 1.2.2) between two entities. When unification is
performed, a substitution is made in one entity by another [106]. If it is a concept that
is being unified, then the concept may go from a general form toa more specific one.
Specialization on hierarchies moves the concepts from the top toward the bottom. The
bottom (⊥) is the most specialized.
2.3 Representation
As was discussed in Section 1.1.1 of the introduction, conceptual ideas can be
transformed through representation levels to a form that can be processed by a com-
puter. In this section, examples will be discussed from level 1 and level 2 of those
representation levels.
2.3.1 Knowledge
Level 1 from Figure 1.1 discussed the concept of knowledge representation
(KR) at a beginning syntactic level. This section will present three KRs: Logic, Rule-
Base, and Semantic Network, and their basic representationof knowledge.
40
![Page 71: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/71.jpg)
2.3.1.1 Logic
Logic as a knowledge representation looks at representation of knowledge in
two parts: implicit and explicit [63]. The implicit part allows knowledge to be repre-
sented within a closed world assumption; that is, it contains a set of sentences of the
form (s 6= t) for any two terms in the universe that have not already been explicitly
defined. This allows the user of the world to know what is “not true” for the universe.
This part of the knowledge, when speaking about the processing level of knowledge
representation, relates to the ontology (as discussed in Section 2.1) of the Logic KR.
The explicit part is a collection of first-order sentences (asubset are called Horn
clauses) of the form:
∀x1 · · ·xn[P1∧· · ·∧Pm⊃ Pm+1] wherem≥ 0 and eachPi is atomic.
If m = 0 and the arguments to the predicates,P, are all constants then there
is nothing more than a relational database of facts. However, this may be a first or-
der logic, FOL (see Section 3.3) sentence. These first-ordersentences define what
is “known” about the universe, and give the syntax of the Logic KR. Logic must be
mapped to the next machine processing level using some ADT (see Section 2.3.2).
The computational part of Logic KR is theexecution, inference, of the logic
system. This can be seen as a form of semantics for logic. The inference engine also
uses an ADT declaration to interface to a machine representation. See Figure 2.2 for
a simple example of a knowledge representation (KR) and internal representation (IR)
41
![Page 72: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/72.jpg)
that uses logic. Within this example, at the KR level one can see a FOL (see Section
3.3 for definition) sentence where there is a red block on top of a yellow block which
is on the table. When translating this to the IR level, the KR single sentence translates
into 13 triples of relationship information.
KR level
( Table( table-1 ) ∧(( Block( block-1 ) ∧ Color( Yellow ) ) ∧
Supported-by( block-1, table-1 )) ∧(( Block( block-2 ) ∧ Color( Red ) ) ∧
Supported-by( block-2, block-1 )) )
IR level
(inst table-1 table)
(inst block-1 block)(color yellow block-1)(and and1 block-1 yellow)(supported-by sup1 block-1 table-1)(and and2 and1 sup1)
(inst block-2 block)(color red block-2)(and and3 block-2 red)(supported-by sup2 block-2 block-1)(and and4 and3 sup2)
(and and5 and2 and4)(and and6 and5 table-1)
Figure 2.2: Logic Example.
42
![Page 73: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/73.jpg)
2.3.1.2 Rule-Bases
Rule-Baseknowledge representations are procedural schemes that represent
knowledge as a set of instructions for solving a problem. Theinstructions are in the
form of if ... then ... rule and may be interpreted as a procedure for solving a goal
in a problem domain. At the heart of the system is a knowledge base that holds the
instructions. An inference engine takes the rules (knowledge) from the knowledge base
and applies them in the correct order to produce a solution (goal) to an actual problem.
This is a recognize-act control cycle, and procedures that implement the control cycle
are separate from the rules in the knowledge base. The procedures can be seen as the
semantics of the system and they produce a very simple ADT foroperation by the infer-
ence engine for processing the rules. Rule-Base systems arethe basis of expert systems
and an expert provides the rules for the system. These systems focus on a narrow set of
problems in which knowledge is extracted from a specialist in this area.
2.3.1.3 Semantic Network
A semantic network is an example of a knowledge representation that is dis-
played as a discrete graphical structure of vertices and arcs [61]. Within the graphical
structure, the vertices are called nodes and may be displayed as circles or boxes. The
arcs are called links and are displayed as lines with arrows between the nodes. The
nodes are related to each other through their links, where the links are assigned a one-
to-one correspondence with a conceptual meaning defining the relationship [108].
43
![Page 74: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/74.jpg)
The nodes are sometimes called conceptual units and may be seen as objects
within the network. These objects may be of many different types including entities,
attributes, events or even states. Syntactically, each object is just a symbol (normally
text within a box or circle), in the graphical structure. On top of the semantic network,
abstract hierarchies are organized according to levels of generalization for the concep-
tual units. These hierarchies were discussed in Section 2.1.1. The links of the network
form relational connections between the conceptual units,such that, the valence (or par-
ity) of the relational connection is the number of units thatare connected to a particular
unit with a link. In a semantic network links are usually dyadic (binary) connecting two
conceptual units together.
The syntax of the semantic network is a set of the grammaticalrules that express
how the symbols of the network can be combined within the graphical structure. In
this way, the syntax of the network is very abstract. The semantics of the network
is the abstract meaning of the links and their nodes. Becausethe semantic network’s
representation, in the abstract, appears as informal, its semantics is an interpretation
of the objects displayed within the graphical structure. This creates a transformation
from one representation level to the next. Therefore, the interpretation of the network
defines a modeling of the relational connections between conceptual units using an
abstract and generative form of semantics, and has the characteristic notion of a set of
links which connect individual conceptual units, referredto as facts, into a total basic
network structure. In this way, the representation of knowledge or implementation of
44
![Page 75: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/75.jpg)
the knowledge representation is at a different level of representation than the semantic
network.
Elements of semantic networks appeared as early as the late nineteenth cen-
tury in works by Alfred Kempe in 1886 and Charles Peirce in 1897 [86, 61, 37, 121].
Both gentlemen used a graphical structure of conceptual units to diagram meaning [86].
However, semantic networks were not introduced for use withcomputers until 1956 by
R.H. Richens in a system called ’NUDE’ [53]. This system was used for machine
translation of Russian to English by going through a neutralconceptual language. This
procedure actually operates over the innermost level of representation produced by the
translation of the semantic network to the storage representation of knowledge. The
actual natural language, Russian, is mapped onto a semanticnetwork knowledge repre-
sentation for natural language processing. This KR is then mapped onto an internal rep-
resentation, which is really a new language declaration fora new conceptual language.
It uses the nodes and arcs within the semantic network to map to the new language.
The virtual machine internal representation, at level 2 (asseen in Figure 1.1), is not the
semantics of the network, but the representation produced by applying the semantics of
the network through a mapping to the new representation language. After the semantic
network has been translated to an internal representation,the new language is mapped
through the definition level with the new data structures onto the implementation of
the algorithm for processing that data structure, so the innermost (highest) level can be
executed and perform reasoning operations and analysis.
45
![Page 76: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/76.jpg)
Examples of applications where semantic networks have beenused are natural
language understanding, planning, machine translation, deductive databases, and ex-
pert systems [61]. However, in order for a semantic network to be a good knowledge
representation for an application, the network must be interpreted in terms of a repre-
sentation that algorithmically or procedurally can process the network’s meaning and
perform reasoning. Interpretation requires that the representation be translated from
the abstract to a more concrete representation. For any semantic network, different
representations of knowledge, levels 2 - 4 (as seen in Figure1.1), may be used for
implementing the storage representation. These representations use different formal
approaches for syntactic processing, such as: 1) propositional logic, 2) predicate cal-
culus, and 3) graph grammar (set theory). Some of the representations of knowledge
used by semantic networks will be discussed in more detail inSection 2.3.2. However,
within a propositional logic approach, propositions, logical operators, and abstract hi-
erarchies use arbitrary conceptual units and expression links to define nodes and arcs
with semantic descriptions and context. Predicate calculus is built on top of a propo-
sitional approach and also incorporates the use of predicates. A graph grammar or set
theoretic approach not only is built above the predicate calculus approach, but also uses
primitive entities and actions with procedural operators to help define the semantics of
the network.
A specific type of semantic network, or a knowledge representation in its own
right, is a frame representation. Theframeis a named data object with an unbounded
46
![Page 77: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/77.jpg)
collection of namedslots(attributes or fields) which can havevalues[61]. The value
to a slot in a frame can be a pointer to another frame, thereby producing a network
of frames (therefore the representations name). A frame is an object represented by a
node with a set of slots; a slot is information about the object and may be represented
by a pointer to another node, restrictions on attribute values, by a pointer to an attached
procedure for calculating a value, an actual simple value, or a set of values [63]. Frames
collect explicit information about an individual object ata node level.
2.3.2 Internal Representation
Each of these internal representations is at the next higherlevel than the KR
used (see Level 2 from Figure 1.1). Even though each internalrepresentation will be
discussed as being the ADT for a “best fit” with a particular knowledge representa-
tion, any of these ADTs could be used with any knowledge representation, as stated in
Brachman’s work [11] when he was discussing semantic networks.
2.3.2.1 Predicate Calculus
Elements within the syntactic representation of the semantic network knowl-
edge representation can be grouped into structures. These structures have predefined
reductions (meaning) to a ADT. The structures arepropositions, predicates, logical op-
eratorsandprocedural operators. Let us look at each of these structures and how they
affect building an ADT.
47
![Page 78: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/78.jpg)
Propositions will be discussed in Section 2.3.2.2. From that discussion it can
be seen that predicates not only generalize propositions, but also define relationships.
They can be separated into their intentional and extensional characteristics [139]. The
extensionof the predicate refers to the set of things that this conceptdenotes; while
the intentionof the predicate defines the meaning for the concept. Both characteristics
define the semantics of the concept. However, the intention of a predicate gives an
abstract function which can be assigned to the extension of the predicate, the concept
itself.
Logical operators use model-theoretic semantics with the basic operators being
conjunction, disjunction, negation, and existential and universal quantifiers. How each
of the operators are used in relationships was looked at moreclosely in Section 2.1.2.
So, let us consider the question: “what is model-theoretic semantics?” The wordmodel
has multiple meanings, three of them being [119]:
• Simulation - simplified system that simulates some significant characteristics of
some other system.
• Realization - a set of axioms as a data structure in which these axioms are true.
• Prototype - an ideal or standard for a system.
Theoryon the other hand is a proved hypothesis. Therefore, logicaloperators are mod-
eling a proved hypothesis by the conjunction of true propositions containing existing
48
![Page 79: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/79.jpg)
objects and conjoined predicates and relations [61]. Systems that exclusively use only
logical operators answer reasoning questions by usingtheorem provingin FOL (see
Section 3.3).
Procedural operators define procedures that actively interpret the semantic net-
work and operate over it [119]. Use of these operators in defining the semantics of the
network sets up a controversy,procedural vs declarative.
The procedural semantics assume that knowledge of the worldor meaning, can
be represented byknowing howa concept operates; declarative semantics assumes that
knowledge can be represented byknowing thata concept is defined by a collection of
facts [119]. This controversy will appear throughout the discussion of the systems in
Section 6.1.
Each of these structures will need to be examined in buildingan ADT for the in-
ternal representation. Figure 2.2 shows an example of a logic knowledge representation
that is mapped to an internal predicate calculus representation. This internal represen-
tation can then be used to help define an ADT for processing thestructures. As one can
see the connectives get turned into predicates and are called to instantiate the objects
from the knowledge representation level.
2.3.2.2 IF..THEN
For some knowledge representations, in particular rule-base representation, the
data structure that just consists of the propositional query can be used. A proposition
49
![Page 80: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/80.jpg)
comes from mathematical logic and is a simple statement which may have atruth value,
TRUE or FALSE, associated with it. These simple statements can generate, manipulate,
and/or relate concepts through logical functions. Propositions are alwaysintentional,
define concepts, and do not consider relationships or dependencies between concepts.
They also only use quantitative relationships which allow the application of heuristics
to reduce the search space, but do not function in the areas oftime or space.
The IF..THENrule construct can be defined directly in most programming lan-
guages and makes for a very simple ADT for defining the inference engine. However
because of the simplicity of the data structure only simple questions can be answered.
It is for this reason that this internal representation is not used for knowledge represen-
tations such as logic or semantic networks.
2.3.2.3 Conceptual Structures
Even though there are multiple semantic network representations available, the
representation that has flexibility in its use of the above approaches isconceptual struc-
tures. Conceptual Structures, CS, are a logic based representation of C.S. Peirce’s exis-
tential graphs [86] developed by John Sowa[119]. Graphicaldiagrams that are built out
of the logic building blocks of conceptual structures areconceptual graphs, CG (see
Section 3.4).
Semantic networks play a very important role in the use of conceptual graphs.
Sowa claims that “a conceptual graph has no meaning in isolation. Only through the
50
![Page 81: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/81.jpg)
semantic network are its concepts and relations linked to context, language, emotion,
and perception”. Such concepts as TOMATO or DOG are easier tounderstand and
define than abstract concepts such as PEACE or JUSTICE. In order to capture the
meaning of abstract concepts, these concepts must be hookedup through a vast network
of relationships which will eventually link them to concrete concepts. The philosopher
A. R. White [135] defined the meaning of a concept as follows:
“To discover the logical relations of a concept is to discover the nature
of that concept. For concepts are, in this respect, like points; they have
no quality except position. Just as the identity of a point isgiven by its
coordinates, that is, its position relative to other pointsand ultimately to
a set of axes, so the identity of a concept is given by its position relative
to other concepts and ultimately to the kind of material to which it is
intensively applicable. A concept is that which is logically related to
others just as a point is that which is spatially related to others” [135].
In Tepfenhart’s paper [129], he stated that the conceptual grounding for conceptual
structures is based on the triangle meaning for the relationships between symbols, con-
cepts, and referents (see Figure 2.3).
Peirce [86] actually had a different relationship triangle(see Figure 2.4); it
aligns its sign relation with Tepfenhart’s symbol, while the concept stayed the same.
For Tepfenhart, a referent was the instantiation of the concept in the triangle meaning,
51
![Page 82: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/82.jpg)
Figure 2.3: Meaning Triangle for Symbols, Concepts, and Referents (Basedon [[129], Figure 1]).
while Peirce saw the object as the instantiation of the concept. This makes Peirce’s
triangle more general to all conceptual logics; not just conceptual structures. Concep-
tual Structures (CS) are the development of human “concepts” in such a way that they
can be processed by machines. The structures give meaning inthe computer for the
conceptual ideas [119].
Figure 2.4: Peirce’s Triadic Relation.
52
![Page 83: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/83.jpg)
Going back to language as a mechanism for communicating human concepts;
over time, the foundation of conceptual structures within knowledge representation has
changed. Chomsky maintained thattraditional grammars, which are syntactic, carried
the structure needed to process sentences in computers, andthat each sentence was a
single structure [16]. However, he clarified in 1965 that these structures were an ab-
stract theory of competence, which is an idealized knowledge of language, as opposed
to a performance structure, which is the actual use of natural language [17]. Jackendoff
maintains that themeaningof a sentence, which is semantic, in natural human language
actually has separate semantic structures for each elementof the sentence [52].
John Sowa took both of these ideas and blended them together to develop a
graph diagrammatic representation for the structure called Conceptual Graphs, CG
[119, 121]. Section 3.4 defines conceptual graphs. Later, Bernhard Ganter and Rudolf
Wille realized that they had developed a similar, but simpler lattice representation for
conceptual structures that had a mathematical foundation calledFormal Concept Anal-
ysis, FCA[41]. FCA is a mathematical formalism [41] that handles concepts with
attributes in a lattice format. These mathematical structures can be traversed as in a
type hierarchy to discover super and sub-type relationships between concepts. They
can also be easily stored in a relational database [6, 7]. Their latest research has been in
the area of adding temporal attributes to the lattices to handle time relationships [138].
53
![Page 84: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/84.jpg)
54
![Page 85: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/85.jpg)
CHAPTER 3
DEFINITIONS
This chapter gives definitions for several concepts that will be developed though out
this work. Theses definitions are complete for the knowledgeneeded for this work, but
are not a complete coverage of all of these areas of study.
3.1 Graph Theory
Graph theory, unlike logic, is not built on sentences of predicates that evalu-
ate to TRUE and FALSE, but is based on the visual elements of drawings. A graph,
G = {V,E} whereV is a finite nonempty set of points (or vertices), andE is the set
of all the links (or edges) between adjacent points [46]. An edge,x = {u,v} wherex
is said to join verticesu andv [46]. The example in Figure 3.1 is a graphG where
V = {v1,v2,v3,v4,v5} and E = {{v1,v2},{v1,v3},{v2,v3},{v2,v4},{v3,v4},{v3,v5},
{v4,v5}}. However, even though graphs must have at least one vertex, they do not
have to have any edges. Graphs are very useful for discovering if a finite number of
objects, vertices, are in relationship, edges, with each other. The next sections are graph
theory definitions that are important within the details discussed later in this work.
55
![Page 86: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/86.jpg)
v
5
2
vvv
4
31
G:
v
Figure 3.1: A Graph to Illustrate Graph Theory Concepts (Adapted from [[46], Figure2.9]).
3.1.1 Digraph and Bigraph
A directedgraph (ordigraph), H = {V,E} whereV is a finite nonempty set of
vertices, andE is set of ordered pairs of all directed edges between adjacent vertices
[13, 46]. For these edges, a directed pairx = (u,v) joinsu andv in an irreflexive binary
relation and the direction is fromu to v. When drawing the edges in a graph there
is an arrow to indicate the direction [13, 46]. As can be seen in Figure 3.2, graphH
containsV = {v1,v2,v3,u1,u2} andE = {(v1,u1),(u1,v2),(v2,u2),(u2,v3)}. For each
of the pairs inE, the arrow is in the direction from the first vertex to touching on the
second vertex, i.e.v1 to u1 where the arrow is touchingu1.
A bipartite graph,B = {V,E} is a graph with the distinction that all vertices in
V can be divided into two subsetsV1 andV2 , or colors, such that every edge,E, of
graphB connects an element ofV1 to an element ofV2 and there are no edges between
the vertices in the subsets (for example with the same color)[46]. If Figure 3.2 is again
56
![Page 87: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/87.jpg)
v u1
v
u22
3
H:
v
1
Figure 3.2: A Digraph that is a Bipartite Graph.
examined, it is seen that besides being a digraph this is alsoa bigraph (bipartite graph)
whereV1 = {v1,v2,v3} andV2 = {u1,u2}.
3.1.2 Walk, Path and Connected
A walk of a graph is an alternating sequence of vertices and edges where the
beginning and ending of the walk is at a vertex, and the edges are incident on two
vertices [46]. In the previous example, Figure 3.1, a simplewalk has verticesv1v2v3v4v5
with the edges comprising the following order:{v1,v2}{v2,v3}{v3,v4}{v4,v5}. This
walk does not include all the edges, but does include all the vertices. In the example
just stated, since all edges are distinct, it is called atrail . There are other kinds of
walks, such as when the first and last nodes are the same, then the walk is acycle[46].
57
![Page 88: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/88.jpg)
Using Figure 3.1 again, a cycle would be verticesv1v2v4v5v3v1 with edge ordering
{v1,v2}{v2,v4}{v4,v5}{v5,v3}{v3,v1}. This second example is also a nontrivial trail
that is closed. This kind of trail is referred to as acircuit [13]. Another example of a
trail, that is a cycle, would bev2v4v2 with edges{v2,v4}{v4,v2} (because the edges are
not directed). However, this is not a circuit because it is a trivial trail.
A pathfor a graph, in graph theory terms [46], is a walk in which all the vertices
on the walk are distinct except in one special case. If the path creates a cycle then the
path will come back to the starting vertex. Remember the firstexample encountered in
this section, Figure 3.1, was a path, but not a cycle.
For a graph to beconnectedevery pair of vertices is joined by a path [46]. Both
Figures 3.1 and 3.2 are connected graphs.
3.2 Types and Hierarchies
A type is a label that represents an idea with a underlying perceived object or
entity; these aretype labels. These entities within the world are in relationship with
each other. These entities can beaxiomatic, that is primitive and not made up of any
other defining entities, or the entities can bedefinedmeaning that they are built of more
than one axiom [125, 90]. The relationships can be seen as a hierarchy and can be
broken down into two functions:
58
![Page 89: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/89.jpg)
1. The functionctypemaps a finite set of vertices calledconceptnodes onto a set,
TC, of type labels. Each type label inTC is specified as axiomatic or defined.
Examples from Figure 3.4 of a type labels areBird,∼Cat,∼Dog,∼ etc.
2. The functionrtypemaps a finite set of vertices calledconceptual relationnodes
onto a set,TR, of type labels. Each type label inTR is specified as axiomatic or
defined. Examples from Figure 3.5 of these type labels aremember,∼ works−
with,∼ etc.
A type hierarchyis a partially ordered set of type labels,TH . Type hierarchies can be
used to define membership in various categories of entities.An example of a four level
hierarchy can be seen in Figure 3.3.
TOP
CH
M GF
A JI
K L B N
BOTTOM
E
D
Figure 3.3: A Type Hierarchy.
59
![Page 90: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/90.jpg)
The levels are counted from the TOP of the hierarchy to the BOTTOM. In each
level of the hierarchy, there are entities that are members of the hierarchy. These mem-
bers are organized into a partial ordering with the symbol≥ being used to designate the
ordering from top to bottom of the hierarchy. An example of a partial ordering from
Figure 3.3 would beTOP≥C≥D≥ A≥ L≥ BOTTOM.
Members at the top of the hierarchy are considered to be more general; the
members at the bottom are more specific. More general membersin the partial ordering
are said tosubsumethe more specific members and the more specific membersinherit
information from the more general ones. As stated in MacGregor:
“a concept C subsumes a concept D if any individual satisfying
the definition for D necessarily satisfies the definition of C”
[[66] page 388].
Through this process of moving down the hierarchy to gain more specific information
concepts are classified based on a relationship known assubsumption[140].
As seen in these hierarchies there is a partial ordering between its members.
However, when this membership is extended such that for any two elementsx andy of
L, the setx,y has both a least upper bound and a greatest lower bound then anlattice
exists [44]. When the elements aretypes, then these are referred to astype lattices.
60
![Page 91: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/91.jpg)
3.2.1 Concept Type Hierarchy
With these type lattices, theconcept type hierarchiesare organized into partial
ordered hierarchies according to the level of generality ofthe types. Using a more con-
crete example in Figure 3.4, there is given a set of labels {Animal, Mammal, Bird, Cat,
Dog,Human}. If Mammal≤ Animal thenMammalis called a subtype ofAnimaland
Animalis called a supertype of Mammal, writtenAnimal≥Mammal. If Cat≤ Animal
andCat≤MammalthenCat is called a common subtype,∩, of MammalandAnimal.
If Animal≥MammalandAnimal≥Cat thenAnimalis called a common supertype,∪,
of MammalandCat.
T
Animal
BirdMammal
Cat Dog Human
Figure 3.4: An Animal Concept Hierarchy.
61
![Page 92: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/92.jpg)
Extending the type lattice definition as a type hierarchy plus the operators∪ and
∩, it can be seen that the minimal common supertype ofa and b, writtena∪b, has the
property that for any typet, if t ≥ a andt ≥ b, thent ≥ a∪b. The maximal common
subtype ofa andb, writtena∩b, has the property that for any typet, if t ≤ a andt ≤ b,
thent ≤ a∩b. In order to make the lattice complete, the labels⊥ and⊤ are introduced
such that for any typet, ⊥ ≤ t ≤ ⊤. The levels from⊤ to⊥ in the hierarchy go from
general to specialized for the types (e.g.Animalto Cat). Relationships that hold for all
objects of a given type are inherited through the hierarchy by all subtypes.
3.2.2 Support
Per the definition given within Baget and Mugnier [5], asupportis defined as
4-tupleS= (TC,TR, I ,τ) (see Figure 3.5 for an example).TC andTR are two partially
ordered finite sets of concept types and relation types, respectively.TR is partitioned into
subset of hierarchies,T1R . . .Tk
R, of relation types of arity 1. . .k wherek≥1, respectively.
Both orders onTC andTR are denoted byx≤ y, which means thatx is a subtype (or
specialization (see Section 2.2.2.4)) ofy. I is the set of individual markers (or referents),
andτ is a mapping fromI to TC.
As can be seen in Figure 3.5,TC andTR are written in more of a shorthand for
type hierarchies; they do not include the⊥ and⊤ labels, even though they are implied.
Also, as discussed above, the relation hierarchy is broken down into a set of hierarchies
using thertype function (as defined earlier in this section) noted by the arity of the
62
![Page 93: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/93.jpg)
relations within the hierarchy. The individual markers,J. andK., are mapped to the
conceptResearcherthrough the mapping functionτ. The ‘. . .’ in each membership list
indicates that there are more elements to each of these lists.
TC
T TR R
2= { }
Tc
TTr2
member works−with geographical−relation
in near
adjoin
OfficePersonProject
Researcher Manager
HeadOfProject
I = {J.,K., . . .}τ = {(J.,Researcher),(K.,Researcher), . . .}
Figure 3.5: Support Using a Relation Hierarchy (Based on [[5], Figure 1]).
3.3 FOL
First Order Logic, FOL, is a well understood form of symbolicreasoning pi-
oneered by Boole, Frege, and C.S. Peirce [51]. Each sentencethat appears in FOL
contains a predicate and a subject in variable form. The predicate can either define
or modify the subject, but the resolution of the predicate isdefined only for the logi-
cal truth values,TRUE andFALSE . When these sentences are combined, they must
adhere to the rules of Boolean algebra. These sentences “only have variables for first-
order objects (and these expressions such as “∀x” and “∃x” apply only to the elements
of a structure), so will be call afirst-order language” [[35] page 9].
63
![Page 94: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/94.jpg)
FOL is considered part of first order predicate calculus, FOPC, but for FOPC
there is also a finite number of axioms. As restrictions are relaxed, and one looks at
the full area of logic considered FOPC, predicates can extend beyond just TRUE and
FALSE [12], where there are predicates such that can not be proven TRUE nor FALSE
[50]. With FOPC,λ−expressionsusing the predicate axioms with an infinite sequence
can be expressed. This allows the use of predicates, such as,exists, forall, iff,etc.
Building up these axioms to represent sentence descriptions leads to set theory.
3.4 Conceptual Graphs
In his book [119], John Sowa states: “Conceptual graphs forma knowledge
representation language based on linguistics, psychology, and philosophy” [[119] page
69]. The representation containsa graph, the definition stated in Section 3.1, and oper-
ate according to graph theory rules using graph diagrams that are built out of the logic
building blocks of conceptual structures (see Section 2.3.2.3). The definitions for some
of the blocks are presented beginning with thetypeblock:
Definition 3.4.1 A type is a labeling for an abstract idea which is either
a conceptual unit or a relationship. These types are membersof a set, T,
that may form several structures including hierarchy trees, lattices, and
other related structures. When the structure is a type hierarchy lattice,
the set is labeled TC, and the functionctypemaps a conceptual unit to
64
![Page 95: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/95.jpg)
the type label in the structure. When the structure is a relation hierarchy
tree, the set is labeled TR, and the functionrtypemaps a relationship to
the type label in the structure.
A referentblock would have the following definition:
Definition 3.4.2 A referent is an abstract conceptual unit that has been
instantiated with a factual value.
Therefore, aconceptual graph, CG, applies the following definition:
Definition 3.4.3 A conceptual graph is a bipartite, connected, directed
graph G= (V,E), such that V , all vertices inG, is partitioned into two
disjoint sets VC and VR. The vertices are labeled, and the set VC is called
theconceptnodes and the set VR is called the conceptual relationsnodes.
Thus, e∈ E is an ordered pair that connects an element of VC to an
element of VR using a directed edge which will be calledan arc.
The label of a concept node is a pair, c=< type, re f erent>. The type
is an element of the set TC, that may be defined in a type lattice (see Sec-
tion 2.1.1). The referent (if present) contains the individual instantiation
for the type field; however, if it is not present then c=< type,empty>or
just written c=< type>.
65
![Page 96: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/96.jpg)
The label of a conceptual relation node is a pair, cr=< type,signature>,
where type is an element of the set TR, and the signature is a pair,
s =< I ,O > where I is the arcs that are directed into the conceptual
relation and O is the arcs that are directed out from the conceptual re-
lation. The signature is further defined by its subset category of either
relationor actor. The relation is a tuple, r=< type,c1,c2, ...,cn >where
type is defined above and in the signature I⊆ VC and O∈ VC. The
number of concepts in the tuple is the valence of the relation. A con-
ceptual relation of valence n is said to be n-adic, and all signatures
must be at least 1-adic. The actor is a slightly different tuple, a=<
type,c1,c2, ...,{...,cn−1,cn} > where type is defined above and in the
signature I⊆VC and O⊆VC.
Figure 3.6 shows a basic conceptual graph in traditional format with nine nodes.
R2
C1
C3
R1 C2
R3 R4
C4R5
Figure 3.6: Basic Abstract Conceptual Graph.
66
![Page 97: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/97.jpg)
Figure 3.7 shows a conceptual structure with nine nodes in the mathematical
digraph and bigraph format. Within the CS community, it is felt that the typical display
format of Figure 3.6 is easier to read and follow the conceptual relationships.
3
C1 R1
C2 R2
R3C3
C4 R4
R5
Figure 3.7: Basic Abstract Conceptual Graph in Digraph Format that is Bipartite.
In Figure 3.6, four nodes are concepts (seen in display mode as rectangles), five
nodes are relations (seen in display mode as ovals). In this example,VC = {c1,c2,c3,c4}
andVR = {r1, r2, r3, r4, r5}. There is not a type hierarchy, but the four concepts are
67
![Page 98: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/98.jpg)
c1 =< C1 >, c2 =< C2 >, c3 =< C3 >, c4 =< C4 >, and the relations arer1 =<
R1,< c1,c2 >>,r2 =< R2,< c1,c3 >>,r3 =< R3,< c1,c5 >>, r4 =< R4,< c2,c5 >>
, r5 =< R5,< c3,c5 >> . As can be seen the “R1” relation has the signature< c1,c2 >
which indicates thatr1 is a 2-adic (or binary) relationship wherec1 is the input concept
or the argument to the relationship andc2 is the output concept or the output for the
relationship.
Sowa has shown how unknown objects (nodes with no referent field) can be
computed by anactor node [119]. Actor nodes (displayed as diamond-shaped boxes)
are connected to concept nodes withdashedlines because anactor can best be thought
of as a “functional relation”, where there is a semantics (performed by the procedure)
being represented graphically between two objects. Figure3.8 is a functional relation-
ship, displayed with the diamond shape, betweenCATCHINGandPERSON, CATCH,
andBALL.
A functional relationship has directionality from one nodeto another, but there
is both optional inputs, and multiple outputs possible of conceptual information. In
order to know how to handle the data being processed through this kind of relationship
an action function(see Figure 3.9 for example) is attached to the relation. Therefore
each functional relationship is calledan actor in the conceptual graph representation
because it can perform actions on its data.
68
![Page 99: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/99.jpg)
AGT
POSS
BALLPERSON
CATCH
CATCHING
PTNT
Figure 3.8: Basic Conceptual Graph with Actor.
void Catching(String person, String catch, String &ball){
// process in knowledge base so this person now has// possession of a specific ball
}
Figure 3.9: Action Function For Basic Actor Graph.
If this internal representation was to be used at a higher level for a logic knowl-
edge representation, one would need to map the intention of the predicate with atype,
and the extension of the predicate with areferent in the internal a conceptual structure
representation.
69
![Page 100: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/100.jpg)
3.4.1 Graph Theory Relationships
This conceptual structure encodes knowledge using concepts and conceptual
relations. Concepts are “blocks” of typed information and conceptual relations are the
“linkages” between the blocks. This knowledge is then transformed into a graphical
structure such that a conceptual graph contains two kinds ofnodes: concepts and re-
lations. The lines are the arcs between these two kinds of nodes. It is this duality
that make the graph abigraph, or bipartite graph. It should be noted that conceptual
relations can also be of two kinds: direct relationships andfunctional relationships.
Also, unlike a general graph, the pairs of nodes defining the arcs are ordered, or
directed. The arrows on the arcs show the directionality of information movement from
one node to another. As can be seen in Figure 3.6, relationR2 is in a direct relationship
from conceptC1 to conceptC3. R2 receives conceptual input data from conceptC1,
and produces output data and sends it to conceptC3 (instantiatesC3). Therefore, these
nodes are connected in a triple relationship.
The walk for a conceptual graph must not only alternate between nodes and
arcs, but the kind of the nodes must alternate between concepts and relations [136].
When a walk is just atrail such as in Figure 3.6, an example trail would be fromC1→
R2→C3→ R5→C4. In a walk, since the arcs are incident to the nodes, each concept
node has a relationship number to each relation node (i.e.C1 has the relationship
number 1 toR2, andC3 has the relationship number 2 toR2). From now on to this
70
![Page 101: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/101.jpg)
relationship number will be referred to asthe ith edgeof the relation in respect to each
linked concept.
For there to be a path in a graph all the vertices must be unique. However, for
a conceptual graph only the concept nodes must be unique [136]. If the path is closed,
that is a cycle, then the first concept,c1 would be equal tocn. Since a conceptual graph
is directed, one can follow the arcs through the graph to create a path. A conceptual
graph without a cycle, that does not contain a functional relation, is called atree[136],
and the path followed will lead to a leaf node. Examining Figure 3.6, a path reaching
all the concept nodes (but not all the relations) would beC1→R2→C3→R5→C4→
R4→C2→ R1→C1. Note,R3 is not reached andC1 is repeated; therefore, this is a
path, but not a tree.
3.4.2 Formation Rules
Not every concept and conceptual relation combined together make sense in
a meaning full way; therefore, conceptual graphs that do represent meaning will be
consideredwell-formed, and other combinations with no meaning will be calledill-
formed [118]. When working with well-formed CGs, three formation rules can be
applied repetitively [118]:
1. Copy - An exact copy of a well-formed CG is well formed.
2. Detach - All CGs that remain when any conceptual relation is removedfrom a
71
![Page 102: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/102.jpg)
well-formed CG are also well-formed.
3. Restrict - If a is a concept in a well-formed CGG, then for any conceptc ≤ a
from the concept type hierarchy (see Section 3.2) ofG, the graph obtained by
substitutingc for a is well-formed.
Examples can be presented to show how each of these formationrules can be applied.
The ‘copy’ formation rule is fairly straight forward: the graphG in Figure 3.7, can be
copied to graphH in Figure 3.6, where both graphs are equivalent and well-formed,
just displayed in a different way.
If one starts with the graphH in Figure 3.6, and performs two ‘detach’ formation
rules; first, remove conceptual relationR3; and second, remove conceptual relationR5;
then graphH ′ shown in Figure 3.10 will be produced. The graphH ′ is still well-formed
even though two of the conceptual relations have been detached.
R4R2
R1
C4C3
C2C1
Figure 3.10: Basic Detached Conceptual Graph.
72
![Page 103: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/103.jpg)
In order to explain clearly about therestrict rule, graphH ′created with the de-
tach formation rule applications in the paragraph above, and graphG shown in Figure
3.11 will be used in connection with a new concept type hierarchy shown in Figure
3.12.
R1
R6
C6
C5 C2
Figure 3.11: Simple Basic Conceptual Graph.
T
C5
C2 C1
C3 C4
Figure 3.12: Second Concept Type Hierarchy.
73
![Page 104: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/104.jpg)
GraphG contains the nodesC5, R1,C2, R6, andC6; in which, theC5 node can
be restricted using the second concept type hierarchy (see Figure 3.12) toC1, because
C1 is a subtype of nodeC5 (note: other restrictions could also be performed).
This restriction will produce the well-formed graph in Figure 3.13.
R1 C2C1
R6
C6
Figure 3.13: Simple Restricted Basic Conceptual Graph.
3.4.3 Simple Conceptual Graphs (SCGs)
Researchers M. Chein and M.-L. Mugnier [15] from the LIRMM group at the
Universite Montpelier and others [5, 22] have done researchon a subset of concep-
tual graphs known assimple conceptual graphs, SCGs,(see Sowa 3.1.2 [119]). As
explained in Baget and Mugnier [5], these SCGs are connected, bipartite graphs where
the arcs are labeled and finite but not directed,SG= ((Vc,Vr),U,λ). Figure 3.14 is an
example of a SCG.Vc andVr are the concept and relation nodes, respectively.U is the
set of edges, where edges incident on a relation node are totally ordered (that is, they
74
![Page 105: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/105.jpg)
1
2
3
C4C3
C1
R2
Figure 3.14: Simple Conceptual Graph (SCG).
are numbered from 1 to the degree of the node).λ is a labeling function of the nodes
and edges [75].
ExaminingU further, an edge numberedi between a relation noder and a con-
cept nodec can be labeled by(r, i,c) and is unique inU ; all edges withinU will be
stored in this triplet format. As an example, from Figure 3.14, (r2,2,c1) would be an
element ofU .
Every node also has a label defined by the mapping ofλ. A relation node’s label
is its (type(r), arity(r)) (defined in Section 3.2.2), and a concept is its (type(c),marker(c))
(defined in Section 3.2.2). The directionality is removed tosimplify the reasoning
processing of the graphs. Due to the fact that there is no directionality, there are no
conceptual relations that are functional (excludes actors).
However, an extension from SCG that does allow directionality and cycles (this
will be discussed more later in actual algorithms), isfeature term graphs,ω− term,
introduced by Ait-Kaci [2]. A conceptual graphG is a feature term graph if it obeys the
75
![Page 106: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/106.jpg)
following conditions [136]:
• the relations are all binary, for any relationr onlyarg1(r) andarg2(r) are defined,
• the relations arefunctional, for any relationsr andr ′∈ A,arg1(r) = arg1(r ′) and
type(r) = type(r ′) implies thatr = r ′, and
• there is aheadconcepth∈C such that for allc∈C there is a path(c1, r1, . . . , rn−1,
cn) with arg1(r i) = ci andarg2(r i) = ci+1 such thatc1 = h andcn = c. Note that
whenn = 1 this includes the casec = h.
3.4.4 Conceptual Graphs Interchange Format (CGIF)
The conceptual graph interchange format (CGIF1) is a representation for con-
ceptual graphs intended for transmitting CGs across networks and between IT systems
that use different internal representations. The CGIF syntax ensures that all necessary
syntactic and semantic information about a symbol is available before the symbol is
used; therefore, all translations can be performed during asingle pass through the in-
put stream. Part of this information is reproduced here in appendices (see Appendix
B) to give a concrete definition of a conceptual graph and indicating how CGs were
transmitted between the systems during testing discussed later in this work.
1The current archived copy of CGIF from the ICCS2001 workshopis located at:http://www.cs.nmsu.edu/~hdp/CGTools/cgstand/cgstandnmsu.html#Header_44
76
![Page 107: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/107.jpg)
The CGIF format was originally developed by John Sowa for a possible In-
ternational Standard [126]. It was then modified to the format seen at the CGTools
Workshop, that was held at the International Conference on Conceptual Structures in
2001 [97]. Since that time, it was totally changed and incorporated, as Annex B, into a
larger effort of standardization known as “International Standard for Common Logic”
[33].
3.5 Data Structures
In order to evaluate the array and hash table data structuresover the graph struc-
ture, one looks at how long it takes to store and retrieve a single relationship (see Def-
inition 3.4.3 for a CG) within the graph given the specified data structure. Note: this
does not examine or account for any support (see Section 3.2.2) or hierarchy (see Sec-
tion 3.2) processing. This is from the perception that more retrievals will be done on the
knowledge base than stores, so it is important to optimize the retrieval of relationship
elements of the graph over considering the time and space to store that information.
Table 3.1 indicates the time to store and retrieve a relationship from a set ofn rela-
tionships within a graph for certain data structures. The following sections define how
these values were reached and any related constraints or constants.
77
![Page 108: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/108.jpg)
Table 3.1: Execution Times For Single Element with Set of Sizen.
Data Structure Storage RetrieveArray (sorted) O(n) O(log(n))
Array (unsorted) O(1) O(n)Hash Table O(1) O(1+α)
Perfect Hash (single) O(n) O(1)
Perfect Hash (double) O(n2) O(1)
3.5.1 Arrays
When arrays are used for data structures, the time an array takes to store ele-
ments depends on whether the array of values is sorted. When the data is not sorted,
but just appended to the end of the array then the storage of data is very quick,O(1),
but retrieving the data back can take as long asO(n) because one has to look through
the whole array. When the array is sorted data on storage, it can takeO(n) time to place
the data, but with a binary search on a tree structure it takesO(log(n)) time to retrieve
it back.
However, if the sorted data is from a directed cyclic graph into a knowledge base
structure, storage is equal in execution time for the time needed to retrieve it back. This
can be shown, such that, for building the array, the execution time for a single graph, is
O(n) where C *n = #vertices + #edges and #edges = 1/2#vertices=1/2n, so C = 3/2 (see
Cormen90[20]). For retrieving the element back from the graph (for example, doing a
direct match) the whole array may again need to be checked again giving the execution
78
![Page 109: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/109.jpg)
time ofO(n).
3.5.2 Hash Tables
Storage of an element in a hash table data structure has the expected storage
time of O(1) plus the time it takes to compute the hash value for the key,h(k), and to
store in the case of collision depending on the secondary data structure. If the secondary
structure is an unsorted linked list then the element can just be placed at the head of
the list (most common data structure [20]). On retrieval, even when there are collisions
with key hash values, there are still far fewer thann wheren is the number of nodes in
the graph. The hashing function will produce more than one value, so all hash keys will
not collide. The expected time for retrieval with a hash table isO(1 + α) whereα is the
time to retrieve the element if there was a collision at storage, and the time to compute
the hash value for the key,h(k).
3.5.2.1 Perfect Hashing
In true perfect hashing, there are no collisions on key values so retrieval time
now becomesO(1) (note: there is a constant because of the execution of the hashing
function to find the key) [24]. However, creation of the perfect hashing function given a
set of dynamic input data can be costly on storage. There has been research on finding
the perfect hashing function, and a hash function description (“program”) for a set of
size n occupiesO(n) words, and can be constructed in expectedO(n) time [83]. Work
79
![Page 110: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/110.jpg)
has also been done on finding a universal hash function [131] or a quasi-perfect hash
function [23] (as opposed to a perfect hash function) that can be constructed in time
O(1+α) whereα is again not close ton.
3.5.2.2 Hash Table/Hash Tables
When a hash table is the value element of a hash table data structure, then
there is extra storage space in order to hold the overhead needed by the second hash
table. Considering that this hash table is embedded in another hash table with its own
overhead, then there is double the amount of overhead space being used. However,
if both hash tables areperfect hash tablesthen the retrieval time for the finding the
sub-value becomesO(1) * O(1) or O(1) (constant). After the overhead retrieval time,
constant, is accounted for, then the retrieval time isO(1).
Besides the extra overhead space is required, the time to store the double hash
tables would be at maximumO(n2) time for two hash tables to be stored. This assess-
ment is reached by looking at two hash tables in which one table holdsn elements and
the other table holdsm elements. Since the size ofm is ≤ to n, then evaluation can
be performed by using the sizen. When using Pagh’s algorithm [83] discussed above,
it was shown that to store perfect hash tables for one set of hashes takesO(n) time;
therefore, if storing two perfect hash tables it would takeO(n) time at each element in
the first hash table to store the second hash table orO(n2) for both tables.
80
![Page 111: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/111.jpg)
CHAPTER 4
REASONING OPERATIONS
Within this chapter, first will be described the operator ‘project’ and then how it relates
to the operator ‘join’. The second section will describe graph isomorphism relation-
ships, and the last part of the chapter will describe how all these elements are connected
within reasoning operations.
4.1 Operators
Using the knowledge representation described in Section 3.4, two operators,
project and join, manipulate conceptual graphs using the rules that incorporate type
hierarchy subsumption [48]. These operators are duals (i.e., intersection and union),
therefore, the description of project is, in some sense, thedual of the description of
join.
The following set of correspondences are sufficient to indicate how project and
join compare:
Project ←→ Join
Min. Supertype ←→ Max. Subtype
Intersection ←→ Union
81
![Page 112: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/112.jpg)
4.1.1 Project
Theprojectoperator is defined through a mappingπ :u→ v, whereπu is a sub-
element ofv. Whenu andv are defined to be conceptual graphs, for graphu to be a
subgraph of graphv, all of the nodes and arcs ofu are inv [46], and the project operator
π holds to the following rules [119, 136]:
• Type preserving: For each conceptc in u, πc is a concept inπu wheretype(πc)
≤ type( c ), and≤ is the subtype relation. Ifc is an individual, that is an actual
instance of an object, thenreferent( c ) = referent( πc).
• Structure preserving: For each conceptual relationr in u, πr is a conceptual rela-
tion in πu wheretype(πr) = type( r ). If the ith edge ofr is linked to a conceptc
in u, the ith edge ofπr must be linked toπc in πu.
The example in Figure 4.1 shows project with the general forms of graphs.
G
A F
A
projectI
J
A
B
I
J
Figure 4.1: Project (Mp (Q, H) = P) (Adapted from [[92], Figure 3]).
82
![Page 113: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/113.jpg)
This example uses the hierarchy example (see Figure 3.3) from Section 3.2. One
can see that the nodeA from the first graph, which will be calledQ, is projected onto
the second graph, which will be calledH, with a match at its nodeA. This is the only
exact match in the project. Then using the hierarchy,F is the supertype ofI , so when
Q is projected onto graphH, I the common subtype ofI andF forms a new node in
the projection graphP, and this node is linked toA. Lastly, nodesG from graphQ and
J from graphH have a common sub-type ofJ, so that is formed as a new node in the
projection graphP giving the resulting project graphP = {V,E} whereV = {A, I ,J}
andE = {{A, I},{A,J}}. Note, using this hierarchy there are more than just this one
project possible.
If join (see Section 4.1.2) is likened to set union, in that all nodes not joined
are just left alone, and come along for the ride, then projectis like set intersection.
All nodes that are not projected are simply dropped from the resultant graph, and their
associated relation nodes are detached.
4.1.2 Join
With an elementary join between two graphs,U1andU2, that are non-necessarily
distinct; letc1,c2 be two concept vertices belonging respectively toU1andU2, and hav-
ing the same type or subtype, then the results of the join ofU1 andU2would beU3 with
the restriction (see Section 3.4.2) of conceptc1 with c2 and linking toc2 all the edges
that had been linked toc1 now inU3 [15, 119].
83
![Page 114: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/114.jpg)
In join MJ (see Figure 4.2), the labelsmaybe restricted by replacement with
a label of any subtype, and graphswill be merged on the maximum number of nodes.
Figure 4.2 again uses the hierarchy example (see Figure 3.3)from Section 3.2. One
can see that the nodeA from the first graph, which will be calledQ, is joined with the
second graph, which will be calledH, with a match at its nodeA. This is again the
only exact match in the join. Then using the hierarchy,I is the subtype ofD, so when
Q is joined with graphH, D is restricted toI andI forms a new node in the join graph
J, and this node is linked toA. Lastly, nodeK from graphQ is linked into the new
join graphJ giving the resulting join graphJ = {V,E} whereV = {A, I ,K,B,F} and
E = {{A, I},{A,K},{A,B},{A,F},{I ,F}}.
A
K
I
B F
join
A
B
I
F
K
A D
Figure 4.2: Join (MJ (Q, H) = J) (Adapted from [[92], Figure 2]).
84
![Page 115: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/115.jpg)
4.2 Graph and Subgraph Isomorphism
Table 4.1 shows how each of these problems and sub-problems fall within
the problem classes discussed in basic subgraph isomorphism reasoning (see Section
1.2.1).
4.2.1 Graph Isomorphism
For two graphs to be identical, the vertices inG must map onto the vertices in
H, such that,(x,y) is an edge ofG iff ( f (x), f (y)) is an edge inH; therefore, giving
isomorphicgraphs. However, if the graphs are labeled, that is the vertices have actual
labels as opposed to variables, then given graphG = (Vg,Eg) and graphH= (Vh,Eh),
such that they are identical, that is(x,y)∈ Eg iff (x,y)∈ Eh, then they can be defined to
beisomorphic. As already stated in Section 1.2.1, graph isomorphism is inthe problem
class NP (see first row of Table 4.1), even though there are known algorithms when the
graphs are labeled that have a polynomial time solution.
4.2.2 Subgraph Isomorphism
As discussed in Ullman’s paper of 1976 [132] and used in basicsubgraph iso-
morphism reasoning (see Section 1.2.1), looking for all theisomorphisms between a
given graphG = (Vg,Eg) and subgraphs of a further graphH= (Vh,Eh) allows the de-
tection of related objects within the two graphs. This subgraph isomorphism helps to
find if two structural patterns within the graphs are related.
85
![Page 116: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/116.jpg)
Table 4.1: Related Problem Classes.
ProblemClass
Graph DescriptionGraph to Graph(worst casetime)
References
GraphIsomorphism
nodes - non-labelededges - undirected
NP [42]
SubgraphIsomorphism
nodes - non-labelededges - undirected
NP-Complete [42, 132, 77, 68]
SubgraphIsomorphism
nodes - labelededges - undirected;labeled
P (n2) [132, 77, 68]
Isomorphismnodes - non-labelededges - undirectedgraphs are both trees
P (n2.5) [103, 42, 111]
SubtreeIsomorphism
nodes - non-labelededges - undirectedquery graph is a tree
NP-Complete [42]
SubforestIsomorphism
nodes - non-labelededges - undirectedquery graph is a forest;search graph is a tree
NP-Complete [42]
SubbipartiteIsomorphism
nodes - bipartite;non-labeled except typeedges - undirected
NP-Complete [38, 39, 40]
Projection
nodes - bipartite; non-labeled except typeedges - labeled;undirected
NP-Hard [119, 48, 74, 22]
ProposedProjection
nodes - bipartite;labelededges - non-labeled;directed
NP-Harddissertationdefinedalgorithm
MaximalJoin
nodes - bipartite; non-labeled except typeedges - labeled;undirected
NP-Hard[84, 119, 48, 74,77]
ProposedMaximalJoin
nodes - bipartite;labelededges - non-labeled;directed
NP-Harddissertationdefinedalgorithm
86
![Page 117: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/117.jpg)
4.2.2.1 Non-labeled nodes and undirected edges
This is where one wishes to discover ifG1 contains a subgraph isomorphic to
G2. For this class of problem, the vertices are general non-labeled nodes and the edges
are non-labeled, undirected links. According to Garey and Johnson [42], and other
references, this problem can be restricted to the known NP-Complete class problem of
CLIQUE and therefore has the complexity of NP-Complete. Perthe reasoning given
above for graph isomorphism, the complexity of graph,G2, to all the graphs in a knowl-
edge base is also NP-Complete.
4.2.2.2 Labeled nodes and undirected edges
This sub-problem of the subgraph isomorphism problem discussed above is
shown by Ullman [132], and others to be solvable in P (polynomial time). Within
the Messmer and Bunke paper of 2000 [68], they show that by dividing the subgraph
question into two parts: 1) decomposing the graph, and 2) querying the subgraph iso-
morphism question on the smaller graph; this can improve thetime complexity. In fact,
producing unique labels for the decomposed graph parts (down to the single nodes)
allows both parts to run in polynomial time.
87
![Page 118: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/118.jpg)
4.2.3 Subtree Isomorphism
One of the sub-problems to subgraph isomorphism issubtree isomorphism. This
is when bothG andH are trees (“a tree is a connected acyclic graph” [[46] page 32]).
A polynomial time algorithm for this sub-problem was shown by Reyner [103] where
the running time wasO(n1∗n1.52 ) wheren1 is the number of vertices in the input graph
andn2 is the number of vertices in the knowledge base graph. This polynomial time
algorithm extends tom∗O(n2.5) wherem is the number of graphs in the knowledge base
andO(n2.5) is the polynomial running time for the input graph times the knowledge
base graph. In theP algorithm, then is the maximum number of nodes in the largest
graph. It should be noted that Reyner’s algorithm used maximal matching in a bipartite
graph and therefore considered the trees to be bipartite [103].
4.2.3.1 Hamiltonian Path
When the query graph,H, is a tree and the knowledge base graph,G, is un-
known, thenH contains aHAMILTONIAN PATHas a sub-problem and hence is NP-
Complete [according to Garey and Johnson [42] page 104].
4.2.3.2 Subforest Isomorphism
When the knowledge base graph,G, is a tree then the query graph,H, must be
acyclic. If it is not a tree, then it may be a forest. However, Garey and Johnson [[42]
page 105] also show that this sub-problem is also NP-Complete.
88
![Page 119: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/119.jpg)
4.2.4 Subbipartite Isomorphism
This sub-problem can be defined as a subgraph isomorphism search using bi-
partite graphs. This would be the most closely related classof problem to the reasoning
operations projection and maximal join as defined in Sowa’s 1984 book [119]. This
sub-problem of subgraph isomorphism answers the decision question: is there a sub-
graph of the knowledge base graph,G, that is isomorphic to the query graph,H, where
G andH are bipartite graphs. According to the Eppstein 1994 work [38], this sub-
graph isomorphism question on bipartite graphs can be answered in the best case in
polynomial time. This comes about because the number of edges are reduced through
the relationship between the nodes and a natural set of labels that are added because
of the types on the nodes. However, it should be stated that these labels are not totally
unique, and therefore, Ettinger [40] clearly states for theworst case running time over
a whole knowledge base where the labels turn out to be duplicated across nodes, the
execution time is still NP-Complete. The labels may not be totally unique even though
they are separated into two groups because the labels in the nodes must only be of two
different types; within a type the label on all nodes may be the same. Therefore, this
sub-problem can be reduced to the Maximal CLIQUE problem which is known to be
NP-Complete.
89
![Page 120: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/120.jpg)
4.2.5 Projection
Projection involves both the subclass of problems defined above as subbipartite
isomorphism and a new subclass of problem that looks at defining rules for type lattice
(many times called ‘trees’) subsumption.
4.2.5.1 Historical Algorithms
The projection sub-problem can be consideredconstructiveas well as isomor-
phic because of the way the rules are applied. The construction comes from the gener-
alization that can be applied when a node, through the application of subsumption with
the type lattice, isbuilt into a new node of the output graph when part of the projection
of an isomorphic subgraph. Therefore, the output to projection is not simply a logical
true or false, but a newly constructed graph containing the subgraph structure from the
knowledge base graph with possible new constructed nodes through the application of
the subsumption rules. The nodes in the graph are only labeled to the same extent as
the bipartite graphs described in the above class so, when evaluating the running time
of the algorithm, if no rules are applied from a type lattice,the running times are just
the same as for a subgraph isomorphism using bipartite graphs. The output graph in
this case is the subgraph from the knowledge base graph that was being projected onto.
However, the worst case running time when rules for the type lattice are applied
must take into account that projection is a problem known to be in NP (Sowa [119],
90
![Page 121: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/121.jpg)
Hartley and Coombs [48], Mugnier and Chein [74], and Croitoru and Compatangelo
[22]), and is constructive, so NP-hard.
4.2.5.2 Proposed Algorithm
This sub-problem of the projection given above will be defined new in the dis-
sertation. It makes the following two modifications to the maximal projection problem:
1) all nodes are uniquely labeled, and 2) the edges are non-uniquely labeled, but do
have some implicit labeling because they are directed. Tests were also performed us-
ing different data structures at implementation time. It isbelieved that through the
use of different data structures, the execution time will reflect the running time of the
subgraph ‘labeled’ isomorphism problem as opposed to the subgraph isomorphism on
bipartite graphs. The change in data structures also allow achange in how concepts
verses conceptual relations are searched for within the graph structure through the use
of the ‘labels’. Through this shift in sub-problem of subgraph isomorphism, there is an
improvement in the running time for the first part of the projection problem (not having
the application of rules from the type lattice).
However, because the overall problem is still constructiveas opposed to a de-
cision problem, and the application type rules have a worst case running time in NP
(Mugnier and Chein [74], and Croitoru and Compatangelo [22]), the worst case run-
ning time for this sub-problem is still NP-hard.
91
![Page 122: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/122.jpg)
4.2.6 Maximal Join
Maximal join is a sub-problem of projection, in that, the maximal join algorithm
is a join on compatible projections. These projections are maximally extended from the
common generalization of two graphs which are bipartite graphs [119].
4.2.6.1 Historical Algorithms
In performing a join, the time complexity includes the time to find the subbi-
partite isomorphism(s) of the two graphs, and then the matching (or joining) of these
projections again in a constructive manner to produce the largest extended constructed
graph from the subgraph of the knowledge base graph with the query graph.
Graph matching can be reduced to a unification problem, and bydoing so, in
many cases where the graphs are acyclic, can be performed in linear time (Myaeng and
Lopez-Lopez [77] and Paterson and Wegman [84]). Therefore,the overall complexity
of a maximal join in the best case (when no type rules are applied in the projection)
is still polynomial,O(n4); however, in the worst case (when the projection does apply
type rules) it is a NP-Hard problem.
4.2.6.2 Proposed Algorithm
Like the proposed new projection algorithm, this new algorithm is a sub-problem
of maximal join with modifications. The modifications are unique labels on all the
nodes of the graphs and non-labeled directed edges in the graph. Here, it is seen that
92
![Page 123: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/123.jpg)
the data structures being modified at implementation time again drive the projection to
be the time complexity of a labeled subgraph isomorphism, without type lattice rules.
These changes also drive the matching in the join to be lineareven when the graphs are
cyclic. Therefore, the best case time complexity isO(n3). However, the worst case,
with type rules being applied, is still NP-hard. During experimentation, it is hoped that
it can be shown that the worst case is not reached very often.
4.3 Operations
There are two basic operations necessary to process CG reasoning processes:
1) projection and maximal join. These operations use the project and join operators,
respectively, and apply the CG KB algorithms over them. These algorithms are based
on the subgraph isomorphism class of problems defined in the section above.
4.3.1 Projection
A projection operation uses the project operator, which is amatching on a graph
morphism, graph data structures with either the support information for SCGs or hierar-
chies when full CGs, and the actual projection algorithm. Stated in Baget and Mugnier,
“the elementary reasoning operation, projection, is a kindof graph homomorphism that
preserves the partial order defined on labels” [[5] page 428]. Not only does projection
use a project operator (see its definition in Section 4.1.1),but either the supportS of
the graph (when a SCG) or the defined type hierarchy (when CG),and produces a gen-
93
![Page 124: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/124.jpg)
eralization subgraph. During the projection of the query graph onto the match graph,
the match graph is generalized, and structure is removed by conceptual relations being
detached [37].
For the rest of this work, the projection operation evaluation and comparison
will be restricted to injective projection. The projectionmapping is not necessarily one-
to-one; that is, a concept or relation inu may have more than one concept or relation
in v thatπu is a valid mapping. In this respect, there is more than one valid projection
from u to v .
When the projection operation is performed using the query graph from Figure1
4.3 onto the KB graph and hierarchy of Figure 4.4, the two projections,P1 andP2,
discovered are displayed in Figure 4.5. Using the type hierarchy, bothobjectandball
are matches; note, if no hierarchy were present, then there would be only one projec-
tion. This is a simple injective projection because of the small graphs, however, it can
become complex very quickly.
Color: bluepropObject
Figure 4.3: Query Graph.
1The figures in this section were generated byCharGer[32].
94
![Page 125: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/125.jpg)
CubeBetweenBalls
prop
prop
Object
Ball
Color: blue
Cube: A
Ballbetween
ontop
T
CubeBetweenBalls
prop
prop
Object
Ball
Color: blue
Cube: A
Ballbetween
ontop
Object Cube
Ball
Figure 4.4: KB Graph with Type Hierarchy.
P1
Color: bluepropObject
P2
Color: blueBall prop
P1
Color: bluepropObject
P2
Color: blueBall prop
Figure 4.5: Projection Results.
95
![Page 126: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/126.jpg)
4.3.2 Maximal Join
For the join operation with conceptual graphs it is always maximal. Maximal
join is therefore defined as: “a join on compatible (common) projections that are max-
imally extended from the common generalization,L, of two conceptual graphs,Q and
G” (see Sowa 3.5.8 [119] page 102). The join is locally maximalbecause there may
be more than one group of compatible projections from two graphs that are maximally
extended (see Figure 4.2). In this way, structure is added orconcepts are made more
specific [37]. Since restrictions are allowed, it is clear that two nodes are join-able as
part of a maximal join operation if they contain types that have a maximal common
subtype using the supportS(in the case of SCGs) or type hierarchy (for CG).
Papers ([15, 92, 91, 48]) contain many examples, but to clarify the maximal
join operation three examples will be shown here. First, thetwo projections found in
the previous example for the projection operation, could bejoined into a single graph
becauseobject is a generalization ofball and cube (see Figure 4.6). Basically the
Objectconcept from graphP1 would be restricted to conceptBall, and then relationR1
would be detached; this produces a graph that is just a copy ofthe graphP2. Because
these two graphs could be fully joined into a completely compatible (common) graph,
where there are no nodes that were not join-able, then these are consideredcompatible
projections. When graphs are specialized, they are maximally joined on compatible
projections of a more general graph [119]; therefore, the joined graph from Figure 4.6
96
![Page 127: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/127.jpg)
could then be joined back to the original graph seen in Figure4.4 to produce parcel
models. Within these models, the consideration that the second ball that is part of the
‘between’ relation is also colored blue will be shown.
Color: bluepropBall
Figure 4.6: Join ofP1 andP2 Graphs.
The second example relates back to Figure 1.3 given in Section 1.2.2 of the
Introduction Chapter (see Chapter 1). From that example it can be seen that the graph
U is the common projection graph between graphsG1 andG2. When graphsG1 and
G2 are maximally joined this common graph becomes the merged nodes within the
resulting graphG. In order for graphU to be the merging ‘piece’ between graphsG1
andG2, it is assumed that a hierarchy indicating thatGirl ∼≤∼ Personis available
information. It is using this subtype that allows the restrict rule to produce the available
join.
The last example being discussed to clarify the maximal joinoperation comes
about when the graphs in Figures 3.10 and 3.11 (see Chapter 3 Section 3.4.2) are max-
imally joined. It has already been seen, within that section, that the graph in Figure
3.11 be restricted and detached to produce the graph in Figure 3.13. Using the common
97
![Page 128: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/128.jpg)
graph seen in Figure 4.7, the graphJ in Figure 4.8 is produced with just one step after
restriction.
R1 C2C1
Figure 4.7: Common Graph of Basic Graphs.
R6
C6
R4R2
R1
C4C3
C2C1
Figure 4.8: Join of Detached Basic and Simple Basic Graphs.
4.3.3 Over Knowledge bases
As discussed in Section 1.2.1, all the subgraph isomorphismproblems discussed
so far are from a two graph perspective. However, for knowledge bases there may be
more than one graph within the KB that will match to the input (query) graph [68].
98
![Page 129: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/129.jpg)
Looking at the operations above, when they are performed over a knowledge
base of graphsG , even though the two graph operation in the typical situation can
be solved in P, the functionality of the operation over the whole database gives the
following results.
Projection’s functionality over a set of graphsG is:
projection: G × G → 2G
As described above, there can be more than one valid projection between two
graphs, hence the powerset notation on the set of all graphsG .
The functionality of maximal join over a set of graphsG is:
maximal join: G × G → 2G
There can be more than one maximal join, hence the powerset notation on the
set of all graphsG . Join is a binary operation but multiple graphs can be joinedby
composing it with itself. Unfortunately, there is good reason to believe that join is not
commutative when semantic considerations come into play [91], but for now it will
assume there is no problem.
99
![Page 130: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/130.jpg)
100
![Page 131: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/131.jpg)
CHAPTER 5
ALGORITHMS AND ANALYSIS
As discussed in Section 4.3.2, the maximal join operation isan algorithm that involves
the joining of compatible projections that are maximally extended; however, not much
analysis and implementation has been performed on the join operation. Therefore, in
the first section on foundational algorithms, only projection algorithms with be ex-
plored. Later when the newly developed algorithms are discussed, any variations on
maximal extension of graphs and joining will be addressed.
5.1 Foundational Algorithms
In general, the matching part of both the projection and joinalgorithms is unifi-
cation (discussed previously in section 1.2.2) [19], and there are known linear unifica-
tion algorithms for acyclic (tree) graphs [84]. Also, SCGs have been evaluated as both
graph homomorphism and graph isomorphism. In their original paper from 1992 [74],
Mugnier and Chein looked at general projection running times and injective projection.
However, CGs and SCGs are not necessarily trees and only partof the algorithms pre-
sented next apply to injective projection, so these linear algorithms give guidance, but
do not always directly apply.
As discussed in the Messmer and Bunke paper [68], a naive strategy with forward-
101
![Page 132: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/132.jpg)
checking for establishing a subgraph isomorphism is Ullman’s backtracking in search
tree algorithm [132]. Since Messmer and Bunke feel that it isa common technique with
a good baseline subgraph isomorphism algorithm, the Ullmanalgorithm and its known
complexity (from [132, 68]) will be reiterated here for to define a basis for investigating
projection algorithms. The basic idea of Ullman’s algorithm is to take one vertex of the
input vertices (query graph) at a time and map it onto a model (a graph from the KB)
such that the resulting mapping represents a subgraph isomorphism for a subgraph of
the model (KB graph) projected from the input graph (query graph) (see page 307 and
322 of Messmer and Bunke [68]). If at some point, the mapping being built does not
represent a subgraph isomorphism then the algorithm backtracks and tries a different
mapping. This process is continued until all vertices,v1, . . . ,vM in VI of the input graph
are successfully mapped ontoV of the model. This either produces a subgraph isomor-
phism fromG to GI or stops when a vertex inVI can not be mapped to at least one
vertex inV. In the second case, the algorithm backtracks to a newv1 in V or vn−1in V
and tries to remap the subgraph isomorphism.
Even though this basic algorithm works well for small model and input graphs,
it performs poorly as the graphs become larger. This is because all checks are being
done locally. Ullman added a forward-checking procedure toknow when it is not pos-
sible forvn to be mapped onto an available vertex inVI (see page 322 in Messmer and
Bunke [68]), so that the algorithm can backtrack immediately and save computational
steps. In the best case Ullman’s algorithm is bounded by:O(NIM) whereN = #model
102
![Page 133: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/133.jpg)
graphs,I = #labeled verticesin the input graph which come from theM set of labels,
M = #labeled verticesin the model graph that are unique. In the worst case the algo-
rithm is bounded by:O(NIMM2) whereN = #modelgraphs,I = #verticesin the input
graph that are not labeled, andM = #verticesin the model graph that are not labeled.
With this general algorithm, labeling of vertices greatly improves the efficiency of the
algorithm. However, it should be noted, that this algorithmdoes not take into account
any support or hierarchy knowledge information.
5.1.1 SCG Projection
This section is an explanation of the projection algorithm found in Marie-Laure
Mugnier and Michel Chein’s 1992 work [74]. Note the base level polynomial algorithm
discussed is for SCG without loops (cycles) in the graph being projected, trees, and
this is the foundation for improving projection between twoSCGs with a support (see
sections 3.4.3 and 3.2.2).
Before discussing the general and injective projection algorithms, some basic
definitions are given which will help the reader understand each algorithm. 1) Using
the projection operation provided in section 4.3.1, the following additional rules on
labels will be added to the graph morphism (from [74] page 240):
Definition 5.1.1 Given two simple conceptual graphs G and G′, a pro-
jectionΠ from G to G′ is an ordered pair of mappings from(RG,CG) to
(RG′,CG′), such that:
103
![Page 134: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/134.jpg)
(i) For all edges rc of G with label i,Π(r) Π(c) is an edge of G′ with
label i.
(ii) ∀r ∈ RG, type(Π(r)) = type(r); ∀c∈CG, type(Π(c)) = type(c).
There is a general projection fromG to G′ if and only if G′ can be derived fromG by
the elementary specialization rules [119, 15].
2) The set of the numbers on edges between r and c (refer to section 3.4.3 on
SCGs) holds the following definition:
Definition 5.1.2 For c a neighbour of r, let Pr [c] be the class of the
partition of Pr which corresponds to c.
3) Injective projection definition:
Definition 5.1.3 Injective projection is a restricted form of projection
where the image of G in G′ is a subgraph of G′ isomorphic to G.
The projection from a tree to a graph in the general case, as defined on pages 245-246
of Mugnier and Chein work [74], and where there is a concept vertex a in T and a
concept vertexc in G is given in Algorithm 5.1.
104
![Page 135: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/135.jpg)
Algorithm 5.1 Π is a General Projection fromT to G
1: function PROJ-ROOT(a,E) ⊲ a∈CT
2: E←{c∈ E | label(a)≥ label(c)}3: If E = /0 or a is a leaf,return E4: for all r successors ofa do ⊲ Move through the neighbours5: for all c∈ E do6: Wc,r ← { r ′ neighbour ofc | type(r) = type(r ′) andPr [a]⊆ Pr ′[c]}7: end for8: Er ←
S
{Wc,r}c∈E
9: Er ← PROJ-r(r,Er )10: for all c∈ E do11: Vc,r ←Wc,r
T
Er
12: end for13: Er ←{c∈ E |Vc,r 6= /0}14: end for15: return E ⊲ Project of graph16: end function
17: function PROJ-R(r,E) ⊲ r ∈R18: E← r ′ ∈ E | Pr is thinner thanPr ′
19: If E = /0 or | P |= 1 is, return E20: for all ai successors ofr do ⊲ Move through the hierarchy21: Ei ←
S
{cr ′ | Pr [ai ]⊆ Pr ′[cr ′]}r ′∈E22: Ei ← PROJ-ROOT(ai , Ei)23: E←{r ′ ∈ E | cr ′ ∈ Ei}24: end for25: return E ⊲ Projection up relation hierarchy26: end function
For this general algorithm to compute this projection fromT to G, it is broken
into two parts. The first function is used to determine thePROJ-ROOT part of the
definition. As seen in line 4, the function looks for the projection fromT to G by com-
paring the relation vertices connected to concept vertexa in T to the relation vertices
connected to conceptc in G.
105
![Page 136: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/136.jpg)
The second function at line 17 is used to determine thePROJ-r part of the defi-
nition. This function looks for possible mappings at each concept vertex by examining
sub-trees. The complexity of this general algorithm as proved on page 247 of Mugnier
and Chein [74] isO(mT ×mG), where m denotes the number of edges. The problem
class related to this algorithm is in the NP class of problems.
This should be recognized as a single graph to graph project operator and with
a projection operation (see section 4.3.1) an injective projection is necessary in order
to produce the projection graph. If each graph is a tree then one has a tree to tree
projection which is known to have a polynomial time algorithm [42], but conceptual
graphs are not necessarily trees.
Therefore, Mugnier and Chein [74] modify their algorithm tothe given Algo-
rithm 5.2 to actually return the image of the new projected graph. Within this algorithm
they use the functionPROJ-r to continue to look for possible mappings at each con-
cept vertex, but they modify thePROJ-ROOT routine to return the projection image.
Even though this is a locally injective projection, on page 249 they prove that ifT is a
conceptual tree andG is a cyclic conceptual graph then the decision question problem
being solved by this algorithm is still a NP-complete problem.
106
![Page 137: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/137.jpg)
Algorithm 5.2 Π Modified as an Injective Projection fromT to G
1: function PROJ-ROOT(a,E) ⊲ a∈CT
2: E←{c∈ E | label(a)≥ label(c)}3: If E = /0 or a is a leaf,return E4: for all r successors ofa do ⊲ Move through the neighbours5: for all c∈ E do6: Wc,r ← { r ′ neighbour ofc | type(r) = type(r ′) andPr [a] = Pr ′[c]}7: end for8: Er ←
S
{Wc,r}c∈E
9: Er ← PROJ-r(r,Er )10: for all c∈ E do11: Vc,r ←Wc,r
T
Er
12: end for13: Er ←{c∈ E |Vc,r 6= /0}14: end for15: for all c∈ E do16: Build the bipartite graph(A,B,U) such that:17: A ={sons ofa}, B ={neighbors ofc}18: (B can also be defined as
S
{Vc,ai ,ai ∈ A})19: U = {aiv | v∈Vc,ai}20: If this graph admits a matching with cardinality| A |,21: c is a solution22: end for23: return all c-vertices which are solutions of 22 ⊲ Projection of the subgraph24: end function
5.1.2 SCG Relation Projection
Madalina Croitoru’s new projection algorithm is based on SCGs as described in
her two 2004 papers [22, 21]. This algorithm begins by starting from the foundational
algorithm given in section 5.1.1 by Mugnier and Chein [74]. The decision question
associated with this new algorithm is the same as was stated in the Mugnier and Chein
1992 work [74] and is in the class of problems that are NP-complete. The significant
107
![Page 138: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/138.jpg)
change applied to Algorithm 5.2 is split the algorithm into two parts and adding a
preprocessing algorithm to each graph pair looking for amatching graphas defined by
the Definition 4.1 (in [22], page 8). Before defining the matching graph, some added
definitions are needed:
Definition 5.1.4 1) λ is a labeling of the nodes of a SCG graph G with
elements from the support S (see Section 3.2.2).
2) d is the degree (or arity) of each node in the SCG graph G.
3) N denotes the neighbour sets for the relation node (see Section 5.1.1).
Now for the actual definition:
Definition 5.1.5 Let SG= (G,λG) and SH= (H,λH) be two SCG’s
without isolated concept vertices defined on the same support S.
The matching graph of SG and SH is the graph MG→H = (V,E)where:
- V ⊆ VR(G)×VR(H) is the set of all pairs(r,s) such that r∈ VR(G),
s∈VR(H), λG(r)≥ λH(s) and for each i∈ {1, . . . ,dG(r)} λG(NiG(r))≥
λH(NiH(s)).
- E is the set of all 2-sets{(r,s),(r ′,s′)}, where r 6= r ′,(r,s),(r ′,s′) ∈
V and for each i∈ {1, . . . ,dG(r)} and j ∈ {1, . . . ,dG(r ′)} such that
NiG(r) = N j
G(r ′) we have NiH(s) = N jH(s′).
108
![Page 139: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/139.jpg)
These matching graphs indicate which relation vertices should be used as potential
candidates for projection; therefore, reducing the searchspace for the related search
problem. By using this preprocessing with the matching graphs, the projection ofG→
H in its reduced form belong to a class of problems in which finding the maximum
clique can be solved in polynomial time [22]. Therefore, theexecution of the algorithm
gives a polynomial time algorithm to the NP-Hard search problem.
5.1.3 Polyprojection
This is Mark Willems’s algorithm explaining polyprojection and how it relates
to a CG projection algorithm from his 1995 paper [136]. A polyprojection (from Defi-
nition 5 in [136], page 282) is:
Definition 5.1.6 Consider two (conceptual) graphs G=(C,R, type, re f erent,
arg1, . . . , argm) and G′=(C′,R′, type′, re f erent′,arg′, . . . , arg′m). Apolypro-
jectionµ from G to G′is a pair of Cartesian product subsets µC⊆C×C′
and µR⊆ R×R′ that are:
1. Type preserving: for all concepts c∈ C and c′ ∈ C′, cµcc′ only if
type(c)≥ type′(c′), and re f erent(c)= ∗ or re f erent(c)≥ re f erent′(c′),
2. Type preserving: for all relations r∈ R and r′ ∈ R′, rµRr ′ only if
type(r)≥ type′(r ′),
3. Structure preserving: µR◦arg′i = argi ◦µC.
109
![Page 140: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/140.jpg)
4. Non Empty: for all concepts c∈C there is a concept c′ ∈C′ such that
cµcc′.
It is said thatG′ is structurally similar toG, if there is a polyprojectionµ betweenG′
andG, and will be writtenG′µG.
Given this definition, Willems goes on to define that a polyprojection can be
found by a polynomial algorithm. The algorithm is divided into two parts, the first
part computes steps 1 and 2 from Definition 5 and finds the structure to beType-
preserving(G,G’) (see Algorithm 1 in [136], page 283); the second part computes
step 3 from Definition 5 and finds a polyprojection through theuse ofStructure-
preserving(M) (see Algorithm 2 in [136], page 284) whereM0⊆Type−preserving(G,
G′) and determine a pair of setsM ⊆ M0 that is structure-preserving; that isM =
({(c1,c′1), . . . , (cn,c′m)},{(r1, r ′1), . . . , (ro, r ′p)}) wheren = # of concept verticesin G, m
= # of concept verticesin G′, o = # of relation verticesin G, andp = # of relation ver-
ticesin G′. The actual execution time of the algorithm is not given; theonly statement
is that it is a polynomial result.
The algorithm described above is reminiscent to the one given in Reyner’s work
[103] (see page 284 in [136]). Therefore, if both G and G’ are trees, the polyprojection
of GµG′ is a projection ofG ontoG′ by Corollary 8 (see page 283 in [136]). Willems
goes on to state in Theorem 10 (see page 285 in [136]) that if there is a polyprojection
TµG′ whereT is a tree, then there is a projectionT → G′. This is significant because
110
![Page 141: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/141.jpg)
Garey and Johnson on pages 104 - 106 of [42] indicate that the sub-problem of sub-tree
isomorphism calledsub-forest isomorphismis NP-complete. The sub-forest isomor-
phism problem is where given two graphsG andH, determine ifH is isomorphic to a
subgraph inG, such thatG is required to be a tree, butH is a forest. However, in this
caseH may be a cyclic graph, and given thatG is a tree, a polynomial time algorithm
can be determined. Willems shows that a polynomial time algorithm can be found for
detecting the structure of a projection graph helped in the design of the new algorithm
seen in section 5.2.2.
5.1.4 Notio Projection
The Notio project is a conceptual graph implementation witha well defined API
[117]. It is currently being used by several projects [30, 10, 99] for working with basic
reasoning operations with a CG KB. This is the author’s derived theoretical algorithm
(see Algorithm 5.3) from the Notio implementation code [117, 115] for his injective
projection algorithm (note: Southey never wrote any analysis papers or documentation
on the actual implemented algorithm).
It should be noted for Algorithm 5.3, the vertices are all labeled, but the edges
are directed. Also for the analysis of the execution times given above, the following
definition of variables hold:
111
![Page 142: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/142.jpg)
Definition 5.1.7 Variable definitions:
|Mc |= # of concepts in the KB graph
|Mr |= # of relations in the KB graph
|Qc |= # of concepts in the query graph
|Qr |= # of relations in the query graph
|Qe |= # of edges in the query graph
| N |= # of graphs in the KB
| KBc |= # of concepts in the whole KB
As can be seen in the stated algorithm, in step 1: Notio collects all the concept and
relation vertices from both the KB graph and query graph. This takesO(|Mc |+ |Mr |
+ | Qc | + | Qr |). In step 2: Notio attempts to see if any of the concept vertices from
the KB graph maps to a concept vertex in the query graph.
In this way attempting to see if there is any possible subgraph isomorphism
of the KB graph onto the query graph. In the best case this stepis bounded by:O(|
Mc || Qc |) ; for the worst case by:O(|Mc || Qc || KBc |) ; and expected by:O(|Mc ||
Qc || log(KBc) |) . In step 13: Notio (if a possible mapping was indicated from step
2) will attempt to match all the relation vertices from the KBgraph (along with their
neighboring concepts along their edges) onto query graph vertices with the same edge
relationships. As a match is found for relation vertices in the query graph; only those
relation vertices are now examined. At the end of this step, it is checked that all relation
vertices for the query graph were mapped. In the best case this step is bounded by:
O(|Mr ||Qr ||Mc ||Qc |+ |Qe |) , with the arity being binary (so it is just a constant).
112
![Page 143: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/143.jpg)
Algorithm 5.3 Notio Projection
1: Get all concept and relation vertices from the KB and Query graphs2: for i← 0,num f irstconceptsdo ⊲ all concepts in KB graph3: for j ← 0,numsecondconceptsdo ⊲ all concepts in Query graph4: f oundmatch← f alse5: if (type(ci) == type(c j )) || (supertype(ci) == type(c j )) then6: if (individ(ci) == individ(c j) || (individ(c j) == /0) then7: f oundmatch← true ⊲ match all concepts in query graph8: end if9: end if
10:11: end for12: end for13: if foundmatch ==true then14: for i← 0,num f irstrelationsdo ⊲ all relations in KB graph15: for j← 0,numsecondrelationsdo ⊲ all relations in Query graph16: if (!relation[j].mapped) && (type(r i) == type(r j )) then17: if match fromr j to match to each of its conceptsthen18: relation[j].mapped = true ⊲ repeat line 2 for all19: end if20: end if21:22: end for23: end for24: f oundmatch← true25: for j ← 0,numsecondrelationsdo26: if !relation[j].mappedthen27: f oundmatch← f alse28: end if29: end for30: end if31: if foundmatch ==true then32: P← build new subgraph projection33: return P ⊲ return new projection34: else35: return /0 ⊲ no projection returned36: end if
113
![Page 144: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/144.jpg)
In the worst case step 13 is bounded by:O(| Mr || Qr || Mc || Qc || KBc | + |
Qe ||Qr |) when the arity of the query graph is fully connected; and expected by: O(|
Mr || Qr ||Mc || Qc || log(KBc) | + | Qe |) , again with the arity being binary and only
having to go to the hierarchy the height number of times. In step 31: if a projection is
found, it is returned.
Therefore the leading step is 13 in the overall running time,so the best case for
finding a projection for all the graphs in the KB =| N | would have a lower bound of
: O((| Mr || Qr || Mc || Qc | + | Qe |))(| N |). Therefore, when the number of graphs
in the KB is small, the number of vertices in the KB graphs are small, and the number
of vertices in the query graph is small then the execution time would move towards
O(n3) wheren =avg # of nodesin the KB graphs. As the KB grows in size and as
the number of vertices in the KB graph and query graph increase the expected run-time
becomes explosive even though not out of P. However, the worst case bound for the
whole KB is very close to the worst case bound given for Ullman’s algorithm above:
O((|Mr ||Qr ||Mc ||Qc || KBc |+ |Qe ||Qr |))(|N |).
5.2 New Algorithms
After examining the above algorithms it was discovered thateven though the
running times were acceptable with small size graphs and fewer numbers of graphs,
the actual algorithms were either not truly general as with SCG or had a very poor
execution times with large data sets. With a SCG set of graphs, the user was confined
114
![Page 145: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/145.jpg)
by what parts of a valid conceptual graph could be present in the data. The desire to
allow the user to give a directed, connected, bipartite conceptual graph (see Definition
3.4.3) that was cyclic and contained actors prompted new projection and maximal join
algorithms to be designed.
5.2.1 Supporting Information
In order to produce new algorithms, new data structures and supporting routines
were needed. Because the author believes that the connection between the algorithm
and data structures in the KB is critical, the new data structures and variables need to
be designed around the actual supporting routines.
5.2.1.1 Variables and Given values
Evaluating all the past projection algorithms, and lookingat the data struc-
tures used for each knowledge base, the author has discovered that handling conceptual
graphs astriplesas opposed to vectors or linked lists makes the operation of projection
much easier and cleaner to process. This author is not the first researcher to think about
using triples. Kabbaj and Moulin in 2001 [58] looked at CG operations using a boot-
strapping step. It was at this time that they also looked at defining the join operation
using triples as part of the matching data structure. Even asrecent as 2006, Skipper and
Delugach, [113], looked at using triples again in the data structure for the storage of
graphs. However, in both cases, they did not look at exploiting the triples in the actual
115
![Page 146: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/146.jpg)
algorithm of the operation.
All conceptual graphs in the KB and the query graph are storednot only with
the general conceptual graph information, but also with aC-R-C list andC-A-C list in
a cs-triple format. Their definitions are given below:
• cs-triple is a 3-tuple,T =< ci ,b,c j >, whereci ,c j are concept nodes, andi and
j are not equal.b is a conceptual relation (either a relation or actor node), and
(ci ,b) ∈ E and(b,c j) ∈ E, andci andc j are members in the signature ofb.
• defining labelsare all elements in a data structure that hold a unique label;that
includes concepts, relations, actors, and cs-triples
• c-r-c list is a concept-relation-concept list that holds cs-triple information in
which the ‘b’ in the 3-tuple is a relation node
• c-a-c list is a concept-actor-concept list that holds cs-triple information in which
the ‘b’ in the 3-tuple is an actor node
During the performance of the projection operation, two added data structures are used.
One data structure holds the matching possibilities of the query concepts with the KB
graph concepts, called thematch list, and the second structure holds the matching triples
from the KB graph for each concept in the query graph, called theanchor list. These
data structures improve performance by making available preprocessed information at
the time of creating and building the actual projection graphs. These data structure’s
116
![Page 147: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/147.jpg)
implementation will be defined when describing the experimentation systems (see Sec-
tion 6.4).
5.2.1.2 Actual Supporting Routines
Because the conceptual information is the structural foundation of a concep-
tual graph and because the relationships between the concept define the meaning of
the graph, the new supporting routines algorithms define in Algorithms 5.4, 5.5 and
5.6 have been defined around thecstriple relationship ofC-R-C.The main supporting
routines are:MatchHierarchy, MatchConcept, MatchConcepts, MatchTriple, andPro-
jection. They are the foundation behind the projection operation, and these routines
will help in determining the projection operation’s worst case and typical case execu-
tion time.
5.2.1.3 Worst Case Analysis for Support Routines
Using the support routines defined in the Algorithms 5.4, 5.5and 5.6, the worst
case execution time will be evaluated.
MatchHierarchy:
The type hierarchy is depicted as a tree of relationships, such that, the maximum depth
of the tree is just all concepts from the top,⊤, to the bottom,⊥. Therefore, in the worst
case the time to match to the given input concept type is to traverse the whole tree, or
linear, which isO(n).
117
![Page 148: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/148.jpg)
Algorithm 5.4 Supporting Projection Routines
1: function MATCHHIERARCHY(qi ,n j ) ⊲ q∈Q andn∈G2: foundmatch =f alse3: if check flag for supertypethen4: check to see ifqi is a supertype ofn j ⊲ check up hierarchy5: if qi is supertype ofn j then6: foundmatch =true7: end if8: else9: check to see ifqi is a subtype ofn j ⊲ check down hierarchy
10: if qi is subtype ofn j then11: foundmatch =true12: end if13: end if14: if foundmatch =true then15: add to match list16: return n j ⊲ returnn j as a match17: else18: return NULL ⊲ return NULL as no match19: end if20: end function ⊲ Check if concept match in hierarchy
21: function MATCHCONCEPT(qi ,n j) ⊲ q∈Q andn∈G22: if check match list forq, n matchthen23: return n j ⊲ returnn j as a match24: else25: if type(qi) == type(n j) then26: M← { qi ,n j } as match27: return n j ⊲ returnn j as a match28: else29: return MatchHierarchy(qi,n j ) ⊲ Check if match in hierarchy30: end if31: end if32: end function ⊲ Check if concepts match
33: function MATCHCONCEPTS(qi ,G) ⊲ q∈Q andG∈ KB34: for eachn j ∈ L, where j = 1 toc(G) do ⊲ L is a list inG35: C← MatchConcept(qi,n j )36: end for37: return C ⊲ All matching concepts from KB graph to Query graph concept38: end function
118
![Page 149: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/149.jpg)
Algorithm 5.5 Supporting Projection Routines (Cont1)
1: function MATCHTRIPLE(ta,sb,directionp) ⊲ t ∈Q, s∈G and2: ⊲ directionp is a BOOLEAN3: if (directionp ==true) && ((direction from ta) == -1)) then4: match← f alse5: end if6: match← Compare relation type ofta to relation type ofsb
7: if match ==true then8: match← Compare MatchConcept(ta,cb,sb,cb)!= NULL9: else
10: match← f alse11: end if12: if ((match ==true) && (directionp == f alse)) then13: match← Compare (direction fromta == direction fromsb)14: else15: match← f alse16: end if17: if match ==true then18: return true ⊲ Indicate two triples are a match19: else20: return f alse ⊲ No triple match21: end if22: end function
MatchConcept:
This routine must first check to see if the query concept,qi , is found in the match
concept,n j , match list, and in the worst case this takes timeO(c∗m) , wherec is the
number of concepts in the query graph,Q, andm is the number of concepts in the
match graph,G. If this check fails then next is to compareqi andn j for a match in
both concept type and referent. This is a constant time operation. If this succeeds, then
adding to the match list is in the worst caseO(c∗m); if not, then worst case running
119
![Page 150: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/150.jpg)
time will beO(n) which is the height of the type hierarchy tree. Overall the total worst
case running time for this routine would beO(c∗m+n).
Algorithm 5.6 Supporting Projection Routines (Cont2)
1: function PROJECTION(i,W,G,Pset) ⊲ i,W ∈Q2: t← Number of elements in theqi list ∈W3: z← Size of Pset4: if (i == 1) then5: for eachsa ∈ qi , where a = 1 to tdo6: Pset← AddNewProjection(sa, G, PSet) ⊲ Starts Projection Graph7: end for8: else if(t == 1) then9: s1← only element ofqi list
10: AddToExistingProjection(s1, G, Pset) ⊲ Add to existing Projection Graph11: else12: Pset′← /013: for eachsa ∈ qi , where a = 1 to tdo14: Pset′← ProcessProjection(sa, G, PSet,Pset′) ⊲ Process Proj Triple15: end for16: Pset← Pset∪Pset′
17: end if18: return Pset ⊲ Return created and modified Projection Graphs19: end function
MatchConcepts:
This routine will process all the concepts in the match graph, G, wherem is the number
of concepts inG. Since to process the concepts the routine MatchConcept is called and
its worst case running time is known to beO(c∗m+n), then the total worst case time
for this routine would beO(m∗ (c∗m+n)).
120
![Page 151: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/151.jpg)
MatchTriple:
Within this routine, the driving step would be step 11. This step of the algorithm would
call the routine MatchConcept where its worst case running time is known to beO(c∗
m+n). Therefore, in the worst case this routine would also beO(c∗m+n).
Projection:
This is the routine for creating and building the new projection graphs where there is
a structural match after finding the matching cstriples between the two graphs. Within
this routine are three major step that depend on the processing of the anchor list: 1)
when first concept in the anchor list; 2) when only one relatedtriple matching for the
concept in the anchor list; and 3) when neither of the first twoconditions exist. The
driving section of the algorithm in this routine is this third type of processing. As
can be seen at step 11 of the algorithm, this step calls to routine ProcessProjection.
ProcessProjection checks to see if a new projection graph has to be started by copying
an existing projection or if an existing projection graph can just add the current cstriple
being processed in the For Loop. The easier of the two functions is to add to an existing
projection, but time must be taken to find which projection graph to add to so from the
algorithm it can be seen that is timez, which is the size of Pset or the # of projections.
The more complex modification would be to copy an existing projection graph
in order to add the new cstriple being processed. It was just seen that to add a cstriple
121
![Page 152: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/152.jpg)
is time z, but at each time step within this processing would be the time needed to
copy a projection graph, which will be calledd times the number of projection graphs
that must be copied which ist. Therefore the worst case time for this step would be
O(z∗ t ∗d). It should be noted that the size of Pset which isz would be growing much
faster than the time needed for copying,d, therefore,d can be dropped from the running
time leavingO(z∗ t).
There is a relationship betweent, the number of triple matches for this concept
in the query graph, andz, the size of Pset; that is, in the worst casez= t i−1. During the
processing of this routine, if all triple matches lead to a new projection graph, then the
number of projection graphs currently in Pset will be the number of all triple matches
currently processed from the anchor list ort i−1. On replacement ofz, one gets a new
worst case running time ofO(t i−1∗ t)or justO(t i).
5.2.2 New Projection
As seen in Algorithm 5.7 for the new projection of the query graph onto the KB
is based on looking at all triples that are in the query graph and checking for a complete
subgraph match of the query graph onto the KB graph. Because each triple in the query
graph is unique, even if the nodetype is not, all projections can be found in the KB
graph.
122
![Page 153: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/153.jpg)
Algorithm 5.7 New Projection
1: function NEWPROJECTION(Q,KB) ⊲ Query and KB graphs2: P = /03: for eachG∈ KB do ⊲ All graphs in KB4: W← A list from Q ⊲ Preprocessing5: for eachqi ∈W, wherei = 1 toc(W) do6: if ((M←MatchConcepts(qi ,G)) > /0) then7: for eachn j ∈M, where j = 1 toM do8: match = f alse9: for eachta ∈Q do
10: ⊲ wherea = 1 to the # of cs-triples in crc list forqi
11: for eachsb ∈G do12: ⊲ whereb = 1 to the # of cs-triples in crc list forn j
13: if MatchTriple(ta,sb, true) == true then14: add (n j , (sb, ta)) to qi ∈W15: match =true16: end if17: end for18: end for19: if match ==f alsethen20: break out of loop and start next graph in KB21: end if22: end for23: else24: break out of loop and start next graph in KB25: end if26: end for27: Pset= /0 ⊲ Projection processing28: for eachqi ∈W, wherei = 1 toc(W) do29: Pset= Projection(i,W,G,Pset)30: end for31: P← P∪Pset32: end for33: return P ⊲ Set of projections from query onto KB34: end function
123
![Page 154: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/154.jpg)
5.2.2.1 Actual Algorithm
The overall algorithm (see Algorithm 5.7) for the projection of the query graph
onto the KB is checking for a complete subgraph match of the query graph onto the KB
graph during preprocessing. Because each triple in the query graph is unique, even if the
nodetypeis not, all projections can be found in the KB graph. Then after all matches of
conceptual units and triples are found, the actual projection graphs are built. However,
because the temporary data structures are saved from the preprocessing, matching does
not have to happen again at build time. The actual projectionjust uses the match list
and anchor list already created to build up or create the new projection graphs. Because
the anchor list contains all available projections, both injective and non-injective or
homomorphism projections are found.
5.2.2.2 Execution Time
Now that the algorithm is split into two sections, there is a running time for
answering the decision question of whether or not there is a projection, it will be called
the matching algorithm, and a running time for theactual projection. For the new
algorithms, three modifications have been made that affect the execution time of the
projection operation: 1) all nodes and triples are uniquelylabeled, 2) the edges are not
labeled, but do have implied labeling through their directionality within the triples, and
3) the triples are not only part of the data structure of the KB, but also directly effect
the actual projection algorithm. Thelabelingdrives the execution time of the matching
124
![Page 155: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/155.jpg)
algorithm when doing an injective projection toward the running time for a subgraph
‘labeled’ isomorphism problem which can be solved in polynomial time as opposed
to a straight subgraph isomorphism problem which is known tobe NP-complete. The
triples allow the matching algorithm to stop sooner when no projection is possible.
For the actual projection creation, the number of triples inthe query graph drives
the amount of time needed for the actual projection. The sizeof the graphs in the KB
affects the base of the execution time, but the number of times theProjection function
is executed is based on the number of triples in the query graph.
5.2.2.3 Worst Case Analysis for Projection
The actual projection operation algorithm is broken down into two steps: Pre-
processing (mapping of concepts) and Projection (structural build of new projection
graphs). Within the preprocessing step, the ‘forward’ concepts from the query graph,
H, that are inanchor list, W, are unified (or matched) to concepts in the match graph,
G (see 9 and 11). Because in the worst case the number of ‘forward’ concepts inH
is equal to the total number of concepts in C minus 1 from now onin this analysis the
number of elements inW will be seen as the number of concepts inH. Since in the
worst case the number of concepts inH is equal to the number of concepts inG then
the number of concepts inH will be calledm. For the rest of the processing of the
preprocessing step, it will be recognized that there are four nested For Loops with each
being connected to the value ofm. In two of the four loops, they will be executedm
125
![Page 156: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/156.jpg)
times with constant time internal processing. The second For Loop at step 26, involves
a call to MatchConcepts which has already been seen to have the worst case running
time ofO(m∗ (c∗m+n)). Assuming in the worst case thatc= m, on expansion of this
time is foundO(m3 + mn) or justO(m3) becausen can never be greater thanm. The
fourth For Loop calls the routine MatchTriple that has the worst case running time of
O(c∗m+n) or O(m2) because of the previous reasoning. This would give a worst case
running time for the matching processing ofO(m8).
The actual projection part loops around the support routineProjection. This
routine was discussed as having the worst case running time of O(t i) where i = m
when called from NewProjection. Given that the actual projection will loop through all
m concepts, in the worst case the actual projection isO(m∗ tm). Therefore, with the
overall NewProjection algorithm, the worst case is driven by the building of the actual
projection with the exponential factor on the number of concepts in the query graph.
5.2.3 New Maximal Join
As described in the Maximal Join operation section (see Section 4.2.6), more
than one node (or groups of nodes) can be joined between two graphs. When these joins
happen, the two graphs are composed into a new graph with possibly more information
than the original input graph. However, the joining of the input graph across the KB,
producing maximal join graphs are not commutative [90] whensemantic considerations
come into play. As with the projection algorithm, the overall algorithm (see Algorithm
126
![Page 157: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/157.jpg)
5.8) is split into two parts.
Algorithm 5.8 New Maximal Join
1: function NEWMAXIMAL JOIN(I ,KB) ⊲ Input and KB graphs2: J = /03: for eachG∈ KB do ⊲ All graphs in KB4: foundmatch =f alse5: W← A list from I ⊲ Preprocessing6: for eachqi ∈W, wherei = 1 toc(W) do7: if ((qi != null) && (( X← MatchConcepts(qi,G)) > /0)) then8: foundmatch =true9: for eachn j ∈ X, where j = 1 toX do
10: for eachta ∈ I do11: ⊲ wherea = 1 to the # of cs-triples in crc list forqi
12: for eachsb ∈G do13: ⊲ whereb = 1 to the # of cs-triples in crc list forn j
14: if MatchTriple(ta,sb, f alse) == true then15: add (n j , (sb, ta)) to qi ∈W16: end if17: end for18: end for19: end for20: end if21: end for22: Jset= /0 ⊲ Join processing23: if foundmatch ==true then24: for eachn j ∈M, where j = 1 to |M | do25: Jset= MaximalJoin(j,W, I ,G,Jset)26: end for27: end if28: J← J∪Jset29: end for30: return J ⊲ Set of joins from input onto KB31: end function
127
![Page 158: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/158.jpg)
This new algorithm has the matching algorithm (checking forpossible joins)
happening first, and then the actual joining of the two graphsto build the new maximal
join graph being performed second. This work will actually proceed as future work
using this algorithm as the starting point.
5.3 Typical Scenario Analysis for Projection Algorithms
Unlike the worst case analysis just evaluated for the projection algorithms, with
a typical query sent to a query-answer system, the query graph is much smaller than
the graphs in the knowledge base [100]. Basically, this comes about because the user is
trying to find a specific piece of data. Looking at the “blocks world” domain area (later
to be tested on implemented systems as seen in Chapter 7), onehas a knowledge base
of graphs that represent blocks on a table. The user wishes toknow information like “Is
there a red block in the graph?”, or “Is there a blue block above a red block?”. These
are very small graphs compared to the graphs in the knowledgebase describing all the
blocks on a table and their relationships to each other. As well as descriptions about all
characteristics and relationships to all the blocks on the table. Blocks world is a well
known planning problem [100].
5.3.1 Projection Algorithms using SCG
Both the SCG injective projection algorithms, Mugnier and Chein, and Croitoru,
have a direct tie between thematchingpart of the algorithm and thebuildingpart of the
128
![Page 159: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/159.jpg)
algorithm. Also both these algorithms are built from the relation perspective which are
typically fewer nodes than concept nodes.
5.3.1.1 SCG Projection
On evaluation of the injective projection algorithm by Mugnier and Chein, given
a typical scenario of a much smaller query graph on few graphsin the KB, the execution
time is still bound by the fact that the matching of relationsand their related concepts,
and the building of the image structure are not separated. Therefore the execution time
for the searching and building of the projection in this typical scenario has to match
every relation from the query graph onto all relations in thematch graph at all iterations.
However, if there is no match the structure of the subgraph does not have to be checked
any further from that root evaluation. When the typical scenario is very small and the
support depth is shallow then this algorithm performs well,but quickly derogates as the
number of valid projections and support depth increases because of the re-evaluation of
the match each time.
5.3.1.2 SCG Relation Projection
Croitoru has a preprocessing phase to her algorithm to look for matches, and
then executes the build phase separately based on the numberof relations in the query
graph. By doing the preprocessing phase with the matching through the search space,
the number of relations fromG that are candidates for projection is pruned. Therefore
129
![Page 160: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/160.jpg)
the execution time for the building of the projection graph in this typical scenario is
O(qrxg′r), whereqr= # of relations inQ andg′r = the # of relations fromG that were
viable candidates.
5.3.2 Notio Projection
This typical case would match up to the analysis of the lower bound,O(n3),
for the Notio as discussed in section 5.1.4. Notio does the matching and building of
the projection in the same step without pruning the tree. However, Notio only finds
a single projection because all relations within a graph must be unique. Even though
Notio can work over full CGs, this constraint does reduce thesearch space during the
Notio algorithm execution.
5.3.3 New Projection
With this typical case, the new projection algorithm moves towards the best case
results possible from the algorithm. To evaluate the typical case using this algorithm,
first the support routines will be evaluated and then the new projection algorithm will
be looked at.
5.3.3.1 Typical Case for Support Routines
Using the support routines defined in Algorithms 5.4, 5.5 and5.6, the typical
case can be given a foundation by evaluation by first examining these routines:
130
![Page 161: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/161.jpg)
MatchHierarchy:
The type hierarchy is depicted as a tree of relationships, such that, the maximum depth
of the tree is just all concepts from the top,⊤, to the bottom,⊥, but in a typical case
the tree is a broad tree and the depth of the tree is normally the log(n) wheren is the
number of concepts in the type hierarchy. Therefore, in the typical case the time to
match to the given input concept type isO(log(n)).
MatchConcept:
In the typical case the only step that would not be constant time would be matching
to the hierarchy. Since it was just shown that this running time isO(log(n)) then the
running time for this routine would be the same.
MatchConcepts:
Since to process the concepts the routine MatchConcept is called and its typical case
running time is known to beO(log(n)), then the total time for this routine would be
O(m∗ (log(n))).
MatchTriple:
Again within this routine, the driving step would be step 11.This step of the algorithm
calls the routine MatchConcept where its running time is shown to beO(log(n)) which
would also be routines typical running time.
131
![Page 162: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/162.jpg)
Projection:
Since it was seen in the worst case analysis that this routine’s running time is connected
to the number of triples in element of the anchor list, then ifonly one match is available
for a query graph concept then only one projection would be produced and the running
time for this routine become linear in the number of conceptsin the query graph.
5.3.3.2 Typical Case for New Projection Algorithm
In a typical query-answer scenario where the query graph would potentially
contain normally one to four triples compared to possibly a thousand in the KB graph,
this algorithm takes into account that the query graph is small. Because of that, the
time to do thousands of graphs in a KB is only multiplied by a constant based on the
maximum number of triples in a KB graph that the small query graph is projected onto.
The preprocessing part is again based on the number of concepts in the query
graph. However, for a typical scenario these would be small;probably not more than
eight concepts. Now if the four For Loops are evaluated, two of the loops become
constant time. The second For Loop at step 26, involving a call to MatchConcepts
which has running time ofO(m∗ (log(n))). The fourth For Loop calls the routine
MatchTriple with running time ofO(log(n)). Since as stated beforen would never be
greater thanm, this would give a typical case running time for the matchingprocessing
of O(m2∗ log2(m)).
132
![Page 163: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/163.jpg)
The actual projection part of the algorithm is multiplicative in the number of
projections available with this query graph. Since in the most common case there is
only one projection, the actual projection creation algorithm becomes polynomial (in
fact linear as seen in the Projection routine analysis).
The preprocessing part now becomes the driving step in the algorithm and shifts
the execution of the problem to one that is polynomial. Through this shift in search
problem performance, the running time for the projection operation for a typical sce-
nario within a query-answer application shows improvement.
133
![Page 164: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/164.jpg)
134
![Page 165: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/165.jpg)
CHAPTER 6
SYSTEMS/ENVIRONMENTS AND IMPLEMENTATIONS
This chapter discusses each example system’s basic features, as well ashow it
is used in one or more of the previously defined knowledge representations, ontology
elements and ADTs.
6.1 Semantic Network Systems
For each semantic network system the good points/features will be brought for-
ward and also the drawbacks of each system. Each of these goodand bad features will
attempt to be defined in a factual way.
6.1.1 KL-ONE
The KL-ONE language was originally formulated by Ron Brachman’s Ph.D.
dissertation from Harvard [66]. It was built into a system atBolt Beranek and Newman
(BBN) by Woods and Schmolze [141].
The KL-ONE system was designed originally around the classic framesystem.
As stated earlier, “frames” could be defined as a knowledge representation type all to
itself, but for this work they are classifying it as a sub-type of semantic networks. Typi-
cally a frame will include an “isa” or “ako” pointer to a more general frame from which
135
![Page 166: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/166.jpg)
additional slots can inherit [141]. KL-ONE forms a taxonomyhierarchy out of multi-
ple links of this type, therefore forming a partial orderingof concepts for inheritance.
Taxonomies were discussed in Section 3.2.
KL-ONE is made up of concepts, roles, and fillers. Structuredconcepts are el-
ements standing in specific relationships to each other [141]; roles are the entity names
for the relationships; and fillers are the structural conditions of the roles. Concepts are
represented in the semantic network by ovals, roles are circled squares, and structural
conditions are double ovals attached to diamond shaped lozenge (see Figure 6.1).
BlockArch
Noncontact
Support Supporters
Supported
Objects
Lintel# = 1
Upright# = 2
V/R
V/R
Figure 6.1: A KL-ONE Diagram of a Simple ‘Blocks-World’ Arch(Based on [[141],Figure 1]).
136
![Page 167: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/167.jpg)
Concepts can be generalized from other concepts. Thesesuper-conceptsspec-
ify a class of which the defined concept is a subclass. In this way, KL-ONE structures
depict a mapping of inheritance. An example from Wood’s work[141] is a concept
[appreciable debt obligation] which has super-concept link to two parents, [debt obliga-
tion] and [appreciable asset]. This example illustrates the utility from multiple parents;
it is also directly represented within the semantic network.
Concepts may also be primitive in definition. That means thatthe collection of
super-concepts, roles and structural conditions are necessary, but not sufficient to define
the concept. These concepts are indicated in the semantic network representation by
putting an asterisk by the oval [141]. Concepts may also be individual, that is they are a
member of a set and not the set itself. Many times they are the instantiation of a generic
concept and are represented in the semantic network by diagonal shading inside the
concept.
Roles also have different forms of structure. Value restrictions on roles are con-
cepts that characterize constraints on possible role fillers [141]; they are shown by roles
with an arrow coming from a role to the concept that applies the constraint. Number
restrictions may also be applied for the maximum and minimumnumber of allowed
fillers. These are seen in Figure 6.1 by the use of “# = <value>”; they may also be
a range such as “#<lowernumberorvariable>,<uppernumberorvariable>”. Roles
may also be “chained” together to produce an access path fromthe concept being de-
137
![Page 168: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/168.jpg)
fined to the intended filler [141]; this would be depicted by a using a small triangle
between the structural condition diamond and the role. The chain is necessary because
it constrains the filler of the specific role. If roles are justlinked together a square is
used for the intermediate roles.
As discussed previously in this section, taxonomic structures are built into the
semantic structure of KL-ONE. This means that at the internal representation level, sub-
sumption and other terminological operations must be considered and at the ADT level
these operations must be available. Putting the taxonomy hierarchy inside the semantic
network was a deliberate act [141], but as discussed earlier, the taxonomy is part of
level 0 and is not supposed to be part of the semantics of the actual network. Therefore,
the classification operation is used to place new descriptions into the taxonomy at their
correct position [141], and the internal representation must be able to interact with the
semantic network representation when editing the network.
Within the internal representation level, KL-ONE makes a distinction between
terminological components and assertional components. The terminological compo-
nents are called “T-Boxes” and assertions are called “A-Boxes”. The t-box is responsi-
ble for specialized types of reasoning that follow from the structure of the terms, that
is definitional information, where the a-box is responsiblefor general reasoning and
provides factual information to the system. Later systems based on KL-ONE allowed
“hybrids” between these components [141]. Given the three types of ADT defined in
138
![Page 169: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/169.jpg)
Section 6.3, the logical ADT would best represent this system.
An important goal of KL-ONE was to make useful KR services available and
as the system was developed the expressive power of the system increased [141]. With
the development of the roles and fillers, the quantitative relationships were fully im-
plemented; however, this system did not provide for qualitative relationships. In fact,
frame-based systems are severely limited when dealing withprocedural(qualitative
relationship)knowledge[137].
6.1.2 SNePS
SNePS is a system designed for representing the beliefs of a natural-language-
using intelligent system [110]. At the semantic network knowledge representation
level, it consists of nodes and labeled, directed arcs. The nodes are the terms or con-
cepts of the network and the arcs are like grammatical punctuation. All entities in all the
versions of SNePS are nodes [110]; the nodes are four basic types: base nodes, variable
nodes, molecular nodes and pattern nodes. Base nodes represent some particular entity
within the network, while variable nodes represent arbitrary individuals, propositions,
etc. that are distinct from the rest of the network. Neither base or variable nodes have
output arcs. Molecular nodes represent propositions, rules and “structured individuals”,
while pattern nodes are like open sentences or functional terms with free variables. Both
molecular and pattern nodes have input and output arcs and are structurally defined by
the arcs. Every node has an identifier and base nodes may be identified by the user (all
139
![Page 170: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/170.jpg)
others are system generated identifications).
The arcs were defined differently for different versions of SNePS. Within the
current version, there are two types of arcs: descending andascending. The arcs rep-
resent relationships. The current system also has a belief revision system as a standard
feature. As part of this system, assertion tags ‘!’ are appended onto asserted nodes. For
an example of how the semantic network representation lookssee Figure 6.2.
snsequence
B
Aput
table stack
M18 M19!
M10
M13 M14 M15 M17
M16
M12
M11
lex
lex
lex lex
lex
plan
action action
action
actobject2
object2object2
object1 object1
object2
lex
object1
object1
Figure 6.2: A SNePS Representation of “A on B on a Table” (Based on [[110], Figure12]).
140
![Page 171: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/171.jpg)
Internal to the SNePS system are incorporated some theoretical decisions [110]:
• the system will not build a new node where there is already a node in the structure.
• two variables in one rule can not be instantiated to the same term.
• the universal quantifier is only supported on a proposition whose main connective
is one of the following: and, or, min/max, or thresh.
Given these restrictions, SNePS is not much more than an intensional propositional
representation; however, the inference package, SNIP, is adirect part of SNePS and
adds to the capabilities of the system.
SNIP must be able to interpret rules properly because it is a separate system
and because operator-based formulations may be added on topof SNIP. Also the belief
revision system is also built above SNIP. Therefore, when looking at a possible internal
representation for SNePS one would need the functionality of predicate calculus. This
would also mean that the logical ADT would need to be chosen.
SNePS is a very straight forward representation. It has onlynodes and arcs, and
puts everything into the semantic network structure. Thereis no hierarchy being ap-
plied to the network or even structurally incorporated intothe network, thereby keeping
it very simple. Belief processing is available through assertion tags and operator-based
formulations may be added on top of the system through procedures. However, only
141
![Page 172: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/172.jpg)
universal quantification is available, therefore limitingknowledge that can be repre-
sented. Also no qualitative relationships are possible.
6.1.3 SNAP
SNAP stands for Semantic Network Array Processor and was implemented at
the University of California. It is a parallel computer architecture with a semantic net-
work representation of the permanent knowledge being stored [72]. The actual model
is one of marker-passing and the knowledge-base does not do much more than just
general production rule processing (see Figure 6.3 for an example).
California
city
Los Angeles
university
USCis-inis-in
is-a is-a
Figure 6.3: SNAP Semantic Network of “USC in LA, CA” (Based on[[72], Figure 2]).
The permanent knowledge for the knowledge-base is stored atstart up time.
Nodes are terms or concepts and the arcs are the labeled relations between the nodes.
For each new relationship within the knowledge-base an instruction is created by the
controller of the machine, transformations are performed and node assignments are
142
![Page 173: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/173.jpg)
done, and then commands are broadcast to specific array processors for storage of the
knowledge [72].
The temporary knowledge is where the markers are processed.Markers are
flags that travel around a distributed intelligent network.Marking nodes indicate that
they are relevant to the current action. Markers may also have attributes associated with
them.
The inference engine controls the two knowledge areas, but the job of the infer-
ence engine is controlled by the controller on the machine and the intelligent network.
The markers are controlled by the inference engine and spread the searches and queries.
Because of the simpleness of the actual semantic network knowledge represen-
tation, the internal representation can just be basic data structures and the basic ADT
for the IF .. THEN structure can be used. This does not give much expressive power to
the semantic network, but it does allow parallel processingacross an intelligent network
which provides much potential for the future.
6.1.4 CS Initial Project - PEIRCE
The PEIRCE project is named after the American philosopher and logician
Charles Sanders Peirce [37]. In 1883, Peirce developed the first linear notation for
first-order logic [86]; however, he felt that the predicate notation for logic was unduly
complex [121]. Then in 1897, Peirce inventedexistential graphs[86, 25] with the sim-
ple mechanism of graphs within a context that were parts of larger graphical notations
143
![Page 174: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/174.jpg)
[121, 37]. John Sowa then used these existential graphs as his foundation for his Con-
ceptual Graph theory [119].
The PEIRCE project is designed to be built out of conceptual graphs[37]. It
originated as a joint effort for different systems being built out of conceptual graphs
across the world to work together [37]. Over time, it became aproject at the PEIRCE
Foundation being built by its director Gerard Ellis in Australia. An example of a graph
within the PEIRCE system is given in Figure 6.4.
Person: ∀ Age: Ε1
*x
Date: Ε 1
Date: ∀Chrc
Chrc
Ptim
Ptim
Birth
DT-Birth Diff-DT
schema for Age(x) is
Figure 6.4: PEIRCE Schema for Age (Based on [[119], Figure 6.5]).
144
![Page 175: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/175.jpg)
These graphs are made out of conceptual structures which were discussed in
Section 2.3.2.3. The PEIRCE system is divided up into the following modules [37]:
• Programming standards
• Database storage and retrieval
• Linear notation input and output
• Massively parallel hardware
• Graphical editor and display
• Conceptual catalogs (ontologies)
• Programming in conceptual graphs with constraints
• Inference/theorem-proving mechanism
• Learning mechanism
• Natural language parsers and generators
• Information systems engineering
• Vision system
The following modules are the only ones that are within the scope of this work: Database
storage and retrieval; Graphical editor and display; and Programming in conceptual
145
![Page 176: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/176.jpg)
graphs with constraints. Because of the difficulty of collaboration with many people
none of these original modules made it past the design phase,but this work was very
important as these original designs were used within other tools that have been devel-
oped for the conceptual structures community.
The database storage and retrieval module was responsible for storing concep-
tual graphs. It was to use a C++ ADT for graph operations and generalization hierarchy
operations. These were to incorporate the fundamental operations of graph matching
and unification (maximal join [119]). They also perform generalization and specializa-
tion operations on the hierarchy. As stated in the Ellis work[37] large knowledge bases
were being created for processing, but it has taken some timeto deliver these to the
community.
A graphical editor and display that was constructed in X-Windows and executed
on all versions of Unix available (including Linux for PCs) was one of the foundational
modules. This same module runs under Windows. Growing out ofthis effort is the
very complete graphical editorCharGerdeveloped by Harry Delugach [29, 30]. To go
along with the editor would be a compiled language that will allow programming in
conceptual graphs with constraints.
This actual system would be available once the ADT has been coded and boot-
strapped into a compiler for conceptual graphs. Two systems, Amine and FMF, have
grown out of this effort and given Prolog compilers that include CGs as part of the lan-
146
![Page 177: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/177.jpg)
guage [55, 56, 124]. The system is mainly a set of concepts andtools, however it will
address all quantitative and qualitative relationships and generalization and specializa-
tion operations when functioning.
6.2 Conceptual Graphs Environments
6.2.1 CoGITaNT
CoGITaNT has several useful utilities: a set of library routines in C++ for con-
ceptual modeling, some knowledge bases in conceptual graphs, and an XML specifica-
tion for CGXML [64]. All documentation is in French and none is available in English
(including the installation instructions). In the future,documentation should be avail-
able in English which will allow this author to test and evaluate this very complete
system.
6.2.2 Amine
Amine is actually a “platform” as opposed to an environment [55]. Its main
processing is a multilingua system for ontologies [54]. It was originally built on a
conceptual structures internal representation, with a storage representation compiled
through Prolog [57]. Now that it has been converted to a platform, it is written in Java.
At the present time, only the ontological hierarchies have been converted, but all the
storage representation will soon be made available. Amine is using CGs as an internal
representation for machine translation from French to English.
147
![Page 178: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/178.jpg)
6.2.3 pCG
pCG is “a process operating upon a CG”. It was developed at the University of
South Australia by David Benn under the direction of Dan Corbett [10, 9, 8]. This is
based on the work of Guy Mineau at the Universite Laval [69, 70]. This system imple-
ments its process mechanism by using the Java library routines of Notio developed by
Finnegin Southey [117].
pCG had several design goals: 1) making concepts, graphs, actors and processes
first class citizens with the pCG language; 2) easy extensibility; 3) rapid development;
4) portability; and 5) minimality [10]. Of these goals, the first, fourth and fifth were
the most interesting to this author. By making all value types first class types in this
language, every type can be passed as a parameter to functions for execution. Portability
was available by using Java as the language and the ANTLR1 construction tool for
parsing. This system was constructed and designed with as few of constraints and built-
in keywords as possible. Therefore many functions that are already available within the
Notio system are directly possible from pCG.
”The pCG language is multi-paradigm, since apart from its object-based char-
acteristic, pCG supports imperative (variables, assignment, operators, selection, iter-
ation), functional (higher order functions, value, recursion), and declarative styles of
programming” [10]. This created the opportunity for interoperability between pCG
1http://www.antlr.org
148
![Page 179: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/179.jpg)
and other systems.
6.2.4 CPE
The Conceptual Programming system (CP) was originally developed as a sin-
gle, standalone application [92, 93] that handles temporal, spatial and constraint infor-
mation [47, 94] using a knowledge base of Conceptual Graphs (CGs) [119]. CP was
a knowledge representation development environment with agraphical visualization
framework, that had a set of tools that used graph structuresand operations over those
structures to do knowledge reasoning.
All knowledge within the system is stored and operated on as agraph. These
graphs are implementations of Sowa’s Conceptual Graphs [119], but also retain many
of the features of graph theory [46]. Although there exists amapping from CGs to
formulae in first-order predicate calculus (FOPC), the operations used in the CP sys-
tem take advantage of the graphical representation; therefore, the data structures and
operations over the graphs use graph theory [46] instead of FOPC.
The original system was a single application written in Lispand ran only on a
Symbolics machine. The data structures were CGs defined using link lists of structure
elements where the structures held the node information andthe links were the edges
of the graphs. All graphs had to be entered directly into the environment’s editor, and
each graph was stored into the environment’s knowledge base. The CP inference engine
would then operate over these data structures; sometimes creating new graphs or partial
149
![Page 180: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/180.jpg)
models of conceptual graphs and storing them into the environment’s knowledge base.
In the old environment, there was no way to import or export any of the graphs
or models. This prompted investigation into alternative data structures and models to
allow other applications and systems to communicate with the CP application [95, 96,
49]. Harry Delugach’s invited talk at ICCS2003 [31] outlined a framework for building
active knowledge systems. By 2004 the Conceptual Programming Environment (CPE)
had been introduced with its new modular, multi-component design to increase the flex-
ibility of the environment and to allow modules to be used outside of the environment
by other systems [87]. The viewpoint on the redesigned was tomake the CP Environ-
ment be the “heaven” displayed in Delugach’s framework. At that time, the main form
of interoperability was by using the CGIF interchange format (see Section 3.4.4). John
Sowa, in a paper published in 2002 as part of a ”Special Issue on Artificial Intelligence”
of the IBM Systems Journal[124], proposed a modular framework as an architecture
for intelligent systems because of the flexibility in communication and interoperabil-
ity it provides. This flexible modular framework (FMF) allows different applications
in different memory spaces to communicate using a blackboard architecture of mes-
sage passing between applications. FMF would be very usefulin implementing the
reference framework discussed in Aldo de Moor’s RENISYS specification methodol-
ogy [34] because FMF handles interprocess communication across computers as well
as processes, and it would also be useful in developing the intelligent agent operations
from Delugach’s framework [31]. However, the modularization of CP is at a module
150
![Page 181: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/181.jpg)
component level, rather than the FMF process communicationlevel, so that the module
can be directly “tied-in” to another application. The modular design at the component
level also allows modules to be interchanged as units, as in modular furniture, to get
the most flexibility from the environment.
The modularization of the CP Environment allows parts of theenvironment, the
actual modules, to be both interfaced and interacted with byoutside systems or appli-
cations. It also has a specific module, CGIF, that creates a mechanism to import and
export CGs created from execution of the environment’s inference engine modules and
storage in the environment’s knowledge base. CPE included simple wrapper modules
to allow other languages, besides C and C++, to use the CGIF module.
6.2.4.1 Basic Architecture for the Environment
Figure 6.5 depicts the new directionality of the CP Environment. The very light
gray background area indicates what is actually part of the environment. The light gray
oval depicts applications, i.e. the pCG reasoning and language system. The medium
gray rounded-corner-square represents editors that are available for CGs, i.e. ARCEdit;
these editors should be able to import/export CGIF formatted files. The light gray
trapezoid and drum shapes indicate data that is not necessarily graphical in nature, but
may be part of a domain of information that a user wishes to process (note: the data in
the database need not necessarily be textual and may be graphical or any visual form).
The very dark gray shapes are modules that are part of the CP Environment and use
151
![Page 182: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/182.jpg)
Figure 6.5: Current CP Environment (From [[87], Figure 1, page 322]).
the environment’s internal data structures. All solid arrowed lines in the figure indicate
data or processing that is currently available; dashed arrowed lines indicate where an
interface, connection, interaction, and/or translation should be available between these
elements, but is not currently present.
6.2.4.2 Data Flow within the Environment
Because the architecture is set up as a set of modules, each module is set up as a
DLL (under Windows) and a library (under Unix or Linux) depending on the operating
152
![Page 183: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/183.jpg)
system. It also has a specific module, CGIF, that creates a mechanism to import and
export CGs created from execution of the environment’s inference engine modules and
storage in the environment’s knowledge base. This mechanism can be “plugged-in” to
other applications by using the CGIF module’s API specification to call the module’s
implementation code level [117]. All the modules have available APIs to allow their
library routines to be called by other applications. Also, because all data structures can
be stored to a CGIF formatted file, graphs can transferred to other applications through
the graphs in the CGIF file.
6.2.4.3 Data Structures used by the Environment
When originally conceived it was just an implementation of conceptual graphs
algorithms without considering how the data structures affected implementation. In
2000, this system began to change to allow it to be more of a foundational environment
that could be used as the underpinning of a multiple reasoning systems. When this
environment was first conceived it used a double linked list data structure. On redesign,
new data structures were investigated and have been and willbe discussed in other
chapters.
6.3 ADT Implementations
Given is a discussion of three implementations of the internal representation
ADT definitions discussed in Section 2.3.2. These are just basic ideas of how each of
153
![Page 184: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/184.jpg)
the ADT’s might be implemented. Each of the definitions have been given in pseudo-
code that looks like C++, but that does not have anything to dowith the programming
language that it might be implemented in.
6.3.1 Logical
This ADT could best be implemented in either Prolog or Lisp. The basic struc-
ture of the ADT is one of predicates. If the predicate and treestructure of Prolog is
used, then implementation is straight forward. The syntax and semantics as seen in the
example in Figure 2.2 could be directly mapped onto this ADT.Within the implemen-
tation of the ‘query’ procedure, unification and resolutionwould be performed over the
knowledge-base records. This would be performed by using the ‘SupportClauses’ that
will be saved during processing. If there is a network present, then the routines that
are needed to perform terminological operations would alsobe executed. The theorem
prover would need to use not only the ‘SupportClauses’, but also the stored knowledge-
base from the Calculus class. Note: the ‘Logical’ class is where reasoning is performed
by use of its function, this is the inference engine, and the ‘Calculus’ class is for storing
the knowledge-base.
6.3.2 Basic Data Structures
When implementing the basic data structures that many timesare needed for
simple rule-based systems, languages such as Lisp or C come to mind. This implemen-
154
![Page 185: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/185.jpg)
tation needs to store records of information (knowledge-base) that can be separated out
as “conditional” information and “rule” information. The rule would be used for pro-
cessing or ‘fired’ when the conditional is found to be true. Ifimplemented in Lisp then a
list representation could be used where the “car” and the “cdr” can give the conditional
or rule back from the record. If one used C then a structure holding the elements of the
IF .. THEN record would be used and functions would need to be defined to retrieve
the conditional and rule parts of the structure.
The inference engine would be implemented in the ‘query’ function. It would
apply the actual knowledge that had been stored in the knowledge-base in order to inter-
pret the conditional [65]. This function is also where the actual reasoning is performed.
If there is any network or hierarchy processing to be performed, it would be imple-
mented in the inference engine, i.e. marker-passing operations are implemented in this
module.
6.3.3 Object
For object manipulation, and in particular, graph manipulation more informa-
tion is needed. To work with graphs there are not only record types of information
about the objects, but the structure of the graph has bearingon both the syntax and
the semantics, or meaning, of the graph and must be stored as part of the represen-
tation ADT. As can be seen by the ADT definition, more basic information needs to
be stored. Because Java and C++ are object-oriented, these languages work well for
155
![Page 186: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/186.jpg)
the implementation of Conceptual Structures. The implementation must not only know
which conceptual units are linked to which relationship, but must have directionality.
By the use of the ‘3WayTable’ data structure, knowing which is the starting conceptual
unit and which is the ending concept of the relationship is possible. Also, by evalua-
tion of the fetched links, the structure of the physical graph can be known. Through
this knowledge, syntax and semantics of the internal representation can be mapped and
stored.
When this basic ADT is built upon, qualitative functions canbe performed by
using the ‘query’ procedure and adding information to the knowledge base about time
and space. The ‘Graph’ class would also work with any hierarchy that is used with the
knowledge base. In order to do reasoning, several other graph manipulation procedures
and functions have been added. Given a specific system, it is possible that this is not a
complete ADT and more functions will need to be defined.
6.4 Experiment Systems Implementation
The experimental systems were chosen because they were ableto handle full
Conceptual Graphs and did not have the restrictions of SCGs described in 3.4.3. Even
though the SCG algorithm by Muginer and Chein has been implemented in the CoG-
ITaNT system, there is no English documentation in order to work with the system,
and the Croitoru algorithm has not been implemented. Lastly, the author of the pCG
system, David J. Benn, addressed any errors or problems thatarose within the pCG
156
![Page 187: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/187.jpg)
code.
6.4.1 pCG - Original Notio
The pCG system as discussed previously is built on top of Notio library. It, like
Notio, was written in Java for portability and used the antlrparsing system to read and
process the CGIF format. It was mainly designed to use Notio for the actual match-
ing, projection and join algorithms, while developing alanguagefor inputting and out-
putting simple programs to do analysis. After examining several currently available
systems (see above), pCG was chosen as the most general of thecurrently available
systems. After working with this system, it was discovered that it like several of the
other systems, had the following limitations:
• Only a single copy of a relation could be present in a graph. I.E. if a person
had two characteristics ofbrown hair andblue eyes, creating a graph with both
characteristics was not valid.
[Person]->(CHRC)->[Hair:brown]->(CHRC)->[Eyes:blue]
• It also only found a single projection of a query into a graph even if others were
present.
However, it was possible to work with the pCG programs (see Section C.1) to directly
use many of the same test sets of CG graphs that would test CPE’s data structure varia-
tions.
157
![Page 188: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/188.jpg)
The data structures used within pCG/Notio are vector arraysfor processing
within an operation. More specifically, that is for matchingnodes, structural analy-
sis of graphs, and evaluating the search space during the projection operation. Through
out all of these processes array data structures are use. However, at the very end of the
projection operation, the graphs in the KB are actually translated to be stored in a hash
table even though if the projection operation is again performed on the KB it will be
again loaded into an array data structure for processing. The hash table is only used
when doing a direct retrieval of a graph from the KB; not during operations.
An interesting feature design of pCG can be related to the ‘typical case’ analysis
from Chapter 5 (see Section 5.3). Within the pCG implementation, it computes the pro-
jection by using a two part algorithm; however, these parts are not the same as the SCG
Relation algorithm of Croitoru. The first part is actually more a part of the storage of
the graphs. In the preparation for the projection operation, this algorithm performs an
Assertionphase which commutes the structure of the graph and re-aligns the labels on
the elements of the graphs to improve the matching later during the projection. There-
fore, as the number of graphs in the KB increases, this Assertion takes a proportional
amount of time to the size of the KB. Since in the typical case the size of the query
graph is small in the number of nodes and the KB size is small, the Assertion does not
have a significant effect on the results, but as the size of graphs in the KB increase and
the number of graphs in the KB also increase this Assertion part should have more of
an effect.
158
![Page 189: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/189.jpg)
6.4.2 CP Environment (CPE)
This system has been developed since 1988 at New Mexico StateUniversity
[93, 94] and has never had the two limitations listed for the pCG system. However,
besides this difference the two systems are very comparable.
As discussed in Section 6.2.4, CP was originally developed to use doubly linked
lists. When linked lists are sorted, but single linked the execution time for retrieval from
a linked list is the same as that of an array data structure. When in the process of re-
designing CP to use new algorithms for the projection and maximal join operations,
investigation was done on what would be good data structuresto use with these new
algorithms. Therefore, an array data structure was originally chosen to test with the
projection algorithm because when the array is non-sorted then storage is just an ap-
pend at the end of the list and one does not have to use a sorted,or doubly linked
list. After carefully looking at other data structures, hash tables were also chosen to be
investigated.
There are four variables that hold a direct link between the algorithms and data
structures for the system. By changing their underlying data structure, it is believed
that the projection operation execution time will be altered. These variables arec-r-c
andc-a-c, which are part of the CG graph data structure, andmatch listandanchor list
which hold internal data information that will be used to move data from the match-
ing pre-processing part of the algorithm to the actual projection building of the query
159
![Page 190: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/190.jpg)
graph onto the KB graph. These variables were defined in Section 5.2.1; their actual
implementations will be defined here for use with each test representations.
6.4.2.1 Array (Vectors)
First will be discussed the array implementation for these critical variables (c-
a-c will not be showed in this implementation or the next because the block world
benchmark did not use actors, but its implementation is veryclose to thec-r-c data
structure). In the following descriptive examples, ‘[]’ indicate indices and ‘()’ indicate
structures.
c-r-c
[1] -> (GC1, ([1] -> GT1, (GT1, GC1, R1, GC2, 1)[2] -> GT2, (GT2, GC1, R2, GC3, -1)))
[2] -> (GC2, ([1] -> GT1, (GT1, GC2, R1, GC1, -1)))[3] -> (GC3, ([1] -> GT2, (GT2, GC3, R2, GC1, 1)))
This data structure would be an array that is part of the cg graph class in which the first
part of the structure is the unique concept identifier, for example, at index 3 the key
would be “GC3”. Also at every index in the array, there is an array of cstriple unique
identifiers, for example, at index 1 the key would be “GT2”, that will retrieve a node
structure. This node structure contains the cstriple, forward concept, relation, backward
concept and direction. The direction is either a ‘1’ or ‘-1’ indicating if the cstriple, in
display format, is proceeding from forward concept to back concept with directed arrow
or vice versa. The node structure within the previous built example would be cstriple -
160
![Page 191: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/191.jpg)
“GT2”, forward concept - “GC3”, relation - “R2”, backward concept - “GC1”, direction
- ‘1’.
match-list
[1] -> (GC1, ([1] -> QC1[2] -> QC2))
[2] -> (GC2, ([1] -> QC2)[2] -> QC1))
[3] -> (GC3, ([1] -> QC3))[4] -> (GC4, NULL)
This list holds the matching concepts between the KB graph and the query graph. In
this example structure, an array would hold all the conceptsfound in the KB graph with
a link to an array of all the matching concepts in the query graph. Until a matching
concept is found, the second array is NULL.
anchor-list
[1] -> (QC1 -> ([1] -> (GC1, ([1] -> GT1,QT1)[2] -> (GC2, ([1] -> GT2,QT2)))
[2] -> (QC2 -> ([1] -> (GC3, ([1] -> GT1,QT1))))[3] -> (QC3 -> ([1] -> (GC4, ([1] -> GT2,QT2))))
The anchor list holds the matching KB concepts that also structurally have the cstriple
relationships found in the query graph. By holding both the matching concepts to
each query graph concept and the related triples, at build time the anchor list can just
be traversed to create the new projection graphs. This example finds two projections
where one projection includes concepts GC1 and GC3 in the projection using the GT1
161
![Page 192: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/192.jpg)
cstriple node. The second projection includes concepts GC2and GC4 in the projection
using the GT2 cstriple node. All data structures here are arrays.
6.4.2.2 Hash Tables
The important change in the data structures for the criticalvariables given above
came when it was seen that perfect hash tables (as discussed in Section 3.5.2.1) can
improve the overall projection time by greatly improving the actual projection step, or
building of the projection graph, in the second step of processing. In the following
descriptive examples, ‘<>’ indicate hash tables and ‘()’ indicate structures.
c-r-c
<GC1, (<GT1, (<GT1direction, (GT1, GC1, R1, GC2, 1)>)GT2, (<GT2direction, (GT2, GC1, R2, GC3, -1)>)>)
GC2, (<GT1, (<GT1direction, (GT1, GC2, R1, GC1, -1)>)>)GC3, (<GT2, (<GT2direction, (GT2, GC3, R2, GC1, 1)>)>)>
This data structure would be a perfect hash table with a <key,value> that is part of the
cghash graph class. The first part of the structure is the unique concept identifier, for
example, at key GC3 would be a perfect hash value for “GC3”. Also at every key in the
hash table, there is another perfect hash table of cstriple unique identifiers, for example,
at key GT2 would be a perfect hash value for “GT2”. This secondhash table has a value
that is a node structure. The node structure is also stored ina perfect hash table using
the cstriple unique identifier and direction as the key in which the two values create a
162
![Page 193: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/193.jpg)
perfect indexing value for the hash table. The value of the last hash table is the same as
the value in the above array implementation.
match-list
<GC1, (<QC1, QC2>)GC2, (<QC1, QC2>)GC3, (<QC1>)GC4, NULL>
This hash table holds the matching concepts between the KB graph and the query graph.
In this example structure, a perfect hash table would hold all the concepts found in the
KB graph with a link to a perfect hash table of all the matchingconcepts in the query
graph. As before, the KB graph concept that did not have a match would have a NULL
in its value parameter.
anchor-list
<QC1, (<GC1, (<GT1,(GT1,QT1)>)GC2, (<GT2,(GT2,QT2)>)>)
QC2, (<GC3, (<GT1,(GT1,QT1)>)>)QC3, (<GC4, (<GT2,(GT2,QT2)>)>)>
This example finds the same two projections discovered from the array implementation;
however, all data structures here are perfect hash tables and each unique label would
produce its own unique index in order to have constant time retrieval.
163
![Page 194: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/194.jpg)
164
![Page 195: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/195.jpg)
CHAPTER 7
PROJECTION EXPERIMENTS, RESULTS AND ANALYSIS
This chapter includes a discussion of the domain “blocks world” problem and
the actual experiments tested with it. Each of these experiments use a set of reasoning
graphs in the KB for projecting queries against a solution tothe “blocks world” prob-
lem. These are extended graphs from the benchmark set of conceptual graphs from the
CGTools workshop of ICCS2001.
It will also give all the timing results for the cross matrix of test runs discussed
in Section 7.2.1; including a simple analysis of the over allexecution times of each test
set. Comparing and contrasting each test set analyzing the amount of execution time,
overhead time, and space requirements.
7.1 Domain Problem - ‘Blocks World’
Back in 2001, a group of tool developers began the process of truly making con-
ceptual graph systems interoperable. A set of benchmarked files of conceptual graphs
that can be used by reasoning systems to work with theblocks worlddomain were de-
veloped in CGIF format (see Section 3.4.4). During the 2001 Conceptual Graphs Tools
Workshop1 a set of benchmark graphs were defined and place in files with increasing
1The web location: http://www.cs.nmsu.edu/~hdp/CGTools/, holds the resources for theworkshop and the proceedings.
165
![Page 196: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/196.jpg)
difficulty to process the CGIF format [96]. Figures 7.1, 7.2,7.3, and 7.4 are the contents
of file ‘final_graphs_level2.cgf’ that was able to be processed by all tools submitted to
the workshop.
(GT [TypeLabel: "Entity"] [TypeLabel: "Block"];A block is an entity; ).(GT [TypeLabel: "Entity"] [TypeLabel: "Hand"];A hand is an entity; ).(GT [TypeLabel: "Entity"] [TypeLabel: "Location"];A location is an entity; ).(GT [TypeLabel: "Act"] [TypeLabel: "Pickup"];Pickup is an action; ).(GT [TypeLabel: "Act"] [TypeLabel: "Putdown"];Putdown is an action; ).(GT [TypeLabel: "Act"] [TypeLabel: "MoveHand"];MoveHand is an action; ).(GT [TypeLabel: "Act"] [TypeLabel: "MoveBlock"];MoveBlock is an action; )
Figure 7.1: Part 1: Example of Blocks World Benchmark File.
The Part 1 example (see Figure 7.1) is the type hierarchy definition for the
concepts used in the benchmark.Entity andAct are directly below the top,⊤, concept
of hierarchy, and the other seven concepts,Block, Hand, Location, Pickup, Putdown,
MoveHandandMoveBlock, are directly above the bottom,⊥, concept. Not a very deep
166
![Page 197: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/197.jpg)
hierarchy.
In Part 2 (see Figure 7.2), one can find the definition graphs for theEntity indi-
vidualized as aBlock, and for theActs individualized asPickup, Putdown, MoveHand
[Entity:’Block’(ATTR [Block*b] [Color])(CHRC ?b [Shape]);Each block has a color and shape; ].[Act:’Pickup’(PTNT [Pickup*p] [Block*b])(INST ?p [Hand*h])(RSLT ?p [Situation: (GRASP ?h ?b)]);Each block is picked up using a hand; ].[Act:’Putdown’(PTNT [Putdown*p] [Block*b])(DEST ?p [Location*l])(INST ?p [Hand])(RSLT ?p [Situation: (Top ?b ?l)]);Each block is put down at a location from the hand; ].[Act:’MoveHand’(DEST [MoveHand*m] [Location*l])(PTNT ?m [Hand*h])(RSLT ?m [Situation: (At ?h ?l)]);This action moves the hand to a location; ].[Act:’MoveBlock’(DEST [MoveBlock*m] [Location*l])(PTNT ?m [Block*b])(INST ?m [Hand])(RSLT ?m [Situation: (At ?b ?l)]);This action moves the block to a location; ]
Figure 7.2: Part 2: Example of Blocks World Benchmark File.
167
![Page 198: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/198.jpg)
andMoveBlock. The conceptsEntity andAct aredominant concepts(see Subsection
B.1 for definition) with internal structure. The< type, re f erent> pair are the external
scoped concept definition for the instantiation of the dominant concept. During model
processing these individualized definitions can be joined with a reference to the subtype
from the hierarchy. ConceptsHandandLocationhave a concept type and a location in
the type hierarchy; however, they do not have any internal structure to be considered.
Part 3 (see Figure 7.3) contains both the relation hierarchydefinition for the
relationsAt, Above, OnTable, TopandEmptyHandbeing used in the benchmark. It also
gives the dominant conceptRelationinternal structure for each relationship because
these relations are not axioms to CGs. When these referencedrelations appear in other
CGs then the definitional graphs can be joined to them.
The last part of the file, Part 4 (see Figure 7.4), gives the factual graphs con-
tained in the knowledge base. From this section of the file canbe seen, three cubical
blocks with colors ‘Red’, ‘Blue’ and ‘Green’ that are on a table at two locations. Block
#1 is above Block #3 which located directly on the table. Boththese blocks are located
at Location #5 and Block #2 is at Location #6. The hand is emptyand is holding no
blocks. Either Block #1 or Block #2 must be Blue in color, but both can be. The file
contents can be seen in picture form in Figure 7.5.
168
![Page 199: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/199.jpg)
(GT [RelationLabel: "Relation"] [RelationLabel: "At"]; Relation At ;).(GT [RelationLabel: "Relation"] [RelationLabel: "Above"]; Relation Above ;).(GT [RelationLabel: "Relation"] [RelationLabel: "OnTable"]; Relation OnTable ;).(GT [RelationLabel: "Relation"] [RelationLabel: "Top"]; Relation Top ;).(GT [RelationLabel: "Relation"] [RelationLabel: "EmptyHand"]; Relation EmptyHand ;).[Relation:’At’(POS [Entity] [Location]);An entity is positioned at a location; ].[Relation:’Top’(OnTable [Block*b1] [Location])~[(Above [Block*b2] ?b1)];A block on top is at a location and has no blocks above it; ].[Relation:’EmptyHand’~[(GRASP [Hand] [Block])];A hand is empty when no blocks are in it; ].[Relation:’OnTable’(At [Block*b] [Location])~[(GRASP [Hand] ?b)];A block on the table is at a location and not in the hand; ].[Relation:’Above’(OnTable [Block*b1] [Location*l])(OnTable [Block*b2] ?l);The first block is above the second block at the same location; ]
Figure 7.3: Part 3: Example of Blocks World Benchmark File.
169
![Page 200: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/200.jpg)
[Block:#1].[Block:#2].[Block:#3].[Hand:#4].[Location:#5].[Location:#6].[Block:@3].;Block #1 is red;(ATTR [Block:#1] [Color:’Red’]).;Block #2 is blue;(ATTR [Block:#2] [Color:’Blue’]).;Block #3 is green;(ATTR [Block:#3] [Color:’Green’]).(OnTable [Block:#1] [Location:#5]).(OnTable [Block:#2] [Location:#6]).(OnTable [Block:#3] [Location:#5]).;Block #1 is above block #3, and block #2 is at a different location;(Above [Block:#1] [Block:#3]).;All the blocks are on the table and not in the hand;(Emptyhand [Hand:#4]).[Either: [Or: (ATTR [Block:#1] [Color:’Blue’])][Or: (ATTR [Block:#2] [Color:’Blue’])]].;All blocks are cubical;(CHRC [Block:@every] [Shape:’Cubical’])
Figure 7.4: Part 4: Example of Blocks World Benchmark File.
170
![Page 201: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/201.jpg)
Green
Blue
Red
Figure 7.5: A Picture of the Benchmark File.
7.2 Tests
The tests that were performed were not only to validate that the new projection
algorithm produced the correct projection of the query ontothe knowledge base graphs,
but to evaluate how different parameters effect the runningof that algorithm given the
data structures used. The data file described in Section 7.1 that was benchmarked was
modified to create larger size knowledge bases and larger size graphs in terms of the
number of nodes (concepts and relations) in the graphs.
171
![Page 202: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/202.jpg)
Each of the tests were run on a single computer that was running the operating
system, Windows XP. There was 2 gigabytes of memory in the machine and all systems
were setup to use all virtual memory. No other applications were executed while the
tests were being performed. There was also 80 gigabytes of disk space, so there were
not space limitations imposed.
7.2.1 Single Appearance of Relation within Graph
Because it turned out that pCG was not able to process more than one instance
of a relation type within a graph, two sets of tests were performed. In Table 7.1 is given
all the files that were tested by all three systems. A knowledge base with 1, 1000, 2500,
and 5000 graphs were each stored in a file; those numbers are across the top of the table.
Then graphs of size 5, 11, 21, 31, 53, and 73 nodes had each of these KB graphs in a
files; those numbers are down the first column. Within each graph in these knowledge
bases, all relation types were unique.
Table 7.1: KB Single Relation Graph Files.
1 1000 2500 50005 graphs_5_1.cgf graphs_5_1000.cgf graphs_5_2500.cgf graph_5_5000.cgf11 graphs_11_1.cgf graphs_11_1000.cgf graphs_11_2500.cgf graphs_11_5000.cgf21 graphs_21_1.cgf graphs_21_1000.cgf graphs_21_2500.cgf graphs_21_5000.cgf31 graphs_31_1.cgf graphs_31_1000.cgf graphs_31_2500.cgf graphs_31_5000.cgf53 graphs_53_1.cgf graphs_53_1000.cgf graphs_53_2500.cgf graphs_53_5000.cgf73 graphs_73_1.cgf graphs_73_1000.cgf graphs_73_2500.cgf graphs_73_5000.cgf
172
![Page 203: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/203.jpg)
7.2.1.1 Increase # of Graphs in KB
One parameter being examined was how the number of graphs within the knowl-
edge base effected the running time. Therefore, 1, 5, 100, 1000, 2500 and 5000 graphs
were stored in the knowledge base for each graph size. However, as will be seen in
the Results section below (see Section 7.4), the times for 1,5 and 100 graphs in a
knowledge base were so low that there was not significant difference between all of the
systems for evaluation. Therefore, only the 1000, 2500 and 5000 graphs KBs will be
analyzed.
7.2.1.2 Increase # of Nodes in Graphs in KB
Another parameter that was believed to effect the actual execution time of the
projection of the query into the knowledge base was just how many nodes were present
in each graph of the knowledge base. This was somewhat arbitrary because areal world
knowledge base would not have a fixed number of nodes in every graph. In fact, the
sizes of the graphs would be small for factual data, medium for definitional data, but
larger for particle and complete model data. As seen in Table7.1 above the number of
nodes in the graphs of the KBs were increased in the followingway: 5, 11, 21, 31, 53,
and 73.
173
![Page 204: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/204.jpg)
A sample graph from the single graph KB for each node size is:
5-nodes:
(OnTable [Block*b] [Table])(NAME ?b [Number])
11-nodes:
(ATTR [Block*b] [Color])(NAME ?b [Number])(CHRC ?b [Shape])(LOC ?b [Place])(OnTable ?b [Table])
21-nodes:
(Above [Block*b2] [Block*b1])(OnTable ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])(NAME2 ?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])
31-nodes:
(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2)(OnTable ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1[Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2?b2 [Color])(NAME2 ?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])(ATTR3 ?b3 [Color])(NAME3 ?b3 [Number])(CHRC3 ?b3 [Shape])(LOC3 ?b3 [Place])
53-nodes:
(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2)(OnTable1 ?b1 [Table*t1])(OnTable2 [Block*b4] ?t1)(Above3 [Block*b5] ?b4)(NAMET ?t1 [Number])(ATTR1 ?b1[Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1?b1 [Place])(ATTR2 ?b2 [Color])(NAME2 ?b2 [Number])(CHRC2?b2 [Shape])(LOC2 ?b2 [Place])(ATTR3 ?b3 [Color])(NAME3?b3 [Number])(CHRC3 ?b3 [Shape])(LOC3 ?b3 [Place])(ATTR4?b4 [Color])(NAME4 ?b4 [Number])(CHRC4 ?b4 [Shape])(LOC4?b4 [Place])(ATTR5 ?b5 [Color])(NAME5 ?b5 [Number])(CHRC5 ?b5 [Shape])(LOC5 ?b5 [Place])
174
![Page 205: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/205.jpg)
73-nodes:
(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2)(OnTable1 ?b1 [Table*t1])(OnTable2 [Block*b4] ?t1)(Above3 [Block*b5] ?b4)(Above4 [Block*b6] ?b5)(Above5[Block*b7] ?b6)(NAMET ?t1 [Number])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])(NAME2 ?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])(ATTR3 ?b3 [Color])(NAME3 ?b3 [Number])(CHRC3 ?b3 [Shape])(LOC3 ?b3 [Place])(ATTR4 ?b4 [Color])(NAME4 ?b4 [Number])(CHRC4 ?b4 [Shape])(LOC4 ?b4 [Place])(ATTR5 ?b5 [Color])(NAME5 ?b5 [Number])(CHRC5 ?b5 [Shape])(LOC5 ?b5 [Place])(ATTR6 ?b6 [Color])(NAME6 ?b6 [Number])(CHRC6 ?b6 [Shape])(LOC6 ?b6 [Place])(ATTR7 ?b7 [Color])(NAME7 ?b7 [Number])(CHRC7 ?b7 [Shape])(LOC7 ?b7 [Place])
It should be noted that as of 21-nodes in the KB graph, it now became the case that a
relation type needed to be repeated. Therefore, a number wasadded to the relation type
in order to make it unique.
7.2.1.3 Increase # of Nodes in Query Graph
Returning to the ‘typical case’ discussed in Section 5.3, itwas proposed that
smaller query graphs would take less time to project onto a KBwith larger, more nodes,
graphs. Therefore, query graphs ranging in size from 3 nodesall the way to 73 nodes
were tested given the constraint the no query graph was larger in size then the KB graph
size. This is so the projection was always an injective projection as explained in Section
5.1.1.
Examples of several of the query graphs for the projection are given below in
CGIF:
175
![Page 206: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/206.jpg)
3-nodes:
(ATTR [Block] [Color])
5-nodes:
(ATTR1 [Block*b] [Color])(NAME1 ?b [Number])
7-nodes:
(ATTR [Block*b] [Color])(NAME ?b [Number])(OnTable ?b[Table])
9-nodes:
(ATTR [Block*b] [Color])(NAME ?b [Number])(CHRC ?b[Shape])(LOC ?b [Place])
15-nodes:
(Above [Block*b2] [Block*b1])(OnTable ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1[Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])
27-nodes:
(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2)(OnTable ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1[Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2?b2 [Color])(NAME2 ?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2?b2 [Place])(ATTR3 ?b3 [Color])(NAME3 ?b3 [Number])
176
![Page 207: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/207.jpg)
43-nodes:
(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2)(OnTable1 ?b1 [Table*t1])(OnTable2 [Block*b4] ?t1)(Above3[Block*b5] ?b4)(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])(NAME2 ?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])(ATTR3 ?b3 [Color])(NAME3 ?b3 [Number])(CHRC3 ?b3[Shape])(LOC3 ?b3 [Place])(ATTR4 ?b4 [Color])(NAME4 ?b4[Number])(ATTR5 ?b5 [Color])(NAME5 ?b5 [Number])
Not all of the query graphs were given as examples especiallyif the graph structure
is defined in the subsection above for the KB. It should be noted that some of the 3-
nodes query graphs were not exactly the one given. This is because several of the query
graphs had to slightly modified to make the relations match. This happened with not
only 3-nodes queries, but several of the others also. At thistime, pCG did not have
a relation hierarchy to account for relation types that wereactually specializations of
other relations, so CPE was not given one either.
The set of queries evaluated with each KB were attempting to test ‘typical’
queries that would possibly be asked by a user when using a query-answer system. This
is why not all queries were tested against all KBs that meant the injective projection
requirement. A place that this is very obvious is while examining the data in Table
7.2. Here is seen that the query graph with 7 nodes is only usedwhen testing the 11
nodes KB graph. Looking at the structure of the 11 node KB graph, one can see that the
“OnTable” relation without an “Above” relation is only usedin this graph. Therefore
looking for a block with a name, color and directly on the table without a block above
177
![Page 208: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/208.jpg)
Table 7.2: Single Relation: Query Graph Size Run vs Number ofNodes in KB Graphs.
3 5 7 9 11 15 21 27 31 43 53 63 735 X X11 X X X X X21 X X X X X X31 X X X X X X X53 X X X X X X X X X X73 X X X X X X X X X X X X
it would only appear in these graphs. That is why this query graph was tested only with
this KB graph structure.
7.2.2 Multiple Appearance of Relation with a Graph
Due to the fact that some systems are not able to process multiple relations of
the same type within a single graph and it is perceived that this would be necessary for
any system working as a general query-answer system, tests were performed on only
the two data structure versions of CPE to validate that this projection algorithm is in fact
able to handle this type of data. As in the above section, multiple sizes of graphs within
the knowledge base were tested as well as multiple sizes of query graphs. Because
these tests were for validation and not for execution time purposes, multiple size KBs
were not tested.
178
![Page 209: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/209.jpg)
7.2.2.1 Increase # of Nodes in Graphs in KB
Each of these graphs in the KB are designed to test that a querygraph that is
contained in more than one subgraph will produce all valid projections. In order to test
several different query graphs with several different nodesizes, KB graphs with 13, 23,
33 and 55 nodes within the graph in the KB were tested.
A sample graph from the single graph KB for each node size is:
13-nodes:
[Block*b1][Block*b2](Above ?b1 ?b2)(OnTable ?b2 [Table])(ATTR ?b1 [Color])(NAME ?b1 [Number])(ATTR ?b2 [Color])(NAME ?b2 [Number])
23-nodes:
(Above [Block*b1] [Block*b2])(OnTable ?b2 [Table*t1])(NAME ?t1 [Number])(ATTR ?b1 [Color])(NAME ?b1 [Number])(CHRC ?b1 [Shape])(LOC ?b1 [Place])(ATTR ?b2 [Color])(NAME ?b2 [Number])(CHRC ?b2 [Shape])(LOC ?b2 [Place])
33-nodes:
(Above [Block*b2] [Block*b1])(Above [Block*b3] ?b2)(OnTable ?b1 [Table*t1])(NAME ?t1 [Number])(ATTR ?b1[Color])(NAME ?b1 [Number])(CHRC ?b1 [Shape])(LOC ?b1[Place])(ATTR ?b2 [Color])(NAME ?b2 [Number])(CHRC ?b2[Shape])(LOC ?b2 [Place])(ATTR ?b3 [Color])(NAME ?b3[Number])(CHRC ?b3 [Shape])(LOC ?b3 [Place])
179
![Page 210: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/210.jpg)
55-nodes:
(Above [Block*b2] [Block*b1])(Above [Block*b3] ?b2)(OnTable ?b1 [Table*t1])(OnTable [Block*b4] ?t1)(Above [Block*b5] ?b4)(NAME ?t1 [Number])(ATTR ?t1[Legs])(ATTR ?b1 [Color])(NAME ?b1 [Number])(CHRC ?b1[Shape])(LOC ?b1 [Place])(ATTR ?b2 [Color])(NAME ?b2[Number])(CHRC ?b2 [Shape])(LOC ?b2 [Place])(ATTR ?b3[Color])(NAME ?b3 [Number])(CHRC ?b3 [Shape])(LOC ?b3[Place])(ATTR ?b4 [Color])(NAME ?b4 [Number])(CHRC ?b4[Shape])(LOC ?b4 [Place])(ATTR ?b5 [Color])(NAME ?b5[Number])(CHRC ?b5 [Shape])(LOC ?b5 [Place])
7.2.2.2 Increase # of Nodes in Query Graph
When examining Table 7.3, it is seen that not as many variations of query graphs
were exam. This is due to the fact that the interest here was invalidating that multiple
projection graphs could be found within the KB graphs. Therewas only a limited
number of nodes that did in fact appear in some form of multiple projection; after that,
as the number of nodes in the query graph grew, the projectionoperation could only
find a single subgraph projection from the query graph onto the KB graph.
Table 7.3: Multi-Relation: Query Graph Size Run vs Number ofNodes in KB Graphs.
3 5 9 1113 X X23 X X X33 X X X X55 X X X X
180
![Page 211: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/211.jpg)
The query graph node structure was designed to produce multiple projections
given the KB graph. The actual query graph for the projectionis given below in CGIF:
3-nodes:
(ATTR [Block] [Color])
5-nodes:
(ATTR [Block*b] [Color])(NAME ?b [Number])
9-nodes:
(ATTR [Block*b] [Color])(NAME ?b [Number])(CHRC ?b[Shape])(LOC ?b [Place])
11-nodes:
(ATTR [Block*b] [Color])(NAME ?b [Number])(CHRC ?b[Shape])(LOC ?b [Place])(OnTable ?b [Table])
Each of these query graphs will produce multiple projectiongraphs given using the
KBs discussed in Section 7.2.2.1. To look at an example of howthis will give multiple
projection graphs, if the 5 nodes query graph is projected onto the 13 nodes KB graph
it would result in the following two projections:
1. (ATTR [Block*b1] [Color])(NAME ?b1 [Number])
2. (ATTR [Block*b2] [Color])(NAME ?b2 [Number])
181
![Page 212: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/212.jpg)
7.3 Results of Each Experiment Systems
Beginning will be an overview of what was seen for results of each of the ex-
periment systems.
7.3.1 pCG - Original Notio
As will be seen and discussed in the sections below the pCG system is very
stable when the query graphs are small and when there are few numbers of graphs in
the knowledge base. As the size of the query graphs grow towards the size of the graphs
within the knowledge base and as the number of graphs within the knowledge grows
larger, then the error span increases (see error bar data in Section D.2 of Appendix D)
and becomes very unstable.
7.3.2 CP Environment
The new projection algorithm presented in Chapter 5 and Section 5.2.2 gave
interesting results in both forms of the tested data structures. The array implementation
did very well on the typical case (as it was designed to do); the hash table implemen-
tation did not come on strong until the size of the graphs within the KBs was increase.
Below some more information is given on why it is believed that these results were
seen.
182
![Page 213: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/213.jpg)
7.3.2.1 Array (Vector)
As laid out in Chapter 6, the data structures here were all arrays. The storage
of these data structures were using an unsorted mechanism excepted that each concept,
relation, and cstriple all had unique labels and were storedaccording to their appearance
within the CGIF formatted graphs in the file. That did cause the most basic concept
node in the graph to many times be stored first in the list; therefore, causing it to be
quickly retrieved during the projection operation. However, as the arrays became longer
and some of the concept nodes had an equal number of links, it can be seen that the time
needed to check the structure and build the projection increased. But the increase and
the shape of the resulting polynomial never went outside of the predicted analysis of
the algorithm given in Chapter 5 and Section 5.3.3.
7.3.2.2 Hash Tables
This data structures implementation behavior as expected.In using a perfect
hash, 1) extra time was needed for storage; 2) more space was needed in order for the
KB to be resident in memory; and 3) there was extra overhead inprocessing the hash
tables. However, even though on the projection of small query graphs onto small KB
graphs did not give excellent results, as the size of the graphs within the KBs increased
and the number of graphs within the KB increased the simple linear regression [73]
showed that the execution time was linear as the size of the query graphs increase. It
is believed that the reason that the results were not seen with the more ‘typical case’
183
![Page 214: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/214.jpg)
is because the hash tables were designed and implemented so that no collisions would
happen within the tables. This added to both the amount of space needed for the KB
to be resident in memory and added extra overhead during processing. However, when
the execution time needed for the actual projection in the other implementation reached
the overhead for this implementation, this implementationbecame more efficient.
7.4 Results of Each # of Nodes in KB
As discussed previously in this chapter, each of the graph sizes were placed in a
knowledge base of size 1, 5, 100, 1000, 2500 and 5000 graphs. Timings were collected
on runs against part of the graph files, but all of the relevantquery graphs. However
until there were at least 1000 graphs in the KB, there were really no separation in the
acquired execution times; therefore, for each graph size below only the timing for 1000,
2500 and 5000 graphs will be given and discussed.
7.4.1 5 nodes in KB graphs
Here are the result charts for the test runs performed using the knowledge bases
containing graphs with 5 nodes in them. First is Figure 7.6 containing the results for
the projection of query graphs against a KB containing 1000 graphs all with 5 nodes.
As discussed in Section 7.2.1.2, the 5 nodes in the KB graph holds the information of
the name of block that is on the table in the ‘blocks world’ domain. Second is Figure
7.7 containing the results for the projection of query graphs against a KB containing
184
![Page 215: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/215.jpg)
0
50
100
150
200
250
2 3 4 5 6
#nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.6: 5 nodes in KB of 1000 Graphs.
2500 graphs all with 5 nodes.
Third is Figure 7.8 containing the results for the projection of query graphs
against a KB containing 5000 graphs all with 5 nodes. Lookingat this set of 3 charts,
there really is not enough information to indicate what the real growth curve is for the
projection of the query graphs onto the KB graphs. Therefore, the tests were expanded
to include more nodes and more cstriples.
185
![Page 216: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/216.jpg)
0
100
200
300
400
500
600
2 3 4 5 6
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.7: 5 nodes in KB of 2500 Graphs.
0
200
400
600
800
1000
1200
2 3 4 5 6
#nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.8: 5 nodes in KB of 5000 Graphs.
186
![Page 217: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/217.jpg)
7.4.2 11 nodes in KB graphs
Here are the result charts for the test runs performed using the knowledge bases
containing graphs with 11 nodes in them. First is Figure 7.9 containing the results for
the projection of query graphs against a KB containing 1000 graphs all with 11 nodes.
These 11 node graphs from the KB, as seen in Section 7.2.1.2, contain the block on the
table as well as the name, color, shape and location of the block. Second is Figure 7.10
containing the results for the projection of query graphs against a KB containing 2500
graphs all with 11 nodes.
0
50
100
150
200
250
2 3 4 5 6 7 8 9 10 11 12
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.9: 11 nodes in KB of 1000 Graphs.
187
![Page 218: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/218.jpg)
0
100
200
300
400
500
600
700
800
2 3 4 5 6 7 8 9 10 11 12
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.10: 11 nodes in KB of 2500 Graphs.
Third is Figure 7.11 containing the results for the projection of query graphs
against a KB containing 5000 graphs all with 11 nodes. Because there are more nodes
and cstriples in the KB graphs, more query graphs can be projected onto these graphs
to see more of a gradation in the set of 3 charts. The slope and shape of the curves as
the number of nodes in the query graphs increase become more distinct. Even with this
smaller number of query graphs being tested, CPEHash, is showing a linear slope and
CPE (array format) performs faster than the other two systems.
188
![Page 219: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/219.jpg)
0
200
400
600
800
1000
1200
1400
1600
2 3 4 5 6 7 8 9 10 11 12
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.11: 11 nodes in KB of 5000 Graphs.
7.4.3 21 nodes in KB graphs
Here are the result charts for the test runs performed using the knowledge bases
containing graphs with 21 nodes in them. First is Figure 7.12containing the results
for the projection of query graphs against a KB containing 1000 graphs all with 21
nodes. These 21 node graphs not only have the information forthe block on the table
including the name, color, shape and location of the block (see Section 7.2.1.2), but
this same information that is defined for the block located above the first block. Second
is Figure 7.13 containing the results for the projection of query graphs against a KB
189
![Page 220: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/220.jpg)
0
50
100
150
200
250
300
350
400
2 4 6 8 10 12 14 16 18 20 22
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.12: 21 nodes in KB of 1000 Graphs.
containing 2500 graphs all with 21 nodes.
Third is Figure 7.14 containing the results for the projection of query graphs
against a KB containing 5000 graphs all with 21 nodes. In all three of these charts, the
slopes of the execution time for projecting the query graph onto the KB graphs for each
system are the same. However, the execution time for projecting the query graph onto
the KB is definitely a function of the number of graphs in the KB. For the actual graph
isomorphism, that is when the query graph is the same size as the the KB graph, the
execution time is actually coming together.
190
![Page 221: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/221.jpg)
0
100
200
300
400
500
600
700
800
900
1000
2 4 6 8 10 12 14 16 18 20 22
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.13: 21 nodes in KB of 2500 Graphs.
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2 4 6 8 10 12 14 16 18 20 22
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.14: 21 nodes in KB of 5000 Graphs.
191
![Page 222: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/222.jpg)
7.4.4 31 nodes in KB graphs
Here are the result charts for the test runs performed using the knowledge bases
containing graphs with 31 nodes in them. First is Figure 7.15containing the results
for the projection of query graphs against a KB containing 1000 graphs all with 31
nodes. These 31 node graphs include the two blocks with theirinformation including
the name, color, shape and location of the block (see Section7.2.1.2). It also indicates
that the first block is on the table, and that a third block withall of its information is on
top of the second block.
0
100
200
300
400
500
600
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.15: 31 nodes in KB of 1000 Graphs.
192
![Page 223: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/223.jpg)
Second is Figure 7.16 containing the results for the projection of query graphs
against a KB containing 2500 graphs all with 31 nodes, and third is Figure 7.17 contain-
ing the results for the projection of query graphs against a KB containing 5000 graphs
all with 31 nodes. These charts are now showing quite clearlythat as the number of
nodes in both the query and KB graphs increase, the shape of the curves become clearer.
These curves are coming very close to crossing indicating that with larger graphs some
algorithms perform better than with small graphs. In fact the curves are very close
together when looking at a large size KB.
0
200
400
600
800
1000
1200
1400
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.16: 31 nodes in KB of 2500 Graphs.
193
![Page 224: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/224.jpg)
0
500
1000
1500
2000
2500
3000
2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32
# of nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.17: 31 nodes in KB of 5000 Graphs.
7.4.5 53 nodes in KB graphs
Here are the result charts for the test runs performed using the knowledge bases
containing graphs with 53 nodes in them. First is Figure 7.18containing the results for
the projection of query graphs against a KB containing 1000 graphs all with 53 nodes.
These 53 node graphs (see Section 7.2.1.2) include all threeof the blocks in one stack
on the table with their information including the name, color, shape and location of the
block. Also, there is a second stack on the same table with twomore blocks including
their information.
194
![Page 225: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/225.jpg)
0
100
200
300
400
500
600
700
800
900
1000
2 7 12 17 22 27 32 37 42 47 52
# of nodes in Query Graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.18: 53 nodes in KB of 1000 Graphs.
Second is Figure 7.19 containing the results for the projection of query graphs
against a KB containing 2500 graphs all with the 53 nodes information. Third is Figure
7.20 containing the results for the projection of query graphs against a KB containing
5000 graphs all with 53 nodes. Because the number of nodes in the KB graphs has
gotten large enough, the curves have now crossed to indicatethat the overhead from
the hash tables is no longer having as much effect on the overall execution time. The
CPEHash system is continuing to show a linear curve with the cross-over points being
the same in all three charts.
195
![Page 226: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/226.jpg)
0
500
1000
1500
2000
2500
2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50 53
#nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.19: 53 nodes in KB of 2500 Graphs.
0
1000
2000
3000
4000
5000
6000
2 5 8 11 14 17 20 23 26 29 32 35 38 41 44 47 50 53
#nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.20: 53 nodes in KB of 5000 Graphs.
196
![Page 227: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/227.jpg)
7.4.6 73 nodes in KB graphs
Here are the result charts for the test runs performed using the knowledge bases
containing graphs with 73 nodes in them. First is Figure 7.21containing the results for
the projection of query graphs against a KB containing 1000 graphs all with 73 nodes.
These 73 node graphs (see Section 7.2.1.2) include six blocks in two stacks on the table
with the information for each block including the name, color, shape and location of
the block. The name for the table is also part of each graph in the KB.
0
200
400
600
800
1000
1200
1400
1600
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75
#nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.21: 73 nodes in KB of 1000 Graphs.
197
![Page 228: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/228.jpg)
Second is Figure 7.22 containing the results for the projection of query graphs
against a KB containing 2500 graphs all with 73 nodes. Third is Figure 7.23 containing
the results for the projection of query graphs against a KB containing 5000 graphs all
with 73 nodes. With these three system tests, it is seen that 73 nodes shows the clearest
result changes between the pCG, CPE and CPEHash. As before, CPE does the best
with the smallest (fewest number of nodes) query graphs, butwhen graph isomorphism
is reached, complete coverage of the full graph, then the array vector implementation
causes real slow down. As with the 53 node charts the cross-over of the curves happens
when testing the same query graph projection in each chart.
0
500
1000
1500
2000
2500
3000
3500
4000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75
#nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.22: 73 nodes in KB of 2500 Graphs.
198
![Page 229: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/229.jpg)
0
1000
2000
3000
4000
5000
6000
7000
8000
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75
#nodes in Query graph
tim
e in
mill
isec
on
ds
pCG
CPE
CPEHash
Poly. (pCG)
Poly. (CPE)
Linear (CPEHash)
Figure 7.23: 73 nodes in KB of 5000 Graphs.
7.5 Analysis of Results
Each of the results given in the section above is laid out by nodes in the KB
graphs. This appears to be the most direct way of evaluating the result received from
the tests.
199
![Page 230: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/230.jpg)
7.5.1 Change # of Graphs in KB
Looking at the results above for the 1000, 2500 and 5000 graphs in each KB, the
curves in each chart are the same and just increase in milliseconds by the propositional
number of graphs increased in the KB. Adding large numbers ofgraphs to the KB puts
stress on the amount of memory space needed for processing because the KB needs to
stay resident in memory. However on evaluation of each of theKBs by nodes size, the
shape of the curves in relationship to the three system implementations is showing no
change.
7.5.2 Change # of Nodes in KB Graphs
As the number of nodes in the graphs found in the KB increased,the shape of
the curves in the result graphs became more pronounced. Thatis as the graph sizes
increased and the problems moved closed to “real life”, the effects of the algorithm
changes and data structures were more prominent. When looking at the results from
the 5 nodes and 11 nodes KBs, about all the solutions looked the same except that the
hash tables because of their added overhead took longer thanboth of the other solutions.
However, as the size of the graphs increased, the curves generated by the results took
on either a polynomial exponential or linear shape. By 53 nodes in the KB graphs, the
solutions had started crossing and taking on shape. In the 73nodes KB results, the
same crossings seen at 53 nodes were present and the CPE hash table implementation
was definitely a linear result when tested with a simple linear regression [73].
200
![Page 231: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/231.jpg)
7.5.3 Change # of Nodes in Query Graph
The number of nodes in the query graphs as projected onto the KB were started
small and then increased until the subgraph isomorphism wasin fact a graph isomor-
phism of the KB graph. Because it was desired to only do an injective projection
operation, the number of nodes in the query graph never exceeded the number of nodes
in the KB graph. As the number of nodes in the query graph increased the number of
concepts in the anchor list also increased. This created more matches of query concepts
to match graph concepts and therefore more processing of cstriples during the building
of projections phase. Therefore, when the number of nodes inthe KB graphs increased,
the execution time of the faster implementation with small graphs started to increase.
By the time the KB size was to 53 nodes even the smaller query graphs were showing
similar execution times. That is for larger size graphs in the KB, the smaller query
graph execution was much closer together for all implementations then when the KB
graphs were small. Also, as the KB graphs increased in size more variations in query
graph sizes could be tested; therefore, allowing the visualization of cross over for the
CPE hash implement. This implementation showed better result than the pCG system
at about 27 nodes in the query graphs and better results than the CPE array implemen-
tation at about 41 nodes in the query graphs. These results were seen for both the 53
and 73 nodes in graphs of KB.
201
![Page 232: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/232.jpg)
7.5.4 Change # of Identical Relations in Graph
As discussed previously in this chapter, the pCG system can not process iden-
tical relations within a single graph. Therefore the tests run with multiple instances of
the same relation were confided to validating that the CPE system with both its data
structures were correctly finding all projections (see Section D.3.2 in Appendix D for
actual output). Table 7.4 shows how many projections were found when running the
validation tests. These tests gave the same results for bothdata structures implemented
in the CPE system.
Table 7.4: Number of Projections Found: Query Graph Size vs KB Graph Size.
3 5 9 11
13 2 223 2 2 233 3 3 3 155 5 5 5 2
202
![Page 233: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/233.jpg)
CHAPTER 8
CONCLUSIONS AND FUTURE WORK
8.1 Evaluation of Four Projection Algorithms
Four different, yet related, projection algorithms that use either full Conceptual
Graphs (CGs) or Simple Conceptual Graphs (SCGs) have been described (see Chapter
5). Examining Table 8.1 comparisons between basic units, type of graphs, number of
possible projections found, projection question analysis, overall problem analysis of
projection operation algorithm execution time and actual projection creation execution
time will be evaluated.
Table 8.1: Comparison of Four Algorithms.
M&C Croitoru Notio New Projbasic unit relations relations relations concepts
works over SCGs SCGs CGs CGsprojs found all # relations 1 allproj question NP-Complete NP-Complete NP-Complete NP-Complete
problem NP-Hard NP-Hard NP-Hard NP-Hardproj alg non-impl non-impl n3 n3/n
203
![Page 234: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/234.jpg)
The Mugnier and Chein and Croitoru algorithms use SCGs, and Notio and the
new algorithm work over full CGs. Looking back at the exampleshown when dis-
cussing the projection operation, Notio would only find one projection because it was
only designed to look for a single projection graph. Croitoru’s algorithm includes a
stop mechanism such that the total number of relations in thequery graph equals the
number of possible projections; therefore, at times it may not find all projections even
though the actual algorithm should final all projections.
It is not clear from the Mugnier and Chein 1992 work [74], if they can handle
two concept pairs with the same relationship between them ina projection operation.
However, from later work [75], it is indicated that the same relationship between dif-
ferent concepts can be found and multiple projections are possible between two CGs,
but the algorithm is based on SCGs, that do not use actors, andare not directed graphs.
The Mugnier and Chein algorithm is also based on the relations found within the graph
and must traverse all of their signatures to discover if there is a subgraph morphism.
The new algorithm is based on the conceptual units, or concepts, within the graph and
can stop searching as soon as there is no match for a concept orconcept triple in the
KB graph for one of a query graph’s conceptual unit.
Mugnier and Chein’s algorithm does the whole projection operation as a sin-
gle injective projection algorithm, where Croitoru, Notioand the new algorithm all use
some form of preprocessing. Notio and the new algorithm havea complete separation
204
![Page 235: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/235.jpg)
between the preprocessing algorithm and projection; where, Croitoru uses the prepro-
cessing algorithm inside of the actual projection, therefore, giving the same running
time for both the overall algorithm and the actual projection. Notio does preprocessing
at storage time that helps in constructing the projection. However, the actual projection
search problem after the preprocessing is still NP-Hard.
The new algorithm splits the overall projection algorithm into two parts, match-
ing and projection construction. Then data structures are used between these two al-
gorithms to use the structure of the graphs to help in the projection process. In the
most common case the matching algorithm is the longest running part of the overall
algorithm because the actual projection execution is polynomial.
8.1.1 Strengths
All of these algorithms address decision problems that are in the class of NP-
Complete and have search problems that are in the class of NP-Hard, therefore, where
the strength of the algorithms come into play is in how they handle ‘typical case’ situ-
ations where they would be used.
Since the number of database records and semantic web pages are increasing
yearly in the amount of information available, algorithms that can work with knowledge
bases with large amounts of data will be on the forefront.
205
![Page 236: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/236.jpg)
8.1.2 Weaknesses
Notio has a definite weakness in that it is actually designed to only find one
possible projection graph even when others are available. Also, the SCG Relation algo-
rithm of Croitoru has a weakness in the stop mechanism of the algorithm. This should
be modified to not exclude any possibilities.
The CPE algorithm potentially could take a large amount of time to process the
matching or preprocessing algorithm. However, in most cases discussed within this
work it does not operate on this end of the spectrum.
8.2 Data Structures and Algorithms Effectiveness Comparison for ImplementedAlgorithms
Since the Mugnier and Chein algorithm and Croitoru algorithm were not im-
plemented, comparison will only be in relationship to Notioand CPE algorithms. The
pCG system that implemented the Notio algorithm used an added phase between stor-
age and projection to impose internal structure on the stored graphs. Even though this
added structural information helped the projection process when the graphs were small
in size and the KBs had few graphs, as sizes and KBs increased this added phase be-
came very costly in time.
The CPE algorithm when adding the data structures change also showed some
changes from when the graphs were small and KBs were small to when these elements
increased.
206
![Page 237: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/237.jpg)
8.2.1 Strengths
pCG algorithm was efficient in its size and memory usage. The graph was stored
in a very tight array and hash table structure. Many graphs could be processed before
the memory available had to be increased in order to have processing of the projections.
The CPE array data structure also was efficient in its size andmemory usage. In
fact, in all tests it never ran out of memory even when the 5000graph KB was resident.
The hash table data structure CPE had the advantage that it executed in linear time as
the number of nodes in the query graph increased.
8.2.2 Weaknesses
With the pCG algorithm, the Assertion phase became a real ‘bottle neck’ as the
number of graphs in the KB increased. Because it wanted to compare all the structures
of all the graphs within the KB when asserting them, this tookover an hour of actual
execution time to assert the 73 node KB with both 2500 and 5000graphs.
CPE hash table implementation require a lot resident memoryfor processing the
large KBs. This was because it used 10000 element hash table indices to be sure that
the labels for all elements (concepts, relations and triples) were unique. The implemen-
tation could have been changed to add an extra part to the processing to redo the unique
identifiers after the graphs were stored, but it was not knownhow much execution time
would be added to the process.
207
![Page 238: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/238.jpg)
8.3 Significance of Work
Both in a typical scenario where the query graph is small in size (number of
relationships between concepts) compared to the graphs in the knowledge base, and
in actual execution tests, the new injective projection algorithm: 1) performs projec-
tions on full conceptual graphs, 2) finds all projections even when conceptual relation’s
rtypes are not unique, 3) performs the projections faster over a complete KB than com-
parative system, and 4) gives good results when executing against a large KB (5000
graphs).
Data structure modifications when directly integrated intoprojection algorithm
produced significance improvement when executed over larger KBs with larger size
graphs within the KB.
8.3.1 Full Conceptual Graphs
Even though much work has been done with SCG, full conceptualgraphs with
all their functionality are desirable. This new algorithm does not have the added re-
strictions of SCGs and can even process functional relations. Because there are cases
of queries over time and space that require full CGs, this newalgorithm is significant.
8.3.2 Finds All Valid Projections
This new algorithm finds all valid projections given a query.Because it is not
known which projection from the KB graph may answer the needed information, it is
208
![Page 239: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/239.jpg)
necessary to produce all valid projections. Because of the data structure implementation
finding all valid projections does not really cost any more time than finding one.
8.3.3 Data Structure Integration in Algorithm over Large KB and Graphs
The perfect hash table implementation it is more efficient with large KBs and
large graphs within a KB even though it required much more storage space and memory
allocation. Because the information within many standard databases is increasing with
record information and if one desires to store semantic information from off of the
Semantic Web, being able to handle large amounts of data and knowledge is critical.
Being able to retrieve a projection onto this large KB is a significant improvement.
8.4 Future Work
As an extension to this dissertation’s work, some lines of research can be con-
tinued. The work on the maximal join algorithm can be improved by using the infor-
mation found in this work. Also other researchers can be worked with on this algorithm
analysis.
By evaluating the information about the use of the data structures, this work
can help to develop new ideas for storing knowledge base meta-structures in relational
databases for creating the ability to move factual information to a knowledge base and
then return more information back to the original database.
As new benchmark graphs become available within the community of research
209
![Page 240: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/240.jpg)
more domains can be tested with this new algorithm and data structures. Also, time and
space constraints can also be tested with the new algorithm,while adding heuristics to
improve on the constraint processing.
8.4.1 Experiments and Analysis of Maximal Join Algorithm
In Section 5.2.3, an algorithm was presented to describe themaximal join oper-
ation in the same terms as the projection operation new algorithm. Now that the author
of the Amine Platform, Adil Kabbaj, wishes to make his systeminteroperate with other
systems [56] and has an implemented the full CG Maximal Join operation, the same
modifications made to theProjectionsupport routine (see Section 5.2.1.3) will be im-
plemented, tested and analyzed. Amine can be used for comparison and to help in the
validation of the new algorithm.
8.4.2 KB Stored From and To Standard Relational DB
Investigation into possibly storing SCG from relational database records has
begun [130]. Given that the new data structure used in storing the CG, in this work, is
in a hash table format, this structure could be translated into a relational database record
structure. Then this meta-data could be used to store the full structure of the CG. Once
the CG is stored in the database, retrieving it again back to aknowledge base would be
easily constructed.
210
![Page 241: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/241.jpg)
8.4.3 Time and Space Constraints
Constraints are divided into several groups. Some constraints work by modi-
fying and/or evaluating the elements of the domain that are actually processed. Some
constraints apply heuristics to decide what information seen during processing should
continue to be considered. These heuristics may be very simple such as is the current
domain element TRUE or FALSE, maintaining basic truth, or they may be very com-
plex. In constraint-satisfaction problems quantificationoperations are used in a Prolog
type fashion to assign values and variables subject to a set of constraints [28]. Con-
straint specifications give a convenient form for expressing known knowledge while
allowing the system designer to focus on local relationships among entities within the
domain. The next sub-section will discuss heuristic constraints.
Other constraints are concerned with time and space relationships between the
domain elements and between the actual conceptual units within the semantic network.
These constraints use qualitative relationships to propagate over time and space. As
discussed in the Qualitative Section (see 2.1.2.3), these are interval relationships that
are setup “point to point”. In Figure 8.1, adapted from Allen’s 1991 paper (p. 346) [4],
seven of the basic interval relationships originally discussed in the 1983 Allen paper [3]
are shown. There are six other relationships that are the inverse of part of these event
objects not depicted. In the further sections below, these relationships will be discussed
as they relate to time and space.
211
![Page 242: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/242.jpg)
Figure 8.1: Interval Time Relationships.
8.4.3.1 Heuristics
Heuristics are criteria, methods, or principles for deciding which among several
alternative courses of action promises to be the effective in order to achieve some goal
[85]. The idea here is to define a simple criteria that discriminates between good and
bad selection. One may choose a heuristic that is just arule of thumb that guides to a
selection, or one may look to see if the out come of applying a heuristic appears to take
them to a “stronger” position. When one has good heuristics,they provide a simple
212
![Page 243: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/243.jpg)
means of indicating which course of action will lead to the preferred goal with a quick
path even if it is not the most effective [85]. In general, most reasoning problems are
very complex and have large numbers of cases to evaluate to find an answer. Heuristics
allow these number of cases to be reduced and a shorter, even though maybe not the
most direct, solution to be discovered within reasonable time constraints.
Heuristics use quantification operations to prune and shapethe evaluation of
information. The information may be checked with heuristics for either its feasibility
to lead to a valuable solution or for its correctness in the world that the system knows
as reality.
8.4.3.2 Time
Time intervals over moments in time are processed using qualitative relation-
ships. Temporal reasoning requires that the knowledge representation be able to define
and process asnapshotof time. A snapshot is a constraint in the time interval where
only one moment in time, zero duration, is processed. Dean and McDermott [27] saw
time in terms of duration constraints. Figure 8.2 gives an example from the Dean and
McDermott 1987 paper (p. 41) [27], on how a time map can be designed seeing snap-
shots as a point to point with the duration constraints encoded between the snapshots.
It is like a time slice across the current states and schematics, which will be call asitu-
ation, of the objects being considered. The situations may be processed in a forward or
backward direction of snapshots.
213
![Page 244: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/244.jpg)
Figure 8.2: A Simple Time Map.
As can be seen in Figure 8.3, an interval can be set up with a start and end
time and can be assigned as a property of an act. This propertyis the time duration
of the interval and can be viewed as either a fixed time scale orrelative time. In the
Figure 8.3 relative time is used because the time line does not have actual time values.
Time intervals when used with a relative scale require a way of knowing where to start
to investigate for information. As following the time map given, the ball is suspended,
drop, then falling; a choice is now made from continuing in this time stretch by a
bounce, rising, and stop, or roll and rolling. If the choice is for the bounce, then as the
stop interval finishes, the falling event will return. If theroll event occurs, there will
be no circling back to the falling event. In many temporal reasoning systems, this is
done by time indexing of time tokens at insertion of events; however, as discussed in
the Mukerjee review [76], sometimes the “neat” durations are not available.
Now as one looks at the full time interval each time slice for an object can be
seen as a constraint. If each of these constraints are viewedas their own act property,
then the full time interval picture will be modified depending which time slices are
current and/or which constraints that are satisfied.
214
![Page 245: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/245.jpg)
Figure 8.3: Time Chart for a Bouncing Ball.
8.4.3.3 Space
Regions within space at a location are also processed using qualitative function-
ality. Unlike temporal reasoning with a starting and endingtime, space does not have
time line flow, but can be multi-dimensional patchwork [76].However, one can still
look at regions that are space sliced according to locationsacross processes and chron-
icles, but one does not get a concept of input and output to thespatial relationships
[47].
In Figure 8.4, it can be seen that over time a ball that is bouncing appears in
215
![Page 246: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/246.jpg)
different locations, bounce height, in space (left axes). When working with spatial
constraints [12], the key is to find objects that fall into some spatially organized category
such as a region and then sort according to the category, Because there is only one
object in this example it is harder to see, but if two balls were bouncing being dropped
at different times, the constraints could be categorized bywhether or not there is zero,
one or two balls in the same space slice.
Figure 8.4: Conceptual Space Diagram for a Bouncing Ball.
216
![Page 247: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/247.jpg)
8.4.4 Different Domain Problems and Interoperability
The design of the architecture for the CPE system specifically was to address
the need to communicate and interoperate with other systems[87, 89]. The next step
is to work with multiple domains of information and start to connect the modules to
as many applications as possible. Work has already been proceeding on using the CPE
knowledge base as the “back-end” for a Story Understanding System [14, 88, 89].
217
![Page 248: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/248.jpg)
APPENDICES
![Page 249: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/249.jpg)
APPENDIX A
PROGRAMMING LANGUAGE CRITERIA
A.1 Language Evaluation
Each of the applications/systems defined in Chapter 6 may or may not share the
same implementation language. So an added complication, besides different internal
data structures, is that an application may not be able to communicate at a function call
level with another application because they are not writtenin the same implementation
language.
When looking to design an application with flexibly modules,the question
arises what implementation language should be used. Since,it was desired that the
system work with conceptual structures as the internal representation existing CS sys-
tems were examined. First the CS editors that were currentlyavailable were evaluated.
These editors, for editing CGs and FCAs, turned out to have different implementation
languages. CharGer [29, 30] is based on the API/Implementation code of Notio which
is written in Java [117, 115]. ARCEdit is a plug-in to PowerPoint [96] and is written
in Visual Basic 6.0. ToscanaJ [6, 7] has an editor as part of its suite of tools written
in JavaTM. While Docco is actually based on a Conceptual Email Manager[18], writ-
ten in C++/QT, the commercial version of the manager [36] is aplug-in for Microsoft
219
![Page 250: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/250.jpg)
Outlook.
Since Notio, written in Java, is already defined as an API/Implementation code
level and is available with extensible class definitions, the author considered using the
Notio interface for the CGIF (see Section 3.4.4) module. However, there are some
drawbacks in communications to Java(see Section A.2.2), and Notio is in hiatus and is
not currently being enhanced or developed [116].
All of the applications in the conceptual structures community employ differ-
ent implementation languages, such as Prolog, XML, Schema,RDF, etc., which made
it difficult to pass even simple syntactic representation data by linking languages in
modules. Files, streams, pipes, blackboards, etc. can be used to pass data information
without passing the actual data structures, but these mechanisms can be slow if there are
a large number of graphs or the graphs are extremely complex.Every time one applica-
tion process needs to talk to another, these mechanisms require multiple file descriptors
to be opened. If the applications or systems execute on different machines, the Flexible
Module Framework, FMF, architecture designed by John Sowa [124] is a flexible way
of passing the syntactic data representation; but, if the applications and systems are able
to be executed on the same machine configuration, a good API/Implementation design
would be more advisable because the module can be linked directly into the existing
application. Communication by files and other stream devices may require a locking
mechanism to be setup, so that one application can know when it is safe to read the
220
![Page 251: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/251.jpg)
input graphs from another application. The locking of records can cause a problem
when two applications communicate by way of shared databases or message passing
systems, such as MPI. If on the other hand, an application/system can call another
application/system directly (or can link to it), processing can go more quickly.
However, connecting systems when the implementation languages are not the
same is more difficult, because a straight forward “call” to the other system’s functions
is not always possible. Each language implementation has its own calling specifica-
tions.
In order for data to be transferred between working tools, either all the tools
must be implemented within the same environment as a single application or there
must be an interchange format. When tools are developed through a single system, the
same data structure (or model) can be shared among all the tools so that data can be
stored and retrieved. However, when tools are not part of thesame system, they do
not necessarily share the same internal data structure (or design model). To support
interoperability for applications [101], an interchange format must be defined. This
interchange format must be agreed upon by the whole working community. When this
standard format is used to move data between applications, standard benchmark test
can be developed.
One module concentrated on, in respect to data structures for the processors,
is in the use of Conceptual Graph Interchange Format, CGIF, for communication [122,
221
![Page 252: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/252.jpg)
126]. This is not to say that a processor is constrained in itsinternal structures or imple-
mentation by CGIF, but it makes sense to examine the correspondence between CGIF
and the appropriate data types with a view to minimizing the difficulties of parsing and
generating CGIF syntax [87]. A definition of the actual CGIF syntax and semantic
interpretation used for Conceptual Graphs data structure can be seen in section 3.4.4.
A.1.1 Visual Basic .Net
If it was desirable to have the component modules available only for execu-
tion under the Microsoft Windows OS, Visual Basic .Net does not have any of the
connection or interface problems discussed above. However, this would not allow the
components implementation to be moved off the Microsoft Windows OS, and if the
modules are not implemented in Visual Basic .Net, than they can be made more widely
available under Linux operating systems and eventually under other operating systems
such as Unix.
A.1.2 JavaTM
Java has a very nice visual interface and is able to be transferred to many dif-
ferent operating systems. However, it is an interpreted language and takes more time to
execute than a language that is resident to the machine.
222
![Page 253: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/253.jpg)
A.1.3 C
C as originally designed to be an operating system base language. It like Visual
Basic .Net is tied to the OS that it is running under. Because of this, it is much faster
than interpreted languages like Java, but is not as portable. It is also designed to be
coded “bottom up”, such that, routines are built into libraries and then called from an
overall application.
A.1.4 C++
C++ is an Object-Oriented Programming (OOP) language that is an evolution-
ary extension from the language C was developed by Bjarne Stroustrup [127]. Even
though it accepts the C syntax, it improves on many features of the language. In par-
ticular, programs written in C++ can be coded “top down” by designing what objects
are needed within the program and then how do they relate. Theactual code comes
directly from the design and specification of the program instead of linking existing
routines together.
A.2 Language Comparison
In order to know which language would be best for implementation of the new
environment’s modular components, so that they could possibly be used directly by
other applications, an evaluation is performed over how theC++ language interacts,
interfaces and communicates with other languages.
223
![Page 254: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/254.jpg)
A.2.1 C++ to C
Interfacing implementation code between languages that are somewhat similar,
for example: C++ and C, is not as difficult as other communications between languages.
However, this connection may not be bidirectional. The calling sequence for the C
language is simpler than for the C++ language, because C++ does name mangling with
the name of the function, the types of the arguments and the return type of the function.
C does not do the same name mangling and uses only a modified form of the actual
name of the function.
Therefore when designing an API in C++, the interface routines should be ex-
ported as “C” functions as opposed to being methods for a class in C++; this will
prevent the routines from being name mangled by the C++ compiler. Wrapping C++
with standard C routines allows the internal implementation of the module to remain
C++ and use the classes and methods functionality from C++, while at the same time
using the simpler formulation of the name of the calling routine provided by C.
A.2.2 C++ to JavaTM
Connecting C++ to Java is also possible, but is more difficultthan communicat-
ing with C. This connection is also not bidirectional, but for different reasons. Java is
a simpler language than C++ [102], but it is an interpreted language. This means that
Java can be byte-compiled, creating a smaller file to be movedacross the web, but it is
224
![Page 255: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/255.jpg)
not compiled to machine code. This allows Java to be platformindependent; C++ is a
compiled language and is both platform dependent and operating system (OS) depen-
dent. However, because Java is interpreted instead of compiled, it can not be linked and
called directly by a system (application) that is not written in Java. Java must start the
process, and then can call compiled code in some of the other compiled languages (for
example: C/C++). Therefore, Java can call interface functions written in C or C++, but
C++ can not call Java directly.
A.2.3 C++ to Prolog
Connecting C++ to Prolog is very similar to connecting C++ toJava. Prolog
has foreign function routines for call C++/C functions. Also, when using particular
operating system and version of Prolog, communication may be provided by the Prolog
system (for example: Amzi! Prolog Logic Server and Microsoft C++) for integrating
C++ and Prolog routines. However, in general, like Java and Lisp (not discussed in this
paper) Prolog must start the process of executing the systemand then call to the C++
routines, but C++ cannot call directly to Prolog.
A.2.4 C++ to Visual Basic 6.0
The connection or interface from C++ to Visual Basic 6.0 is the most difficult
connection among the four languages discussed in this paper. One reason is that Visual
Basic 6.0 is a two part language; an event driven module part and a class module part.
225
![Page 256: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/256.jpg)
The class module part is very similar to C++ and holds the characteristics that
are available in object-oriented languages. Class modulescan also be compiled just
like C++ to native code for the machine. However, the event driven part executes Basic
code in response to an event. These event driven procedures (routines) are triggered by
a form or control which is hooked into the visual part of the language. The triggering
of a routine by an event is similar to the interpretation of a function call in languages
like Java. Because of the event driven part of the language, Visual Basic 6.0 can call
C++ or C API/Interface routines, but C++/C cannot trigger anevent within the Visual
Basic code, so the event procedures (routines) are not executed outside of Visual Basic
code.
A second reason Visual Basic 6.0 is difficult to connect, is that it has different
encoding of some of its data types than C, C++, or Java [105]. Character data is stored in
more bits by Visual Basic than by C. Therefore, to pass a character string as a parameter
from Visual Basic to C or visa versa, the character string must be converted to Unicode
first, that is passed as a parameter, and then decoded from Unicode at the other end.
This makes passing character data much more cumbersome. Also, Visual Basic defines
different Boolean values than C; the “false” value is 0 in both languages, but the “true”
value for Visual Basic is -1 (negative) where in C it is 1 (positive). Therefore, in passing
Boolean values, the user must be careful when working with conditionals.
226
![Page 257: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/257.jpg)
APPENDIX B
DOCUMENTATION OF CGIF - VERSION 2001
B.1 Added Definitions For CGIF Categories
Context.
A contextC is a concept whose designator is a nonblank conceptual graphg.
• The graphg is said to be immediately nested inC, and any conceptc of g is said
to be immediately nested inC.
• A conceptc is said to be nested inC, if eitherc is immediately nested inC or c is
immediately nested in some contextD that is nested inC.
• Two conceptsc and d are said to be co-nested if eitherc=d or there is some
contextC in whichc andd are immediately nested.
• If a conceptx is co-nested with a contextC, then any concept nested inC is said
to be more deeply nested thanx.
• A conceptd is said to be within the scope of a conceptc if either d is co-nested
with c or d is more deeply nested thanc.
227
![Page 258: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/258.jpg)
Coreference Set.
A coreference setC in a conceptual graphg is a set of one or more concepts selected
from g or from graphs nested in contexts ofg.
• For any coreference setC, there must be one or more concepts inC, called the
dominant conceptsof C, which include all concepts ofC within their scope. All
dominant concepts ofC must be co-nested.
• If a conceptc is a dominant concept of a coreference setC, it may not be a
member of any other coreference set.
• A conceptc may be member of more than one coreference setC1,C2, ... provided
thatc is not a dominant concept of anyCi .
• A coreference setC may consist of a single conceptc, which is then the dominant
concept ofC.
Referent.
Adding to the definition already seen in Definition 3.4.2, a referent of a concept is
specified by a quantifier, a designator, and a descriptor.
• Quantifier. A quantifier is one of two kinds: existential or defined.
• Designator. A designator is one of three kinds:
228
![Page 259: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/259.jpg)
1. literal, which may be a number, a string, or an encoded literal;
2. locator, which may be an individual marker, an indexical,or a name;
3. undetermined.
• Descriptor. A descriptor is a conceptual graph, possibly blank, which is said to
describe the referent.
B.2 Lexical Categories
The CGIF lexical categories can be recognized by a finite-state tokenizer or
preprocessor. No characters of white space (blanks or othernonprinting characters)
are permitted inside any lexical item other than delimited strings (names, comments,
or quoted strings). Zero or more characters of white space may be inserted or deleted
between any lexical categories without causing an ambiguity or changing the syntactic
structure of CGIF. The only white space that should not be deleted is inside delimited
strings.
Comment.
A comment is a delimited string with a semicolon ";" as the delimiter.
Comment ::= DelimitedStr(";")
229
![Page 260: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/260.jpg)
DelimtedStr(D).
A delimited string is a sequence of two or more characters that begin and end with a
single character D called the delimiter. Any occurrence of Dother than the first or last
character must be doubled.
DelimitedStr(D) ::= D (AnyCharacterExcept(D) | D D)* D
Exponent.
An exponent is the letter E in upper or lower case, an optionalsign ("+" or "-"), and an
unsigned integer.
Exponent ::= ("e" | "E") ("+" | "-")? UnsignedInt
Floating.
A floating-point number is a sign ("+" or "-") followed by one of three options: (1)
a decimal point ".", an unsigned integer, and an optional exponent; (2) an unsigned
integer, a decimal point ".", an optional unsigned integer,and an optional exponent; or
(3) an unsigned integer and an exponent.
Floating ::= ("+" | "-") ("." UnsignedInt Exponent?| UnsignedInt ("." UnsignedInt? Exponent?| Exponent ) )
230
![Page 261: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/261.jpg)
Identifier.
An identifier is a string beginning with a letter or underscore "_" and continuing with
zero or more letters, digits, or underscores.
Identifier ::= (Letter | "_") (Letter | Digit | "_")*
Integer.
An integer is a sign ("+" or "-") followed by an unsigned integer.
Integer ::= ("+" | "-") UnsignedInt
Name.
A name is a delimited string with a single quote "’" as the delimiter.
Name ::= DelimitedStr("’")
Number.
A number is an integer or a floating-point number.
Number ::= Floating | Integer
QuotedStr.
A quoted string is a delimited string with a double quote ’"’ as the delimiter.
QuotedStr ::= DelimitedStr(’"’)
231
![Page 262: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/262.jpg)
UnsignedInt.
An unsigned integer is a string of one or more digits.
UnsignedInt ::= Digit+
B.3 Syntactic Categories
The CGIF syntactic categories are defined by a context-free grammar that can be
processed by a recursive-descent parser. Zero or more characters of white space (blanks
or other nonprinting characters) are permitted between anytwo successive constituents
of any grammar rule that defines a syntactic category.
Actor.
An actor begins with "<" followed by a type. It continues withzero or more input arcs,
a separator "|", zero or more output arcs, and an optional comment. It ends with ">".
Actor ::= "<" Type(N) Arc* "|" Arc* Comment? ">"
The arcs that precede the vertical bar are called input arcs,and the arcs that follow the
vertical bar are calledoutput arcs. The valence N of the actor type must be equal to the
sum of the number of input arcs and the number of output arcs.
232
![Page 263: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/263.jpg)
Arc.
An arc is a concept or a bound label.
Arc ::= Concept | BoundLabel
BoundLabel.
A bound label is a question mark “?” followed by an identifier.
BoundLabel ::= "?" Identifier
CG.
A conceptual graph is a list of zero or more concepts, relations, actors, special contexts,
or comments.
CG ::= (Concept | Relation | Actor | SpecialContext | Comment)*
The alternatives may occur in any order provided that any bound coreference label must
occur later in the CGIF stream and must be within the scope of the defining label that
has an identical identifier. The definition permits an empty CG, which contains nothing.
An empty CG, which says nothing, is always true.
CGStream.
A conceptual graph stream is defined as a sequence of one or more CGs, each separated
by a period.
233
![Page 264: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/264.jpg)
CGStream ::= CG ("." CG)*
Since a CG may itself be empty, the string "....." would also qualify as a CG Stream; as
well as an empty file.
Concept.
A concept begins with a left bracket "[" and an optional monadic type followed by
optional coreference links and an optional referent in either order. It ends with an
optional comment and a required "]".
Concept ::= "[" Type(1)? {CorefLinks?, Referent?} Comment? "]"
If the type is omitted, the default type is Entity. This rule permits the coreference labels
to come before or after the referent. If the referent is a CG that contains bound labels
that match a defining label on the current concept, the defining label must precede the
referent.
Conjuncts.
A conjunction list consists of one or more type terms separated by "&".
Conjuncts(N) ::= TypeTerm(N) ("&" TypeTerm(N))*
The conjunction list must have the same valence N as every type term.
234
![Page 265: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/265.jpg)
CorefLinks.
Coreference links are either a single defining coreference label or a sequence of zero or
more bound labels.
CorefLinks ::= DefLabel | BoundLabel*
If a dominant concept node (as defined in subsection B.1) has any coreference label, it
must be either a defining label or a single bound label that hasthe same identifier as the
defining label of some co-nested concept.
DefLabel.
A defining label is an asterisk “*” followed by an identifier.
DefLabel ::= "*" Identifier
The concept in which a defining label appears is called the defining concept for that
label; a defining concept may contain at most one defining label and no bound corefer-
ence labels. Any defining concept must be a dominant concept as defined in subsection
B.1.
Every bound label must be resolvable to a unique defining coreference label
within the same context or some containing context. When conceptual graphs are im-
ported from one context into another, however, three kinds of conflicts may arise:
235
![Page 266: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/266.jpg)
1. A defining concept is being imported into a context that is within the scope of
another defining concept with the same identifier.
2. A defining concept is being imported into a context that contains some nested
context that has a defining concept with the same identifier.
3. Somewhere in the same module there exists a defining concept whose identifier
is the same as the identifier of the defining concept that is being imported, but
neither concept is within the scope of the other.
In cases (1) and (2), any possible conflict can be detected by scanning no further than
the right bracket "]" that encloses the context into which the graph is being imported.
Therefore, in those two cases, the newly imported defining coreference label and all its
bound labels must be replaced with an identifier that is guaranteed to be distinct. In
case (3), there is no conflict that could affect the semanticsof the conceptual graphs
or any correctly designed CG tool; but since a human reader might be confused by the
similar labels, a CG tool may replace the identifier of one of the defining coreference
labels and all its bound labels.
Descriptor.
A descriptor is a structure or a nonempty CG.
Descriptor ::= Structure | CG
236
![Page 267: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/267.jpg)
A context-free rule, such as this, cannot express the condition that a CG is only called
a descriptor when it is nested inside some concept.
Designator.
A designator is a literal, a locator, or a quantifier.
Designator ::= Literal | Locator | Quantifier
Disjuncts.
A disjunction list consists of one or more conjunction listsseparated by "|".
Disjuncts(N) ::= Conjuncts(N) ("|" Conjuncts(N))*
The disjunction list must have the same valence N as every conjunction list.
FormalParameter.
A formal parameter is a monadic type followed by a optional defining label.
FormalParameter ::= Type(1) [DefLabel]
The defining label is required if the body of the lambda expression contains any match-
ing bound labels.
237
![Page 268: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/268.jpg)
Indexical.
An indexical is the character “#” followed by an optional identifier.
Indexical ::= "#" Identifier?
The identifier specifies some implementation-dependent method that may be used to
replace the indexical with a bound label.
IndividualMarker.
An individual marker is the character “#” followed by an integer.
IndividualMarker ::= "#" UnsignedInt
The integer specifies an index to some entry in a catalog of individuals.
LambdaExpression(N).
A lambda expression begins with "(" and the keyword "lambda", it continues a signature
and a conceptual graph, and it ends with ")".
LambdaExpression(N) ::= "(" "lambda" Signature(N) CG ")"
A lambda expression with N formal parameters is called an N-adic lambda expression.
The simplest example, represented "(lambda ())", is a 0-adic lambda expression with a
blank CG.
238
![Page 269: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/269.jpg)
Literal.
A literal is a number or a quoted string.
Literal ::= Number | QuotedStr
Locator.
A locator is a name, an individual marker, or an indexical.
Locator ::= Name | IndividualMarker | Indexical
Negation.
A negation begins with a tilde “~” and a left bracket “[” followed by a conceptual graph
and a right bracket “]”.
Negation ::= "~[" CG "]"
A negation is an abbreviation for a concept of typePropositionwith an attached relation
of type Neg. It has a simpler syntax, which does not permit coreference labels or at-
tached conceptual relations. If such options are required,the negation can be expressed
by the unabbreviated form with an explicitNegrelation.
Quantifier.
A quantifier consists of an at sign “@” followed by an unsignedinteger or an identifier
and an optional list of zero or more quoted strings enclosed in braces.
239
![Page 270: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/270.jpg)
Quantifier ::= "@" (UnsignedInt | Identifier ("{" (remove(Arc*))QuotedStr ("," QuotedStr)* "}")?)
The symbol @some is called the existential quantifier, and the symbol @every is called
the universal quantifier. If the quantifier is omitted, the default is @some.
Referent.
A referent (see subsection B.1 for added definitions) consists of a colon “:” followed
by an optional designator and an optional descriptor in either order.
Referent ::= ":" {Designator?, Descriptor?}
Relation.
A conceptual relation begins with a left parenthesis “(” followed by an N-adic type, N
arcs, and an optional comment. It ends with a right parenthesis “)”.
Relation ::= "(" Type(N) Arc* Comment? ")"
The valence N of the relation type must be equal to the number of arcs.
Signature.
A signature is a parenthesized list of zero or more formal parameters separated by
commas.
240
![Page 271: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/271.jpg)
Signature ::= "(" (FormalParameter ("," FormalParameter)*)? ")"
SpecialConLabel.
A special context label is one of five identifiers: “if”, “then”, “either”, “or”, and “sc”,
in either upper or lower case.
SpecialConLabel ::= "if" | "then" | "either" | "or" | "sc"
The five special context labels and the two identifiers "else"and "lambda" are reserved
words that may not be used as type labels.
SpecialContext.
A special context is either a negation or a left bracket, a special context label, a colon,
a CG, and a right bracket.
SpecialContext ::= Negation | "[" SpecialConLabel ":" CG "]"
Structure.
A structure consists of an optional percent sign “%” and identifier followed by a list of
zero or more arcs enclosed in braces.
Structure ::= ("%" Identifier)? "{" Arc* "}"
241
![Page 272: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/272.jpg)
Type.
A type is a type expression or an identifier other than the reserved labels: "if", "then",
"either", "or", "sc", "else", "lambda".
Type(N) ::= TypeLabel(N) | TypeExpression(N)
A concept type must have valence N=1. A relation type must have valence N equal
to the number of arcs of any relation or actor of that type. Thetype label or the type
expression must have the same valence as the type.
TypeExpression.
A type expression is either a lambda expression or a disjunction list enclosed in paren-
theses.
TypeExpression(N) ::= LambdaExpression(N) | "(" Disjuncts(N) ")"
The type expression must have the same valence N as the lambdaexpression or the
disjunction list.
TypeLabel(N).
A type label is an identifier.
TypeLabel(N) ::= Identifier
242
![Page 273: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/273.jpg)
The type label must have an associated valence N.
TypeTerm.
A type term is an optional tilde “~” followed by a type.
TypeTerm(N) ::= "~"? Type(N)
The type term must have the same valence N as the type.
Example.
When transforming the English phrase:A person is between a rock and a hard place,
the display format, DF, of the phrase in CG format can be seen in Figure B.1. Following
is a translation of Figure 2 from DF to CGIF:
(Betw [Rock] [Place *x1] [Person]) (Attr ?x1 [Hard])
For more compact storage and transmission, all white space not contained in comments
or enclosed in quotes may be eliminated:
(Betw[Rock][Place*x1][Person])(Attr?x1[Hard])
243
![Page 274: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/274.jpg)
Between
Place Attr Hard
Person
Rock
Figure B.1: The Display Format for‘A person is between a rock and a hard place.’
This translation takes the option of nesting all concept nodes inside the concep-
tual relation nodes. A logically equivalent translation, which uses more coreference
labels, moves the concepts outside the relations:
[Rock *x1] [Place *x2] [Person *x3] (Betw ?x1 ?x2 ?x3)
[Hard ?x4] (Attr ?x2 ?x4)
The concept and relation nodes may be listed in any order provided that every bound
label follows the defining node for that label.
244
![Page 275: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/275.jpg)
APPENDIX C
DOCUMENTATION OF SYSTEMS
C.1 pCG (CGP Programs)
In order to use pCG to test the projection operation that was found at the Notio
level, a ‘cgp’ program file had to be generated. Table C.1 shows the cgp programs that
match to each of the tests that are given in section 7.2.1 of Chapter 7.
Table C.1: CGP Program Files.
CGP Program Filename # of KB files # of Queries
5 graphs_5.cgp 6 211 graphs_11.cgp 6 521 graphs_21.cgp 6 631 graphs_31.cgp 6 853 graphs_53.cgp 6 1073 graphs_73.cgp 6 12
The program file contained instructions to the pCG processoron what functions
are to be executed and what information is to be retrieved. Figures C.1, C.2, C.3, and
C.4 contains an example of one of the cgp program files.
245
![Page 276: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/276.jpg)
# A test of the final graphs. Reads a final graph file, asserts# each non type hierarchy related graph into pCG’s top-level# knowledge base, and projects a filter over all these graphs,# returning matches and sending them to standard output.# Use a June 2001 CG Standard conformant CGIF parser and# generator. The current (0.2.2) Notio defaults are based upon# an older version of the standard.option cgifparser = "cgp.translators.CGIFParser";option cgifgen = "cgp.translators.CGIFGenerator";# Get the file path separator for the current operating system.sep = (_ENV.member("file.separator"))[2];# Final file names.graphFileNames = {"graphs_53_1.cgf", "graphs_53_5.cgf",
"graphs_53_100.cgf", "graphs_53_1000.cgf","graphs_53_2500.cgf", "graphs_53_5000.cgf"};
Figure C.1: Part 1: Example of CGP Program from pCG.
The cgp program is broken into several parts so that it can be displayed in sev-
eral figures. The first part indicates the parser and translator format being used by the
CGP program; this may be either the Notio original format or the CGIF format from
2001 [126]. The next section in this piece of the program, indicated the KBs that should
be tested. Part 2 indicates the query graphs that will be tested against the KB.
The third part examines the parameters that the CGP program will be using to
select the correct KB, query graph to be tested, and the number of times to run the test.
Then reads in the KB from the indicated file and runs the “Assertion” phase of the CGP
program file to setup the knowledge base for the pCG system. Inthe fourth part is the
actual running of the projection algorithm and the printingof the time results.
246
![Page 277: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/277.jpg)
# filter graphsgraphFilters = {‘(ATTR1 [Block] [Color])‘,‘(ATTR1 [Block*b] [Color])(NAME1 ?b [Number])‘,‘(Above1 [Block*b2] [Block*b1])(OnTable1 ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])‘,
‘(Above1 [Block*b2] [Block*b1])(OnTable1 ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(ATTR2 ?b2 [Color])‘,
‘(Above1 [Block*b2] [Block*b1])(OnTable1 ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])‘,
‘(Above1 [Block*b2] [Block*b1])(OnTable1 ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])(NAME2 ?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])‘,
‘(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2)(OnTable1 ?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])(NAME2 ?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])(ATTR3 ?b3 [Color])(NAME3 ?b3 [Number])‘,
‘(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2) (OnTable1?b1 [Table])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1[Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])(NAME2 ?b2[Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])(ATTR3 ?b3[Color])(NAME3 ?b3 [Number])(CHRC3 ?b3 [Shape])(LOC3 ?b3 [Place])‘,
‘(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2)(OnTable1 ?b1 [Table*t1])(OnTable2 [Block*b4] ?t1)(Above3 [Block*b5] ?b4)(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])(NAME2 ?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])(ATTR3 ?b3 [Color])(NAME3 ?b3 [Number])(CHRC3 ?b3 [Shape])(LOC3 ?b3 [Place])(ATTR4 ?b4 [Color])(NAME4 ?b4 [Number])(ATTR5 ?b5 [Color])(NAME5 ?b5 [Number])‘,
‘(Above1 [Block*b2] [Block*b1])(Above2 [Block*b3] ?b2)(OnTable1?b1 [Table*t1])(OnTable2 [Block*b4] ?t1)(Above3 [Block*b5] ?b4)(NAMET ?t1 [Number])(ATTR1 ?b1 [Color])(NAME1 ?b1 [Number])(CHRC1 ?b1 [Shape])(LOC1 ?b1 [Place])(ATTR2 ?b2 [Color])(NAME2?b2 [Number])(CHRC2 ?b2 [Shape])(LOC2 ?b2 [Place])(ATTR3 ?b3[Color])(NAME3 ?b3 [Number])(CHRC3 ?b3 [Shape])(LOC3 ?b3[Place])(ATTR4 ?b4 [Color])(NAME4 ?b4 [Number])(CHRC4 ?b4[Shape])(LOC4 ?b4 [Place])(ATTR5 ?b5 [Color])(NAME5 ?b5[Number])(CHRC5 ?b5 [Shape])(LOC5 ?b5 [Place])‘};
Figure C.2: Part 2: Example of CGP Program from pCG.
247
![Page 278: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/278.jpg)
# Get optional graph file number from command-line.# Defaults to 1.gFileNum = 1;if _ARGS.length > 0 and _ARGS.length <= 3 thengFileNum = (_ARGS[1]).toNumber(); endif gFileNum > 6 or gFileNum < 1 then
exit "Invalid file number."; endgraphFileName = graphFileNames[gFileNum];gFNum = 1;if _ARGS.length > 1 and _ARGS.length <= 3 thengFNum = (_ARGS[2]).toNumber(); endif gFNum > 10 or gFNum < 1 then exit "Invalid filter number."; endtNum = 1;if _ARGS.length > 2 and _ARGS.length <= 3 thentNum = (_ARGS[3]).toNumber(); end# open the CGF filenewF = file ("examples" + sep + "projection" + sep + graphFileName);# open to get timingsu = new Util;# Read and assert the graphs.println "*** Asserting graphs into KB...";println "";startfull = u.getCurrentTimeInMillis();graphs = newF.readGraphStream();newF.close();endtime1 = u.getCurrentTimeInMillis();println "Storage time is " + (endtime1 - startfull) + " ";startassert = u.getCurrentTimeInMillis();t = 0;foreach g in graphs dorels = g.relations;t.inc();if rels.member("GT") is undefined thenassert g;endendendtime2 = u.getCurrentTimeInMillis();print "Assert time is " + (endtime2 - startassert)";println " for " + t + " graphs.";
Figure C.3: Part 3: Example of CGP Program from pCG.
248
![Page 279: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/279.jpg)
n = 0;while n < tNum doprojections = {};# Retrieve all graphs in the outer context containing an# OnTable relation.filter = graphFilters[gFNum];# print "Result of projecting " + filter# println " onto asserted graphs...";# println "";t = 0;startpart = u.getCurrentTimeInMillis();foreach g in _KB.graphs doh = g.project(filter);if not (h is undefined) thenif (tNum <= 1) thenendtimefull = u.getCurrentTimeInMillis();println h;endt.inc();projections.append(h);endif h is undefined thent.inc();println "Not found graph is number " + t + ".";endendif (tNum > 1) thenendtimefull = u.getCurrentTimeInMillis();endendpart = (endtimefull - startpart);print "Actual Projection time is " + endpart + " for "println t + " graphs";n.inc();end
Figure C.4: Part 4: Example of CGP Program from pCG.
249
![Page 280: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/280.jpg)
C.2 CP Environment, CPE
The CP Environment has both module documentation for showing how the
DDLs are designed and class documentation for some of the data structures and func-
tions found in the classes of the systems.
C.2.1 CPE Module Documentation
The module documentation gives the top level systems API functions and the
internal general functions for both the CPE and CG modules. The CPE module is the
most generalized routines available for the CPE system and the CG module holds the
basic data structure for the conceptual graphs storing the knowledge base.
C.2.1.1 CP_Graph Reasoning Operations
These functions perform the basic reasoning operations from the API:
• CPE_API CPLPGraphs STDCALLCPE_projectionUnique (void)
– Note: only returns one projection even if more than one available.
• CPE_API CPLPGraphs STDCALLCPE_projection (void)
– Projects the current query graph onto the current.
250
![Page 281: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/281.jpg)
C.2.1.2 CP_Graph Reasoning Internal Operations
These functions perform the actual internal reasoning operations:
• CPLPGraphscp_ops::CProjection(CPLPGraph, CPLPGraph)
– Actual graph to graph projection.
• CPLPGraphscp_ops::CProject(CPLPKB,CPLPGraph)
– Knowledge base to query graph projection which processes all the graphs in theKB.
• BOOLEAN cp_ops::get_onlyOne(void)
– Check to see if only one projection needs to be found per KB graph.
• void cp_ops::set_onlyOne(BOOLEAN)
– Set if only one projection needs to be found per KB graph.
• CPLPGraphcp_ops::add_toornewprojections(CGLPChar, CGLPChar, CGLPChar,
CGLPChar, CGLPChar, CGLPNElement, CPLPGraph, CPLPGraph,CPLPGraphs)
– Check new matching concept or new projection line from current matching con-cept.
• BOOLEAN cp_ops::add_toexistprojections(CGLPChar, CGLPChar, CGLP-
NElement, CPLPGraph, CPLPGraphs)
– Add the new query triple match to all related projection graphs.
251
![Page 282: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/282.jpg)
• BOOLEANcp_ops::add_copyprojections(CGLPChar, CGLPChar, CGLPNEle-
ment, CPLPGraph, CPLPGraph, CPLPGraphs)
– Make a copy of a projection graph and add in the new triple for next round pro-cessing.
• BOOLEAN cp_ops::process_querytriple(CGLPChar, CGLPChar, CGLPChar,
CGLPNElement, CGLPCStr, CPLPGraph, CPLPGraphs, CPLPGraphs)
– Process multiple elements to anchorlist when not on first concept in query graph.
C.2.1.3 CGHash_Graph and CG_Graph Public Functions
These functions perform the basic graph operations from theAPI:
• booladdChild (CGLPChar)
– Adds a new child graph (nested graph) to the CG graph.
• booladdConcept(CGLPChar)
– Adds a new concept to the CG graph concepts list.
• booladdRelation (CGLPChar)
– Adds a new relation to the CG graph relations list.
• booladdTriple (CGLPChar)
– Adds a new triple name to the CG graph triples list.
252
![Page 283: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/283.jpg)
• bool isChild (CGLPChar)
– Is a children list
• bool isConcept(CGLPChar)
– Is a concepts list
• bool isRelation (CGLPChar)
– Is a relations list
• bool isTriple (CGLPChar)
– Is a triples list
• CGLPNodesgetNodes(short)
– Returns the node list for the type of node being searched for.
C.2.2 CPE Class Documentation
The class documentation indicates how the hierarchy of class references are
setup in C++ for the modules.
C.2.2.1 cp_graph Class Reference
• Inheritance diagram for cp_graph is seen in Figure C.5.
253
![Page 284: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/284.jpg)
cp_graph
cghash_graph cg_graph
Figure C.5: Inheritance Diagram for Class ‘cp_graph’.
C.2.2.2 cghash_graph Class Reference
This class is the perfect hash implementation of a CG graph. Inheritance dia-
gram for cghash_graph is the left side of Figure C.5. Base graph: cghash_graph data
structure that is changing for graphs; when CGHASH defined then implemented as two
hashtables and all lists are hashtables with keys that are unique numbers. Class specific
functions are:
• cghash_graph(void)
– Constructor function that makes sure most internal lists are built.
• cghash_graph(UINT)
– Constructor function that makes sure internal lists are built and triples lists.
• ∼cghash_graph(void)
– The destructor class for cleaning up at the end.
254
![Page 285: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/285.jpg)
C.2.2.3 cg_graph Class Reference
This class is the array implementation of a CG graph. Inheritance diagram for
cg_graph is the right side of Figure C.5. Base graph: cg_graph: Conceptual Graph
elements of graph data structure that is changing for graphs; when CG2DARR defined
then implemented as two 2-dimensional arrays and all lists are list of strings. Class
specific functions are:
• cg_graph(void)
– Constructor function that makes sure all internal list are built.
• cg_graph(int)
– Constructor function that makes sure internal lists are built and triples lists.
• ∼cg_graph(void)
– The destructor class for cleaning up at the end.
255
![Page 286: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/286.jpg)
256
![Page 287: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/287.jpg)
APPENDIX D
DATA COLLECTED FROM SAMPLE TESTS
This appendix will give an example of some of the data collected to produce the
results found in Chapter 7. It also gives output from the tested systems to verify that the
correct results were produced with the projections in both the unique relation instance
and multiple relations within graph instance.
D.1 Data Collected for Computing Each Experimental ResultsTest Set - 53 nodesin KB Graphs
The three Tables D.1, D.2 and D.3 are the average data values used to produce
the graphs found in subsection 7.4.5 of Chapter 7.
Table D.1: Average Data Values for 53 nodes KB with 1000 Graphs.
# of nodes in Query pCG CPE (array) CPE (hash table)3 82.1 25.85 174.155 105.45 49.05 181.359 143.8 82.95 214.111 161.75 90.85 228.815 206.3 140.5 261.521 279 177.15 305.4527 346.15 280.25 36231 424.25 320.4 392.943 575.8 506.9 503.1553 700.85 662.15 595.35
257
![Page 288: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/288.jpg)
These averages came from computing the average value over 48runs for each
test case. A test case consisted of selecting the number of nodes in the KB graphs file,
and then selecting the query graph to be projected onto that KB of graphs. Before com-
puting the average value the four lowest (fastest) times andthe four highest (slowest)
times were dropped. The average values seen in the tables arefor runs with the pCG
system, CPE with the array data structures and CPE with the hash table data structures.
Table D.2: Average Data Values for 53 nodes KB with 2500 Graphs.
# of nodes in Query pCG CPE (array) CPE (hash table)
3 225.1 55.3 457.355 291.35 122.4 466.19 396.9 190.1 552.911 465.8 251.05 600.815 575.8 344.2 685.2521 742.25 510.75 786.8527 945.4 713.5 939.831 1037.55 857.15 1052.4543 1494.6 1351.55 1303.6553 1814.95 1875 1514.8
The reason some of the timings collected were not used in computing the av-
erages is timings on the machine used for all testing was onlyaccurate to within 16
milliseconds, so some “spreading” of the timing was seen. How much spreading is
given in the error bar data below. It should be explained, that the 16 milliseconds accu-
racy came because the tests were being executed on at 64 bit processing architecture,
258
![Page 289: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/289.jpg)
but the clock values could only be retrieved with 32 bit accuracy. Therefore, the clock
timing values jumped by 16 milliseconds on time change.
Table D.3: Average Data Values for 53 nodes KB with 5000 Graphs.
# of nodes in Query pCG CPE (array) CPE (hash table)
3 559.35 131.05 838.65 660.95 226.9 944.99 862.55 431.25 1144.6511 1010.15 536.8 1174.515 1217.2 804.9 1340.421 1578.1 1142.6 1589.727 1976.6 1529.75 1774.131 2220.3 1881.95 2005.3543 2991.45 2933.55 2429.553 3786.7 3924.75 3039.35
The justification for dropping part of collected data was that it was consistent
over all tests runs for all systems being tested. For every 12runs, the highest and lowest
values collected were always dropped.
D.2 Error Bar Data - 53 nodes in KB Graphs
Discussed above was the fact that there was some “spreading”of data values;
that is, not all the data fell cleaning in a small range of timevalues.
Tables D.4, D.5 and D.6 indicate the actual fastest and slowest values collected
for the 53 nodes in KB graph test set.
259
![Page 290: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/290.jpg)
Table D.4: Fast/Slow Values for 53 nodes KB with 1000 Graphs.
# nodes in Q pCG (f) pCG (s) CPE (af) CPE (as) CPE (hf) CPE (hs)3 78 94 0 48 125 2175 94 110 16 78 156 2189 140 172 31 127 141 29811 156 188 31 142 217 27915 203 219 63 219 202 34121 265 359 140 234 250 36227 328 407 188 407 312 42231 406 641 251 438 343 45243 532 797 424 607 403 59853 656 906 576 751 468 736
The columns are laid out by each of the three systems, giving the best (fastest)
time for each set of runs followed by the worst (slowest) timefor that run. Therefore,
first seen is the fastest time for the pCG system run followed by the slowest time for
that same set of runs. Second is the CPE system using the arraydata structure with
its fastest execution time for the runs followed by the slowest time, and lastly the CPE
system using the hash table data structures fastest times followed by the slowest.
The rows in each table are the number of nodes within the querygraph being
projected. The query graph size is smaller than or equal in size to the graphs found in
the KB. In fact, the query graphs are built from the abstract (most general) version of
the graphs in the KB.
Tables D.7, D.8 and D.9 then display the ranges of data (or howfar away from
the average value), which will be referred to as Error Bar Data, for all of the 53 nodes
260
![Page 291: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/291.jpg)
Table D.5: Fast/Slow Values for 53 nodes KB with 2500 Graphs.
# nodes in Q pCG (f) pCG (s) CPE (af) CPE (as) CPE (hf) CPE (hs)3 218 235 16 79 391 5175 281 312 78 188 359 5319 390 407 141 232 468 62711 453 484 203 298 515 67315 562 656 249 438 548 76321 719 843 390 583 667 84927 937 1032 639 808 858 105231 985 1125 720 969 970 112443 1422 1969 1199 1475 1146 144153 1703 2344 1725 2008 1314 1673
Table D.6: Fast/Slow Values for 53 nodes KB with 5000 Graphs.
# nodes in Q pCG (f) pCG (s) CPE (af) CPE (as) CPE (hf) CPE (hs)
3 546 578 48 171 732 9405 656 672 155 314 782 10779 844 875 328 533 1017 129611 1000 1032 433 641 1033 128115 1203 1235 655 908 1221 144121 1562 1656 980 1345 1437 170927 1953 2031 1418 1712 1607 201831 2141 3172 1682 2103 1836 215843 2859 4015 2714 3121 2368 293453 3593 4813 3699 4064 2839 3248
261
![Page 292: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/292.jpg)
in KB test set. Again this is laid out in the columns by system with distance away from
the lowest (fastest) value followed by the highest (slowest) value for each system. The
rows again are just the number of nodes in the query graph being projected. It can be
seen that as the number of graphs in the KB is increased, than the systems (especially
pCG) become unstable when projecting a query graph that is close too or actually the
size of the KB graphs (see row 43 and 53 in Table D.9).
Table D.7: Error Bar Data Values for 53 nodes KB with 1000 Graphs.
# nodes in Q pCG (l) pCG (h) CPE (al) CPE (ah) CPE (hl) CPE (hh)
3 4.1 11.9 25.85 22.15 49.15 42.855 11.45 4.55 33.05 28.95 25.35 36.659 3.8 28.2 51.95 44.05 73.1 83.911 5.75 26.25 59.85 51.15 11.8 50.215 3.3 12.7 77.5 78.5 59.5 79.521 14 80 37.15 56.85 55.45 56.5527 18.15 60.85 92.25 126.75 50 6031 18.25 216.75 69.4 117.6 49.9 59.143 43.8 221.2 82.9 100.1 100.15 94.8553 44.85 205.15 86.15 88.85 127.35 140.65
D.3 Validation of Correct Projection
Shown within the next two subsections are the actual output data verifying that
the projections were correct. The output data is three figures where the first figure is the
KB graph, second is the query graph that was projected and thethird is the projection
results found. Each of the graphs outputted give several parts:
262
![Page 293: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/293.jpg)
Table D.8: Error Bar Data Values for 53 nodes KB with 2500 Graphs.
# nodes in Q pCG (l) pCG (h) CPE (al) CPE (ah) CPE (hl) CPE (hh)3 7.1 9.9 39.3 23.7 66.35 59.655 10.35 20.65 44.4 65.6 107.1 64.99 6.9 10.1 49.1 41.9 84.9 74.111 12.8 18.2 48.05 46.95 85.8 72.215 13.8 80.2 95.2 93.8 137.25 77.7521 23.25 100.75 120.75 72.25 119.85 62.1527 8.4 86.6 74.5 94.5 81.8 112.231 52.55 87.45 137.15 111.85 82.45 71.5543 72.6 474.4 152.55 123.45 157.65 137.3553 111.95 529.05 150 133 200.8 158.2
Table D.9: Error Bar Data Values for 53 nodes KB with 5000 Graphs.
# nodes in Q pCG (l) pCG (h) CPE (al) CPE (ah) CPE (hl) CPE (hh)
3 13.35 18.65 83.05 39.95 106.6 101.45 4.95 11.05 71.9 87.1 162.9 132.19 18.55 12.45 103.25 101.75 127.65 151.3511 10.15 21.85 103.8 104.2 141.5 106.515 14.2 17.8 149.9 103.1 119.4 100.621 16.1 77.9 162.6 202.4 152.7 119.327 23.6 54.4 111.75 182.25 167.1 243.931 79.3 951.7 199.95 221.05 169.35 152.6543 132.45 1023.55 219.55 187.45 61.5 504.553 193.7 1026.3 225.75 139.25 200.35 208.65
263
![Page 294: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/294.jpg)
1. Basenode - this is the unique identifier for the single concept node that is consid-
ered the basic node of the graph.
2. Concept nodes - these are the concepts in the graph, givingthe unique identifier
as well as the type, referent (if any) and co-reference links(if any).
3. Relation nodes - these are the relations in the graph, giving the unique identifier
and the type value.
4. CRC list - displays the concept-relation-concept list bygiving the unique iden-
tifier for the node followed by the direction of the linkage into that node. If the
direction is indented on the next line than that linkage is scoped within the unique
identifier displayed above.
D.3.1 11 nodes in KB graphs - Unique Relation Results
The Figures D.1, D.2 and D.3 show the three elements of the test for the 11
nodes graph in KB being projected by a 3 nodes query graph. It should be noted that
the graph seen in the projection graph (see Figure D.3) has the unique identifying nodes
from the KB graph, but the structure of the query graph. Therefore, a subgraph was, in
fact, found within the KB that was isomorphic to the query graph.
264
![Page 295: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/295.jpg)
Graph Graph*G1 built:Basenode - C2062
CG Base graph- Concepts in this graph:
Concept unique label is C2062with co-ref link *b and as Block
Concept unique label is C5665 as ColorConcept unique label is C0493 as NumberConcept unique label is C8990 as PlaceConcept unique label is C3008 as ShapeConcept unique label is C6346 as Table
- Relations in this graph:Relation unique label is R9474 as ATTRRelation unique label is R0897 as NAMERelation unique label is R9634 as LOCRelation unique label is R1126 as CHRCRelation unique label is R4954 as OnTable
- crc:C2062 -> R9474 -> C5665
-> R0897 -> C0493-> R9634 -> C8990-> R1126 -> C3008-> R4954 -> C6346
C5665 <- R9474 <- C2062C0493 <- R0897 <- C2062C8990 <- R9634 <- C2062C3008 <- R1126 <- C2062C6346 <- R4954 <- C2062
Figure D.1: KB for Verifying 3 nodes Query onto 11 nodes KB.
D.3.2 13 nodes in KB graphs - Multi-Instances Relation Results
The Figures D.5, D.4 and D.6 show the three elements of the test for the 13
nodes graph in KB being projected by a 5 nodes query graph. It should be noted that
the graphs seen in the projection graphs (see Figure D.6) show that two subgraphs are
265
![Page 296: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/296.jpg)
query graphs - 1 graph/s readGraph Graph*G2 built:
Basenode - C6067CG Base graph- Concepts in this graph:Concept unique label is C6067 as BlockConcept unique label is C1389 as Color
- Relations in this graph:Relation unique label is R9447 as ATTR
- crc:C6067 -> R9447 -> C1389C1389 <- R9447 <- C6067
Figure D.2: Query Graph for Verifying 3 nodes Query onto 11 nodes KB.
projection graphsGraph P30001 built:
Basenode - C2062CG Base graph- Concepts in this graph:
Concept unique label is C2062with co-ref link *b and as Block
Concept unique label is C5665 as Color- Relations in this graph:Relation unique label is R9474 as ATTR
- crc:C2062 -> R9474 -> C5665C5665 <- R9474 <- C2062
Figure D.3: Projection Verifying 3 nodes Query onto 11 nodesKB.
266
![Page 297: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/297.jpg)
found within the KB that are isomorphic to the query graph. Again the image of the
query graph is projected onto the KB graph, but the projection graph have the nodes
from inside of the KB graph.
query graphs - 1 graph/s readGraph Graph*G3 built:
Basenode - C4124CG Base graph- Concepts in this graph:
Concept unique label is C4124with co-ref link *b and as Block
Concept unique label is C1918 as ColorConcept unique label is C5682 as Number
- Relations in this graph:Relation unique label is R7152 as ATTRRelation unique label is R2455 as NAME
- crc:C4124 -> R7152 -> C1918
-> R2455 -> C5682C1918 <- R7152 <- C4124C5682 <- R2455 <- C4124
Figure D.4: Query Graph for Verifying 5 nodes Query onto 13 nodes KB.
267
![Page 298: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/298.jpg)
Graph Graph*G1 built:Basenode - C9474
CG Base graph- Concepts in this graph:
Concept unique label is C2062with co-ref link *b1 and as Block
Concept unique label is C9474with co-ref link *b2 and as Block
Concept unique label is C0493 as TableConcept unique label is C8990 as ColorConcept unique label is C3008 as NumberConcept unique label is C6346 as ColorConcept unique label is C3285 as Number
- Relations in this graph:Relation unique label is R5665 as AboveRelation unique label is R0897 as OnTableRelation unique label is R9634 as ATTRRelation unique label is R1126 as NAMERelation unique label is R4954 as ATTRRelation unique label is R5963 as NAME
- crc:C2062 -> R5665 -> C9474
-> R9634 -> C8990-> R1126 -> C3008
C9474 -> R0897 -> C0493-> R4954 -> C6346-> R5963 -> C3285<- R5665 <- C2062
C0493 <- R0897 <- C9474C8990 <- R9634 <- C2062C3008 <- R1126 <- C2062C6346 <- R4954 <- C9474C3285 <- R5963 <- C9474
Figure D.5: KB for Verifying 5 nodes Query onto 13 nodes KB.
268
![Page 299: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/299.jpg)
projection graphsGraph P30001 built:
Basenode - C2062CG Base graph- Concepts in this graph:
Concept unique label is C2062with co-ref link *b1 and as Block
Concept unique label is C8990 as ColorConcept unique label is C3008 as Number
- Relations in this graph:Relation unique label is R9634 as ATTRRelation unique label is R1126 as NAME
- crc:C2062 -> R9634 -> C8990
-> R1126 -> C3008C8990 <- R9634 <- C2062
C3008 <- R1126 <- C2062Graph P30002 built:
Basenode - C9474CG Base graph- Concepts in this graph:
Concept unique label is C9474with co-ref link *b2 and as Block
Concept unique label is C6346 as ColorConcept unique label is C3285 as Number
- Relations in this graph:Relation unique label is R4954 as ATTRRelation unique label is R5963 as NAME
- crc:C9474 -> R4954 -> C6346
-> R5963 -> C3285C6346 <- R4954 <- C9474C3285 <- R5963 <- C9474
Figure D.6: Projections Verifying 5 nodes Query onto 13 nodes KB.
269
![Page 300: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/300.jpg)
270
![Page 301: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/301.jpg)
REFERENCES
[1] H. Aidinejad. Semantic networks as a unified model of knowledge representa-tion. MCCS-88-117, 1988.
[2] H. Ait-Kaci. An algebraic semantics approach to the effective resolution of typeequations.Theor. Comp. Sc., 45:293–351, 1986.
[3] J.F. Allen. Maintaining knowledge about temporal intervals. Communicationsof the ACM, 26(11):pp. 832–843, 1983.
[4] J.F. Allen. Time and time again: The many ways to represent time. InternationalJournal of Intelligent Systems, 6(4):pp. 341–355, July 1991.
[5] J.-F. Baget and M.-L. Mugnier. Extensions of simple conceptual graphs: thecomplexity of rules and constraints.Journal of Artificial Intelligence Research(JAIR), 16:425–465, 2002.
[6] P. Becker. ToscanaJ. Technical University of Darmstadt, Germany, 2004.http://toscanaj.sourceforge.net/.
[7] P. Becker and J.H. Correia. The ToscanaJ suite for implementing ConceptualInformation Systems. In G. Stumme, editor,Formal Concept Analysis – State ofthe Art, Berlin – Heidelberg – New York, 2004. Springer. To appear.
[8] D.J. Benn. Implementing conceptual graph processes. Master’s thesis, Uni-versity of South Australia, School of Computer and Information Science, April2001. http://members.ozemail.com.au/ djbenn/Masters/thesis/Thesis.pdf.
[9] D.J. Benn and D. Corbett. An application of the process mechanism to a roomallocation problem using the pCG language. In H.S. Delugachand G. Stumme,editors,Conceptual Structures: Broadening the Base, Springer-Verlag LectureNotes in Computer Science 2120, pages 360–374, 2001.
[10] D.J. Benn and D. Corbett. pCG: An implementation of the process mechanismand an extensible CG programming language. InCGTools Workshop Proceed-ings in connection with ICCS 2001, Stanford, CA, 2001. [Online Access: July2001] URL:http://www.cs.nmsu.edu/ hdp/CGTOOLS/proceedings/index.html.
[11] R.J. Brachman. On the epistemological status of semantic networks. In N.V.Findler, editor,Associative Networks: Representation and Use of KnowledgebyComputers, pages 3–50. Academic Press, New York, 1979.
[12] E. Charniak and D. McDermott.Introduction To Artifical Intelligence. Addison-Wesley, Reading, MA, 1985.
271
![Page 302: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/302.jpg)
[13] G. Chartrand and L. Lesniak.Graphs & Digraphs. Mathematics Series.Wadsworth & Brooks/Cole, Pacific Grove, CA, second edition edition, 1986.
[14] N.R. Chavez, Jr. and R.T. Hartley. The Role of Object-Oriented Techniques andMulti-Agents in Story Understanding. InProceedings of the International Con-ference on Integration of Knowledge Intensie Multi-Agent Systems, Waltham,Mass, 2005. KIMAS 2005.
[15] M. Chein and M.-L. Mugnier. Conceptual graphs: Fundamental notions.Revued’Intelligence Artificielle, 6-4:365–406, 1992.
[16] N. Chomsky.Syntactic Structures. The Hague, Mouton, 1957.
[17] N. Chomsky. Aspects of the Theory of Syntax. MIT Press, Cambridge, MA,1965.
[18] R.J. Cole, P. Eklund, and G. Stumme. Document retrievalfor email search anddiscovery using Formal Concept Analysis.Applied Artificial Intelligence, 17(3),2003.
[19] D. Corbett. Reasoning and Unification over Conceptual Graphs. Kluwer Aca-demic/Plenum Plublishers, New York, 2003.
[20] T.H. Cormen, C.E. Leiserson, and R.L. Rivest.Introduction to Algorithms. TheMIT Press, 1990.
[21] M. Croitoru and E. Compatangelo. A combinatorial approach to conceptualgraph projection checking. InProc. of the 24th Int’l Conf. of the British Com-puter Society’s Specialist Group on Art’l Intell.AI’2004, Springer-Verlag, 2004.
[22] M. Croitoru and E. Compatangelo. On conceptual graph projection. TechnicalReport AUCS/TR0403, University of Aberdeen, UK, Department of ComputingScience, 2004.
[23] Z.J. Czech. Quasi-perfect hashing.The Computer Journal, 41(6):416–421, 1998.
[24] Z.J. Czech, G. Havas, and B.S. Majewski. Perfect hashing. Theoretical Com-puter Science, 182(1-2):1–143, 15 August 1997. Fundamental Study.
[25] F. Dau. Types and tokens for logic with diagrams. In K.E.Wolff, H.D. Pfeiffer,and H.S. Delugach, editors,Conceptual Structures at Work, 12th InternationalConference on Conceptual Structures, volume LNAI of3127, pages 62–93, Hei-delberg, July 2004. ICCS 2004, Springer.
[26] E. Davis. Representations of Commonsense Knowledge. Morgan Kaufmann,San Mateo, CA, 1990.
272
![Page 303: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/303.jpg)
[27] T. Dean and D. McDermott. Temporal data base management. Artificial Intelli-gence, 32:pp. 1–55, 1987.
[28] R. Dechter and J. Pearl. Network-based heuristics for constraint-satisfactionproblems.Artificial Intelligence, 34:1–38, 1988.
[29] H.S. Delugach. CharGer: Some lessons learned and new directions. InG. Stumme, editor,Working with Conceptual Structures - Contributions to ICCS2000, pages 306–309, 2000. Shaker-Verlag.
[30] H.S. Delugach. CharGer: A graphical Conceptual Graph ed-itor. In CGTools Workshop Proceedings in connection withICCS 2001, Stanford, CA, 2001. [Online Access: July 2001]URL:http://www.cs.nmsu.edu/ hdp/CGTOOLS/proceedings/index.html.
[31] H.S. Delugach. Towards building active knowledge systems with conceptualgraphs. In A. de Moor, Wilfried Lex, and Bernhard Ganter, editors,ConceptualStructures for Knowledge Creation and Communications, volume 2745 ofLNAI,pages 296–308, Heidelberg, 2003. Springer-Verlag.
[32] H.S. Delugach. CharGer 3.3 - A Conceptual Graph Editor. University ofAlabama in Huntsville, Alabama, USA, 2004. http://www.cs.uah.edu/ delu-gach/CharGer.
[33] H.S. Delugach. Common logic standard. Located at http://cl.tamu.edu, Novem-ber 2006.
[34] A. deMoor. Applying conceptual graph theory to the user-driven specificationof network information systems. In D. Lukose, H.S. Delugach, M. Keeler,L. Searle, and J.F. Sowa, editors,Conceptual Structures: Fulfilling Peirce’sDream, Springer-Verlag Lecture Notes in Artificial Intelligence1257, pages536–550. ICCS, Springer, August 1997.
[35] H.-D. Ebbinghaus, J. Elum, and W. Thomas.Mathematical Logic. Springer-Verlag, Berlin, second edition, 1994.
[36] P. Ekland. Mail-Sleuth. Email Analysis Pty Ltd, Australia, 2004.http://www.mail-sleuth.com/.
[37] G. Ellis and R. Levinson. The birth of peirce: A conceptual graphs workbench.In G. Ellis and R. Levinson, editors,Proccedings of the 1st International Work-shop on PEIRCE, 1992.
273
![Page 304: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/304.jpg)
[38] D. Eppstein. Arboricity and bipartite subgraph listing algorithms. Tech. Report94-11, University of California, Irvine, CA 92717, February 24 1994. Depart-ment of Information and Computer Science.
[39] D. Eppstein. Subgraph isomorphism in planar graphs andrelated problems.Jour-nal of Graph Algorithms and Applications, 3(3):1–27, 1999.
[40] J.M. Ettinger. The complexity of comparing reaction systems. Technical re-port, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA,November 2001.
[41] B. Ganter and R.Wille.Formal Concept Analysis: Mathematical Foundations.Springer-Verlag, Berlin Heildelberg New York, 1999.
[42] M.R. Garey and D.S. Johnson.Computers and Intractability A Guide to theTheory of NP-Completeness. W.H. Freeman and Company, New York, 1979.
[43] J.C. Giarratano and G. Riley.Expert Systems: Principles and Programming.PWS-KENT Publishing Company, Boston, 1989.
[44] G. Gratzer.Lattice Theory: First concepts and distributive lattices. W.H. Free-man, 1971.
[45] N. Guarino. Philosophy and the Cognitive Science, chapter The OntologicalLevel, pages 443–456. Holder-Pivhler-Tempsky, Vienna, 1994.
[46] F. Harary.Graph Theory. Addison-Wesley, Reading, MA, 1969.
[47] R.T. Hartley. A uniform representation for time and space and their mutualconstraints. In F. Lehmann, editor,Semantics Networks, Oxford, ENGLAND,1992.
[48] R.T. Hartley and M. Coombs. Reasoning with graph operations. In J.F. Sowa,editor,Principles of Semantic Networks: Explorations in the Representation ofKnowledge, San Mateo, CA, 1991. Morgan Kaufmann.
[49] R.T. Hartley and H.D. Pfeiffer. Data models for Conceptual Structures. InFoun-dations and Applications of Conceptual Structures, Contributions to ICCS 2002.ICCS2002, 2002.
[50] L. Henkin. The Completeness of the First-Order Functional Calculus.The Jour-nal of Symbolic Logic, 14, 1949.
[51] W. Hodges.The Blackwell Guide to Philosophical Logic, chapter 1, pages 9–32.Blackwell Publishing, 2001.
274
![Page 305: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/305.jpg)
[52] R. Jackendoff.Semantics Structures. MIT-Press, Cambridge, UK, 1990.
[53] K.S. Jones.Early years in machine translation: memoirs and biographies ofpioneers. John Benjamins, Amsterdam, 2000.
[54] A. Kabbaj. Un systeme multi-paradigme pour la manipulation des connais-sances utilisant la theorie des Graphes Conceptuels. PhD thesis, Universite deMontreal, Departement d’Informatique et de Recherche Operationnelle, Canada,1996.
[55] A. Kabbaj.The Amine Platform, 2004. http://amine-platform.sourceforge.net.
[56] A. Kabbaj. CS-TIW 2007 Second Conceptual Structures Tool InteroperabilityWorkshop, chapter Interoperability: The next steps for Amine Platform, pages65–70. Research Press International, 2007.
[57] A. Kabbaj and M. Janta-Polcynzki. From Prolog++ to Prolog+CG: A CG object-oriented logic programming language. In B. Ganter and G. Mineau, editors,Conceptual Structures: Logical, Linguistic, and Computional Issues, pages 540–554, Berlin, 2000. Lecture Notes in Artificial Intelligence, vol. 1867, Springer-Verlag.
[58] A. Kabbaj and B. Moulin. An algorithmic definition of CG operations based ona bootstrap step. InProceedings of ICCS’01, 2001.
[59] knowledge. Dictionary.com unabridged (v 1.0.1). Available at Dictionary.comwebsite: http://dictionary.reference.com/browse/knowledge, November 2006.
[60] knowledge. Merriam-webster online dictionary. Available at web-site:http://www.m-w.com/dictionary/knowledge, November 2006.
[61] F. Lehmann, editor.Semantics Networks. Pergamon Press, Oxford, ENGLAND,1992.
[62] D. Lenat and R. Guha.Building Large Knowledge-Based Systems - Representa-tion and Inference in the Cyc Project. Addison-Wesley, Reading, MA, 1990.
[63] H. Levesque. A fundamental tradeoff in knowledge representation and reason-ing. In Proceedings of CSCSI-84, pages 141–152, London, 1984.
[64] LIRMM. GoCITaNT. Montpellier, France, 2004.http://cogitant.sourceforge.net/index.html.
[65] G. Luger and W. Stubblefield.Artifical Intelligence - Structures and Strategiesfor Complex Problem Solving. The Benjamin/Cummings Publishing Company,Inc., Redwood City, CA, 1993.
275
![Page 306: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/306.jpg)
[66] R. MacGregor. The evolving technology of classification-based knowledge rep-resentation systems. In J.F. Sowa, editor,Principles of Semantic Networks: Ex-plorations in the Representation of Knowledge, San Mateo, CA, 1991. MorganKaufmann.
[67] A. Martelli and U. Montanari. An efficient unification algorithm. ACM Trans-actions on Programming Languages and Systems, 4(2):258–282, April 1982.
[68] B.T. Messmer and H. Bunke. Efficient subgraph isomorphism detection: A de-composition approach.IEEE Transactions on Knowledge and Data Engineering,12(2):307–323, March/April 2000.
[69] G.W. Mineau. From actors to process: The representation of dynamic knowl-edge using conceptual graphs. In Marie-Laure Mugnier and Michel Chein, ed-itors, Conceptual Structures: Theory, Tools, and Applications, volume 1453 ofSpringer-Verlag Lecture Notes in Artificial Intelligence, pages 65–79, Heidel-berg, August 1998. ICCS 1998, Springer.
[70] G.W. Mineau. Constraints on processes: Essential elements for the validationand execution of processes. In William Tepfenhart and Walling Cyre, editors,Conceptual Structures: Standards and Practices, volume 1640 ofLNAI, pages66–82, Heidelberg, July 1999. ICCS 1999, Springer.
[71] G.W. Mineau and Q. Gerbe. Contexts: A formal definition of worlds of asser-tions. In D. Lukose, H.S. Delugach, M. Keeler, L. Searle, andJ.F. Sowa, editors,Conceptual Structures: Fulfilling Peirce’s Dream, volume 1257 ofLNAI, pages80–94. ICCS 1997, Springer, August 1997.
[72] D. Moldovan, W. Lee, C. Lin, and M. Chung. Snap parallel processing appliedto ai. Computer, 25(5):39–49, may 1992.
[73] H. Motulsky. Intuitive Biostatistics. Oxford University Press, New York, 1995.
[74] M.-L. Mugnier and M. Chein. Polynomial algorithms for projection and match-ing. In H.D. Pfeiffer and T.E. Nagle, editors,Conceptual Structures: Theory andImplementation, volume LNAI of 754, pages 239–251. ICCS, Springer-Verlag,July 1992.
[75] M.-L. Mugnier and M. Leclere. On querying simple conceptual graphs withnegation. InData and Knowledge Engineering. DKE, Elsevier, 2006. Revisedversion of R.R. LIRMM 05-051.
[76] A. Mukerjee. Computational Representation and Processing of Spatial Expres-sions, chapter Neat vs Scruffy: A review of Computational Models for SpatialExpressions, pages 1–37. Lawrence Erlbaum Associates, Mahwah, NJ, 1998.
276
![Page 307: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/307.jpg)
[77] S.H. Myaeng and A. Lopez-Lopez. Conceptual graph matching: A flexible algo-rithm and experiments.Journal for Experimental and Theoretical AI, 4(2):107–126, 1992.
[78] T.E. Nagle, J.W. Esch, and G. Mineau. A notation for conceptual structures graphmatchers. InProceedings of the 5th Conceptual Structures Workshop, Boston,MA, 1990. held in conjunction with AAAI-90.
[79] T.E. Nagle, J.A. Nagle, L.L. Gerholz, and P.W. Ekland, editors. ConceptualStructures: Current Research and Practice. Ellis Horwood Workshops. EllisHorwood, 1992.
[80] A. Newell. The knowledge level.Artifical Intelligence, 18(1):87–127, 1982.
[81] P. Oehrstoem, J. Andersen, and H. Scharfe. What has happened to ontology. InF. Dau, M-L Mugnier, and G. Stumme, editors,Conceptual Structures: Com-mon Semantics for Sharing Knowledge, volume 3596 ofLNAI, pages 425 – 438.ICCS2005, Springer, July 2005.
[82] C.K. Ogden and I.A. Richards.The Meaning Of Meaning. Harcourt, Brace, andWorld, New York, NY, 1946.
[83] R. Pagh. Hash and displace: Efficient evaluation of minimal perfect hashfunctions. InAlgorithms and Data Structures: 6th International Workshop.WADS’99, LNCS, May 1999.
[84] M.S. Paterson and M.N. Wegman. Linear unification.J. Comput. Syst. Sci.,16(2):158–167, April 1978.
[85] J. Pearl.Heuristics. Addison-Wesley, Reading, MA, 1984.
[86] C.S. Peirce. Manuscripts on existential graphs.Peirce, 4:320–410, 1960.
[87] H.D. Pfeiffer. An exportable CGIF module from the CP environment: A prag-matic approach. In K.E. Wolff, H.D. Pfeiffer, and H.S. Delugach, editors,Con-ceptual Structures at Work, volume 3127 ofLNAI, pages 319–332. ICCS2004,Springer, July 2004.
[88] H.D. Pfeiffer, N.R. Chavez, Jr., and R.T. Hartley. A generic interface for commu-nication between story understanding systems and knowledge bases. InRichardTapia Celebration of Diversity in Computing Conference, Albuquerque, NM,2005.
277
![Page 308: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/308.jpg)
[89] H.D. Pfeiffer, N.R. Chavez Jr., and J.J. Pfeiffer Jr. CPE design considering in-teroperability. In H.D. Pfeiffer, A. Kabbaj, and D.J. Benn,editors,CS-TIW 2007Second Conceptual Structures Tool Interoperability Workshop, pages 71–75. Re-search Press International, 2007.
[90] H.D. Pfeiffer and R.T. Hartley. Semantic additions to conceptual programming.In Proc. of the Fourth Annual Workshop on Conceptual Structures, Detroit, MA,1989.
[91] H.D. Pfeiffer and R.T. Hartley. Additions for set representation and processing toconceptual programming. InProc. of the Fifth Annual Workshop on ConceptualStructures, pages 131–140, Boston&Stockholm, 1990.
[92] H.D. Pfeiffer and R.T. Hartley. The Conceptual Programming Environment, CP:Reasoning representation using graph structures and operations. InProc. of IEEEWorkshop on Visual Languages, Kobe, Japan, 1991.
[93] H.D. Pfeiffer and R.T. Hartley. The Conceptual Programming Environment, CP.In T.E. Nagle, J.A. Nagle, L.L. Gerholz, and P. W. Ekland, editors, Concep-tual Structures: Current Research and Practice, Ellis Horwood Workshops. EllisHorwood, 1992.
[94] H.D. Pfeiffer and R.T. Hartley. Temporal, spatial, andconstraint handling inthe Conceptual Programming Environment, CP.Journal of Experimental andTheoretical AI, 4(2):167–182, 1992.
[95] H.D. Pfeiffer and R.T. Hartley. Visual CP representation of knowledge. InG. Stumme, editor,Working with Conceptual Structures - Contributions to ICCS2000, pages 175–188, 2000. Shaker-Verlag.
[96] H.D. Pfeiffer and R.T. Hartley. ARCEdit - CG editor. InCGTools Workshop Proceedings in connection with ICCS2001, Stanford, CA, 2001. [Online Access: July 2001]URL:http://www.cs.nmsu.edu/ hdp/CGTOOLS/proceedings/index.html.
[97] H.D. Pfeiffer and R.T. Hartley, editors.CGTools Workshop Proceedings in con-nection with ICCS 2001, Stanford, CA, 2001. [Online Access: July 2001]URL:http://www.cs.nmsu.edu/ hdp/CGTOOLS/proceedings/index.html.
[98] J. Piaget.Genetic epistomology. Columbia University Press, New York, 1970.Trans. E. Duckworth.
[99] S. Polovina and R. Hill. Enhancing the initial requirements capture of multi-agent systems through conceptual graphs. In F. Dau, M-L Mugnier, and
278
![Page 309: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/309.jpg)
G. Stumme, editors,Conceptual Structures: Common Semantics for SharingKnowledge, volume 3596 ofLNAI, pages 439–452. ICCS2005, Springer, July2005.
[100] B. Prasad. A planning system for blocks-world domain.In AICCSA ’01: Pro-ceedings of the ACS/IEEE International Conference on Computer Systems andApplications, page 59, Washington, DC, USA, 2001. IEEE Computer Society.
[101] A. Puder. Mapping of cgif to operational interfaces. In Marie-Laure Mugnier andMichel Chein, editors,Conceptual Structures: Theory, Tools, and Applications,Springer-Verlag Lecture Notes in Computer Science 1453, pages 119–126, 1998.
[102] K. Radeck.C# and Java: Comparing Programming Languages. MSDN, Octo-ber 2003. http://www.windowsfordevices.com/articles/AT2128742838.html.
[103] S.W. Reyner. An analysis of a good algorithm for the subtree problem.SIAM J.Comput., 6:730–732, 1977.
[104] J.A. Robinson.Machine Intelligence, volume 6, chapter Computational logic:The unification computation., pages 63–72. Edinburgh University Press, Edin-burgh, Scotland, 1971.
[105] S. Roman.Win32 API Programming with Visual Basic. O’Reilly, first edition,1999.
[106] S. Russell and P. Norvig.Artifical Intelligence - A Modern Approach. PrenticeHall, Upper Saddle River, NJ, 1995.
[107] G. Ryle.The Concept of Mind. Penguin Books, Harmondsworth, UK, 1949.
[108] L. Schubert. Extending the expressive power of semantic networks. ArtificalIntelligence, 7:163–198, 1976.
[109] S.C. Shapiro. A net structure for semantic information storage, deduction, andretrieval. InProceedings of the 2nd International Conference on Artifical Intel-ligence, pages 512–523, 1971.
[110] S.C. Shapiro and W.J. Rapaport. The sneps family.Computers Math. Applic.,23(2-5):243–275, 1992.
[111] A. Shokoufandeh and S. Dickerson.Graph-Theoretical Methods in ComputerVision. Number 2292 in LNCS. Springer-Verlag, Berlin Heidelberg,2002.
[112] J. Siegel. Making the case: OMG’s Model Driven Architecture (MDA). SanDiego Times, 2002.
279
![Page 310: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/310.jpg)
[113] D. Skipper and H. Delugach. OpenCG: An open source graph representation. InA. de Moor, S. Polovina, and H. Delugach, editors,First Conceptual StructuresTool Interoperability Workshop, pages 48–57. CS-TIW 2006, Aalborg Univer-sitetsforlag, 2006.
[114] R. Soley.Model Driven Architecture. OMG, 11-05 2000. document.
[115] F. Southey. Notio and Ossa. InCGTools Workshop Proceedings in con-nection with ICCS 2001, Stanford, CA, 2001. [Online Access: July 2001]URL:http://www.cs.nmsu.edu/ hdp/CGTOOLS/proceedings/index.html.
[116] F. Southey.NOTIO, 2003. http://notio.lucubratio.org/index.html.
[117] F. Southey and J.G. Linders. NOTIO - a Java API for developing CG tools. InW. Tepfenhart and W. Cyre, editors,Conceptual Structures: Standards and Prac-tices, pages 262–271, Berlin, 1999. Springer-Verlag. Lecture Notes in ArtificialIntelligence, LNAI 1640.
[118] J.F. Sowa. Conceptual graphs for a data base interface. IBM Journal of Researchand Development, 20(4):336–357, 1976.
[119] J.F. Sowa.Conceptual Structures: Information Processing in Mind andMachine.Addison-Wesley, Reading, MA, 1984.
[120] J.F. Sowa, editor.Principles of Semantic Networks: Explorations in the Repre-sentation of Knowledge. Morgan Kaufmann, San Mateo, CA, 1991.
[121] J.F. Sowa. Conceptual graphs as a universal knowledgerepresentation. InF. Lehmann, editor,Semantics Networks, Oxford, ENGLAND, 1992.
[122] J.F. Sowa. Conceptual graphs: Draft proposed american national standard. InConceptual Structures: Standards and Practices, editors,Conceptual Structures:Standards and Practices, pages 1–65, Berlin, 1999. Springer-Verlag. LectureNotes in Artificial Intelligence, LNAI 1640.
[123] J.F. Sowa.Knowledge Representation: Logical, Philosophical, and Computa-tional Foundations. Brooks/Cole, 2000.
[124] J.F. Sowa. Architectures for intelligent systems.IBM Systems Journal,41(3):331–349, 2002.
[125] J.F. Sowa, N.Y. Foo, and A. Rao, editors.Conceptual Graphs for KnowledgeSystems. Addison Wesley, New York, NY, 1989.
280
![Page 311: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/311.jpg)
[126] J.F. Sowa et al. Conceptual Graph Standard, American National StandardNCITS. et all, T2/ISO/JTC1/SC32 WG2 M 00 edition, 2001. [Access On-line:April 2001], URL: http://www.bestweb.net/ sowa/cg/cgstand.htm.
[127] B. Stroustrup.The C++ Programming Language. Addison-Wesley, 3rd edition,2000.
[128] D. A. Tappan.Knowledge-Based Spatial Reasoning For Automated Scene Gen-eration From Text Descriptions. PhD thesis, New Mexico State University, May2004.
[129] W.M. Tepfenhart. Ontologies and conceptual structures. In Marie-Laure Mug-nier and Michel Chein, editors,Conceptual Structures: Theory, Tools, and Ap-plications, volume LNAI of 1453, pages 334–348, Heidelberg, August 1998.ICCS 1998, Springer.
[130] R. Thomopoulos, J.-F. Baget, and O. Haemmerle. Conceptual graphs as coopera-tive formalism to build and validate a domain expertise. In U. Priss, S. Polovina,and R. Hill, editors,Conceptual Structures: Knowledge Architectures for SmartApplications, pages 112–125. ICCS2007, Springer, 2007.
[131] M. Thorup. Even strongly universal hashing is pretty fast. InSODA ’00: Pro-ceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms,pages 496–497, Philadelphia, PA, USA, 2000. Society for Industrial and AppliedMathematics.
[132] J.R. Ullman. An algorithm for subgraph isomorphism.J. of the Assoc. for Com-puting Machinery, 23(1):31–42, 1976.
[133] W.P. Weijland. Semantics for logic programs without occur check.TheoreticalComputer Science, 71:155–174, 1990.
[134] C.A. Welty. In Integrated Representation for Software Development andDiscov-ery. PhD thesis, Vassar College, 1995.
[135] A.R. White. Conceptual analysis. In C.J. Bontempo andS.J. Odell, editors,TheOwl of Minerva, pages 103–117. McGraw-Hill, 1975.
[136] M. Willems. Projection and unification for conceptualgraphs. In G. Ellis,R. Levinson, W. Rich, and J. Sowa, editors,Conceptual Structures: Applica-tions, Implementation and Theory, volume LNAI of 954, pages 278–282. ICCS1995, Springer, August 1995.
281
![Page 312: THE EFFECT OF DATA STRUCTURES MODIFICATIONS ON …hdp/PDF/dissertation.pdf · Structures Tool Interoperability Workshop.Research Press International, 2007. Field of Study Major field:](https://reader033.vdocuments.site/reader033/viewer/2022042321/5f0bbc887e708231d431f720/html5/thumbnails/312.jpg)
[137] T. Winograd. Frame representations and the declarative/procedural controvery.In Readings in Knowledge Representation, pages 185–210. Morgan Kaufman,1975.
[138] K.E. Wolff. ’particles’ and ’waves’ as understood by temporal concept analysis.In K.E. Wolff, H.D. Pfeiffer, and H.S. Delugach, editors,Conceptual Structuresat Work, pages 126–141. ICCS2004, LNAI 3127, Springer, July 2004.
[139] W.A. Woods. What’s in a link: Foundations for semanticnetworks. In D.G.Bobrow and A.M. Collins, editors,Representation and Understanding: Studiesin Cognitive Science, pages 35–82. Academic Press, 1975.
[140] W.A. Woods. Understanding subsumption and taxonomy.In J.F. Sowa, editor,Principles of Semantic Networks: Explorations in the Representation of Knowl-edge. Morgan Kaufmann, 1991.
[141] W.A. Woods and J.G. Schmolze. The kl-one family.Computers Math. Applic.,23(2-5):133–177, 1992.
282