hfoxwell dissertation

Upload: hfoxwell

Post on 30-May-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 HFoxwell Dissertation

    1/196

    A Web-based System for Representing,Retrieving, and Visualizing Analogies

    A dissertation submitted in partial fulfillment of the requirements for the degree of

    Doctor of Philosophy at George Mason University

    By

    Harry J. FoxwellMaster of Science in Applied Statistics, Villanova University, 1978

    Bachelor of Arts in Mathematics, Franklin & Marshall College, 1973

    Director: Daniel A. Menasc, Professor of Computer Science

    Spring 2003George Mason University

    Fairfax, Virginia

  • 8/14/2019 HFoxwell Dissertation

    2/196

    ii

    Copyright 2003 Harry J. Foxwell

    All Rights Reserved

  • 8/14/2019 HFoxwell Dissertation

    3/196

    iii

    DEDICATION

    To my teachers who inspired and believed in me, andto my family whose love sustained and supported me.

  • 8/14/2019 HFoxwell Dissertation

    4/196

    iv

    ACKNOWLEDGEMENTS

    I thank my employer, Sun Microsystems, for financial support, and especially my formerand current managers, Robert McCartin and Paul Tatum, for encouragement andflexibility in scheduling time for my courses and research. Deborah Lapeyre and

    Wendell Piez of Mulberry Technologies provided valuable advice in the use of XML andXSLT. Thanks to my doctoral committee, Nada Dabbagh, Peter Denning, Gheorghe

    Tecuci, and especially to my advisor, Daniel A. Menasc, for his patience and

    encouragement. Special thanks to Honglei Ruan for her excellent work on the AnalogyExpression editor. And ultimately, thanks to my wife Eileen. I could not have completed

    this work without her love and support.

  • 8/14/2019 HFoxwell Dissertation

    5/196

    v

    TABLE OF CONTENTS

    Page

    Abstract................................................................................................................ix

    1. Introduction...............................................................................................................1

    1.1 Why are analogies important?.................................................................................31.2 What is an analogy? ................................................................................................4

    1.3 Why we want to use the Web to access analogy expressions .................................5

    1.4 What we want to do with analogies and why .........................................................81.4.1 Representation......................................................................................................9

    1.4.2 Visualization......................................................................................................101.4.3 Storage/Retrieval................................................................................................11

    1.5 The Focus of this Research...................................................................................121.6 Overview of Contributions ...................................................................................141.7 Dissertation Organization.....................................................................................15

    2. Background .............................................................................................................182.1 Web technologies Used in Our research...............................................................18

    2.1.1 XML...................................................................................................................182.1.2 XML Editors ......................................................................................................20

    2.1.3 Java.....................................................................................................................212.1.4 Jakarta Tomcat ...................................................................................................212.1.5 Apache Cocoon..................................................................................................22

    2.1.6 XML Parsers ......................................................................................................222.1.7 XSLT..................................................................................................................232.1.8 XPath..................................................................................................................24

    2.1.9 SVG....................................................................................................................242.1.10 Querying XML Documents .............................................................................25

    2.2 Related Research...................................................................................................262.2.1 Analogy research in Computer Science.............................................................272.2.2 Analogy Research in Cognitive Science............................................................30

    2.2.3 Analogy Research in Education.........................................................................322.2.4 The Semantic Web and Knowledge Representation..........................................33

    2.2.5 The MARVIN System.......................................................................................353. The Structure and Components of Analogies .........................................................373.1 Analogy Examples ................................................................................................41

    3.2 The Limitations of Analogies ...............................................................................544. The Representation of Analogies............................................................................57

  • 8/14/2019 HFoxwell Dissertation

    6/196

    vi

    4.1 Analogy Expression DTD Elements.....................................................................644.2 Creating an Analogy Expression XML Document...............................................72

    4.3 Summary...............................................................................................................745. Visualizing Analogy Expressions ...........................................................................76

    5.1 Visualization Stylesheets ......................................................................................815.2 Summary...............................................................................................................896. Storing, Retrieving, and Ranking Analogy Expressions ........................................90

    6.1 Retrieval Queries...................................................................................................936.1.1 Keyword Match Queries ....................................................................................94

    6.1.2 Keyword Generalization Queries.......................................................................966.2 Ordering Analogy Expression Query Result Sets.................................................996.3 Query Examples..................................................................................................103

    6.4 Summary.............................................................................................................1047. MARVIN: A Prototype System for Representing, Retrieving,

    and Visualizing Analogy Expressions ...........................................................105

    7.1 Design Goals.......................................................................................................1057.2 System Architecture............................................................................................109

    7.3 Creating MARVIN Analogy Expressions ..........................................................1217.3.1 The MARVIN Analogy Editor ........................................................................121

    7.4 Summary.............................................................................................................1238. MARVIN System Evaluation...............................................................................1258.1 Formative Evaluation..........................................................................................126

    8.1.1 Survey Results .................................................................................................1328.1.1.1 Discussion of User Survey Results ...............................................................135

    8.1.1.2 Discussion of Author Survey Results ...........................................................1368.2 Performance and Scalability...............................................................................140

    8.2.1 MARVIN Archive Performance ......................................................................1428.3 Summary.............................................................................................................1459. Conclusions and Further Research........................................................................147

    References .................................................................................................................152Appendix...................................................................................................................166Curriculum Vitae.......................................................................................................186

  • 8/14/2019 HFoxwell Dissertation

    7/196

    vii

    LIST OF FIGURES

    Figure PageFigure 1.1 The MARVIN System Architecture........................................................9

    Figure 3.1 The Eye/Camera Analogy .....................................................................42Figure 3.2 Galileos Solar System Analogy ...........................................................43Figure 3.3 The Rutherford Analogy........................................................................44

    Figure 3.4 The Bohr Liquid Drop Model of Nuclear Fission.................................45Figure 3.5 Historical Analogy September 11 = Pearl Harbor..............................47

    Figure 3.6(a) ConceptSetMap.................................................................................51Figure 3.6(b) PrimaryRelationStructureMap ..........................................................51Figure 3.6(c) RelationToConceptStructureMap .....................................................52

    Figure 3.6(d) ConceptToRelationtructureMap .......................................................52Figure 3.6(e) RelationToRelationStrucureMap ......................................................53

    Figure 3.7 Circulatory System Analogy .................................................................53Figure 4.1 Modified EBNF Definition of an Analogy Expression .........................62Figure 4.2 The Analogy Expression DTD..............................................................69

    Figure 4.3 A Rutherford analogy expression XML document ...............................71Figure 4.4 A minimal Rutherford analogy expression XML document .............71

    Figure 4.5 Creating a Rutherford XML file using the XML editor epcEdit ...........75

    Figure 5.1 Tabular visualization of Rutherfords analogy......................................79Figure 5.2 Graphical visualization of the Galileo analogy .....................................80

    Figure 5.3 Generic tabular visualization of an analogy..........................................85Figure 5.4 Process for creating and visualizing analogy expressions.....................87

    Figure 5.5 The Rutherford Analogy visualization using concept links to WebKB89Figure 7.1 Integrating MARVIN into an Instructional or CBI Web Page ............108Figure 7.2 The MARVIN System Architecture....................................................110

    Figure 7.3 Sequence diagram: analogy expression transformation,XML to HTML ...........................................................................112

    Figure 7.4 Direct Web access of a MARVIN Analogy Expression .....................114Figure 7.5 Sequence Diagram for MARVIN archive query.................................115

    Figure 7.6 The MARVIN User Interface..............................................................117Figure 7.7 A tabular visualization of the Rutherford Analogy .............................118Figure 7.8 A graphical visualization of the Galileo analogy ................................119

    Figure 7.9 The MARVIN Archive Search interface.............................................120Figure 7.10 The MARVIN Analogy Expression Editor .......................................123Figure 8.1 The MARVIN User Evaluation Survey ..............................................130

    Figure 8.2 The MARVIN Author Evaluation Survey...........................................131

  • 8/14/2019 HFoxwell Dissertation

    8/196

    viii

    Figure 8.3 Question 1: Analogies help me understand complex topics................132Figure 8.4 Question 2: The MARVIN analogy visualizations will help me

    understand the analogies that are presented to me.........................132Figure 8.5 Question 3: The tabular analogy visualizations help me

    understand the analogies that are presented to me ........................133Figure 8.6 Question 4: The graphical analogy visualizations help me

    understand the analogies and the target subject.............................133

    Figure 8.7 Question 5: The ability to use the MARVIN system to look upexample analogies is a useful feature.............................................134

    Figure 8.8 Question 6: The ability to use the MARVIN system to look upalternate analogies is a useful feature ............................................134

    Figure 8.9 User Survey response summary..........................................................135

    Figure 8.10 Question 1: Analogies are an important component of my instruction.136Figure 8.11 Question 2: I understand the components and structures

    that can occur in analogies.............................................................137

    Figure 8.12 Question 3: The MARVIN analogy visualizations will assist mystudents/readers in understanding the analogies that I present .......137

    Figure 8.13 Question 4: The tabular visualizations will help my studentsunderstand the analogies and the target subject.............................138

    Figure 8.14 Question 5: The graphical visualizations will help my studentsunderstand the analogies and the target subject.............................138

    Figure 8.15 Question 6: The ability to use the MARVIN system to look up

    example analogies is a useful feature.............................................139Figure 8.16 Question 7: The ability to use the MARVIN system to look up

    alternate analogies is a useful feature ............................................139Figure 8.17 Average Query Execution Time (seconds)........................................144

  • 8/14/2019 HFoxwell Dissertation

    9/196

    ix

    Abstract

    A WEB-BASED SYSTEM FOR REPRESENTING, RETRIEVING, ANDVISUALIZING ANALOGIES

    Harry J. Foxwell, Ph.D.

    George Mason University, Spring 2003

    Dissertation Director: Dr. Daniel A. Menasc

    Analogies are essential in human cognition, reasoning, learning, communication, and

    problem solving. They can have a profound and broad effect on how we view and

    understand our world. In this dissertation we design, implement, and evaluate a Web-

    based system for representing, retrieving, and visualizing human-conceived analogies

    that provides a medium and a common language for analogy practitioners to share their

    analogies. To accomplish this, we review the components of analogies, and develop a

    general representation of their structure. We then develop a compact XML content

    model of this representation for use in Web-based environments, and show that the model

    is capable of represent ing a wide range of human-conceived analogies. We demonstrate,

    using XSLT, several example methods for visualizing analogy expressions that use our

    model. We demonstrate methods for storing and retrieving such expressions, and

  • 8/14/2019 HFoxwell Dissertation

    10/196

    x

    develop methods for ranking the retrieved expressions. We designed and implemented

    the MARVIN (Markup for Analogy Representation and Visualization for the InterNet)

    system to demonstrate these methods. A formative evaluation of the MARVIN system

    by analogy authors and end users was conducted; both author evaluators and user

    evaluators agreed that the MARVIN system analogy visualizations can assist them in

    their use of analogies, and that the systems ability to retrieve analogies and alternates is

    also of value.

  • 8/14/2019 HFoxwell Dissertation

    11/196

    1

    1. INTRODUCTION

    How do we ever understand an yth ing? I th ink , by

    usin g one or another k ind of analogy - th at is,

    representing each n ew t hing as th ough it resembles

    som ething we already k now.

    - Marvin Minsky

    When Meriwether Lewis and William Clark retu rn ed in 1806 from th eir

    hist oric 18-month explora tion of wester n N ort h America, th e most significan t

    item th ey brought back was not th e specimens of un known plant a nd an imal

    species, nor the accoun ts of th e nat ive inh abitan ts of th at a rea. What t hey

    brought back, indeed, th e prima ry pu rpose of their expedition, was a ma p, a

    ma p of hit her to un kn own t err itory, a ma p for oth ers t o follow [VIAs, 1998].

    Maps s how th e way, how to get from wher e you a re t o wher e you wa nt to go.

    They represent key featu res and th eir relative locat ions, an d a llow you to

    orient your self while traveling th rough un fam iliar t erritory. Lewis & Clar k's

    ma p ha d a pr ofound influence on t he newly form ed United Sta tes. It guided

    millions of sett lers a nd explorer s from t he familiar lan d of th e original

    Ea stern colonies to th e new lan ds of th e West.

  • 8/14/2019 HFoxwell Dissertation

    12/196

    2

    Maps can be crea ted by explorer s for oth ers t o follow. But th ey can also serve

    as r eminder s. They can d ocum ent significan t discoveries as well as blind

    alleys an d wrong tu rn s. They can be a valua ble record of th e explora tion

    process.

    Analogies are like ma ps of un fam iliar kn owledge ter rit ories, crea ted by th e

    explorers of th ose ter rit ories for oth ers t o follow. They sh ow you h ow to get

    from what you kn ow to what you wan t t o know. Like geograph ical ma ps,

    th ey represent key featu res -- concepts -- and th eir relat ionsh ips, and t hey

    help you orient your self while you ar e lear ning. And just as Lewis and

    Clark's map guided tr avelers from t he kn own to th e unk nown, analogies have

    guided hu man s from existing ideas an d a ssumpt ions to new kn owledge.

    When Ga lileo wan ted t o lead people from th e fam iliar Ea rt h-cent ered world

    of Pt olemy to the u nfam iliar Sun -cent ered world of Copern icus, h e crea ted a

    ma p -- an a na logy ma p -- showing how th e solar s yste m of plan ets wa s like

    th e J ovian system of sat ellites th at he directly observed t hr ough his t elescope

    [Galileo, 1632]. When Da rwin tr ied to lead people from t he creat ion-cent ered

    world of sta t ic biologica l species t o th e dyn am ic world of biological evolut ion,

    he creat ed his famous a na logical ma p showing how to get from t he familiar

    process of an imal a nd pla nt var iat ion u nder domestication to th e process of

    natural selection [Darwin, 1859].

  • 8/14/2019 HFoxwell Dissertation

    13/196

    3

    This dissertat ion first present s a map of what might be called Ana logy Lan d;

    describing the point s of int erest a nd t he r out es you m ust tr avel to visit th em.

    We then provide you with ma pma ker s tools -- wha t you need t o crea te a nd

    sha re m aps of your own explora tions, an d to under sta nd t he a na logy ma ps of

    oth er explorer s.

    1.1 Why are a na logies import an t?

    Ana logies pervade all hum an comm un icat ion a nd lear ning. They occur in an

    extraordinar y scope an d variety, ra nging from t he simplest ratio form, such

    as ha nd is t o ar m a s foot is t o leg, th rough exten ded an alogical essa ys

    proposing tha t a comput er is like a brain [von N euma nn, 1958], the In tern et

    economy is like the E nglan d ra ilroad boom of th e 1800s [Arth ur , 2002], an d

    th e mind of an au tist ic is like a Web browser [Gran din, 2000].

    Analogies are widely used when explain ing ideas, especially in ins tr uctional

    cont exts . Us ing an alogies is one of the core pr ocesses of cognit ion [Forbus,

    2001], and m ay be the primary process of all cognition and communication

    [Hofst adt er, 2001]. Analogies ar e a k ey componen t of learn ing-by-example,

    or case-based rea soning [Scha nk , 2000], an d a re qu ite comm on in science

    educat ion [Glynn , 1997], [Pa ris, 2000]. And, alt hough a na logies ar e gener ally

  • 8/14/2019 HFoxwell Dissertation

    14/196

  • 8/14/2019 HFoxwell Dissertation

    15/196

    5

    Ana logies assist in a cquiring new kn owledge by at tempt ing to map t he

    structure of existing knowledge to new situa tions [Gent ner , 1983]. For

    example, in a widely kn own a nd often cited an alogy, Er nest Rut her ford in

    1910 proposed th at an un fam iliar idea th e str uctur e of th e at om, is like the

    str uctu re of a pr esum ably fam iliar object th e solar syst em [Bohr ,

    1922][Gent ner , 1983]. In th is exam ple the a tom is th e targetanalogof the

    an alogy an d th e solar system is th e source analog.This ana logy suggests th at

    we try tom ap

    what we cur rent ly know about th e sour ce t he solar system's

    par ts an d their interr elations, to th e tar get t he par ts of th e atom, allowing

    us to mak e plausible inferen ces an d predictions a nd t o form hypotheses a bout

    th e ta rget object. Some of th ese predictions a nd inferences may t ur n out t o

    be wrong, but th e an alogy provides a useful str uctu re, or scaffold, for

    genera ting and consider ing th em. [Roblyer an d Edwar ds, 2000][Bru ner ,

    1986].

    1.3 Why we wan t t o us e th e Web to access an alogy expressions

    The World Wide Web (WWW) is a syst em for storing, r etr ieving, an d

    visualizing inform at ion au th ored by people anywher e in th e world. It is

    gener ally built upon widely accepted t echn ology st an dar ds, and consequ ent ly

    its a ccessibility can extend globally. Em erging technologies such a s th e

    Ext ensible Mar ku p La ngua ge (XML) [W3C, 2000] are n ow us ed t o impr ove

  • 8/14/2019 HFoxwell Dissertation

    16/196

    6

    an d r estru ctu re Web-based inform at ion by allowing th e separa tion of the

    content from the mechanisms of its presentation, thereby allowing multiple

    form s of access an d display on diverse devices usin g th e sam e sour ce dat e.

    The Web and its t echn ologies const itut e an ideal environmen t for sh ar ing and

    comm un icat ing h um an -conceived a na logies given it s u biquity, global r each,

    and u niversal standards.

    One of th e most im porta nt consequ ences of the Webs growth an d scope is th e

    form at ion of comm un ities of pra ctice cent ered on persona l, social, an d

    especially professiona l inter ests. The Web provides th e medium for

    comm un icat ion, an d XML support s th e developmen t of a comm on lan gua ge

    for each comm un ity. Busin esses, governm ent a gencies, an d academ ic

    disciplines are u sing XML to develop lan gua ges for excha nging da ta , ideas,

    an d document s within th eir respective commu nities. For exam ple,

    ma th emat icians h ave agreed upon a common langua ge for r epresenting

    ma th ema tical expressions on t he Web [W3C, 2001], chem ists u se Ch emical

    Mar kup La ngua ge [Murr ay-Rust, 1998], and th e U .S. Departm ent of J ust ice

    is developing Ju stice XML, a set of projects to ena ble repres ent at ion a nd

    sharing of data on criminal activities, biometric information, and driving

    records [USDOJ, 2002].

  • 8/14/2019 HFoxwell Dissertation

    17/196

    7

    Web-based comm un ities of pr actice cent ered on edu cat ion topics a nd issues

    ha ve become a n especially effective aid to tea chers a nd st uden ts [Gordin an d

    Gomez, 1996][Wenger, 1998]. Edu cat ion comm un ity professiona l ha ve also

    used XML to define a comm on langu age for exchan ging inform at ion a nd da ta

    about cur riculum st ru ctu re an d cont ent u sing the Learning Object Metada ta

    Sta nda rd [IEEE, 2003].

    We suggest t he need for a mecha nism t o represent, record, sha re, an d

    ret rieve hu ma n-conceived an alogies in a st ru ctur ed an d globally accessible

    form , for comm un ities of an alogy practitioners. Su ch web-based commu nit ies

    ar e alrea dy form ing [Ruh l, 2002]. We th erefore r equire a compa ct a nd

    general represent at ion for a na logies tha t su pport s th eir expression a nd

    visualization using sta nda rd Web-based t echn ologies, and th at is relat ively

    easy to un dersta nd, aut hor, an d extend. Our a pproach is to provide an

    au gmenta tion of existing Web documen ts t ha t cont ain a na logies rat her t ha n

    embedding cont extual mar kup within t hose docum ents. This approach

    recognizes the im men se volume of existing HTML-based Web conten t t ha t

    will persist for man y years, a nd pr ovides a mecha nism for cont ent au th ors to

    include easily creat ed stru ctu red cont extual inform at ion a bout an alogies

    described th eir Web docum ent s.

  • 8/14/2019 HFoxwell Dissertation

    18/196

    8

    1.4 What we want to do with a na logies an d why

    In order to comm un icat e about a na logies using t he Web, we need to represent

    th em in a st an dar d form at , and pr ovide tools to au th or, display, use, shar e,

    an d reuse the an alogies. Ha ving recorded an alogies in a st an dar d form at

    th en perm its t he st ora ge, retr ieval, an d compa rison of ana logies in

    educat iona l an d other explana tory cont exts. The ability to reference mu ltiple

    an alogies for a given ta rget h as been sh own t o increase stu dents

    un dersta nding of th e tar get [Nott is and McFarlan d, 2001], and it h as been

    recommended that teachers develop a repertoire of analogies [Thiele and

    Treagust , 1994]. Our system is th erefore designed t o assist a na logy

    pra ctitioners -- au th ors a nd u sers of an alogies providing a comm on

    language for representing, visualizing, and retrieving analogies, providing

    Web access to multiple alternate analogies, and providing an archive of

    interesting and useful human-conceived analogies.

    Cha pter 7 describes MARVIN (Mark up for Ana logy Represen ta tion an d

    Visualizat ion for th e Int erNet), a system th at defines an XML-based

    represent at ion for a na logies and demonst ra tes h ow it can be used t o

    represent an d visualize an alogy expressions, an d t o store, retrieve, and sha re

    such expressions in local an d Web-based a rchives. Figur e 1.1 shows th e

    genera l architectu re of th e MARVIN system .

  • 8/14/2019 HFoxwell Dissertation

    19/196

    9

    Figure 1.1 The MARVIN System Architecture

    1.4.1 Representation

    We require a compact a nd int uitive representa tion of an alogies th at is easy to

    au th or wit h gener ally available tools, and is capable of expressing mu ch of

    th e range of an alogies that hu man s are able to generat e. Such an expression

    mu st be consisten t with curr ent t heory and resear ch findings on a na logy

    MARVINProgrammable

    Proxy Server

    Instructional URL

    MARVIN URL:___ Help

    MARVIN User Interface

    MARVIN Search EngineQuery:_________

    Results:

    WWW RemoteInstructional/Content

    Web Servers

    Local

    Instructional/ContentWeb Servers

    Analogy Expression(s)

    Network

    User Clientand Browser

    MARVIN Transformation & Search Engine

    Analogy Expression DTD

    XSLT Stylesheets

    XML Transformation

    Engine

    XML Search

    EngineXML AnalogyExpression Archive

  • 8/14/2019 HFoxwell Dissertation

    20/196

    10

    components a nd st ru ctu re, and m ust be programm at ically useful to permit

    Web-based shar ing, storage, retr ieval, an d m an ipulation of the expressions.

    The MARVIN syst em defines su ch an XML cont ent model for th e crea tion of

    an alogy representa tions; with t his model, ana logy auth ors can creat e an alogy

    expressions us ing XML editors or th e Analogy Express ion Edit or described in

    Chapter 7.

    1.4.2 Visualization

    Visua lizat ion is a visual/spat ial display in which inform at ion is

    comm un icat ed by th e spatial ar ra ngement of element s in the repr esenta tion

    [Hegart y, 2002]. Such displays are cognit ive aids th at pr omote mem ory an d

    information processing [Tversky, et al., 2002]; visualizations of analogies

    facilita te learn ing and enha nce the u se of analogies in both inst ru ction an d

    pr oblem solving [Cra ig, et a l., 2002], [Par is, 2000]. We requ ire a m eth od for

    pr oducing mult iple visua lizat ion form s from our a na logy expression th at

    separ at es th e process of producing the visua lizat ion from th e expression of

    th e cont ent an d st ru ctu re of th e an alogy, and th at allows for display of th ese

    visualizat ions u sing sta nda rd Web browser t echn ologies.

    An an alogy can be underst ood when th e person hear ing or reading it k nows

    someth ing about th e sour ce an alog and is a ble observe and ma p th e

  • 8/14/2019 HFoxwell Dissertation

    21/196

    11

    component s of th e sour ce to the ta rget. Un derst an ding of th e ana logy is

    significan tly improved when t he relat iona l stru ctu res pr esent in t he an alogy

    can be visualized in a t abu lar or gr aph ical form [Pa ris, 2000][Mat ocha ,

    Cam p, and Hooper, 1998]. The MARVIN system pr ovides this capa bility

    through the use of XSLT stylesheets, which transform analogy expressions

    into visua lizations t ha t can be displayed using sta nda rd Web browsers.

    1.4.3 Storage/Retrieval

    We wish to make interesting and useful human-conceived analogies storable

    an d retr ievable in a sta nda rd str uctured form at for comput at ion an d

    expression. An ar chive of such expressions would perm it t he r ecordin g of

    an alogies conceived du rin g th e development of problem solut ions in order to

    document, replicate, and share the thought processes of the problem solvers

    [Dunba r, 2001]..

    For those who wish to use analogies for instructional or explanatory

    pur poses, a sea rcha ble archive of an alogy express ions m ay be queried t o

    locate an app ropriat e ana logy for t he topic un der discussion. Or, if a

    proposed ana logy is not under stood by the learn er or reader , the a rchive ma y

    be queried to locat e additiona l or a lterna te a na logies better suited t o the

    learner's backgroun d knowledge. Edu cat ion r esearchers ha ve recomm ended

  • 8/14/2019 HFoxwell Dissertation

    22/196

    12

    th at tea chers develop a r epert oire of an alogies for t heir in str uction [Thiele

    an d Treagust , 1994]; th ey have also foun d t ha t present ing multiple ana logies

    for a given tar get a na log resulted in greater u nderst an ding of th e tar get

    [Nott is and McFar land , 2001].

    The a bility t o ret rieve several an alogies of possible inter est im plies the n eed

    for a na logical ra nking m ethods th at ma y be used to order th e results of a

    query. Such meth ods, discussed in Chapt er 6, perm it a learn er to select

    can didate an alogies most a ppropriate to the lear ning task . The MARVIN

    system was t herefore designed to implement st ora ge, retr ieval, an d ra nkin g

    of an alogy expressions.

    1.5 The F ocus of this Resear ch

    Analogy resea rch h istorically ha s focused on several ba sic an d overla pping

    areas understanding the cognitive processes of analogical reasoning

    [Hofstadt er, 1995][Mar sha ll, 1999][Gent ner , 1989][Falken ha iner, 1989],

    computer simulation of human analogical reasoning [French, 1995][Gentner,

    1989][Forbus, Gentner, and Law, 1995], and using analogies in educational

    sett ings [Glynn, 1997][Schan k, 2000][Par is, 2000]. While the compu ter

    simulation research has yielded great insight into the cognitive foundation of

    an alogical r easoning, it is often limited t o relat ively small, well defined, an d

  • 8/14/2019 HFoxwell Dissertation

    23/196

    13

    easily represen ted a rea s of kn owledge known as microdomains. CopyCat

    [Hofstadt er a nd Mit chell, 1994] an d TableTop [Fren ch, 1995] ar e examples of

    th is approach. As noted in [Forbus, 2001], such systems r epresent a nd t est

    limited form s of an alogy-relat ed ta sks, an d such syst ems can not possibly

    scale to ha ndle th e kinds of cognit ive processing th at hu ma n beings clear ly

    do. The work pr esent ed here is focused on helping hu ma ns share and u se

    analogies they have already conceived or discovered, rather than using

    compu ter s t o discover a na logies or t o perform an alogical r easoning.

    Hu ma n an alogical r easoning t ypically involves th e following pr ocesses

    [Holyoak, et al., 2001]:

    - recall a sour ce ana log

    - ma p th e componen ts of th e sour ce to the t ar get

    - generat e plausible inferences about t he t ar get

    - evalua te t he inferences about th e ta rget

    - accept (lear n/rem ember ) new kn owledge about t he t ar get

    The system described in t his dissertat ion focuses on helping huma ns with th e

    first t wo steps ret rieving source ana logs alrea dy perceived an d described by

    oth ers, and visualizing the ma ppings between th e sour ce an d ta rget a na logs,

    an d the final step remem bering th e an alogy. Genera ting and evalua ting

  • 8/14/2019 HFoxwell Dissertation

    24/196

    14

    inferences suggested by the analogy remains the responsibility of the analogy

    au th or an d users. That is, we are not concern ed here with comput er -aided

    an alogy gener at ion.

    1.6 Overview of Cont ribu tions

    The pr imar y contr ibution of our work is th e design a nd developmen t of

    MARVIN (Mar kup for Analogy R epresentation and V isualization for th e

    InterN et), a prototype Web-based system t ha t enables aut hors an d user s of

    instr uctiona l cont ent to record, r etrieve, visua lize, an d quer y hu ma n-

    conceived ana logy expressions. We demonst ra te t he usefulness of th e system

    th rough a form at ive evalua tion pr ocess by cont ent experts an d au th ors, an d

    by end user s. Additional cont ribut ions a ssociat ed with t he developmen t of

    th e MARVIN syst em a re discussed below.

    We developed a compact, genera l repr esent at ion of an alogy expressions,

    usin g an XML conten t m odel for u niversa lity and Web-based a ccess, an d

    demonst ra ted th rough exam ples the power of th is represent at ion t o express

    an alogies from va ried doma ins, su ch as science, history, medicine, religion,

    and literatu re.

    We developed mu ltiple visualizat ion meth ods, such as ta bular an d gra phical

  • 8/14/2019 HFoxwell Dissertation

    25/196

    15

    displays, which ar e generat ed directly from th e XML an alogy represent at ions

    usin g th e Ext ensible Stylesheet Lan gua ge (XSL) [W3C, 1999], an d

    demonst ra ted a variety of example visua lizations.

    We developed Web-based met hods for s tora ge an d r etr ieval of ana logy

    expressions, a nd demonst ra ted examples of retr ieving altern at e an alogies

    an d ra nking th e retr ieved expressions.

    The above contr ibutions use our m odel of th e component s an d str uctu re of

    an alogies. This model is consist ent wit h both comm on usa ge an d form al

    char acter izations of an alogies from cognitive science r esear ch, from

    educat iona l pra ctice and r esear ch, an d from work on Web-based k nowledge

    representation.

    1.7 Dissert at ion Organ izat ion

    The r ema inder of th is disserta tion is orga nized as follows:

    Chapt er 2 discusses t he var ious Web techn ologies used in th e design an d

    implementation of the MARVIN system, and reviews prior research on

    an alogies by compu ter scient ists, cognitive scient ists, an d educat ion

    researchers.

  • 8/14/2019 HFoxwell Dissertation

    26/196

    16

    Cha pter 3 reviews and defines t he componen ts of an alogies, discuss es th e

    basic structure of analogies, and introduces a graphical representation for

    severa l key st ru ctu res comm only found in ana logies. This cha pter a lso

    present s a nd discusses several exam ple ana logies from various doma ins.

    Chapt er 4 discusses the n eed for a genera l-pur pose repr esenta tion for

    analogies, discusses design goals for such representations, presents a formal

    definition for analogy expressions, and implements that definition using an

    XML cont ent model.

    Chapt er 5 discusses h ow a na logies th at ar e expressed using our XML cont ent

    model may be visualized in severa l form s us ing Web-bas ed t ools su ch as XSL

    Tra nsform at ions (XSLT) [W3C, 1999] an d S cala ble Vector Gra phics (SVG)

    [W3C, 2001], discuss es design goals for such visua lizat ions, a nd presen ts

    several example visualizations.

    Chapt er 6 discusses t he st ora ge and retr ieval of an alogy expressions, a nd

    intr oduces and demonstr at es meth ods for r an king the r esults of queries of

    an alogy expression ar chives.

    Cha pter 7 discuss es th e design goals of th e MARVIN syst em for r ecordin g,

  • 8/14/2019 HFoxwell Dissertation

    27/196

    17

    retrieving, and visualizing analogy expressions that use our XML model and

    stylesheets, an d describes the components a nd system ar chitecture.

    Per form an ce char acter istics of th e MARVIN syst em a re a lso discuss ed. The

    Analogy Expression Editor, used to aid authors in creating XML analogy

    expressions, is a lso described.

    Chapt er 8 pr esents t he r esults of form at ive evaluations of the MARVIN

    system by cont ent a ut hors and individual users.

    Chapt er 9 provides a su mma ry of th e dissert at ion, an d discusses fut ur e

    resear ch su ggested by this work in th e area of an alogy represent at ion,

    visua lization, an d retr ieval.

  • 8/14/2019 HFoxwell Dissertation

    28/196

    18

    2. Background

    2.1 Web Technologies Used in Our Resear ch

    The work described in th is dissert at ion em ploys several techn ologies

    developed by th e World Wide Web Consort ium (W3C) th at were designed t o

    provide stru ctu re a nd m eaning t o Web-based cont ent an d t o provide

    progra mming langua ges and int erfaces to access tha t cont ent. These

    techn ologies include XML (Ext ensible Mar ku p La ngu age), XSLT (Extensible

    Stylesheet Langua ge for Tran sform at ions), and related tools an d lan guages

    such as Xerces, XPath, Xalan, Apache Tomcat, Jakarta Lucene, Apache

    Cocoon, a nd H TML. The following sections pr ovide overviews of th ese

    techn ologies. The section 2.2 of th is cha pter reviews resea rch concern ing

    analogies.

    2.1.1 XML

    XML (Exten sible Mar ku p Lan gua ge) is a met a-lan gua ge designed to provide

    a universal format for structured documents and data on the Web [W3C,

    2000]. Virtu ally all Web pages are cur rent ly writt en an d form at ted using

  • 8/14/2019 HFoxwell Dissertation

    29/196

    19

    HTML (HyperText Mar ku p Lan gua ge), which was specifically designed for

    presen ta tion and display of Web cont ent [W3C, 1999a]. But H TML ha s no

    capa bility for at ta ching meaning or inter preta tion to the cont ent. Using

    XML, content authors can create special-purpose descriptive languages that

    can be used to ta g cont ent st ru ctu re an d componen ts. Such langu ages

    enable commu nities of practitioners to use a common lan guage th rough th e

    Web for effective sha rin g of data an d ideas.

    XML ta gs ar e un ique, case-sen sitive labels for sections of cont ent , delimited

    by angle brackets. Tags are used to encapsu late cont ent elements a nd give

    th em mean ing. This enables sear ch engines and other Web tools to reduce

    am biguit y in the sear ch space. F or exam ple, ta gging th e word Brown in a

    Web document listing personal information, using

    Brown or Brown

    permits a search engine or oth er pr ogram to distinguish between Brown th e

    na me and Brown th e color.

    The XML specificat ion d escribes str ict ru les for th e ta gging of cont ent in

    order to simplify docum ent pa rsin g an d to eliminat e am biguit y. Cont ent

    element s have sta rt -ta gs and end-t ags; an XML docum ent is well form ed if

    element s delimited by start -ta gs and end-ta gs nest pr operly with in each

    oth er (th at is, text is not per mitt ed, while

  • 8/14/2019 HFoxwell Dissertation

    30/196

    20

    text is properly nest ed). All XML docum ent s mu st be well

    formed.

    XML ta gs ma y be defined with in t he docum ent cont ent file itself, or m ay be

    defined in an as sociat ed file usin g th e XML DTD (Documen t Type Definition)

    specificat ion [W3C 2000] or th e XML Schema specificat ion [W3C, 2001]. A

    validXML file uses only th e ta gs defined in its DTD or Schema file, and m ust

    also be well form ed.

    2.1.2 XML Editors

    The MARVIN s ystem r equires t he creat ion of XML files to represen t

    an alogies. Ther e are nu mer ous comm ercial an d open-sour ce XML editors,

    su ch as XML-Spy [XML-Spy, 2002], MS XML NoteP ad , Morphon [Morp hon,

    2002], an d epcEDIT [epcEDIT, 2002] to na me only a few. Such editors en able

    the creation of valid, well-formed XML files that conform to a DTD or

    schema . Ana logy au th ors u sing th e MARVIN system can elect to use such

    editors to crea te a na logy expression files, but u sing such editors r equires

    kn owledge of XML st ru ctu re an d synta x. We th erefore designed and

    implement ed an Ana logy Expression E ditor, writ ten in J ava, for u se with t he

    MARVIN system, wh ich perm its t he crea tion a nd modificat ion of XML files

    th at conform to our cont ent m odel, without r equiring the au th or t o know any

  • 8/14/2019 HFoxwell Dissertation

    31/196

    21

    XML. This editor is described fur th er in Ch apt er 7.

    2.1.3 Ja va

    The J ava program ming lan gua ge [J oy and Gosling, 2000] is ideal for Web-

    based a pplicat ions because of its r ich n etwork APIs a nd it s ability to run on

    diverse opera tin g system s and ar chit ectu res. Most of th e techn ologies used

    in th is dissert at ion u se J ava directly or indirectly, becau se th ey are writt en

    as J ava applications or as J avaservlets

    [Horstm an n an d Cornell, 2000][Sun,

    2003]. Servlets ar e used to extend t he capa bilities of Web servers by enabling

    progra mmer s t o produce intera ctive, dyna mic Web cont ent based on u ser

    actions and dat a cont ent. Servlets run within a containerthat m anages the

    servlets int era ction with t he ser vers opera tin g system an d Web server.

    Servlet cont ain ers ar e typical componen ts of comm ercial an d open-sour ce

    Web and a pplicat ion ser vers, and can also be implemented as sta nd -alone

    Web ser vices.

    2.1.4 J aka rt a Tomcat

    J aka rt a Tomcat is an open-sour ce J ava ser vlet cont ainer th at is the official

    reference implemen ta tion for J ava Ser vlets [Apache, 2002]. It can be

    int egrat ed with Web servers su ch a s Apache [Apache, 2002], or r un as a

    sta nd-alone Web service. The MARVIN prototype system described in

  • 8/14/2019 HFoxwell Dissertation

    32/196

    22

    Chapt er 7 is implement ed using J aka rt a Tomcat t o ru n th e XML

    transformation and search servlets.

    2.1.5 Apache Cocoon

    Apache Cocoon [Apache, 2002] is an XML Web publishing framework that

    ru ns as a servlet within Apache Tomcat. It ena bles th e development ,

    ma na gement , and gener at ion of dyna mic Web cont ent from XML sour ce

    docum ents. It permits th e separa tion of th erepresentation

    of Web content

    from t he processing necessa ry to gener at e mult iple form s of display. The

    MARVIN system uses Cocoon t o tra nsform XML repr esent at ions of ana logies

    into a variety of visualizat ion form s, an d t o int erface with th e ret rieval and

    text sear ch servlets.

    2.1.6 XML Pa rser s

    XML document s must be read an d int erpret ed according to th e XML

    specificat ion. A program t ha t perform s th is task is called an XMLparser. A

    par ser th at enforces a docum ent 's complian ce with a DTD or schema is called

    a validating par ser. Ther e ar e several comm ercial an d open-sour ce

    validat ing XML par sers ava ilable. For th e work described in th is

    dissert at ion, we use the Xerces Java Parser [Apache, 2001], which supports

    the XML 1.0 recommendation [W3C, 2000]; the Apache Cocoon servlet used

  • 8/14/2019 HFoxwell Dissertation

    33/196

    23

    in th e MARVIN system usesXerces to parse the XML analogy expressions.

    2.1.7 XSLT

    XSLT (Extensible Stylesheet Language for Transformations) is a language

    for t ra nsform ing XML document s int o oth er form s, including PostScript ,

    Adobe PDF, HTML, Java, and alternate XML representations [W3C, 2001].

    XSLT is also a specificat ion for su ch t ra nsform at ions. It is par tia lly

    implemen ted in some browsers su ch a s [Mozilla, 2002], but th ese

    implement at ions ar e still imma tu re an d buggy. A more matu re an d complete

    XSLT pr ocessor is Xa lan [Apache, 2002]. The Xalan p rocessor oper at es on a

    parsed XML document and transforms it according to instructions contained

    in a stylesheetfile. These inst ru ctions , called templates, specify

    tr an sform at ions t o be applied to selected n ode element s of th e par sed XML

    docum ent [Kay, 2001].

    The XSLT program min g model is not pr ocedur al, driven by th e program code.

    Rath er, it is event-orient ed, driven by the da ta , in th is case by th e XML

    docum ent . When t he XSLT processor reads th e pars ed XML docum ent , it

    detects document node mat ch event s a nd processes th e node data according

    to the templat e defined for t hat node.

  • 8/14/2019 HFoxwell Dissertation

    34/196

    24

    XSLT templa tes m ay be th ought of as in dependen tly selecta ble processing

    instr uctions t ha t bind to a n ode in th e XML docum ent when a ma tch is

    encoun ter ed, ana logous t o th e way messenger RNA binds to a segment of

    DNA dur ing cell r epr oduction [Piez, 2002], [Foxwell, 2002]. Like th e

    messen ger RNA provides inst ru ctions for const ru ctin g a specific protein, t he

    tem plat e provides inst ru ctions for const ru cting a sp ecific componen t of an

    outpu t docum ent . The Apache Cocoon ser vlet used in th e MARVIN syst em

    uses theXalan

    XSLT pr ocessor.

    2.1.8 XPa th

    XPa th is an XML langu age specificat ion for r eferen cing n odes of a p ar sed

    XML docum ent [W3C, 1999b]. It s synt ax is similar t o th at for compu ter file

    system directories, an d perm its r eference to a documen t's element a nd

    at tr ibute nodes, an d to th eir parent a nd child nodes. XSLT templates cont ain

    XPath references to document nodes, and the instructions for processing the

    nodes. XSLT processors such as Xala n ma ke use of XPa th wh en referen cing

    XML docum ent nodes for t he t empla tes t o process.

    2.1.9 SVG

    SVG (Scalable Vector Gra phics) is a lan guage for describing t wo-dimensiona l

    gra phics in XML [W3C, 2002], allowing t he genera tion of lines, curves,

    ima ges, an d text from an XML docum ent specificat ion. SVG provides a

  • 8/14/2019 HFoxwell Dissertation

    35/196

    25

    compact a nd porta ble means for generating r esizable and searchable Web

    based ima ges [Eisenberg, 2002]. Browsers su ch a s MS IE an d Mozilla ar e

    beginn ing to support t he display of SVG gra phics directly, but a s with XSLT,

    th ese are st ill early implement at ions an d do not completely support t he full

    SVG specificat ion. However, t her e ar e tools for convert ing SVG gra phics to

    browser-displaya ble GIF or J PG form at gr ap hics. The Cocoon servlet used in

    th e MARVIN system includes an SVG-to-J PG tr an sform er; th e SVG ana logy

    visualizat ions pr oduced by th e MARVIN system can t hu s be convert ed to

    JPG format and displayed on any graphics-capable Web browser.

    2.1.10 Quer ying XML Docum ent s

    Ther e ar e various tools un der developmen t for sea rching th e cont ent s of XML

    docum ent s. XQuer y [W3C, 2002] is an XML quer y lan gua ge cur ren tly under

    developmen t by W3C, but t he sp ecificat ion for t his la ngua ge is still in

    Working Dra ft st at us, an d at th is time there a re few complete

    implementations. On the other han d, there are text search engines tha t can

    ma ke us e of XML mar ku p in docum ent s, providing the a bility to perform a

    str uctur ed sear ch for text within XML ta gged fields. One such str uctu re -

    awa re search engine is Ja kar ta Lucene [Apache, 2002], written in J ava, an d

    implement ed as a J ava servlet. Lucene allows queries based on XML tag

    cont ent; th e MARVIN system uses Lucene a s its sear ch a nd r etrieval

  • 8/14/2019 HFoxwell Dissertation

    36/196

    26

    component.

    2.2 Relat ed Research

    Resear ch concern ing ana logies occur s in man y disciplines. Compu ter

    scient ists in th e fields of Art ificial Int elligence and Ma chin e Lear nin g build

    systems that attempt to model analogical reasoning; cognitive scientists also

    build such systems to investigate the underlying mental processes involved in

    memory, ana logy perception, a nd an alogical r easoning; and educat ors stu dy

    th e use, an d abus e, of an alogies in tea chin g. Becau se the compu ter is often

    used as a sur rogat e for st udying and modeling the h uma n m ind, there is

    significan t overlap between cognitive science resea rch a nd compu ter science

    resea rch int o th e work ings of an alogies.

    Additionally, historians analyze and debate the usefulness of historical

    an alogies [Neu sta dt, 1988][Rour ke an d Taylor, 1995], and legal scholar s an d

    practitioners make extensive use of analogies in legal arguments [Ashley,

    1991]. Analogies ar e also quit e comm on in jour na lism and edit oria l writing,

    although their overuse and oversimplification of important ideas has been

    criticized [Clark, 2002].

    In sh ort , ana logies ar e foun d in n early all area s of hu ma n commu nicat ion

  • 8/14/2019 HFoxwell Dissertation

    37/196

    27

    an d lear ning, an d ar e widely stu died in th e scientific an d social disciplines.

    Commu nities of analogy practitioners an d r esearchers use th e Web

    exten sively to sha re examples, ideas, and resear ch resu lts. The following

    sections review an alogy resea rch in Compu ter Science, Cognitive Science, and

    Education.

    2.2.1 Ana logy Research in Compu ter Science

    Among the prominent ear ly resear chers in comput er learning an d reasoning

    by ana logy is Pa tr ick Winst on of th e Art ificial Int elligence Labora tory at MIT

    [Winst on, 1980]. He designed a LISP-based Fra me Representat ion La ngua ge

    (FRL) system derived from Minsk y's kn owledge repr esent at ion frames

    [Minsk y, 1985]. FRL was a pplied to finding a na logous st ory plots in

    litera tu re sam ples th at were already expressed in concept/relation st ru ctu res.

    It used weight ed ma tching criteria to compa re concepts a nd relat ions, giving

    high weight s to th ose th at were of particular import an ce to plot st ru ctu re.

    While Winst on wa s a ble to find su b-plot a na logies between sections of

    Shakespeare's Hamleta ndMacbeth , for examp le, this resea rch focus ed on

    finding ana logies th at were already kn own a nd in limited kn owledge

    domains.

  • 8/14/2019 HFoxwell Dissertation

    38/196

    28

    Much a na logy resear ch explores r elatively small, well-defined, a nd easily

    represent ed ar eas of knowledge kn own a s microdomains. The ANALOGY

    pr ogra m [Evan s, 1968], for example, exam ined spa tia l ana logies am ong

    geomet ric sha pes. Decades later , microdomains such as Copycat[Mitchell,

    1999], Tabletop [Fr ench, 1995], an d IDA [Wolverton, 1994] continue to yield

    insight int o th e cognit ive processes t ha t cau se th e perception of ana logies.

    Copycat, for example, examines analogies between character strings, and

    Tabletopgenera tes a ction an alogies using comm on objects on a kitchen t able.

    Wolverton'sIDA sought to generat e engineering design an alogies, and

    focused pr ima rily on finding a n efficient algorit hm for sear chin g a pr edefined

    problem space ontology.

    Some at tem pts a t compu ter modeling of an alogy retr ieval use concept

    indexing, spreading activation, and network-matching approaches [Collins

    an d Loftu s, 1975]. These techniqu es were u seful in findin g sem ant ically close

    an alogies (target an d source ana logs tak en from t he sa me specialized

    kn owledge domain), but t hey were not scalable to finding sema nt ically

    distan t an alogies across mu ltiple large kn owledge doma ins (the t ype hum ans

    ar e good at creat ing). [Wolvert on, 1994] proposed a r efinemen t of spr eadin g

    activat ion, called Kn owledge-Directed Spr eading Activat ion, consist ing of

    multiple network-ma tching search agents th at would persist wh en concept

  • 8/14/2019 HFoxwell Dissertation

    39/196

    29

    ma p fragments were m at ched a nd would reduce or st op activat ion wh en n o

    fragmen t ma tches were found. Hofst adt er an d Mitchells Copycatprogram

    [Mitchell, 1999] used a s imilar a gent-based appr oach, with softwa re a gents

    perform ing mu ltiple, ran dom searches t hr ough th e source knowledge a rchive;

    each a gent would gain or lose resources for cont inu ed sear ching a ccordin g to

    its su ccess in findin g can didat e concept ma tches.

    Becau se ana logies are per ceptions t ha t n eed to be tested a gainst r eality,

    hu ma n int erpret at ion a nd expertise should be used to validate an alogies

    pr oposed by a comput er. The Disciple syst em [Tecuci 1998], for exam ple,

    suggests problem solut ions derived using a na logy to a hu ma n expert , who

    accepts or rejects t hem a nd pr ovides an explan at ion for t he decision. The

    meth od used in Disciple illustr at es an import an t idea in searching for

    an alogies: repla cing a concept in a k nowledge expression with a

    generalization of th e concept. Hofsta dter calls th is variablization, and

    Mitchell calls it conceptual slippage [Hofsta dter , 1995]. A key cha ra cter istic

    of th is process is tha t t he relations am ong th e concepts in t he kn owledge

    expression are preser ved. [Gent ner , 1983] calls this preser vat ion of

    relational str uctur e th e systematicity principle .

    Anoth er resea rch effort in th e modeling of hu ma n r easoning is th e Cyc

    Pr oject [Lenat a nd Gu ha , 1990]. Cyc is a large, gener al pur pose kn owledge

  • 8/14/2019 HFoxwell Dissertation

    40/196

  • 8/14/2019 HFoxwell Dissertation

    41/196

  • 8/14/2019 HFoxwell Dissertation

    42/196

  • 8/14/2019 HFoxwell Dissertation

    43/196

    33

    The pr esenta tion an d visualizat ion of ana logies for instr uctiona l pur poses

    ha ve been sh own t o significan tly enha nce learn ing and ret ention, part icular ly

    for n ovel or complex topics [Mat ocha , Cam p, an d Hooper , 1998], but

    educat ors r ecognize the limita tions an d dan gers of misleadin g ana logies.

    Some have at tem pted t o develop met rics for a na logical validity [Nott is an d

    McFar land, 2001], while most h ave empha sized th e need, during t he

    an alogical rea soning process, to indicate wh ere a na logies brea k down [Her r,

    2001].

    2.2.4 The Semantic Web and Knowledge Representation

    At presen t, most of th e document s on t he World Wide Web are visua l and

    textua l, mark ed with H TML tags for br owser display form at ting an d for ea se

    of na vigat ion am ong docum ent s. Tim Bern ers-Lee, credited with inven tin g

    th e Web, envisions a richer form of cont ent he calls th e Sem antic W eb

    [Bern ers-Lee, Hen dler, and Lass ila, 2001], [W3C, 2002]. In t he Sema nt ic

    Web, every object (word or im age) in a docum ent is labeled with it s mea nin g

    an d context, a nd is linked t o rela ted objects a ccording to th e pur pose of th e

    documen t. Such labeling and linking, using XML metada ta , may event ua lly

    enable intelligent softwa re a gents t o access, int erpret , and tr an sform all Web

    cont ent for other softwar e agents a s well as for h uma n u sers.

  • 8/14/2019 HFoxwell Dissertation

    44/196

    34

    RDF (Resour ce Description F ra mework) is a fram ework th at supports t he

    Sema nt ic Web project for describing an d excha nging da ta about objects

    repr esent ed on t he Web [W3C, 2003]. It describes all Web cont ent in ter ms of

    resour ces (object locat ion), propert ies (aut hor, t itle, doma in), and associat ions

    am ong resour ces, and is focused on providing compu ter -searcha ble meta da ta

    for identifying and locating resources. It presu mes, however, th at event ua lly

    all Web cont ent will be re-crea ted or at least a nn otat ed using RDF codes; th e

    process an d sta nda rds for t his effort ar e still under development , and th ere is

    concern th at au th ors of Web cont ent will find th e required ma rku p to enable

    th e Sema nt ic Web vision t oo bur densome [Suter , 2003].

    Topic Maps [TopicMaps.Org, 2001] are a related Web technology explicitly

    designed t o represent a nd display associations a mong terms within a

    docum ent, perm itting a cont ent au th or to link pairs or groups of concepts an d

    to describe the na tu re of th e relationsh ips among the concepts. Topic ma ps

    an d Seman tic Web technologies both require subst an tial ma rku p within a

    Web docum ent, h owever, an d th e result ing concept link st ru ctu re is genera lly

    unique for each docum ent . In t his dissert at ion, we focus on an alogies only,

    an d present a general appr oach to representing th e concepts a nd r elations

    th at compose an a na logy, providing a compact m ar kup str uctur e for

  • 8/14/2019 HFoxwell Dissertation

    45/196

    35

    represent ing a wide ran ge of an alogies, and expressing tha t st ru ctu re as a

    linkattachmentto existing Web cont ent. We will show in Cha pter 5 th at t his

    app roach allows for r epresen ta tions of an alogical concept r elationsh ips th at

    can be tra nsform ed into severa l form s u sing XSL stylesheet s, including

    ta bular st ru ctu res, graphical visualizat ions, an d other kn owledge

    representations including Topic Maps.

    As noted in [Dunn , 2002], th e effort to provide useful an d gener al cont ext an d

    mea nin g ma rk up for Web docum ent s is an enormous t ask , still too difficult

    for aver age user s. The resu lt of th is difficult y, along with t he decent ra lized

    na tu re of th e Web, encour ages comm un ities of pra ctitioner s to tak e a simpler

    appr oach, developing th eir own ontologies and m eta dat a [Sta ab, 2002].

    2.2.5 The MARVIN System

    The MARVIN syst em described in th is dissert at ion is designed to provide a

    Web-based, common-language environment for describing and sharing

    an alogies alrea dy conceived by tea chers, scient ists, journ alists, doctors, an d

    other a na logy pra ctit ioner s. It is ther efore directed at comm un ities of

    analogy users, especially, but not exclusively, educators. And while it is not

    explicitly designed for int era ction with t he a na logical rea soning engines used

  • 8/14/2019 HFoxwell Dissertation

    46/196

    36

    in cognitive science an d compu ter science, the an alogies produced by th ose

    systems can be represent ed and stored using th e MARVIN system. Thus

    MARVIN ma y be us eful t o resear chers in compu ter science, cognitive science,

    an d educat ion, by providing an environm ent for captu ring an d sha ring

    interest ing or u nu sua l ana logies, wheth er produced by hu man s or by

    machines.

  • 8/14/2019 HFoxwell Dissertation

    47/196

    37

    3. The Structure and Components of Analogies

    What is considered to be analogy or analogical reasoning varies substantially

    am ong pra ctitioners an d r esearchers in various fields, par ticular ly educat ion,

    cognitive science, an d comput er science. While most resea rcher s agree th at

    th e mapping ofrelational str uctur es is th e defining cha ra cterist ic of

    an alogies, some u se th e term analogy more broadly to include simple

    ma ppings of concepts or of similar propert ies. We include her e several

    examples from th e an alogy literat ur e to illustr at e an d develop a term inology

    for t he st ru ctu re an d components a na logies. We revisit t hese examples in

    Chap ters 4, 5, and 6 to illust ra te repr esentat ion, visua lization, an d retr ieval,

    respectively. Additional examples ma y be foun d on th e aut hor's website

    [Foxwell, 2002] an d in th e Appendix.

    The word an alogy as used in both general langua ge and in t he r esearch

    litera tu re refers t o a perceived level of similar ity or sa men ess between t he

    observed pr opert ies, concepts, a nd relat ions of two knowledge domain s, one

    assu med to be known, the oth er par tially known or u nkn own.. A somewhat

    rest rictive definition of an alogy by Gentn er in [Vosnia dou an d Or tony, 1989]

  • 8/14/2019 HFoxwell Dissertation

    48/196

    38

    th at empha sizes the importan ce of relations, sta tes:

    ...an analogy is a mapping of knowledge from one

    doma in (the base) into another (th e ta rget), which

    conveys th at a system of relat ions t ha t h olds among

    the base objects also holds among the target

    objects ...in int erpr eting a n a na logy, people seek to

    put th e objects of the base in one-to-one

    corr esponden ce with t he objects in t he t ar get...

    Th e objects in t he a bove definition ma y be words , soun ds, images, pr ocesses,

    or other symbols representing perceived concepts, and the relations among

    them .

    In t his dissertat ion, we define an an alogy as

    a set of proposed similarity ma ppings between an

    un kn own set of concepts a nd r elations (th e target

    analog)and a known set of concepts an d r elations

    (the source an alog), used for inst ru ctional or

    explanatory purposes.

    The definit ion ofconceptdepend s on t he cont ext for its use. A concept m ay be

  • 8/14/2019 HFoxwell Dissertation

    49/196

  • 8/14/2019 HFoxwell Dissertation

    50/196

  • 8/14/2019 HFoxwell Dissertation

    51/196

    41

    th e sour ce to th ose of the t ar get. U sing the Ruth erford an alogy, for example,

    we can th en form (an d test ) a h ypoth esis tha t electr ical at tr action causes the

    electr on t o orbit t he nu cleus in th e same way th at gravitat iona l att ra ction

    causes th e planet t o revolve ar oun d th e sun [Wilson, et al., 2001]. Tha t is,

    th e proposed ana logy preserves in the t ar get the h igher order causal

    relat ionsh ip perceived in th e sour ce.

    3.1 Ana logy Exam ples

    In addit ion t o th e Ruth erford a na logy discussed above, we now exam ine

    severa l additiona l exam ples to illustr at e th e types of an alogy str uctur es tha t

    can occur . Note tha t our goal is to describe the a nalogy as its au thor presents

    it, without at tempt ing to evaluate t he corr ectn ess or completeness of th e

    proposed comparisons.

    The Altoona List of Medical Ana logies [Ruh l, 2002] lists a simple a na logy

    compa rin g the eye to a camer a in order to explain cert ain types of vision

    problems to patients. It first establishes a ma pping of th e part s of th e eye to

    th ose of a camera , and th en explains t ha t a cat ar act in th e eye is like a fogged

    lens in a camera . Cont inuing with t he an alogy, it explains th at a detached

    retina is like a cam era with wrinkled film. Once th is ana logy is established,

    the eye doctor can continue, perhaps with additional analogies, explaining

  • 8/14/2019 HFoxwell Dissertation

    52/196

    42

    th e necessar y procedur es for tr eat ing th e condit ions. As presen ted, this

    an alogy is pr imar ily a m apping of known concepts camera par ts an d th eir

    presu ma bly un derstood functions, t o unfamiliar concepts eye anat omy a nd

    vision impairm ents. Figure 3.1 shows a repr esenta tion of th is mapping.

    Figur e 3.1. The Eye/Cam era Ana logy

    Upon observing the moons of J upit er, Galileo form ed an an alogy between th e

    solar system and t he J ovian system, proposing tha t t he relationship of th e

    planets t o th e Sun wa s th e same a s th at of th e moons t o J upiter [Galileo,

    Target Source

    cornea

    pupil

    iris

    retina

    lens

    aperture

    diaphragm

    film

    cataract

    detached retina

    fogged lens

    wrinkled film

  • 8/14/2019 HFoxwell Dissertation

    53/196

  • 8/14/2019 HFoxwell Dissertation

    54/196

    44

    associated with a sour ce relation, an d a similar relational str ucture is

    proposed in th e ta rget a na log, as sh own in F igur e 3.3.

    Figure 3.3. The Rutherford Analogy

    We see a similar t ype of structur e in a nother exam ple, th e Bohr Liquid Drop

    model of Nu clear F ission [Koushia ppas a nd Cohen, 1999]. In t his an alogy,

    we also see th e ma pping of a cau sal relat ionship when th e Coulomb

    repu lsion bet ween t he t wo ha lves of a deform ed liquid drop is greater th an

    th e surface tension between th e two ha lves tha t r elationsh ip causes th e drop

    to split. The an alogy proposes th at th e same t ype of cau sal relat ionsh ip holds

    Target Source

    electrical

    attraction

    gravitational

    attraction

    electron

    orbit

    nucleus

    planet

    orbit

    sun

    causes causes

  • 8/14/2019 HFoxwell Dissertation

    55/196

    45

    for th e nu cleus tha t when th e Coulomb r epulsion between two ha lves of a

    deformed nucleus exceeds the binding energy between the halves, this causes

    th e nu cleus t o split. Moreover, t he form ula e for t he calculat ion of th e forces

    ar e also proposed to be ana logous. In t his example, shown in Figure 3.4, we

    see th e str uctu re of a r elation (Coulomb repu lsion greater th an surface

    ten sion) mapped t o a concept (fission). This is similar to th e str uctu re we sa w

    in t he Rut herford an alogy, but with th e directiona lity of th e higher order

    relation reversed.

    Figur e 3.4. The Bohr Liquid Drop Model of Nuclear Fission

    Target Source

    nuclear

    fission

    droplet

    fission

    binding energy

    greater than

    Coulomb

    repulsion

    Coulomb

    repulsion

    greater than

    surface tension

    causes causes

  • 8/14/2019 HFoxwell Dissertation

    56/196

    46

    Historical an alogies can be creat ed an d u sed by politicians an d jour na lists to

    sway public opinion or t o suggest t he inevitability of a decision or course of

    action. The September 11, 2001 at ta ck on th e World Trade Center a nd

    Pent agon h as been compar ed to th e J apa nese Navys at ta ck on P earl Ha rbor

    in 1941 [Cox, 2002]. Without evalu at ing th e merit s of th e ana logy, we see

    th at th e ana logy can be part ially expressed as a higher order relational map.

    The t ar get consists of an action (al Qaeda at ta cks WTC) cau sing (or implying

    a des ired decision) an action (US declares wa r on t err orist s); th e sour ce is

    similar ly stru ctu red (J apa nese Navy att acks Pear l Har bor) cau sing (or

    implying a desired decision) an action (US declar es war on Ja pan ). Figur e

    3.5 illustr at es this map.

  • 8/14/2019 HFoxwell Dissertation

    57/196

    47

    Figure 3.5. Historical an alogy September 11 = Pear l Ha rbor

    We observe that although analogies may be quite elaborate or complex

    [Art hu r, 2002], [Gra ndin , 2000], th ey may be decomposed int o a s ma ll

    nu mber of str uctur ed component t ypes such a s th ose illustr at ed above with in

    th e ta rget a na log, each of which ma ps t o an identically stru ctu red component

    with in t he sour ce an alog. We now provide in wh at follows t he definitions of

    each of five types of componen ts t ha t can a ppear in eith er t he source an alog

    or ta rget an alog:

    Definition 1 (Concept Set ): A ConceptSetC = {c1, c2, , cn} is a n ord ered set of

    one or m ore concepts. Note that th e order of t h e ele m en t s l is t ed in a

    Target Source

    World Trade Center

    attack

    al Qaeda

    Pearl Harbor

    attack

    Japanese Navy

    Caused/justified Caused/justified

    terrorists

    declare war

    United States

    Japan

    declare war

    United States

  • 8/14/2019 HFoxwell Dissertation

    58/196

    48

    ConceptSet is significan t sin ce we ma p correspondin g concepts in t he t ar get

    an d source. In t he Ruth erford an alogy, for example, a possible tar get

    ConceptSet is {electr on, n ucleus}, an d a corr esponding s our ce ConceptSet is

    {plan et, su n}.

    Definition 2 (Prima ryRelat ionSt ru ctu re) : A PrimaryRelationStructure P =

    (Ca ,R , Cb) associat es ConceptSet s Ca and Cb th rough zero or m ore r elat ions in

    th e list of relat ions R . We say zero or more becau se r elat ions can be implied

    in an an alogy ra th er th an explicitly na med.

    In its m ost common form, a Pr imar yRelationSt ru ctu re consists of two

    concept s associat ed by a single rela tion a s in ({plane t}, revolves, {su n}). Note

    th at th ere ma y be multiple relations between t he ConceptSets (e.g., sun is

    larger th an a planet, sun is m ore m assive tha n a planet, sun is hotter than a

    planet, etc.), and t ha t t here m ay be more t ha n one element in each concept

    set (th e concept set cont ainin g the sin gle element planet could be r eplaced by

    the set {Mercury, Venu s, Earth, Mars, Ju piter, Sat urn, Uranus, N eptun e,

    Pluto}, for example.

    Definition 3 (ConceptToRelationStructure): A ConceptToRelationStructure is

    a t uple of th e form (C, R , P ) wher e C is a ConceptSet ass ociat ed by zero or

  • 8/14/2019 HFoxwell Dissertation

    59/196

    49

    more relations in th e list R to the PrimaryRelationStructure P .

    Definition 4 (RelationToConceptStructure): A RelationToConceptStructure is

    a tuple of the form (P , R , C) where P is a PrimaryRelationStructure

    ass ociat ed by zero or m ore r elations in t he list R to theConceptSet C.

    A RelationToConceptStructure and a ConceptToRelationStructure associates

    rela tions with concepts or concepts wit h relat ions, r espectively, depending on

    th e directionality of th e higher order rela tion being specified. Higher order

    rela tions can in clude cau salit y, implicat ion, an d sequencing, for exam ple. In

    th e sour ce an alog of th e Rut her ford an alogy, for exa mple, we observe th e

    ConceptToRelationSt ru ctu re gra vita tiona l a tt ra ction causes the planet t o

    revolve ar oun d th e sun.

    Definition 5 (RelationToRelationStructure): A RelationToRelationSt ru ctu re is

    a t uple of th e form (P ,R ,P ) th at a ssociates a Pr imaryRelationSt ru ctu re to

    an oth er P rima ryRelat ionSt ru ctu re th rough zero or m ore relat ions in list R .

    There a re t hu s five types of maps, each of which m aps a str uctur e (e.g.,

    Pr imaryRelat ionSt ru ctu re) in the ta rget an alog to a str uctur e of th e same

    type in th e sour ce an alog: Figur es 3.6 (a) 3.6 (e) illust ra te t he five types of

  • 8/14/2019 HFoxwell Dissertation

    60/196

    50

    ma ps discussed a bove.

    Consider th e ana logy th at compa res th e hum an circulat ory system to home

    plum bing, used by physician s to explain car diovascular diseases t o medical

    pat ients [Ruh l, 1999]. The tar get concepts heart , blood, and blood vessels are

    ma pped to th e sour ce concepts pum p, water, and pipes , respectively, forming

    a ConceptSetMap. When explaining congestive hear t failur e to a patient , the

    physician first describes th e similar ity of the concepts of th e circulat ory

    system t o th ose of a h ome plum bing system, implicitly using t he

    ConceptSetMap, an d th en describes how the concepts ar e relat ed, creat ing

    an d explaining relational stru ctu res and ma ps. For exam ple, th e physician

    explains th at a clogged ar tery can cause th e blood to back up in to th e lungs

    similar to the wa y tha t a clogged drain can cause the water to back up a nd

    overflow, ther eby mappin g the cau sal r elat ion of th e sour ce an alog to th at of

    th e ta rget. Thus, we see th at th is an alogy, as it is present ed by its auth or,

    ma y be decomposed int o several ConceptSet Maps a nd a

    Pr imar yRelationSt ru ctu reMaps. Note th at we can creat e mu ltiple

    ConceptSet Maps, gr ouped by type of concept, a s sh own in Figur e 3.7.

  • 8/14/2019 HFoxwell Dissertation

    61/196

  • 8/14/2019 HFoxwell Dissertation

    62/196

    52

    Target Source

    c1

    c2

    c1

    c2

    r1, r

    2,

    c1

    c2

    r1, r

    2,

    c1

    c2

    c1

    c2

    r1, r

    2,

    c1

    c2

    r1, r

    2,

    Figur e 3.6 (c). RelationToConceptSt ru ctu reMa p

    Target Source

    c1

    c2

    c1

    c2

    r1, r2,

    c1

    c2

    r1, r

    2,

    c1

    c2

    c1

    c2

    r1, r2,

    c1

    c2

    r1, r

    2,

    Figur e 3.6 (d). ConceptToRelat ionSt ru ctu reMa p

  • 8/14/2019 HFoxwell Dissertation

    63/196

    53

    Target Source

    c1

    c2

    r1, r2,

    c1

    c2

    c1

    c2

    r1, r2,

    c1

    c2

    r1, r2, r1, r

    2,

    c1

    c2

    r1, r

    2,

    c1

    c2

    c1

    c2

    r1, r

    2,

    c1

    c2

    Figure 3.6 (e). RelationToRelationStructureMap

    Figure 3.7. Circulatory System Analogy

    Target Source

    heart

    weak heart

    causes

    congestive heart failure

    blood

    blood vessels

    blood pressure

    plaque

    pump

    weak pump

    causes

    backup or overflow

    water

    pipes

    water pressure

    scale/deposits

  • 8/14/2019 HFoxwell Dissertation

    64/196

    54

    This an alogy and it s MARVIN visua lizat ion ma y be foun d on t he au th or's

    Web site [Foxwell, 2002], along with add itiona l an alogy examples t ak en from

    a va riet y of kn owledge doma ins.

    3.2 The Limita tions of Ana logies

    Analogies ar e like cha insa ws powerful and u seful tools th at can injur e you

    if you misu se them . Like incorr ect ma ps, bad ana logies can lea d you awa y

    from your dest ina tion or cau se you t o become en tir ely lost. The over-relian ce

    on limited a na logical m odels can impa ir discovery an d t he form at ion of useful

    hypothes es. The ear ly hist ory of science includes ma ny inst an ces of

    misleading and inaccur at e ana logies. F or example, alchemists were

    const ra ined in th eir underst an ding of ma tt er an d chemistr y by their

    adh erence to ana logies between elements an d anima l or hu man

    char acter istics [Gent ner a nd J eziorsky, 1993]. Kepler's reliance on a divinely

    orda ined, geomet rically perfect m odel of the u niverse led him init ially to an

    err oneous explan at ion of th e orbit s of th e planet s [Kepler, 1596]. Only when

    he r elucta nt ly abandoned tha t model was h e able to conceive a more accur at e

    explana tion of the plan ets orbit s.

    Pr actitioners of an alogical explan at ions r ecognize the da ngers of misleadin g

    an alogies. A doctor wh o frequ ent ly uses a na logies to explain m edical

  • 8/14/2019 HFoxwell Dissertation

    65/196

    55

    concepts n otes t ha t his pat ients can tr an sfer ina ppropriate kn owledge from

    th e sour ce to th e ta rget of an an alogy, and suggests th at good a na logies

    should be visual, should illustrate the necessary concepts, use a familiar

    source, an d should be clear a nd sh ort [Ruh l, 1999]. Additiona lly, those

    explaining or teaching an idea should consider whether an analogy is even

    necessar y. Ana logies should be used primar ily when th e idea being tau ght is

    new and is har d for th e learn er to un dersta nd. The ana logy's limita tions

    should be discussed, an d dependence on th e an alogy reduced as th e learner

    progresses in un dersta nding the tar get.

    The u se of hist orical an alogies to guide decisions on th e u se of milita ry force

    ha s been widely crit icized by his toria ns [Record, 2002]. The lessons of

    hist ory can become obsolete or irr elevan t a s t ime pr ogresses a nd as p olitical

    an d social cond itions differ from the sour ce event . The so-called Munich

    an alogy tha t a ppeasement of aggression by totalitarian sta tes leads to

    more aggression h as been invoked a s a n instr uctive model for milita ry a nd

    foreign policy decisions by several countries, although the models success

    an d a pplicability ha s been quest ioned [Record, 1998].

    Analogy pra ctit ioner s do indeed r ecognize the limitat ions of ana logies, a nd

    usu ally include a fina l recomm ended st ep in th e an alogical rea soning process

  • 8/14/2019 HFoxwell Dissertation

    66/196

    56

    to indicat e wher e th e an alogy break s down [Klein a nd Milligan , 2002],

    [Glynn , 1991], alth ough t hey note th at th is should genera lly be done by th e

    proposer of th e an alogy. This suggests t ha t th e process ofevaluating the

    an alogy is separate from th e depiction of the an alogy itself, as we discuss in

    Chapter 4.

    Even th ose who suggest avoiding an alogies can ha ve difficult y explain ing

    complex ideas with out usin g ana logy. For examp le, in crit icizing th e use of

    an alogies in t eaching compu ter concepts, [Hala sz and Mora n, 1982] end u p

    simply us ing different an alogies replacing a filing cabinet model of file

    system s with a sup posedly more gener al an d less an alogous tr ee model.

    Ir onically, even t he t itle of th eir pap er, An alogies Considered Harm ful ,is

    pur posely ana logous t o th e title of a m ore famous pa per, GOTO St atem ent

    Considered Harmful [Dijkstra, 1968].

    In spite of th eir dangers and m isuses, an alogies remain an import an t a nd

    widely used t ool for comm un icat ion a nd lear nin g. Our resea rch focuses on

    how to represent and visualize some of the enormous range of human-created

    an alogies. The fina l evalu at ion of an an alogys us efuln ess, however, is th e

    responsibility of its pr oposer a nd u sers.

  • 8/14/2019 HFoxwell Dissertation

    67/196

    57

    4. The Repr esent at ion of Ana logies

    As discussed in Cha pter 3, an a na logy ma y be described as a set of ma ps.

    Ea ch ma p pairs t he concepts and r elationa l stru ctu res of a t ar get ana log to

    corr esponding componen ts of a sour ce an alog. The corr esponden ce of concept

    an d r elation components is determined by the an alogy aut hors perception of

    similar ity of propert y, form, or fun ction, or by a hypothesis of similarity.

    Our goal her e is to use t his char acter ization of an alogies to creat e an XML

    content model [W3C, 2000] capable of describing a wide variety of analogies.

    The design of such a m odel for an alogies must meet s everal genera l crit eria

    an d mu st a lso meet t he needs of cont ent a ut hors who select or creat e

    an alogies, cont ent reader s who learn using th e an alogies, and progra m

    developers wh o use t he m odel to crea te n ew ways to use t he a na logies.

    A representation is a symbolic surr ogate for an object th at facilita tes h um an

    expression a bout th e object an d serves as a medium for compu ta tions wit h

    th at object [Sowa, 2000]. Note tha t we distin guish between an analogy the

    hu ma ns perception of sameness, a nd an analogy expression the

  • 8/14/2019 HFoxwell Dissertation

    68/196

    58

    represent at ion of th at perception. The represent at ion is necessar ily a limited

    an d imperfect model of th e complet e hu ma n per ception, however. Our goal is

    to define a symbolic repr esent at ion of an alogies th at capt ur es a wide scope

    an d variety of hu ma n-conceived an alogies. Becau se th e term an alogy ha s a

    broad ra nge of interpr etat ions, our represent at ion mu st r eflect both t he

    comm on usa ge of th e ter m, as well as t he m ore form al us age in r esear ch

    fields su ch a s Cognitive Science, Artificial In telligence, an d E ducat ion.

    Specifically, it mu st be able t o express both simple similar ity of propert ies

    an d concepts, and m ust preserve relat iona l stru ctu res th at ar e the core of

    an alogy perceptions [Gent ner , 1983]. Becau se ana logies are rem embered,

    reused, a nd extended over time t o genera te n ew inferences [Hofsta dter,

    2001][Kean e an d Costello, 2001], our repr esent at ion m ust also be exten dable.

    Tha t is, it mu st be flexible enough t o perm it t he a ddition of new a na logy

    components as needed by the author, user, or developer.

    Aut hors of instr uctiona l or explan at ory cont ent frequent ly use an alogies t o

    assist the student or reader in un derst an ding new ideas. An an alogy

    representation must therefore be compact yet expressive enough to allow the

    au th or t o record t he essential components of th e ana logy, and must also

    perm it some indication of th e validity of th e proposed compa risons. In our

    an alogy expression, we t herefore u se simple words an d sh ort phr ases a s

  • 8/14/2019 HFoxwell Dissertation

    69/196

    59

    expression prim itives along with expressions th at describe th e str uctu re of

    th e relations am ong th e primitives. Becau se we us e XML to define an alogy

    expressions, au th ors can u se XML editors or other tools to assist t hem in

    producing valid XML files. Cha pter 7 describes an an alogy express ion editor

    creat ed for th is pur pose.

    We wan t to provide access t o ana logy express ions usin g fam iliar , Web-based

    tools and techn ologies such as br owsers. Moreover, we need to separ at e the

    representation of the structure of the analogy expression from its

    visua lizat ion. XML was designed specifically for th is pur pose [W3C, 2002],

    allowing m ult iple form s of visua lizat ion ba sed on a comm on repr esent at ion

    model.

    Researchers need programmatic access to knowledge expressions in order to

    st ore, retr ieve, an d man ipulat e them . The use of XML ena bles developers to

    access analogy expressions using standard, Web programming tools and

    met hods such a s XML par sers , XSLT stylesheet s [W3C, 2001] an d pr ocessors,

    J ava program s, HTML, an d SVG [W3C, 2001] to visua lize th ese expressions

    in a var iety of form s. Our represent at ion is implemented as a n XML DTD

    (Docum ent Type Definition), permit tin g a compa ct, comm on form at for

    describing, searching, and transforming analogy expressions.

  • 8/14/2019 HFoxwell Dissertation

    70/196

    60

    Term inology used t o describe the component s an d str uctu re of an alogies

    should be compa ct yet descript ive. In designing the DTD we st ress th e

    distinction between th e tar get an alog components an d th e sour ce an alog

    components by giving th em separ at e but similar n am es. We also observe

    th at in order to map only identical an alogy str uctur es, we must enu mera te

    th e possible stru ctu res r at her t ha n u sing more compact or recursive element

    definit ions t ha t would allow ma pping of un like componen ts .

    Knowledge representations defined using XML may be expressed using either

    DTDs or schema s. Like XML DTDs, XML Schem as a re u sed to define th e

    str uctu re, cont ent , an d seman tics of XML docum ent s [W3C, 2001]. Schema s

    provide for str ong dat a t yping and va lidation, explicit cardin ality cont rols,

    an d const ra ints on at tr ibute values [W3C, 2001]. But for r epresenta tions

    th at do not r equire str ong dat a t yping or cardina lity cont rols, DTDs ar e

    sufficient a nd sim pler [Mertz, 2001]. Tools for creat ing, valida tin g, an d

    tr an sform ing DTD-based XML files are m at ur e an d widely ava ilable [Cover,

    2002] while th e XML schem a st an dar d an d tools ar e still evolving [Gar shol,

    2002]. Becau se th e an alogy express ions d iscuss ed in t his docum ent consist

    exclusively a r elatively small nu mber of text-based element s with an y

    numberof repea ted componen ts , a DTD was developed. XML ana logy

    expressions ba sed on our DTD may be t ra nsform ed as needed into oth er

  • 8/14/2019 HFoxwell Dissertation

    71/196

    61

    form s usin g XSL stylesheet s or oth er t ools, including t ra nsform ing th e DTD

    int o an XML schema