heraklion, april 2, 2002 1 mapping a data structure to the cidoc conceptual reference model martin...

16
1 Heraklion, April 2, 2002 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April 2, 2002

Upload: alice-freeman

Post on 22-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

1Heraklion, April 2, 2002

Mapping a Data Structure to the CIDOC Conceptual Reference Model

Martin Doerr (ICS-FORTH, Crete, Greece)

Heraklion, Crete, April 2, 2002

Page 2: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

2Heraklion, April 2, 2002

What Means Mapping of One Schema to Another

Defining an (automated) transformation of each instance

of schema 1 into an instance of schema 2 with the same

meaning.

CRM Approach: Interpretation of schema 1 as semantic model (nodes and links),

mapping each element of that to an equivalent CIDOC CRM path,

such that each instance of an element of the semantic model 1

can be converted into a valid instance of the CIDOC CRM with the

same meaning.

This is the most simple theory. Works for good structures

Page 3: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

3Heraklion, April 2, 2002

Interpreting a Schema as Semantic Model

1. Interpreting tables, columns as entities

2. Interpreting records as entity instances

3. Interpreting fieldnames as relationships and entities

4. Interpreting field contents as entity instances

Each field is interpreted as entity-relationship-entity (e-r-

e)

The whole schema is decomposed into e-r-e’s

Each e-r-e is mapped individually to the CRM.

Page 4: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

4Heraklion, April 2, 2002

Interpreting a Schema as Semantic Model, Example

ID 1975-7309

Category NRM - Railway furniture

Description Armchair, Upholstered in blue moquette with curved, buttoned back & scroll arms. Wooden legs

Item name(s) armchairs (AAT Hierarchy: Furnishings)

   

Part Aspect Term (AAT Hierarchy)

overall physical descriptor

upholstering Processes & techniques

overall material moquette Materials

overall colour blue Color

legs material wood Materials

back physical descriptor

buttoning Processes & techniques

back shape curved Physical attributes

arms shape scrolled arms Components

The whole recordcorrespondsto one entity:

It stands for one objectwhich is not referred to

The field name stands for a relationship and the kind of contents

Object1975-7309

The field contents stand foran entity instance :

1975-7309has ID:

(data example from the Science Museum of London)

Page 5: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

5Heraklion, April 2, 2002

Mapping the First Element:Creating an Equivalent Proposition

Object1975-7309

1975-7309

WholeRecord

ID“has ID ” Source Schema

interpretation

Instance,valid for both

schemata

Man-MadeObject

ObjectIdentifier

is identified by CRM Schema

maps to:

Possible Mapping Annotation:Whole Record = E22 Man-Made ObjectID = E42 Object identifierWhole Record->ID = P47 is identified by

Possible CRM instance Annotation:Object 1975-7309 (E22: Man-Made_Object) is_identified_by 1975-7309 (E42 Object_Identifier)

Page 6: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

6Heraklion, April 2, 2002

Mapping the Interpreted Schema to the CRM

Each Entity-link-entity can be instantiated as self-explanatory,

context independent proposition

The mapping allows to create sets of propositions equivalent to

the meaning of each source document, but in terms of the CIDOC

CRM.

As the CRM-compatible propositions are self-explanatory, they can

be merged into huge knowledge pools and the document

boundaries can be ignored.

buzz words: Data warehouses, Semantic Web

Page 7: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

7Heraklion, April 2, 2002

Interpreting a Schema: Advanced Stuff: Value Dependency

ID 1975-7309

Category NRM - Railway furniture

Description Armchair, Upholstered in blue moquette with curved, buttoned back & scroll arms. Wooden legs

Item name(s) armchairs (AAT Hierarchy: Furnishings)

   

Part Aspect Term (AAT Hierarchy)

overall physical descriptor

upholstering Processes & techniques

overall material moquette Materials

overall colour blue Color

legs material wood Materials

back physical descriptor

buttoning Processes & techniques

back shape curved Physical attributes

arms shape scrolled arms Components

The whole row correspondsto one entity:

It stands for one part

The first field name stands for a relationship and the kind of contents

Object1975-7309

The field contents stands foran entity instance :

legs of obj.1975-7309

has part:

If part = overall,it stands for the whole

Mapping condition:

Page 8: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

8Heraklion, April 2, 2002

Mapping under condition:Creating an equivalent statement

Object1975-7309

legs of obj.1975-7309

WholeRecord

Row“Part”

“has Part ” Source Schemainterpretation

Instance,valid for both

schemata

Man-MadeObject

Man-MadeObject

is composed of CRM Schema

maps to:

Possible Mapping Annotation:Whole Record = E22 Man-Made ObjectRow “Part” = E22 Man-Made ObjectIf (in Row “Part”, Part /= “overall”) thenWhole Record-> Row “Part” = P46 is composed of

Possible CRM instance Annotation:Object 1975-7309 (E22: Man-Made_Object) is_composed_of legs of 1975-7309 (E22: Man-Made_Object)

If Part /= “overall”

Page 9: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

9Heraklion, April 2, 2002

Interpreting a Schema: Advanced Stuff: Values as Properties

ID 1975-7309

Category NRM - Railway furniture

Description Armchair, Upholstered in blue moquette with curved, buttoned back & scroll arms. Wooden legs

Item name(s) armchairs (AAT Hierarchy: Furnishings)

   

Part Aspect Term (AAT Hierarchy)

overall physical descriptor

upholstering Processes & techniques

overall material moquette Materials

overall colour blue Color

legs material wood Materials

back physical descriptor

buttoning Processes & techniques

back shape curved Physical attributes

arms shape scrolled arms Components

The field “Aspect” contents state a relationship

Object1975-7309

The field contents stands foran entity instance :

moquettehas material:

If part = overall,AND

Aspect = material

Value based mapping

Page 10: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

10Heraklion, April 2, 2002

Interpreting a Schema: Advanced Stuff: Mapping to Paths

ID 1975-7309

Category NRM - Railway furniture

Description Armchair, Upholstered in blue moquette with curved, buttoned back & scroll arms. Wooden legs

Item name(s) armchairs (AAT Hierarchy: Furnishings)

   

Part Aspect Term (AAT Hierarchy)

overall physical descriptor

upholstering Processes & techniques

overall material moquette Materials

overall colour blue Color

legs material wood Materials

back physical descriptor

buttoning Processes & techniques

back shape curved Physical attributes

arms shape scrolled arms Components

The field “Aspect” contents state a relationship

Object1975-7309

The field contents stands foran entity instance :

upholsteringhas physical descriptor:

If part = overall,AND

Aspect = physical descriptor

Value based mapping

Page 11: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

11Heraklion, April 2, 2002

Mapping to Paths:Introducing an intermediate node

Object1975-7309

upholstering

WholeRecord

Term“has physical descriptor ” Source Schema

interpretation

Instance ofsource

Man-MadeObject

was produced byCRM Schema

maps to:

Possible Mapping Annotation:Whole Record = E22 Man-Made ObjectTerm = E55 TypeIf Part = “overall” & Aspect= physical descriptor

Whole Record-> Term = P108 was produced by – E12 Production - P32 used general technique

Possible CRM instance Annotation:Object 1975-7309 (E22: Man-Made_Object) was_produced_by Obj. 1975-7309 Production (E12: Production)

used general technique upholstering (E55 Type)

Production Type

If Part = “overall” &Aspect= physical descriptor

used general technique

Object1975-7309

upholsteringObj.1975-7309Production

Instance oftarget

Page 12: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

12Heraklion, April 2, 2002

Interpreting a Schema: Advanced Stuff: Nested Structures

ID 1975-7309

Category NRM - Railway furniture

Description Armchair, Upholstered in blue moquette with curved, buttoned back & scroll arms. Wooden legs

Item name(s) armchairs (AAT Hierarchy: Furnishings)

   

Part Aspect Term (AAT Hierarchy)

overall physical descriptor

upholstering Processes & techniques

overall material moquette Materials

overall colour blue Color

legs material wood Materials

back physical descriptor

buttoning Processes & techniques

back shape curved Physical attributes

arms shape scrolled arms Components

The whole row correspondsto one entity:If part /= overall

it stands for one part

The field contents stands foran entity instance :

legs of obj.1975-7309

has material:

If Aspect = material

wood

The contents of field “Aspect”state a relationship

Value based mapping

Page 13: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

13Heraklion, April 2, 2002

Mapping Nested Structures :Continuing on a Range Entity

woodlegs of obj.1975-7309

Row“Part”

Term“has material ” Source Schema

interpretation

Instance,valid for both

schemata

Man-MadeObject Material

consists of CRM Schema

maps to:

Possible Mapping Annotation:Row “Part” = E22 Man-Made ObjectIf Aspect= “material”

Term = E57 MaterialRow “Part” -> Term = P45 consists of

Possible CRM instance Annotation:Object 1975-7309 (E22: Man-Made_Object) is_composed_of legs of 1975-7309 (E22: Man-Made_Object)

consists_of wood (E57 Material)

If Part /= “overall” &Aspect = “material”

Object1975-7309

Page 14: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

14Heraklion, April 2, 2002

Other Forms of Maps:Cases of Heterogeneity

A B“a ”Source Schemainterpretation

Dc

CRM SchemaE

C“b ”

Fd

A B“a ”Source Schemainterpretation

Dc

CRM Schema

E

C“b ”

Fd

Ge

Parallelto nested:

Parallelto intermediate-

parallel:(frequent with

events!)

Page 15: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

15Heraklion, April 2, 2002

Other Mapping Forms:Cases of Heterogeneity

A B“a ”

Source Schemainterpretation

B,C,D are partsof an identifier

for one real-life thing

Dd

CRM SchemaE

C“b ”

Compound contraction:(frequent withaddresses,

species names etc!)

D“c ”

Page 16: Heraklion, April 2, 2002 1 Mapping a Data Structure to the CIDOC Conceptual Reference Model Martin Doerr (ICS-FORTH, Crete, Greece) Heraklion, Crete, April

16Heraklion, April 2, 2002

Mapping to the CRM: Conclusions

Mapping to the CRM can serve just as guide for good-

practice data structures.

It can be used to create a Semantic Web of cultural

knowledge.

It can be used to preserve data in a neutral form.

Even though mapping can become weird, good data

structures transform easily, and there are commercial

tools.

No tool can guess all the experts intention in a data

structure: Domain experts must assist the mapping.