ddi 3.0 conceptual model chris nelson. why have a model non syntactic representation of the business...

Post on 20-Dec-2015

217 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

DDI 3.0

Conceptual Model Chris Nelson

Why Have a Model

• Non syntactic representation of the business domain

• Useful for identifying common constructs– Identification, versioning etc.– “patterns”

• A good basis for designing syntactic representation (e.g. XML) schemas, databases, and processing systems– Industry tools support this process (e.g. EMF)

Variable

Scheme

Physical

Data Product

Physical

Instance

Archive

Study

Unit

Category

SchemeQuestions

DDI 2.0

•Driven by the need to archive data

•Developed as an XML DTD

•No formal conceptual model

•No re-use of artifacts

DDI 3.0 design goals – life-cycle model

The statistical production process (Secondary) use of dataArchiving

2.0

3.0

Variable

Scheme

Physical

Data Product

Physical

Instance

Archive

Study

Unit

Category

SchemeQuestions

DDI 2.0

Identification, Item Schemes, Item Scheme Associations

Component Schemes, Organisations

Group

Variable

Scheme

NCube Record

Layout

Physical

Data Product

Physical

Instance

ArchiveStudy

Unit

Data/Metadata Resource

Metadata

Report

DDI Base

Structural Metadata

Data/Metadata Management

Category

Scheme

Concept

Scheme

Data & Metadata

Structure

Question

BankInstrument

DDI 3.0

Identification, Item Schemes, Item Scheme Associations

Component Schemes, Organisations

Group

Variable

Scheme

NCube Record

Layout

Physical

Data Product

Physical

Instance

ArchiveStudy

Unit

Data/Metadata Resource

Metadata

Report

DDI Base

Structural Metadata

Data/Metadata Management

Category

Scheme

Concept

Scheme

Data & Metadata

Structure

Question

BankInstrument

DDI 3.0

In Line NCube Record

Layout

NCube TableLayout

NCubeLayout

Instrumentation module

A module in DDI 3.0 to describe survey instruments in a system independent way.

and others

To be used to drive data capturing systems or to pick up the output from these systems.

•Important metadata is entered at this stage and should be carried forward to the end data product.•Information about question flow, cues presented to the respondents etc. is important for the interpretation of the data•Often complex relationships between questions and variables.

Q

V

V

V

UML Constructs as used in the DDI Conceptual Model

Classes and Associations (1)

Variable(from VariableStandard)

ConceptItem(from ConceptStandard)

ConceptItem(from ConceptStandard)

Variable(from VariableStandard)

0..* 0..*0..*

+conceptualSemantic

0..*

0..*

0..1

1..*

1

zero or more

zero or one

one or more

one

cardinalities

Classes and Associations (2)- Aggregates

CategoryScheme

CategoryItem

1..*

0..1

CategoryItem is subordinate to and “belongs to” CategoryScheme

Aggregate by reference

Aggregate by value

In the model diagrams in this presentation there is no distinction made between aggregate by reference and aggregate by value. All aggregates are shown with a open diamond.

Classes and Associations (3)- Unidirectional

NCubeLogicalProduct VariableDataAttribute

0..*0..* 1

+takesSemanticFrom

1

Variable is navigable from DataAttribute but not vice-versa

Sub Classes - Inheritance

DimensionVariable

MetadataReport(from Metadata)

Variable+documentation

0..*0..*

DimensionVariable inherits from Variable (i.e. it is a “specialisation” of Variable). Therefore DimensionVariable can have an association to MetadataReport. However, any associations from DimensionVariable are specific to DimensionaVariable and are not applicable to Variable

Abstract ClassesIdentifiableArtefact

id : Stringuri : Stringurn : String

Instrument(from DataCol lection)

Instrument(from DataCollection)

InternationalString(from DDI_Base)

IdentifiableArtefact

id : Stringuri : Stringurn : String

0..1 0..10..1

+description

0..1+name

0..1 0..10..1 0..1

An abstract class is drawn because it is a useful way of grouping classes, and avoids drawing a complex diagram with lots of association lines, but where it is not foreseen that the class serves any other purpose (i.e. it is always implemented as one of its sub classes).

Here Instrument inherits the attributes of Id, uri, urn.

Instrument can have a multilingual name and description.

SoftwarePackage

contains metadata attribute values for:

MethodologyCollectionEventNoteUniverseOtherMaterial

QuestionItem

QuestionConstruct ComputationItem Loop

MetadataReport(from Metadata)

DataCollection

Instrument

IfThenElse

ControlConstruct

Sequence

type : String

DisplayText

StatementItem

description : StructuredString

11

+documentation

0..*

0..*

0..*

0..*

0..*0..*

0..*0..*

+thenCondition

11

+elseCondition0..10..1

0..*0..*

11

Instrument - Simplified Class Diagram

Question Bank - Simplified Class Diagram not in the schemas

here the multiple question item can (must) have at least one sub item which can be another multiple or a single question. At the bottom of the tree there will be only single questions

Metadata forQuestion IntentVisual AidResponse UnitAnalysis Unit

ResponseDomain

MultipleQuestionItem

sequence : Integer

QuestionText

QuestionBank

MetadataReport(from Metadata)

QuestionItem

0..*0..*

1..*

+sub question

1..*

0..*0..*

0..*

1

0..*

1

0..*

+documentation

0..*

Variable(from VariableStandard)

ConceptItem(from ConceptStandard)

0..*0..*

0..* 0..*0..*

+conceptualSemantic

0..*

Variable - Simplified Class Diagram metadata forDefinitionUniverseEmbargoResponseUnitAnalysisUnit

Metadata forDescription(Formal) Derivation Rules

VariableScheme

MetadataReport(from Metadata)

MetadataReport(from Metadata)

ConceptItem

QuestionItem

VariableConcatenation

DerivedVariable

+documentation0..10..1

Variable

0..*

+documentation

0..*

0..*

0..*

+conceptualSemantic

0..*

0..*0..*

+usedBy

0..*

0..10..1

0..1

+valueStoredIn

0..1

1

1..*

1

1..*

Representation

VariableRepresentation

0..10..1

0..10..1

0..10..1

Identifying potentially comparative data• The grouping mechanism can be used to mark up families of studies that

from the outset have been designed to be comparable.

• ...or families of studies that has been made comparable through a harmonization process.

• However, none of these mechanisms reach beyond the limit of the DDI 3.0-wrapper that binds the family of studies together.

• One of the biggest challenges for DDI 3.0 has been to define a way to describe relationships between variables across DDI-wrappers,

collections and servers.

• Use-case: “Give me more variables like this”, in other words the ability to identify potentially comparative variables across studies, collections, archives and locations.

Identifying potentially comparative data

• There is a mechanism in the existing DDI that to a certain degree will allow you do this. That is the ability to assign concepts from external vocabularies to variables.

Study 1

V1 V2 V3

Study 2

V4 V5 V6

External vocabulary

C1

C11 C12 C13

ConceptItem

Variable

+conceptualSemantic

0..*

0..*

0..*

0..*

• In DDI 3.0 there will a more elaborated solution to the same problem, a specification of an external registry-like question-bank or classification database that will allow you to register concepts, questions and variables.

• The specification can be used to set up local question banks or question banks that are global to many organizations.

• The specification will also support statements about differences between registered variables

Study 1

V1 V2 V3

Study 2

V4 V5 V6

Identifying potentially comparative data External registry

I1 I2 I3 I4

Diff• The registry can be seen as an extension to a standard DDI document.

• ...but the specification might also include the interfaces to allow this to be set up and run as a proper registry on the Web.

Registries• Contains metadata that allows users/ applications to find things• The objects themselves do not need to be in the registry

– But must be accessible over the internet (preferably accessible by standardised queries and retrievable in a standardised format)

– E.g. questions in question bank category schemes variables

• Registries can have repositories to store local content• Registry standards exist and registry products are available

– But they need to be customised to support the domain(e.g. customised software that understands the DDI model and syntax implementation)

• If objects can be identified in a globally unique way, then they can be accessed and shared

Data Analysis

Data & Metadata

Structure

Physical

Data Product

NCube Record

Layout

Cube Structure - Simplified

metadata for

UniverseDefinition

can also link to a Variable

metadata for

UniverseDefinitionImputationResponseUnitAnalysisUnitPurpose

Constraint

DimensionVariable(from VariableStandard)

AttachableArtefact

CoordinateGroup

NCubeLogicalProduct

Dimension

NCubeStructure

MetadataReport(from Metadata)

Label

Measure

DataAttribute

MetadataReport(from Metadata)

Variable(from VariableStandard)

0..*

0..*

+takesSemanticFrom

1

+attachesTo

1

0..*

0..10..*

0..*

0..* 0..*

1

1..*0..1 1..*0..1

{ordered}

0..*

0..*0..*

0..*

0..1

0..*

0..1 {ordered}

1

0..1

+documentation0..*0..*

0..10..1

0..1

0..*0..*

0..10..*0..*

1

1

0..*0..*

+documentation

0..*0..*

+takes SemanticFrom11

+takesSemanticFrom

11

AttachableArtefact

Measure Dimension CoordinateGroup

DataAttributeDimensionMeasure

MeasureItem AttributeItem

ItemValue

ReferencedValue

Data Structure

CubeCoordinate

EmbeddedDataValue

Variable

CoordinateVariable

1+cubeCoordinateVariable

1

NCube Record

Layout

NCube Logical

Product

Physical

Instance

Cube Data – Contains or Points to Data

EmbeddedDataValue

value : String

ValueLocation

startPosition : Integerwidth : IntegerdecimalPosition : IntegerdecimalSeparator : StringgroupingSeparator : Stringdelimiter : StringvariableNamesSpecified : BooleanexplicitDataType : String

ColumnPosition

columnPosition : Integer

ReferencedValue

Variable(from VariableStandard)

CoordinateVariable

1

+cubeCoordinateVariable

1

Dimension(from LogicalProduct)

DataAttribute(from LogicalProduct)

CubeCoordinate

number : Integer

1

+valueFor

1

AttributeItem

1

+valueFor

1

Measure(from LogicalProduct)

ItemValue

{TabularNCube}

11

11

MeasureItem0..*

+attribute

0..*

1

+valueFor

1

11

itemValue

Link to the Cube Structure Definition

DDI 3.0 Metadata• Metadata constructs that are fairly generic and can be attached at various places in the

hierarchy.• Examples:

– Coding instructions– Description of time and geography– Citation/Abstract– Methodology etc.

• The DDI model contains a metamodel for metadata structures:– Identifies the object types to which metadata can be attached– Specifies the category/concept schemes that contains the list of valid identifiers for the object

types– Specifies the metadata reports that can be made (e. g. coding instructions, citation) in terms of

• Attributes• Value domain (e.g. format) of the attributes• Reporting hierarchy of the attribute

– Identifies to which object types the metadata report can be associated

Metadata Attributes

Object Identifier

Metadata Structure Definition

Identifier ComponentsItem Scheme

uses defined concepts

defines the object types to which metadata can be “attached”

specifies to which object types the

report can be “attached”

identifies the value domain of the

component

Metadata Report

Concept Scheme

concept defined inConcept

takes semantic and context

from

Target Object Type

identifies target object type of the

component

can have hierarchy

Format and Permitted Value List

Value domain

identifies target object type of the

identifierSpecifies

components for each Object (“key”

Metadata Structure

+takesSemanticFrom

this can be any object in the DDI model, including a specific DocumentableArtefact, thus allowing a report to be attached to or referenced from, another report

ObjectTypeScheme(from DDI_Base)

ItemScheme(from DDI_Base)

IdentifierGroup

IdentifiableObjectType(from DDI_Base)

1..*

1

1..*

1

IdentifierComponent

1..*

1

1..*

1

1

+targetClass

1

0..1

+valueDomain

0..1

AttachmentStatus

isMandatory : Boolean

MetadataStructureDefinition

1

1

1

1

IdentifiableObject

1+targetClass

1

1..*

1

1..*

1

1..*

1..*

1..*

1..*

DocumentableArtefact

Type(from DDI_Base)

Representation(from DDI_Base)

ReportStructure

1..*1..* 1..*

+attachesTo

1..*

1 1

+structureFor

1 1

ConceptItem(from ConceptStandard)

MetadataAttribute

isMandatory : Boolean

0..1

+localType

0..1

0..*

1

+child

0..*sub-structure

+parent1

0..1

+localRepresentation

0..1

1..*1..*

11

e.g. CitationCoding InstructionsUniverseAbstract

+specifies +uses

Metadata Set – Contains Metadata Reports

0..* 1

MetadataAttribute1

1ReportStructure

1..*

MetadataSet

MetadataAttributeValue

MetadataReport

1

1..*

1

1..*

1..*1..*

AttachmentKey1..*

+objectIdentifier

1..*

IdentifierComponentValue

1..*1..*

MaintainableArtefact(from DDI_Base)

IdentifiableArtefact(from DDI_Base)

1IdentifierComponent

0..*

1..*

IdentifiableObject

1

shows the link to the metadata structure definition

Modularity and grouping as a way to handle comparative data

Ques-

tions

Study

design

Variab-

les

Group

French

study

German

study

UK

study

Spanish

study

Italian

study

Extentions

Local

overrides

Extentions

Local

overrides

Translation

Extentions

Local

overrides

Translation

Extentions

Local

overrides

Translation

Extentions

Local

overrides

Translation

Modularity and grouping as a way to handle multiple tables/cubes

Variables

Study

description

nCube3nCube 2nCube1 nCube5nCube4

Table

description

Table

description

Table

description

Table

description

Table

description

Group

Category

Schemes

Group: Logical Combination of Artifacts

0..1

Metadata forCitationPurposeAbstractUniverseOtherMaterialNote

Metadata forItemDetailsAccessOtherMaterialNote

DataCollection(from DataCollection)

StudyUnit(from StudyUnit)

Archive GroupType

time : TimeTypeinstrument : InstrumentTypepanel : PanelTypegeography : GeographyTypedataset : DataSetType

ConceptualComponentSet(from ConceptStandard)

LogicalProduct

MetadataReport(from Metadata)

QuestionItem(from DataCollection)

Group

0..*0..*

0..*0..*

0..*

1

+child0..*

subGroup

+parent

1

1..*

1

1..*

1

11

0..*0..*

0..*0..*

0..*0..*

ConceptScheme(from ConceptStandard)

CategoryScheme(from CategoryScheme)

Variable(from VariableStanda...

Study(from StudyUnit)

ComparisonStandards

1..*1..*

0..1

1..*1..*

0..*0..* 1..*1..*0..*0..*

Thank You

top related