toby burrows: vernacular classification: knowledge organization in the humanities networked...

Post on 29-Jul-2015

158 Views

Category:

Education

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

VERNACULAR CLASSIFICATION: HUMANITIES NETWORKED INFRASTRUCTURE (HUNI)

Department of Digital Humanities Toby Burrows

HuNI (Humanities Networked Infrastructure)

•  Aggregates data from 30 different Australian humanities datasets

•  Data are defined as entities occurring in the source datasets: 740,000 entities in all

•  Harvested records are mapped to one of six basic

categories

•  No imported relationships between entities

•  No de-duplication of entities

Challenges for HuNI

•  How to organize and link heterogeneous data for browsing – without entirely pre-determining the structure and relationships

•  How to make the aggregated data useful – without imposing too much of a conceptual framework

•  How to respect the different disciplinary perspectives reflected in the source datasets

•  Researchers need to be able to record and share their views about the data

ADB

AFIRC

AMHD

APFA

AUSTLANG

AusStage

AustLit

AWAP

Bonza

CAARP

Biography

Linguistics

Literature

CircusOz

DAAO

EMEL

EOAS

F&C (x9)

GO

LD

MAP

Mura

OA

PARADISEC

SAUL

WALL

Media

Performing arts

Social history

Visual arts

Data sources

Disc

iplin

e

Concept

HuNI Record Category

Event Organisation Person Place Work

More icons = more records

PERSON A natural person

ORGANISATION A company, club, trust, gallery, political party, etc

WORK A cultural artefact or “man-made” thing created by someone, that has some existence in its own right, either physical or digital

PLACE A real, spatial location

EVENT An activity that occurs in space and time and may involve people, organisations, places, works, etc.

CONCEPT Something whose existence is primarily mental

http://wiki.huni.net.au/display/DS/Data+Model

HuNI: creating collections

•  Users are able to create their own collections of data

•  They can create categories and classifications, and assign individual entities to them

•  Users can choose whether to make these collections public

•  The list of public collections can be seen and browsed

•  Individual entities show which public collections they belong to

•  The graph for each entity also shows its membership of a public collection

HuNI: socially-linked data

•  Users are also able to create links between entities

•  These links are public, by default

•  There are no pre-determined links between entities

•  Users can add to each others’ links, including disagreeing with them or contradicting them

•  Links can describe any kind of reciprocal relationship

•  There is no pre-determined ontology or vocabulary of relationships

HuNI: classification and categorization 1

•  Specific individual entities and phenomena are the focus of the HuNI data aggregate

•  There is as little pre-defined classification and categorization as possible

•  HuNI avoids hierarchical ontological structures (= “flat ontologies”?)

•  Entities are organized and presented primarily so that researchers can work with them and manipulate them – classifying entities into collections and creating links between individual entities

•  HuNI is not organizing and presenting the entities so as to reflect an authoritative classification or organization of knowledge

HuNI: classification and categorization 2

•  Not organizing the entities for structured or faceted search and retrieval –  Only indexing them for a basic keyword search

•  Not organizing them into browsable semantic hierarchies –  Providing only basic browsing via the six categories (and the list of

source datasets) •  HuNI is trying to find a middle ground between: –  The linguistic and conceptual limitations of “search” –  The imposition of a single “normative” ontology or classificatory

semantic structure

HuNI: vernacular classification

•  The user-contributed collections and links give meaning to the data

•  Multiple interpretations and perceptions of relationships between entities are encouraged – even if these are contradictory

•  Users can express the relationships they see in the data – including classifications and categorizations

•  HuNI resists a single normative or expert interpretation or classification of the data

•  HuNI encourages the sharing of different perspectives by researchers and other users

Dr Toby Burrows Marie Curie Fellow Department of Digital Humanities King’s College London 26-29 Drury Lane London WC2B 5RL toby.burrows@kcl.ac.uk @tobyburrows tobyburrows.wordpress.com

Alternative approaches

•  Search – use ontologies to classify search results (facets) •  Topic modeling – automatic generation of semantic categories

and relations from text-based Natural Language Processing •  Linked Data with light categorization for reasoning –  Vocabularies & thesauri encoded for the Semantic Web

(SKOS) •  Social tagging or “folksonomies”

v Tags are applied to entities v There is no formal classification or categorization of concepts

v There are no relationships between tags (other than being used to tag the

same entity)

v Research into deriving ontologies from social tagging

Massive  A)ack  Tags  (last.fm)  00s      80s      90s    acid  jazz      alterna1ve    alterna1ve  dance    alterna1ve  rock    ambient    atmospheric    beau1ful      bristol    bristol  sound    bri1sh      chill      chill  out    chillout      dance      dark    downbeat    downtempo    dub      easy  listening    electro    electronic    electronica      england    english    experimental    favorite    favorites    favourite  female  vocalists      hip  hop    hip-­‐hop    house    hypno1c    idm    indie    indie  rock    industrial    instrumental  jazz    lounge    male  vocalists    massive  a@ack    mellow  pop      psychedelic    rap    relax    rock    sexy    soul  soundtrack    technotrance    trip  hop    trip-­‐hop    triphopuk      

top related