towards concise preservation by managed...

Post on 14-Sep-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ForgetIT Project, GA 600826

Towards Concise Preservation by Managed Forgetting

iPRES-2013 Conference Lisbon, Portugal 5 September 2013

Nattiya Kanhabua, Claudia Niederée, and Wolf Siberski

L3S Research Center / Leibniz Universität Hannover

Hannover, Germany

ForgetIT Project Consortium

An interdisciplinary team of experts in: – Preservation, information management, information extraction

– Multimedia analysis, storage computing, cognitive psychology

2

Outline

Motivation & Vision

Pilot Applications

Approaches: Overview

Integration Framework

3

Inspiration

4

A Computer that forgets ? Intentionally ??

And in context of preservation???

Inspiration

However we are facing

– dramatic increase in content creation (e.g. digital photography)

– information overload and changing professional + private lives

– increasing storage costs for long-term storage (>10 years)

– increasing use of mobile devices with restricted capacity

– inadvertent forgetting in lack of systematic preservation

5

A Computer that forgets ? Intentionally ??

And in context of preservation???

Inspiration

However we are facing

– dramatic increase in content creation (e.g. digital photography)

– information overload and changing professional + private lives

– increasing storage costs for long-term storage (>10 years)

– increasing use of mobile devices with restricted capacity

– inadvertent forgetting in lack of systematic preservation

And: Forgetting plays a crucial role for human remembering and life in general (focus, stress on important information, forgetting of details)

6

A Computer that forgets ? Intentionally ??

And in context of preservation???

Inspiration

However we are facing

– dramatic increase in content creation (e.g. digital photography)

– information overload and changing professional + private lives

– increasing storage costs for long-term storage (>10 years)

– increasing use of mobile devices with restricted capacity

– inadvertent forgetting in lack of systematic preservation

And: Forgetting plays a crucial role for human remembering and life in general (focus, stress on important information, forgetting of details)

7

A Computer that forgets ? Intentionally ??

And in context of preservation???

So: “Shouldn’t there be something like forgetting in digital memories as well?

Forget IT

Complementing Human Memory

8

V. Mayer-Schönberger. Delete - The Virtue of Forgetting in the Digital Age. Morgan Kaufmann Publishers, 2009.

Motivation

9

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

Motivation

10

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Vision: Building a Bridge

11

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of

digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Vision: Building a Bridge

12

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of

digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

Enabling smooth transition to preservation

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Vision: Building a Bridge

13

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of

digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

Enabling smooth transition to preservation

Creating immediate benefit + reducing effort

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Vision: Building a Bridge

14

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of

digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

ForgetIT

Enabling smooth transition to preservation

Creating immediate benefit + reducing effort

Opening alternatives to “keep it all” and “forgetting by accident”

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Vision: Building a Bridge

15

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of

digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

ForgetIT

Enabling smooth transition to preservation

Creating immediate benefit + reducing effort

Opening alternatives to “keep it all” and “forgetting by accident”

Easing interpretation in the long run

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Vision: Building a Bridge

16

major progress in preservation technology

maturing Information extraction technology

storage as service (e.g. clouds)

Opportunities increasing amount of

digital content handled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

ForgetIT

Enabling smooth transition to preservation

Creating immediate benefit + reducing effort

Opening alternatives to “keep it all” and “forgetting by accident”

Easing interpretation in the long run

taking inspiration from and complementing human memory

large gap for adoption

high-up front cost

no established practices

lack of understanding of benefit

reluctance to invest

Major Obstacles

Building the Bridge

17

Managed Forgetting

Synergetic Preservation

Contextualized Remembering

Building the Bridge

18

Managed Forgetting

Synergetic Preservation

Contextualized Remembering

• as opposed to the current

“forgetting by accident”

• inspired by human

forgetting

Building the Bridge

19

Managed Forgetting

Synergetic Preservation

Contextualized Remembering

• bringing back information

into active use in a

meaningful way

• as opposed to the current

“forgetting by accident”

• inspired by human

forgetting

Building the Bridge

20

Managed Forgetting

Synergetic Preservation

Contextualized Remembering

• bringing back information

into active use in a

meaningful way

• as opposed to the current

“forgetting by accident”

• inspired by human

forgetting

• couples information

management and

preservation management

Simple Example: Holidays

21

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to Paris with Friends

• Thousands of picures

• High

awareness of trip details

• Showing of pictures

• Sorting out redundant pictures

• Sub-grouping and sorting

Simple Example: Holidays

22

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to Paris with Friends

• Thousands of picures

• High

awareness of trip details

• Showing of pictures

• Sorting out redundant pictures

• Sub-grouping and sorting

Simple Example: Holidays

23

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to Paris with Friends

• Thousands of picures

• Life goes on • Pictures go

out of focus • Creation of a

small diverse subset for showing occasionally

• High

awareness of trip details

• Showing of pictures

• Sorting out redundant pictures

• Sub-grouping and sorting

Simple Example: Holidays

24

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to Paris with Friends

• Thousands of picures

• Life goes on • Pictures go

out of focus • Creation of a

small diverse subset for showing occasionally

• Creation of summary page

• Addition of context info

• Further reduction of redundancy

• Rest of pictures into archive

February 2015 Paris Team: Me, Mary Christine, Tom

• High

awareness of trip details

• Showing of pictures

• Sorting out redundant pictures

• Sub-grouping and sorting

Simple Example: Holidays

25

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to Paris with Friends

• Thousands of picures

• Life goes on • Pictures go

out of focus • Creation of a

small diverse subset for showing occasionally

• Creation of summary page

• Addition of context info

• Further reduction of redundancy

• Rest of pictures into archive

February 2015 Paris Team: Me, Mary Christine, Tom

• Changes in

life (e.g. marriage)

• Addition/ update of context information

• Dealing with preservation issues

girlfriend

• High

awareness of trip details

• Showing of pictures

• Sorting out redundant pictures

• Sub-grouping and sorting

Simple Example: Holidays

26

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to Paris with Friends

• Thousands of picures

• Life goes on • Pictures go

out of focus • Creation of a

small diverse subset for showing occasionally

• Creation of summary page

• Addition of context info

• Further reduction of redundancy

• Rest of pictures into archive

February 2015 Paris Team: Me, Mary Christine, Tom

• Changes in

life (e.g. marriage)

• Addition/ update of context information

• Dealing with preservation issues

girlfriend Girlfriend wife

• High

awareness of trip details

• Showing of pictures

• Sorting out redundant pictures

• Sub-grouping and sorting

Simple Example: Holidays

27

+20 Years +5-10 Years +1 Years after trip +1 month

• Trip to Paris with Friends

• Thousands of picures

• Life goes on • Pictures go

out of focus • Creation of a

small diverse subset for showing occasionally

• Creation of summary page

• Addition of context info

• Further reduction of redundancy

• Rest of pictures into archive

February 2015 Paris Team: Me, Mary Christine, Tom

• Changes in

life (e.g. marriage)

• Addition/ update of context information

• Dealing with preservation issues

girlfriend Girlfriend wife

• Revisiting of

Photo of trip photos

• Re-integration into overall photo collection (link into context)

Application: Personal Preservation

Starting point: Tremendous growth of information in personal sphere Diversity and fast evolution of devices, platforms and formats Keeping info sustainably available: Only ad hoc solutions for mid-term, long-term solutions

ForgetIT approach: Preservation solution for personal information space Based on concept of Semantic Desktop Consideration of social web content, multimedia content, other types of personal content, knowledge structures Additional short/mid-term benefit: de-cluttering information space by managed forgetting Consideration of multi-level infrastructures (e.g. mobile, PC, cloud)

Dissemination/Exploitation: Personal Preservation as a service (e.g. to customers of a telco company)

28

Application: Organizational Preservation

Starting point: existing and popular CMS (TYPO3)

Sophisticated workflows for content creation and publication

But: Separation of publication and preservation/archival Access to archived content is difficult and costly

obsolete and even outdated information stays online

ForgetIT approach:

Preservation as integral part (binary model gradual managed forgetting)

Bolder attitude towards removing content possible

Automated support of cleaning up processes

Support of many stages of archiving, e.g. offline but still in index, aggregates online/ content in archive, only aggregates kept, etc.

Dissemination/Exploitation: Involvement of TYPO3 community, TYPO3 with preservation extension as open source project to TYPO3 community

29

Variables & Dimensions

Personal Organization

Scenarios • Personal events (years at school, holidays, social events, graduations, marriage, etc)

• Public events

• Work-related events (project starts/closing, business trips, new products, etc.)

Data Type • Local: photos, contacts, sms • Online: user-generated content • Feature:

1. user behaviors 2. social context

• Local: textual documents, files • Online: web pages • Feature:

1. user roles 2. policies

Interaction search/retrieve, re-find, organize, explore

Action summarization, aggregation, delete

30

Information Value Assessment

Memory Buoyancy Preservation Value

Short-term relevance/interests E.g., current meeting documents

Long-term interests E.g. important life events

Subjective metrics + usage logs (views, edits, modifies) + social context, influences

Objective metrics + diversity, coverage, quality

31

Managed Forgetting

32

Automatic

Deletion?

Managed Forgetting

Inspired by central role of human forgetting

Aim:

– help in identifying and focus on relevant information

– supporting preservation content selection

Will replace inadvertent forgetting

Managed forgetting ≠ automatic deletion

Instead: range of forgetting options e.g.

– resource condensation

– change of indexing & ranking

– reduction of redundancy

33

Managed Forgetting

Inspired by central role of human forgetting

Aim:

– help in identifying and focus on relevant information

– supporting preservation content selection

Will replace inadvertent forgetting

Managed forgetting ≠ automatic deletion

Instead: range of forgetting options e.g.

– resource condensation

– change of indexing & ranking

– reduction of redundancy

Based on:

34

Managed Forgetting

Inspired by central role of human forgetting

Aim:

– help in identifying and focus on relevant information

– supporting preservation content selection

Will replace inadvertent forgetting

Managed forgetting ≠ automatic deletion

Instead: range of forgetting options e.g.

– resource condensation

– change of indexing & ranking

– reduction of redundancy

Based on:

Careful information value assessment

35

decreasing

memory

buoyancy

Managed Forgetting

Inspired by central role of human forgetting

Aim:

– help in identifying and focus on relevant information

– supporting preservation content selection

Will replace inadvertent forgetting

Managed forgetting ≠ automatic deletion

Instead: range of forgetting options e.g.

– resource condensation

– change of indexing & ranking

– reduction of redundancy

Based on:

Careful information value assessment

Forgetting strategies via policies

36

decreasing

memory

buoyancy

Managed Forgetting

Inspired by central role of human forgetting

Aim:

– help in identifying and focus on relevant information

– supporting preservation content selection

Will replace inadvertent forgetting

Managed forgetting ≠ automatic deletion

Instead: range of forgetting options e.g.

– resource condensation

– change of indexing & ranking

– reduction of redundancy

Based on:

Careful information value assessment

Forgetting strategies via policies

Forgetting options to integrate final manual checking before deletion

37

decreasing

memory

buoyancy

Managed Forgetting

Inspired by central role of human forgetting

Aim:

– help in identifying and focus on relevant information

– supporting preservation content selection

Will replace inadvertent forgetting

Managed forgetting ≠ automatic deletion

Instead: range of forgetting options e.g.

– resource condensation

– change of indexing & ranking

– reduction of redundancy

Based on:

Careful information value assessment

Forgetting strategies via policies

Forgetting options to integrate final manual checking before deletion

Combination with multi-tier storage solution possible

38

decreasing

memory

buoyancy

Use of

tiers

Contextualized Remembering

Aim:

– bringing back information into active use in a meaningful way even if a lot of time has passed

– aiming for semantic level of preservation

Based on:

taking into account relevant parts of context when moving to archive

increasing contextualization of preserved content

considering context evolution over time (evolution-aware contextualization)

39

Evolution-aware Contextualization & Re-contextualization

40

Context of

Interpretation

t

C

Archival Information

System

Information

System

D

Evolution-aware Contextualization & Re-contextualization

41

Context of

Interpretation

t

C C‘

Archival Information

System

Information

System

Human Forgetting

Change in focus

Structural changes

D

Evolution-aware Contextualization & Re-contextualization

42

Context of

Interpretation

t

C C‘

Archival Information

System

Information

System

Human Forgetting

Change in focus

Structural changes

Contextualization

D

D

Evolution-aware Contextualization & Re-contextualization

43

Context of

Interpretation

t

C C‘

Archival Information

System

Pres(D‘)

Pres(C‘)

Information

System

Human Forgetting

Change in focus

Structural changes

Contextualization

D

Context-aware

Preservation

D

D

Evolution-aware Contextualization & Re-contextualization

44

Context of

Interpretation

t

C C‘

Archival Information

System

Pres(D‘)

Pres(C‘)

Information

System

Human Forgetting

Change in focus

Structural changes

C‘‘

Semantic evolution

Structural evolution

Terminology evolution

Contextualization

D

Context-aware

Preservation

D

D

Evolution-aware Contextualization & Re-contextualization

45

Context of

Interpretation

t

C C‘

Archival Information

System

Pres(D‘)

Pres(C‘)

Information

System

Human Forgetting

Change in focus

Structural changes

C‘‘

Semantic evolution

Structural evolution

Terminology evolution

Contextualization

D

Context-aware

Preservation

Semantic Evolution

Detection

D

D

Evolution-aware Contextualization & Re-contextualization

46

Context of

Interpretation

t

C C‘

Archival Information

System

Pres(D‘)

Pres(C‘)

Information

System

Human Forgetting

Change in focus

Structural changes

C‘‘

Evolution-aware

Contextualization

Pres(D‘)

Pres(C‘‘)

Semantic evolution

Structural evolution

Terminology evolution

Contextualization

D

Context-aware

Preservation

Semantic Evolution

Detection

D

D

Evolution-aware Contextualization & Re-contextualization

47

Context of

Interpretation

t

C C‘

Archival Information

System

Pres(D‘)

Pres(C‘)

Information

System

Human Forgetting

Change in focus

Structural changes

C‘‘

Evolution-aware

Contextualization

Re-contextualization

Pres(D‘)

Pres(C‘‘)

Semantic evolution

Structural evolution

Terminology evolution

Pres(D‘)

Pres(C‘‘)

D

Contextualization

C‘‘‘

D

Context-aware

Preservation

Semantic Evolution

Detection

D

D

Synergetic Preservation

smooth and step-wise transition between active information use and preservation

enables rich information flow in both directions

supports more informed preservation decisions

eases preservation adoption

ForgetIT Project, GA600826 - Kickoff Meeting, Hannover, February 2013

48

Data

Management Descr. Info.

Archival

Storage AIPs

Access

Ingest

Administration

Preservation Planning

Preserve-or-Forget Framework

Synergetic

Preservation

Extraction &

Contextualization

Re-

Contextualization

Content

Management

Access

Authoring

Administration

Adapter

Layer

Managed Forgetting

Information

Assessment Condensation

Arc

hiv

al

Info

rma

tio

n S

ys

tem

Info

rmati

on

Man

ag

em

en

t S

yste

m

Integration Framework

49

Information Management System

• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)

Forgettor

Assessor calculates: + Memory Buoyancy + Perservation Value

Analyzer 1. Classification of resources

w.r.t. startegies 2. Triggers forgetting actions

Strategies

Values Statistics

Resources Meta-Info

Resources Values + Decisions

Integration Framework

50

Information Management System

• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)

Forgettor

Assessor calculates: + Memory Buoyancy + Perservation Value

Analyzer 1. Classification of resources

w.r.t. startegies 2. Triggers forgetting actions

Strategies

Values Statistics

Resources Meta-Info

Resources Values + Decisions

Input: strategy meta-infomation (content, context,

neigbours ) previous values

Integration Framework

51

Information Management System

• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)

Forgettor

Assessor calculates: + Memory Buoyancy + Perservation Value

Analyzer 1. Classification of resources

w.r.t. startegies 2. Triggers forgetting actions

Strategies

Values Statistics

Forgetting strategies for

different types of resources

Resources Meta-Info

Resources Values + Decisions

Input: strategy meta-infomation (content, context,

neigbours ) previous values

Integration Framework

52

Information Management System

• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)

Forgettor

Assessor calculates: + Memory Buoyancy + Perservation Value

Analyzer 1. Classification of resources

w.r.t. startegies 2. Triggers forgetting actions

Strategies

Values Statistics

Forgetting strategies for

different types of resources

Resources Meta-Info

Resources Values + Decisions

Input: strategy meta-infomation (content, context,

neigbours ) previous values

Processing Resources based on stategies and

information values

Integration Framework

53

Information Management System

• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)

Forgettor

Assessor calculates: + Memory Buoyancy + Perservation Value

Analyzer 1. Classification of resources

w.r.t. startegies 2. Triggers forgetting actions

Strategies

Values Statistics

Forgetting strategies for

different types of resources

Resources Meta-Info

Resources Values + Decisions

Input: strategy meta-infomation (content, context,

neigbours ) previous values

Processing Resources based on stategies and

information values

Storing the new values and sending them back to IMS

Integration Framework

54

Information Management System

• Resources + Meta data: • ResourceID • Content (size, tags, aging, geo) • Context (folder/file usage) • Social features • Resources neighbours (Graph)

Forgettor

Assessor calculates: + Memory Buoyancy + Perservation Value

Analyzer 1. Classification of resources

w.r.t. startegies 2. Triggers forgetting actions

Strategies

Values Statistics

Forgetting strategies for

different types of resources

Resources Meta-Info

Resources Values + Decisions

Input: strategy meta-infomation (content, context,

neigbours ) previous values

Processing Resources based on stategies and

information values

Storing the new values and sending them back to IMS

Archives

Acc

ess

Sto

re

Store & access data

Thank you

http://ForgetIT-Project.eu/

55

top related