duraark presentation at dedicate final seminar, october 21st 2013, michelle lindlar
DESCRIPTION
Presentation of the DURAARK project at the final seminar of the DEDICATE project ("Design's Digital Curation for Architecture") held in Glasgow on October 21st, 2013. http://architecturedigitalcuration.blogspot.de/TRANSCRIPT
1 / 23 21 / 10 / 13
DURAARK Preserving Architectural Knowledge
Michelle Lindlar (LUH / TIB)
DEDICATE – Final SeminarGlasgow, October 21st 2013
2 / 23 21 / 10 / 13
TIB (Technische Informationsbibliothek)is the German National Library of Science and Technology
Why architectural data?subjects: engineering, architecture, chemistry, computer science, mathematics and physics
Competence centre for non-textual materials (KNM)
2007 – 2011 DFG funded PROBADO3D projectmetadata and content based search for digital architectural 3D models
http://www.probado.de/en_3d.html
Why digital preservation?2009-2011: Goportis digital preservation pilot project, together with our Goportis partners ZB MED and ZBW
Since 2012: Goportis digital preservation system hosted by TIB
A few words about TIB
3 / 23 21 / 10 / 13
DURAARK (DURAble Architectural Knowledge)FP7 – ICT – Digital Preservation (STReP)February 2013 – January 2016
GoalDevelop methods and tools for sustainable long-term preservation of building data (3D and BIM models, metadata, related knowledge & Web data)
Scope• address all layers of digital preservation (bit,
logical, semantic)• interlinked curation and preservation workflows• focus on two file formats: IFC and E57• incorporate existing OAIS compliant digital
preservation system
Project overview
4 / 23 21 / 10 / 13
Tangible outcomes
Semantic enrichment: Vocabularies for description of built structures and enrichment techniques based on a unified and sustainable naming scheme
Tailored Workflows: Thoroughly investigate requirements of institutional stakeholders (libraries/archives) and SMEs on long-term archiving. Develop according workflows.
Sustainability of file formats: Face problem of digital decay by using Industry Foundation Classes (IFC) and E57 as open and already well-established file formats suited for long-term preservation. Ensure availabilityof characterization tools for those formats.
Goal and Tangible Outcomes
5 / 23 21 / 10 / 13
DURAARK – an interdisciplinary project
6 / 23 21 / 10 / 13
TUE, Department of the Built Environment, Eindhoven University of Technology- WP3 leader, semantics & metadata
CITA, Center for Information Technology and Architecture Copenhagen- WP7 leader, evaluation, test
LUH: German National Library of Science and Technology (TIB) & L3S Research Center Hannover
-Coordinator- WP3 Semantic Enrichment- WP6 leader, long-term preservationLuleå University of Technology
- WP8 leader, dissemination/exploitation
Fraunhofer Austria- WP2 leader, system specification & integration
UBO: Universität Bonn- Technical Coordinator- WP4/WP5: change management, shape recognition
Catenda, SME- User perspective, market requirements, evaluation
ConsortiumJakob Beetz (Eindhoven University of Technology)
7 / 23 21 / 10 / 13
3 layers of a digital object
8 / 23 21 / 10 / 13
risks:• media obsolescence• technical failure• human error• DRM
possible actions:• media migration, refreshing, replication• technological redundancy, ideally with geographic spread• error detection, monitoring, recovery & disaster planning• controlled storage with regular maintenance• security and trust
Solved through „good IT practice“ (which, of course,needs to be implemented …)
1. Bit(stream) [Physical] preservation layer
http://commons.wikimedia.org/wiki/File:Compact_Floppy.jpg
9 / 23 21 / 10 / 13
risks:• software / file format obsolesence• software OS hardware dependencies
• additionally: configuration / package dependencies• lack of compliance to format standards („mal-formed objects“)• DRM
possible actions:• migration, emulation, normalization• „hardware museum“• data/information extraction• extensive technical metadata capturing• definition of significant properties (what to preserve)
Established basic processes … but theyrequire adaptation for new formats.
2. Logical [object] preservation layer
http://www.flickr.com/photos/89771128@N02/8451172304/in/pool-2121762@N23
10 / 23 21 / 10 / 13
risks:• terminology and concepts change over time• context and provenance may be lost
(purpose, setting, limitations, cultural context, related objects)
possible actions:• semantic enrichment• tracing of metadata• audit trail capturing• migration at semantic level• documentation of context• document intended meaning / interpretation
Least developed area of digital preservation
3. Semantic [interpretability] preservation layer
11 / 23 21 / 10 / 13
DURAARK Stack
12 / 23 21 / 10 / 13
Use Cases (1/2)
13 / 23 21 / 10 / 13
DURAARK stakeholders
producers
long-termdata stewards
14 / 23 21 / 10 / 13
DURAARK stakeholders
consumers
long-termdata stewards
15 / 23 21 / 10 / 13
Curation and Preservation
producer /consumer
long-termdatasteward
Actionsneed to meetrequirements of
DCC Curation Lifecycle Modelhttp://www.dcc.ac.uk/resources/curation-lifecycle-model
Createsdatato bepreservedby
16 / 23 21 / 10 / 13
Consumer Use Cases
• result of stakeholder analysis• describe desired use, re-use, access• will be adressed in geometric and
semantic enrichment processing layer
Knowing why something should bepreserved helps us in evaluating thecharacteristics to be preserved
Use Cases (2/2)
17 / 23 21 / 10 / 13
OAIS: Information Object
http://public.ccsds.org/publications/archive/650x0m2.pdf
18 / 23 21 / 10 / 13
Metadata: Technical„Metadata that describes the technical state of and process used to create a file. Often closely related either to its file format or the original software used to create the file, e.g. scanning equipment and settings used to create or modify a digital object.“http://www.digitalpreservation.gov/ndsa/ndsa-glossary.html
Information needed in order to maintain access to the file
Significant properties:criteria which an institutionconsiders important factors of an object‘s quality, structureor behaviour, which should bepreserved over time, i.e. over the course of digital preservation actions.
http://public.ccsds.org/publications/archive/650x0m2.pdf
Technical Metadata
19 / 23 21 / 10 / 13
File format characterization
Existing tools for various fileformats:
Jhove, Tika, fido, fits, DROID, …
Few existing tools for IFC and E57:
E57 validator, IFC validator
20 / 23 21 / 10 / 13
National Library of Australia: Testing Software Tools of Potential Interest for Digital Preservationhttp://www.openplanetsfoundation.org/system/files/Digital%20Preservation%20Project%20Report%20-%20Testing%20Software%20Tools.pdf
21 / 23 21 / 10 / 13
IFC extraction:geometry typesschema versionimplementation levelapplicationversion of applicationmeasurement unitsMVDgeotaggedgross areanumber of stories…
E57 extraction:geo-referenced (yes/no)total square metrenumber of floorsresolution settingsquality settingssensor model, sensor serial number, …total number of scanstotal number of pointsintensity (yes/no)colour (yes/no)reasons for spatial disturbance: distribution
of detected elementssub quality parameters (positioning) – in %
e.g., distance error matched references; occupied quadrants
sub quality parameters (references) – in %e.g., point drift, longitudinal mismatch
…
Potential candidates for technical metadata
22 / 23 21 / 10 / 13
Currently developing stakeholder questionnaire
covering the following areas:– data holdings (formats, SW, produced internally / externally)– data storage / management (data carriers, backup practises, archiving
practises) – access (when, for what reason)– experience with data loss (yes/no, reasons)
Looking for interested institutionsand multiplicators !
Want to help?