besser--cir 2000, 5/5/00 1 image metadata: what users will want from mature interoperable image...

Post on 17-Jan-2018

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Besser--CIR 2000, 5/5/00 3 Developmental Stages _ Experiment with methods _ Build real operational systems _ Build interoperable operational systems

TRANSCRIPT

Besser--CIR 2000, 5/5/00 1

Image Metadata: What users will want from mature interoperable

image retrieval systems

Howard BesserUCLA School of Education & Information

http://www.gseis.ucla.edu/~howard

Besser--CIR 2000, 5/5/00 2

Image Metadata: What users will want from mature interoperable

image retrieval systems-_ Developmental Stages_ Metadata background_ Merging Images from 7 Museums (MESL)_ Structural and Administrative Metadata (MOA2)_ Image Technical Information (NISO/DLF)_ Finding Image Origins_ Other Metadata & Issues (IPR, Moving Images, Complex Objects)

Besser--CIR 2000, 5/5/00 3

Developmental Stages

_ Experiment with methods_ Build real operational systems_ Build interoperable operational systems

Besser--CIR 2000, 5/5/00 4

Traditional Digital Library Model

DL

DL

DL

DL

useruser

search & presentation

search & presentation

search & presentation

search & presentation

Besser--CIR 2000, 5/5/00 5

Ideal Digital Library Model

DL

DL

DL

DL

useruser

search & presentation

Besser--CIR 2000, 5/5/00 6

Developmental Stages

_ Experiment with methods_ Build real operational systems_ Build interoperable operational systems

– For DL Initiatives– For OPACs– For I & A Services– For Image Retrieval

Besser--CIR 2000, 5/5/00 7

Metadata is not just indexing terms

_ CBIR attributes used for retrieval on color, shape, texture, etc._ Structural attributes used for page-turning_ Administrative attributes used for managing a digital work over

time_ IPR attributes to limit unauthorized use_ Identification attributes to determine what application software

is needed to view a particular digital work

_ Can be located anywhere

Besser--CIR 2000, 5/5/00 8

Merging Images from 7 Museums (MESL)

_ Project Description_ Inconsistent Metadata Issues-_ Strange Search Results-_ User Needs Assessment-

Besser--CIR 2000, 5/5/00 9

Samples from a MESL Site

Besser--CIR 2000, 5/5/00 10

Samples from a MESL Site

Besser--CIR 2000, 5/5/00 11

Creating New Image Sets (Views)

Besser--CIR 2000, 5/5/00 12

Fields in MESL Data Dictionary (1.1)

_ 1. data agreement number _ 2. holding institution _ 3. accession number _ 4. accession method _ 5. credit line _ 6. label _ 7. object type/ objectclass/ object name _ 8. object title/caption _ 9. creator/maker - name _ 10. creator/maker - culture/nationality _ 11. creator/maker - role _ 12. creation place _ 13. creation begin date _ 14. creation end date _ 15. creation technique/method/process _ 16. material/medium

_ 17. support _ 18. dimension/extent-quantity-unit _ 19. parts/pieces _ 20. marks/inscriptions _ 21. edition/state _ 22. associated events, people, organizations,

places _ 23. concepts/subject _ 24. concepts/style-period _ 25. concepts/function _ 26. description _ 27. accompanying image - file name _ 28. accompanying image - caption _ 29. accompanying image - capture data _ 30. accompanying document - file name _ 31. accompanying document - type _ 32. version identification

Besser--CIR 2000, 5/5/00 13

_ Museum Collectn Mgmt System

_ 1. object title/caption _ 2. accession method _ 3. accession number _ 4. label _ 5. credit line _ 6. creation end date _ 7. object type/ objectclass/ object name _ 8. holding institution _ 9. ..._ 10. ..._ ..._ ..._ 99. creation begin date _ 100. data agreement number _ 101. creation_ ...

_ MESL Data Dictionary

_ 1. data agreement number _ 2. holding institution _ 3. accession number _ 4. ccession method _ 5. credit line _ 6. label _ 7. object type/ objectclass/ object name _ 8. object title/caption _ 9. creator/maker - name _ 10. creator/maker - culture/nationality _ 11. creator/maker - role _ 12. creation place _ 13. creation begin date _ 14. creation end date _ 15. creation technique/method/process _ 16. material/medium

Besser--CIR 2000, 5/5/00 14

Authority control over artist name

_ Goya y Lucinetes, Francisco de (Houston) _ Goya y Lucientes, Francisco Jose de

(Harvard) _ Goya, Francisco de (NGA)

Besser--CIR 2000, 5/5/00 15

MESL Technical InfoMuseum File Format Largest File Smallest File

NMAA TIFF 1056x9302.8MB

286x892800K

LOC JPEG/JFIF 540x42039K compressed5:1 compression

552x42032K compressed7:1 compression

NGA JPEG/JFIF 729x600305K76% compression

333x76842K17:1 compression

Houston JPEG/JFIF 1284x18762.3MB compressed-7% compression

608x1802315K compressed10:1 compression

Fowler JPEG/JFIF, GIF 1472x9991.45MB-55% compression

1057x87632K compressed86:1 compression

Harvard JPEG/JFIF 1024x672818K compressed59% compression

605x48331K compressed28:1 compression

Eastman House PCD 16 Base3072x2048

Base/16192x128

Besser--CIR 2000, 5/5/00 16

Search Discrepencies

Besser--CIR 2000, 5/5/00 17

Query for “surreal” University Results NotesAmerican 4 Two are photos of people with “Surreal”

mentioned in the description.Columbia 2 Same as others

Cornell 2 Same as others

Illinois 2 Same as others

Maryland 2 Case sensitive, yielded 0 results whencapitalized.

Michigan 2 Same as others

Virginia Varied Subject search: 2 (same as others),Full text: 4 (same as American)

Besser--CIR 2000, 5/5/00 18

Query for “haystack” University Results NotesAmerican 2 Searched on title (0 results for subject).

Columbia 1 Haystack in background (not main subject ofwork)

Cornell 5 Searched on full text

Illinois 2 Same as American

Maryland 3 One result different from any of the others.

Michigan 6 Searches on both subject field and “Quicksearch” yielded same results.

Virginia Varied Subject search: 1Title search: 4Full text search: 5

Besser--CIR 2000, 5/5/00 19

Query for “oil portraits of children”

University Results NotesAmerican 0 Compound search, tried multiple fields

Columbia 0

Cornell 82 Medium=oil, Subject=child

Illinois 6 Searched “child oil”

Maryland 0 Searched on Keyword

Michigan 31 Subject=child, medium=oil, majority of hitsare relevant

Virginia 82 Subject=child, medium=oil

Besser--CIR 2000, 5/5/00 20

“Madonna” Query

_ columbia (99) _ michigan (66) _ virginia (66) _ cornell (65) _ illinois (65) _ maryland (0)

Besser--CIR 2000, 5/5/00 21

“Africa” Query

_ illinois (273) _ virginia (249) _ cornell (195) _ michigan (104) _ columbia (99) _ maryland (0)

Besser--CIR 2000, 5/5/00 22

Search Discrepancy -- What Happened?

_ different mapping btwn original data fields and perceived user needs

_ different ways in which the various search engines work

Besser--CIR 2000, 5/5/00 23

Fields indexed from MESL Data Dictionary (1)

_ 1. data agreement number _ 2. holding institution _ 3. accession number _ 4. ccession method _ 5. credit line _ 6. label _ 7. object type/ objectclass/ object name _ 8. object title/caption _ 9. creator/maker - name _ 10. creator/maker - culture/nationality _ 11. creator/maker - role _ 12. creation place _ 13. creation begin date _ 14. creation end date _ 15. creation technique/method/process _ 16. material/medium

_ 17. support _ 18. dimension/extent-quantity-unit _ 19. parts/pieces _ 20. marks/inscriptions _ 21. edition/state _ 22. associated events, people, organizations,

places _ 23. concepts/subject _ 24. concepts/style-period _ 25. concepts/function _ 26. description _ 27. accompanying image - file name _ 28. accompanying image - caption _ 29. accompanying image - capture data _ 30. accompanying document - file name _ 31. accompanying document - type _ 32. version identification

Besser--CIR 2000, 5/5/00 24

Fields indexed from MESL Data Dictionary (2)

_ 1. data agreement number _ 2. holding institution _ 3. accession number _ 4. ccession method _ 5. credit line _ 6. label _ 7. object type/ objectclass/ object name _ 8. object title/caption _ 9. creator/maker - name _ 10. creator/maker - culture/nationality _ 11. creator/maker - role _ 12. creation place _ 13. creation begin date _ 14. creation end date _ 15. creation technique/method/process _ 16. material/medium

_ 17. support _ 18. dimension/extent-quantity-unit _ 19. parts/pieces _ 20. marks/inscriptions _ 21. edition/state _ 22. associated events, people, organizations,

places _ 23. concepts/subject _ 24. concepts/style-period _ 25. concepts/function _ 26. description _ 27. accompanying image - file name _ 28. accompanying image - caption _ 29. accompanying image - capture data _ 30. accompanying document - file name _ 31. accompanying document - type _ 32. version identification

Besser--CIR 2000, 5/5/00 25

Fields indexed from MESL Data Dictionary (3)

_ 1. data agreement number _ 2. holding institution _ 3. accession number _ 4. ccession method _ 5. credit line _ 6. label _ 7. object type/ objectclass/ object name _ 8. object title/caption _ 9. creator/maker - name _ 10. creator/maker - culture/nationality _ 11. creator/maker - role _ 12. creation place _ 13. creation begin date _ 14. creation end date _ 15. creation technique/method/process _ 16. material/medium

_ 17. support _ 18. dimension/extent-quantity-unit _ 19. parts/pieces _ 20. marks/inscriptions _ 21. edition/state _ 22. associated events, people, organizations,

places _ 23. concepts/subject _ 24. concepts/style-period _ 25. concepts/function _ 26. description _ 27. accompanying image - file name _ 28. accompanying image - caption _ 29. accompanying image - capture data _ 30. accompanying document - file name _ 31. accompanying document - type _ 32. version identification

Besser--CIR 2000, 5/5/00 26

UCB Mellon Grant:Examining Faculty & Student

Use & Usefulness

_ What Faculty Do with Digital Images_ Major Issues for Faculty_ Faculty Concerns about teaching with Digital

Images_ Faculty Concerns about Image Quality and

Metadata

Besser--CIR 2000, 5/5/00 27

Faculty Use of Digital Images : Major Issues

_ technical support_ training_ tools (software and hardware)

– More than just query options_ time commitment

Besser--CIR 2000, 5/5/00 28

Faculty Concerns about Image Quality and Metadata

_ Image quality is important, but the quality needed is contextual

_ For these faculty, digital image quality was no worse than slides

_ Any metadata delivered must be customizable by faculty member

Besser--CIR 2000, 5/5/00 29

MESL Follow-on Projects

_ Academic Image Cooperative -- http://www.academicimage.org/

_ AMICO Project --http://www.amn.org/AMICO/

_ Museum Digital Licensing Consortium --http://www.digitalmuseums.org/

Besser--CIR 2000, 5/5/00 30

Structural and Administrative Metadata (MOA2)-

Special Collections Material DLF Metadata for Interoperability Testbed

Administrative Metadata-Structural Metadata-

Besser--CIR 2000, 5/5/00 31

Making of America II

R & D Distributed Repositories Transportation, 1869-1900 Testbed Project Best Practices Structural and administrative metadata

Besser--CIR 2000, 5/5/00 32

MOA2 Goal is Interpoerability

Book example

Besser--CIR 2000, 5/5/00 33

MOA II Classes of Objects

Continuous Tone Photos Photo Albums Diaries, journals, letterpress books Ledgers Correspondence

Besser--CIR 2000, 5/5/00 34

MOA II Metadata

_ Administrative Metadata– for enhancing resource management

_ Structural Metadata– for reflecting internal hierarchies and

relationships btwn parts

_ Raw/Seared/Cooked

Besser--CIR 2000, 5/5/00 35

MOA II Behaviors

Navigation Display/Print

Besser--CIR 2000, 5/5/00 36

NISO/DLF Image Metadata Workshop (4/99)

Image Technical Information : Possible Goals

Metadata fields Rules for Field Contents (authority control)

Core set of necessary fields

Syntax for expressing fields and contents (headers)

Besser--CIR 2000, 5/5/00 37

Image Metadata

Focus on Metadata that may prove helpful for

management use preservation ...

Besser--CIR 2000, 5/5/00 38

Image Metadata

Break-out Groups: Work Done-

Characteristics and Features of Images Image Production and Reformatting

Features Image Identification and Integrity

Besser--CIR 2000, 5/5/00 39

Image Metadata Elements for Data Dictionary

Data Dictionary Entries_ Element Name_ Definition (short) of the element name_ Is the element required? (Identified as: Mandatory, Mandatory if

Applicable, Recommended, Optional)_ How is the value of the element represented?_ Examples_ When is this data collected?_ What is the purpose of this data?_ Who would the identified users be?_ How is the metadata used?_ What other metadata standards reference it?

Besser--CIR 2000, 5/5/00 40

Image Metadata Elements for Data Dictionary

Characteristics and Features Element List

_ Format Issues:_ Resolution Issues:_ Encoding:_ Compression:_ Others:

Besser--CIR 2000, 5/5/00 41

Image Metadata Elements for Data Dictionary

Image Production Element List (Pertaining to the Image)

_ In-image target(s):_ System target(s), associated with the object:_ Responsible agent_ Rationale:_ Hardware:_ Software:

Besser--CIR 2000, 5/5/00 42

Image Metadata Elements for Data Dictionary

Image Production Element List (Pertaining to the Process)

_ Format of the image_ Intrinsic characteristics of the image_ Identification_ Provides a means for defining methodology including documentation and rationale_ Who is involved with the file?_ Who created the image file?_ Who commissioned the creation of the image file (i.e., the chartering entity), as opposed

to: Who is the responsible agency? Who is the owner?_ Where_ What_ When: necessary dates including: capture date/time, modification_ Checksum_ Navigational aid_ Encoding tools

Besser--CIR 2000, 5/5/00 43

Image Metadata

NISO/DLF Image Metadata:In Progress

_ Data Dictionary for both “Characteristics & Features” and for “Image Production Elements” due end of 6/00

Besser--CIR 2000, 5/5/00 44

Finding Image Origins

Besser--CIR 2000, 5/5/00 45

Identification/Provenance (Images)-

The number of variant forms of a work can be enormous Image Families A digital image frequently has many layers of parentage Information about the parentage that can indicate the

quality and veracity of the image (Dublin Core "Source" and "Relation")

how to deal with different versions derived from the same scan or different encoding schemes

Vocabulary Standards to express this

Besser--CIR 2000, 5/5/00 46

The number of variant forms of a work can be enormous

different views of the same object different scans of the same photo different resolutions different compression schemes different compression ratios different file storage formats different details of the same image ...

Image Families

Besser--CIR 2000, 5/5/00 48

Identification/Provenance

how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF)

Vocabulary Standards to express this– VRA Surrogate Categories– CIMI's "Image Elements”

Besser--CIR 2000, 5/5/00 49

Are some of the images I retrieved actually identical to

each other?

_ Canonical forms

Besser--CIR 2000, 5/5/00 50

Other Metadata & Issues-

_ For IPR management_ Approaches to Indexing Moving Image

Materials_ Structural Metadata for Complex Objects

Besser--CIR 2000, 5/5/00 51

<Indecs> formal structure for describing and uniquely identifying intellectual property

itself, the people and businesses involved in its trading, and the agreements which they make about it (primarily for publishing, music, and visual arts)

will develop high-level specifications for the services that will be required to implement a global IP trading system based on this <indecs> generic data model

focus is on encoding rights at a high level, not on resource discovery likely to involve metadata schma registration and directory to allow

interoperation of personal identifiers for rightsholders and users supported by EEC DG-13 First meeting July 1999 http://www.indecs.org/

Besser--CIR 2000, 5/5/00 52

Indexing ofMoving Image Materials

_ Whole works vs. parts of Works_ MPEG 7_ Approaches to segmentation & thumbnail

representation_ Closed caption indexing_ Audio description indexing_ Semiotics

Besser--CIR 2000, 5/5/00 53

Structural Metadata for Complex Objects-

_ MPEG 4_ SMIL

Besser--CIR 2000, 5/5/00 54

Synchronized Multimedia Integration Language (SMIL)

_ For repurposing and reuse in different ways_ Use XML to reference various pieces in different

ways_ Supported by Realmedia but not Microsoft or

Macromedia

Besser--CIR 2000, 5/5/00 55

MPEG 4

_ Object-oriented_ Very low level of granularity (even objects vs

backgrounds)_ Scaleable bandwidth use_ Binary Format for Scenes (BIFS) borrows

concepts from VRML

Besser--CIR 2000, 5/5/00 56

What will we need in mature CBIR systems?

_ Look at what users want_ Hooks to allow users to DO something with

the images_ Interoperability_ Image Metadata & Standards

Besser--CIR 2000, 5/5/00 57

Image Metadata: What users will want from mature interoperable image retrieval systems

_ http://www.gseis.ucla.edu/~howard_ http://sunsite.berkeley.edu/Imaging/Databases/1998mellon_ Spring 1999 special issue of Visual Resources_ http://www.gseis.ucla.edu/~howard/image-meta.html_ http://www.niso.org/image.html_ http://sunsite.Berkeley.EDU/Imaging/Databases/#standards_ http://sunsite.Berkeley.EDU/moa2/_ http://www.gseis.ucla.edu/~howard/Classes/287-moving.html_ http://sunsite.Berkeley.EDU/Longevity/_ http://www.nlc-bnc.ca/ifla/II/metadata.htm_ http://www.gseis.ucla.edu/~howard/Classes/287-mov-index-bib.html

Besser--CIR 2000, 5/5/00 58

Besser--CIR 2000, 5/5/00 59

Problems of Rich Media

_ Complexity of formats (storage & compression)_ Synchronicity between media/streams_ Pieces and Boundaries_ Persistent IDs_ Interactivity_ Historical context_ Content_ Recontextualization (Postmodernism)

Besser--CIR 2000, 5/5/00 60

Opportunities--a scenario

_ Huge stable online DB of rich media (Prelinger Archives)

_ Creators create new works that consist mainly of links to and transitions btwn pieces of the rich media DB

_ Works are not really assembled until run-time_ Securing IP permission may shift from capital-intensive

producer to end-user_ Economics of media production may change drastically

Besser--CIR 2000, 5/5/00 61

top related