is276 final presentation

39
Team Lightning presents: LAPL Photo Collection A case study in Information Retrieval Presented December 8 th , 2009 by Dalena Hunter, Michael Mocciaro, Shelly Ray, Dan Schell, Chris Salvano, Teresa Soleau Team Lightning: LAPL Photo Collection

Upload: shellyray

Post on 01-Nov-2014

361 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Is276 Final Presentation

Team Lightning presents:

LAPL Photo CollectionA case study in Information Retrieval

Presented December 8th, 2009 by Dalena Hunter, Michael Mocciaro, Shelly Ray, Dan Schell, Chris Salvano, Teresa Soleau

Team Lightning: LAPL Photo Collection

Page 2: Is276 Final Presentation

LAPL Photo CollectionA case study in Information Retrieval

I. Background: About the photo collection and the system that organizes it.

II. Problem Statement: Three specific information retrieval problems and solutions:

1. Sessions timing out

2. Ranking search results

3. Interface issues

III. Going forward …

Team Lightning: LAPL Photo Collection

Page 3: Is276 Final Presentation

LAPL Photo Collection

Team Lightning: LAPL Photo Collection

Background on the collection: Materials and System

Page 4: Is276 Final Presentation

Background: What is the Los Angeles Public Library Photo Collection ?

Consists of: the Herald Examiner photo collection, Shades of LA, and the Security Pacific National Back Collection.

The Security Pacific National Bank collection is comprised of 8 sub-collections:

a. Los Angeles Chamber of Commerce Collection;

b. Turn of the Century Los Angeles;

c. Hollywood Citizen News/Valley Times Newspaper Collection;

d. Central Library’s Historical California Photographs;

e. Portrait Collection;

f. Federal Writers Project;

g. Ralph Morris Archives;

h. William Reagh Collection

TEAM LIGHTNING: LAPL Photo Collection

Page 5: Is276 Final Presentation

Background: The collection and the system

Team Lightning: LAPL Photo Collection

Collection is part of LAPL’s online catalog

Items are described using MaRC metadata schema

Results in truncated keyword search results.

Rich indexing and descriptive elements are only available to staff working with the items themselves.

Page 6: Is276 Final Presentation

Background: System constraints

Team Lightning: LAPL Photo Collection

IT department is stretched thin and unable to devote time to backend or UI capability issues.

Only one photo archivist working on the project

Processing memory is limited

Results in system crashes (on a weekly basis) and timeouts

This may affect any attempt to add information or functionality to the system.

Page 7: Is276 Final Presentation

LAPL Photo Collection

Team Lightning: LAPL Photo Collection

Problem Statement

Page 8: Is276 Final Presentation

Problem Statement

Team Lightning: LAPL Photo Collection

What are the impediments to good information retrieval?

Lots of them …

1. Session timeouts

2. Ranking of search results

3. User interface

Page 9: Is276 Final Presentation

LAPL Photo Collection

Team Lightning: LAPL Photo Collection

Problem #1: Session Timeouts

Page 10: Is276 Final Presentation

Problem: Session timeouts

Team Lightning: LAPL Photo Collection

Users get interrupted with message that their session has “timed out”

A major disruption

When did they “time in”?

We suggest: Remove the automated time out feature and allow users to perform more elaborate, linked searches.

Page 11: Is276 Final Presentation

Problem: Session timeouts

Team Lightning: LAPL Photo Collection

Eliminating timeouts is #1 recommendation

This will enhance information retrieval by:

Allowing users to progress further in their search in the course of a session

Allowing for the addition to add greater user interface capabilities, such as a "View Personal List" feature

Acts as a form of search memory so that users do not have to remember or record their past searches

Page 12: Is276 Final Presentation

LAPL Photo Collection

Team Lightning: LAPL Photo Collection

Problem #2: Ranking search results

Page 13: Is276 Final Presentation

Problem: Ranking Search Results

Team Lightning: LAPL Photo Collection

The current ranking system (keyword searching):

Keyword search picks up hits in all descriptive fields of a photo’s metadata record

Favors “Subject” and “Summary,” often to the detriment of good recall and precision

Page 14: Is276 Final Presentation

Problem: Ranking Search Results

Page 1: Page 38:

Example 1: “Airport” as keyword search

Page 15: Is276 Final Presentation

Comparative analysis of “Airport” returns: Records #1 and #379

Problem: Ranking Search Results

Page 16: Is276 Final Presentation

Problem: Ranking Search Results

Example 2: “Raymond Chandler” as keyword search

Page 17: Is276 Final Presentation

Comparative analysis of “Raymond Chandler” returns: Records #1 and #6

Problem: Ranking Search Results

Page 18: Is276 Final Presentation

What’s going on here?

A keyword search favors the “Summary” and “Subject” fields and sorts returned photos by reverse chronological order

Therefore, a photo with 1 “airport” hit in the “Summary” or “Subject” fields and a photo date will be returned ahead of a photo with 3 “airport” hits that does not have a photograph date (n.d.)

How can Team Lightning bring some rationality to a keyword search?

Page 19: Is276 Final Presentation

Behold, the proposed ranking system…

Metadata Element Metadata Value Point Value

Click for Images: Direct link to photo --

Title(s): Title of photograph 3

Photographer: Name of photographer 1

Order Number: Control number for ordering purposes --

Filing Information: Filing box location / name 1

Publisher: Date of photograph --

Description: Item’s physical description --

Series: Associated Series Name (Name files) 1

Notes: LAPL control number --

Summary: Photo description 1

Subjects: Controlled vocabulary (LCSH) 2

Other Entries: Other entry names associated with item 2

Page 20: Is276 Final Presentation

The “Airport” example using Team Lightning’s Relevancy Ranking:

Elements Metadata Value Point Value

Click for Images: Link --

Title(s): George W. Bush [graphic] --

Photographer: Leonard, Gary --

Filing Information:

Portraits-Bush, George W. --

Publisher: 1999 --

Description: 1 photograph : b&w --

Summary:Closeup view of George W. Bush, Republican presidential candidate, taken at the Los Angeles International Airport. Photo dated: September 1, 1999.

1

Subjects:

Bush, George W. (George Walker), 1946-Los Angeles International AirportPresidential candidates--United StatesAirports--California--Los AngelesWestchester (Los Angeles, Calif.)

2

Total Point Value = 3

RECORD #1

Page 21: Is276 Final Presentation

The “Airport” example using Team Lightning’s Relevancy Ranking

Elements Metadata Value Point Value

Click for Images: Link --

Title(s): Los Angeles International Airport [graphic] 3

Filing Information:S-002-348.3 4x5 Transportation-Aviation-Airports-L.A. International Airport.

1

Publisher: [n.d.] --

Description: 1 photograph : b&w --

Summary:Aerial view of Los Angeles International Airport and surrounding area.

2

Subjects:

Los Angeles International Airport and surrounding areaAerial viewsAirports—California—Los AngelesWestchester (Los Angeles, Calif.)

2

Total Point Value = 8Analysis: This photo should appear before the photoof George W. Bush when doing a keyword search for “Airport”

RECORD #379

Page 22: Is276 Final Presentation

Elements Metadata Value Point Value

Click for Images: Link --

Title(s): Appian Way Apartments --

Photographer: Solomon, Cliff --Filing

Information:HE Box Raymond Chandler 1

Publisher: 1986 --

Description: 1 photograph : b&w --

Series: Herald Examiner Collection --

Summary:

Front view of the Appian Way Apartments with windows and trim in need of a paint job. Possibly used for location shooting in Robert Altman's version of "The Long Goodbye". Photo dated: Jul. 18, 1986.

--

Subjects:Marlowe, Philip (Fictitious character)Apartment houses—California—Los AngelesMotion picture locations

--

Other Entries:Altman, RobertChandler, Raymond

2

The “Raymond Chandler” example using TL’s Relevancy Ranking

Total Point Value = 3

RECORD #1

Page 23: Is276 Final Presentation

The “Raymond Chandler” example using TL’s Relevancy Ranking

Elements Metadata Value Point Value

Click for Images: Link --

Title(s): Raymond Chandler [graphic] 3

Filing Information: HE Box… --

Publisher: 1939 --

Description: 1 photograph : b&w --

Series: 8389 Chandler, Raymond 1

Summary: Novelist Raymond Chandler in 1939 2

Subjects:Chandler, Raymond, 1888-1959Authors

2

Total Point Value = 8Analysis: Though photographs of filming locations of “The LongGoodbye” may be useful for a user, photos of Raymond Chandlershould appear first in a search for “Raymond Chandler”

RECORD #6

Page 24: Is276 Final Presentation

Problem: Ranking Search Results

Final Analysis:

Incorporating a metadata “point” system can help improve recall and precision (within a keyword search)

Search results should be based on content across all fields, irrespective of reverse chronological order

LAPL won’t fool me twice

Page 25: Is276 Final Presentation

LAPL Photo Collection

Team Lightning: LAPL Photo Collection

Problem #3: Interface issues

Page 26: Is276 Final Presentation

User Interface: Revised Main Search Screen

New Search Options

Subject Browse By Letter

Simplified Year Limit Options

Team Lightning: LAPL Photo Collection

Page 27: Is276 Final Presentation

User Interface: Revised Advanced Search Screen

Advanced Search Options

Added Year Options

Added Boolean search options

Team Lightning: LAPL Photo Collection

Page 28: Is276 Final Presentation

User Interface:

LAPL Results Screen

Team Lightning: LAPL Photo Collection

Page 29: Is276 Final Presentation

User Interface:

Google Life Results Screen

Team Lightning: LAPL Photo Collection

Page 30: Is276 Final Presentation

User Interface:

LAPL item listing

Very small image on initial record

Detailed summary provided

Can browse by Subject

Team Lightning: LAPL Photo Collection

Page 31: Is276 Final Presentation

Large Picture on initial record

Limited metadata provided

Can browse related images

Can browse by “label”

One click to purchase screen

User Interface:

Google Life item listing

Page 32: Is276 Final Presentation

LAPL Photo Collection

Team Lightning: LAPL Photo Collection

Future enhancements

Conclusions

Page 33: Is276 Final Presentation

Going forward …

Future enhancements we recommend:

Dynamic term suggestion/real-time query expansion

Team Lightning: LAPL Photo Collection

Page 34: Is276 Final Presentation

Going forward …

Future enhancements we recommend:

Cross-walking to Dublin Core for inclusion in an aggregate

Team Lightning: LAPL Photo Collection

Page 35: Is276 Final Presentation

Going forward …

Team Lightning: LAPL Photo Collection

Page 36: Is276 Final Presentation

Going forward …

Team Lightning: LAPL Photo Collection

Page 37: Is276 Final Presentation

Going forward …

Team Lightning: LAPL Photo Collection

Page 38: Is276 Final Presentation

LAPL Photo Collection

Conclusions

Team Lightning: LAPL Photo Collection

Page 39: Is276 Final Presentation

LAPL Photo Collection

Team Lightning: LAPL Photo Collection

Questions??