automatic metadata generation charles duncan

Findings of the Automatic Metadata Generation Use Cases project Charles [email protected]

Find and Seek

•Hide and Seek?

What is metadata?

• Anything that aids the discovery and discrimination of resources

?

?

?

?

?

??

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?

?known unknowns

unknown unknowns

too manyresults

Purposes of metadata

• Discovery (known unknowns)

• Discrimination (too many results)

• Recommendation (unknown unknowns)

Simplistic View

Information Archive

Deposit Discovery/Curation

Metadata Generation

Metadata Use

Closer to reality

Metadata Generation

Other Information

Archives

Other Information

Arc hives Other

Information Archives

Metadata Use

Information Archive

Deposit Discovery

Metadata Generation

Discrimination

Recommendation Metadata Generation

Just-in-case

Just-in-time

In-case v. In-time

For Against

Just-in-case

• Efficient if metadata is created once and used many times

• May create and store metadata which might never be used

Just-in-time

• Allows great flexibility for new applications

• Could require unreasonable processing times for a real-time service

Use case - student

• Student on history course gets reading list from VLE. Selects an article and is offered, additional information about geographic locations and historical characters mention in the article, list of other articles by same author that have the same highly ranked keywords, other articles that commonly appear on reading lists with this article and books borrowed by students of similar profile with matched keywords.

Use case - depositor

• Researcher submits a paper for deposit in a repository. The PDF is analysed and keywords and classification suggested. File type and size are detected. Author and journal names are detected and checked and disambiguated against authoritative source. Page numbers and date of publication extracted. All these metadata fields completed automatically. Depositor puts paper into two “collections”. All references are identified and related to this paper.

Types of automatic metadata gen

• Automatic recognition and extraction services• Key word extraction• Automatic classification• Basic facts (date, depositor, file type, file size, etc)

• Authoritative source services• Name authority• Journal title authority

• Translation services• Conversion between metadata schemas• Conversion between languages

• Metadata quality validation services• Harvesting and validating metadata

• Activity aggregation services• Usage (reading lists, library borrowing, search

terms)• Relationship services

• User-created collections• Automatic term relation mapping (with “strength”)

Types of automatic metadata gen

Just-in-case

Just-in-time

Automatic recognition and extraction services

Authoritative source services Translation services Metadata quality validation

services

Activity aggregation services Relationship services

Reports

• Synthesis Report (Use Cases)• Guidance Report (Tools)• Recommendations Report (JISC only)• Specialist reports

– Subject metadata– Name metadata – Geospatial metadata– Factual metadata– Bibliographic metadata– Usage metadata– File format metadata– Integrating automatic metadata services

http://www.intrallect.com/index.php/intrallect/knowledge_base/research_projects/automatic_metadata_generation_use_cases

automatic metadata generation charles duncan

Education