noriko kando national institute of informatics presented at: roadmap for language resources and...

11
Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by COCOSDA and WRITE Genoa, 28 May 2006 In Conjunction with LREC 2006 Roadmap for Language Resources in the viewpoint from Information Access Technology Evaluation

Upload: alfred-perkins

Post on 13-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

Noriko KandoNational Institute of Informatics

Presented at:Roadmap for Language Resources and Evaluation

In a Multilingual Environment:Organised by COCOSDA and WRITE

Genoa, 28 May 2006In Conjunction with LREC 2006

Roadmap for Language Resources in the viewpoint from Information

Access Technology Evaluation

Page 2: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

Issues

• LR and Information Access• Multi-linguality across Culture• Emerging Areas: Genres, Opinion, Subjectiv

ity, Community-based Ontology

Page 3: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

LR and Information Access

• Information Access (IA)Technologies (Information Retrieval, Question Answering, Summarization, Text mining, etc) needs better LR: coverage, richness, quality.– Evaluation– Development

Ex. AQUAINT (Advanced QA) Program has supported Resources (WordNet, Gazetteer, etc.), Component Modules, and QA systems.

Page 4: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

LR and Information Access– cont’d

• Extrinsic (Task-based) LR Evaluation– So far LR evaluation had placed emphasis on i

ntrinsic evaluation. Eg. Accuracy, consistency, standards, etc.

– Extrinsic LR evaluation: How LR improved the effectiveness of the IA technologies?

– Good ways to appeal LR’s social importance. Easy to be understood by non-experts and sponsors

Page 5: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

LR and Information Access– cont’d

• LR can be enriched or created through usage in IA systemsEx.Search Engine Query logsUsers’ Relevance judgments, click

through, etc.

Page 6: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

Multi-linguality• Axes to characterize CLIA systems

– Languages– Type of media– Tasks and users – Success criteria or relevance judgments– Document genres – Layers of CLIR technologies– Information access process

[Kando 2002; Gey, Kando & Peters 2002]

Page 7: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

Layers of Cross-Lingual IA Technologies;

pragmatic layer: cultural & social aspects,semantic layer: concept mappingsyntactic lager: lexical layer: language identify, indexingsymbol layer: character codesphysical layer: network

[Kando 1999; 2002; Gey, Kando & Peters 2002]

Page 8: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

Multi-linguality in Pragmatic layer

• Pragmatic layer of CLIA technologies– include issues related to text structure,

intra & inter- text relationship– identifying the differences of the

viewpoints across the languages or cultures is also critical.

Page 9: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

Emerging Areas

• Esp. Conjunction with WEB,– Heterogeneous Document Genres– Subjectivity, Opinion, Emotion, etc.– Community-based or Domain-specific

Ontology– Multi-faceted Ontology– Interactivity– Multi-modal

Page 10: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

Summary

• LR and Information Access• Multi-linguality across Culture• Emerging Areas: Genres, Opinion, Subjectiv

ity, Community-based Ontology

Page 11: Noriko Kando National Institute of Informatics Presented at: Roadmap for Language Resources and Evaluation In a Multilingual Environment: Organised by

Thanks MerciDanke schön Gracie Gracia

s Ta! Tack Köszönöm KiitosTerima Kasih Khap Khun

Ahsante Tak 謝謝 ありがとう

Thanks MerciDanke schön Gracie Gracia

s Ta! Tack Köszönöm KiitosTerima Kasih Khap Khun

Ahsante Tak 謝謝 ありがとう