wp1.3 - quality assessment

15
LIRICS mid-term rev iew 1 WP1.3 - Quality Assessment Dr Lee Gillam; Neil Newbold [email protected]; [email protected] University of Surrey 23rd May 2006

Upload: maximus-zenas

Post on 02-Jan-2016

20 views

Category:

Documents


1 download

DESCRIPTION

WP1.3 - Quality Assessment. Dr Lee Gillam; Neil Newbold [email protected]; [email protected] University of Surrey 23rd May 2006. Overview. Task: to introduce, complementary to the ISO process of standards production, specific quality assessment steps - PowerPoint PPT Presentation

TRANSCRIPT

LIRICS mid-term review 1

WP1.3 - Quality Assessment

Dr Lee Gillam; Neil [email protected]; [email protected]

University of Surrey23rd May 2006

LIRICS mid-term review 2

Overview

Task: to introduce, complementary to the ISO process of standards production, specific quality assessment steps

Vision: Longer term, a content management system for developing standards is envisaged, although this is beyond the scope of LIRICS

Methodology: evaluate the efficacy of UniS’ content analysis applications (System Quirk), developed in prior research, including EU-co funded projects, to analyse language consistency and document coherence, identify unexplained terminology, and generate “understandability” metrics

Definition: Quality (ISO 9000): degree to which a set of inherent characteristics fulfils requirements

LIRICS mid-term review 3

Overview

Structure of an ISO standard ISO boilerplate, contents, foreword Introduction Scope Normative References:

“The following referenced documents are indispensable for the application of this document.” - cross-document understanding

Terms and Definitions 3.2 code table table of code elements (3.4) as part of a code

may inherit from other documents

[Content]

LIRICS mid-term review 4

Overview

ISO Comments

LIRICS mid-term review 5

Lines of Inquiry

Readability metrics: avg. word length; avg. sentence length; num. Multiword expressions; reading age;….

Clarity of message: Plain English Campaign; Simplified English (AECMA)

Terminology use: num. known terms; num. unknown terms;

Automation? Hypermedia?

LIRICS mid-term review 6

Plain English Campaign

Advantages of plain English Faster to write Faster to read You get your message across more often and in a

friendlier way Main ways to make writing clearer

Keep your sentences short Prefer active verbs Choose words appropriate for the reader Use positive language

The A to Z of alternative words

LIRICS mid-term review 7

Plain English Campaign

Golden Bull Awards Australian Taxations Office for its Goods and

Services legislation ‘For the purpose of making a declaration under this Subdivision, the

Commissioner may:a) treat a particular event that actually happened as not having

happened; and b) treat a particular event that did not actually happen as having happened and, if appropriate, treat the event as:

i) having happened at a particular time; andii) having involved particular action by a particular entity; and

c) treat a particular event that actually happened as:i) having happened at a time different from the time it actually

happened; orii) having involved particular action by a particular entity (whether

or not the event actually involved any action by that entity).’

LIRICS mid-term review 8

Plain English Campaign

Golden Bull Awards Wanadoo for 'Wireless and Talk' terms and conditions

‘The failure to exercise or delay in exercising a right or remedy under this Agreement shall not constitute a waiver of the right or remedy or a waiver of any other rights or remedies and no single or partial exercise of any right or remedy under this Agreement shall prevent any further exercise of the right or remedy or the exercise of any other right or remedy. The rights and remedies contained in this Agreement are cumulative and not exclusive of any rights or remedies provided by law.’

A reorganisation announcement by Marconi's EMEA (Europe, Middle East, Africa and Australasia) division

'The benefit of having dedicated subject matter experts who are able to evangelise the attributes and business imperatives of their products is starting to bear fruit.'

LIRICS mid-term review 9

Take Sheffield’s GATE

Implement Plain English lookup - prior efforts: MHA

Integrate with POS analysis

New draft of ISO 704 1715 possible Plain

English replacements identified

Manual evaluation required

Move towards automation

LIRICS mid-term review 10

LIRICS mid-term review 11

Automation From words

to patterns - “learning”

LIRICS mid-term review 12

Known Terms

LIRICS mid-term review 13

Incorporate known terms - ISO 16642; ISO 12620

“preferred term”? “deprecated”?

Overlaps between known terms and Plain English?

Explore “principle of substitutability” and hyperlinking

LIRICS mid-term review 14

Summary

Vision: Longer term, a content management system for developing standards is envisaged, although this is beyond the scope of LIRICS …. or is it?

Expanded methodology: Integration of existing systems and standards - GATE, System Quirk components, ISO 16642, ISO 12620 … linguistic annotation / lexical markup? - “Eat our own dog food”; “Drink our own Champagne”

Degree of automation: planned integration and evaluation of System Quirk components for automatic keyword analysis, ontology learning and indicative text summarisation. Significant evaluation and further development required.

Exploitation: Results useful to the standards community at large? To document authoring in general?

LIRICS mid-term review 15

Recent Dissemination

Selection: Lee Gillam, Debbie Garside, Chris Cox. (2006) "Information volumes and linguistic diversity: meeting

the challenges for content management". 3rd International Conference on Terminology, Standardization and Technology Transfer, 25-26 August, Beijing, PRC. Accepted. 

Khurshid Ahmad, Lee Gillam and David Cheng. (2006) Sentiments on a Grid: Analysis of Streaming News and Views. Proc. of 5th Intl. Conf. on Language Resources and Evaluation (LREC).

Lee Gillam and Khurshid Ahmad. (2006) Financial data tombs and nurseries: A grid-based text and ontological analysis. Proc. of 1st Intl. Workshop on Grid Technology for Financial Modeling and Simulation (Grid in Finance 2006). See http://www.gridinfinance.org/ for details

Lee Gillam. Sentiment Analysis and Financial Grids: presentation at the UK National Centre for Text Mining’s workshop on Bridging quantitative and qualitative methods for social sciences using text mining techniques, 28 April 2006.

Lee Gillam. No Place for Sentiments?: forthcoming Access Grid Seminar presentation for the UK’s National Centre for e-Social Science, 8 June 2006

Related activities: ISO 639-6: Committee Draft (CD) accepted; move towards Draft International Standard (DIS) ISO 639-4: Description of the Language Documentation Interchange Format (LDIF)