“is the sky pure today?” awkchecker: an assistive tool for detecting and

41
1 “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors Taehyun Park, Edward Lank, Pascal Poupart, Michael Terry David R. Cheriton School of Computer Science University of Waterloo, Waterloo, ON, Canada, N2L 3G1 ACM UIST 2008

Upload: tameka

Post on 12-Jan-2016

33 views

Category:

Documents


0 download

DESCRIPTION

“Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and Correcting Collocation Errors. Taehyun Park, Edward Lank, Pascal Poupart, Michael Terry David R. Cheriton School of Computer Science University of Waterloo, Waterloo, ON, Canada, N2L 3G1. ACM UIST 2008. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

1

“Is the Sky Pure Today?”AwkChecker: An Assistive Tool for Detecting and

Correcting Collocation Errors

Taehyun Park, Edward Lank, Pascal Poupart, Michael TerryDavid R. Cheriton School of Computer ScienceUniversity of Waterloo, Waterloo, ON, Canada, N2L 3G1

ACM UIST 2008

Page 2: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

2

Motivation

writing aids for non-native speakers

non-native speakers can learn a foreign language's rules for spelling and grammar, but not easy to learn word pairs.

Ex.

take their shoes down vs take their shoes off

more common expression

Page 3: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

3

AwkChecker detect collocation errors and suggest alternatives

Page 4: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

4

Contributions

Define collocation errors as a function of the relative frequency of phrase usage within a corpus. Presents algorithms for suggesting alternatives based on the specific types of errors made by NNSs.

1. Insertion (I went to home I went home)2. Deletion (I am student I am a student )3. Transposition (he’s talking with his full mouth he’s talking with his mouth full)4. Substitution (pure sky clear sky)

Page 5: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

5

Detecting Collocation Errors

acceptability of a phrase e

g(e): frequency of input phraseg(c): frequency of alternative phrasef (e,c): edit distance between e and c

If A(e) is less than a user-customizable threshold, the phrase e is flagged as a collocation error.

Page 6: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

6

Evaluation

- User Testing

five non-native speakers had never seen tools such a system before positive reactions employed AwkChecker to check articles and prepositions

pass judgment (to/on) <noun>

Page 7: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

7

Automatic Collocation Suggestion in Academic Writing

Jian-Cheng Wu1 Yu-Chia Chang1,* Teruko Mitamura2 Jason S. Chang1 1 National Tsing Hua University

Hsinchu, Taiwan2 Carnegie Mellon University

Pittsburgh, United States

ACM ACL 2010

Page 8: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

8

Goals

automate suggestions for verb-noun lexical collocation

Verb-noun collocations are recognized as presenting the most challenge to students (Howarth, 1996; Liu,2002). word choice of verbs in collocations which are considered as the most difficult ones for learners to master (Liu,2002; Chang, 2008).

Page 9: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

9

Collocation Inspector

Page 10: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

10

Algorithm of ProducingSuggestions

Page 11: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

11

Collocation Extraction

Ex. We introduce a novel method for learning to find documents on the web.

We proposed that the web-based model would be more effective than corpus-based one.

Use dependency parser (Stanford Parser)

dobj (introduce-2, method-4)

Page 12: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

12

Using a Classifier for the Suggestion task

Page 13: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

13

Effective Feature Selection Training algorithm: Maximum Entropy

- Use contextual features(head , ngram)

Ex: We introduce a novel method for learning to find documents on the web.

Page 14: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

14

ExampleInput :

There are many investigations about wireless network communication, especially it is important to add Internet transfer calculation speeds.

Result

Page 15: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

15

Experiment Training Corpus: CiteSeer (20,306 abstracts, 95,650 sentences)

790 verbal collocates are identified as tagged classes

Test data: randomly select 600 sentences not overlapping with the training set.

Page 16: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

16

The YouTube Video Recommendation System

James Davidson 、 Benjamin Liebald 、 Junning LiuTaylor Van Vleet 、 Palash Nandy

Google Inc

ACM RecSys 2010

Page 17: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

17

Personalized recommendations user’s previous activity on the site

Page 18: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

18

Goals Help users find high quality videos relevant to their interests.

Recommendations should be updated regularly and reflect a user’s recent activity on the site.

Maintain user privacy.

Page 19: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

19

Challenges

Videos as they are uploaded by users often have no or very poor metadata (title, description).

Videos on YouTube are mostly short form (under 10 minutes in length)

Many of the interesting videos on YouTube have a short life cycle.

Page 20: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

20

System Designseed

1

2

Videos rank using relevance anddiversity.

user

Top-N candidates

Page 21: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

21

Input Data(seed)

1. videos that were watched (potentially beyond a certain threshold)

2. videos that were explicitly favorited, “liked”, rated or added to playlists

Page 22: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

22

Related Videos(candidates) relatedness score

: total occurrence counts across all sessions for videos vi and vj

: global popularity for videos vi and vj

Threshold :overall view count

Top-N candidates of vi

Page 23: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

23

Generating Recommendation Candidates

Candidate set:

S: seed setR: related videosn: distance of n from any video in the seed set

Page 24: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

24

Ranking

video quality (view count ,commenting, sharing activity…) user specificity (consider properties of the seed video) diversification (videos that are too similar to each other are removed)

Page 25: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

25

Evaluation

Page 26: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

26

Text Cohesion Visualizer

Chakarida Nukoolkit, Praewphan Chansripiboon Pornchai Mongkolnam, Richard Watson Todd*

Computer Science Program, School of Information TechnologySchool of Liberal Arts*

King Mongkut’s University of Technology ThonburiBangkok, 10140 Thailand

IEEE ICCSE 2011

Page 27: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

27

Goals design of a prototype system developed to help analyze the lexical coherence of essays

provide visualized output as writing feedback to users

Page 28: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

28

System Flowchart

Preprocessing

Matching keywords

Creating bond table

(Stanford Part Of Speech tagger)

Page 29: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

29

Matching keywords count the number of matched words (link) between any two sentences

four types of matching: 1. repetition 2. complex repetition 3. paraphrase(synonyms,hypernyms) 4. pronoun

Page 30: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

30

Creating bond table

indicating whether or not there is a bond between sentences.

Page 31: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

31

Page 32: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

32

six types

Page 33: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

33

Conclusion We proposed an application that can detect the cohesion errors in text correctly as experts indicate. The system’s accuracy is at an acceptable level according to expert opinion.

In future work, we first plan to improve the process of matching keywords for more accurate results by augmenting the existing process with more specific linguistic rules.

Page 34: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

34

See-To-Retrieve: Efficient Processing of Spatio-VisualKeyword Queries

Chao Zhang 、 Lidan Shou 、 Ke Chen 、 Gang Chen

College of Computer ScienceZhejiang University, China

ACM SIGIR 2012

Page 35: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

35

Spatio-Visual Keyword

searches for introductory information about a distant grand church within her eyesight.

Page 36: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

36

Goals

visually conspicuous

semantically relevant

document spacephysical space

WYRIWYS(What-You-Retrieve-Is-

What-You-See)

Page 37: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

37

Motivation state-of-the-art spatial retrieval methods are mostly distance-based but overlook the visibility of objects.

Italianfood

Page 38: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

38

Visibility Analysis

System Flowchart

Ranking Mechanism

Page 39: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

39

Page 40: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

40

Experiment

Data set: 1.street objects in Los Angeles (contains 131,461 MBRs)

2.Gowalla (consists of 28,867 Web documents)

Page 41: “Is the Sky Pure Today?” AwkChecker: An Assistive Tool for Detecting and

41

柏安

亞婷 家愷 冠中 ???