lecture 1: introduction - computer sciencekc2wc/teaching/nlp16/slides/01-intro.pdf · lecture 1:...

Post on 09-Aug-2018

234 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Lecture 1:Introduction

Kai-Wei Chang

CS @ University of Virginia

kw@kwchang.net

Couse webpage: http://kwchang.net/teaching/NLP16

1CS6501– Natural Language Processing

Announcements

Waiting list: Start attending the first few meetings

of the class as if you are registered. Given that

some students will drop the class, some space

will free up.

We will use Piazza as an online discussion

platform. Please enroll.

CS6501– Natural Language Processing 2

Staff

Instructor: Kai-Wei Chang

Email: nlp16@kwchang.net

Office: R412 Rice Hall

Office hour: 2:00 – 3:00, Tue (after class).

Additional office hour: 3:00 – 4:00, Thu

TA: Wasi Ahmad

Email: wua4nw@virginia.edu

Office: R432 Rice Hall

Office hour: 4:00 – 5:00, Mon

3CS6501– Natural Language Processing

This lecture

Course Overview

What is NLP? Why it is important?

What will you learn from this course?

Course Information

What are the challenges?

Key NLP components

CS6501– Natural Language Processing 4

What is NLP

Wiki: Natural language processing (NLP) is

a field of computer science, artificial

intelligence, and computational linguistics

concerned with the interactions between

computers and human (natural) languages.

CS6501– Natural Language Processing 5

Go beyond the keyword matching

Identify the structure and meaning of

words, sentences, texts and conversations

Deep understanding of broad language

NLP is all around us

CS6501– Natural Language Processing 6

Machine translation

CS6501– Natural Language Processing 7

Facebook translation, image credit: Meedan.org

Statistical machine translation

CS6501– Natural Language Processing 8

Image credit: Julia Hockenmaier, Intro to NLP

Dialog Systems

CS6501– Natural Language Processing 9

Sentiment/Opinion Analysis

CS6501– Natural Language Processing 10

Text Classification

Other applications?

CS6501– Natural Language Processing 11

www.wired.com

Question answering

CS6501– Natural Language Processing 12

credit: ifunny.com

'Watson' computer wins at 'Jeopardy'

Question answering

Go beyond search

CS6501– Natural Language Processing 13

Natural language instruction

CS6501– Natural Language Processing 14

https://youtu.be/KkOCeAtKHIc?t=1m28s

Digital personal assistant

Semantic parsing – understand tasks

Entity linking – “my wife” = “Kellie” in the phone

book

CS6501– Natural Language Processing 15

credit: techspot.com

More on natural language instruction

Information Extraction

Unstructured text to database entries

CS6501– Natural Language Processing 16

Yoav Artzi: Natural language processing

Language Comprehension

Q: who wrote Winnie the Pooh?

Q: where is Chris lived?

CS6501– Natural Language Processing 17

Christopher Robin is alive and well. He is the same

person that you read about in the book, Winnie the Pooh.

As a boy, Chris lived in a pretty home called Cotchfield

Farm. When Chris was three years old, his father wrote

a poem about him. The poem was printed in a magazine

for others to read. Mr. Robin then wrote a book

What will you learn from this course

The NLP Pipeline

Key components for

understanding text

NLP systems/applications

Current techniques & limitation

Build realistic NLP tools

CS6501– Natural Language Processing 18

What’s not covered by this course

Speech recognition – no signal processing

Natural language generation

Details of ML algorithms / theory

Text mining / information retrieval

CS6501– Natural Language Processing 19

This lecture

Course Overview

What is NLP? Why it is important?

What will you learn from this course?

Course Information

What are the challenges?

Key NLP components

CS6501– Natural Language Processing 20

Overview

New course, first time being offered

Comments are welcomed

Aimed at first- or second- year PhD students

Lecture + Seminar

No course prerequisites, but I assume

programming experience (for the final project)

basics of probability calculus, and linear

algebra (HW0)

CS6501– Natural Language Processing 21

Grading

No exam & HW -- hooray

Lectures & forum

Participate in discussion (additional credits)

Review quizzes (25%): 3 quizzes

Critical review report (10%)

Paper presentation (15%)

Final project (50%)

CS6501– Natural Language Processing 22

Quizzes

Format

Multiple choice questions

Fill-in-the-blank

Short answer questions

Each quiz: ~20 min in class

Schedule: see course website

Closed book, Closed notes, Closed laptop

CS6501– Natural Language Processing 23

Critical review report

1 page maximum

Pick one paper from the suggested list

Summarize the paper (use you own words)

Provide detailed comments

What can be improved

Potential future directions

Other related work

Some students will be selected to present

their critical reviews

CS6501– Natural Language Processing 24

Paper presentation

Each group has 2~3 students

Picked one paper from the suggested

readings, or your favorite paper

Cannot be the same as critical review report

Can be related to your final project

Register your choice early

15 min presentation + 2 mins Q&A

Will be graded by the instructor, TA, other

students

CS6501– Natural Language Processing 25

Final Project

Work in groups (2~3 students)

Project proposal

Written report, 2 page maximum

Project report (35%)

< 8 pages, ACL format

Due 2 days before the final presentation

Project presentation (15%)

5-min in-class presentation (tentative)

CS6501– Natural Language Processing 26

Late Policy

Credit of 48 hours for all the assignments

Including proposal and final project

No accumulation

No more grace period

No make-up exam

unless under emergency situation

CS6501– Natural Language Processing 27

Cheating/Plagiarism

No. Ask if you have concerns

UVA Honor Code:

http://www.virginia.edu/honor/

CS6501– Natural Language Processing 28

Lectures and office hours

Participation is highly appreciated!

Ask questions if you are still confusing

Feedbacks are welcomed

Lead the discussion in this class

Enroll Piazza

https://piazza.com/virginia/fall2016/cs6501004

CS6501– Natural Language Processing 29

Topics of this class

Fundamental NLP problems

Machine learning & statistical approaches

for NLP

NLP applications

Recent trend in NLP

CS6501– Natural Language Processing 30

What to Read?

Natural Language ProcessingACL, NAACL, EACL, EMNLP, CoNLL, Coling, TACL

aclweb.org/anthology

Machine learningICML, NIPS, ECML, AISTATS, ICLR, JMLR, MLJ

Artificial IntelligenceAAAI, IJCAI, UAI, JAIR

CS6501– Natural Language Processing 31

Questions?

CS6501– Natural Language Processing 32

This lecture

Course Overview

What is NLP? Why it is important?

What will you learn from this course?

Course Information

What are the challenges?

Key NLP components

CS6501– Natural Language Processing 33

Challenges – ambiguity

Word sense ambiguity

CS6501– Natural Language Processing 34

Challenges – ambiguity

Word sense / meaning ambiguity

CS6501– Natural Language Processing 35

Credit: http://stuffsirisaid.com

Challenges – ambiguity

PP attachment ambiguity

CS6501– Natural Language Processing 36

Credit: Mark Liberman, http://languagelog.ldc.upenn.edu/nll/?p=17711

Challenges -- ambiguity

Ambiguous headlines:

Include your children when baking cookies

Hospitals are Sued by 7 Foot Doctors

Iraqi Head Seeks Arms

Safety Experts Say School Bus Passengers

Should Be Belted

CS6501– Natural Language Processing 37

Challenges – ambiguity

Pronoun reference ambiguity

CS6501– Natural Language Processing 38

Credit: http://www.printwand.com/blog/8-catastrophic-examples-of-word-choice-mistakes

Challenges – language is not static

Language grows and changes

e.g., cyber lingo

CS6501– Natural Language Processing 39

LOL Laugh out loud

G2G Got to go

BFN Bye for now

B4N Bye for now

Idk I don’t know

FWIW For what it’s worth

LUWAMH Love you with all my heart

Challenges--language is compositional

CS6501– Natural Language Processing 40

Carefully Slide

Challenges--language is compositional

CS6501– Natural Language Processing 41

小心:

Carefully

Careful

Take

Care

Caution

地滑:

Slide

Landslip

Wet Floor

Smooth

Challenges – scale

Examples:

Bible (King James version): ~700K

Penn Tree bank ~1M from Wall street journal

Newswire collection: 500M+

Wikipedia: 2.9 billion word (English)

Web: several billions of words

CS6501– Natural Language Processing 42

This lecture

Course Overview

What is NLP? Why it is important?

What will you learn from this course?

Course Information

What are the challenges?

Key NLP components

CS6501– Natural Language Processing 43

Part of speech tagging

CS6501– Natural Language Processing 44

Syntactic (Constituency) parsing

CS6501– Natural Language Processing 45

Syntactic structure => meaning

CS6501– Natural Language Processing 46

Image credit: Julia Hockenmaier, Intro to NLP

Dependency Parsing

CS6501– Natural Language Processing 47

Semantic analysis

Word sense disambiguation

Semantic role labeling

CS6501– Natural Language Processing 48

Credit: Ivan Titov

Christopher Robin is alive and well. He is the

same person that you read about in the book,

Winnie the Pooh. As a boy, Chris lived in a

pretty home called Cotchfield Farm. When

Chris was three years old, his father wrote a

poem about him. The poem was printed in a

magazine for others to read. Mr. Robin then

wrote a book

49

Q: [Chris] = [Mr. Robin] ?

Slide modified from Dan Roth

Christopher Robin is alive and well. He is the

same person that you read about in the book,

Winnie the Pooh. As a boy, Chris lived in a

pretty home called Cotchfield Farm. When

Chris was three years old, his father wrote a

poem about him. The poem was printed in a

magazine for others to read. Mr. Robin then

wrote a book

50

Co-reference Resolution

Questions?

CS6501– Natural Language Processing 51

top related