lecture 1: introduction - computer sciencekc2wc/teaching/nlp16/slides/01-intro.pdf · lecture 1:...

51
Lecture 1: Introduction Kai-Wei Chang CS @ University of Virginia [email protected] Couse webpage: http://kwchang.net/teaching/NLP16 1 CS6501Natural Language Processing

Upload: dodan

Post on 09-Aug-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Lecture 1:Introduction

Kai-Wei Chang

CS @ University of Virginia

[email protected]

Couse webpage: http://kwchang.net/teaching/NLP16

1CS6501– Natural Language Processing

Page 2: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Announcements

Waiting list: Start attending the first few meetings

of the class as if you are registered. Given that

some students will drop the class, some space

will free up.

We will use Piazza as an online discussion

platform. Please enroll.

CS6501– Natural Language Processing 2

Page 3: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Staff

Instructor: Kai-Wei Chang

Email: [email protected]

Office: R412 Rice Hall

Office hour: 2:00 – 3:00, Tue (after class).

Additional office hour: 3:00 – 4:00, Thu

TA: Wasi Ahmad

Email: [email protected]

Office: R432 Rice Hall

Office hour: 4:00 – 5:00, Mon

3CS6501– Natural Language Processing

Page 4: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

This lecture

Course Overview

What is NLP? Why it is important?

What will you learn from this course?

Course Information

What are the challenges?

Key NLP components

CS6501– Natural Language Processing 4

Page 5: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

What is NLP

Wiki: Natural language processing (NLP) is

a field of computer science, artificial

intelligence, and computational linguistics

concerned with the interactions between

computers and human (natural) languages.

CS6501– Natural Language Processing 5

Page 6: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Go beyond the keyword matching

Identify the structure and meaning of

words, sentences, texts and conversations

Deep understanding of broad language

NLP is all around us

CS6501– Natural Language Processing 6

Page 7: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Machine translation

CS6501– Natural Language Processing 7

Facebook translation, image credit: Meedan.org

Page 8: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Statistical machine translation

CS6501– Natural Language Processing 8

Image credit: Julia Hockenmaier, Intro to NLP

Page 9: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Dialog Systems

CS6501– Natural Language Processing 9

Page 10: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Sentiment/Opinion Analysis

CS6501– Natural Language Processing 10

Page 11: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Text Classification

Other applications?

CS6501– Natural Language Processing 11

www.wired.com

Page 12: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Question answering

CS6501– Natural Language Processing 12

credit: ifunny.com

'Watson' computer wins at 'Jeopardy'

Page 13: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Question answering

Go beyond search

CS6501– Natural Language Processing 13

Page 14: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Natural language instruction

CS6501– Natural Language Processing 14

https://youtu.be/KkOCeAtKHIc?t=1m28s

Page 15: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Digital personal assistant

Semantic parsing – understand tasks

Entity linking – “my wife” = “Kellie” in the phone

book

CS6501– Natural Language Processing 15

credit: techspot.com

More on natural language instruction

Page 16: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Information Extraction

Unstructured text to database entries

CS6501– Natural Language Processing 16

Yoav Artzi: Natural language processing

Page 17: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Language Comprehension

Q: who wrote Winnie the Pooh?

Q: where is Chris lived?

CS6501– Natural Language Processing 17

Christopher Robin is alive and well. He is the same

person that you read about in the book, Winnie the Pooh.

As a boy, Chris lived in a pretty home called Cotchfield

Farm. When Chris was three years old, his father wrote

a poem about him. The poem was printed in a magazine

for others to read. Mr. Robin then wrote a book

Page 18: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

What will you learn from this course

The NLP Pipeline

Key components for

understanding text

NLP systems/applications

Current techniques & limitation

Build realistic NLP tools

CS6501– Natural Language Processing 18

Page 19: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

What’s not covered by this course

Speech recognition – no signal processing

Natural language generation

Details of ML algorithms / theory

Text mining / information retrieval

CS6501– Natural Language Processing 19

Page 20: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

This lecture

Course Overview

What is NLP? Why it is important?

What will you learn from this course?

Course Information

What are the challenges?

Key NLP components

CS6501– Natural Language Processing 20

Page 21: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Overview

New course, first time being offered

Comments are welcomed

Aimed at first- or second- year PhD students

Lecture + Seminar

No course prerequisites, but I assume

programming experience (for the final project)

basics of probability calculus, and linear

algebra (HW0)

CS6501– Natural Language Processing 21

Page 22: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Grading

No exam & HW -- hooray

Lectures & forum

Participate in discussion (additional credits)

Review quizzes (25%): 3 quizzes

Critical review report (10%)

Paper presentation (15%)

Final project (50%)

CS6501– Natural Language Processing 22

Page 23: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Quizzes

Format

Multiple choice questions

Fill-in-the-blank

Short answer questions

Each quiz: ~20 min in class

Schedule: see course website

Closed book, Closed notes, Closed laptop

CS6501– Natural Language Processing 23

Page 24: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Critical review report

1 page maximum

Pick one paper from the suggested list

Summarize the paper (use you own words)

Provide detailed comments

What can be improved

Potential future directions

Other related work

Some students will be selected to present

their critical reviews

CS6501– Natural Language Processing 24

Page 25: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Paper presentation

Each group has 2~3 students

Picked one paper from the suggested

readings, or your favorite paper

Cannot be the same as critical review report

Can be related to your final project

Register your choice early

15 min presentation + 2 mins Q&A

Will be graded by the instructor, TA, other

students

CS6501– Natural Language Processing 25

Page 26: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Final Project

Work in groups (2~3 students)

Project proposal

Written report, 2 page maximum

Project report (35%)

< 8 pages, ACL format

Due 2 days before the final presentation

Project presentation (15%)

5-min in-class presentation (tentative)

CS6501– Natural Language Processing 26

Page 27: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Late Policy

Credit of 48 hours for all the assignments

Including proposal and final project

No accumulation

No more grace period

No make-up exam

unless under emergency situation

CS6501– Natural Language Processing 27

Page 28: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Cheating/Plagiarism

No. Ask if you have concerns

UVA Honor Code:

http://www.virginia.edu/honor/

CS6501– Natural Language Processing 28

Page 29: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Lectures and office hours

Participation is highly appreciated!

Ask questions if you are still confusing

Feedbacks are welcomed

Lead the discussion in this class

Enroll Piazza

https://piazza.com/virginia/fall2016/cs6501004

CS6501– Natural Language Processing 29

Page 30: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Topics of this class

Fundamental NLP problems

Machine learning & statistical approaches

for NLP

NLP applications

Recent trend in NLP

CS6501– Natural Language Processing 30

Page 31: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

What to Read?

Natural Language ProcessingACL, NAACL, EACL, EMNLP, CoNLL, Coling, TACL

aclweb.org/anthology

Machine learningICML, NIPS, ECML, AISTATS, ICLR, JMLR, MLJ

Artificial IntelligenceAAAI, IJCAI, UAI, JAIR

CS6501– Natural Language Processing 31

Page 32: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Questions?

CS6501– Natural Language Processing 32

Page 33: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

This lecture

Course Overview

What is NLP? Why it is important?

What will you learn from this course?

Course Information

What are the challenges?

Key NLP components

CS6501– Natural Language Processing 33

Page 34: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges – ambiguity

Word sense ambiguity

CS6501– Natural Language Processing 34

Page 35: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges – ambiguity

Word sense / meaning ambiguity

CS6501– Natural Language Processing 35

Credit: http://stuffsirisaid.com

Page 36: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges – ambiguity

PP attachment ambiguity

CS6501– Natural Language Processing 36

Credit: Mark Liberman, http://languagelog.ldc.upenn.edu/nll/?p=17711

Page 37: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges -- ambiguity

Ambiguous headlines:

Include your children when baking cookies

Hospitals are Sued by 7 Foot Doctors

Iraqi Head Seeks Arms

Safety Experts Say School Bus Passengers

Should Be Belted

CS6501– Natural Language Processing 37

Page 38: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges – ambiguity

Pronoun reference ambiguity

CS6501– Natural Language Processing 38

Credit: http://www.printwand.com/blog/8-catastrophic-examples-of-word-choice-mistakes

Page 39: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges – language is not static

Language grows and changes

e.g., cyber lingo

CS6501– Natural Language Processing 39

LOL Laugh out loud

G2G Got to go

BFN Bye for now

B4N Bye for now

Idk I don’t know

FWIW For what it’s worth

LUWAMH Love you with all my heart

Page 40: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges--language is compositional

CS6501– Natural Language Processing 40

Carefully Slide

Page 41: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges--language is compositional

CS6501– Natural Language Processing 41

小心:

Carefully

Careful

Take

Care

Caution

地滑:

Slide

Landslip

Wet Floor

Smooth

Page 42: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Challenges – scale

Examples:

Bible (King James version): ~700K

Penn Tree bank ~1M from Wall street journal

Newswire collection: 500M+

Wikipedia: 2.9 billion word (English)

Web: several billions of words

CS6501– Natural Language Processing 42

Page 43: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

This lecture

Course Overview

What is NLP? Why it is important?

What will you learn from this course?

Course Information

What are the challenges?

Key NLP components

CS6501– Natural Language Processing 43

Page 44: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Part of speech tagging

CS6501– Natural Language Processing 44

Page 45: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Syntactic (Constituency) parsing

CS6501– Natural Language Processing 45

Page 46: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Syntactic structure => meaning

CS6501– Natural Language Processing 46

Image credit: Julia Hockenmaier, Intro to NLP

Page 47: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Dependency Parsing

CS6501– Natural Language Processing 47

Page 48: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Semantic analysis

Word sense disambiguation

Semantic role labeling

CS6501– Natural Language Processing 48

Credit: Ivan Titov

Page 49: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Christopher Robin is alive and well. He is the

same person that you read about in the book,

Winnie the Pooh. As a boy, Chris lived in a

pretty home called Cotchfield Farm. When

Chris was three years old, his father wrote a

poem about him. The poem was printed in a

magazine for others to read. Mr. Robin then

wrote a book

49

Q: [Chris] = [Mr. Robin] ?

Slide modified from Dan Roth

Page 50: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Christopher Robin is alive and well. He is the

same person that you read about in the book,

Winnie the Pooh. As a boy, Chris lived in a

pretty home called Cotchfield Farm. When

Chris was three years old, his father wrote a

poem about him. The poem was printed in a

magazine for others to read. Mr. Robin then

wrote a book

50

Co-reference Resolution

Page 51: Lecture 1: Introduction - Computer Sciencekc2wc/teaching/NLP16/slides/01-intro.pdf · Lecture 1: Introduction ... Lecture + Seminar No course prerequisites, but I assume ... 15 min

Questions?

CS6501– Natural Language Processing 51