natural language processing - hasso plattner …natural language processing machine learning...
TRANSCRIPT
![Page 1: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/1.jpg)
Natural Language ProcessingMachine Learning
Potsdam, 26 April 2012
Saeedeh MomtaziInformation Systems Group
![Page 2: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/2.jpg)
Introduction
Machine LearningField of study that gives computers the ability to learn withoutbeing explicitly programmed.
[Arthur Samuel, 1959]
Learning MethodsSupervised learning
Active learning
Unsupervised learningSemi-supervised learningReinforcement learning
Saeedeh Momtazi | NLP | 26.04.2012
2
![Page 3: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/3.jpg)
Outline
1 Supervised Learning
2 Semi-Supervised Learning
3 Unsupervised Learning
Saeedeh Momtazi | NLP | 26.04.2012
3
![Page 4: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/4.jpg)
Outline
1 Supervised Learning
2 Semi-Supervised Learning
3 Unsupervised Learning
Saeedeh Momtazi | NLP | 26.04.2012
4
![Page 5: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/5.jpg)
Supervised Learning
Renting budget: 1000 e
Size: 180 m2
Age: 2 years 7
Saeedeh Momtazi | NLP | 26.04.2012
5
![Page 6: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/6.jpg)
Supervised Learning
size
age
4
7?7
44
44
4
7
7
7
7
7
Saeedeh Momtazi | NLP | 26.04.2012
6
![Page 7: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/7.jpg)
Classification
Training
T1 → C1T2 → C2...
Tn → Cn
−−−−→
F1F2...
Fn
−−−−→�� ��Model(F,C)
Testing
Tn+1 →? −−−−→ Fn+1 −−−−→
←−−−−−−−−
Cn+1
Saeedeh Momtazi | NLP | 26.04.2012
7
![Page 8: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/8.jpg)
Applications
Problem Item Category
POS Tagging Word POSNamed Entity Recognition Word Named entityWord Sense Disambiguation Word The word’s senseSpam Mail Detection Document Spam/Not spamLanguage Identification Document LanguageText Categorization Document TopicInformation Retrieval Document Relevant/Not relevant
Saeedeh Momtazi | NLP | 26.04.2012
8
![Page 9: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/9.jpg)
Part Of Speech Tagging
“I saw the man on the roof.”
“ I[PRON] saw[V ] the[DET ] man[N] on[PREP] the[DET ] roof[N]. ”
[PRON] Pronoun[PREP] Preposition[DET] Determiner[V] Verb[N] Noun...
Saeedeh Momtazi | NLP | 26.04.2012
9
![Page 10: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/10.jpg)
Named Entity Recognition
“Steven Paul Jobs, co-founder of Apple Inc, was born in California.”
“ Steven Paul JobsPerson
, co-founder of Apple IncOrganization
, was born in CaliforniaLocation
.”
PersonOrganizationLocationDate...
Saeedeh Momtazi | NLP | 26.04.2012
10
![Page 11: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/11.jpg)
Word Sense Disambiguation
“Jim flew his plane to Texas.”
“Alice destroys the item with a plane.”
Saeedeh Momtazi | NLP | 26.04.2012
11
![Page 12: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/12.jpg)
Spam Mail Detection
Saeedeh Momtazi | NLP | 26.04.2012
12
![Page 13: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/13.jpg)
Language Identification
Saeedeh Momtazi | NLP | 26.04.2012
13
![Page 14: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/14.jpg)
Text Categorization
---------------------------------------------
---------------------------------------------
---------------------------------------------
---------------------------------------------
news
------------------------------------------------------------------------------------------
? ? ? ?
Culture Sport Politics Business
Saeedeh Momtazi | NLP | 26.04.2012
14
![Page 15: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/15.jpg)
Information Retrieval
Saeedeh Momtazi | NLP | 26.04.2012
15
![Page 16: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/16.jpg)
Classification
Training
T1 → C1T2 → C2...
Tn → Cn
−−−−→
F1F2...
Fn
−−−−→�� ��Model(F,C)
Testing
Tn+1 →? −−−−→ Fn+1 −−−−→
←−−−−−−−−
Cn+1
Saeedeh Momtazi | NLP | 26.04.2012
16
![Page 17: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/17.jpg)
Classification Algorithms
K Nearest NeighborSupport Vector MachinesNaïve BayesMaximum EntropyLinear RegressionLogistic RegressionNeural NetworksDecision TreesBoosting...
Saeedeh Momtazi | NLP | 26.04.2012
17
![Page 18: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/18.jpg)
K Nearest Neighbor
?
Saeedeh Momtazi | NLP | 26.04.2012
18
![Page 19: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/19.jpg)
K Nearest Neighbor
?
Saeedeh Momtazi | NLP | 26.04.2012
19
![Page 20: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/20.jpg)
K Nearest Neighbor
1 Nearest Neighbor
Saeedeh Momtazi | NLP | 26.04.2012
20
![Page 21: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/21.jpg)
K Nearest Neighbor
?
3 Nearest Neighbor
Saeedeh Momtazi | NLP | 26.04.2012
21
![Page 22: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/22.jpg)
K Nearest Neighbor
3 Nearest Neighbor
Saeedeh Momtazi | NLP | 26.04.2012
22
![Page 23: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/23.jpg)
K Nearest Neighbor
But this item is closer
3 Nearest Neighbor
Saeedeh Momtazi | NLP | 26.04.2012
23
![Page 24: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/24.jpg)
Support Vector Machines
Saeedeh Momtazi | NLP | 26.04.2012
24
![Page 25: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/25.jpg)
Support Vector Machines
Find a hyperplane in the vector space that separates the itemsof the two categories
Saeedeh Momtazi | NLP | 26.04.2012
25
![Page 26: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/26.jpg)
Support Vector Machines
There might be more than one possible separating hyperplane
Saeedeh Momtazi | NLP | 26.04.2012
26
![Page 27: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/27.jpg)
Support Vector Machines
?
There might be more than one possible separating hyperplane
Saeedeh Momtazi | NLP | 26.04.2012
27
![Page 28: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/28.jpg)
Support Vector Machines
?
Find the hyperplane with maximum marginVectors at the margins are called support vectors
Saeedeh Momtazi | NLP | 26.04.2012
28
![Page 29: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/29.jpg)
Support Vector Machines
Find the hyperplane with maximum marginVectors at the margins are called support vectors
Saeedeh Momtazi | NLP | 26.04.2012
29
![Page 30: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/30.jpg)
Naïve Bayes
Selecting the class with highest probability⇒ Minimizing the number of items with wrong labels
c = argmaxciP(ci)
The probability should depend on the to be classified data (d)
P(ci |d)
Saeedeh Momtazi | NLP | 26.04.2012
30
![Page 31: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/31.jpg)
Naïve Bayes
c = argmaxciP(ci)
c = argmaxciP(ci |d)
c = argmaxci
P(d |ci) · P(ci)
P(d) P(d) has no effect
c = argmaxciP(d |ci) · P(ci)
Saeedeh Momtazi | NLP | 26.04.2012
31
![Page 32: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/32.jpg)
Naïve Bayes
c = argmaxciP(d |ci) · P(ci)
Likelihood
ProbabilityPrior
Probability
Saeedeh Momtazi | NLP | 26.04.2012
32
![Page 33: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/33.jpg)
Maximum Entropy
Assigning a weight λj to each feature fjPositive weight: the feature is likely to be effectiveNegative weight: the feature is likely to be ineffective
Picking out a subset of data by each feature
Voting for each class based on the sum of weighted features
c = argmaxciP(ci |d , λ)
Saeedeh Momtazi | NLP | 26.04.2012
33
![Page 34: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/34.jpg)
Maximum Entropy
c = argmaxciP(ci |d , λ)
P(ci |d , λ) =exp
∑j λj · fj(c,d)∑
ciexp
∑j λj · fj(ci ,d)
The expectation of each feature is calculated as follows:
E(fi) =∑
(c,d)∈(C,D)
P(c,d) · fi(c,d)
Saeedeh Momtazi | NLP | 26.04.2012
34
![Page 35: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/35.jpg)
Classification
Training
T1 → C1T2 → C2...
Tn → Cn
−−−−→
F1F2...
Fn
−−−−→�� ��Model(F,C)
Testing
Tn+1 →? −−−−→ Fn+1 −−−−→
←−−−−−−−−
Cn+1
Saeedeh Momtazi | NLP | 26.04.2012
35
![Page 36: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/36.jpg)
Feature Selection
size
age
4
7
44
44
4
7
7
7
7
7
LocationNumber of roomsBalcony...
Saeedeh Momtazi | NLP | 26.04.2012
36
![Page 37: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/37.jpg)
Feature Selection
Bag-of-words:Each document can be represented by the set of words thatappear in the documentResult is a high dimensional feature spaceThe process is computationally expensive
SolutionUsing a feature selection method to select informative words
Saeedeh Momtazi | NLP | 26.04.2012
37
![Page 38: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/38.jpg)
Feature Selection Methods
Information GainMutual Informationχ-Square
Saeedeh Momtazi | NLP | 26.04.2012
38
![Page 39: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/39.jpg)
Information Gain
Measuring the number of bits required for category predictionw.r.t. the presence or absence of a term in the documentRemoving words whose information gain is less than apredefined threshold
IG(w) =−K∑
i=1
P(ci) log P(ci)
+ P(w)K∑
i=1
P(ci |w) log P(ci |w)
+ P(w)K∑
i=1
P(ci |w) log P(ci |w)
Saeedeh Momtazi | NLP | 26.04.2012
39
![Page 40: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/40.jpg)
Information Gain
P(ci) =NiN
P(w) = NwN P(ci |w) = Niw
Ni
P(w) = NwN P(ci |w) = Niw
Ni
N: # docsNi : # docs in category ciNw : # docs containing wNw : # docs not containing wNiw : # docs in category ci containing wNiw : # docs in category ci not containing w
Saeedeh Momtazi | NLP | 26.04.2012
40
![Page 41: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/41.jpg)
Mutual Information
Measuring the effect of each word in predicting the categoryHow much does its presence or absence in a documentcontribute to category prediction?
MI(w , ci) = logP(w , ci)
P(w) · P(ci)
Removing words whose mutual information is less than apredefined threshold
MI(w) = maxiMI(w , ci)
MI(w) =∑
i
P(ci) ·MI(w , ci)
Saeedeh Momtazi | NLP | 26.04.2012
41
![Page 42: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/42.jpg)
χ-square
Measuring the dependencies between words and categories
χ2(w , ci) =N · (NiwNiw − NiwNiw )
2
(Niw + Niw ) · (Niw + Niw ) · (Niw + Niw ) · (Niw + Niw )
Ranking words based on their χ-square measure
χ2(w) =K∑
i=1
P(ci) · χ2(w , ci)
Selecting the top words as features
Saeedeh Momtazi | NLP | 26.04.2012
42
![Page 43: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/43.jpg)
Feature Selection
These models perform well for document-level classificationSpam Mail DetectionLanguage IdentificationText Categorization
Word-level Classification might need another types of featuresPOS TaggingNamed Entity Recognition
(will be discussed later)
Saeedeh Momtazi | NLP | 26.04.2012
43
![Page 44: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/44.jpg)
Shortcoming
Data annotation is labor intensive
Solution:Using a minimum amount of annotated dataAnnotating further data by human, if they are very informative
Active Learning
Saeedeh Momtazi | NLP | 26.04.2012
44
![Page 45: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/45.jpg)
Active Learning
Saeedeh Momtazi | NLP | 26.04.2012
45
![Page 46: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/46.jpg)
Active Learning
Annotating a small amount of data
Saeedeh Momtazi | NLP | 26.04.2012
46
![Page 47: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/47.jpg)
Active Learning
MM
MM
LL
L
H H
H
HH
HM
Annotating a small amount of data
Calculating the confidence score of the classifier onunlabeled data
Saeedeh Momtazi | NLP | 26.04.2012
47
![Page 48: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/48.jpg)
Active Learning
Annotating a small amount of data
Calculating the confidence score of the classifier onunlabeled data
Finding the informative unlabeled data(data with lowest confidence)
Annotating the informative data by the human
Saeedeh Momtazi | NLP | 26.04.2012
48
![Page 49: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/49.jpg)
Active Learning
Amazon Mechanical Turk
Saeedeh Momtazi | NLP | 26.04.2012
49
![Page 50: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/50.jpg)
Outline
1 Supervised Learning
2 Semi-Supervised Learning
3 Unsupervised Learning
Saeedeh Momtazi | NLP | 26.04.2012
50
![Page 51: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/51.jpg)
Semi-Supervised Learning
Problem of data annotation
Solution:Using minimum amount of annotated dataAnnotating further data automatically, if they are easy to predict
Saeedeh Momtazi | NLP | 26.04.2012
51
![Page 52: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/52.jpg)
Semi-Supervised Learning
A small amount of labeled data
Saeedeh Momtazi | NLP | 26.04.2012
52
![Page 53: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/53.jpg)
Semi-Supervised Learning
A small amount of labeled data
A large amount of unlabeled data
Saeedeh Momtazi | NLP | 26.04.2012
53
![Page 54: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/54.jpg)
Semi-Supervised Learning
A small amount of labeled data
A large amount of unlabeled data
SolutionFinding the similarity between the labeled andunlabeled dataPredicting the labels of the unlabeled data
Saeedeh Momtazi | NLP | 26.04.2012
54
![Page 55: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/55.jpg)
Semi-Supervised Learning
Training the classifier usingThe labeled dataPredicted labels ofthe unlabeled data
Saeedeh Momtazi | NLP | 26.04.2012
55
![Page 56: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/56.jpg)
Shortcoming
Introducing a lot of noisy data to the system
Saeedeh Momtazi | NLP | 26.04.2012
56
![Page 57: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/57.jpg)
Shortcoming
Introducing a lot of noisy data to the system
SolutionAdding unlabeled data to the training set, if the predicted labelhas a high confidence
Saeedeh Momtazi | NLP | 26.04.2012
57
![Page 58: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/58.jpg)
Semi-Supervised Learning
Saeedeh Momtazi | NLP | 26.04.2012
58
![Page 59: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/59.jpg)
Related Books
Semi-Supervised Learning
by O. Chapelle, B. Schölkopf, A. ZienMIT Press
2006
Semisupervised Learning forComputational Linguistics
by S. AbneyChapman & Hall
2007
Saeedeh Momtazi | NLP | 26.04.2012
59
![Page 60: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/60.jpg)
Outline
1 Supervised Learning
2 Semi-Supervised Learning
3 Unsupervised Learning
Saeedeh Momtazi | NLP | 26.04.2012
60
![Page 61: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/61.jpg)
Unsupervised Learning
area
age
444
4
4
4
7
7
7
7
7
7
Saeedeh Momtazi | NLP | 26.04.2012
61
![Page 62: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/62.jpg)
Unsupervised Learning
area
age
777
77
7
7
7
7
7
7
7
Saeedeh Momtazi | NLP | 26.04.2012
62
![Page 63: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/63.jpg)
Clustering
Working based on the similarities between the data items
Assigning the similar data items to the same cluster
Saeedeh Momtazi | NLP | 26.04.2012
63
![Page 64: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/64.jpg)
Applications
Word ClusteringSpeech RecognitionMachine TranslationNamed Entity RecognitionInformation Retrieval...
Document ClusteringInformation Retrieval...
Saeedeh Momtazi | NLP | 26.04.2012
64
![Page 65: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/65.jpg)
Speech recognition
⇒ “Computers can recognize speech.”
“Computers can wreck a nice peach.”
Saeedeh Momtazi | NLP | 26.04.2012
65
![Page 66: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/66.jpg)
Machine Translation
“The cat eats ...” ⇒ “Die Katze frisst ...”“Die Katze isst ...”
Saeedeh Momtazi | NLP | 26.04.2012
66
![Page 67: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/67.jpg)
Language Modeling
Corpus Texts:“I have a meeting on Monday evening”“You should work on Wednesday afternoon”“The next session of the NLP lecture in on Thursday morning”
no observation in the corpus.
⇒ “The talk is on Monday morning.”
“The talk is on Monday molding.”
Monday
Tuesday
Wednesday
Thursday
Sunday
Saturday
Friday
afternoonmorning
evening
night
Saeedeh Momtazi | NLP | 26.04.2012
67
![Page 68: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/68.jpg)
Language Modeling
Corpus Texts:“I have a meeting on Monday evening”“You should work on Wednesday afternoon”“The next session of the NLP lecture in on Thursday morning”
⇒ [Week-day] [day-time]
Class-based Language Model
Saeedeh Momtazi | NLP | 26.04.2012
68
![Page 69: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/69.jpg)
Information Retrieval
“The first car was invented by Karl Benz.”
“Thomas Edison invented the first commercially practical light.”
“Alexander Graham Bell invented the first practical telephone.”
vehicleautomobile
car
Saeedeh Momtazi | NLP | 26.04.2012
69
![Page 70: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/70.jpg)
Information Retrieval
Saeedeh Momtazi | NLP | 26.04.2012
70
![Page 71: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/71.jpg)
Information Retrieval
Clustering documents
based on their similarities
Saeedeh Momtazi | NLP | 26.04.2012
71
![Page 72: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/72.jpg)
Clustering Algorithms
FlatK-means
HierarchicalTop-Down (Divisive)Bottom-Up (Agglomerative)
Single-linkComplete-linkAverage-link
Saeedeh Momtazi | NLP | 26.04.2012
72
![Page 73: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/73.jpg)
K-means
The best known clustering algorithmWorks well for many casesUsed as default / baseline for clustering documents
AlgorithmDefining each cluster center as the mean or centroid of the itemsin the cluster
~µ =1|c|
∑~x∈c
~x
Minimizing the average squared Euclidean distance of the itemsfrom their cluster centers
Saeedeh Momtazi | NLP | 26.04.2012
73
![Page 74: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/74.jpg)
K-meansInitialization: Randomly choose k items as initial centroidswhile stopping criterion has not been met do
for each item doFind the nearest centroidAssign the item to the cluster associated with the nearest centroid
end forfor each cluster do
Update the centroid of the cluster based on the average of all items in the clusterend for
end while
Iterating two steps:Re-assignment
Assigning each vector to its closest centroid
Re-computation
Computing each centroid as the average of the vectors that were assigned to it in re-assignment
Saeedeh Momtazi | NLP | 26.04.2012
74
![Page 75: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/75.jpg)
K-means
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/
AppletKM.html
Saeedeh Momtazi | NLP | 26.04.2012
75
![Page 76: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/76.jpg)
Hierarchical AgglomerativeClustering (HAC)
Creating a hierarchy in the form of a binary tree
Saeedeh Momtazi | NLP | 26.04.2012
76
![Page 77: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/77.jpg)
Hierarchical AgglomerativeClustering (HAC)
Initial Mapping: Put a single item in each clusterwhile reaching the predefined number of clusters do
for each pair of clusters doMeasure the similarity of two clusters
end forMerge the two clusters that are most similar
end while
Measuring the similarity in three ways:Single-linkComplete-linkAverage-link
Saeedeh Momtazi | NLP | 26.04.2012
77
![Page 78: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/78.jpg)
Hierarchical AgglomerativeClustering (HAC)
Single-link / single-linkage clusteringBased on the similarity of the most similar members
Saeedeh Momtazi | NLP | 26.04.2012
78
![Page 79: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/79.jpg)
Hierarchical AgglomerativeClustering (HAC)
Complete-link / complete-linkage clusteringBased on the similarity of the most dissimilar members
Saeedeh Momtazi | NLP | 26.04.2012
79
![Page 80: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/80.jpg)
Hierarchical AgglomerativeClustering (HAC)
Average-link / average-linkage clusteringBased on the average of all similarities between the members
Saeedeh Momtazi | NLP | 26.04.2012
80
![Page 81: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/81.jpg)
Hierarchical AgglomerativeClustering (HAC)
http://home.dei.polimi.it/matteucc/Clustering/tutorial_html/
AppletH.html
Saeedeh Momtazi | NLP | 26.04.2012
81
![Page 82: Natural Language Processing - Hasso Plattner …Natural Language Processing Machine Learning Potsdam, 26 April 2012 Saeedeh Momtazi Information Systems Group Introduction Machine Learning](https://reader033.vdocuments.site/reader033/viewer/2022060223/5f07ec727e708231d41f7056/html5/thumbnails/82.jpg)
Further Reading
Introduction to Information Retrieval
C.D. Manning, P. Raghavan, H. Schütze Cambridge
University Press 2008
http://nlp.stanford.edu/IR-book/html/htmledition/irbook.html
Chapters 13,14,15,16,17
Saeedeh Momtazi | NLP | 26.04.2012
82