chapter 6. hidden markov and maximum entropy model

sented by Jian-Shiun Tzeng 4/9/2009 Chapter 6. Hidden Markov and Maximum Entropy Model Daniel Jurafsky and James H. Martin 2008

Upload: sinjin

Post on 12-Jan-2016

74 views

Category:

Documents

3 download

Report

Download

Embed Size (px):

DESCRIPTION

Chapter 6. Hidden Markov and Maximum Entropy Model. Daniel Jurafsky and James H. Martin 2008. Introduction. Maximum Entropy ( MaxEnt ) More widely known as multinomial logistic regression Begin from non-sequential classifier A probabilistic classifier - PowerPoint PPT Presentation

TRANSCRIPT

Presented by Jian-Shiun Tzeng 4/9/2009

Chapter 6. Hidden Markov and Maximum Entropy Model

Daniel Jurafsky and James H. Martin2008

Introduction

• Maximum Entropy (MaxEnt)– More widely known as multinomial logistic regression

• Begin from non-sequential classifier– A probabilistic classifier– Exponential or log-linear classifier– Text classification– Sentiment analysis

• Positive or negative opinion

– Sentence boundary

Page 3: Chapter 6. Hidden Markov and Maximum Entropy Model

Linear Regression

Page 4: Chapter 6. Hidden Markov and Maximum Entropy Model

Linear Regression

• x(j): a particular instance• y(j)

obs: observed label in the training set of x(j)

• y(j)pred: predict value from linear regression model

sum square error

Page 5: Chapter 6. Hidden Markov and Maximum Entropy Model

Logistic Regression – simplest case of binary classification

• Consider whether x is in class (1, true) or not (0, false)

w f‧ (-∞,∞)∈

∈ [0,∞)

∈ (-∞,∞)

∈ [0,1]

Page 6: Chapter 6. Hidden Markov and Maximum Entropy Model

Logistic Regression – simplest case of binary classification

Page 7: Chapter 6. Hidden Markov and Maximum Entropy Model

Logistic Regression – Classification

Page 8: Chapter 6. Hidden Markov and Maximum Entropy Model

Advanced: Learning in logistic regression

Page 9: Chapter 6. Hidden Markov and Maximum Entropy Model

Maximum Entropy Modeling

• Input: x (a word need to tag or a doc need to classify)– Features

• Ends in –ing• Previous word is “the”

– Each feature fi, weight wi

– Particular class c– Z is a normalizing factor, used to make the prob. sum

to 1

Page 10: Chapter 6. Hidden Markov and Maximum Entropy Model

Maximum Entropy Modeling

C = {c1, c2, …, cC}

Normalization

fi: A feature that only takes on the values 0 and 1 is also called an indicator function

In MaxEnt, instead of the notation fi, we will often use the notation fi(c,x), meaning that a feature i for a particular class c for a given observation x

Page 11: Chapter 6. Hidden Markov and Maximum Entropy Model

Maximum Entropy ModelingAssume C = {NN, VB}

Page 12: Chapter 6. Hidden Markov and Maximum Entropy Model

Learning Maximum Entropy Model

Page 13: Chapter 6. Hidden Markov and Maximum Entropy Model

HMM vs. MEMMHMM MEMM

MEMM can condition on any useful feature of the input observation; in HMM this isn’t possible

word

class

Page 14: Chapter 6. Hidden Markov and Maximum Entropy Model

Conditional Random Fields (CRFs)

• CRFs (Lafferty, McCallum, et al. 2001) constitute another conditional model based on maximal entropy

• Like MEMM, CRFs are able to accommodate many possibly correlated features of the observation

• However, CRFs are better able to trade off decisions at different sequence positions

• MEMM were found to suffer from the label bias problem

Page 15: Chapter 6. Hidden Markov and Maximum Entropy Model

Label Bias

• The problem appears when the MEMM contains states with different output degrees

• Because the probabilities of transitions from any given state must sum to 1, transitions from lower degree states receive higher probabilities than transitions from higher degree states

• In the extreme case, transition from a state with degree 1 always gets probability 1, effectively ignoring the observation

• CRFs do not have this problem because they define a single ME-based distribution over the whole label sequence

L13: hidden Markov modelscourses.cs.tamu.edu/rgutier/csce630_f14/l13.pdf · L13: hidden Markov models • Discrete Markov processes • Hidden Markov models • Forward and Backward

Markov hidden

OULU 2012 ACTAjultika.oulu.fi/files/isbn9789514298493.pdf · 2015. 12. 16. · HMM Hidden Markov Models HM-SVM Hidden Markov Support Vector Machines 9. ... MEMM Maximum Entropy Markov

Hidden Markov Modelle

Hidden Markov Models

Hidden markov model

Hidden Markov Model Nov 11, 2008 Sung-Bae Cho. Hidden Markov Model Inference of Hidden Markov Model Path Tracking of HMM Learning of Hidden Markov Model

HIDDEN MARKOV AND MAXIMUM ENTROPY MODELS Tyeni/files/A textbook explanation of HMMs.pdf · by Maximum Entropy Markov Models. Our discussion of the Hidden Markov Model extends what

L13: hidden Markov models - Texas A&M Universityresearch.cs.tamu.edu/prism/lectures/sp/l13.pdf · L13: hidden Markov models • Discrete Markov processes • Hidden Markov models

Hidden Markov Models in Bioinformaticscsatol/mach_learn/bemutato/Mate_Korosi_HMMpr… · Outline ˜ Markov Chain ˜ HMM (Hidden Markov Model) ˜ Hidden Markov Models in Bioinformatics