discriminative training of chow-liu tree multinet classifiers
DESCRIPTION
Discriminative Training of Chow-Liu tree Multinet Classifiers. Huang, Kaizhu Dept. of Computer Science and Engineering, CUHK. Outline. Background Classifiers Discriminative classifiers Generative classifiers Bayesian Multinet Classifiers Motivation - PowerPoint PPT PresentationTRANSCRIPT
Discriminative Training of Discriminative Training of Chow-Liu tree Multinet Chow-Liu tree Multinet
ClassifiersClassifiers
Huang, KaizhuHuang, KaizhuDept. of Computer Science and Dept. of Computer Science and
Engineering,Engineering,CUHKCUHK
OutlineOutline
BackgroundBackground– ClassifiersClassifiers
» Discriminative classifiersDiscriminative classifiers
» Generative classifiersGenerative classifiers Bayesian Multinet ClassifiersBayesian Multinet Classifiers
MotivationMotivation Discriminative Bayesian Multinet ClassifiersDiscriminative Bayesian Multinet Classifiers ExperimentsExperiments ConclusionConclusion
Discriminative ClassifiersDiscriminative Classifiers
Directly maximize a discriminative function Directly maximize a discriminative function
SVM
Generative ClassifiersGenerative Classifiers Estimate the distribution for each class, and Estimate the distribution for each class, and
then use Bayes rule to perform classification then use Bayes rule to perform classification
P1(x|C1)
P2(x|C2)
ComparisonComparison
Example of Missing Information:
From left to right: Original digit, Cropped and resized digit, 50% missing digit, 75% missing digit, and occluded digit.
Comparison (Continue)Comparison (Continue)
Discriminative Classifiers Discriminative Classifiers cannot deal with deal with missing information problems easily.missing information problems easily.
Generative Classifiers Generative Classifiers provide a principled way to handle missing to handle missing information problems.information problems.
When is missing, we can use When is missing, we can use Marginalized P1 and P2 to perform classification
)|,...,,( 1211 CxxxP m)|,...,,( 2212 CxxxP m
ix
mmii CxxxPCxxxxxP )|,...,,()|,...,,,...,( 1211111211
ix
mmii CxxxPCxxxxxP )|,...,,()|,...,,,...,( 2212211212
ix
Handling Missing Information Handling Missing Information ProblemProblem
SVM
TJT: a generative model
MotivationMotivation
It seems that a good classifier should It seems that a good classifier should combinecombine the strategies of discriminative the strategies of discriminative classifiers and generative classifiersclassifiers and generative classifiers
Our work trains the one of the generative Our work trains the one of the generative classifier: the classifier: the generativegenerative Bayesian Bayesian Multinet classifierMultinet classifier in a in a discriminativediscriminative wayway
Roadmap of our workRoadmap of our work
S u p p ort V e c to r M a ch in e s (S V M ) O th e rs
D isc rim in a tive C la ss if ie rs
N a ive B aye s ia n C la ss ife rs T re e -like B a yes ia n C la ss if ie rs O th e rs
B a ye s ia n M u ltin e t C la ss if ie rs O th e rs
B a ye s ia n N e tw o rk C la ss if ie rs (B N C )
G a u ss ia n M ix tu re M od e l(G M M ) H id d e n M a rkov M o d e l(H M M )
O th e rs M o d e ls
G e n e ra tive C lass if ie rs
Classifiers
How our work relates to other How our work relates to other work?work?
Discriminative Classifiers Generative Classifiers1.
Jaakkola and Haussler NIPS98
HMM and GMM Discriminative training2.
Difference: Our method performs a reverse process:
From Generative classifiers to Discriminative classifiers
Beaufays etc., ICASS99, Hastie etc., JRSS 96
Difference: Our method is designed for Bayesian Multinet Classifiers, a more general classifier.
S u p p ort V e c to r M a ch in e s (S V M ) O th e rs
D isc rim in a tive C la ss if ie rs
B a ye sia n Ch o w -L iu tre e M u lt in e t O th e rs
B a ye s ia n M u lt in e t C la ss if ie rs O th e rs
B a ye s ia n N e tw o rk C la ss if ie rs (B N C )
G a u ss ia n M ix tu re M od e l(G M M ) H id d e n M a rkov M o d e l(H M M )
O th e rs M o d e ls
G e n e ra tive C lass if ie rs
Classifiers
Problems of Bayesian Multinet Problems of Bayesian Multinet ClassifiersClassifiers
Pre-classified dataset
Sub-dataset D1 for Class I
Sub-dataset D2 for Class 2
Estimate the distribution P1 to approximate D1 accurately
Estimate the distribution P2 to approximate D2 accurately
Use Bayes rule to perform classification
Comments: This framework discards the divergence information between classes.
Our Training SchemeOur Training Scheme
Mathematic ExplanationMathematic Explanation
Bayesian Multinet Classifiers (BMC)Bayesian Multinet Classifiers (BMC)
Discriminative Training of BMCDiscriminative Training of BMC
Mathematic ExplanationMathematic Explanation
x
x
xP
xPxPPerror
xP
xPxPPerror
)(2
)(2log)(2)2(
)(1
)(1log)(1)1(
^^
^^
Finding Finding P1P1 and and P2P2
Finding Finding P1P1 and and P2P2
Experimental SetupExperimental Setup
DatasetsDatasets» 2 benchmark datasets from UCI machine learning repository2 benchmark datasets from UCI machine learning repository
Tic-tac-toe Tic-tac-toe Vote Vote
Experimental EnvironmentsExperimental Environments» Platform:Windows 2000Platform:Windows 2000
» Developing tool: Matlab 6.5Developing tool: Matlab 6.5
Error RateError Rate
Convergence PerformanceConvergence Performance
ConclusionConclusion
A discriminative training procedure for A discriminative training procedure for generative Bayesian Multinet Classifiers is generative Bayesian Multinet Classifiers is presentedpresented
This approach improves the recognition rate This approach improves the recognition rate for two benchmark datasets significantlyfor two benchmark datasets significantly
The theoretic exploration on the The theoretic exploration on the convergence performance of this approach convergence performance of this approach is on the way.is on the way.