pattern recognition applied to biomedical signals - …fcruz/pdf/pattrecaug07.pdfpattern recognition...
TRANSCRIPT
![Page 1: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/1.jpg)
Pattern Recognition applied to Biomedical Signals
Dr Philip de ChazalChief Technical OfficerBiancaMed, Dublin, Ireland
![Page 2: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/2.jpg)
Outline
1st Hour: Focus on Pattern Recognition basics1. Pattern Recognition Overview2. Classifiers: LDA, QDA, FFNN, HMM etc3. Performance Assessment
Data splittingPerformance measures (sensitivity, specificity etc)
4. FeaturesTransformationsMissing valuesFeature Selection
![Page 3: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/3.jpg)
Outline
2nd hour: Case study on Sleep apnea detection from the Electrocardiogram– Database– Expert Annotations– Features– Performance assessment– Practical tips
![Page 4: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/4.jpg)
1. Pattern Recognition OverviewPattern Recognition
Supervised Learning Unsupervised learning
Self organising feature maps
Cluster Analysis
Hebbian Learning
Vector QuantisationPlug-in parameters Distributed parameters
Parametric Non-Parametric
Nearest neighbour (kNN)
Decision Trees
Discriminant Analysis Bayesian ClassifiersNeural Networks (feed
forward etc)
HMM
![Page 5: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/5.jpg)
Classifier Types: Parametric models with plug-in parameters
Define a parametric model between the input features and the output classesThe model has adjustable parameters which are set using training data
Features Classifier Classes
Adjustable Parameters
![Page 6: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/6.jpg)
2. Classification Methods
Bayes theoremGaussian modelsLinear discriminant analysis– Derivation– Covariance inversion– Training Equations– Classifying Equations
Quadratic discriminant analysisFeedforward neural networks
![Page 7: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/7.jpg)
Bayes theorem
x
ℜ1 ℜ2
p x C P C( | ) ( )2 2p x C P C( | ) ( )1 1
![Page 8: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/8.jpg)
Bayes theorem
Optimally classify an object into one of cmutually exclusive classes given priors and class densities.For a c-class problem Bayes’ rule states that the posterior probability of the kth class is related to the its prior probability and its class density function by
( )( )
1
,
,
k k kk c
l l ll
fp
f
π
π=
=
∑x θ
x θ
Class density
Posterior probability
Prior probability Normalising
component
![Page 9: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/9.jpg)
Likelihood functionIf we have
1
c
kn
N N=
= ∑ labelled data (d×1) vectors, ( ) , 1..kn kn N=x , then a likelihood can
be formed as follows:
( ) ( )( )
1 1
,kNc
kk k n k
k n
l fπ= =
=∏∏θ x θ
Our aim is to find the values of θ for each class that maximise the value of the ( )l θ likelihood. Equivalently we can find the values of θ that maximise the value of
the log- likelihood:
( ) ( ) ( )( )
( )( ) ( )
( )
1 1
( )
1 1 1
log( ) log ,
log , log
k
k
Nck
k k n kk n
Nc ck
k n k k kk n k
L l f
f N
π
π
= =
= = =
= =
= +
∑∑
∑∑ ∑
θ θ x θ
x θ
Think of it as combined probabilities
![Page 10: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/10.jpg)
The class densities are modelled with a Gaussian model (d-dimensional) with common covariance across all classes:
( ) ( ) ( ) ( )122 11
2, , 2 expd T
k k k k kf π −− −⎡ ⎤= = − − −⎣ ⎦x θ μ Σ Σ x μ Σ x μ
Common Covariance Class mean
LDA: Gaussion parametric model for the class densities
![Page 11: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/11.jpg)
LDA: Log-likelihood
For a training example
Hence
And the log-likelihood over all training examples
( )( ) ( ) ( ) ( ) ( )( ) ( ) 1 ( )1 12 2 2log , , log 2 log
Tk k kdk n k n k n kf π −= − − − − −x μ Σ Σ x μ Σ x μ
( )( ) ( ) ( ) ( ) ( )( ) ( ) 1 ( )12 2 2
1 1 1 1log , log 2 log
k kN Nc c Tk k kdN Nk n k n k n k
k n k nf π −
= = = =
= − − − − −∑∑ ∑∑x θ Σ x μ Σ x μ
( ) ( ) ( ) ( ) ( ) ( )( ) 1 ( )11 2 2 2
1 1 1,... , log 2 log log
kNc cTk kdN Nc n k n k k k
k n kL Nπ π−
= = =
= = − − − − − +∑∑ ∑θ μ μ Σ Σ x μ Σ x μ
![Page 12: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/12.jpg)
LDA: Maximising the log-likelihood function
1 ( )
1[many steps omitted!]= 0
kNk
k n k knk
L N−
=
⎛ ⎞∂= − =⎜ ⎟∂ ⎝ ⎠
∑Σ x μμ
( )
1
kNk
k n kn
N=
= ∑μ x
0.5 1 11 0.5
11 1 0.5
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦
H
L
M
M O
L
1.
2. ( )( )( ) ( )1
1 1[many steps omitted!]=
kNc Tk kn k n k
k n
L N−= =
⎛ ⎞∂= × − − −⎜ ⎟∂ ⎝ ⎠
∑∑H Σ x μ x μΣ
( )( )( ) ( )
1 1
kNc Tk kn k n k
k nN
= =
= − −∑∑Σ x μ x μ
![Page 13: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/13.jpg)
QDA: Separate Gaussion parametric model for the class densities
The class densities are modelled with a Gaussian model (d-dimensional) with separate covariance across all classes:
Class specific Covariance Class mean
( ) ( ) ( ) ( )1 122( ) ( ) ( )1
2, , 2 expd Tk k k
k k k k kf π−−− ⎡ ⎤= = − − −⎣ ⎦x θ μ Σ Σ x μ Σ x μ
![Page 14: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/14.jpg)
QDA: Log-likelihood
For a training example
( )( ) ( ) ( ) ( ) ( )1( ) ( ) ( ) ( ) ( ) ( )1 12 2 2log , , log 2 log
Tk k k k k kdk n k n k n kf π
−
= − − − − −x μ Σ Σ x μ Σ x μ
Hence
( )( ) ( ) ( ) ( ) ( )1( ) ( ) ( ) ( ) ( )12 2
1 1 1 1 1
1log , log 2 log2
k kN Nc c c Tk k k k kdNk n k k n k n k
k n k k nf Nπ
−
= = = = =
= − − − − −∑∑ ∑ ∑∑x θ Σ x μ Σ x μ
And the log-likelihood over all training examples
( ) ( ) ( ) ( ) ( ) ( )1( ) ( ) ( ) ( ) ( )1 11 2 2 2
1 1 1 1,... , log 2 log log
kNc c cTk k k k kdNc k n k n k k k
k k n kL N Nπ π
−
= = = =
= = − − − − − +∑ ∑∑ ∑θ μ μ Σ Σ x μ Σ x μ
![Page 15: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/15.jpg)
QDA: Maximising the log-likelihood function
1 ( )
1[many steps omitted!]= 0
kNk
k n k knk
L N−
=
⎛ ⎞∂= − =⎜ ⎟∂ ⎝ ⎠
∑Σ x μμ
( )
1
kNk
k n kn
N=
= ∑μ x
( )( )( ) ( )1
1[many steps omitted!]=
kN Tk kk k n k n k
nk
L N−=
⎛ ⎞∂= × − − −⎜ ⎟∂ ⎝ ⎠
∑H Σ x μ x μΣ
0.5 1 11 0.5
11 1 0.5
⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦
H
L
M
M O
L
1.
2.
( )( )( ) ( )
1
kN Tk kk n k n k k
nN
=
= − −∑Σ x μ x μ
![Page 16: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/16.jpg)
Covariance inversion
If a covariance matrix does not have full rank then cannot invert matrixWork around– Identify columns of CV matrix with zero
eigenvectors– Remove these columns for CV (equivalent to
remove corresponding features)– Invert submatrix
![Page 17: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/17.jpg)
Training EquationsDefine an associated (c×1) target vector nt which has one element set to 1 and all other elements set to zero for each of the N (d×1) training feature vector nx . The position of the element with value 1 indicates the class e.g. for a four class problem a target vector indicating that the associated training feature vector belongs to class 3 is
[ ]0 0 1 0 Tt = . Form a (d×N) matrix of feature vectors,
[ ]1 2 ... N=X x x x and a (c×N) matrix of target vectors
[ ]1 2 ... N=T t t t and a (d×c) matrix of mean vectors
[ ]1 2 ... c=M μ μ μ and a (c ×1) vector of prior probabilities
[ ]1 2 ... Tcπ π π=π
![Page 18: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/18.jpg)
Matlab implementation – Target vectors
10000t5
01000t4
00100t3
00010t3
00001t1
Class 5Class 4Class 3Class 2Class 1Vector elements
![Page 19: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/19.jpg)
Training Equations
( )T N= −Σ X X ΜT
( ) 1T T −=M XT TT
[ ]( )
. .,1
Tk
k k k kdN⎛ ⎞⎛ ⎞= × −⎜ ⎟⎜ ⎟⎝ ⎠⎝ ⎠
Σ X X 1 T μ T
LDA
QDA
. means the kth row of matrix kT T
![Page 20: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/20.jpg)
Processing – general form ( )
( )( ) ( )( ) ( )
( ) ( )( )( )( ) ( )( )( )
( )( )( )( )( )( )
( )( )
( )( )
( )( )
1 1
1 1
1
1
, exp ,
, exp ,
exp exp log , exp log ,
exp exp log , exp log ,
exp,where log ,
exp
,
,
k k k k k kk c c
l l l l l ll l
k k k k k k
c c
l l l l l ll l
kk k k kc
ll
k k kk c
kl l l
l
f K fp
f K f
K f f K
K f f K
yy f K
y
fNow p
f
π π
π π
π π
π π
π
π
π
= =
= =
=
=
= =
+= =
+
= = +
=
∑ ∑
∑ ∑
∑
∑
x θ x θ
x θ x θ
x θ x θ
x θ x θ
x θ
x θ
x θ
( )
( )
( )( )
1
1 1
1
12
1
,1
,
exp1 ,
exp
c
k k kc ck
ck
l l ll
ck
l k cl
ll
f
f
yp p p
y
π
π
=
= =
=
=
=
= =
∴ = − =
∑∑ ∑
∑
∑∑
x θ
x θ
![Page 21: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/21.jpg)
Processing - LDA
( )( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( )( ) ( ) ( )
( )
( )
11 12 2 2
1 1 11 12 2 2
1 1 11 12 2 2
1 112
log , log log 2 log
log 2 log 2 log
log , log 2 log
log
log
Tdk k k k k k
T T T dk k k k
T T Tdk k k k
T Tk k k k k
k k k k
Tk k
f
K where K
y
y b
π π π
π π
π π
π
π
−
− − −
− − −
− −
= − − − − −
= − − + − −
= + − + = − − +
= + −
= + +
=
x θ Σ x μ Σ x μ
x Σ x μ Σ x μ Σ μ Σ
μ Σ x μ Σ μ Σ x Σ x
μ Σ x μ Σ μ
a x
a μ 1
112
Tk k kb
−
−= −
Σ
μ Σ μ
Linear equation
![Page 22: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/22.jpg)
Processing - QDA
( )( ) ( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
11 12 2 2
11 12 2 2
11 12 2 2
1
log , log log 2 log
log log 2 log
log log , log 2
2log log
Tdk k k k k k k k
T dk k k k k
T dk k k k k
Tk k k k k k
f
K where K
y
π π π
π π
π π
π
−
−
−
−
= − − − − −
= − − − − −
= − − − − − =
= − − − −
x θ Σ x μ Σ x μ
x μ Σ x μ Σ
x μ Σ x μ Σ
x μ Σ x μ Σ
Quadratic equation
![Page 23: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/23.jpg)
Feedforward Neural Networks
Multilayer perceptron artificial neural network0 or more hidden layers
x1
x2
1 1
∑
∑xd
1
1
∑
∑
∑
1
1
∑w11
1)(
wMd(1)
b21)(
b11)(
bM(1)
y21)(
y11)(
yM(1)
w112( )
wcM( )2
y22( )
y12( )
yc( )2
b22( )
b12( )
bc( )2
∑
1b3
1)(
y31)(
Input Features
Output Classes
![Page 24: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/24.jpg)
Feedforward Neural Networks
Flexible linear or nonlinear mapping from features to classesA feedforward neural network is a ‘universal function approximator’Except for linear networks, training requires numerical optimisation Back propagation algorithm used for efficient training
x1
x21
∑w11
1)(
b11)(
1)(
y11)(
Non-linearity
y b w xm m mi ii
d( ( (1) 1) 1)
1
= +FHG
IKJ=
∑ϕ
OutputLinearly summed inputs
Nodal equation
![Page 25: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/25.jpg)
3. Performance Measurement Methods
Twoway classification– Prior probabilities– Sensitivity– Specificity– Positive and negative predictivity– Accuracy
Multiway classification– Prior probabilities– Sensitivity– Specificity– Positive and negative predictivity– Accuracy
Data splitting
![Page 26: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/26.jpg)
Two classification
Diagnostic True statusAllocation Abnormal (D) Normal (~D) Total
Abnormal (S) a b a+b
Normal (~S) c d c+d
Total a+c b+d N
a is the number of cases which were classified abnormal and were truly abnormal. b is the number of cases which were classified abnormal but were in fact normal. c is the number of cases which were classified normal but were in fact abnormal d is the number of cases which were classified normal and were truly normal. N is the total number of cases.
(TP)(FP)(FN)(TN)
Probability of having the disease (P) = a cN+
Probability of not having the disease = b dN
P+= −1
![Page 27: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/27.jpg)
Two way classification
Sensitivity (Se) = p(S|D) = Percentage of well classified abnormals aa c
=+
Specificity (Sp) = p(~S|~D)= Percentage of well classified normals db d
=+
Diagnostic True statusAllocation Abnormal (D) Normal (~D) Total
Abnormal (S) a b a+b
Normal (~S) c d c+d
Total a+c b+d N
Accuracy (A) = Percentage of well classified cases a dN
=+
![Page 28: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/28.jpg)
Two way classification
Predictive value of positive test (PV+) = p(D|S)
=Percentage of well classified positives aa b
P SeP Se P Sp
=+
=+ − −
.. ( ).( )1 1
Predictive value of a negative test (PV-) =p(~D|~S)
=Percentage of well classified negatives dc d
P SpP Se P Sp
=+
=−
− + −( ).
.( ) ( ).1
1 1
Diagnostic True statusAllocation Abnormal (D) Normal (~D) Total
Abnormal (S) a b a+b
Normal (~S) c d c+d
Total a+c b+d N
![Page 29: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/29.jpg)
Multiway classification Diagnostic Allocation
True Status No disease Disease 1 Disease 2 . . Disease n SumNo disease N00 N01 N02 N0n N0 .
Disease 1 N10 N00 N00 N1n N1 .
Disease 2 N20 N00 N00 N2n N2 .
:Disease n Nn0 N11 N12 Nnn Nn .
Sum N.0 N.1 N.2 N.n N..
Prevalence of disease i = probability of having disease i: P NNi
i= .
..
![Page 30: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/30.jpg)
Multiway classification Diagnostic Allocation
True Status No disease Disease 1 Disease 2 . . Disease n SumNo disease N00 N01 N02 N0n N0 .
Disease 1 N10 N00 N00 N1n N1 .
Disease 2 N20 N00 N00 N2n N2 .
:Disease n Nn0 N11 N12 Nnn Nn .
Sum N.0 N.1 N.2 N.n N..
Sensitivity for disease i = Proportion of correctly classified cases with disease i:
Se NNi
ii
i
=.
Specificity = Proportion of correctly classified normals = NN
00
0.
Accuracy (A) = Proportion of correctly classified cases N
Nii∑
..
![Page 31: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/31.jpg)
Multiway classification Diagnostic Allocation
True Status No disease Disease 1 Disease 2 . . Disease n SumNo disease N00 N01 N02 N0n N0 .
Disease 1 N10 N00 N00 N1n N1 .
Disease 2 N20 N00 N00 N2n N2 .
:Disease n Nn0 N11 N12 Nnn Nn .
Sum N.0 N.1 N.2 N.n N..
Predictive value for disease i = proportion of cases classified disease i which are
correct: PV NNi
ii
i
=.
![Page 32: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/32.jpg)
Data splitting
Ideal– Maximum data for training– Maximum data for testing– Training and test data independent– Conflicting requirements
Resubstitution– Train and test on same recordings– Positively biased results
Holdout– Train on one sample, test on remaining sample
Cross fold validation
![Page 33: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/33.jpg)
Cross fold validation
Training case Testing case
Illustration of 5-fold cross validation using 10 ECG records. The data is divided into 5 mutually exclusive folds and the classifier is trained and tested 5 times. Each time a different test fold is used and the remainder of the data used for training.
Unbiased, computationally intensive
![Page 34: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/34.jpg)
4. Features
TransformationsMissing valuesFeature Selection
![Page 35: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/35.jpg)
Transformations
Look at the histogram of features and try applying a transformation if a skewed distribution resulting in a less skewed distribution.
Histogram of original feature
Histogram of log(feature)
![Page 36: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/36.jpg)
Missing Values
Practical data sets often have missing feature values due to faulty measurements etcA majority of classifier models require all feature values to presentWhat to do?
– Delete all cases with one of more feature values– Estimate the missing feature values
replace with average (bad choice as skews distribution)Replace with random valueReplace with value from another case that is “similar” and has all featuresSee academic.uprm.edu/~eacuna/IFCS04r.pdf for a good summary
![Page 37: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/37.jpg)
Feature Selection
The aim is to find a subset of the available features that provides “acceptable”performance– Fewer features means easier implementation– There may exist subsets of the available features
that provide higher classification performance than the full feature set
Irrelevant and redundant features in general reduce classifier performance
– Methods to look for include “filter” and “wrapper”methods, forward selection, backward elimination, stepwise, exhaustive, beam search
![Page 38: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/38.jpg)
Part 2: Case Study: Sleep Apnea Detection using the Electrocardiogram
![Page 39: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/39.jpg)
Diagnosis using polysomnogram (multiple signals).Diagnostic test costs $1500 - carried out in hospital.Only 15% with disease have been diagnosed.
Obstructive sleep apnea: 2-4% prevalence, disrupted sleep, treated with Continuous Positive Airway Pressure (CPAP) mask.
Case Study: Sleep Apnea Detection using the Electrocardiogram
![Page 40: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/40.jpg)
Study Objective
See if can determine a method that can reliably detect sleep apnea using the Electrocardiogram (ECG)Benefits– Can do the test at home– Low cost– Reduce waiting lists in hospitals
![Page 41: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/41.jpg)
Sleep Apnoea ECG database
Computers in Cardiology Conference 2000 Challenge– Automated ECG apnoea detection.
Uses modified lead V2 ECG from PSG database from patients at Philipps University in Germany (T. Penzel) which had been scored by sleep physiologists using complete polysomnogram. Supplied raw ECG waveform from a single lead and QRS detection times (unverified). 70 records total (about 8 hrs each); 35 released for training, 35 for independent testing
![Page 42: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/42.jpg)
PSG scored on epoch-by-epoch basis.Goal was to mimic human scorer
– Each epoch labelled as Normal (NR) or Sleep disordered respiration (SDR)
– Ea
Epoch based scoring
![Page 43: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/43.jpg)
Splitting up of the DataWe have 70 overnight recordings of ECG– Every minute annotated as ‘normal’ of ‘sleep
disorder breathing’ by an expert– Over 32000 labels– 35 recordings available for training, 35 withheld for
testing
![Page 44: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/44.jpg)
Guilleminault et al. were first to report on characteristic bradycardia/tachycardia pattern associated with obstructive apnoeas(Guilleminault C et al.. Cyclical variation of heart rate in sleep apnea syndrome. Lancet 1984 ).
• Ichimaru reported on low-frequency heart rate fluctuations caused by Cheyne-Stokes respiration(Ichimaru Y, Yanaga T. Frequency characteristics of the heart rate variability produced by Cheyne-Stokes respiration during 24-hr ambulatory electrocardiographic monitoring. Comput Biomed Res 1989).
Bradycardia/tachycardia patterns
![Page 45: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/45.jpg)
Stein et al., J. CardiovascElectrophysiol.,
2003
Brady/tachy patterns
![Page 46: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/46.jpg)
EDR signal
Modulation of chest lead ECG signal amplitude by respiration
Respiration
EDR(n)=area enclosed by the QRS complex(n)
![Page 47: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/47.jpg)
ECG derived respiration (EDR)
A. Travaglini, et al, “Respiratory signal derived from eight-lead ECG,” in Computers in Cardiology, 1998.
B. G.B. Moody, et al, “Clinical Validation of the ECG-Derived Respiration (EDR) Technique,” in Computers in Cardiology, 1986.
Travaglini et al
![Page 48: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/48.jpg)
Features
Record-based stdevEDR amplitude
Record-based stdevRR interval
NN50
Epoch-based mean EDR amplitude
Epoch-based mean RR interval
pNN50
Epoch-based st.dev. EDRamplitude
Epoch-based st.dev. RR interval
Allan Factor at 5-25 secs time scale
Record-based mean EDR amplitude
Record-based mean RR interval
SDNN
32 PSD features32 PSD featuresSerial correlation
EDR frequency domain
RR interval frequency domain
RR interval time domain
88 Features in all
![Page 49: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/49.jpg)
Experiments
Linear and quadratic discriminant classifiers– Quick to train– Wanted to focus study on the features not the classifiers– Wanted a system that is readily implemented on
microprocessors
Different combination of feature groups: – RR time domain and frequency domain– EDR frequency domain
Feature SelectionCovariance regularisation (not discussed here)
![Page 50: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/50.jpg)
Training – all features35-fold cross validation. Each fold contained 1 record
– Removed intra-record bias
![Page 51: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/51.jpg)
Training –Feature SelectionUse the best first feature selection strategyOuter loop: 35-fold cross validationInner loop: 34 fold cross validation.As before each fold contained 1 record
– Removed intra-record bias
![Page 52: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/52.jpg)
Performance Assessment 1
During Feature selection classifiers compared using ‘Accuracy’
Accuracy = (TP+TN) / (TP+FN+FP+TN)
TNFPSDR
FNTPNR
SDRNR
ExpertPr
edic
ted
![Page 53: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/53.jpg)
Performance Assessment 2In addition classifiers assessed using Sensitivity and Specificity
TNFPSDR
FNTPNR
SDRNR
ExpertPr
edic
ted
Sensitivity = TP / (TP+FP)
Specificity = TN / (TN+FN)
![Page 54: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/54.jpg)
Results- cross validation
(a) All features, (b) Feature selection
•LDA better than QDA
•Feature selection improved QDA
•Best features were RR and EDR combined
![Page 55: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/55.jpg)
Results – withheld set
•Results on new data (withheld set) similar to the cross-validation results which is an encouraging sign!
![Page 56: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/56.jpg)
Results- feature separation across classes
RR PSD features– Good separation at low
frequencies
EDR PSD features– Good separation
particularly at low frequencies
![Page 57: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/57.jpg)
Practical tips
Always start with a simple system and add complexity if needed e.g start with LDA, progress to NN if neededFocus on finding good discriminating features. If your features are poor then no fancy classifier will helpAs best as possible make sure your performance assessment is unbiased
– Be careful of training and testing with features from the same record– Never make many performance assessments on the same data set
then report the best as this will be a postiviely biased result. Be particularly careful with feature selection where many thousands of comparisons may be made.
Consider what is the target device for the pattern recognition system
– E.g. low power, low computational device will influence your choice of features and classifier
![Page 58: Pattern Recognition applied to Biomedical Signals - …fcruz/pdf/PattRecAug07.pdfPattern Recognition applied to Biomedical Signals ... kn lf π == θ=∏∏ x θ Our ... c kkkk l l](https://reader034.vdocuments.site/reader034/viewer/2022052309/5abad65e7f8b9ad1768c0288/html5/thumbnails/58.jpg)
Bibliography
Data splitting– R. Kohavi, “A study of cross validation and bootstrap for accuracy estimation and model selection,” In:
Proc. of 14th Int. Joint Conference on Artificial Intelligence, 1995, pp. 1137-1143. Performance estimating
– M. H. Zweig (1993), “Receiver Operator Characteristic (ROC) Plots: A Fundamental Evaluation Tool in Clinical Medicine,” Clin. Chem., vol. 39(4), pp. 561-577
– J. Michaelis and J. L. Willems (1987), “Performance Evaluation of Diagnostic ECG Programs,”Proceedings of the Computers in Cardiology Conference, Leuven, Belgium, Sept 12-15, pp. 25-30, Edited by: K. L. Ripley, IEEE Computer Society.
Classifiers– C. M. Bishop, Neural Networks for Pattern Recognition. New York: Oxford University Press, 1995.– B.D. Ripley, Pattern Recognition and Neural Networks, Cambridge, England: Cambridge University
Press, 1996.– R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification, New York:John Wiley and Sons, 2001. – Warren Sarle, SAS Institute, http://www.faqs.org/faqs/ai-faq/neural-nets/part1/– Mike James, Classification Algorithms, John Wiley and Sons, 1985Sleep Apnea
– P. de Chazal, C. Heneghan, E. Sheridan, R.B. Reilly, P. Nolan, M O’Malley (2003) ”Automated Processing of the Single Lead Electrocardiogram for the Detection of Obstructive Sleep Apnea”, IEEE Transactions on Biomedical Engineering, Vol. 50, No. 6, June 2003, pp. 686-696 .