boutellaa237
TRANSCRIPT
-
8/13/2019 Boutellaa237
1/11
Face Verification Using Local Binary Patterns and
Maximum A Posteriori Vector Quantization Model
Elhocine Boutellaa1,2, Farid Harizi1, Messaoud Bengherabi1, Samy Ait-Aoudia2, and
Abdenour Hadid3
1Centre de Developpement des Technologies Avancees (DZ),2Ecole Nationale Superieure dInformatique (DZ),
3University of Oulu (FI)
Abstract. The popular Local binary patterns (LBP) have been highly success-
ful in representing and recognizing faces. However, the original LBP has some
problems that need to be addressed in order to increase its robustness and dis-
criminative power and to make the operator suitable for the needs of different
types of problems. Particularly, a serious drawback of LBP method concerns the
number of entries in the LBP histograms as a too small number of bins would fail
to provide enough discriminative information about the face appearance while a
too large number of bins may lead to sparse and unstable histograms. To over-
come this drawback, we propose an efficient and compact LBP representation
for face verification using vector quantization maximuma posteriori adaptation
(VQ-MAP) model. In the proposed approach, a face is divided into equal blocks
from which LBP features are extracted. We then efficiently represent the face by a
compact feature vector issued by clustering LBP patterns in each block. Finally,
we model faces using VQ-MAP and use the mean squared error for similarity
score computation. We extensively evaluate our proposed approach on two pub-
licly available benchmark databases and compare the results against not only the
original LBP approach but also other LBP variants, demonstrating very promis-
ing results.
1 Introduction
It is widely believed that biometrics will become a significant component of the iden-
tification technology and it is already of universal interest. The goal of a biometric
system is to determine the identity of an individual using physical/biological charac-
teristics (i.e. biometric modalities). Biometric systems have many applications such
as criminal identification, airport checking, computer or mobile devices log-in, build-
ing gate control, digital multimedia access, transaction authentication, voice mail, or
secure teleworking. Various characteristics can be used: from the most conventional
biometric modalities such as face, voice, fingerprint, iris, hand geometry or signature,
to the socalled emerging biometric modalities such as gait, hand-grip, ear, body odour,
body salinity, electroencephalogram or DNA. Each modality has its strengths and draw-backs [1].
Biometric systems can run into two fundamentally distinct modes: (i) verification
(or authentication) and (ii) recognition (more popularly known as identification). In au-
thentication mode, the system aims to confirm or deny the identity claimed by a person
-
8/13/2019 Boutellaa237
2/11
2 Authors Suppressed Due to Excessive Length
(one-to-one matching) while in recognition mode the system aims to identify an individ-
ual from a database (one-to-many matching). Because of its natural and non-intrusive
interaction, identity verification and recognition using facial information is among the
most active and challenging areas in computer vision research [ 1,2]. However, despite
the great deal of progress during the recent years [2], face biometrics (that is identifying
individuals based on their face information) is still a major area of research. Wide range
of viewpoints, aging of subjects and complex outdoor lighting are still challenges in
face recognition.
There are numerous ways to categorize different face description approaches. One
of the most widely used divisions is to distinguish whether the method is based on rep-
resenting the feature statistics of small local face patches (local) or computing features
directly from the entire image or video (global). Lately the local methods have proved
to be more effective in real world conditions whereas the other approaches have almost
disappeared. However the global methods have recently started to partially reappear to
complement the local descriptors. A survey on different face descriptions can be found
in [2].Recent developments in face analysis showed that local binary patterns (LBP) [ 3]
provides excellent results in representing faces[4,5]. LBP is a gray-scale invariant tex-
ture operator which labels the pixels of an image by thresholding the neighborhood of
each pixel with the value of the center pixel and considers the result as a binary number.
LBP labels can be regarded as local primitives such as curved edges, spots, flat areas
etc. The histogram of the labels can be then used as a face descriptor. Due to its dis-
criminative power and computational simplicity, the LBP methodology has attained an
established position in face analysis1 and has inspired plenty of new research on related
methods.
The original LBP has some problems that need to be addressed in order to increase
its robustness and discriminative power and to make the operator suitable for the needs
of different types of problems. For instance, a serious drawback of LBP method con-
cerns the number of entries in the LBP histograms as a too small number of bins wouldfail to provide enough discriminative information about the face appearance while a
too large number of bins may lead to sparse and unstable histograms. To overcome this
drawback, we propose an efficient and compact LBP representation for face verification.
The face is first divided into several regions from which LBP features are extracted. LBP
codes of each region are then quantified into low-dimensional feature vector. The face
is represented by staking vectors of all the regions. Finally, we generate reliable face
model using VQ-MAP[6] method. We extensively evaluate our proposed approach on
two publicly available benchmark databases and compare the results against not only
the original LBP approach and also other LBP variants, demonstrating very promising
results.
The rest of this paper is organized as follows. Section 2describes the original LBP
based face representation. In section3, our proposed approach for efficient and compact
LBP representation overcoming LBP drawbacks (i.e. sparse and unstable histograms) is
introduced. Experimental analysis are presented in section4and conclusions are drawn
in section5.
1 See LBP bibliography at http://www.cse.oulu.fi/MVG/LBP_Bibliography
http://www.cse.oulu.fi/MVG/LBP_Bibliographyhttp://www.cse.oulu.fi/MVG/LBP_Bibliography -
8/13/2019 Boutellaa237
3/11
Title Suppressed Due to Excessive Length 3
2 Face Representation Using Local Binary Patterns
The LBP texture analysis operator, introduced by Ojala et al.[3], is defined as a gray-
scale invariant texture measure, derived from a general definition of texture in a local
neighborhood. It is a powerful means of texture description and among its properties
in real-world applications are its discriminative power, computational simplicity and
tolerance against monotonic gray-scale changes.
The original LBP operator forms labels for the image pixels by thresholding the
33 neighborhood of each pixel with the center value and considering the result as abinary number. Fig.1shows an example of an LBP calculation. The histogram of these
28 = 256different labels can then be used as a texture descriptor.
Fig. 1.The basic LBP operator.
The operator has been extended to use neighborhoods of different sizes. Using a
circular neighborhood and bilinearly interpolating values at non-integer pixel coordi-
nates allow any radius and number of pixels in the neighborhood. The notation (P, R)is generally used for pixel neighborhoods to refer to P sampling points on a circle ofradiusR. The calculation of the LBP codes can be easily done in a single scan throughthe image. The value of the LBP code of a pixel (xc, yc)is given by:
LBPP,R =P1p=0
s(gp gc)2p, (1)
wheregc corresponds to the gray value of the center pixel (xc, yc), gp refers to grayvalues ofPequally spaced pixels on a circle of radius R, ands defines a thresholdingfunction as follows:
s(x) =
1,ifx 0;0,otherwise.
(2)
Another extension to the original operator is the definition of the so called uniform
patterns. This extension was inspired by the fact that some binary patterns occur more
commonly in texture images than others. A local binary pattern is called uniform if thebinary pattern contains at most two bitwise transitions from 0 to 1 or vice versa when
the bit pattern is traversed circularly. In the computation of the LBP labels, uniform
patterns are used so that there is a separate label for each uniform pattern and all the non-
uniform patterns are labeled with a single label. This yields to the following notation
-
8/13/2019 Boutellaa237
4/11
4 Authors Suppressed Due to Excessive Length
for the LBP operator: LBPu2P,R. The subscript represents using the operator in a(P, R)neighborhood. Superscript u2 stands for using only uniform patterns and labeling allremaining patterns with a single label.
Each LBP label (or code) can be regarded as a micro-texton. Local primitives which
are codified by these labels include different types of curved edges, spots, flat areas etc.
The occurrences of the LBP codes in the image are collected into a histogram. The
classification is then performed by computing histogram similarities. For an efficient
representation, facial images are first divided into several local regions from which LBP
histograms are extracted and concatenated into an enhanced feature histogram.
3 Our Proposed Approach to Face Verification using LBP and
VQMAP
As mentionned above, a simple concatenation of all local block features in the original
LBP based face recognition approach may be subject to the curse of dimensionality (e.g.sparse and unstable histograms). To tackle this problem, we describe in this section an
elegant solution.
3.1 LBP Quantization
In original LBP based face representation and most of its variants, extracted histograms
over a block are generally sparse. Most of bins in the histogram are zero or near to zero,
particularly in the case of small blocks. Indeed, the number of LBP labels in a block
depends on its size. On one hand, big blocks produce dense histograms that badly repre-
sent local face changes. On the other hand, small blocks are robust to local changes but
create unreliable sparse histograms, as the number of histogram bins exceeds by far the
number of LBP patterns in the block. Another problem with LBP respresentation is thatthe number of bins of the histogram is function of the number of neighborhood sam-
pling pointsP. The number of histogram bins grows considerably when P increases(there areP(P1) + 3 bins per block). Hence, small neighborhood yields in com-pact but poor representation whereas large neighborhood produces huge and unreliable
feature vectors.
Fig. 2.LBP based face description : LBP is first applied to Face which is then subdi-
vided into equal blocks and finally the face codebook is computed using VQ.
-
8/13/2019 Boutellaa237
5/11
Title Suppressed Due to Excessive Length 5
Fig. 3.Face block description using LBP histogram and VQ-LBP codebook.
Furthermore, not all LBP labels are present in a given face region. Labels with
low occurrences can be considered as noise, and thus are useless for characterizing
the face region. Therefore, a block can be efficiently characterized by a more accurate
lowdimensional vector by ignoring those patterns.
To tackle these problems, we apply vector quantification to each block of the face
in order to dynamically obtain a more accurate feature vector that represents the face in
a best way. Patterns of each block are clustered into a fixed number of groups and the
face is represented by resulting codebook. Thus, only relevant LBP labels of a given
block will be represented while other labels are ignored. This yields into a feature of
the patterns which are face-specific and thus suitable for face representation. Figure 2
illustrates how a face is represented in our approach.
In our proposed approach, the clustering of LBP labels is achieved by LBG algo-
rithm[7]. LBG algorithm is like a K-means clustering algorithm which takes a set ofvectorsS={xi R
d|i = 1, . . . , n}as input and generates a representative subset ofvectorsC= {cj Rd|j = 1, . . . , K }with a user specifiedK
-
8/13/2019 Boutellaa237
6/11
6 Authors Suppressed Due to Excessive Length
Fig. 4.LBP-VQ/MAP face verification system.
parameters to be adapted: mean vectors (centroids), covariance matrices, and weights.
VQ-MAP model is motivated by the fact that accurate models could be obtained byonly adapting the mean vectors in the GMM-MAP approach. By reducing the number
of free parameters, the VQ-MAP model is simpler for implementation and much faster
adaptation could be achieved. Moreover, the similarity computation for a given probe is
further simplified by replacing the log likelihood ratio (LLR) computation by the mean
squared error (MSE) [6]. The speed gain in VQ-MAP originates mostly from the re-
placement of the Gaussian density computations with squared distance computations,
leaving out the exponentiation and additional multiplications[8].
Figure4depicts our face verification system. To generate a user model, a universal
background model (UBM) is first created using the pool of tanning faces. After extract-
ing LBP codes from each face, we divide the faces into equal blocks. Then, we run
VQ algorithm considering together the set of blocks of the same position from all tan-
ning faces. A codebook representing the background model is obtained. User specific
model is then inferred from the global model by applying the MAP adaptation tech-nique. Formally, MAP paradigm is the process to find the parameters that maximizesthe posterior probability density function (pdf):
MAP =arg max
P(/X) (3)
In our case, denotes the centroidsCj of the codebook.In the verification phase, the closest UBM vectors are searched for each block of
the probe face. For the face model, nearest neighbor search is performed on the corre-
sponding adapted vectors only. The match score is the difference of the UBM and target
quantization errors [9]:
score= M SE(X,UBM) MSE(X, C) (4)
Where:
M SE(X, Y) = 1
|X|
xiX
minykY
||xi yk||2 (5)
-
8/13/2019 Boutellaa237
7/11
Title Suppressed Due to Excessive Length 7
Fig. 5.Example of face images from from different sessions from XM2VTS database.
4 Experimental Analysis
In this section, we use two public databases, namely XM2VTS and BANCA, to ex-
tensively evaluate the proposed approach and assess its performance. Moreover, we
compare our approach to similar as well as recent state of the art methods.
4.1 Databases
XM2VTS The XM2VTS database [10]contains face videos from 295 subjects. Data
is collected on four different sessions separated by one month interval. A set of 200
training clients, 25 evaluation impostors and 70 test impostors constitute the database.
The database is collected through four sessions. Figure5shows an example of one shot
from each session for a database subject.
We use bothLausanne Protocol configurations (LPI and LPII) defined for XM2VTS
to assess system performance in verification mode. The database is divided into three
subsets: train, evaluation and test. The training data serves for estimating models. Eval-
uation subset is used to tune system parameters. Finally, system performances are esti-
mated on the test subset, using evaluation parameters.
BANCA Fro BANCA database [11], the English part is used for our tests. It contains
52 users (26 male and 26 female). Faces are collected through 12 different sessions with
various acquisition devices of different quality and in different environment conditions:
controlled (high-quality camera, uniform background, controlled lighting), degraded
(web-cam, non-uniform background) and adverse (high-quality camera, arbitrary con-
ditions). Exmaples of the three conditions are shown in figure6. For each session, two
videos are recorded: a true client access and an impostor attack.
In the BANCA protocol, seven distinct configurations for the training and testing
policy have been defined. In our experiments, the configurations referred as Match
Controlled (Mc), Unmatched Degraded (Ud), Unmatched Adverse (Ua), Pooled Test
(P) and Grand Test (G) are used. All of the listed configurations, except protocol G, use
the same training conditions: each client is trained using images from the first recordingsession of the controlled scenario. Testing is then performed on images taken from the
controlled scenario (Mc), adverse scenario (Ua), degraded scenario (Ud), while (P) does
the test for each of the previously described configurations. The protocol G uses training
images from the first recording sessions of scenarios controlled, degraded and adverse.
-
8/13/2019 Boutellaa237
8/11
8 Authors Suppressed Due to Excessive Length
Fig. 6.Example of Banca face images from the three aqcuision conditions : controlled,
degraded and adverse.
The database is divided into two groups g1 and g2, alternatively used for development
and evaluation.
4.2 Setups
In our evaluations, we use the same parameters for both databases. We cropped faces
using provided eye positions to fixed size of 80 x 64. Faces are subdivided to equal
blocks of 8 x 8 pixels, yielding to 80 blocks per face. No more preprocessing of face
images were performed. We assess our approach performance using different sizes of
codebook (k = 2, 4, . . . , 32) issued by LBG clustering algorithm and we report thebest results. To illustrate the effect on various LBP histograms we consider different
parameters :(P, R) {(8, 2), (16, 2), (24, 3)}. Finally, for the sake of comparison, allexperiments are also performed for baseline LBP system.
System performance is assessed using the half total error rate (HTER) which is the
mean of false acceptance rate (FAR) and false rejection rate (FRR) of the evaluation
set :
HTER= F AR() + F RR()2 (6)
The threshold corresponds to the optimal operating point of the development setdefined by the minimal equal error rate (EER).
4.3 Results and discussion
For the sake of clarity and fair evaluation, we only report the best results along with
the best per block feature vector size. Tables1and2summarize the obtained results
on XM2VTS and BANCA databases, respectively. Compared to the baseline LBP, the
results clearly show that our proposed approach not only yields in shorter feature vector
lengths but better verification performance. This is due to the fact that not all the in-
formation present in the baseline LBP representation is discriminative. Indeed, most of
the bins in the baseline LBP histograms are close to zero and and may represent noise.It is worth noting that the gain ( i.e. the decrease) in the feature vector length is more
significant with larger neighborhood sizesP.In the case of XM2VTS database (Table1), our approach outperforms the baseline
LBP in all the configurations for both protocols LPI and LPII. The best obtained HTERs
-
8/13/2019 Boutellaa237
9/11
Title Suppressed Due to Excessive Length 9
Table 1. HTER (%) and per block feature vector size on XM2VTS database using
LPI and LPII protocols for different configurations of LBP baseline and our proposed
method.
Method Protocol
LPI LPII
HTER Feature size HTER Feature size
LBP(8,2) Baseline 3.0 59 2.2 59
LBP(8,2)/VQMAP 3.0 16 0.8 16
LBP(16,2) Baseline 2.9 243 2.0 243
LBP(16,2)/VQMAP 2.3 16 1.1 32
LBP(24,3) Baseline 3.9 555 2.9 555
LBP(24,3)/VQMAP 1.9 16 1.0 16
Table 2.HTER (%) and per block feature vector size on Banca database using Mc, Ud,
Ua, P and G protocols for different configurations of LBP baseline and our proposed
method.
Method Protocol
Mc Ud Ua P G
HTER F. size HTER F. size HTER F. size HTER F. size HTER F. size
LBP(8,2) Baseline 10.5 59 14.5 59 17.3 59 25.0 59 8.4 59
LBP(8,2)/VQMAP 4,0 16 11,6 32 14,9 8 16,6 32 4,9 32
LBP(16,2) Baseline 10.9 243 15.3 243 18.5 243 28.4 243 9.6 243
LBP(16,2)/VQMAP 3.8 32 18.5 4 18.8 8 20.7 16 6.4 32
LBP(24,3) Baseline 12.3 555 18.1 555 25.0 555 33.3 555 14.9 555LBP(24,3)/VQMAP 4.8 32 18.4 16 18.2 16 20.4 16 5.9 32
are respectively 1.9% and 0.8% for LPI and LPII, with a per block feature vector size
of 16 for both cases.
Moreover, our approach shows more robustness to different challenges present in
the Banca database compared to baseline LBP. In fact, our approach outperforms base-
line LBP in almost all the configurations (Table2). We also note that the best HTERs
for the considered five protocols are given by our approach (bold values in Table2).
Our obtained results confirm the effectiveness of the proposed approach over the
baseline LBP method. For the sake of comparison, we also report in Table 3 the main
recent state of the art results achieved on Banca database. Our approach shows compet-itive results in the case of Mc and Ua protocols. For the rest of protocols (Ud, P and G)
average performances are obtained. This can be explained by the fact that our approach
is not using any preprocessing while most of the compared methods use preprocessing
procedures (like the three steps preprocessing chain proposed by Tan and Triggs [15])
-
8/13/2019 Boutellaa237
10/11
10 Authors Suppressed Due to Excessive Length
Table 3.HTER (%) for state of the art methods on Banca database.
Method Protocol
Mc Ud Ua P G
Our approach 3.8 11.6 14.9 16.8 5.9
LBP baseline 10.5 14.5 17.3 25.0 6.4
LBP/MAP [12] 7.3 10.7 22.1 19.2 5.0
LBP-KDE [13] 4.3 6.4 18.1 17.6 4.0
Weighted LBP-KDE [13] 3.7 6.6 15.1 11.6 3.0
Gabor DCT-GMM [14] - - - 17.3 3.6
LGBPHS [14] - - - 10.8 5.9
to boost the results under Banca bad quality images. It is also worth nothing that, con-
trary to the compared methods, our method inherits the simplicity and computational
efficiency of the baseline LBP approach.
5 Conclusion and future work
In this work, we presented a new generative model for face verification based on LBP
features. LBP is first applied to each face. Then, faces are divided into blocks to which
vector quantization is applied to create a robust and compact feature vector. Further-
more, we generated reliable face model by MAP adaptation from the background model,
which is obtained from the whole training data. At the verification stage, the LBP pat-
terns of the probe face are matched to the nearest codebook of both the adapted model
and the background model, and the difference is employed as the similarity between the
probe and claimed identity.
The main advantage of the proposed method is the reduction of feature vector com-
pared to classical concatenated LBP histograms. Indeed, competitive results are ob-
tained using very compact feature vector. Hence, the system computation complexity
as well as needed storage space can be reduced. Furthermore, the robustness of the sys-
tem is enhanced by using MAP adaptation to generate the face model. Obtained error
rates and comparison to the recent state of the art work demonstrated the efficiency of
the proposed approach.
It is worth noting that our proposed approach can also be applied to other LBPvariants, such as MSLBP, CLBP and LTP, which provide more accurate discrimination
but with larger feature vectors than original LBP. Furthermore, due to the nature of LBP
codes, it is of interest to explore the different metrics in both clustering and matching.
For instance, the use of Hamming distance may yields in performance improvements.
-
8/13/2019 Boutellaa237
11/11
Title Suppressed Due to Excessive Length 11
Acknowledgments
This work is supported in part by the project FNR/CDTA/ASM/BSM/26 and CDTA/PNR/BIOVISA
project. Authors are thankful for the MESRS and DGRSDT for their support.
References
1. Li, S.Z., Jain, A.K., eds.: Encyclopedia of Biometrics. Springer US (2009)
2. Li, S.Z., Jain, A.K., eds.: Handbook of Face Recognition, 2nd Edition. Springer (2011)
3. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant
texture classification with local binary patterns. TPAMI 24 (2002) 971987
4. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: Appli-
cation to face recognition. TPAMI 28 (2006) 20372041
5. Pietikainen, M., Hadid, A., Zhao, G., Ahonen, T.: Computer Vision Using Local Binary
Patterns. Springer (2011)
6. Hautamaki, V., Kinnunen, T., Karkkainen, I., Saastamoinen, J., Tuononen, M., Franti, P.:Maximum a posteriori adaptation of the centroid model for speaker verification. Signal
Processing Letters, IEEE15 (2008) 162165
7. Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. Communications,
IEEE Transactions on 28 (1980) 8495
8. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian
mixture models. Digital Signal Processing10 (2000) 19 41
9. Kinnunen, T., Saastamoinen, J., Hautamaki, V., Vinni, M., Franti, P.: Comparing maximum
a posteriori vector quantization and gaussian mixture models in speaker verification. In:
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Confer-
ence on. (2009) 42294232
10. Messer, K., Matas, J., Kittler, J., Jonsson, K.: Xm2vtsdb: The extended m2vts database. In:
In Second International Conference on Audio and Video-based Biometric Person Authenti-
cation. (1999) 7277
11. Bailly-Baillire, E., Bengio, S., Bimbot, F., Hamouz, M., Kittler, J., Marithoz, J., Matas, J.,Messer, K., Popovici, V., Pore, F., Ruiz, B., Thiran, J.P.: The banca database and evaluation
protocol. In Kittler, J., Nixon, M., eds.: Audio- and Video-Based Biometric Person Authen-
tication. Volume 2688 of Lecture Notes in Computer Science. Springer Berlin Heidelberg
(2003) 625638
12. Rodriguez, Y., Marcel, S.: Face authentication using adapted local binary pattern histograms.
In Leonardis, A., Bischof, H., Pinz, A., eds.: Computer Vision ECCV 2006. Volume 3954
of Lecture Notes in Computer Science. Springer Berlin Heidelberg (2006) 321332
13. Ahonen, T., Pietikinen, M.: Pixelwise local binary pattern models of faces using kernel
density estimation. In Tistarelli, M., Nixon, M., eds.: Advances in Biometrics. Volume 5558
of Lecture Notes in Computer Science. Springer Berlin Heidelberg (2009) 5261
14. El Shafey, L., Wallace, R., Marcel, S.: Face verification using gabor filtering and adapted
gaussian mixture models. In: Biometrics Special Interest Group (BIOSIG), 2012 BIOSIG -
Proceedings of the International Conference of the. (2012) 397408
15. Tan, X., Triggs, B.: Enhanced local texture feature sets for face recognition under difficultlighting conditions. In Zhou, S., Zhao, W., Tang, X., Gong, S., eds.: Analysis and Modeling
of Faces and Gestures. Volume 4778 of Lecture Notes in Computer Science. Springer Berlin
Heidelberg (2007) 168182