boutellaa237

8/13/2019 Boutellaa237

1/11

Face Verification Using Local Binary Patterns and

Maximum A Posteriori Vector Quantization Model

Elhocine Boutellaa1,2, Farid Harizi1, Messaoud Bengherabi1, Samy Ait-Aoudia2, and

Abdenour Hadid3

1Centre de Developpement des Technologies Avancees (DZ),2Ecole Nationale Superieure dInformatique (DZ),

3University of Oulu (FI)

Abstract. The popular Local binary patterns (LBP) have been highly success-

ful in representing and recognizing faces. However, the original LBP has some

problems that need to be addressed in order to increase its robustness and dis-

criminative power and to make the operator suitable for the needs of different

types of problems. Particularly, a serious drawback of LBP method concerns the

number of entries in the LBP histograms as a too small number of bins would fail

to provide enough discriminative information about the face appearance while a

too large number of bins may lead to sparse and unstable histograms. To over-

come this drawback, we propose an efficient and compact LBP representation

for face verification using vector quantization maximuma posteriori adaptation

(VQ-MAP) model. In the proposed approach, a face is divided into equal blocks

from which LBP features are extracted. We then efficiently represent the face by a

compact feature vector issued by clustering LBP patterns in each block. Finally,

we model faces using VQ-MAP and use the mean squared error for similarity

score computation. We extensively evaluate our proposed approach on two pub-

licly available benchmark databases and compare the results against not only the

original LBP approach but also other LBP variants, demonstrating very promis-

ing results.

1 Introduction

It is widely believed that biometrics will become a significant component of the iden-

tification technology and it is already of universal interest. The goal of a biometric

system is to determine the identity of an individual using physical/biological charac-

teristics (i.e. biometric modalities). Biometric systems have many applications such

as criminal identification, airport checking, computer or mobile devices log-in, build-

ing gate control, digital multimedia access, transaction authentication, voice mail, or

secure teleworking. Various characteristics can be used: from the most conventional

biometric modalities such as face, voice, fingerprint, iris, hand geometry or signature,

to the socalled emerging biometric modalities such as gait, hand-grip, ear, body odour,

body salinity, electroencephalogram or DNA. Each modality has its strengths and draw-backs [1].

Biometric systems can run into two fundamentally distinct modes: (i) verification

(or authentication) and (ii) recognition (more popularly known as identification). In au-

thentication mode, the system aims to confirm or deny the identity claimed by a person


2/11

2 Authors Suppressed Due to Excessive Length

(one-to-one matching) while in recognition mode the system aims to identify an individ-

ual from a database (one-to-many matching). Because of its natural and non-intrusive

interaction, identity verification and recognition using facial information is among the

most active and challenging areas in computer vision research [ 1,2]. However, despite

the great deal of progress during the recent years [2], face biometrics (that is identifying

individuals based on their face information) is still a major area of research. Wide range

of viewpoints, aging of subjects and complex outdoor lighting are still challenges in

face recognition.

There are numerous ways to categorize different face description approaches. One

of the most widely used divisions is to distinguish whether the method is based on rep-

resenting the feature statistics of small local face patches (local) or computing features

directly from the entire image or video (global). Lately the local methods have proved

to be more effective in real world conditions whereas the other approaches have almost

disappeared. However the global methods have recently started to partially reappear to

complement the local descriptors. A survey on different face descriptions can be found

in [2].Recent developments in face analysis showed that local binary patterns (LBP) [ 3]

provides excellent results in representing faces[4,5]. LBP is a gray-scale invariant tex-

ture operator which labels the pixels of an image by thresholding the neighborhood of

each pixel with the value of the center pixel and considers the result as a binary number.

LBP labels can be regarded as local primitives such as curved edges, spots, flat areas

etc. The histogram of the labels can be then used as a face descriptor. Due to its dis-

criminative power and computational simplicity, the LBP methodology has attained an

established position in face analysis1 and has inspired plenty of new research on related

methods.

The original LBP has some problems that need to be addressed in order to increase

its robustness and discriminative power and to make the operator suitable for the needs

of different types of problems. For instance, a serious drawback of LBP method con-

cerns the number of entries in the LBP histograms as a too small number of bins wouldfail to provide enough discriminative information about the face appearance while a

too large number of bins may lead to sparse and unstable histograms. To overcome this

drawback, we propose an efficient and compact LBP representation for face verification.

The face is first divided into several regions from which LBP features are extracted. LBP

codes of each region are then quantified into low-dimensional feature vector. The face

is represented by staking vectors of all the regions. Finally, we generate reliable face

model using VQ-MAP[6] method. We extensively evaluate our proposed approach on

two publicly available benchmark databases and compare the results against not only

the original LBP approach and also other LBP variants, demonstrating very promising

results.

The rest of this paper is organized as follows. Section 2describes the original LBP

based face representation. In section3, our proposed approach for efficient and compact

LBP representation overcoming LBP drawbacks (i.e. sparse and unstable histograms) is

introduced. Experimental analysis are presented in section4and conclusions are drawn

in section5.

1 See LBP bibliography at http://www.cse.oulu.fi/MVG/LBP_Bibliography
http://www.cse.oulu.fi/MVG/LBP_Bibliographyhttp://www.cse.oulu.fi/MVG/LBP_Bibliography


3/11

Title Suppressed Due to Excessive Length 3

2 Face Representation Using Local Binary Patterns

The LBP texture analysis operator, introduced by Ojala et al.[3], is defined as a gray-

scale invariant texture measure, derived from a general definition of texture in a local

neighborhood. It is a powerful means of texture description and among its properties

in real-world applications are its discriminative power, computational simplicity and

tolerance against monotonic gray-scale changes.

The original LBP operator forms labels for the image pixels by thresholding the

33 neighborhood of each pixel with the center value and considering the result as abinary number. Fig.1shows an example of an LBP calculation. The histogram of these

28 = 256different labels can then be used as a texture descriptor.

Fig. 1.The basic LBP operator.

The operator has been extended to use neighborhoods of different sizes. Using a

circular neighborhood and bilinearly interpolating values at non-integer pixel coordi-

nates allow any radius and number of pixels in the neighborhood. The notation (P, R)is generally used for pixel neighborhoods to refer to P sampling points on a circle ofradiusR. The calculation of the LBP codes can be easily done in a single scan throughthe image. The value of the LBP code of a pixel (xc, yc)is given by:

LBPP,R =P1p=0

s(gp gc)2p, (1)

wheregc corresponds to the gray value of the center pixel (xc, yc), gp refers to grayvalues ofPequally spaced pixels on a circle of radius R, ands defines a thresholdingfunction as follows:

s(x) =

1,ifx 0;0,otherwise.

(2)

Another extension to the original operator is the definition of the so called uniform

patterns. This extension was inspired by the fact that some binary patterns occur more

commonly in texture images than others. A local binary pattern is called uniform if thebinary pattern contains at most two bitwise transitions from 0 to 1 or vice versa when

the bit pattern is traversed circularly. In the computation of the LBP labels, uniform

patterns are used so that there is a separate label for each uniform pattern and all the non-

uniform patterns are labeled with a single label. This yields to the following notation


4/11


for the LBP operator: LBPu2P,R. The subscript represents using the operator in a(P, R)neighborhood. Superscript u2 stands for using only uniform patterns and labeling allremaining patterns with a single label.

Each LBP label (or code) can be regarded as a micro-texton. Local primitives which

are codified by these labels include different types of curved edges, spots, flat areas etc.

The occurrences of the LBP codes in the image are collected into a histogram. The

classification is then performed by computing histogram similarities. For an efficient

representation, facial images are first divided into several local regions from which LBP

histograms are extracted and concatenated into an enhanced feature histogram.

3 Our Proposed Approach to Face Verification using LBP and

VQMAP

As mentionned above, a simple concatenation of all local block features in the original

LBP based face recognition approach may be subject to the curse of dimensionality (e.g.sparse and unstable histograms). To tackle this problem, we describe in this section an

elegant solution.

3.1 LBP Quantization

In original LBP based face representation and most of its variants, extracted histograms

over a block are generally sparse. Most of bins in the histogram are zero or near to zero,

particularly in the case of small blocks. Indeed, the number of LBP labels in a block

depends on its size. On one hand, big blocks produce dense histograms that badly repre-

sent local face changes. On the other hand, small blocks are robust to local changes but

create unreliable sparse histograms, as the number of histogram bins exceeds by far the

number of LBP patterns in the block. Another problem with LBP respresentation is thatthe number of bins of the histogram is function of the number of neighborhood sam-

pling pointsP. The number of histogram bins grows considerably when P increases(there areP(P1) + 3 bins per block). Hence, small neighborhood yields in com-pact but poor representation whereas large neighborhood produces huge and unreliable

feature vectors.

Fig. 2.LBP based face description : LBP is first applied to Face which is then subdi-

vided into equal blocks and finally the face codebook is computed using VQ.


5/11


Fig. 3.Face block description using LBP histogram and VQ-LBP codebook.

Furthermore, not all LBP labels are present in a given face region. Labels with

low occurrences can be considered as noise, and thus are useless for characterizing

the face region. Therefore, a block can be efficiently characterized by a more accurate

lowdimensional vector by ignoring those patterns.

To tackle these problems, we apply vector quantification to each block of the face

in order to dynamically obtain a more accurate feature vector that represents the face in

a best way. Patterns of each block are clustered into a fixed number of groups and the

face is represented by resulting codebook. Thus, only relevant LBP labels of a given

block will be represented while other labels are ignored. This yields into a feature of

the patterns which are face-specific and thus suitable for face representation. Figure 2

illustrates how a face is represented in our approach.

In our proposed approach, the clustering of LBP labels is achieved by LBG algo-

rithm[7]. LBG algorithm is like a K-means clustering algorithm which takes a set ofvectorsS={xi R

d|i = 1, . . . , n}as input and generates a representative subset ofvectorsC= {cj Rd|j = 1, . . . , K }with a user specifiedK


6/11


Fig. 4.LBP-VQ/MAP face verification system.

parameters to be adapted: mean vectors (centroids), covariance matrices, and weights.

VQ-MAP model is motivated by the fact that accurate models could be obtained byonly adapting the mean vectors in the GMM-MAP approach. By reducing the number

of free parameters, the VQ-MAP model is simpler for implementation and much faster

adaptation could be achieved. Moreover, the similarity computation for a given probe is

further simplified by replacing the log likelihood ratio (LLR) computation by the mean

squared error (MSE) [6]. The speed gain in VQ-MAP originates mostly from the re-

placement of the Gaussian density computations with squared distance computations,

leaving out the exponentiation and additional multiplications[8].

Figure4depicts our face verification system. To generate a user model, a universal

background model (UBM) is first created using the pool of tanning faces. After extract-

ing LBP codes from each face, we divide the faces into equal blocks. Then, we run

VQ algorithm considering together the set of blocks of the same position from all tan-

ning faces. A codebook representing the background model is obtained. User specific

model is then inferred from the global model by applying the MAP adaptation tech-nique. Formally, MAP paradigm is the process to find the parameters that maximizesthe posterior probability density function (pdf):

MAP =arg max

P(/X) (3)

In our case, denotes the centroidsCj of the codebook.In the verification phase, the closest UBM vectors are searched for each block of

the probe face. For the face model, nearest neighbor search is performed on the corre-

sponding adapted vectors only. The match score is the difference of the UBM and target

quantization errors [9]:

score= M SE(X,UBM) MSE(X, C) (4)

Where:

M SE(X, Y) = 1

|X|

xiX

minykY

||xi yk||2 (5)


7/11


Fig. 5.Example of face images from from different sessions from XM2VTS database.

4 Experimental Analysis

In this section, we use two public databases, namely XM2VTS and BANCA, to ex-

tensively evaluate the proposed approach and assess its performance. Moreover, we

compare our approach to similar as well as recent state of the art methods.

4.1 Databases

XM2VTS The XM2VTS database [10]contains face videos from 295 subjects. Data

is collected on four different sessions separated by one month interval. A set of 200

training clients, 25 evaluation impostors and 70 test impostors constitute the database.

The database is collected through four sessions. Figure5shows an example of one shot

from each session for a database subject.

We use bothLausanne Protocol configurations (LPI and LPII) defined for XM2VTS

to assess system performance in verification mode. The database is divided into three

subsets: train, evaluation and test. The training data serves for estimating models. Eval-

uation subset is used to tune system parameters. Finally, system performances are esti-

mated on the test subset, using evaluation parameters.

BANCA Fro BANCA database [11], the English part is used for our tests. It contains

52 users (26 male and 26 female). Faces are collected through 12 different sessions with

various acquisition devices of different quality and in different environment conditions:

controlled (high-quality camera, uniform background, controlled lighting), degraded

(web-cam, non-uniform background) and adverse (high-quality camera, arbitrary con-

ditions). Exmaples of the three conditions are shown in figure6. For each session, two

videos are recorded: a true client access and an impostor attack.

In the BANCA protocol, seven distinct configurations for the training and testing

policy have been defined. In our experiments, the configurations referred as Match

Controlled (Mc), Unmatched Degraded (Ud), Unmatched Adverse (Ua), Pooled Test

(P) and Grand Test (G) are used. All of the listed configurations, except protocol G, use

the same training conditions: each client is trained using images from the first recordingsession of the controlled scenario. Testing is then performed on images taken from the

controlled scenario (Mc), adverse scenario (Ua), degraded scenario (Ud), while (P) does

the test for each of the previously described configurations. The protocol G uses training

images from the first recording sessions of scenarios controlled, degraded and adverse.


8/11


Fig. 6.Example of Banca face images from the three aqcuision conditions : controlled,

degraded and adverse.

The database is divided into two groups g1 and g2, alternatively used for development

and evaluation.

4.2 Setups

In our evaluations, we use the same parameters for both databases. We cropped faces

using provided eye positions to fixed size of 80 x 64. Faces are subdivided to equal

blocks of 8 x 8 pixels, yielding to 80 blocks per face. No more preprocessing of face

images were performed. We assess our approach performance using different sizes of

codebook (k = 2, 4, . . . , 32) issued by LBG clustering algorithm and we report thebest results. To illustrate the effect on various LBP histograms we consider different

parameters :(P, R) {(8, 2), (16, 2), (24, 3)}. Finally, for the sake of comparison, allexperiments are also performed for baseline LBP system.

System performance is assessed using the half total error rate (HTER) which is the

mean of false acceptance rate (FAR) and false rejection rate (FRR) of the evaluation

set :

HTER= F AR() + F RR()2 (6)

The threshold corresponds to the optimal operating point of the development setdefined by the minimal equal error rate (EER).

4.3 Results and discussion

For the sake of clarity and fair evaluation, we only report the best results along with

the best per block feature vector size. Tables1and2summarize the obtained results

on XM2VTS and BANCA databases, respectively. Compared to the baseline LBP, the

results clearly show that our proposed approach not only yields in shorter feature vector

lengths but better verification performance. This is due to the fact that not all the in-

formation present in the baseline LBP representation is discriminative. Indeed, most of

the bins in the baseline LBP histograms are close to zero and and may represent noise.It is worth noting that the gain ( i.e. the decrease) in the feature vector length is more

significant with larger neighborhood sizesP.In the case of XM2VTS database (Table1), our approach outperforms the baseline

LBP in all the configurations for both protocols LPI and LPII. The best obtained HTERs


9/11


Table 1. HTER (%) and per block feature vector size on XM2VTS database using

LPI and LPII protocols for different configurations of LBP baseline and our proposed

method.

Method Protocol

LPI LPII

HTER Feature size HTER Feature size

LBP(8,2) Baseline 3.0 59 2.2 59

LBP(8,2)/VQMAP 3.0 16 0.8 16

LBP(16,2) Baseline 2.9 243 2.0 243

LBP(16,2)/VQMAP 2.3 16 1.1 32

LBP(24,3) Baseline 3.9 555 2.9 555

LBP(24,3)/VQMAP 1.9 16 1.0 16

Table 2.HTER (%) and per block feature vector size on Banca database using Mc, Ud,

Ua, P and G protocols for different configurations of LBP baseline and our proposed

method.

Method Protocol

Mc Ud Ua P G

HTER F. size HTER F. size HTER F. size HTER F. size HTER F. size

LBP(8,2) Baseline 10.5 59 14.5 59 17.3 59 25.0 59 8.4 59

LBP(8,2)/VQMAP 4,0 16 11,6 32 14,9 8 16,6 32 4,9 32

LBP(16,2) Baseline 10.9 243 15.3 243 18.5 243 28.4 243 9.6 243

LBP(16,2)/VQMAP 3.8 32 18.5 4 18.8 8 20.7 16 6.4 32

LBP(24,3) Baseline 12.3 555 18.1 555 25.0 555 33.3 555 14.9 555LBP(24,3)/VQMAP 4.8 32 18.4 16 18.2 16 20.4 16 5.9 32

are respectively 1.9% and 0.8% for LPI and LPII, with a per block feature vector size

of 16 for both cases.

Moreover, our approach shows more robustness to different challenges present in

the Banca database compared to baseline LBP. In fact, our approach outperforms base-

line LBP in almost all the configurations (Table2). We also note that the best HTERs

for the considered five protocols are given by our approach (bold values in Table2).

Our obtained results confirm the effectiveness of the proposed approach over the

baseline LBP method. For the sake of comparison, we also report in Table 3 the main

recent state of the art results achieved on Banca database. Our approach shows compet-itive results in the case of Mc and Ua protocols. For the rest of protocols (Ud, P and G)

average performances are obtained. This can be explained by the fact that our approach

is not using any preprocessing while most of the compared methods use preprocessing

procedures (like the three steps preprocessing chain proposed by Tan and Triggs [15])


10/11


Table 3.HTER (%) for state of the art methods on Banca database.

Method Protocol

Mc Ud Ua P G

Our approach 3.8 11.6 14.9 16.8 5.9

LBP baseline 10.5 14.5 17.3 25.0 6.4

LBP/MAP [12] 7.3 10.7 22.1 19.2 5.0

LBP-KDE [13] 4.3 6.4 18.1 17.6 4.0

Weighted LBP-KDE [13] 3.7 6.6 15.1 11.6 3.0

Gabor DCT-GMM [14] - - - 17.3 3.6

LGBPHS [14] - - - 10.8 5.9

to boost the results under Banca bad quality images. It is also worth nothing that, con-

trary to the compared methods, our method inherits the simplicity and computational

efficiency of the baseline LBP approach.

5 Conclusion and future work

In this work, we presented a new generative model for face verification based on LBP

features. LBP is first applied to each face. Then, faces are divided into blocks to which

vector quantization is applied to create a robust and compact feature vector. Further-

more, we generated reliable face model by MAP adaptation from the background model,

which is obtained from the whole training data. At the verification stage, the LBP pat-

terns of the probe face are matched to the nearest codebook of both the adapted model

and the background model, and the difference is employed as the similarity between the

probe and claimed identity.

The main advantage of the proposed method is the reduction of feature vector com-

pared to classical concatenated LBP histograms. Indeed, competitive results are ob-

tained using very compact feature vector. Hence, the system computation complexity

as well as needed storage space can be reduced. Furthermore, the robustness of the sys-

tem is enhanced by using MAP adaptation to generate the face model. Obtained error

rates and comparison to the recent state of the art work demonstrated the efficiency of

the proposed approach.

It is worth noting that our proposed approach can also be applied to other LBPvariants, such as MSLBP, CLBP and LTP, which provide more accurate discrimination

but with larger feature vectors than original LBP. Furthermore, due to the nature of LBP

codes, it is of interest to explore the different metrics in both clustering and matching.

For instance, the use of Hamming distance may yields in performance improvements.


11/11


Acknowledgments

This work is supported in part by the project FNR/CDTA/ASM/BSM/26 and CDTA/PNR/BIOVISA

project. Authors are thankful for the MESRS and DGRSDT for their support.

References

1. Li, S.Z., Jain, A.K., eds.: Encyclopedia of Biometrics. Springer US (2009)

2. Li, S.Z., Jain, A.K., eds.: Handbook of Face Recognition, 2nd Edition. Springer (2011)

3. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant

texture classification with local binary patterns. TPAMI 24 (2002) 971987

4. Ahonen, T., Hadid, A., Pietikainen, M.: Face description with local binary patterns: Appli-

cation to face recognition. TPAMI 28 (2006) 20372041

5. Pietikainen, M., Hadid, A., Zhao, G., Ahonen, T.: Computer Vision Using Local Binary

Patterns. Springer (2011)

6. Hautamaki, V., Kinnunen, T., Karkkainen, I., Saastamoinen, J., Tuononen, M., Franti, P.:Maximum a posteriori adaptation of the centroid model for speaker verification. Signal

Processing Letters, IEEE15 (2008) 162165

7. Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. Communications,

IEEE Transactions on 28 (1980) 8495

8. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian

mixture models. Digital Signal Processing10 (2000) 19 41

9. Kinnunen, T., Saastamoinen, J., Hautamaki, V., Vinni, M., Franti, P.: Comparing maximum

a posteriori vector quantization and gaussian mixture models in speaker verification. In:

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Confer-

ence on. (2009) 42294232

10. Messer, K., Matas, J., Kittler, J., Jonsson, K.: Xm2vtsdb: The extended m2vts database. In:

In Second International Conference on Audio and Video-based Biometric Person Authenti-

cation. (1999) 7277

11. Bailly-Baillire, E., Bengio, S., Bimbot, F., Hamouz, M., Kittler, J., Marithoz, J., Matas, J.,Messer, K., Popovici, V., Pore, F., Ruiz, B., Thiran, J.P.: The banca database and evaluation

protocol. In Kittler, J., Nixon, M., eds.: Audio- and Video-Based Biometric Person Authen-

tication. Volume 2688 of Lecture Notes in Computer Science. Springer Berlin Heidelberg

(2003) 625638

12. Rodriguez, Y., Marcel, S.: Face authentication using adapted local binary pattern histograms.

In Leonardis, A., Bischof, H., Pinz, A., eds.: Computer Vision ECCV 2006. Volume 3954

of Lecture Notes in Computer Science. Springer Berlin Heidelberg (2006) 321332

13. Ahonen, T., Pietikinen, M.: Pixelwise local binary pattern models of faces using kernel

density estimation. In Tistarelli, M., Nixon, M., eds.: Advances in Biometrics. Volume 5558

of Lecture Notes in Computer Science. Springer Berlin Heidelberg (2009) 5261

14. El Shafey, L., Wallace, R., Marcel, S.: Face verification using gabor filtering and adapted

gaussian mixture models. In: Biometrics Special Interest Group (BIOSIG), 2012 BIOSIG -

Proceedings of the International Conference of the. (2012) 397408

15. Tan, X., Triggs, B.: Enhanced local texture feature sets for face recognition under difficultlighting conditions. In Zhou, S., Zhao, W., Tang, X., Gong, S., eds.: Analysis and Modeling

of Faces and Gestures. Volume 4778 of Lecture Notes in Computer Science. Springer Berlin

Heidelberg (2007) 168182

boutellaa237

Documents