an analytic performance estimation framework for multibit biometric discretization based on...

1242 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, VOL. 7, NO. 4, AUGUST 2012

An Analytic Performance Estimation Frameworkfor Multibit Biometric Discretization Basedon Equal-Probable Quantization and Linearly

Separable Subcode EncodingMeng-Hui Lim and Andrew Beng Jin Teoh, Senior Member, IEEE

Abstract—Biometric discretization derives a binary string foreach user based on an ordered set of real-valued biometric features.The false acceptance rate (FAR) and the false rejection rate (FRR)of a binary biometric-based system significantly relies on a Ham-ming distance threshold which decides whether the errors in thequery bit string will be rectified with reference to the template bitstring. Kelkboom et al. have recently modeled a basic frameworkto estimate the FAR and the FRR of one-bit biometric discretiza-tion. However, as the demand of a bit string with higher entropy(informative length) rises, single-bit discretization is getting lessuseful today due to its incapability of producing bit string thatis longer than the total feature dimensions being extracted, thuscausing Kelkboom’s model to be of restricted use. In this paper, weextend the analytical framework to multibit discretization for esti-mating the performance and the decision threshold for achievinga specified FAR/FRR based on equal-probable quantization andlinearly separable subcode encoding. Promising estimation resultson a synthetic data set with independent feature components andGaussian measurements vindicate the analytical expressions ofour framework. However, for experiments on two popular facedata sets, deviation in estimation results were obtained mainly dueto the mismatch of independency assumption of our framework.We hence fit the analytical probability mass functions (pmfs) tothe experimental pmfs through estimating the mean and the vari-ance parameters from the difference between the correspondinganalytical and experimental curves to alleviate such estimationinaccuracies on these data sets.

Index Terms—Biometric discretization framework, equal-prob-able quantization, linearly separable subcode (LSSC) encoding,performance estimation.

I. INTRODUCTION

B IOMETRIC cryptosystems such as the fuzzy commitmentscheme [9], the helper data system [12], [14], the fuzzy

extractor scheme [5], and the key generation scheme [1], [7],

Manuscript received April 21, 2011; revisedMarch 08, 2012; acceptedMarch12, 2012. Date of publication April 03, 2012; date of current version July 09,2012. This work was supported by the Korea Science and Engineering Founda-tion (KOSEF) grant funded by the Korea government (MEST) (2011-8-1095).The associate editor coordinating the review of this manuscript and approvingit for publication was Dr. Arun Ross.M.-H. Lim is with the School of Electrical and Electronic Engineering, Col-

lege of Engineering, Yonsei University, Seoul, 120-749, South Korea (e-mail:[email protected]).A. B. J. Teoh is with the School of Electrical and Electronic Engineering,

College of Engineering, Yonsei University, Seoul, 120-749, South Korea andalso with Predictive Intelligence Research Cluster, Sunway University, BandarSunway, 46150, P.J. Selangor, Malaysia (e-mail: [email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TIFS.2012.2191962

[20], [25] have gained increasing popularity in the past decade.However, the implementation of it is often not straightforward,as continuous biometrics of a user (i.e., face, speech, retina,etc.) is inappropriate to be served directly as an input to the(binary-based) cryptographic systems. Apart from that, crypto-graphic schemes entailing exactness of the input would not beable to tolerate any fuzziness of biometric caused by its inherentintraclass variation. A promising solution to these challenges isto apply a discretization followed by an error correction to thecontinuous features in order to obtain an exact binary input [4],[11], [23]. In particular, discretization converts continuous fea-tures into its binary counterpart without substantially sacrificingthe actual discrimination, while error correction eliminates biterrors of the binary counterpart with reference to a template bitstring. This paper focuses on the former. The general block dia-gram of a biometric discretization-based binary string generatoris illustrated in Fig. 1.Biometric discretization can be decomposed into two fun-

damental components: biometric quantization and feature en-coding. These components are governed by a static or dynamicbit allocation algorithm, determining whether the quantity of bi-nary bits allocated to every feature dimension is fixed or variedrespectively. Typically, given an ordered set of extracted featureelements per user, each single-dimensional feature space is ini-tially constructed and quantized into multiple nonoverlappingintervals according to a quantization fashion. The quantity ofthese intervals is determined by the corresponding number ofbits assigned by the bit allocation algorithm. Each feature ele-ment captured by an interval is then mapped to a short binarystring with respect to the label of the corresponding interval.Eventually, a short binary string from each dimension is con-catenated to form the user’s final bit string.Apart from the above consideration, information about the

constructed feature space for each dimension (i.e., intervalcutpoints and intervals quantity) is stored in the form of helperdata to enable reproduction of the same binary string for eachgenuine user. However, it is required that such helper data,upon compromise, should neither leak any information aboutthe output binary string (security concern), nor the biometricfeature itself (privacy concern) which will be beneficial to theattacker.Due to the noisy nature of biometrics, the extracted query bit

string during verification may contain errors (with reference tothe template) and thus require correction. Each extracted gen-uine binary string has to undergo an error-correction process

1556-6013/$31.00 © 2012 IEEE

LIM AND TEOH: ANALYTIC PERFORMANCE ESTIMATION FRAMEWORK FOR MULTIBIT BIOMETRIC DISCRETIZATION 1243

Fig. 1. Biometric discretization-based binary string generator.

with “Bose, Ray-Chaudhuri, Hocquenghem” (BCH) error-cor-recting code, for instance, before it can be transformed into anexact input for use in the subsequent cryptographic applications.The error-correction mechanism is leveraged in a secure wayby fuzzy commitment or a fuzzy extractor scheme in order toensure that the relative helper data will not jeopardize user pri-vacy. Some examples of such helper data are Hamming offsetbetween the binary representation of a user and the randomlychosen codeword from the error-correcting code in the fuzzycommitment scheme, and syndrome of the user’s binary repre-sentation for syndrome decoding in the fuzzy extractor scheme.This helper data needs to be stored during enrolment and willbe invoked during query to help correct errors in the genuine bitstring if the Hamming distance of the extracted bit string fromthe template bit string is not more than a preset (Hamming dis-tance) decision threshold (determined by the error-correctingcapability of the error-correcting code).

A. Related Work

Earlier works of biometric discretization mainly focus onseeking an optimal quantization technique based on a (static)one-bit allocation per feature component. A fundamental dis-cretization scheme by Monrose et al. [15], [16], Teoh et al.[20], and Verbitsky et al. [24] partitions each feature spaceinto two intervals (labeled by “0” and “1”) based on a prefixthreshold. Tuyls et al. [23] and Kevenaar et al. [12] have useda similar 1-bit discretization technique, but instead of fixingthe threshold, the mean of the background probability densityfunction [(pdf) for modeling interclass variation] is selected asthe threshold in each dimension. Further, reliable componentsare identified based on either the training bit statistics [23] ora reliability function [12] so that unreliable dimensions can beeliminated from bit extraction.Steadily rising adversarial capability owing to technology ad-

vancement has stimulated many research efforts to solicit dis-cretization techniques that extract a long (informative) binarystream. The basic idea is to increase the entropy (uncertainty) ofthe binary string so that malicious brute force attack can be pre-vented. Unfortunately, the incapability of the above one-bit al-

location schemes in producing a lengthy binary stream has beena setback in meeting the current high entropy security standard.Although the quantity of extracted feature dimensions can be in-creased for the one-bit allocation to allow a longer bit string tobe extracted per user, performance degradation could probablyoccur due to the utilization of such additional (noisy) extracteddimensions.Hao and Chan [7] have employed a multibit discretization

scheme with a dynamic bit allocation. Each single-dimensionalgenuine interval that takes the range of isinitially determined, where and denote the mean and thestandard deviation of the user distribution with as a free pa-rameter. The remaining intervals with a constant width ofare constructed from the genuine interval outwards. Thus, therewill be a different number of bits allocated to each feature di-mensionwith distinct and . Chang et al. [1] have introduced avery similar scheme to Hao and Chan’s scheme [7]. This schemeextends the real feature space of every dimension to account forthe extra equal-width intervals so that the total number of in-tervals mounts up to . Both these schemes utilize -bit directbinary representation (DBR) for encoding.Chen et al. [3] demonstrated a likelihood-ratio-based multibit

discretizer with a static bit allocation which derives multiplebits from each feature dimension. The involved quantizationscheme constructs intervals in an equal-probable manner wherethe background probability mass is equally distributed withineach interval. The left-most and right-most intervals with insuf-ficient background probability mass are wrapped into a singleinterval which is tagged with a common label. Binary reflectedgray code (BRGC) [6] is used for encoding.Yip et al. [25] have used another static multibit discretization

scheme based on equal-width intervals construction. Likewise,BRGC is adopted for their encoding usage.Teoh et al. [19], [21] have proposed a standard deviation-

based dynamic bit allocation scheme to search for an optimalbit length assignment for each user. Specifically, each single-di-mensional feature space is initially partitioned into equal-width intervals with denoting the intended number of bits tobe allocated in each dimension. Twice the standard deviation of


the estimated user pdf in each dimension is taken as the evalua-tion measure, determining whether the width of the constructedinterval is sufficiently large to accommodate such a pdf in thatdimension. With incremental , these procedures are repeatediteratively until the optimal value of is found.Chen et al. [4] have proposed a detection rate-based dynamic

bit allocation technique. Basically, detection rate refers to themaximum user probability mass detected/captured by an in-terval on a one-dimensional feature space, where this interval isknown as the genuine interval of that dimension. The scheme,known as “Detection Rate Optimized Bit Allocation” (DROBA)looks for a set of optimal bit assignment that maximizes theoverall detection rate in extracting a -bit binary string. Themaximization of the overall detection rate signifies the max-imization of the probability of all the features staying in thegenuine intervals simultaneously or the probability of achievinga zero bit error for each genuine user. Since a straightforwardbrute force search of all possible bit assignments may incur anextremely high computational complexity, the authors have pro-posed a dynamic programming approach and a greedy searchapproach with lower computational complexity to search for theoptimal bit assignment.Recently, Chen et al. [2] have developed another dynamic

bit allocation algorithm based on optimizing a different bit allo-cation measure: area under the FRR curve. Given the bit-errorprobability, the scheme allocates bits dynamically to every fea-ture component in a similar way as DROBA except that the an-alytical area under the FRR curve for Hamming distance evalu-ation is minimized instead of detection rate maximization.Lim et al. [13] have highlighted the ineffectiveness of BRGC

when it is deployed as an encoding scheme for classification andsubsequently introduced a class of encoding scheme known aslinearly separable subcode (LSSC) that is able to preserve dis-crimination of the quantized features effectively in the Ham-ming domain.

B. Motivations and Contributions

Each genuine query binary string has to undergo an error-cor-rection process before the exact template binary string can berecovered. In terms of Hamming distance, the predetermineddecision threshold used to obtain a binary decision (acceptor reject a user’s query) based on the similarity between thequery and template binary strings is typically determined by theerror-correcting capacity of the error-correcting code. For in-stance, a query binary string belonging to a genuine user willbe corrected and accepted if the Hamming distance betweenthe query and the template bit string is less than or equal to adecision threshold . Otherwise, if such a Hamming distanceis greater than , this will lead to a false reject. On the otherhand, a query binary string belonging to an imposter user willnot be corrected and rejected if the Hamming distance betweenthe query and the template bit string is greater than . Otherwise,if such a Hamming distance is less than or equal to , this willresult in a false accept. Hence, false rejection rate (FRR) is themeasure of likelihood that a genuine access attempt will be in-correctly rejected while false acceptance rate (FAR) is the mea-sure of likelihood that an imposter access attempt will be incor-

rectly accepted by the system. Usually, there exists a trade-offbetween these two error rates.The choice of is essential because it directly affects the FAR

and the FRR of a biometric-based verification system. In fact,different may be desired by different systems based on theamount of FRR or FARwhich the systems can afford to tolerate.However, there is no clear clue about how can be determinedbeforehand with respect to a desired amount of FAR/FRR.Kelkboom et al. have analytically expressed the genuine and

imposter bit-error probability [10] and subsequently modeled adiscretization framework [11] to analytically estimate the gen-uine and imposter Hamming distance probability mass func-tions (pmfs) of a biometric system. This model is based upona static one-bit equal-probable discretization under the assump-tion that both intraclass and interclass variations are Gaussiandistributed. With proper estimation of genuine and imposter’spmfs, an appropriate will be estimated consequently accordingto a prespecified FRR or FAR. The model is first constructedbased on a homogenous conjecture where a common intraclassvariation is assumed for all users. This impractical assumptionis later relaxed and feature dependency consideration is furtherincorporated into the model. However, this model is restrictedto estimating the performance of the basic one-bit discretizationand it is not directly extendable to estimating the more essen-tial multibit discretization that is able to derive bit strings withhigher entropy.Motivated by such a limitation, we extend the framework

to make the performance estimation applicable to multibit dis-cretization based on equal-probable quantization and LSSC en-coding. Similar to Kelkboom et al.’s model, our model is basedon Gaussian assumption on the data. With this framework, weare able to estimate the genuine and the imposter Hamming dis-tance pmfs, the FAR and FRR curves, and the decision thresholdfor achieving a prespecified FAR/FRR from the analytically

estimated bit-error probability as long as the bit allocation ineach dimension, mean, and standard deviation of every user pdfand the background pdf of each feature component, and the av-eraging factor (which will be defined in Section III-C) are spec-ified. This also implies that our model could equally be well-ap-plied to a discretization scheme that is either based on staticmultibit allocation or dynamic bit allocation.Our framework makes use of the statistics estimated directly

from the training data to model the multibit discretizationprocess. Since convolutions are involved in combining thepmfs of individual feature dimensions which are assumed to bemutually independent (but in practice they are not), we furtherextend the framework by optimizing and applying a collectionof mean and variance correction parameters for both analyticalpmfs with reference to the respective experimental pmfs toalleviate such a dependency problem.The organization of this paper is described as follows. In

Section II, we provide the details of the equal-probable quan-tization and the LSSC encoding upon which our framework isbased. In Section III, we present the analytic expression forestimating the genuine and imposter Hamming distance pmfsfollowed by the FAR and FRR curves, and subsequently fitour analytical pmfs to the experimental pmfs to alleviate thefeature component independency mismatch of our framework.


Fig. 2. Background pdf-based construction of four equal-probable quantizationintervals with their corresponding indices labeled on each interval.

In Section IV, we validate these analytic expressions with aGaussian synthetic data set, two face data sets (CMU PIE [18]and FERET [17]), and scrutinize the estimations under differentsettings of dimension, bit allocation per dimension, and fea-ture averaging factor. Lastly, concluding remarks are drawn inSection V.

II. PRELIMINARIES

A. Equal-Probable Quantization

The interclass and intraclass variations in a specific featuredimension can be modeled by a background pdf and a user pdf,respectively. Both distributions are commonly assumed to beGaussian.An equal-probable quantization divides a single-dimensional

feature space unevenly into a number of intervals based on thebackground distribution. Every quantization interval sharesan equal portion of background probability mass, as depictedin Fig. 2. Hence, a continuous feature may fall into a specificinterval equal probably and the feature will be mapped to adiscrete index associated with such an interval. Alternatively,quantization can be viewed as a continuous-to-discrete map-ping process.Since the Gaussian background pdf is solely based upon in

equal-probable quantization, one could just efficiently store themean and standard deviation of the background pdf (instead ofthe interval cutpoints) as the helper data so that they can beinvoked for quantization on the fly during verification.

B. LSSC Encoding

Encoding is defined as a discrete-to-binary mapping process,where the resultant index of each dimension from the quantiza-tion process is mapped to a unique -bit binary codeword of anencoding scheme. The size of a code, which is referred to as thenumber of codewords in a code, needs to be matched with thenumber of intervals in order to avoid any unnecessary entropyloss.

TABLE ICOLLECTION OF -BIT LSSCs FOR , , AND WITH INDICATING

THE CODEWORD INDEX

LSSC [13] is a recent encoding scheme introduced for ef-fective classification. Compared with BRGC and DBR, LSSCappears to be the only encoding scheme that is able to fully pre-serve distance among the quantized feature points during thediscrete-to-binary mapping process and consequently producespromising discretization performance. (See [13] for the superi-ority details of LSSC over BRGC and DBR.)Generally, an -bit LSSC has a code size . To gen-

erate codewords of an -bit LSSC, one could begin with an arbi-trary -bit codeword, say an all zero codeword. The nextcodewords can then be sequentially derived by complementinga bit at a time from the lowest (right-most) to the highest order(left-most) bit position. The resultant -bit LSSCs for the spec-ified , , and are shown in Table I.Perhaps the incurrence of the bit redundancy is the only dis-

advantage of the LSSC encoding scheme. However, this redun-dancy is in fact inevitable if a discretization scheme wishes topreserve the separation among the quantized feature points inthe Hamming domain [13].

III. ANALYTIC ESTIMATION OF FAR AND FRR

Suppose that upon feature extraction, single-dimensionalbiometric features are extracted from an identity. A -bit bi-nary output is then allocated to the th feature dimension for all

through discretization. By concatenatingindividual binary outputs, a -bit binary string foris derived.Let and

denote the extracted binary string during enrolment and verifi-cation, respectively. The Hamming distance between andwith the same bit allocation for the dimensions can be defined

as the number of disagreeing bits or bit errors between them

(1)

where denotes the Hamming distance computation oper-ator.While it is probable to get a -bit error out of a -bit bi-

nary output for when the th single-dimensionalbinary output of and are compared, the pmf ofis defined as

(2)


Since each dimension is treated independently in discretization,the overall probability distribution of multiple independent di-mensions can be obtained through convolution of their indi-vidual distributions. When all dimensions are considered, (2)becomes

(3)

Let the subscripts “ ” and “ ” indicate belonging to thegenuine and the imposter, respectively. Here, the “imposter” isreferred to as a passive nongenuine user without the capabilityof performing active attacks. The FRR with reference to adecision threshold is defined as the probabilityof attaining a Hamming distance greater than for a genuinecomparison, that is

(4)

On the other hand, the FAR with reference to a decisionthreshold is the probability of attaining a Hamming distancenot more than for an imposter comparison, that is

(5)

Therefore, it is necessary to obtain expressions of andpmfs for in order to solve (4) and (5).

A. Interval-Cutpoints Derivation for Equal-ProbableQuantization

Before we start formulating and , it is neces-sary to work out the expression for each interval cutpoint, sinceeach interval is bounded by every consecutive pair of cutpoints.These precise locations of cutpoints are in fact important whenit comes to measuring the background and the user probabilitymass within some intervals during the derivation ofand .Recall that when codewords are used for intervals labeling,

the code size and the number of quantization intervals must beequal. Thus, we use a common notation for both such quan-tities. Suppose that we have quantization inter-vals denoted by , shown in Fig. 3. Intuitively,there is a total of cutpoints including the boundary cut-points of the feature space, i.e., . For

, let the th interval cutpoint of the th di-mension be denoted as with and .This implies that the th interval is bounded by and .

Fig. 3. Equal-probable quantization with quantization intervals andinterval cutpoints.

Let the th single-dimensional background distribution berepresented by a Gaussian pdf with mean and standard devia-tion . The equal-probability mass enclosed by the th intervalcan be expressed in the integral form as

(6)

Converting (6) to a cumulative probability expression, we have

(7)

where signifies the number of intervals being considered. Byknowing the fact that index difference between any two inter-vals on a feature space is equivalent to the Hamming distancebetween the two corresponding binary codewords when LSSCencoding is employed, it is thus possible to equate .Let ; (7) becomes

(8)


since . Rearranging (8) and substi-tuting , we eventually have

(9)

as the expression for the interval cutpoints of equal-probablequantization when , and are specified. In fact, , ,and are the required helper data for equal-probable quantiza-tion to enable similar reconstruction of the th quantized featurespace during verification.

B. Estimation

Since equal-probable quantization is employed, each of theconstructed intervals on the th single-dimen-

sional feature space encloses an equal portion of backgroundprobability mass, thus yielding an equal probability offor every interval into which a feature may fall.For ranging within , a -bit imposter error proba-

bility is defined as the background probability mass or thesum of background probability mass enclosed by the interval(s)which are interval(s) away from the genuine interval in theth single-dimensional feature space. Due to LSSC encoding,“ intervals” here is in fact equivalent to “ Hamming dis-tance” when comparison of binary codewords is concerned. Forinstance, reflects the event where an imposter’s featureis located in the same interval as the genuine user’s feature inthe th dimension. Hence, both genuine and imposter’s featuresare mapped to a common binary codeword which contributes tothe FAR of the system eventually.Let be the th single-dimensional genuine interval of a user

with index and let be the interval whichis intervals away from in the same dimension, such that

and . The -bit imposter errorprobability of the th user can be formulated as

(10)

At this step, the problem of (10) branches into two parts:

1) : The first case is totally independent of the index. Given an index , there can only be one value of thatcould satisfy , which simply turns out to be thevalue of itself. Thus, the (background) probability of a0-bit error can be directly measured over the th interval(genuine interval), such that

(11)Subsequently, it follows from (10) that

(12)

2) : Since LSSC is a linear subcode which re-flects a linear relation between the Hamming distance withthe index difference, the -bit-error probability could bemeasured over the interval(s) which is (are) intervalsto the left or/and right from the th interval, depending onhow , , and are related. Hence, the measurement ofuser probability over can further be subdi-vided into three cases as follows:a) If , the probability of a -bit erroris only measurable over an interval that is intervalsto the right of the th interval.

b) If ,the probability of a -bit error is measurable over aninterval that is intervals away from both left andright of the th interval.

c) If , the probabilityof a -bit error is only measurable over an intervalthat is intervals to the left of the th interval.

Thus, from (10), we have

(13)

Combining results from (12) and (13), the following piecewisepmf can be obtained as the th user’s single-dimensional im-poster error pmf with respect to

(14)


A convolution on the pmf in (14) over all dimensions yieldsthe imposter pmf with respect to the th user

(15)

By taking the average over all users, the resultant imposterpmf

(16)

can eventually be used to estimate the system FAR illustrated in(5).

C. Estimation

A -bit genuine error probability is defined as the userprobability mass, or the sum of user probability mass enclosedby the interval(s) which are interval(s) away from the genuineinterval in the th single-dimensional feature space. Similarly,the case where implies no bit error in the comparison ofthe th single-dimensional template with query binary outputs.Let the th single-dimensional distribution of a user, say theth user, be modeled by a Gaussian pdf with mean and stan-dard deviation . To obtain better classification results, sam-ples of each user can be averaged to reduce the intraclass varia-tion if a biometric system could not afford to tolerate excessiveFRR. When samples of the th user are averaged, the intra-class variance will decrease by a factor of compared to thecase where no averaging is applied

(17)

In addition, it would be absolutely reasonable to assume acommon when measurements are taken during enrolmentand verification, since it is simpler and more convenient towork with consistent in both phases.It is worth noting that the user probability mass enclosed by

each interval is not known, since the intervals are not segmentedaccording to the user pdf. For this reason, it is necessary to makeuse of the cutpoints in quantifying such a probability mass.Similar to the previous case, we let be the th

single-dimensional genuine interval of a user with indexand be the interval which is intervals

away from in the same dimension, such thatand . With the cutpoints readily expressedas a function of in (9), we can now quantify of the thuser as

(18)

where .Based on how problem (10) branches, the problem of (18)

can similarly be divided into the two following parts:1) : Being independent of , the probability of a 0-biterror is equivalent to the probability of having a featurebeing located within an interval during enrolment and ver-ification. Thus, from (9), we have

(19)

where

(20)

2) : Being dependent on and , cases with (a), (b)

, and (c) haveto be taken into consideration, thus leading to


(21)

By combining results from (19) and (21), the th single-dimensional genuine error pmf with respect to canthus be written as:

for

(22)

A convolution of the pmf in (22) over all dimensionswould yield the th user’s overall genuine pmf:

(23)

By taking the average over all users, the resultant gen-uine pmf

(24)

can eventually be used to estimate the system FRR illus-trated in (4).

Since in (14) as well as in (22) is expressed as auser-specific function of single-dimensional bit assignment ,the resultant analytic expressions of and in (16) and (24)can be well applied to both static and dynamic discretizationschemes with fixed and varying for all ,respectively.

D. PmF-Fitting Due to Mismatch ofFeature-Component-Independency Assumption

Convolutions in (15) and (23) can be applied perfectly whenall feature components are mutually independent. However,this assumption, even when the feature extraction is performedthrough independent component analysis (ICA), often does nothold in practice as most types of natural data (e.g., image data)do not have truly independent features to be sought [8]. To over-come such feature component dependencies in our model, wefollow the strategy of [11] in estimating the dependency amongthe feature components from the experimental training data.However, we do not assume both genuine and imposter pmfsbeing influenced by dependency to the same extent since the dif-ference between the analytical and the experimental pmfs for the

genuine and imposter cases are not of equal amount in the caseof multibit discretization, as will be illustrated in Section IV-C(experiment part II). Therefore, Kelkboom et al.’s strategy ofderiving the curve-fitting parameter for estimating the spread ofthe genuine pmf from the experimental imposter pmf cannot bereliably adopted in our case.As a solution, we estimate the dependency from the experi-

mental and pmfs and apply them to theand pmf estimations, respectively. Specifically, we fitand pmfs to and pmfs using Gaussianvariance and mean fitting parameters. Thus, the Gaussian ap-proximation of pmf for the th userwith mean and variance parameters and as well asthe Gaussian approximation of pmf withmean and variance parameters and can be written, respec-tively, as

(25)

and

(26)For both approximations, the optimal parameters can be com-puted through minimizing the mean square error (MMSE), suchthat

(27)

(28)

By applying these optimal parameters into (25) and (26), weobtain a better estimation of the overall genuine and imposterpmfs

(29)

(30)


TABLE IIEXPERIMENT SETTINGS

IV. EXPERIMENTS AND ANALYSIS

A. Experiment Setup

A synthetically generated data set and two popular face datasets are used to evaluate the estimation accuracy in this section:i) Synthetic: This data set contains 50 hundred-dimensionalfeature vectors samples for each of the 50 users, con-stituting a total of 2500 feature vectors. In each dimen-sion, the of all users were independently generatedbased on a Gaussian distribution , where

denotes a small random number; and thesingle-dimensional feature measurements of every userwere independently generated based on another Gaussiandistribution .

ii) PIE: The adopted data set is a large subset of the CMUPIE face dataset [18] that contains a total of 3808 im-ages with 56 images for each of 68 users. The images ofeach person were taken under different poses, different il-lumination conditions, and different expressions. Properalignment is applied to the images based on standard facelandmarks. Due to possible strong variation in hair style,only the face region is extracted for recognition by crop-ping it to the size of 32 32 from each raw image.

iii) FERET: The adopted data set is a large subset of theFERET face dataset [17] in which the images were col-lected under a semicontrolled environment. It contains atotal of 2400 images with 12 images for each of 200 users.Proper alignment is applied to the images based on stan-dard face landmarks. Due to possible strong variation inhair style, only the face region is extracted for recogni-tion by cropping it to the size of 61 73 from each rawimage.

For these data sets, half of each user’s images are used fortraining while the remaining half are used for testing. For mea-suring system FAR, each image of each user is matched againstan image of every other user according to the image index, whilefor evaluating system FRR, each image is matched against everyother images of the same user for every user. The total numberof genuine and imposter matches are shown in Table II.Generally, the experiments can be divided into three parts:

The first part validates the analytic expressions of in (15),in (24), in (30), in (29), in

(5), and in (4) using the Gaussian synthetic data set andcompares the pmf estimation of Kelkboom et al.’s model [11]with that of our model when . The second part exam-ines the capability of the model in handling the same estima-tions using principal component analysis (PCA) [22] projecteddata of the PIE and the FERET face data sets when realistic

concern regarding feature component dependency is present.The third part investigates the accuracy of the estimated equalerror rate [(EER) error rate where FAR FRR] and the re-spective threshold with respect to increasing number ofdimensions, bits allocated to each dimension as well as aver-aging factor based on the PIE data set. Note that static multibitdiscretization is applied throughout all parts of the experimentwith an equal amount of bits allocated to each dimension.

B. Part I: Validation of Analytic Expressions on SyntheticData Set

Fig. 4 validates the estimation results of (a) and (b)and ; and (c) the corresponding detection error trade-off (DET)for (I) and (II) under the ideal scenario where thesynthetically generated feature components are truly Gaussianand mutually independent. The experimental curves are taggedwith “exp” while the analytic estimations with and without de-pendency incorporation are tagged with “an-dep” and “an,” re-spectively.In Fig. 4(Ia) and (IIa), we observe an excellent agreement be-

tween the experimental and the analytical pmfs of our model.Disregarding whether feature component dependency is inte-grated, both -an -an) and -an-dep ( -an-dep) pmfsare found to match very well with -exp ( -exp) under idealconditions, justifying their relative expressions in Section III.With such highly accurate matching results, it is not surprisingto further discover satisfying results for and estimations inFig. 4(Ib) and (IIb), and the corresponding DET estimations inFig. 4(Ic) and (IIc). Since the DET curves combine both and, they are susceptible to estimation errors associated withor/and . Hence, the slight deviation of the an-dep curve fromthe exp curve in Fig. 4 could be due to the corresponding trainingpmfs over-fit. However, it is found that the estimated EER andthe corresponding threshold differ rather insignificantlyfrom the experimental results.It is also worth noting that estimation results of Kelkboom

et al.’s single-bit model match very well with the experimentalresults in Fig. 4(Ia), (Ib), and (Ic) as their model can be regardedas a special case of ours.

C. Part II: Validation of Analytic Expressions on PIE andFERET Face Data Sets

Fig. 5 illustrates the estimation results for (a) and , (b)and , and (c) the corresponding DET for the static multibit

discretization with based on (I) the PIE and (II) theFERET face data sets. Different empirical settings of andfor each dataset (PIE: , ; while FERET: ,

) are used to verify the consistency of our results. PCA isadopted for feature extraction for its capability of transformingraw features into features that are significantly more Gaussian[11].In Fig. 5(Ia) and (IIa), we observe a rather neat agreement

between -an and -exp pmfs but a large discrepancy be-tween the -an and the -exp pmfs. In particular, the meanof -an pmf slightly deviates from that of -exp pmf, but thespread of -exp pmf appears to be much greater than that of-an pmf. This is primarily due to the existence of dependency

among the feature components that affects and pmfs bya different amount, leading to less precise convolution outcomes


Fig. 4. (I) Comparison of estimation results with Kelkboom’s model with reference to the experimental results for . (II) Validation of analytic expressionsfor based on the synthetic data set: (a) the and validation, (b) the and validation, and (c) the corresponding DET curves. The experimentalcurves are tagged with “exp,” while the analytical curves with and without dependency incorporation are tagged with “an-dep” and “an,” respectively.

in (15) and (23) with reference to the experimental results. Thegreat difference in the dependency influence on andpmfs significantly explains why Kelkboom et al.’s strategy offitting an optimal variance parameter estimated from the exper-imental imposter pmf into the analytic genuine pmf (to correctthe spread of -an as a result of feature-independency assump-tion mismatch) cannot be applied to our case.After the dependency correction is applied to the model,

the -an-dep and -an-dep pmfs yield substantially im-proved estimation of the -exp and -exp pmfs. The slightremaining mismatch in the -exp pmf estimation can beexplained by the inaccuracy in the estimation of the Gaussianparameters for approximating every user’s pdf.The improvements pertinent to the -an-dep and the-an-dep pmfs have been illustrated in Fig. 5(Ib) and (IIb)

by the -an-dep and the -an-dep, respectively, with referenceto those without feature dependency incorporation. This canbe justified from the smaller discrepancy between -an-dep( -an-dep) curve and the -exp ( -exp) curves than that be-tween the -an ( -an) curve and the -exp ( -exp) curves.Similar observations can be obtained in Fig. 5(Ic) and (IIc),

where the DET-exp curve is found to be much closer to theDET-an-dep curve than to the DET-an curve. Due to such minordiscrepancies, the EER and the corresponding threshold can,therefore, be reliably estimated with the DET-an-dep curve. Inparticular, difference with a 1 threshold differ-ence, and EER difference with a 3 thresholddifference were observed with reference to the DET-exp curvein Fig. 5(Ic) and (IIc), respectively. This particularly vindicatesthe usefulness of the model in estimating the correct threshold


Fig. 5. (I) Validation of analytic expressions based on the (I) PIE and (II) FERET data sets for : (a) the and validation, (b) the and validation,and (c) the corresponding DET curves. The experimental curves are tagged with “exp,” while the analytical curves with and without dependency-incorporated aretagged with “an-dep” and “an,” respectively.

for achieving EER despite slight remaining discrepancy of theestimation.

D. Accuracy Evaluation of EER and Estimation on PIEFace Data Set

Fig. 6 examines the analytic estimation of (I) the EER and(II) the corresponding threshold on PIE data set with re-spect to different settings of (a) dimension , (b) bit allocationper dimension , and (c) averaging factor . Evidently, the es-timation errors of “an-dep” are quite negligible with all suchsettings. However, if dependency among feature components isnot treated by the model, some patterns of estimation errors canbe recognized as we vary the values of each tested parameter.Hence, from these evaluations, we are interested to investigatethe effect of feature dependency to estimation error of “an.”

In Fig. 6(Ia) and (IIa), it is discerned that the estimation errorsof EER-an and -an tend to increase with dimensionality ,causing such errors to peak at for both EER andestimations. Since the value of appears to be the only factorthat determines the number of convolutions to be performed in(15) and (23), it is anticipated that the estimation inaccuracieswould accumulate over these convolution operations.As for the second evaluation, the EER-an and the EER-exp

in Fig. 6(Ib) fluctuate insignificantly with respect to in-creasing bit allocation per dimension. This is particularlydue to the saturation nature of EER performance (at high) of LSSC encoding-based discretization [13]. However,a steady increment in the estimation error of -an isobserved in Fig. 6(IIb). This observation can be explainedas follows. The approximately constant EER over stemsfrom the large similarity of the corresponding and


Fig. 6. Analytic estimations of (I) the EER and (II) the corresponding threshold based on the PIE data set.

pmfs if they are plotted against the normalized Ham-ming distance. One could easily check that with respect to

with and the final bit length, the normalized

and the normal-ized .By subtracting -an from -exp, the differences be-tween the two sets of thresholds are obtained as [0.0200,0.0133, 0.0200, 0.0167, 0.0227]. However, when the actualHamming distance is concerned, the normalized scale hasto be multiplied with a constant factor of the final bit length

. As such, the difference between the actual thresholdHamming distance with the estimated threshold is amplifiedcorrespondingly when increases.For the last evaluation of estimation error with respect to av-

eraging factor , it is noticed that the estimation error decreasesfor EER-an while increases for with increasing . Recallfrom (17) that the value of reduces the actual standard devia-tion of the user pdfs by a factor of , which eventually leadsto better discretization performance. Furthermore, note in (14)that the expression of pmf does not depend on . Hence,the observations from Fig. 6(Ic) and (IIc) imply that whenincreases, the dependency among the feature components haveled to slower width-shrinking and mean-shifting (to the left) of-exp pmf than that of -an pmf. This sufficiently explains

why the EER-an approaches zero much earlier than EER-exp,while enlarging the estimation error of with increasing .

V. CONCLUSION

In this paper, we have proposed a Gaussian analytical frame-work to estimate the genuine and imposter Hamming distancepmfs, the FRR and FAR curves, and the threshold that achievea specified FAR and FRR of a multibit discretization basedon equal-probable quantization and LSSC encoding. Sincethe analytical pmfs are modeled based on bit allocation per

individual dimension, this framework is well applicable todiscretization schemes with static or dynamic bit allocation.We have validated these analytical expressions experimentallyusing a synthetic data set whose single-dimensional featuremeasurements of each individual are independently generatedbased on a Gaussian distribution. Results show that our modelestimates the curves promisingly when the Gaussianity offeature measurements and the feature independency assump-tions of our framework match the properties of the data set.However, significant estimation errors on the PIE and theFERET face data sets are obtained when dependencies amongfeature components are present. Since the difference betweenthe analytical and the experimental pmfs for the genuine andthe imposter cases are not of equal amount in most cases ofmultibit discretization, Kelkboom et al.’s strategy of derivingthe curve-fitting parameter for estimating the spread of thegenuine pmf from the experimental imposter pmf cannot bereliably adopted. As a result, we suggest to fit the analyticalpmfs to the experimental pmfs through estimating the meanand the variance parameters from the difference between thecorresponding analytical and experimental curves, and theexperimental results on such real data sets yield less deterio-ration in estimation results. Lastly, under different settings ofdimension, bit allocation per dimension and averaging factor,experimental results on PIE data set show that the final frame-work enables encouraging EER and estimations.

REFERENCES[1] Y. Chang, W. Zhang, and T. Chen, “Biometric-based cryptographic

key generation,” in Proc. IEEE Int. Conf. Multimedia and Expo (ICME2004), 2004, vol. 3, pp. 2203–2206.

[2] C. Chen and R. Veldhuis, “Extracting biometric binary strings withminimal area under the FRR curve for the Hamming distance classi-fier,” Signal Process., vol. 91, no. 4, pp. 906–918, 2011.

[3] C. Chen, R. Veldhuis, T. Kevenaar, and A. Akkermans, “Multi-bitsbiometric string generation based on the likelihood ratio,” in Proc.IEEE Int. Conf. Multimedia and Expo (ICME 2004), 2004, vol. 3, pp.2203–2206.


[4] C. Chen, R. Veldhuis, T. Kevenaar, and A. Akkermans, “Bio-metric quantization through detection rate optimized bit allocation,”EURASIP J. Adv. Signal Process., vol. 2009, 2009, Article ID 784834,16 pages.

[5] Y. Dodis, R. Ostrovsky, L. Reyzin, and A. Smith, “Fuzzy extractors:How to generate strong keys from biometrics and other noisy data,” inProc. EUROCRYPT 2004, 2004, vol. 3027, LNCS, pp. 523–540.

[6] F. Gray, “Pulse Code Communications,” U.S. Patent 2632058, Mar.17, 1953.

[7] F. Hao and C. W. Chan, “Private key generation from on-line hand-written signatures,” Inf. Manage. Comput. Security, vol. 10, no. 4, pp.159–164, 2002.

[8] M. Inki, “A model for analyzing dependencies between two ICA fea-tures in natural images,” in Proc. 5th Int. Conf. Independent Analysisand Blind Signal Separation (ICA ’04), 2004, vol. 3195, LNCS, pp.914–921.

[9] A. Juels and M. Wattenberg, “A fuzzy commitment scheme,” in Proc.6th ACM Conf. Computer and Communication Security (CCS’99),1999, pp. 28–36.

[10] E. J. C. Kelkboom, G. Garcia Molina, T. A. M. Kevenaar, R. N. J.Veldhuis, and W. Jonker, “Binary biometrics: An analytic frameworkto estimate the bit error probability under Gaussian assumption,” inProc. Biometrics, Theory, Applications and Systems (BTAS ’08), 2008,pp. 1–6.

[11] E. J. C. Kelkboom, G. Garcia Molina, J. Breebaart, R. N. J. Veldhuis,T. A. M. Kevenaar, and W. Jonker, “Binary biometrics: An analyticframework to estimate the performance curves under Gaussian assump-tion,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 40, no.3, pp. 555–571, May 2010.

[12] T. A. M. Kevenaar, G. J. Schrijen, M. Van der Veen, A. H. M. Akker-mans, and F. Zuo, “Face recognition with renewable and privacy pre-serving binary templates,” in Proc. 4th IEEE Workshop on AutomaticIdentification Advanced Technologies (AutoID ’05), 2005, pp. 21–26.

[13] M.-H. Lim and A. B. J. Teoh, “Linearly separable subcode: A noveloutput label with high separability for biometric discretization,”in Proc. 5th IEEE Conf. Industrial Electronics and Applications(ICIEA’10), 2010, pp. 290–294.

[14] J.-P. Linnartz and P. Tuyls, “New shielding functions to enhance pri-vacy and prevent misuse of biometric templates,” inProc. 4th Int. Conf.Audio and Video Based Person Authentication (AVBPA 2004), 2003,vol. 2688, LNCS, pp. 238–250.

[15] F. Monrose, M. K. Reiter, Q. Li, and S. Wetzel, “Cryptographic keygeneration from voice,” in Proc. IEEE Symp. Security and Privacy(S&P ’01), 2001, pp. 202–213.

[16] F. Monrose, M. K. Reiter, Q. Li, and S. Wetzel, “Using voice togenerate cryptographic keys,” in Proc. Speaker Verification Workshop,2001, pp. 237–242.

[17] P. J. Philips, H. Moon, P. J. Rauss, and S. Rizvi, “The FERET evalu-ation methodology for face recognition algorithms,” IEEE Trans. Pat-tern Anal. Mach. Intell., vol. 22, no. 10, pp. 1090–1104, Oct. 2000.

[18] T. Sim, S. Baker, and M. Bsat, “The CMU Pose, Illumination and Ex-pression (PIE) database,” in Proc. 5th IEEE Conf. Automatic Face andGesture Recognition (FGR ’02), 2002, pp. 46–51.

[19] A. B. J. Teoh, K.-A. Toh, and W. K. Yip, “ discretisation of bio-phasor in cancellable biometrics,” in Proc. 2nd Int. Conf. Biometrics(ICB ’07), 2007, vol. 4642, LNCS, pp. 435–444.

[20] A. B. J. Teoh, D. C. L. Ngo, and A. Goh, “Personalised cryptographickey generation based on FaceHashing,” Comput. Security, vol. 23, no.7, pp. 606–614, 2004.

[21] A. B. J. Teoh,W.K. Yip, and S. Lee, “Cancellable biometrics and anno-tations on BioHash,” Pattern Recognit., vol. 41, no. 6, pp. 2034–2044,2008, Elsevier.

[22] M. Turk and A. Pentland, “Eigenfaces for recognition,” J. CognitiveNeurosci., vol. 3, no. 1, pp. 71–86, 1991.

[23] P. Tuyls, A. H. M. Akkermans, T. A. M. Kevenaar, G.-J. Schrijen, A.M. Bazen, and R. N. J. Veldhuis, “Practical biometric authenticationwith template protection,” in Proc. 5th Int. Conf. Audio- and Video-Based Biometric Person Authentication, 2005, vol. 3546, LNCS, pp.436–446.

[24] E. Verbitskiy, P. Tuyls, D. Denteneer, and J. P. Linnartz, “Reliable bio-metric authentication with privacy protection,” in Proc. 24th BeneluxSymp. Information Theory, 2003, pp. 125–132.

[25] W. K. Yip, A. Goh, D. C. L. Ngo, and A. B. J. Teoh, “Generation ofreplaceable cryptographic keys from dynamic handwritten signatures,”inProc. 1st Int. Conf. Biometrics, 2006, vol. 3832, LNCS, pp. 509–515.

Meng-Hui Lim received the B.Eng. degree fromMultimedia University, Malaysia, in 2006, theM.Eng. degree from Dongseo University, SouthKorea, in 2009, and the Ph.D. degree from YonseiUniversity, Seoul, South Korea, in 2012.His research interests include pattern recognition,

cryptography, biometric security, and discretization.

Andrew Beng Jin Teoh (M’06–SM’12) received theB.Eng. degree in electronics and the Ph.D. degreefrom National University of Malaysia, in 1999 and2003, respectively.He is currently an assistant professor in the

Electrical and Electronics Department, CollegeEngineering, Yonsei University, Seoul, South Korea.His research interests are in biometrics security,pattern recognition, and computational intelligence.He has published around 180 international refereedjournal and conference papers in his area.

an analytic performance estimation framework for multibit biometric discretization based on...

Documents