2006 3d face recognition using normal sphere and general fourier descriptor

Upload: luongxuandan

Post on 04-Jun-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 2006 3D Face Recognition Using Normal Sphere and General Fourier Descriptor

    1/4

    3D Face Recognition using Normal Sphere and General Fourier Descriptor

    Andrea F. Abate, Michele Nappi, Daniel Riccio and Gabriele Sabatino

    University of SalernoVia ponte don Melillo, 84084, Fisciano

    abate, mnappi, driccio, [email protected]

    Abstract

    Today, face figures among the most promising

    biometrics, allowing to identify people without

    requiring any physical contact. In this research field,

    3D provides a significant improvement in recognition

    performances, but the existing approaches show

    limitations dealing with pose variations; indeed 3Dface surfaces need to be aligned before the matching

    operation. This paper proposes an approach that

    overcomes this limitation by projecting the 3D shape

    information onto the 2D surface of a normal sphere,

    while a rotation invariant descriptor is used to extract

    key features from this surface. In addition, using a 2D

    descriptor reduces the computing time that is a typical

    drawback of 3D methods. Experimentations have been

    conducted on a property face dataset, to assess the

    robustness of the method with respect to a large set of

    facial expression and pose variations.

    1. Introduction

    Face represents a quite interesting biometric, being

    one of the most common methods humans use in their

    visual interactions. Furthermore, it is largely

    considered one of the most accepted biometric among

    all the existing ones, thanks that providing high

    recognition rate without requiring any contact with the

    sensor surface. Many 2D algorithms are available from

    literature [5], but none of them is free of limitations

    (i.e.: illumination/pose changes). Thus 3D processing

    has been proposed as a new alternative; it has the

    potential of improving performances overcoming the

    lacks of the 2D systems. Indeed, 2D data provides justa bidimensional projection of facial features, while 3D

    data represents a discrete approximation of faces

    three-dimensional geometry.

    In this way, it turns easier to address some intra-

    class variations such as facial expression variations

    and glasses or beard presence. Furthermore, exploiting

    the depth information allows us to profit by topological

    descriptors (i.e. local curvatures) for localizing face

    features.

    The early researches on 3D face recognition were

    conducted over a decade ago as reported from Bowyer

    et al. [3] in their recent survey on this topic and many

    different approaches have been developed over time to

    address this challenging task. All the existing methods

    can be grouped in two classes according to the way, in

    which they address the pose variation problem. In the

    first class there are all the approaches working with

    pre-aligned models; two examples are given by [6] and

    [2]. In [6] the authors compare faces by measuring

    distances between any two coarsely aligned 3D facial

    surfaces by means of the Iterative Closest Point (ICP)

    method, while in [2] the matching operation is

    performed by combining similarity scores coming from

    comparisons of 3D and 2D prealigned profiles.

    We present a 3D face recognition method based on

    normal sphere, which is a spherical surface

    representing the local curvature of a 3D polygonal

    mesh in terms of RGB color data (similar to normal

    maps used in [1]). Then, a 2D classification process is

    used to save a meaningful part of the computational

    complexity while preserving the rotation invariance

    property. The rest of the paper is organized as follows.

    Section 2 presents the system overview in more detail.

    In section 3 the indexing and retrieval techniques are

    discussed. Section 4 discuss some experimental results.

    The paper concludes in section 5 showing conclusions

    and directions for future research.

    2. A System Overview

    The key idea of the systems consists in projecting

    the 3D shape information of the face onto the 2D

    surface of a normal sphere. The normal sphere is a 3D

    primitive that is recursively generated from an

    icosahedron, in which all sides are spacially equi-

    distributed equilateral triangles.

    Each triangle on this surface is characterized by the

    normal v

    to its surface, while each normal has three

    The 18th International Conference on Pattern Recognition (ICPR'06)0-7695-2521-0/06 $20.00 2006

  • 8/13/2019 2006 3D Face Recognition Using Normal Sphere and General Fourier Descriptor

    2/4

    components . While, face rotations can

    make changes in the value of normal components, the

    angle between normals of contiguous triangles still

    remain unchanged under different poses; in order to

    exploit this property, we store only the angles between

    these normals. The homogeneous distribution of these

    triangles still retains spatial relationships betweenfacial features and give us a two-dimensional

    representation of the original face geometry.

    ),,( zyx vvv

    Each band (Red, Green and Blue) is then processed

    independently by applying a 2D Fourier based

    descriptor that produces a fixed number of coefficients,

    as shown in Figure 1. At last, the global feature vector

    is obtained by concatenating the coefficients of the all

    three bands.

    Figure 1. Enrolment workflow.

    2.1. Face Acquisition and Processing

    3D face acquisition is the first step of the enrolment

    process which results in a polygonal surface

    representing the input face. We opted a structured laser

    that scans the face reconstructing vertices and

    polygons.

    Computational complexity represents a serious

    limitations for 3D applications as well as 3D based

    recognition systems. Then, here we describe a way of

    sub-sampling the 3D information in a 2D space but

    still preserving most of the original information.

    Indeed, the local curvature of a polygonal mesh is

    faithfully represented by polygon normals, where

    angles between couples of normals can be represented

    as a RGB colour in a 2D space.To this aim, we first need to project vertex

    coordinates onto a 2D curvilinear space; this task could

    be thought as the inverse of the well known mapping

    technique. We use a spherical projection (re-adapted to

    the mesh size), because it turns in a sound way to fit

    the 3D shape of the face mesh. In more detail, let be

    iiiii zyxvMvRM ,,,,

    3 , a mesh, we want relate

    each Mvi with a point on the spherical surface

    represented by the ordered couple of polar coordinates

    (i, i) where 0 < 2and 0 . This is done

    by the following formulas:

    x

    y1tan ,

    r

    z1cos (1)

    where ris the diameter of mesh M(Figure 2 shows

    a spherical projection). Then, we can store normals of

    mesh M in a two-dimensional structure, namely the

    normal sphere N, by using the previously 2D-projected

    vertex coordinates given by equation (1).

    Figure 2. Projecting the (a) 3D mesh vertices in (b)

    spherical coordinates.

    In order to obtain an optimized tessellation of 3D

    geometry we have to re-shaping triangles by fixing the

    vertex coordinates according to a quite regular

    structure recursively generated from an icosahedron(see Figure 3). So there is only one normal to each

    point (triangle) on the surface of the normal sphere;

    each normal is characterized by its components, so that

    the spatial distribution of the normals can be

    represented in a two-dimensional structure N by

    representing each normal as a point on the normal

    sphere surface. In details, for each triangle tiinNwith

    The 18th International Conference on Pattern Recognition (ICPR'06)0-7695-2521-0/06 $20.00 2006

  • 8/13/2019 2006 3D Face Recognition Using Normal Sphere and General Fourier Descriptor

    3/4

    vertex coordinates ),(),,(321

    iii vvv we assign the

    RGB value obtained from the corresponding normal nito the triangle tion the mesh M. We refer this resulting

    structure as theNormal Sphere Nof the meshM.

    Figure 3. Mesh details before and after resampling.

    In order to nullify the effects of face rotations we

    consider the angles between normals of contiguous

    triangles. In more details, for each triangle onNits

    three contiguous neighbors are considered and

    the angles

    it

    lkj ttt ,,

    ),( ji tt , ),( ki tt and ),( li tt between

    their normals are computed and sorted (321 ).

    Then21, and 3 are quantized with 8 bits in the

    range [0, 255] and considered as R, G or B band, so

    the resulting codification can be seen as a 24 bit color

    (see Figure 1). This is the new facial shape that the

    GFD descriptor is applied on.

    3. Face Classification

    To turn the 3D classification problem into a 2D

    classification task allows us to save a meaningful partof the computational complexity. However, we want

    the rotation invariance property to be still preserved, so

    a certain attention has to be paid to select a 2D face

    descriptor. In this case, we opted for a Fourier based

    operator. The Fourier Transform (FT) is largely used

    in image processing, because it often makes some

    limitations to be overcome by working in the

    frequency domain (e.g.: noise, shifts). In particular, we

    adopted a FT based descriptor defined by Zhang in [7]

    for classifying binary trademarks, but we readapted it

    to our purposes.

    Given an image I in the Cartesian space xOy, we

    convert it in a polar space O, by relatingI(x,y)withI(,), defining 22 cc yyxx and

    cc xxyya /2tan , where is the

    center of the Cartesian space. The polar Fourier

    Descriptor is defined on this polar space as follows:

    cc yxO ,

    T

    i

    R

    rjrIFD

    r

    i

    i

    22exp,, (2)

    where Rr0 and Tii /2 , Ti0 ;

    R 0 and T 0 . The and represent the

    number of selected radial and angular frequencies. The

    FD descriptor is made rotation invariant by retaining

    only the magnitude of the coefficients, while

    robustness with respect to the scaling is achieved bydividing the first coefficient by the area containing the

    image and all the remaining coefficients by the first

    one, as follows:

    0,0

    ,,,

    0,0

    1,0,

    0,0

    FD

    nmFD

    FD

    FD

    area

    FDV (3)

    The most important factor is the way, in which the

    point cc yxO , is localized. Indeed, imposing some

    restrictions places this method on a par with all the

    approaches requiring a pre-alignment of the face.

    We solve this problem by keeping in mind that the

    objects to be classified are faces, while large

    experimentation proved that we can locate the nose tip

    as the point of maximum curvature on the face surface.

    By fixing this point onto the normal sphere as the

    center of the Cartesian space, left/right side and

    bottom/up rotations are made irrelevant, while rollings

    (7th case in Figure 4) vanish thanks to the

    rotation/scaling invariant property of the FD

    descriptor.

    4. Experimental Results

    We built a gallery of 120 subjects enrolled by a

    structured light scanner from Inspeck Corp.. In this

    case the pose variations were acquired through

    multiple scanning of the same individual. Acquisition

    differs for pose and expression; in more detail, ten 3D

    models with different facial expression have been

    acquired for each subject, while 5 pose per subject

    have been considered. Figure 4 shows a subset of

    poses and expressions we considered.

    Figure 4. Pose variations (1st, 2nd and 3rd

    neutral, 45 and 60 right side, bottom, and rolled

    orientation).

    In the first experiment we investigated the best

    resolution providing highest value of the CMS. The

    The 18th International Conference on Pattern Recognition (ICPR'06)0-7695-2521-0/06 $20.00 2006

  • 8/13/2019 2006 3D Face Recognition Using Normal Sphere and General Fourier Descriptor

    4/4

    gallery set contains 120 images with neutral

    expression, one per subject, while the probe consists of

    1080 images that are the remaining 9 expressions for

    120 subjects. Figure 5 shows that 1616 pixels are too

    little giving very poor results in terms of CMS. The

    best results are achieved for a resolution of 3232,

    while the performances get worse again when theresolution increases. This behavior can be explained by

    considering that high resolution carry too much high-

    frequency information that is of no use in this case.

    The second experiment tests the robustness of the

    method with respect to pose variations. In particular as

    discussed in Section 3 left/right and bottom/up face

    rotations are made irrelevant by locating the nose tip

    on the normal sphere, thus we considered here a

    gallery set of 20 aligned images, as in the first

    experiment, while the probe consists of 180 images (9

    per subject) with different facial expression and rolled

    of a random angle . In 6/,0 Figure 6 we

    compared performances of NS+GFD, NS+PCA andthe normal map (NM) based method from [1] when 30

    head rolling occurs. Results confirm the superiority of

    the proposed method with respect to both facial

    expression and pose variation.

    Figure 5. CMS for different image resolutions.

    5. Concluding remarks

    We presented a novel 3D face recognition approach

    based on normal sphere, a 2D data structure

    representing local curvature of facial surface, aimed to

    biometric applications. The 2D classification task

    allows us to save a meaningful part of the

    computational complexity, preserving the rotation

    invariance property. So, it proved to be simple,

    invariant to posture variations, fast and with an high

    average recognition rate. As the normal sphere is a 2D

    mapping of mesh features, ongoing research will

    integrate additional 2D color info (texture) captured

    during the same enrolment session. Implementing a

    true multi-modal version of the basic algorithm which

    correlates the texture and normal image could further

    enhance the discriminating power even for complex

    3D recognition issues such as the presence of beard,

    moustache, eyeglasses, etc.

    Figure 6. CMS on rolled faces.

    6. References

    [1] A. F. Abate, M Nappi, S. Ricciardi, G. Sabatino, "Fast

    3D Face Recognition Based On Normal Map", in Proc. of

    IEEE International Conference on Image Processing

    (ICIP05), Genova, Italy, 2005.

    [2] C. Beumier and M. Acheroy, "Face verification from

    3D and grey level cues", inPattern Recognition Letters, Vol.

    22, No. 12, pp. 1321-1329, 2001.

    [3] K. Bowyer, K. Chang, P. Flynn, "A survey ofapproaches to threedimensional face recognition", inProc. of

    17thInternational Conference on Pattern Recognition(ICPR

    2004), Vol. 1, pp. 358-361, Aug. 2004.

    [4] A. M. Bronstein, M. M. Bronstein, and R. Kimmel,

    "Expression invariant 3D face recognition", in Proc. of

    Audio- and Video- Based Person Authentication (AVBPA

    2003), Guildford, UK, Lecture Notes in Computer Science, J.

    Kittler and M.S. Nixon, Vol. 2688, pp. 62-69, 2003.

    [5] R. Chellapa, C.L. Wilson, S. Sirohey, "Human and

    machine recognition of faces: A Survey," in Proc. of the

    IEEE, Vol. 83, No. 5, pp. 705-740, 1995.

    [6] G. Medioni and R. Waupotitsch, "Face recognition and

    modeling in 3D", in Proc. of IEEE Int'l Workshop on

    Analysis and Modeling of Faces and Gestures (AMFG

    2003), Nice, France, pp. 232-233, Oct. 2003.

    [7] D. Zhang, G. Lu, Shape-based image retrieval using

    generic Fourier descriptor, in Proc. of Signal Processing:

    Image Communication 17, pp. 825848, 2002.

    The 18th International Conference on Pattern Recognition (ICPR'06)0-7695-2521-0/06 $20.00 2006