low-cost automatic inpainting for artifact suppression in ... · low-cost automatic inpainting for...

10
Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´ e Sobiecki 1 , Alexandru Telea 1 , Gilson Giraldi 2 , Luiz Antonio Pereira Neves 3 and Carlos Eduardo Thomaz 4 1 Scientific Visualization and Computer Graphics, University of Groningen, Nijenborgh 9, Groningen, The Netherlands 2 Department of Computing, National Laboratory of Scientific Computation, Petr´ opolis, Brazil 3 Department of Computing, Paran´ a Federal University, Curitiba, Brazil 4 Department of Artificial Intelligence Applied on Automation, University Center of FEI, S˜ ao Bernardo do Campo, Brazil {a.sobiecki, a.c.telea}@rug.nl, [email protected], [email protected], [email protected] Keywords: Image Inpainting, Face Reconstruction, Statistical Decision, Image Quality Index. Abstract: Facial images are often used in applications that need to recognize or identify persons. Many existing facial recognition tools have limitations with respect to facial image quality attributes such as resolution, face posi- tion, and artifacts present in the image. In this paper we describe a new low-cost framework for preprocessing low-quality facial images in order to render them suitable for automatic recognition. For this, we first detect artifacts based on the statistical difference between the target image and a set of pre-processed images in the database. Next, we eliminate artifacts by an inpainting method which combines information from the target image and similar images in our database. Our method has low computational cost and is simple to implement, which makes it attractive for usage in low-budget environments. We illustrate our method on several images taken from public surveillance databases, and compare our results with existing inpainting techniques. 1 INTRODUCTION Facial recognition is the process of obtaining the iden- tity of a person based on information obtained from facial appearance (Ayinde and Yang, 2002). Com- pared to other person identification methods such as biometric fingerprints, retina scans, and voice recog- nition, facial recognition has the advantage that it can be used in contexts where the collaboration of the person to be identified is not possible (Zhao et al., 2003; Hjelmas and Low, 2001). One such context is the identification of missing people based on existing photographs thereof. In the last decades, automatic face recognition has received considerable attention from the scientific and commercial communities. However, several open is- sues still remain in this area. One such issue is that facial recognition tools are in general not efficient for poor quality facial images, e.g. in the presence of shadows, artifacts, and blurring (Zamani et al., 2008; Zhao et al., 2003; Castillo, 2006). In some contexts, facial protographs are the only key to person identification. For example, the sites of Australian Federal Police (AFP, 2012), Federal Brazilian Government (FGB, 2012), and UK Miss- ing People Centre (MPO, 2012) publish facial pho- tographs of missing people. Often, such photographs are old, poorly digitized, and have artifacts such as folds, scratches, irregular luminance, molds, stamps, and written text, all of various sizes, shapes, texture, and color. Since the effectiveness of facial recognition methods depends on the quality of input images, it is of high importance that such images present the dis- criminant facial features (eyes, mouth, and nose) with minimal artifacts. Moreover, it is desirable that they all have a standard look, e.g. avoid outlier elements such as glasses, hair locks, highlights, and smiles. We present here a computational framework for segmentation and automatic restoration of poor- quality facial images, based on a statistical model built from frontal facial images and inpainting tech- niques. The location of facial features is obtained by a mean image generated from a sample popula- tion of conforming facial images that provides priv- ileged information about spatial location of the fea- tures. Salient outliers are found by statistically com- paring the input image with this mean image. These outliers are next eliminated by using a modified in- painting technique which uses information from both the input image itself and the closest high-quality im- age in our image database. This produces images of sufficient quality for typical face recognition tasks. In 41

Upload: dinhdat

Post on 06-Dec-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

Low-cost Automatic Inpainting for Artifact Suppression in Facial Images

Andre Sobiecki1, Alexandru Telea1, Gilson Giraldi2, Luiz Antonio Pereira Neves3

and Carlos Eduardo Thomaz4

1Scientific Visualization and Computer Graphics, University of Groningen, Nijenborgh 9, Groningen, The Netherlands2Department of Computing, National Laboratory of Scientific Computation, Petropolis, Brazil

3Department of Computing, Parana Federal University, Curitiba, Brazil4Department of Artificial Intelligence Applied on Automation, University Center of FEI, Sao Bernardo do Campo, Brazil

fa.sobiecki, [email protected], [email protected], [email protected], [email protected]

Keywords: Image Inpainting, Face Reconstruction, Statistical Decision, Image Quality Index.

Abstract: Facial images are often used in applications that need to recognize or identify persons. Many existing facialrecognition tools have limitations with respect to facial image quality attributes such as resolution, face posi-tion, and artifacts present in the image. In this paper we describe a new low-cost framework for preprocessinglow-quality facial images in order to render them suitable for automatic recognition. For this, we first detectartifacts based on the statistical difference between the target image and a set of pre-processed images in thedatabase. Next, we eliminate artifacts by an inpainting method which combines information from the targetimage and similar images in our database. Our method has low computational cost and is simple to implement,which makes it attractive for usage in low-budget environments. We illustrate our method on several imagestaken from public surveillance databases, and compare our results with existing inpainting techniques.

1 INTRODUCTION

Facial recognition is the process of obtaining the iden-tity of a person based on information obtained fromfacial appearance (Ayinde and Yang, 2002). Com-pared to other person identification methods such asbiometric fingerprints, retina scans, and voice recog-nition, facial recognition has the advantage that it canbe used in contexts where the collaboration of theperson to be identified is not possible (Zhao et al.,2003; Hjelmas and Low, 2001). One such context isthe identification of missing people based on existingphotographs thereof.

In the last decades, automatic face recognition hasreceived considerable attention from the scientific andcommercial communities. However, several open is-sues still remain in this area. One such issue is thatfacial recognition tools are in general not efficient forpoor quality facial images, e.g. in the presence ofshadows, artifacts, and blurring (Zamani et al., 2008;Zhao et al., 2003; Castillo, 2006).

In some contexts, facial protographs are the onlykey to person identification. For example, the sitesof Australian Federal Police (AFP, 2012), FederalBrazilian Government (FGB, 2012), and UK Miss-ing People Centre (MPO, 2012) publish facial pho-

tographs of missing people. Often, such photographsare old, poorly digitized, and have artifacts such asfolds, scratches, irregular luminance, molds, stamps,and written text, all of various sizes, shapes, texture,and color. Since the effectiveness of facial recognitionmethods depends on the quality of input images, it isof high importance that such images present the dis-criminant facial features (eyes, mouth, and nose) withminimal artifacts. Moreover, it is desirable that theyall have a standard look, e.g. avoid outlier elementssuch as glasses, hair locks, highlights, and smiles.

We present here a computational framework forsegmentation and automatic restoration of poor-quality facial images, based on a statistical modelbuilt from frontal facial images and inpainting tech-niques. The location of facial features is obtainedby a mean image generated from a sample popula-tion of conforming facial images that provides priv-ileged information about spatial location of the fea-tures. Salient outliers are found by statistically com-paring the input image with this mean image. Theseoutliers are next eliminated by using a modified in-painting technique which uses information from boththe input image itself and the closest high-quality im-age in our image database. This produces images ofsufficient quality for typical face recognition tasks. In

41

Page 2: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

contrast to most existing digital restoration methods,we require no user input to mark regions in the im-age which has to be restored. Our method is simpleto implement and can be run in real-time on low-costPCs, which makes it attractive for utilization by gov-ernment agencies in least developed countries.

2 RELATED WORKS

Many image inpainting techniques have been pro-posed in the last decade. Oliveira et al. (Oliveiraet al., 2001) present a simple inpainting methodwhich repeatedly convolves a 3� 3 filter over the re-gions to inpaint. Although fast, this method yieldssignificant blurring. Bertalmio et al. (Bertalmioet al., 2000; Bertalmio et al., 2001) estimate the lo-cal image smoothness, using the image Laplacian,and propagate this along isophote directions, esti-mated by the image gradient rotated 90 degrees andby Navier-Stokes equation. The Total Variational(TV) model (Chan and Shen, 2000a) uses an Euler-Lagrange equation coupled with anisotropic diffusionto keep the isophotes directions. Telea (Telea, 2004)inpaints an image by propagating the image informa-tion from the boundary towards the interior of thedamaged area following a fast marching approachand a linear extrapolation of the image field outsidethe missing region. These methods give good resultswhen the regions to restore are small.

To inpaint thicker regions, the Curvature-DrivenDiffusion (CCD) method (Chan and Shen, 2000b)enhances the TV method to drive diffusion alongisophote directions. Diffusion-based texture synthe-sis techniques have been proposed to achieve higherquality inpainting results (Bugeau and Bertalmio,2009; Bugeau et al., 2009). Closer to our proposal,Li et al. (Li et al., 2010) present semantic inpaint-ing, where the damaged image is restored based ontexture synthesis using a most similar image from agiven database. Perez et al. (Perez et al., 2003) andJeschke et al. (Jeschke et al., 2009) present seamlessimage cloning that uses a Poisson process to computethe seamless filling as well as a guidance vector fieldthat incorporates prior knowledge of the damaged do-main.

Image cloning gives very good results, but is rela-tively expensive in CPU and memory terms. Joshi etal. use example images, taken from a person’s photocollection, to improve the luminance, blur, contrast,and shadows of that person’s photograph (Joshi et al.,2010). Chou et al. also use the seamless imagecloning of Perez et al. for the editing of facial featuresin digital photographs (Chou et al., 2012). However,

in contrast to our goal of automatic removal of facialoutlier artifacts, their goal was to assist users in pre-viewing the results of explicitly chosen image modi-fications.

To achieve our goal of facial image preprocessingfor supporting automatic missing persons identifica-tion in low-cost contexts, we need a method which isable to

� automatically detect the most salient artifacts in alow-quality facial photograph;

� remove these artifacts in a plausible way;

� work (nearly) automatically;

� require very low computational costs.

Next, we present our proposal in order to addressthese requirements.

3 PROPOSED METHOD

Figure 1 illustrates our computational pipeline for fa-cial image restoration. We start by computing an im-age quality index, which classifies the input imageas potentially benefiting from artifact removal or not(Sec. 3.1). If the image can be improved, we nextidentify, or segment, its artifacts using a statisticaldecision method where the image is compared to anexisting database of facial images of various ethnici-ties (Sec. 3.2). The detected artifacts are next slightlyenlarged using morphological dilation (Aptoula andLefevre, 2008) to remove small-scale spatial noisesuch as artifact boundary jaggies. Finally, we elim-inate the artifacts by using a semantic inpainting, avariation of the method presented in (Li et al., 2010),which combines information from the input imageand the image database (Sec. 3.3).

All our images (input and database) are nor-malized and equalized by the framework proposedin (Amaral and Thomaz, 2008). This is needed sinceexisting facial image databases around the world con-tain images with several sizes, resolutions, and con-trasts. The steps of our pipeline are explained next.

3.1 Image Quality Index

Objective image quality measures have an importantrole in image processing applications, such as com-pression, analysis, registration, restoration and en-hancement. One such simple measure is the imagequality index (Wang et al., 2004), which comparestwo images x and y based on luminance l(x;y), con-trast c(x;y) and structure s(x;y) comparison mea-sures

VISAPP�2013�-�International�Conference�on�Computer�Vision�Theory�and�Applications

42

Page 3: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

Figure 1: Pipeline of proposed method.

l(x;y) =2µxµy

µ2x +µ2

y; (1)

c(x;y) =2sxsy

s2x +s2

y; (2)

s(x;y) =sxy

sxsy: (3)

where x = fxij1 � i � ng and y = fyij1 � i � ng arethe two n-pixel images to compare. Here, µx, µy, s2

x ,s2

y and sxy are the mean, variance, and covariance ofx and y, respectively (Sellahewa and Jassim, 2010),i.e.

µx =1n

n

åi=1

xi; µy =1n

n

åi=1

yi

s2x =

1n�1

n

åi=1

(xi�µx)2; s

2y =

1n�1

n

åi=1

(yi�µy)2

sxy =1

n�1

n

åi=1

(xi�µx)(yi�µy)

The quantities l(x;y) and c(x;y) range between 0and 1, while s(x;y) is between -1 and 1. For nor-malized and equalized (NEQ) images, as in our case,l(x;y) and c(x;y) have a maximum of 1 since NEQimages have very similar standard deviation and meanvalues. In contrast, covariance values sxy are distinctin NEQ images. Hence, s(x;y) (Eqn. 3) is a goodcandidate for assessing the similarity of two NEQ im-ages. Note that s(x;y) does not give a direct descrip-tive representation of the image structures: It reflectsthe similarity between two image structures, wheres(x;y) equals one if and only if the structures of thetwo images x and y are exactly the same.

We use the image quality index s for two purposes.First, given an input image x, we compare it with theaverage image x of our pre-computed image database(see Sec. 3.2), using s(x;x). If x is too far away fromx, then x is either not a face image, or an image wecannot improve; and so, we stop our pipeline. Thisis further detailed in the quality index discussion inSec. 4. Otherwise, we determine the so-called outlier

artifacts which differentiate x from x, and suppressthem, as shown next.

3.2 Statistical Artifact Segmentation

Once we have chosen to improve the quality of an in-put image, we aim to segment those image parts, orartifacts, which deviate significantly from typical av-erage images. These will be the target of our inpaint-ing (Sec. 3.3).

For artifact segmentation, we use statistical deci-sion methods based on inference theory (Spiegel andStephens, 2008; Bussab and Morrettin, 2002), wheresamples in a database can generate privileged infor-mation. This makes it possible to discriminate arti-facts which we next want to remove from regular im-age pixels. Given a database of N of NEQ-normalizedfacial frontal images xi, the mean image is given by:

x =1N

N

åi=1

xi: (4)

We compute mean images xDS, xLS, and xJP sepa-rately for images of people of three ethnicities (darkskin (DS), light skin (LS), and japanese (JP)). Thisavoids mixing up too different facial traits. Fig-ure 2 shows the mean and standard deviation imagesfor these three groups. For dark skin people, weuse the FERRET database, (65 images) (P.J. Phillipsand Rauss, 2000) and 14 images taken from theFEI database (Thomaz and Giraldi, 2010). For lightskin people, we consider 100 images from the FEIdatabase. For japanese people, we use the JAFFEdatabase (100 images) (M.J. Lyons and Gyoba, 1998).These databases contain a variety of images for peo-ple of different races, ages, and appearances. Al-though all images are equalized, facial structure dif-ferences exist, e.g. larger nose and lips (DS images),shadows around the eyes (LS images), and significantmouth position variations (JP images).

We next determine the ethnicity of the input im-age x by finding its closest mean image xmin 2fxDS;xLS;xJPg, in terms of Euclidean distance.

Low-cost�Automatic�Inpainting�for�Artifact�Suppression�in�Facial�Images

43

Page 4: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

Finally, we use statistical segmentation to find theartifact regions to be removed. To illustrate this, letus suppose that the face samples are represented by arandom field x that follows a normal distribution withmean xmin and standard deviation s.

Thus the distribution of the standardized variableis given by:

z =x�xmin

s: (5)

Here, s is the standardized normal distribution of xwith mean 0 and variance 1, computed similariy tox, i.e. separately for the white skin, dark skin, andjapanese face databases.

Table 1 shows the values of per-pixel z scores andsignificance levels a. The set of z scores outside the

(a) xJP, japanese (b) sJP, japanese

(c) xDS, dark skin (d) sDS, dark skin

(e) xLS, light skin (f) sLS, light skin

Figure 2: Mean images x and color-coded standard devia-tion s images for japanese, black skin, and light skin eth-nicities.

range [�2:58;2:58] is called the critical region of thehypothesis or region of significance. This is the re-gion of rejection of the hypothesis. The set of z scoresinside the range[�2:58;2:58] is called the region ofhypothesis acceptance or the non-significance region.

Table 1: Statistical Significance.

Level of significance a Values of z10% - 1.645 .. 1.6455% -1.96 .. 1.961% - 2.58 .. 2.58

0.1% -3.291 .. 3.291

In practice, we observed that a value of a ’ 10%gives a good selection of atypical face features. Inother words, for each pixel z of the image z, computedby Eqn. ( 5), we decide if it is an artifact pixel or not,based on the value of z and our threshold a (Tab. 1).The set of artifact pixels W are removed as describednext.

3.3 Semantic Inpainting

Now that we know which pixels of our input imageare likely to be artifacts, we show a way to removethem by using inpainting. As our approach combinesthe fast marching based inpainting (Telea, 2004) andPoisson image editing (Perez et al., 2003) techniques,we first briefly outline relevant aspects of these tech-niques.

3.3.1 Inpainting using Fast Marching

Inpainting technique based on fast marching method(FMM) considers a first order approximation Iq(p) ofthe image at a point p situated on the boundary ¶W ofthe region to inpaint W

Iq(p) = I(q)+ÑI(q) � (p�q); (6)

where q is a neighbor pixel of p located within a ballBe(p) of small radius e centered at p and I(q) is theimage value at q. Each point p is inpainted as func-tion of all points q 2 Be(p); by summing the esti-mates of all points q, weighted by a weighting func-tion w(p;q):

I(p) =åq2Be(p) w(p;q)Iq(p)

åq2Be(p) w(p;q)(7)

where the application-dependent weights w(p;q) arenormalized, i.e. åq2B w(p;q) = 1.

The boundary ¶W is advanced towards the interiorof W using the FMM (Sethian, 1999). While doing

VISAPP�2013�-�International�Conference�on�Computer�Vision�Theory�and�Applications

44

Page 5: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

this, FMM implicitly computes the so-called distancetransform DT : W! R+

DT (p 2W) = argminq2¶W

kp�qk: (8)

As the FMM algorithm visits all pixels of the dam-aged region the method gradually propagates gray-value information from outside W inwards by com-puting expression (7).

This inpainting method is fast and simple. How-ever, for regions thicker than roughly 10..25 pixels, itcreates an increased amount of blurring as we go far-ther from ¶W into W, given the smoothing nature ofEqn. 7. Since our facial artifacts are not guaranteed tobe thin, this method is not optimal.

3.3.2 Poisson Image Editing

This image restoration method allows to achieveseamless filling of the damaged domain by using asource image data. It renders remarkably good re-sults, even for thick target regions. The method isbased on solving the equation:

DI = div v on W (9)

with I known over the boundary ¶W and constrainedto the values of v, the so-called guidance field, thatis derived from a source image. For instance, if wedesires to seamlessly clone a source image Isrc ontoW, we can set v = ÑIsrc, which effectively reducesEqn. 9 to DI = DIsrc on W (Eqn. 10 in (Perez et al.,2003)). Solving the Poisson problem (Eqn. 9), how-ever, is expensive in terms of memory requirements,which is critical in our low-cost scenario (Farbmanet al., 2009).

3.3.3 Proposed Inpainting Method

Both the FMM inpainting method and Poisson imageediting basically use only the non-artifact areas of theinput and near image to restore its artifacts.

However, we have additional information comingfrom our high-quality image database. Hence, ourproposal is to combine the fast-and-simple inpaintingmethod of (Telea, 2004) with the Poisson image edit-ing (Perez et al., 2003) augmented with informationextracted from our image database (Fig. 3).

We start by applying FMM inpainting (Sec. 3.3.1)on the artifact region W� x of our input image x. Wedenote the result of this step, i.e. the image x whereW has been inpainted, by Iinp. Next, we search in ourface image database DB for the closest facial imageInear, i.e.

Inear = argminy2DB

kx�yk: (10)

As outlined in Sec. 3.3.1, when W is thick, FMM in-painting produces a blurred result Iinp. We correct thisby mixing Iinp and Inear to yield:

I2 = (1�DT )Iinp +DT � Inear (11)

where DT is the distance transform of ¶W (Eqn. 8).This progressively mixes the inpainting result Iinpwith the nearest image Inear in a progressive way. Atthe border ¶W, we see the inpainted image, whichsmoothly extrapolates the non-artifact area out of thedamaged region in the input image. In the middle ofW the blurred effects observed in Iinp are attenuatedby the high-quality image Inear.

Next, we calculate Laplacian DInear = ¶2Inear¶x2 +

¶2Inear¶y2 of the closest image Inear, using central differ-

ences, and add this component to our restoration, i.e.compute

I3 = I2 +DInear: (12)

This is a simplified form of the ‘seamless cloning’presented in (Perez et al., 2003) which just adds high-frequency features to I2 through the Laplacian oper-ation DInear.” In contrast to (Perez et al., 2003), ourapproach is much cheaper, as it does not require solv-ing a Poisson equation on W.

As a final step, we apply a 3-by-3 median filter onI3 to yield the final reconstruction result. This furtherremoves small-scale noise elements created by usingthe finite-difference Laplacian (Eqn. 12).

4 RESULTS

Our method can produce good artifact removal evenin large and thick regions, due to the mix of inpaint-ing (which preserves information specific to the in-put image) and prior information (which introducesinformation from a similar high-quality image fromour image database). A key element is that our imagedatabase is composed by high quality images. There-fore, they are considered to be good for facial recogni-tion purposes. Besides this information is suitable forboth to extract the artifacts (Sec. 3.2) and to restorethe corresponding damaged areas.

We tested our method on 79 frontal facial im-ages provided by several public organizations: theFederal Brazilian Government (FGB, 2012), the Aus-tralian Federal Police (AFP, 2012), and the UK Miss-ing People Organization (MPO, 2012). These imagescontain various levels of artifacts, e.g. annotations,scratches, glasses, obscuring facial hair, folds, and ex-posure problems. All input images were normalizedand equalized as outlined in Sec. 3.

Low-cost�Automatic�Inpainting�for�Artifact�Suppression�in�Facial�Images

45

Page 6: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

Figure 3: Diagram of the proposed inpainting method (see Sec. 3.3).

Segmentation. The Fig. 4 shows the results whenapplying our segmentation approach for the im-ages of collum (a) using the light-skin mean image(Fig. 4.(b)), dark-skin mean image (Fig. 4.(c)) andjapanese mean image (Fig. 4.(d)). We observe thatif significance level a � 1:0% we can extract the ar-tifacts (glasses) when using the light and dark-skinmean image images. Segmentation of artifacts worksbest and has similar perceived quality results for dark-skin and light-skin images (see Fig. 4). For severalimages, the mean of japanese people presents verygood results. Although segmentation cannot selectall details where an input image differs from its eth-nic group’s statistical average, it does a good job inselecting details deemed to be unsuitable for officialperson recognition photographs.

Figure 4: Segmentation comparison, with a = 0:1% (red),a = 1% (dark blue), and a = 10% (light blue). (a) originalimage; (b) light skin mean image xLS; (c) dark skin meanimage xDS; (d) japanese mean image xJP.

Significance Level. Figure 5 shows results whenchanging the significance level a (Sec. 3.2). If wechoose high values for a the method extracts the ar-tifacts as well as other regions, as we can see on Fig-

ure 5 . Conversely, the choice of low a values, likein Figure 5, prevent such problem but is not efficientto segment the region of interest. The optimum valuefor a is application dependent and it depends on man-ual fine-tuning in general. For the test performed, thevalue a = 10% has given good results in all cases.This fact can be verified in Figure 6 which shows theresults using some values of a.

Artifacts. Smile is considered an artifact, as itdoes not comply with official standards for fa-cial images concerning expression, illumination, andbackground (ANSI, 2004; ISO, 2004; Thakare andThakare, 2012). Figure 7 presents an example ofsmile suppression. The bottom row shows a casewhere the smile has been well segmented and sup-pressed. The top row shows a less successful case:Part of the large smile (see Fig. 7 a, top) has not beencaptured by the segmentation, and as such, it leakedin the final reconstruction (Fig. 7 d, top). Also, notethat our method can produce changes in facial expres-sions. For example, Fig. 7 (bottom row, d) shows aface where the eyes are a mix between the input im-age and nearest image, whereas the nose, and cheeksfollow more closely the input image.

Figure 8 shows further results, and also comparesthem with three known inpainting techniques. In thetop row, we see an image with a hair lock artifact.Segmenting this artifact is easy, as it is much darkerthan the mean light-skin image xLS. Inpainting this ar-tifact is also relatively easy, as it is not thick. The sec-

VISAPP�2013�-�International�Conference�on�Computer�Vision�Theory�and�Applications

46

Page 7: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

(a) (b) (c)

Figure 5: Segmentation results: (a) original image; (b) seg-mentation; (c) reconstruction.

Figure 6: Results at each value of a.

ond row shows a face image which has a highlight, aswell as a thin horizontal crease (most probably due toa photograph damage) on the cheeks. This artifact ismore complicated to capture since its grayvalue fieldis similar with the mean image pixels nearby. How-ever, the result (column g) shows that this artifact isalso largely eliminated from the input image.

The third row of Figure 8 shows an image withglasses as an outlier artifact. Existing inpainting

(a) (b) (c) (d)

Figure 7: Smile elimination: (a) input image; (b) nearestimage; (c) segmentation; (d) proposed inpainting.

methods (columns d-f) succeed in eliminating this ar-tifact less than our method (column g), because theeyes are reconstructed. The fifth row of Figure 8shows a similar case. Here, the artifact region is thin,so all four tested inpainting methods produce elimi-nate the glasses equally well.

In the fourth row Figure 8 it is shown an imagewhere a large shadow artifact appears under the nose,on the cheeks, and the mouth. As expected, standardinpainting cannot easily remove this problem, sincethe shadow slightly extends outside of the segmentedregion, on the base of the nose (dark area under thenose, Fig. 8 c, fourth row). By using the nearest im-age our method can be more efficient to remove theshadow while generating less spurious details. A sim-ilar effect is observed in Figure 8 (bottom row), whereour method performs better for both to remove thecheek highlights and to reconstruct the opened eyesbetter than existing methods.

Quality Index. Figure 9 shows the average structuralimage quality index (Equation 3) for the three eth-nicities present in our face image database, the origi-nal input images for inpainting, and the results of thestudied inpainting methods. Several observations fol-low. First, as expected, the face database images havea relatively high quality index (average s 2 [0:6;0:8]).Secondly, input images which are faces have a lowerquality (average s � 0:49). This motivates our pro-posal to adjust such images to bring them in line withthe quality level of the face database. In contrast, in-put images which are not faces have a much lowerquality index (average s � 0). This allows us to de-cide whether an input image should be improved ordeemed not improvable (because it is not a face): Wechoose a suitable threshold t, in this case, t = 0:1.Input images with s > t are likely faces, so they arefurther improved; images with s < t are likely non-faces, and thus skipped from the process. Thirdly, wenotice that the nearest image Inear used in our inpaint-

Low-cost�Automatic�Inpainting�for�Artifact�Suppression�in�Facial�Images

47

Page 8: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

(a) (b) (c) (d) (e) (f) (g)

Figure 8: Method results: (a) original image; (b) nearest image (not the same person, just similar); (c) segmentation, wheregreen color indicates significance level a = 10%, yellow is 5%, blue is 1% and red is 0.1%; (d) inpainting (Oliveira et al.,2001); (e) inpainting (Bertalmio et al., 2001); (f) inpainting (Telea, 2004); (g) our method.

VISAPP�2013�-�International�Conference�on�Computer�Vision�Theory�and�Applications

48

Page 9: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

Ima

ge

qu

ality

in

de

x

ligh

t skin

(L

S)

da

rk s

kin

(D

S)

jap

an

ese

(JP

)

ne

are

st

ima

ge

s

no

n-f

ace

s

inp

ain

tin

g r

esu

lt(B

ert

alm

io e

t a

l.)

inp

ain

tin

g r

esu

lt(O

live

ira

et a

l.)

inp

ain

tin

g r

esu

lt(T

ele

a)

inp

ain

tin

g r

esu

lt(o

ur

me

tho

d)

inp

ut im

ag

es

(to

be

re

sto

red

)

Figure 9: Structural image quality for the input images, facedatabase, and result images.

ing (Sec. 3.3.3) has an average high quality index,which justifies its explicit usage during inpainting. Fi-nally, we see that our method yields, on average, re-sults with a higher structural quality than the othersstudied inpainting methods.

Limitations. Despite of its capabilities, our methodcannot handle facial images which deviate too muchfrom the information provided by the predefined im-age database. Figure 10 shows such an example,which is the worst case we found in our tests: Theinput image (a) is too far away from both the aver-age black-skin image (Fig. 2 c) and the nearest image(Fig. 10 b) to successfully remove the facial hair, nosepin, and large opaque glasses. However, we note thatthis type of outlier can be easily detectable by our ap-proach: When the input image is too far away fromthe mean image, as mentioned above, our methodreports that it cannot likely improve this image andstops.

Database. Selecting images that has good quality forthe face database serves two purposes. First, this al-lows specifying what one considers to be an accept-able facial image in a given context (e.g. open eyes,no smiles, shadows, adorns, or imprints). Statisticalcharacteristics of this collection are used to implic-itly determine what are outliers, thus what has to besuppressed in a given input image. Secondly, featuresof the nearest image in this database are used in therestoration. Hence, if one wants to allow certain spe-cific feature to persist in the restored images, this canbe done by inserting images containing such featuresin the database.

Computational Aspects. The proposed methodis simple to implement. For an image of npixels, its memory and computational complexi-ties are O(n log

pn) (Sethian, 1999) respectively,

(a) (b)

(c) (d)

Figure 10: Challenging case for our method: (a) input im-age; (b) nearest image; (c) segmentation; (d) inpainting.

in contrast to more sophisticated inpainting algo-rithms (Bertalmio et al., 2001; Chan and Shen, 2000a;Chan and Shen, 2000b; Perez et al., 2003; Joshi et al.,2010). This makes our method suitable for computersthat have low capacity of memory.

Validation. Although we have not validated theadded value of our artifact suppresion method in a fullface recognition pipeline, feedback from Sao Paulolaw enforcement confirmed that artifact removal is in-deed helpful in recognition tasks. Future work shouldbe done to quantify the improvement in automaticface recognition tasks.

5 CONCLUSIONS

We have presented a computational framework for au-tomatic inpainting of facial images. Given a databaseof facial images which are deemed to be of good qual-ity for recognition tasks, our method automaticallyidentifies outlier (artifact) regions, and reconstructsthese by using a mix of information present in the in-put image and information from the provided imagedatabase. The proposed method is simple to imple-ment and has low computational requirements, whichmakes it attractive for low-cost usage contexts such asgovernment agencies in least developed countries. We

Low-cost�Automatic�Inpainting�for�Artifact�Suppression�in�Facial�Images

49

Page 10: Low-cost Automatic Inpainting for Artifact Suppression in ... · Low-cost Automatic Inpainting for Artifact Suppression in Facial Images Andr´e Sobiecki 1, Alexandru Telea ,

have tested our method using three real-world publicimage databases of missing people, and compared ourrestoration results with three popular methods used inimage inpainting.

Potential improvements lie in the areas of more ro-bust segmentation using artifact-specific quality met-rics and using the k nearest images (k� 1) for inpaint-ing and actual evaluation of inpainting in a real-worldface recognition set-up.

REFERENCES

AFP (2012). Australian Federal Police, NationalMissing Persons Coordination Centre. http://www.missingpersons.gov.au.

Amaral, V. and Thomaz, C. (2008). Normalizacao es-pacial de imagens frontais de face. Dept. of Elec-trical Engineering, Univ. Center of FEI, Brazil.http://fei.edu.br�cet/relatorio tecnico 012008.pdf.

ANSI (2004). US standard for digital image formats for usewith the facial biometric (INCITS 385).

Aptoula, E. and Lefevre, S. (2008). A comparative studyon multivariate mathematical morphology. PatternRecogn. Lett., 29(2):109–118.

Ayinde, O. and Yang, Y. (2002). Face recognition approachbased on rank correlation of Gabor-filtered images.Pattern Recogn., 35(6):1275–1289.

Bertalmio, M., Sapiro, G., and Bertozzi, A. (2001). Navier-stokes, fluid dynamics, and image and video inpaint-ing. In Proc. CVPR, pages 355–362.

Bertalmio, M., Sapiro, G., Caselles, V., and Ballester, C.(2000). Image inpainting. In Proc. ACM SIGGRAPH,pages 417—424.

Bugeau, A. and Bertalmio, M. (2009). Combining texturesynthesis and diffusion for image inpainting. In Proc.VISAPP, pages 26–33.

Bugeau, A., Bertalmio, M., Caselles, V., and Sapiro, G.(2009). A unifying framework for image inpainting.Technical report, Intitute for Math. and Applications(IMA). http://www.ima.umn.edu.

Bussab, W. and Morrettin, P. (2002). Estatistica Basica.Editora Saraiva, Sao Paulo, Brazil.

Castillo, O. (2006). Survey about facial image quality.Technical report, Fraunhofer IGD, TU Darmstadt.

Chan, T. and Shen, J. (2000a). Mathematical model forlocal deterministic inpaintings. In Tech. Report CAM00-11. Image Processing Group, UCLA.

Chan, T. and Shen, J. (2000b). Non-texture inpainting bycurvature driven diffusions. In Tech. Report CAM 00-35. Image Processing Group, UCLA.

Chou, J., Yang, C., and Gong, S. (2012). Face-off: Auto-matic alteration of facial features. Multimedia Toolsand Applications, 56(3):569–596.

Farbman, Z., Hoffer, G., Lipman, Y., CohenOr, D., andLischinski, D. (2009). Coordinates for instant imagecloning. In Proc. ACM SIGGRAPH, pages 342–450.

FGB (2012). Federal Goverment of Brazil, Jus-tice Ministry - Missing Persons. http://www.desaparecidos.mj.gov.br.

Hjelmas, E. and Low, B. (2001). Face detection: A survey.CVIU, 83(3):236–274.

ISO (2004). Final draft international standard of biomet-ric data interchange formats (part 5, face image data,ISO–19794-5 FDIS).

Jeschke, S., Cline, D., and Wonka, P. (2009). A GPU lapla-cian solver for diffusion curves and poisson imageediting. In Proc. ACM SIGGRAPH Asia, pages 1–8.

Joshi, N., Matusik, W., Adelson, E., Csail, M., and Krieg-man, D. (2010). Personal photo enhancement usingexample images. In Proc. ACM SIGGRAPH, pages1–15.

Li, H., Wang, S., Zhang, W., and Wu, M. (2010). Image in-painting based on scene transform and color transfer.Pattern. Recogn. Lett., 31(7):582–592.

M.J. Lyons, S. Akamatsu, M. K. and Gyoba, J. (1998). Cod-ing facial expressions with gabor wavelets. In Proc.FG, pages 200–204. IEEE.

MPO (2012). Missing People from United Kingdom.http://www.missingpeople.org.uk.

Oliveira, M. M., Bowen, B., Mckenna, R., and Chang, Y.(2001). Fast digital image inpainting. In Proc. VIIP,pages 261–266.

Perez, P., Gangnet, M., and Blake, A. (2003). Poisson im-age editing. In Proc. ACM SIGGRAPH, pages 313–318.

P.J. Phillips, M. Hyeonjoon, S. R. and Rauss, P. (2000). TheFERET evaluation methodology for face-recognitionalgorithms. IEEE TPAMI, 22(10):1090–1104.

Sellahewa, H. and Jassim, S. (2010). Image-quality-basedadaptive face recognition. IEEE TIM, 59(4):805–813.

Sethian, J. (1999). Level Set Methods and Fast MarchingMethods. Cambridge Univ. Press, 2nd edition.

Spiegel, M. and Stephens, L. (2008). Statistics. Schaum’sOutlines.

Telea, A. (2004). An image inpainting technique basedon the fast marching method. J. Graphics. Tools,9(1):23–34.

Thakare, N. and Thakare, V. (2012). Biometrics standardsand face image format for data interchange - a review.Int. J. of Advances in Engineering and Technology,2(1):385–392.

Thomaz, C. and Giraldi, G. (2010). A new ranking methodfor principal components analysis and its applica-tion to face image analysis. Image Vision Comput.,28(6):902–913.

Wang, Z., Lu, L., and Bovik, A. (2004). Video quality as-sessment based on structural distortion measurement.Sig. Proc.: Image Comm., 19(2):121–132.

Zamani, A., Awang, M., Omar, N., and Nazeer, S. (2008).Image quality assessments and restoration for face de-tection and recognition system images. In Proc. AMS,pages 505–510.

Zhao, W., Chellappa, R., Phillips, P., and Rosenfeld, A.(2003). Face recognition: A literature survey. ACMComputing Surveys, 35(4):399–458.

VISAPP�2013�-�International�Conference�on�Computer�Vision�Theory�and�Applications

50