constant colour matting with foreground estimation

Constant colour matting with foreground estimation

Maynooth University Department of Computer Science

IMVIP 2014

Constant colour matting with foreground estimation

Guillaume GALESNational University of Ireland Maynooth

Department of Computer [email protected]

John MC DONALDNational University of Ireland Maynooth

Department of Computer [email protected]

Abstract

Constant colour matting consists of estimating for each pixel of an image the propor-tion α of an unknown foreground colour with a known constant background colour. Theα-matte is then used to replace this background with another image. Existing approachesapproximate α directly but post-processing is required to remove spill of the backgroundcolour in semi-transparent areas. Instead of estimating α directly, we propose 3 methodsto estimate the unknown foreground colour, and then to deduce α. This approach leads tohigh quality mattes for transparent objects and allows spill-free results (see Fig. 1). Weshow this through an evaluation of the proposed methods based on a ground truth dataset.

Keywords: constant colour matting, foreground colour estimation, α estimation.

1 Introduction

Matting is a classic problem that consists of creating a matte to mask an unwanted area fromvideo footage. This area is then replaced by content from separate footage to create a compos-ite. Cinema, television and web TV use this technique extensively for visual special effects.Although the principle of this technique is simple, it is often difficult to achieve a realistic seam-less result. This is particularly true where the observed colour is a blend from foreground andbackground. The proportion of foreground colour F , and background colour B for one pixelis called α. This typically occurs at the boundary between both areas, with semi-transparentforeground, or blur. We distinguish between two types of matting techniques: constant colourmatting techniques, where the background colour is known, or natural image matting, workingwith an arbitrary background. The former methods are widely used in visual productions butpost-processing is usually required to clean up the matte and to deal with spill (backgroundcolour reflecting in the foreground). The latter methods can give impressive results but requireadditional prior information and are more computationally expensive.

In this paper, our goal is to estimate the F and α, knowing B, to obtain high quality resultsin difficult areas exhibiting fine and semi-transparent details. Unlike others constant colourmethods, we start by estimating F , inspired by natural image matting approaches, but without

Input

Figure 1: Examples of final results obtained with the proposed method.

Guillaume GALES John McDONALD

Introduction

• Constant colour matting • Consists of replacing a constant coloured background with another image • Applications: VFX cinema, television, webcasting…

Problem

2

Introduction Foreground estimation Results and evaluation Conclusion

Introduction

• Constant colour matting • Consists of replacing a constant coloured background with another image • Applications: VFX cinema, television, webcasting…

Problem

2

http

://w

ww.

sony

pict

ures

.com

/mov

ies/

thea

maz

ingsp

ider

man

2/sit

e/#p

hoto

s


http://www.sonypictures.com/movies/theamazingspiderman2/site/#photos

Introduction

• Estimation of the foreground opacity (alpha) Matting equation

3

the need to input additional information. Our contributions are three methods for the estimationof F (and consequently α) that give mattes requiring little or no cleaning, and produce spill-free results. Furthermore, our algorithm is highly parallelisable (working independently oneach pixel) with a low computational complexity. We also provide a ground truth dataset whichwe use to demonstrate and quantitatively evaluate the performance of our technique.

After a brief description of the state of the art, we detail our approach. Then, we providean evaluation of the three methods based on a a proposed ground truth dataset.

2 Previous work

2.1 Constant colour matting

Constant colour matting aims at estimating a matte from images where the background is as-sumed to be a constant colour (usually blue or green).

In [Smith and Blinn, 1996], the authors formalise the problem of constant colour mattingas follows. For each pixel, we express the observed colour (O, known) as the proportionα ∈ [0; 1] (unknown) of foreground (F , unknown) and background (B, known) colours. Thematting equation is given by:

!r g b

"⊤# $% &

O

= α!Fr Fg Fb

"⊤# $% &

F

+(1− α)!Br Bg Bb

"⊤# $% &

B

(1)

There are four unknowns for three equations and therefore the problem is underdetermined,i.e. it exists no or many solutions. The authors identify three cases where a solution can befound: no blue in the observed colour, the observed colour is gray or two different shades ofbackground are known. However, these cases do not usually occur in practice.

We can distinguish between the following types of method to estimate α with a constantbackground colour:

– Colour difference technique – α is based on differences between the red, green andblue components. This method (a.k.a Ultimatte R⃝), invented by [Vlahos, 1964], is a legacyof an optical multistep process where colour filters are placed in front of an optical printerto filter out the background colour (an interesting history of matting in filmmaking is givenin [Filmmaker IQ, 2013]). This process is usually summarised by α = 1 − max(0, b −max(r, g)) where B is assumed to be blue. Spill is then removed by changing the blue compo-nent to b← min(b, kg) where k is a user control parameter.

– Colorspace segmentation – The colorspace is partitioned into background, foregroundand semi-transparent regions. In the semi-transparent region, α is based on the distance be-tween the other two. In [Ashikhmin, 2001, Jack, 1996], they use the Y CbCr colorspace toseparate the luminance (Y ) and the chrominance (CbCr). The segmentation is performed inthe 2D chrominance space. A classic approach, called Hue Saturation Luminance keying anddescribed in [Schultz, 2006], consists of segmenting the HSL colorspace to isolate the back-ground and foreground regions. These segmentations are done manually by selecting the centerand the dimensions of basic shapes (simple polyhedron or sphere) that encapsulates the differ-ent regions. The Primatte R⃝ algorithm by [Mishima, 1992], is based on this principle, howeverit automatically adjusts a 128 face polyhedron to obtain an fine segmentation of the colorspace.

2.2 Natural image matting

Natural image matting aims at estimating a matte from images with arbitrary background.These methods use optimisation techniques to estimate the best combination of α, F and Bthat minimises an objective function based on spatial statistical models. To build these models,they require a pre-segmentation (called trimap) in three classes: foreground, background andunknown. For example, in [Chuang et al., 2001], the algorithm marches inward from knownto unknown regions. It uses the colour distribution in a weighted window of the known (or al-ready computed) neighbouring regions to estimate the most likely combination of α, F and B.

Foreground colour (F)

Background colour (B)


Introduction


3

Observed colour (O): blend of blue (F) and green (B)



2 Previous work




!r g b

"⊤# $% &

O

= α!Fr Fg Fb

"⊤# $% &

F

+(1− α)!Br Bg Bb

"⊤# $% &

B

(1)










Introduction


3




2 Previous work




!r g b

"⊤# $% &

O

= α!Fr Fg Fb

"⊤# $% &

F

+(1− α)!Br Bg Bb

"⊤# $% &

B

(1)









alpha matte


Introduction


3




2 Previous work




!r g b

"⊤# $% &

O

= α!Fr Fg Fb

"⊤# $% &

F

+(1− α)!Br Bg Bb

"⊤# $% &

B

(1)









✓ ✓

alpha matte


Introduction


3




2 Previous work




!r g b

"⊤# $% &

O

= α!Fr Fg Fb

"⊤# $% &

F

+(1− α)!Br Bg Bb

"⊤# $% &

B

(1)









✓ ✓✘ ✘3 equations 4 unknowns

alpha matte


Introduction

4


• Constant colour matting • colour difference technique • colorspace segmentation

• Natural image matting • based on statistical models

Previous work

• Constant colour matting inspired by natural image matting • Estimate F and induce alpha

Our contribution

Introduction

• Ultimatte — inspired by early optical processColour difference technique

5

max

-

Red Green Blue


Introduction

• HSL segmentationColorspace segmentation

6


Introduction


7

Background

Foreground

Semi-opaque(user defined

rolloff)


Introduction


8

Foreground


rolloff)

Background


Introduction

• Primatte • One of the state of the art commercial

software

Colorspace segmentation

9

Background samples

Foreground


rolloff)


Introduction

• Without F, we cannot estimate alpha accurately • no accurate alpha as the original foreground

colour remains unknown • requires extensive matte cleaning • color inconsistencies with semi-opaque pixels

(spill suppression does not recover the orignal foreground colour)

Problem

10

Background colour

Observed colour

alpha=0.5


Introduction

• Without F, we cannot estimate alpha accurately • no accurate alpha as the original foreground

colour remains unknown • requires extensive matte cleaning • color inconsistencies with semi-opaque pixels

(spill suppression does not recover the orignal foreground colour)

Problem

10

Background colour

Foreground true alpha=0.1

Observed colour

alpha=0.5


Introduction

• arbitrary background • requires prior segmentation (trimap: F, B, unknown) • for each unknown pixel minimise an objective function based on spatial statistical models for F and B • estimate alpha, B and F for the unknown pixels • computationally expensive

Natural image matting

11


BackgroundForegroundUnknown

Introduction

• Estimate F and induce alpha • Consistent foreground color

• more accurate alpha estimation • little or no matte cleaning

• No trimap required

Our contribution

12


Foreground estimation


14


• 1. Build a set of Foreground (F) candidates • 2. For each pixel, find the best F candidate from the previously computed set • 3. Deduce alpha • Three independent methods for step 2



• Hypothesis: the foreground colour is • already present in the image • and distinct from the background colour • if B is different from O, B, O and F are aligned in the colorspace (linear relationship) !

• Modes of the colour distribution of the image minus the background colour • Mean-shift algorithm in Lab* colorspace

Set of foreground candidates

15



16



17



17

th1



18

Set of cluster candidate



19

Solution space

Background color

Observed color



• 3 methods to estimate F • a) F that minimises the sum of

squared residuals of the matting equation system where F is replaced by a cluster candidate Ci

20

B

O

Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α

1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)

10 ci ← cost for this Fi

11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */

13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else

/* Get the best candidate. */

16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥

−−→OFj∥ ; α← ∥−−→BO∥

∥−−→BF∥

(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥

−−→CiFi∥.

(c) This method rotates Ci around B with an angle θi = cos−1

! −−→BCi

−−→BO

∥−−→BCi∥∥−−→BO∥

"so that B, O

and Ci are collinear. The cost is given by the rotation angle: ci = θi.

(a) (b) (c)

..O′

i

.Fi

.B.

O. ci = ρCi.

Ci

..Fi

.B.

O.

Ci

.ci = ∥

−−→CiFi∥

..Fi

.B.

O.

Ci

. ci = θi

Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi

+B Fi =−−→BO(

−−→BO·−−→BCi)

∥−−→BO∥2+B Fi =

−−→BO

∥−−→BO∥∥−−→BCi∥+B

Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.

4 Evaluation and results

Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:

#−Bblue −Bgreen . . .

I3 I3 . . .

$⊤ #ααF

$=

%(Oblue −Bblue) (Ogreen −Bgreen) . . .

&⊤ (2)

where I3 is the 3× 3 identity matrix.

Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥

Ci



• 3 methods to estimate F • a) F that minimises the sum of squared

residuals of the matting equation system where F is replaced by a cluster candidate Ci

• b) F is given by the orthogonal projection of the closest cluster candidate Ci

21

B

O







16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥

−−→OFj∥ ; α← ∥−−→BO∥

∥−−→BF∥


−−→CiFi∥.


! −−→BCi

−−→BO

∥−−→BCi∥∥−−→BO∥

"so that B, O


(a) (b) (c)

..O′

i

.Fi

.B.

O. ci = ρCi.

Ci

..Fi

.B.

O.

Ci

.ci = ∥

−−→CiFi∥

..Fi

.B.

O.

Ci

. ci = θi


+B Fi =−−→BO(

−−→BO·−−→BCi)

∥−−→BO∥2+B Fi =

−−→BO

∥−−→BO∥∥−−→BCi∥+B





I3 I3 . . .

$⊤ #ααF

$=


&⊤ (2)



Ci



• 3 methods to estimate F • a) F that minimises the sum of squared

residuals of the matting equation system where F is replaced by a cluster candidate Ci

• b) F is given by the orthogonal projection of the closest cluster candidate Ci

• c) F is given by the smallest rotation of BCi onto BO

22

B

O







16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥

−−→OFj∥ ; α← ∥−−→BO∥

∥−−→BF∥


−−→CiFi∥.


! −−→BCi

−−→BO

∥−−→BCi∥∥−−→BO∥

"so that B, O


(a) (b) (c)

..O′

i

.Fi

.B.

O. ci = ρCi.

Ci

..Fi

.B.

O.

Ci

.ci = ∥

−−→CiFi∥

..Fi

.B.

O.

Ci

. ci = θi


+B Fi =−−→BO(

−−→BO·−−→BCi)

∥−−→BO∥2+B Fi =

−−→BO

∥−−→BO∥∥−−→BCi∥+B





I3 I3 . . .

$⊤ #ααF

$=


&⊤ (2)



Ci


Results and evaluation

Estimating Ground Truth

• 5 different backgrounds → overdetermided problemGround truth dataset

24







16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥

−−→OFj∥ ; α← ∥−−→BO∥

∥−−→BF∥


−−→CiFi∥.


! −−→BCi

−−→BO

∥−−→BCi∥∥−−→BO∥

"so that B, O


(a) (b) (c)

..O′

i

.Fi

.B.

O. ci = ρCi.

Ci

..Fi

.B.

O.

Ci

.ci = ∥

−−→CiFi∥

..Fi

.B.

O.

Ci

. ci = θi


+B Fi =−−→BO(

−−→BO·−−→BCi)

∥−−→BO∥2+B Fi =

−−→BO

∥−−→BO∥∥−−→BCi∥+B





I3 I3 . . .

$⊤ #ααF

$=


&⊤ (2)




Estimating Ground Truth

• 5 different backgrounds → overdetermided problemGround truth dataset

24







16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥

−−→OFj∥ ; α← ∥−−→BO∥

∥−−→BF∥


−−→CiFi∥.


! −−→BCi

−−→BO

∥−−→BCi∥∥−−→BO∥

"so that B, O


(a) (b) (c)

..O′

i

.Fi

.B.

O. ci = ρCi.

Ci

..Fi

.B.

O.

Ci

.ci = ∥

−−→CiFi∥

..Fi

.B.

O.

Ci

. ci = θi


+B Fi =−−→BO(

−−→BO·−−→BCi)

∥−−→BO∥2+B Fi =

−−→BO

∥−−→BO∥∥−−→BCi∥+B





I3 I3 . . .

$⊤ #ααF

$=


&⊤ (2)



Ground truth alphaF

Ground truth alpha



• Mean squared errorMeasure of error

25

distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.

Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:

MSEth1,th2,m =1

hw

h−1,w−1!

i,j=0,0

∥(αF )i,j − (αF )i,j∥ (3)

where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.

Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).

(a) (b) (c)th

2

th1

4

6

8

10

12

14

16

1

3

5

7

9

11

13

15

17

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

5

10

15

20

25

30

35

40

45

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

.2

.4

.6

.8

1

1.2

1.4

1.6

1.8

35 40 45 50 55 60 65 80 85

Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.

5 Discussion and conclusion

We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.

A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust

(a) Our method

(b) Apple Motion HSL Keyer

Figure 2: Visual comparison with a commercial solution.



MSEth1,th2,m =1

hw

h−1,w−1!

i,j=0,0

∥(αF )i,j − (αF )i,j∥ (3)



(a) (b) (c)

th2

th1

4

6

8

10

12

14

16

1

3

5

7

9

11

13

15

17

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

5

10

15

20

25

30

35

40

45

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

.2

.4

.6

.8

1

1.2

1.4

1.6

1.8

35 40 45 50 55 60 65 80 85





(a) Our method






25



MSEth1,th2,m =1

hw

h−1,w−1!

i,j=0,0

∥(αF )i,j − (αF )i,j∥ (3)



(a) (b) (c)th

2

th1

4

6

8

10

12

14

16

1

3

5

7

9

11

13

15

17

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

5

10

15

20

25

30

35

40

45

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

.2

.4

.6

.8

1

1.2

1.4

1.6

1.8

35 40 45 50 55 60 65 80 85





(a) Our method





MSEth1,th2,m =1

hw

h−1,w−1!

i,j=0,0

∥(αF )i,j − (αF )i,j∥ (3)



(a) (b) (c)

th2

th1

4

6

8

10

12

14

16

1

3

5

7

9

11

13

15

17

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

5

10

15

20

25

30

35

40

45

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

.2

.4

.6

.8

1

1.2

1.4

1.6

1.8

35 40 45 50 55 60 65 80 85





(a) Our method




Result



25



MSEth1,th2,m =1

hw

h−1,w−1!

i,j=0,0

∥(αF )i,j − (αF )i,j∥ (3)



(a) (b) (c)th

2

th1

4

6

8

10

12

14

16

1

3

5

7

9

11

13

15

17

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

5

10

15

20

25

30

35

40

45

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

.2

.4

.6

.8

1

1.2

1.4

1.6

1.8

35 40 45 50 55 60 65 80 85





(a) Our method





MSEth1,th2,m =1

hw

h−1,w−1!

i,j=0,0

∥(αF )i,j − (αF )i,j∥ (3)



(a) (b) (c)

th2

th1

4

6

8

10

12

14

16

1

3

5

7

9

11

13

15

17

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

5

10

15

20

25

30

35

40

45

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

.2

.4

.6

.8

1

1.2

1.4

1.6

1.8

35 40 45 50 55 60 65 80 85





(a) Our method




Result

Ground truth



25



MSEth1,th2,m =1

hw

h−1,w−1!

i,j=0,0

∥(αF )i,j − (αF )i,j∥ (3)



(a) (b) (c)th

2

th1

4

6

8

10

12

14

16

1

3

5

7

9

11

13

15

17

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

5

10

15

20

25

30

35

40

45

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

.2

.4

.6

.8

1

1.2

1.4

1.6

1.8

35 40 45 50 55 60 65 80 85





(a) Our method





MSEth1,th2,m =1

hw

h−1,w−1!

i,j=0,0

∥(αF )i,j − (αF )i,j∥ (3)



(a) (b) (c)

th2

th1

4

6

8

10

12

14

16

1

3

5

7

9

11

13

15

17

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

5

10

15

20

25

30

35

40

45

35 40 45 50 55 60 65 80 85

th2

th1

4

6

8

10

12

14

16

.2

.4

.6

.8

1

1.2

1.4

1.6

1.8

35 40 45 50 55 60 65 80 85





(a) Our method




Result

Ground truth

Difference


26

No matte cleaning added No spill suppression


Conclusion and future work


• Constant colour matting method based on foreground colour estimation • Ground truth dataset • Accurate (low MSE) visually good results

Conclusion

28



• Constant colour matting method based on foreground colour estimation • Ground truth dataset • Accurate (low MSE) visually good results

Conclusion

28

• Ground truth dataset → large residuals in some cases → robust estimation • Comparison with other methods: difficult to compare with commercial products (not free, technical

details not well documented, add post-processing to cleanup the matte)

Future work


constant colour matting with foreground estimation

Software

constantbackground colour

colour filters

observed colour o

colour difference technique

foreground b

background b

constant coloured background

matting equation estimation