constant colour matting with foreground estimation
DESCRIPTION
Constant colour matting consists of estimating for each pixel of an image the propor- tion α of an unknown foreground colour with a known constant background colour. The α-matte is then used to replace this background with another image. Existing approaches approximate α directly but post-processing is required to remove spill of the background colour in semi-transparent areas. Instead of estimating α directly, we propose 3 methods to estimate the unknown foreground colour, and then to deduce α. This approach leads to high quality mattes for transparent objects and allows spill-free results (see www.cs.nuim.ie/research/vision/data/imvip2014/). We show this through an evaluation of the proposed methods based on a ground truth dataset.TRANSCRIPT
Constant colour matting with foreground estimation
Maynooth University Department of Computer Science
IMVIP 2014
Constant colour matting with foreground estimation
Guillaume GALESNational University of Ireland Maynooth
Department of Computer [email protected]
John MC DONALDNational University of Ireland Maynooth
Department of Computer [email protected]
Abstract
Constant colour matting consists of estimating for each pixel of an image the propor-tion α of an unknown foreground colour with a known constant background colour. Theα-matte is then used to replace this background with another image. Existing approachesapproximate α directly but post-processing is required to remove spill of the backgroundcolour in semi-transparent areas. Instead of estimating α directly, we propose 3 methodsto estimate the unknown foreground colour, and then to deduce α. This approach leads tohigh quality mattes for transparent objects and allows spill-free results (see Fig. 1). Weshow this through an evaluation of the proposed methods based on a ground truth dataset.
Keywords: constant colour matting, foreground colour estimation, α estimation.
1 Introduction
Matting is a classic problem that consists of creating a matte to mask an unwanted area fromvideo footage. This area is then replaced by content from separate footage to create a compos-ite. Cinema, television and web TV use this technique extensively for visual special effects.Although the principle of this technique is simple, it is often difficult to achieve a realistic seam-less result. This is particularly true where the observed colour is a blend from foreground andbackground. The proportion of foreground colour F , and background colour B for one pixelis called α. This typically occurs at the boundary between both areas, with semi-transparentforeground, or blur. We distinguish between two types of matting techniques: constant colourmatting techniques, where the background colour is known, or natural image matting, workingwith an arbitrary background. The former methods are widely used in visual productions butpost-processing is usually required to clean up the matte and to deal with spill (backgroundcolour reflecting in the foreground). The latter methods can give impressive results but requireadditional prior information and are more computationally expensive.
In this paper, our goal is to estimate the F and α, knowing B, to obtain high quality resultsin difficult areas exhibiting fine and semi-transparent details. Unlike others constant colourmethods, we start by estimating F , inspired by natural image matting approaches, but without
Input
Figure 1: Examples of final results obtained with the proposed method.
Guillaume GALES John McDONALD
Introduction
• Constant colour matting • Consists of replacing a constant coloured background with another image • Applications: VFX cinema, television, webcasting…
Problem
2
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Constant colour matting • Consists of replacing a constant coloured background with another image • Applications: VFX cinema, television, webcasting…
Problem
2
http
://w
ww.
sony
pict
ures
.com
/mov
ies/
thea
maz
ingsp
ider
man
2/sit
e/#p
hoto
s
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Estimation of the foreground opacity (alpha) Matting equation
3
the need to input additional information. Our contributions are three methods for the estimationof F (and consequently α) that give mattes requiring little or no cleaning, and produce spill-free results. Furthermore, our algorithm is highly parallelisable (working independently oneach pixel) with a low computational complexity. We also provide a ground truth dataset whichwe use to demonstrate and quantitatively evaluate the performance of our technique.
After a brief description of the state of the art, we detail our approach. Then, we providean evaluation of the three methods based on a a proposed ground truth dataset.
2 Previous work
2.1 Constant colour matting
Constant colour matting aims at estimating a matte from images where the background is as-sumed to be a constant colour (usually blue or green).
In [Smith and Blinn, 1996], the authors formalise the problem of constant colour mattingas follows. For each pixel, we express the observed colour (O, known) as the proportionα ∈ [0; 1] (unknown) of foreground (F , unknown) and background (B, known) colours. Thematting equation is given by:
!r g b
"⊤# $% &
O
= α!Fr Fg Fb
"⊤# $% &
F
+(1− α)!Br Bg Bb
"⊤# $% &
B
(1)
There are four unknowns for three equations and therefore the problem is underdetermined,i.e. it exists no or many solutions. The authors identify three cases where a solution can befound: no blue in the observed colour, the observed colour is gray or two different shades ofbackground are known. However, these cases do not usually occur in practice.
We can distinguish between the following types of method to estimate α with a constantbackground colour:
– Colour difference technique – α is based on differences between the red, green andblue components. This method (a.k.a Ultimatte R⃝), invented by [Vlahos, 1964], is a legacyof an optical multistep process where colour filters are placed in front of an optical printerto filter out the background colour (an interesting history of matting in filmmaking is givenin [Filmmaker IQ, 2013]). This process is usually summarised by α = 1 − max(0, b −max(r, g)) where B is assumed to be blue. Spill is then removed by changing the blue compo-nent to b← min(b, kg) where k is a user control parameter.
– Colorspace segmentation – The colorspace is partitioned into background, foregroundand semi-transparent regions. In the semi-transparent region, α is based on the distance be-tween the other two. In [Ashikhmin, 2001, Jack, 1996], they use the Y CbCr colorspace toseparate the luminance (Y ) and the chrominance (CbCr). The segmentation is performed inthe 2D chrominance space. A classic approach, called Hue Saturation Luminance keying anddescribed in [Schultz, 2006], consists of segmenting the HSL colorspace to isolate the back-ground and foreground regions. These segmentations are done manually by selecting the centerand the dimensions of basic shapes (simple polyhedron or sphere) that encapsulates the differ-ent regions. The Primatte R⃝ algorithm by [Mishima, 1992], is based on this principle, howeverit automatically adjusts a 128 face polyhedron to obtain an fine segmentation of the colorspace.
2.2 Natural image matting
Natural image matting aims at estimating a matte from images with arbitrary background.These methods use optimisation techniques to estimate the best combination of α, F and Bthat minimises an objective function based on spatial statistical models. To build these models,they require a pre-segmentation (called trimap) in three classes: foreground, background andunknown. For example, in [Chuang et al., 2001], the algorithm marches inward from knownto unknown regions. It uses the colour distribution in a weighted window of the known (or al-ready computed) neighbouring regions to estimate the most likely combination of α, F and B.
Foreground colour (F)
Background colour (B)
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Estimation of the foreground opacity (alpha) Matting equation
3
the need to input additional information. Our contributions are three methods for the estimationof F (and consequently α) that give mattes requiring little or no cleaning, and produce spill-free results. Furthermore, our algorithm is highly parallelisable (working independently oneach pixel) with a low computational complexity. We also provide a ground truth dataset whichwe use to demonstrate and quantitatively evaluate the performance of our technique.
After a brief description of the state of the art, we detail our approach. Then, we providean evaluation of the three methods based on a a proposed ground truth dataset.
2 Previous work
2.1 Constant colour matting
Constant colour matting aims at estimating a matte from images where the background is as-sumed to be a constant colour (usually blue or green).
In [Smith and Blinn, 1996], the authors formalise the problem of constant colour mattingas follows. For each pixel, we express the observed colour (O, known) as the proportionα ∈ [0; 1] (unknown) of foreground (F , unknown) and background (B, known) colours. Thematting equation is given by:
!r g b
"⊤# $% &
O
= α!Fr Fg Fb
"⊤# $% &
F
+(1− α)!Br Bg Bb
"⊤# $% &
B
(1)
There are four unknowns for three equations and therefore the problem is underdetermined,i.e. it exists no or many solutions. The authors identify three cases where a solution can befound: no blue in the observed colour, the observed colour is gray or two different shades ofbackground are known. However, these cases do not usually occur in practice.
We can distinguish between the following types of method to estimate α with a constantbackground colour:
– Colour difference technique – α is based on differences between the red, green andblue components. This method (a.k.a Ultimatte R⃝), invented by [Vlahos, 1964], is a legacyof an optical multistep process where colour filters are placed in front of an optical printerto filter out the background colour (an interesting history of matting in filmmaking is givenin [Filmmaker IQ, 2013]). This process is usually summarised by α = 1 − max(0, b −max(r, g)) where B is assumed to be blue. Spill is then removed by changing the blue compo-nent to b← min(b, kg) where k is a user control parameter.
– Colorspace segmentation – The colorspace is partitioned into background, foregroundand semi-transparent regions. In the semi-transparent region, α is based on the distance be-tween the other two. In [Ashikhmin, 2001, Jack, 1996], they use the Y CbCr colorspace toseparate the luminance (Y ) and the chrominance (CbCr). The segmentation is performed inthe 2D chrominance space. A classic approach, called Hue Saturation Luminance keying anddescribed in [Schultz, 2006], consists of segmenting the HSL colorspace to isolate the back-ground and foreground regions. These segmentations are done manually by selecting the centerand the dimensions of basic shapes (simple polyhedron or sphere) that encapsulates the differ-ent regions. The Primatte R⃝ algorithm by [Mishima, 1992], is based on this principle, howeverit automatically adjusts a 128 face polyhedron to obtain an fine segmentation of the colorspace.
2.2 Natural image matting
Natural image matting aims at estimating a matte from images with arbitrary background.These methods use optimisation techniques to estimate the best combination of α, F and Bthat minimises an objective function based on spatial statistical models. To build these models,they require a pre-segmentation (called trimap) in three classes: foreground, background andunknown. For example, in [Chuang et al., 2001], the algorithm marches inward from knownto unknown regions. It uses the colour distribution in a weighted window of the known (or al-ready computed) neighbouring regions to estimate the most likely combination of α, F and B.
Foreground colour (F)
Background colour (B)
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Estimation of the foreground opacity (alpha) Matting equation
3
Observed colour (O): blend of blue (F) and green (B)
the need to input additional information. Our contributions are three methods for the estimationof F (and consequently α) that give mattes requiring little or no cleaning, and produce spill-free results. Furthermore, our algorithm is highly parallelisable (working independently oneach pixel) with a low computational complexity. We also provide a ground truth dataset whichwe use to demonstrate and quantitatively evaluate the performance of our technique.
After a brief description of the state of the art, we detail our approach. Then, we providean evaluation of the three methods based on a a proposed ground truth dataset.
2 Previous work
2.1 Constant colour matting
Constant colour matting aims at estimating a matte from images where the background is as-sumed to be a constant colour (usually blue or green).
In [Smith and Blinn, 1996], the authors formalise the problem of constant colour mattingas follows. For each pixel, we express the observed colour (O, known) as the proportionα ∈ [0; 1] (unknown) of foreground (F , unknown) and background (B, known) colours. Thematting equation is given by:
!r g b
"⊤# $% &
O
= α!Fr Fg Fb
"⊤# $% &
F
+(1− α)!Br Bg Bb
"⊤# $% &
B
(1)
There are four unknowns for three equations and therefore the problem is underdetermined,i.e. it exists no or many solutions. The authors identify three cases where a solution can befound: no blue in the observed colour, the observed colour is gray or two different shades ofbackground are known. However, these cases do not usually occur in practice.
We can distinguish between the following types of method to estimate α with a constantbackground colour:
– Colour difference technique – α is based on differences between the red, green andblue components. This method (a.k.a Ultimatte R⃝), invented by [Vlahos, 1964], is a legacyof an optical multistep process where colour filters are placed in front of an optical printerto filter out the background colour (an interesting history of matting in filmmaking is givenin [Filmmaker IQ, 2013]). This process is usually summarised by α = 1 − max(0, b −max(r, g)) where B is assumed to be blue. Spill is then removed by changing the blue compo-nent to b← min(b, kg) where k is a user control parameter.
– Colorspace segmentation – The colorspace is partitioned into background, foregroundand semi-transparent regions. In the semi-transparent region, α is based on the distance be-tween the other two. In [Ashikhmin, 2001, Jack, 1996], they use the Y CbCr colorspace toseparate the luminance (Y ) and the chrominance (CbCr). The segmentation is performed inthe 2D chrominance space. A classic approach, called Hue Saturation Luminance keying anddescribed in [Schultz, 2006], consists of segmenting the HSL colorspace to isolate the back-ground and foreground regions. These segmentations are done manually by selecting the centerand the dimensions of basic shapes (simple polyhedron or sphere) that encapsulates the differ-ent regions. The Primatte R⃝ algorithm by [Mishima, 1992], is based on this principle, howeverit automatically adjusts a 128 face polyhedron to obtain an fine segmentation of the colorspace.
2.2 Natural image matting
Natural image matting aims at estimating a matte from images with arbitrary background.These methods use optimisation techniques to estimate the best combination of α, F and Bthat minimises an objective function based on spatial statistical models. To build these models,they require a pre-segmentation (called trimap) in three classes: foreground, background andunknown. For example, in [Chuang et al., 2001], the algorithm marches inward from knownto unknown regions. It uses the colour distribution in a weighted window of the known (or al-ready computed) neighbouring regions to estimate the most likely combination of α, F and B.
Foreground colour (F)
Background colour (B)
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Estimation of the foreground opacity (alpha) Matting equation
3
Observed colour (O): blend of blue (F) and green (B)
the need to input additional information. Our contributions are three methods for the estimationof F (and consequently α) that give mattes requiring little or no cleaning, and produce spill-free results. Furthermore, our algorithm is highly parallelisable (working independently oneach pixel) with a low computational complexity. We also provide a ground truth dataset whichwe use to demonstrate and quantitatively evaluate the performance of our technique.
After a brief description of the state of the art, we detail our approach. Then, we providean evaluation of the three methods based on a a proposed ground truth dataset.
2 Previous work
2.1 Constant colour matting
Constant colour matting aims at estimating a matte from images where the background is as-sumed to be a constant colour (usually blue or green).
In [Smith and Blinn, 1996], the authors formalise the problem of constant colour mattingas follows. For each pixel, we express the observed colour (O, known) as the proportionα ∈ [0; 1] (unknown) of foreground (F , unknown) and background (B, known) colours. Thematting equation is given by:
!r g b
"⊤# $% &
O
= α!Fr Fg Fb
"⊤# $% &
F
+(1− α)!Br Bg Bb
"⊤# $% &
B
(1)
There are four unknowns for three equations and therefore the problem is underdetermined,i.e. it exists no or many solutions. The authors identify three cases where a solution can befound: no blue in the observed colour, the observed colour is gray or two different shades ofbackground are known. However, these cases do not usually occur in practice.
We can distinguish between the following types of method to estimate α with a constantbackground colour:
– Colour difference technique – α is based on differences between the red, green andblue components. This method (a.k.a Ultimatte R⃝), invented by [Vlahos, 1964], is a legacyof an optical multistep process where colour filters are placed in front of an optical printerto filter out the background colour (an interesting history of matting in filmmaking is givenin [Filmmaker IQ, 2013]). This process is usually summarised by α = 1 − max(0, b −max(r, g)) where B is assumed to be blue. Spill is then removed by changing the blue compo-nent to b← min(b, kg) where k is a user control parameter.
– Colorspace segmentation – The colorspace is partitioned into background, foregroundand semi-transparent regions. In the semi-transparent region, α is based on the distance be-tween the other two. In [Ashikhmin, 2001, Jack, 1996], they use the Y CbCr colorspace toseparate the luminance (Y ) and the chrominance (CbCr). The segmentation is performed inthe 2D chrominance space. A classic approach, called Hue Saturation Luminance keying anddescribed in [Schultz, 2006], consists of segmenting the HSL colorspace to isolate the back-ground and foreground regions. These segmentations are done manually by selecting the centerand the dimensions of basic shapes (simple polyhedron or sphere) that encapsulates the differ-ent regions. The Primatte R⃝ algorithm by [Mishima, 1992], is based on this principle, howeverit automatically adjusts a 128 face polyhedron to obtain an fine segmentation of the colorspace.
2.2 Natural image matting
Natural image matting aims at estimating a matte from images with arbitrary background.These methods use optimisation techniques to estimate the best combination of α, F and Bthat minimises an objective function based on spatial statistical models. To build these models,they require a pre-segmentation (called trimap) in three classes: foreground, background andunknown. For example, in [Chuang et al., 2001], the algorithm marches inward from knownto unknown regions. It uses the colour distribution in a weighted window of the known (or al-ready computed) neighbouring regions to estimate the most likely combination of α, F and B.
Foreground colour (F)
Background colour (B)
alpha matte
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Estimation of the foreground opacity (alpha) Matting equation
3
Observed colour (O): blend of blue (F) and green (B)
the need to input additional information. Our contributions are three methods for the estimationof F (and consequently α) that give mattes requiring little or no cleaning, and produce spill-free results. Furthermore, our algorithm is highly parallelisable (working independently oneach pixel) with a low computational complexity. We also provide a ground truth dataset whichwe use to demonstrate and quantitatively evaluate the performance of our technique.
After a brief description of the state of the art, we detail our approach. Then, we providean evaluation of the three methods based on a a proposed ground truth dataset.
2 Previous work
2.1 Constant colour matting
Constant colour matting aims at estimating a matte from images where the background is as-sumed to be a constant colour (usually blue or green).
In [Smith and Blinn, 1996], the authors formalise the problem of constant colour mattingas follows. For each pixel, we express the observed colour (O, known) as the proportionα ∈ [0; 1] (unknown) of foreground (F , unknown) and background (B, known) colours. Thematting equation is given by:
!r g b
"⊤# $% &
O
= α!Fr Fg Fb
"⊤# $% &
F
+(1− α)!Br Bg Bb
"⊤# $% &
B
(1)
There are four unknowns for three equations and therefore the problem is underdetermined,i.e. it exists no or many solutions. The authors identify three cases where a solution can befound: no blue in the observed colour, the observed colour is gray or two different shades ofbackground are known. However, these cases do not usually occur in practice.
We can distinguish between the following types of method to estimate α with a constantbackground colour:
– Colour difference technique – α is based on differences between the red, green andblue components. This method (a.k.a Ultimatte R⃝), invented by [Vlahos, 1964], is a legacyof an optical multistep process where colour filters are placed in front of an optical printerto filter out the background colour (an interesting history of matting in filmmaking is givenin [Filmmaker IQ, 2013]). This process is usually summarised by α = 1 − max(0, b −max(r, g)) where B is assumed to be blue. Spill is then removed by changing the blue compo-nent to b← min(b, kg) where k is a user control parameter.
– Colorspace segmentation – The colorspace is partitioned into background, foregroundand semi-transparent regions. In the semi-transparent region, α is based on the distance be-tween the other two. In [Ashikhmin, 2001, Jack, 1996], they use the Y CbCr colorspace toseparate the luminance (Y ) and the chrominance (CbCr). The segmentation is performed inthe 2D chrominance space. A classic approach, called Hue Saturation Luminance keying anddescribed in [Schultz, 2006], consists of segmenting the HSL colorspace to isolate the back-ground and foreground regions. These segmentations are done manually by selecting the centerand the dimensions of basic shapes (simple polyhedron or sphere) that encapsulates the differ-ent regions. The Primatte R⃝ algorithm by [Mishima, 1992], is based on this principle, howeverit automatically adjusts a 128 face polyhedron to obtain an fine segmentation of the colorspace.
2.2 Natural image matting
Natural image matting aims at estimating a matte from images with arbitrary background.These methods use optimisation techniques to estimate the best combination of α, F and Bthat minimises an objective function based on spatial statistical models. To build these models,they require a pre-segmentation (called trimap) in three classes: foreground, background andunknown. For example, in [Chuang et al., 2001], the algorithm marches inward from knownto unknown regions. It uses the colour distribution in a weighted window of the known (or al-ready computed) neighbouring regions to estimate the most likely combination of α, F and B.
Foreground colour (F)
Background colour (B)
✓ ✓
alpha matte
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Estimation of the foreground opacity (alpha) Matting equation
3
Observed colour (O): blend of blue (F) and green (B)
the need to input additional information. Our contributions are three methods for the estimationof F (and consequently α) that give mattes requiring little or no cleaning, and produce spill-free results. Furthermore, our algorithm is highly parallelisable (working independently oneach pixel) with a low computational complexity. We also provide a ground truth dataset whichwe use to demonstrate and quantitatively evaluate the performance of our technique.
After a brief description of the state of the art, we detail our approach. Then, we providean evaluation of the three methods based on a a proposed ground truth dataset.
2 Previous work
2.1 Constant colour matting
Constant colour matting aims at estimating a matte from images where the background is as-sumed to be a constant colour (usually blue or green).
In [Smith and Blinn, 1996], the authors formalise the problem of constant colour mattingas follows. For each pixel, we express the observed colour (O, known) as the proportionα ∈ [0; 1] (unknown) of foreground (F , unknown) and background (B, known) colours. Thematting equation is given by:
!r g b
"⊤# $% &
O
= α!Fr Fg Fb
"⊤# $% &
F
+(1− α)!Br Bg Bb
"⊤# $% &
B
(1)
There are four unknowns for three equations and therefore the problem is underdetermined,i.e. it exists no or many solutions. The authors identify three cases where a solution can befound: no blue in the observed colour, the observed colour is gray or two different shades ofbackground are known. However, these cases do not usually occur in practice.
We can distinguish between the following types of method to estimate α with a constantbackground colour:
– Colour difference technique – α is based on differences between the red, green andblue components. This method (a.k.a Ultimatte R⃝), invented by [Vlahos, 1964], is a legacyof an optical multistep process where colour filters are placed in front of an optical printerto filter out the background colour (an interesting history of matting in filmmaking is givenin [Filmmaker IQ, 2013]). This process is usually summarised by α = 1 − max(0, b −max(r, g)) where B is assumed to be blue. Spill is then removed by changing the blue compo-nent to b← min(b, kg) where k is a user control parameter.
– Colorspace segmentation – The colorspace is partitioned into background, foregroundand semi-transparent regions. In the semi-transparent region, α is based on the distance be-tween the other two. In [Ashikhmin, 2001, Jack, 1996], they use the Y CbCr colorspace toseparate the luminance (Y ) and the chrominance (CbCr). The segmentation is performed inthe 2D chrominance space. A classic approach, called Hue Saturation Luminance keying anddescribed in [Schultz, 2006], consists of segmenting the HSL colorspace to isolate the back-ground and foreground regions. These segmentations are done manually by selecting the centerand the dimensions of basic shapes (simple polyhedron or sphere) that encapsulates the differ-ent regions. The Primatte R⃝ algorithm by [Mishima, 1992], is based on this principle, howeverit automatically adjusts a 128 face polyhedron to obtain an fine segmentation of the colorspace.
2.2 Natural image matting
Natural image matting aims at estimating a matte from images with arbitrary background.These methods use optimisation techniques to estimate the best combination of α, F and Bthat minimises an objective function based on spatial statistical models. To build these models,they require a pre-segmentation (called trimap) in three classes: foreground, background andunknown. For example, in [Chuang et al., 2001], the algorithm marches inward from knownto unknown regions. It uses the colour distribution in a weighted window of the known (or al-ready computed) neighbouring regions to estimate the most likely combination of α, F and B.
Foreground colour (F)
Background colour (B)
✓ ✓✘ ✘3 equations 4 unknowns
alpha matte
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
4
Introduction Foreground estimation Results and evaluation Conclusion
• Constant colour matting • colour difference technique • colorspace segmentation
• Natural image matting • based on statistical models
Previous work
• Constant colour matting inspired by natural image matting • Estimate F and induce alpha
Our contribution
Introduction
• Ultimatte — inspired by early optical processColour difference technique
5
max
-
Red Green Blue
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Ultimatte — inspired by early optical processColour difference technique
5
max
-
Red Green Blue
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Ultimatte — inspired by early optical processColour difference technique
5
max
-
Red Green Blue
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• HSL segmentationColorspace segmentation
6
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• HSL segmentationColorspace segmentation
7
Background
Foreground
Semi-opaque(user defined
rolloff)
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• HSL segmentationColorspace segmentation
8
Foreground
Semi-opaque(user defined
rolloff)
Background
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• HSL segmentationColorspace segmentation
8
Foreground
Semi-opaque(user defined
rolloff)
Background
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Primatte • One of the state of the art commercial
software
Colorspace segmentation
9
Background samples
Foreground
Semi-opaque(user defined
rolloff)
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Without F, we cannot estimate alpha accurately • no accurate alpha as the original foreground
colour remains unknown • requires extensive matte cleaning • color inconsistencies with semi-opaque pixels
(spill suppression does not recover the orignal foreground colour)
Problem
10
Background colour
Observed colour
alpha=0.5
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• Without F, we cannot estimate alpha accurately • no accurate alpha as the original foreground
colour remains unknown • requires extensive matte cleaning • color inconsistencies with semi-opaque pixels
(spill suppression does not recover the orignal foreground colour)
Problem
10
Background colour
Foreground true alpha=0.1
Observed colour
alpha=0.5
Introduction Foreground estimation Results and evaluation Conclusion
Introduction
• arbitrary background • requires prior segmentation (trimap: F, B, unknown) • for each unknown pixel minimise an objective function based on spatial statistical models for F and B • estimate alpha, B and F for the unknown pixels • computationally expensive
Natural image matting
11
Introduction Foreground estimation Results and evaluation Conclusion
BackgroundForegroundUnknown
Introduction
• Estimate F and induce alpha • Consistent foreground color
• more accurate alpha estimation • little or no matte cleaning
• No trimap required
Our contribution
12
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
Foreground estimation
14
Introduction Foreground estimation Results and evaluation Conclusion
• 1. Build a set of Foreground (F) candidates • 2. For each pixel, find the best F candidate from the previously computed set • 3. Deduce alpha • Three independent methods for step 2
Foreground estimation
Foreground estimation
• Hypothesis: the foreground colour is • already present in the image • and distinct from the background colour • if B is different from O, B, O and F are aligned in the colorspace (linear relationship) !
• Modes of the colour distribution of the image minus the background colour • Mean-shift algorithm in Lab* colorspace
Set of foreground candidates
15
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
16
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
17
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
17
th1
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
18
Set of cluster candidate
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
19
Solution space
Background color
Observed color
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
19
Solution space
Background color
Observed color
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
• 3 methods to estimate F • a) F that minimises the sum of
squared residuals of the matting equation system where F is replaced by a cluster candidate Ci
20
B
O
Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α
1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)
10 ci ← cost for this Fi
11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */
13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else
/* Get the best candidate. */
16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥
−−→OFj∥ ; α← ∥−−→BO∥
∥−−→BF∥
(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥
−−→CiFi∥.
(c) This method rotates Ci around B with an angle θi = cos−1
! −−→BCi
−−→BO
∥−−→BCi∥∥−−→BO∥
"so that B, O
and Ci are collinear. The cost is given by the rotation angle: ci = θi.
(a) (b) (c)
..O′
i
.Fi
.B.
O. ci = ρCi.
Ci
..Fi
.B.
O.
Ci
.ci = ∥
−−→CiFi∥
..Fi
.B.
O.
Ci
. ci = θi
Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi
+B Fi =−−→BO(
−−→BO·−−→BCi)
∥−−→BO∥2+B Fi =
−−→BO
∥−−→BO∥∥−−→BCi∥+B
Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.
4 Evaluation and results
Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:
#−Bblue −Bgreen . . .
I3 I3 . . .
$⊤ #ααF
$=
%(Oblue −Bblue) (Ogreen −Bgreen) . . .
&⊤ (2)
where I3 is the 3× 3 identity matrix.
Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥
Ci
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
• 3 methods to estimate F • a) F that minimises the sum of
squared residuals of the matting equation system where F is replaced by a cluster candidate Ci
20
B
O
Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α
1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)
10 ci ← cost for this Fi
11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */
13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else
/* Get the best candidate. */
16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥
−−→OFj∥ ; α← ∥−−→BO∥
∥−−→BF∥
(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥
−−→CiFi∥.
(c) This method rotates Ci around B with an angle θi = cos−1
! −−→BCi
−−→BO
∥−−→BCi∥∥−−→BO∥
"so that B, O
and Ci are collinear. The cost is given by the rotation angle: ci = θi.
(a) (b) (c)
..O′
i
.Fi
.B.
O. ci = ρCi.
Ci
..Fi
.B.
O.
Ci
.ci = ∥
−−→CiFi∥
..Fi
.B.
O.
Ci
. ci = θi
Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi
+B Fi =−−→BO(
−−→BO·−−→BCi)
∥−−→BO∥2+B Fi =
−−→BO
∥−−→BO∥∥−−→BCi∥+B
Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.
4 Evaluation and results
Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:
#−Bblue −Bgreen . . .
I3 I3 . . .
$⊤ #ααF
$=
%(Oblue −Bblue) (Ogreen −Bgreen) . . .
&⊤ (2)
where I3 is the 3× 3 identity matrix.
Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥
Ci
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
• 3 methods to estimate F • a) F that minimises the sum of squared
residuals of the matting equation system where F is replaced by a cluster candidate Ci
• b) F is given by the orthogonal projection of the closest cluster candidate Ci
21
B
O
Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α
1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)
10 ci ← cost for this Fi
11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */
13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else
/* Get the best candidate. */
16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥
−−→OFj∥ ; α← ∥−−→BO∥
∥−−→BF∥
(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥
−−→CiFi∥.
(c) This method rotates Ci around B with an angle θi = cos−1
! −−→BCi
−−→BO
∥−−→BCi∥∥−−→BO∥
"so that B, O
and Ci are collinear. The cost is given by the rotation angle: ci = θi.
(a) (b) (c)
..O′
i
.Fi
.B.
O. ci = ρCi.
Ci
..Fi
.B.
O.
Ci
.ci = ∥
−−→CiFi∥
..Fi
.B.
O.
Ci
. ci = θi
Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi
+B Fi =−−→BO(
−−→BO·−−→BCi)
∥−−→BO∥2+B Fi =
−−→BO
∥−−→BO∥∥−−→BCi∥+B
Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.
4 Evaluation and results
Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:
#−Bblue −Bgreen . . .
I3 I3 . . .
$⊤ #ααF
$=
%(Oblue −Bblue) (Ogreen −Bgreen) . . .
&⊤ (2)
where I3 is the 3× 3 identity matrix.
Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥
Ci
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
• 3 methods to estimate F • a) F that minimises the sum of squared
residuals of the matting equation system where F is replaced by a cluster candidate Ci
• b) F is given by the orthogonal projection of the closest cluster candidate Ci
21
B
O
Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α
1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)
10 ci ← cost for this Fi
11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */
13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else
/* Get the best candidate. */
16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥
−−→OFj∥ ; α← ∥−−→BO∥
∥−−→BF∥
(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥
−−→CiFi∥.
(c) This method rotates Ci around B with an angle θi = cos−1
! −−→BCi
−−→BO
∥−−→BCi∥∥−−→BO∥
"so that B, O
and Ci are collinear. The cost is given by the rotation angle: ci = θi.
(a) (b) (c)
..O′
i
.Fi
.B.
O. ci = ρCi.
Ci
..Fi
.B.
O.
Ci
.ci = ∥
−−→CiFi∥
..Fi
.B.
O.
Ci
. ci = θi
Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi
+B Fi =−−→BO(
−−→BO·−−→BCi)
∥−−→BO∥2+B Fi =
−−→BO
∥−−→BO∥∥−−→BCi∥+B
Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.
4 Evaluation and results
Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:
#−Bblue −Bgreen . . .
I3 I3 . . .
$⊤ #ααF
$=
%(Oblue −Bblue) (Ogreen −Bgreen) . . .
&⊤ (2)
where I3 is the 3× 3 identity matrix.
Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥
Ci
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
• 3 methods to estimate F • a) F that minimises the sum of squared
residuals of the matting equation system where F is replaced by a cluster candidate Ci
• b) F is given by the orthogonal projection of the closest cluster candidate Ci
• c) F is given by the smallest rotation of BCi onto BO
22
B
O
Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α
1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)
10 ci ← cost for this Fi
11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */
13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else
/* Get the best candidate. */
16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥
−−→OFj∥ ; α← ∥−−→BO∥
∥−−→BF∥
(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥
−−→CiFi∥.
(c) This method rotates Ci around B with an angle θi = cos−1
! −−→BCi
−−→BO
∥−−→BCi∥∥−−→BO∥
"so that B, O
and Ci are collinear. The cost is given by the rotation angle: ci = θi.
(a) (b) (c)
..O′
i
.Fi
.B.
O. ci = ρCi.
Ci
..Fi
.B.
O.
Ci
.ci = ∥
−−→CiFi∥
..Fi
.B.
O.
Ci
. ci = θi
Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi
+B Fi =−−→BO(
−−→BO·−−→BCi)
∥−−→BO∥2+B Fi =
−−→BO
∥−−→BO∥∥−−→BCi∥+B
Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.
4 Evaluation and results
Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:
#−Bblue −Bgreen . . .
I3 I3 . . .
$⊤ #ααF
$=
%(Oblue −Bblue) (Ogreen −Bgreen) . . .
&⊤ (2)
where I3 is the 3× 3 identity matrix.
Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥
Ci
Introduction Foreground estimation Results and evaluation Conclusion
Foreground estimation
• 3 methods to estimate F • a) F that minimises the sum of squared
residuals of the matting equation system where F is replaced by a cluster candidate Ci
• b) F is given by the orthogonal projection of the closest cluster candidate Ci
• c) F is given by the smallest rotation of BCi onto BO
22
B
O
Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α
1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)
10 ci ← cost for this Fi
11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */
13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else
/* Get the best candidate. */
16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥
−−→OFj∥ ; α← ∥−−→BO∥
∥−−→BF∥
(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥
−−→CiFi∥.
(c) This method rotates Ci around B with an angle θi = cos−1
! −−→BCi
−−→BO
∥−−→BCi∥∥−−→BO∥
"so that B, O
and Ci are collinear. The cost is given by the rotation angle: ci = θi.
(a) (b) (c)
..O′
i
.Fi
.B.
O. ci = ρCi.
Ci
..Fi
.B.
O.
Ci
.ci = ∥
−−→CiFi∥
..Fi
.B.
O.
Ci
. ci = θi
Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi
+B Fi =−−→BO(
−−→BO·−−→BCi)
∥−−→BO∥2+B Fi =
−−→BO
∥−−→BO∥∥−−→BCi∥+B
Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.
4 Evaluation and results
Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:
#−Bblue −Bgreen . . .
I3 I3 . . .
$⊤ #ααF
$=
%(Oblue −Bblue) (Ogreen −Bgreen) . . .
&⊤ (2)
where I3 is the 3× 3 identity matrix.
Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥
Ci
Introduction Foreground estimation Results and evaluation Conclusion
Results and evaluation
Estimating Ground Truth
• 5 different backgrounds → overdetermided problemGround truth dataset
24
Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α
1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)
10 ci ← cost for this Fi
11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */
13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else
/* Get the best candidate. */
16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥
−−→OFj∥ ; α← ∥−−→BO∥
∥−−→BF∥
(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥
−−→CiFi∥.
(c) This method rotates Ci around B with an angle θi = cos−1
! −−→BCi
−−→BO
∥−−→BCi∥∥−−→BO∥
"so that B, O
and Ci are collinear. The cost is given by the rotation angle: ci = θi.
(a) (b) (c)
..O′
i
.Fi
.B.
O. ci = ρCi.
Ci
..Fi
.B.
O.
Ci
.ci = ∥
−−→CiFi∥
..Fi
.B.
O.
Ci
. ci = θi
Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi
+B Fi =−−→BO(
−−→BO·−−→BCi)
∥−−→BO∥2+B Fi =
−−→BO
∥−−→BO∥∥−−→BCi∥+B
Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.
4 Evaluation and results
Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:
#−Bblue −Bgreen . . .
I3 I3 . . .
$⊤ #ααF
$=
%(Oblue −Bblue) (Ogreen −Bgreen) . . .
&⊤ (2)
where I3 is the 3× 3 identity matrix.
Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥
Introduction Foreground estimation Results and evaluation Conclusion
Estimating Ground Truth
• 5 different backgrounds → overdetermided problemGround truth dataset
24
Algorithm 1: Algorithm for one pixel.Data: O, B, C, th1, th2, mResult: F , α
1 if ∥−−→BO∥ = 0 then2 F ← 0 ; α← 0 /* This is a background pixel. */3 else4 if ∥−−→BO∥ > th1 then5 F ← O ; α← 1 /* This is a foreground pixel. */6 else7 L← ∅ /* Initiate a list of candidates. */8 for each Ci ∈ C do9 Fi ← estimate F with a method m using (O,B,Ci)
10 ci ← cost for this Fi
11 if ci < th2 then12 L← L ∩ Fi /* Fi is a candidate. */
13 if |L| = 0 then14 F ← 0 ; α← 0 /* No candidate → background. */15 else
/* Get the best candidate. */
16 F ← Fi ∈ L!! ∀ Fi, Fj ∈ L2, ∥−−→OFi∥ ≤ ∥
−−→OFj∥ ; α← ∥−−→BO∥
∥−−→BF∥
(b) In this method, Fi is given by the orthogonal projection of the Ci onto the line BO. Thecost is given by the Euclidean distance between Ci and Fi: ci = ∥
−−→CiFi∥.
(c) This method rotates Ci around B with an angle θi = cos−1
! −−→BCi
−−→BO
∥−−→BCi∥∥−−→BO∥
"so that B, O
and Ci are collinear. The cost is given by the rotation angle: ci = θi.
(a) (b) (c)
..O′
i
.Fi
.B.
O. ci = ρCi.
Ci
..Fi
.B.
O.
Ci
.ci = ∥
−−→CiFi∥
..Fi
.B.
O.
Ci
. ci = θi
Fi =−−→BO∥−−→BCi∥2−−→BO·−−→BCi
+B Fi =−−→BO(
−−→BO·−−→BCi)
∥−−→BO∥2+B Fi =
−−→BO
∥−−→BO∥∥−−→BCi∥+B
Table 1: Illustration of the 3 methods (a), (b) and (c) proposed to estimate Fi in the Algorithm 1.
4 Evaluation and results
Ground truth dataset To provide a quantitative evaluation of our three methods, we cre-ated a ground-truth dataset. As explained in [Smith and Blinn, 1996], if at least two differentshades of background are known, Eq. 1 becomes overdetermined and we can estimate α andF . We took pictures of six different objects (bath, bottle, muppet, pot1, pot2 and spider, seeTable 3) in front of five different backgrounds (blue, green, black, yellow and red) for furtheroverdetermination. Then, we calculated a least squared solution for αF and α:
#−Bblue −Bgreen . . .
I3 I3 . . .
$⊤ #ααF
$=
%(Oblue −Bblue) (Ogreen −Bgreen) . . .
&⊤ (2)
where I3 is the 3× 3 identity matrix.
Parameters Each method requires two parameters for the initial clustering: the size of thebins sb, and the size of the mean-shift window sms. It also requires th1, the minimum ∥−−→BF∥
Ground truth alphaF
Ground truth alpha
Introduction Foreground estimation Results and evaluation Conclusion
Results and evaluation
• Mean squared errorMeasure of error
25
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)th
2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)
th2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
Introduction Foreground estimation Results and evaluation Conclusion
Results and evaluation
• Mean squared errorMeasure of error
25
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)th
2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)
th2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
Introduction Foreground estimation Results and evaluation Conclusion
Results and evaluation
• Mean squared errorMeasure of error
25
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)th
2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)
th2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
Introduction Foreground estimation Results and evaluation Conclusion
Result
Results and evaluation
• Mean squared errorMeasure of error
25
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)th
2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)
th2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
Introduction Foreground estimation Results and evaluation Conclusion
Result
Ground truth
Results and evaluation
• Mean squared errorMeasure of error
25
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)th
2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
distance, and th2 the threshold on the cost depending on the method employed, see § 3.2.According to an initial experiment, (sb, sms) can be fixed to (2, 4) for the method (a) (givingabout 60 clusters) and to (2, 2) for the methods (b) and (c) (giving about 400 clusters). Thebest results (according to the criterion described below) are obtained with th1 ≈ 40 ± 5 and(a) th2 ≈ 2 ; (b) th2 ≈ 10 ; (c) th2 ≈ 0.4 (see Tables 3 and 2). These values can then be finetuned interactively.
Measure of error For each of the three methods m, we evaluate the results obtained witha different set of values for th1 and th2. We chose to evaluate the methods using the imageshaving a green background. As we are interested in F and α, we compare the estimated valueswith the ground truth. The mean squared error for an image is given by:
MSEth1,th2,m =1
hw
h−1,w−1!
i,j=0,0
∥(αF )i,j − (αF )i,j∥ (3)
where (h,w) are the dimensions of the image, (αF )i,j is the estimated value for the pixel atcoordinates (i, j) and (αF )i,j is the ground truth value for the pixel of same coordinates.
Tables 2 and 3 show the results. With the appropriate set of parameter values, the threemethods can achieve results with a low error score. The best average values for each methodare MSE45,3,(a) = 3.339, MSE50,5,(b) = 2.814 and MSE50,0.4,(c) = 2.811 showing that (b)and (c) perform slightly better than (a). In some scenes, (b) and (c) clearly outperform (a).
(a) (b) (c)
th2
th1
4
6
8
10
12
14
16
1
3
5
7
9
11
13
15
17
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
5
10
15
20
25
30
35
40
45
35 40 45 50 55 60 65 80 85
th2
th1
4
6
8
10
12
14
16
.2
.4
.6
.8
1
1.2
1.4
1.6
1.8
35 40 45 50 55 60 65 80 85
Table 2: Contour plots of the averaged MSEth1,th2,m over the 6 images for each of the threemethods in function of th1 and th2.
5 Discussion and conclusion
We proposed three methods to estimate F and α in a constant colour background mattingproblem. According to our evaluation, these methods give very encouraging results.
A comparison with commercial methods would be interesting. However, source code is notavailable, executables are not free and they may include extra post processing to give a visuallyappealing, but not mathematically accurate, result. These issues make a fair and rigorous com-parison difficult. But, to permit a qualitative evaluation of how our algorithm may compete, wevisually compared our result (left) with one obtained using Apple Motion HSL Keyer (right).The latter one has more spill. We also noticed one inconsistent ground truth data in Pot1 wherethe opaque fluorescent marker is considered to have some transparency. Although the variancesin the observed values are relatively small (due to noise and background indirect illumination),the residual of Eq. 2 is quite large. We propose to re-estimate the ground truth using a robust
(a) Our method
(b) Apple Motion HSL Keyer
Figure 2: Visual comparison with a commercial solution.
Introduction Foreground estimation Results and evaluation Conclusion
Result
Ground truth
Difference
Results and evaluation
26
No matte cleaning added No spill suppression
Introduction Foreground estimation Results and evaluation Conclusion
Conclusion and future work
Conclusion and future work
• Constant colour matting method based on foreground colour estimation • Ground truth dataset • Accurate (low MSE) visually good results
Conclusion
28
Introduction Foreground estimation Results and evaluation Conclusion
Conclusion and future work
• Constant colour matting method based on foreground colour estimation • Ground truth dataset • Accurate (low MSE) visually good results
Conclusion
28
• Ground truth dataset → large residuals in some cases → robust estimation • Comparison with other methods: difficult to compare with commercial products (not free, technical
details not well documented, add post-processing to cleanup the matte)
Future work
Introduction Foreground estimation Results and evaluation Conclusion