[ieee 2013 ieee international conference on multimedia and expo (icme) - san jose, ca, usa...
TRANSCRIPT
OBJECTIVE ASSESSMENT OF VIDEO SEGMENTATION QUALITY FOR AUGMENTED REALITY
Silvio R. R. Sanches*, Valdinei F. Silvat, Ricardo Nakamura, Romero Tori
Universidade de Sao Paulo, Sao Paulo, Brasil
{silviorrs, valdinei.freire}@usp.br, [email protected], [email protected]
ABSTRACT
Assessment of video segmentation quality is a problem sel
dom investigated by the scientific community. Nevertheless,
recent studies presented some objective metrics to evaluate
algorithms. Such metrics consider different ways in which
segmentation errors occur (perceptual factors) and its param
eters are adjusted according to the application for which the
segmented frames are intended. We demonstrate empirically
that the performance of existing metrics changes according to
the segmentation algorithm; we applied such metrics to eval
uate bilayer segmentation algorithms for compose scenes in
Augmented Reality environments. We also contribute with a
new objective metric to adjust the parameters of two bilayer
segmentation algorithms found in the literature.
Index Terms- Binary Segmentation Algorithm, Objec
tive Assessment, Objective Evaluation, Augmented Reality.
1. INTRODUCTION
Image segmentation for the extraction of a person in the fore
ground from their original context has become a common
task in Augmented Reality (AR) systems. This operation be
comes more difficult when, due to application requirements,
it must be performed in natural environments - with an arbi
trary background and without controlled lighting - and using
uncalibrated monocular video capture [1, 2].
Recent research has produced algorithms that work in
real-time and perform segmentation based in monocular im
ages [3,4, 5, 6]. Applications such as videoconferencing sys
tems (or videochats) [3, 4] and immersive games [1] have
adopted these algorithms; both applications may implement
AR systems [1, 2]. Because of the difficulty of segmenting
a natural image - that is, an image obtained from a natural
environment - the output image, which should contain only
the element of interest, may present pixel classification er
rors. The usage of images with those errors may influence the
*Thanks to CAPES (Coordena9ao de Aperfei90amento de Pessoal de Nivel Superior) for financial support and Instituto Nacional de Ciencia e Tecnologia - Medicina Assistida por Computa9ao Cientffica (INCT-MACC) Proc. 573710/2008-2 for device used in the subjective experiments.
tThanks to FAPESP (Funda9aO de Amparo it Pesquisa do Estado de Sao Paulo) Procs. 11119280-8 and 12/19627-0 for financial support.
visual quality of the scene displayed to the user considerably,
and may prevent some applications to use those algorithms.
According to Gelasca and Ebrahimi [7], it is possible to
obtain an application-dependent metric to evaluate segmenta
tion algorithms. Those authors presented an objective metric
containing parameters that were adjusted according to the tar
get application of the segmented video frames; the study in
cluded AR systems. To obtain that metric, the authors defined
and created a set of error types that were artificially inserted
in video sequences that simulated an AR application. Those
videos were submitted to a subjective quality assessment pro
cess.
We have evaluated the metric proposed in [7] using ac
tual segmentation errors, i.e. , errors produced after applying
a real segmentation algorithm. The metric was applied to
video sequences obtained by the execution of two segmenta
tion algorithms described in the literature, and exhibiting the
corresponding errors. Videos simulating an AR environment
were created with the imperfectly segmented video frames.
A subjective quality assessment process was applied to those
videos. Based on the results obtained from this subjective
evaluation, we proposed a new objective metric which con
sider perceptual factors for the assessment of segmentation
quality for AR environments. Our metric can be used to se
lect optimal parameters for two bilayer segmentation algo
rithm following subjective evaluation.
2. SEGMENTATION QUALITY ASSESSMENT
Assessing the quality of segmentation algorithms is a prob
lem that has been investigated in different contexts in the lit
erature. One case is the evaluation of images composed by
objects - in particular, people - extracted from video content
[7]. Although sophisticated segmentation methods exist, none
are precise and general enough yet to be a definitive solution
to the problem. Therefore, applications must consider that the
composite image presented to the user, in a given moment,
may contain segmentation errors.
Images produced by the process of segmentation and
composition with a new background have been evaluated
subjectively and objectively [7]. Subjective assessment has
been shown to be the most efficient means to obtain reli
able measurements [8] both in the industry and in the sci
entific community. Some methods, which are traditionally
used in quality assessment of video codecs for TV broadcasts,
were adapted over time so that they could be used for images
presented in multimedia applications, including evaluation of
segmentation [9]. Some of these methods are also directly
applied in processes of quality assessment of video objects
[9].
The main problem of subjective assessment methods is
that, in general, they require a large number of observers and
significant infrastructure. This makes the process lengthy and
possibly expensive. Then, if an algorithm should be tuned to
best performance, an objective metric can avoid the applica
tion of subjective assessment directly [7].
Although there exist previous researches in measuring
quality of image segmentation, the main motivation for re
search in quality of image segmentation is directly related to
the efforts for the standardization ofISO/MPEG-4. Due to the
standard-required feature of independent video-object encod
ing, research aimed at the quality assessment of those seg
mented images became necessary. This research developed
a metric based in spatial accuracy and in spatial coherency
[10]. Other methods, consisting in improvements on the base
model, were also proposed by the same research group [11].
Correia and Pereira [12] evaluated individual object seg
mentation based in spatial and temporal criteria - it also in
cludes the evaluation of segmentation with multiple objects
presented in the video content. The temporal criteria are:
shape fidelity, geometric fidelity, similarity of edge contents,
and statistical data similarity. The adopted temporal criteria
represents temporal perceptual information and a measure
ment of criticality, using spatial and temporal information si
multaneously.
Gelasca and Ebrahimi [7] proposed subjective tests to
identify the most noticeable error types, but the error types
were artificially created to simulate spatial and temporal er
rors. The developed method takes into consideration the er
rors (or artifacts) identified through the assessment process to
cause more inconvenience to the users. Four artifacts were de
fined: added regions, added background, internal holes, and
edge holes [7]; we detail this artifacts in section 4. Then, all
of these artifacts are linear combined to produce an overall
measurement of discomfort. Finally, authors defines specific
weights for some applications, including the assessment of
segmentation in AR environments.
Although some methods for the objective assessment
of segmentation quality have been proposed, according to
Gelasca and Ebrahimi [7], few of them focus in studying and
defining the errors typically found in the segmentation pro
cess in order to obtain a perceptive measurement.
3. SUBJECTIVE EXPERIMENT
Our first task was to define a formal subjective experiment
to gather the opinion of users regarding videos presenting an
AR environment with segmentation errors in the avatar image
that was inserted in it. These errors were obtained by exe
cuting two different segmentation algorithms that are feasible
in this context [4, l3]. Therefore, user observes actual seg
mentation errors, instead of simulated ones as in [7]. Next
sections describe the steps of the subjective method.
3.1. Preparation of the Video Database
Five different video sequences were used as sources, named
SEQ1, SEQ2, SEQ3, SEQ4, and SEQ5. In these sequences
the element of interest in the scene - the person in the fore
ground - was placed so that the upper body or full body was
visible. Figure 1 shows video frames from sequences SEQ1
and SEQ2.
Fig. 1. Frames from original video sequences SEQ1 e SEQ2.
Each source video sequence used in the experiment has a
ground truth. That is, for each video frame there is a corre
sponding precisely segmented image. This allowed the cal
culation of classification errors for each sequence. The pixels
in the ground truth video frames were labeled as foreground,
background, and unknown region. The pixels that are part of
the unknown region (edge of the element of interest) were not
included in the error count. In the generated videos, pixels
in this region are composed of the element of interest and the
new background as matting technique, each with 50% trans
parency, this technique soften the edges of the element of in
terest.
Sequences SEQ2 and SEQ4 and their respective ground
truth sequences were obtained from a database available for
research', whereas sequences SEQ1, SEQ3, and SEQ5 were
captured and labeled manually.
By executing two segmentation algorithms with differ
ent parameters, we produced short-duration (lOs) video se
quences from the five source video sequences, with a reso
lution of 640 x 480 pixels and different percentages of pixel
segmentation errors2. These videos present an AR application
scenario based on a 3D virtual office environment from a fixed
1 http://research.microsoft.com!en-us/projects/i2i/data.aspx 2We define formally pixel segmentation errors as the artifact Ey in sec
tion 4.2.
point of view. The segmented element of interest is inserted
in the scene as a "billboard " - that is, applied as a texture
on a rectangular 3D mesh present in the virtual environment
that remains perpendicular to the viewing direction. For the
tests, error-segmented sequences were produced for each of
the source videos for a total of forty-eight video sequences.
Twenty-four were segmented using the background subtrac
tion method (Qian) proposed in [13] and the other twenty
four segmented using the method based in the energy mini
mization framework, proposed by Criminisi et al (Crim) [4].
The error amount and error type depend on the segmentation
algorithm and on the parameters applied within the algorithm.
By varying setting parameters in both algorithm, pixel er
rors range from 0% (ground-truth reference video) to 31.85%
(worst case) in video sequences. The different error percent
ages were obtained, in the case of the Qian method, by vary
ing the value of the threshold that controls the tolerance in
comparing colors between the background model and the an
alyzed frame [13]. In the method by Criminisi et aI. , the error
percentages were obtained by varying the normalization pa
rameters of the Conditional Random Field (CRF) [4] that is
used in the model. Since the error percentages were obtained
by the execution of the cited segmentation algorithms, some
videos presented higher error percentages due to the limita
tions of the methods to handle some scenarios presented in the
source video content, for example, similar colors in the back
ground and in the element of interest and lighting changes.
Two frames of the new videos - containing segmentation
errors - can be seen in figure 2.
Fig. 2. Videos sequence with segmentation errors created for
subjective assessment tests.
3.2. Performing the Tests
Some video quality subjective assessment methods, with ac
knowledged efficiency [8], are popular both in the industry
and in the scientific community. Among them, SAMVIQ
(Subjective Assessment Methodology for Video Quality) [14]
has been shown to be very precise according to certain studies
[8]. We used the SAMVIQ method to determine which seg
mentation errors were the most noticeable to the users. Eval
uation of the generated videos was performed by applying the
SAMVIQ method [14], implemented in the MSV3 tool.
3 http://compress ion . ru/ vi deo/q uali ty Jl1eas ure/perceptual _video_quality _tooLen .html
A total of 26 volunteers took part in the experiment. The
only restriction regarding the participants' profile, as required
by the SAMVIQ method, is that they must not work with im
age quality assessment as their primary occupation. The same
environmental conditions were maintained for all tests. De
tails of the physical setup may be found in the lTV recom
mendations BT-5004. The interface of the assessment envi
ronment is presented in figure 3.
T .. ,_
�" ..... _JiS.»oIJ_CMO -'
Sequence C
--,-' -
Fig. 3. Subjective Experiments Interface.
4. ERROR TYPE ANALYSIS
A segmentation error may affect video quality in spatial and
temporal terms [11]. The work of Gelasca and Ebrahirni
presents four spatial artifacts that may be combined to rep
resent a general discomfort in relation to segmentation errors
in a video sequence. They are: Added Regions A,.., Added
Background Ab, Internal Holes Hi, and Edge Holes Hb [7].
Since Gelasca and Ebrahirni assessed subjective evaluation
by showing to users video sequences with synthetic segmen
tation errors; we intended to verify if the same results can be
obtained under errors produced by real algorithms. We do so
in the context of AR applications.
Artifacts of the same type are grouped on scalar. The rel
ative spatial error S Ar (k) groups all added regions in frame k by:
"NAr IAj(k)1 S (k) = L..J=l ,.. Ar In(k)1
(1)
where At (k) is the added region j in the frame k, I . I is the set
cardinality operator, n( k) is the sum of reference pixels and
the results of the segmentation and NAT is the total number of
added regions. Likewise, for the j internal holes Hi (k), the
relative spatial error SHi (k) may be obtained. For the other
types of error, which are connected to the edge of the element
of interest, a weight Dj that takes into account the distance of
each pixel in region to the edge of the element of interest is
4 http://www. itu. intireclR -REC-BT. SOO-12-200909-S/en
also added. The relative spatial error for edge errors S Ab (k) for j added regions is:
(2)
In a similar manner, for j edge holes 1-l1 (k), the relative spa
tial error S1iu (k) may be obtained.
To take into account the temporal aspect, effects such as
the sudden disappearance of artifacts, suprise effect and ex
pectation effect are considered [7], resulting in four objective
perceptual metrics PSTAr, PSTAb, PST1ii and PST1ib, with weights (a, b, C and d) adjusted by means of subjective
evaluation [7]. Lastly, the proposed objective perceptual met
ric is a linear combination of those metrics:
PST = axPSTAr+bxPSTAb+CXPST1ii+dxPST1ib. (3)
4.1. PST Weights Analysis
The initial tests were performed to verify if the weights re
lated to each artifact, obtained from PST metric for AR appli
cations are valid when evaluating segmentation based in ac
tual errors, obtained from the execution of two different seg
mentation algorithms, here referred to as Qian [13] and Crim
[4].
Initially, the artifacts defined in PST were identified in the
data obtained from the segmentation performed by each al
gorithm. Afterwards, the optimal values for W = [a, b, c, dJ (equation 3) and their respective confidence intervals were ob
tained, for the Crim algorithm, by means of linear regression
WCrim = regress(SubjCrim, n) (4)
where SubjCrim are the data from the subjective evalua
tion (SAMVIQ) of the Crim algorithm and n = [PSTAr, PSTAO' PST1ii e PST1iuJ are the artifact-based metrics
found in the data resulting from segmentation. The values
of W Qian were obtained in a similar manner.
Table 1 presents the confidence intervals (with 95% level)
obtained in this experiment for each algorithm.
Table 1 Confidence Interval WCrim WQian
left right left right
a 0.93 1l.72 3.75 44.15
b 6.48 28.79 -22.68 4.78
C -6.10 0.71 -20.87 -3.63
d 6.00 12.98 3.54 8.70
Since the suggested weights W Gel in the PST metric are
a = 6.71, b = 8.39, C = 12.57 e d = 8.74 for AR applica
tions, it is possible to observe that, for the Crim algorithm, C
is outside the confidence interval whereas in the Qian algo
rithm, b, C and d are outside the interval. We note that be
cause the Crim algorithm produces cluster-like errors, since
one of its criterion for segmentation is based on contrast, on
the other side, the Qian algorithm produces scattered-like er
rors. Gelasca and Ebrahimi experiment only within cluster
like errors, but not at all with scattered-like errors [7].
In the previous experiment we compute a weight vector
for each algorithm: W Qian and W Crim. Our statistical test
says that W Qian and W Crim are different from W Gel; the
next step is to measure how each weight affects the evalua
tion prediction. We compute errors EQian,Qian, EQian,Gel, EQian,Crim, ECrim,Crim, ECrim,Gel, and ECrim,Qian, where error Ei,j stand for the square error obtained when pre
dicting evaluation within weights Wj against subjective eval
uation subj;. By applying the Student T-test between EQian,Qian and
EQian,Gel, it is possible to affirm with 95% confidence that
there are significant differences between the errors for AR
applications, i.e. , evaluating Qian algorithm within a metric
obtained under methodology of Gelasca and Ebrahimi is sta
tistically worse than evaluating Qian algorithm within a met
ric obtained directly from the errors produced by the algo
rithm. We also confirmed that EQian,Crim and ECrim,Qian are statistically different, meaning that it is not only impor
tant to make use of real errors, but make tuning metric to spe
cific algorithm. On the other side, as it has already happen in
the parameters interval, errors ECrim,Crim e ECrim,Gel did
not prove to be statistically different, meaning that results ob
tained by Gelasca and Ebrahimi can be useful, but it must be
used with parsimony.
4.2. Definition and Analysis of Artifacts
The second experiment was intended to verify if the artifacts
n, suggested in the work of [7], are the best set to be ap
plied for the evaluation of segmentation in the context of AR
applications considered in this research. For the comparison
with the previously described artifacts, another set of eighteen
artifacts were defined, some of them with variations in their
parameters.
False negatives EN are foreground pixels classification
errors given by
1 K P EN = K � � pix(p, k) E N(k) (5)
k=lp=l
where pix(p, k) is the pixel at p position from frame k. False
positives Ep which are background errors obtained in a sim
ilar way and total error mean Er is the sum of these artifacts
Er = EN + Ep. Some artifacts were defined to measure the degree of an
noyance related to distance from false positives to foreground.
DPin(dt) are false positives at maximum dt pixels from the
foreground which are defined by equation
1 K P
DPin(dt) =
K l: l: pix(p, k) E P(k), Vpix(p, k) < dt k=l p=l
(6)
where dt E {80, 90, 100, 110, 120} is the distance in pixels.
Dpout(dt) is the false positives at minimum dt pixels distant
from foreground and it was obtained in a similar way. An
other type of artifact defined in our experiment is the con
nected pixels (or blob). Blob errors are given by
K BPlarger(size) = l: P(k) > size (7)
k=l
where size = {5, 10, 15, 20} is the amount of pixels con
nected. BPsmaller(size) are the false positives with up to size pixels connected, B Nlarger(size) are the false negatives with
at least size pixels connected and BNsmaller(size) are the
false negatives with up to size pixels connected. These ar
tifacts were calculated in equation 7.
To take into account the temporal aspect we defined the
artifact TN(pc) which are the errors of the pixel p from the
frame k that occur at least in pc percent of the frames from a
video sequence K. This error type is given by
1 K TN(pc) =
K l: pix(p, k) E N(k) and TN(pc) < pc (8)
k=l
where pc = {40, 50, 60, 70, 80, 90}. Tp(pc) which are false
positives spatio-temporal errors are obtained in a similar way.
Another type of spatial and temporal errors considered in
this research are the "false blobs ". A false negative spatial
false blob F S N s is calculate by convolving a binary image (l
= false negatives pixels, 0 = no false negative pixels) with a
simple kernel Ms according to equation
K FSNS = l: MN(k) * Ms
k=l (9)
where * is the convolution operator. MN(k) is a binary image
created from the false negative pixels and Ms is a 3 x 3 kernel.
False positive false blobs F Sps are calculate in a similar
manner. The false blob error F SN9 is obtained by convolving
MN(k) according to equation
K FSN9 = l: MN(k) * Mg(J
k=l (10)
where Mg is a centralized gaussian kernel with standard de
viation (J = 0.8. FSpg is obtained in a similar way.
In order to calculate temporal false blobs errors we de
fined a video sequence as a three-dimensional matrix H ei x
K x Wid where H ei is the height of the original frames and
Wid is its width. Therefore, a " temporal frame " Qt can be
represented by an image H ei x K and a temporal sequence
contain Wid temporal frames. False negatives temporal blobs
are given by
Wid FTNs = l: QtN(w) * Ms(J (11)
w=l
where QtN is a binary temporal image created from false neg
atives pixels. FTps are false positives temporal blobs, FTN 9 and FTpg are gaussian false negatives and gaussian false pos
itives temporal blobs respectively.
Once the new artifacts were defined, the next step con
sisted in obtaining the most annoyance artifacts and its re
spective weights in order to define the objective metric. For
this, we used a multi-step greedy selection after testing a set
of artifacts through regression.
This analysis showed that artifacts EN, Dpout(llO) , TN(60) and PSTAb are the most annoyance in the analysis
performed with the Crim algorithm, while the artifacts EN, TN(70)' Tp(50) and BNsmaller(5) are the most annoyance in the analysis of the Qian algorithm.
S. OBJECTIVE METRIC DEFINITION
According to results shown in section 4.2 an objective metric
must be specific to a segmentation algorithm. Therefore, the
metric M can be defined by equation
I,J M(Alg,Ap) = l: (pesi x artj)Alg (12)
i=l,j=l
where Alg is the algorithm-dependent and Ap is the ap
plication in which the foreground layer will be used.
The weights pes are denoted by a vector pes (pesl' peS2, ... ,pesi, ... ,pes I ) (these weights were ob
tained in linear regression step) as well as the artifacts
(art1, art2, ... , artj, ... , artJ). We used I, J = 4 as in PST
in order to better comparison with that metric.
The metric M must be used in AR applications to evaluate
the Crim algorithm according to equation
M(Crim,RA) = a x EN + b x Dpout(llO)+ c x TN(60) + d x PSTAb
(l3)
where a = 0.051, b = 0, c = -4.692 and d = 13.731. In a
similar manner, the metric M must be used as in equation
M(Qian,RA) = a x EN + b x TN(70)+ c x Tp(50) + d x BNsmaller(5)
(14)
where a = -0.087, b = 1.781, c = 2.978 and d = -0.001. Although the metrics obtained improve over PST
(Gelasca and Ebrahimi), mainly when the Qian algorithm is
considered, we do not state strongly to be the best metric for
both algorithms here considered. First, we considered some
hand-coded artifacts plus artifacts defined on PST; although
the set of artifacts is large, they are far from extinguishing ar
tifact possibilities. Second, we select four artifacts greedily to
compare with PST metric; four artifacts may not be enough to
evaluate properly and multi-step greed selection is suboptimal
in general (for example, Dpout(llO) proves to be irrelevant af
ter adding more two artifacts). Third, the proximity between
W Gel and W Grim may indicate a direction to setting a metric
for types of segmentation algorithms (for example, PST Au was elected as a good artifact for Crim algorithm). Fourth,
although different algorithms may require different different
metrics, both algorithms elected EN and TN(pc) indicating a
direction to setting a common subset for evaluate algorithms.
Finally, the small number of video sequence evaluated within
volunteers does not allow a generalization on our metric.
6. CONCL USION
This paper has addressed the problem of bilayer video seg
mentation quality assessment when the segmented video
frames are used to compose scenes in Augmented Reality en
vironments. Here we show that a state-of-art metric is not
directly usable to evaluate segmentation in this context. De
spite the artifacts proposed by the authors do not prove to
tally irrelevant when the segmentation quality of one of the
two algorithms under investigation was evaluated, the use of
the suggested artifacts weights - which are used to combine
these artifacts - produced results misaligned with the subjec
tive evaluation. Finally, we show that for each segmentation
method used in this experiment new adjusted artifacts can bet
ter represent the overall annoyance produced by segmentation
errors which are noted by the users. These artifacts were used
to compose the new objective metric presented in this paper.
7. REFERENCES
[1] R. Nakamura, L. L. M. Lago, A. B. Carneiro, A. J. C.
Cunha, F. J. M. Ortega, J. L. Bernardes-Jr, and R. Tori,
"3PI experiment: immersion in third-person view, " in
Proceedings of the SIGGRAPH Symposium on Video
Games, New York, NY, USA, 2010, pp. 43-48, ACM.
[2] S. R. R. Sanches, D. M. Tokunaga, V. F. Silva, A. C.
Sementille, and R. Tori, "Mutual occlusion between
real and virtual elements in augmented reality based on
fiducial markers, " in IEEE Workshop on Applications
of Computer Vision (WACV). 2012, pp. 49 -54, IEEE
Computer Society.
[3] P. Yin, A. Criminisi, J. Winn, and I. Essa, "Bilayer
segmentation of webcam videos using tree-based classi
fiers, " Pattern Analysis and Machine Intelligence, IEEE
Transactions on, vol. 33, no. 1, pp. 30-42, 2011.
[4] A. Criminisi, G. Cross, A. Blake, and V. Kolmogorov,
"Bilayer segmentation of live video, " in Proceedings of
IEEE Computer Society Conference on Computer Vision
and Pattern Recognition, Washington, DC, USA, 2006,
vol. 1, pp. 53-60, IEEE Computer Society.
[5] A. Parolin, G. P. Fickel, C. R. Jung, T. Malzbender, and
R. Samadani, "Bilayer video segmentation for video
conferencing applications, " in IEEE International Con
ference on Multimedia and Expo (ICME), 2011, pp. 1-6.
[6] S. R. R. Sanches, V. da Silva, and R. Tori, "Bilayer seg
mentation augmented with future evidence, " in Compu
tational Science and Its Applications - ICCSA, B. Mur
gante, O. Gervasi, S. Misra, N. Nedjah, A. Rocha,
D. Taniar, and B. Apduhan, Eds., vol. 7334 of Lec
ture Notes in Computer Science, pp. 699-711. Springer
Berlin / Heidelberg, 2012.
[7] E. Gelasca and T. Ebrahimi, "On evaluating video ob
ject segmentation quality: A perceptually driven objec
tive metric, " IEEE Journal of Selected Topics in Signal
Processing, vol. 3, no. 2, pp. 319 -335, 2009.
[8] S. Pechard, R. Pepion, and P. L. Callet, "Suitable
methodology in subjective video quality assessment:
a resolution dependent paradigm, " in International
Workshop on Image Media Quality and its Applications
(IMQA), 2008.
[9] S. R. R. Sanches, D. M. Tokunaga, V. F. Silva, and
R. Tori, "Subjective video quality assessment in seg
mentation for augmented reality applications, " in XIII
Symposium on Virtual Reality (SVR), 2012, pp. 46 -55.
[10] P. Villegas, X. Marichal, and A. Salcedo, "Objective
evaluation of segmentation masks in video sequences, "
in Workshop on Image Analysis for Multimedia Interac
tive Services (WIAMIS), 1999, pp. 85 - 88.
[11] X. Marichal and P. Villegas, "Objective evaluation of
segmentation masks in video sequences, " in European
Conference on Signal Processing (EUSIPCG), 2000,
vol. 4, pp. 2193-2196.
[12] P. Correia and F. Pereira, "Objective evaluation of video
segmentation quality, " IEEE Transactions on Image
Processing, vol. 12, no. 2, pp. 186 - 200, 2003.
[13] R. Qian and M. Sezan, "Video background replace
ment without a blue screen, " Proceedings of Interna
tional Conference on Image Processing (ICIP), vol. 4,
pp. 143-146 vol.4, 1999.
[14] F. Kozamernik, V. Steinmann, P. Sunna, and E. Wyck
ens, "Samviq - a new ebu methodology for video qual
ity evaluations in multimedia, " SMPT E Motion Imaging
Journal, vol. 114, no. 4, pp. 152 - 160, april 2005.