foreground/background image segmentation. what is our goal? to label each pixel in an image as...
TRANSCRIPT
Foreground/Background
Image Segmentation
What is our goal?
To label each pixel in an image as belonging to either the foreground of the scene or the background
Solution?
This problem can be solved efficiently by a minimum cut computation.
Likelihood and separation parameters
For each pixel i we have a likelihood ai that it belongs to the foreground and a likelihood bi that it belongs to the background.
We can label a pixel i as belonging to the foreground
if ai > bi, and to the background otherwise. We must also consider a pixel’s neighbours. If many
neighbours are in the background we would be more inclined to label i as background. Thus, for each pair(i,j) of neighbouring pixels there is a separation penalty pij >= 0 if both pixels don’t belong to foreground or background.
We can define our Segmentation Problem as finding an partition of the set of pixels into sets A and B (foreground and background respectively) so as to maximize the following sum:
(1) p- b ),(
|1j}{i,A| Ej)(i,
ij
BjAi
j
iaBAq
Defining our problem mathematically
This is a maximization problem though. Minimum cut algorithm is a minimization problem
Converting our problem to a minimum cut problem
i
i i )b(a Q b BjAi
j
ia
Bjj
Aii a - b - Q
|1j}{i,A|
Ej)(i,
ijBj
jAi
i p-a - b - QB)q(A,
In equation (1) we are defining a maximization problem. We must modify (1) to make our problem a minimization problem.
Let . The sum: equals
. As a result we can rewrite (1) as:
|1j}{i,A|
Ej)(i,
ijBj
jAi
i p a bB)(A,q'
. Maximizing q(A,B) is the
same as minimizing q’(A,B):
Constructing our graph (1) Let V be the set of pixels and E to denote the set of all
pairs of neighbouring pixels. We obtain an undirected graph G=(V,E).
We create a source node s to represent the foreground and a sink node t to represent the background. We attach each of s and t to every pixel and use ai, bi for capacities between pixel i and the source and sink respectively.
For each pair (i,j) we create instead of one undirected, two directed edges (i,j) and (j,i) with capacity pij
(separation parameter)
Constructing our graph (2)
Minimum cut(A,B) An s-t cut(A,B) is a partition of our pixels into sets A
(foreground) and B (background).
Edges (s,j), jєΒ contribute aj capacity to the cut Edges (i,t), iєA contribute bi capacity to the cut Edges (i,j), iєA jєΒ contribute pij capacity to the cut
If we add these contributions we get:
B)(A,q' p a b),(
|1j}{i,A| Ej)(i,
ijBj
jAi
i
BAc