chapter 10 image segmentation 國立雲林科技大學資訊工程所張傳育 (chuan-yu chang )...

Chapter 10Image Segmentation

國立雲林科技大學資訊工程所張傳育 (Chuan-Yu Chang ) 博士Office: EB 212TEL: 05-5342601 ext. 4337E-mail: [email protected]: http://MIPL.yuntech.edu.tw

2

Introduction Image Segmentation

Subdivides an image into its constituent regions or objects. Segmentation should stop when the objects of interest in

an application have been isolated. Segmentation accuracy determines the eventual success

or failure of computerized analysis procedures. Image segmentation algorithm generally are based on one

of two basic properties of intensity values: Discontinuity

Partitioning an image based on abrupt changes in intensity. Similarity

Partitioning an image into regions that are similar according to a set of predefined criteria.

3

Detection of discontinuities There are three basic types of gray-level discontinuities

Points, lines, and edges. The most common way to look for discontinuities is to run a

mask through the image in the manner described in Section 3.5. (sum of product)

9

1

992211 ...

iii zw

zwzwzwR(10.1-1)

4

Detection of discontinuities Point detection

An isolated point will be quit different from its surroundings. Measures the weighted differences between the center point and

its neighbors. A point has been detected at the location on which the mask is

centered if

where T is a nonnegative threshold.

TR

飛機渦輪葉片的 X光影像

有小破洞

Mask 計算後的結果

取 (c) 圖最大灰階的 90%作為 threshold後的結果

5

Detection of discontinuities Line detection

Let R1, R2, R3, and R4 denote the response of the mask in following.

Suppose that the four masks are run individually through an image.

If, at a certain point in the image, |Ri|>|Rj|, for all j≠i, that point is said to be more likely associated with a line in the direction of mask i.

If we are interested in detecting all the lines in an image in the direction defined by a given mask, we simply run the mask through the image and threshold the absolute value of the result.

The coefficients in each mask sum to zero.

6

Detection of discontinuities We are interested in finding all the lines that are one pixel

thick and are oriented at -45. Use the last mask shown in Fig. 10.3.

偵測到的 45度 line

強度最強的line

7

Detection of discontinuities Edge detection

An edge is a set of connected pixels that lie on the boundary between two regions.

The “thickness” of the edge is determined by the length of the ramp. Blurred edges tend to be thick and sharp edges tend

to be thin.

8

Detection of discontinuities

The first derivative is positive at the points of transition into and out of the ramp as we move from left to right along the profile. It is constant for points in the ramp, It is zero in areas of constant gray level.

The second derivative is positive at the transition associated with the dark side of the edge, negative at the transition associated wit the light side of the edge, and zero along the ramp and in areas of constant gray level.

The magnitude of the first derivative can be used to detect the presenceof an edge.The sign of the second derivative can be used to determine whetheran edge pixel lies on the dark or light side of an edge.

9

Edge detection (cont.)

Two additional properties of the second derivative around an edge: It produces two values for every edge in an

image. An imaginary straight line joining the extreme

positive and negative values of the second derivative would cross zero near the midpoint of the edge. The zero-crossing property of the second derivative is

quit useful for locating the centers of thick edges.

10

Detection of discontinuities The entire transition from black to white is a single edge.

Image and gray-level profiles of a ramp edge

First derivative image and the gray-level profile

Second derivative image and the

gray-level profile

=0.1

=1

=10

=0

11

Edge detection (cont.) The second derivative is even more sensitive to noise. Image smoothing should be a serious consideration prior to the

use of derivatives in applications . Summaries of edge detection

To be classified as a meaningful edge point, the transition in gray level associated with that point has to be significant stronger than the background at that point. Use a threshold to determine whether a value is “significant” or not. We define a point in an image as being an edge point if its two

dimensional first-order derivative is greater than a specified threshold.

A set of such points that are connected according to a predefined criterion of connectedness is by definition an edge.

Edge segmentation is used if the edge is short in relation to the dimensions of the image.

A key problem in segmentation is to assemble edge segmentations into linger edges.

If we elect to use the second-derivative to define the edge points in an image as the zero crossing of its second derivative.

12

Detection of discontinuities Gradient operator

The gradient of an image f(x,y) at location (x,y) is defined as the vector

the gradient vector points is the direction of maximum rate of change of f at coordinates (x,y).

The magnitude of the vector denoted ∇f, where

The direction of the gradient vector denoted by

The angle is measured with respect to the x-axis.

2122)( yx GGmagf f

y

fx

f

G

G

y

xf

x

y

G

Gyx 1tan),(

13

Detection of discontinuities Roberts cross-gradient operator

Gx=(z9-z5) Gy=(z8-z6) Masks of size 2x2 are awkward to

implement because they do not have a clear center.

Prewitt operator Gx=(z7+z8+z9)-(z1+z2+z3) Gy=(z3+z6+z9)-(z1+z4+z7)

Sobel operator Uses a weight of 2 in the center

coefficient. Gx=(z7+2z8+z9)-(z1+2z2+z3) Gy=(z3+2z6+z9)-(z1+2z4+z7)

14

Detection of discontinuities Computation of the gradient requires Gx and Gy be combined in Eq.

(10.1-4), however, this implementations is not always desirable because of the computational burden required by squares and square roots.

An approach used frequently is to approximate the gradient by absolute values:

The two additional Prewitt and Sobel masks for detecting discontinuities in the diagonal directions are shown in Fig. 10.9

yx GGf

用來偵測對角邊界的 Prewitt 及 Sobel

mask 。

15

Detection of discontinuities Fig. 10.10 shows the response of the two components of

the gradient, |Gx| and |Gy|. The gradient image formed the sum of these two

components.

16

Detection of discontinuities Figure 10.11 shows the same sequence of images

as in Fig. 10.10, but with the original image being smoothed first using a 5x5 averaging filter. The response of each mask no shows almost no

contribution due to the bricks, with the result being dominated mostly by the principal edges.

17

Detection of discontinuities The horizontal and vertical Sobel masks respond about equally

well to edges oriented in the minus and plus 45° direction. If we emphasize edges along the diagonal directions, the one of

the mask pairs in Fig. 10.9 should be used. The absolute responses of the diagonal Sobel masks are shown

in Fig. 10.12. The stronger diagonal response of these masks is evident in

these figures.

18

Detection of discontinuities The Laplacian of a 2-D function f(x,y) is a second-order

derivative defined as

For a 3x3 region, one of the two forms encountered most frequently in practice is ( 水平與垂直邊 )

A digital approximation including the diagonal neighbors is given by ( 含水平、垂直、及對角邊 )

2

2

2

22

y

f

x

ff

864252 4 zzzzzf

9876432152 8 zzzzzzzzzf

(10.1-13)

(10.1-14)

(10.1-15)

19

Detection of discontinuities The Laplacian generally is not used in its original

form for edge detection for several reasons: As a second-order derivative, the Laplacian typically is

unacceptably sensitive noise. The magnitude of the Laplacian produces double edges. The Laplacian is unable to detect edge direction.

The role of the Laplacian in segmentation consists of Using its zero-crossing property for edge location. Using it for complementary purpose of establishing whether

a pixel is on the dark or light side of an image.

20

Detection of discontinuities Laplacian of a Gaussian (LoG)

The Laplacian is combined with smoothing as a precursor to finding edges via zero-crossing, consider the function

where r2=x2+y2 and is the standard deviation. Convolving this function with an image blurs the image, with

the degree of blurring being determined by the value of . The Laplacian of h (the second derivative of h with respect to

r) is

The function is commonly referred to as the “Laplacian of a Gaussian” (LoG), sometimes is called the ”Mexican hat” function.

2

2

2)( r

erh

2

2

22

222 )(

r

er

rh

(10.1-16)

(10.1-17)

21


The purpose of the Gaussian function in the LoG formulation is to smooth the image, and the purpose of the Laplacian operator is to provide an image with zero crossings used to establish the location of edges.

22

Detection of discontinuities Fig. 10.15(c) is a spatial Gaussian

function (with a standard deviation of five pixels) used to obtain a 27x27 spatial smoothing mask. The mask was obtained by sampling this Gaussian function at equal interval.

∇2h can be computed by application of (c) followed by (d).

The LoG result shown in Fig. 10.15(e) is the image from which zero crossings are computed to find edges. One straightforward approach for

approximating zero-crossings is to threshold the LoG image by setting all its positive values to white, and all negative values to black.

Zero-crossing occur between positive and negative values of the Laplacian.

Estimated zero-crossing, obtained by scanning the threshold image and noting the transitions between black and white.

23


Comparing Figs/ 10.15(b) and (g) The edges in the zero-crossing image are thinner

than the gradient edges. The edges determined by zero-crossings form

numbers closed loops. “spaghetti effect” is one of the most serious drawbacks

of this method. Another major drawback is the computation of

zero crossing.

24

Edge Linking and Boundary Detection Ideally, edge detection should yield pixels lying only on

edges. In practice, this set of pixels seldom characterizes an

edge completely because of noise, breaks in the edge from nonuniform illumination, and other effects that introduce spurious intensity discontinuities.

Thus, edge detection algorithms are followed by linking procedures to assemble edge pixels into meaningful edges.

Local Processing To analyze the characteristics of pixels in a small neighborhood

about every point (x,y) in an image that has been labeled an edge point.

All points that are similar according to a set of predefined criteria are linked.

25

Edge Linking and Boundary Detection The two principal properties used for

establishing similarity of edge pixels: The strength of the response of the gradient

operator used to produce the edge pixel. As defined in Eq.(10.1-4)

The direction of the gradient vector. As defined in Eq. (10.1-5)

26

Edge Linking and Boundary Detection An edge pixel with coordinates (x0,y0) in a predefined neighborhood of

(x,y), is similar in magnitude to the pixel at (x,y) if

where E is a nonnegative threshold An edge pixel at (x0,y0) is the predefined neighborhood of (x,y) has an

angle similar to the pixel at (x,y) if

where A is a nonnegative angle threshold A point in the predefined neighborhood of (x,y) is linked to the pixel at

(x,y) if both magnitude and direction criteria are satisfied 。 This process is repeated at every location in the image. A record must be kept of linked points as the center of the

neighborhood is moved from pixel to pixel.

Eyxfyxf ),(),( 00

Ayxyx ),(),( 00

27

Edge Linking and Boundary Detection Example 10-6: the objective is to find rectangles whose sizes

makes them suitable candidates for license plates. The formation of these rectangles can be accomplished by

detecting strong horizontal and vertical edges. Linking all points, that had a gradient value greater than 25 and

whose gradient directions did not differ by more than 15°.

使用垂直的Sobel operator

使用水平的Sobel operator

分別對圖 (b) 及(c) 進行 edge linking 的動作，將梯度大於 25 ，且角度小於 15 度的點連起來。

28

Edge Linking and Boundary Detection Global Processing via the Hough Transform

Points are linked by determining first if they lie on a curve of specified shape.

Given n points in an image, suppose that, we want to find subsets of these points that lie on straight lines.

Consider a point (xi,yi) and the general equation of a straight line in slope-intercept form, yi=axi+b.

Infinitely many lines pass through (xi,yi), but they all satisfy the equation yi=axi+b for varying values of a and b.

However, writing this equation as b=-xia+yi and considering the ab-plane yields the equation of a single line for a fixed pair (xi,yi) .

A second point (xj,yj) also has a line in parameter space associated with it, and this line intersects the lines associated with (xi,yi) at (a’, b’).

All points contained on this line have lines in parameter space that intersect at (a’, b’).

29

Edge Linking and Boundary Detection Hough Transform

Subdividing the parameter space into so-called accumulator cell Initially, these cells are set to 0. For every point (xk, yk) in the image plane, we let the parameter a equal each of the allowed

subdivisions values on the a-axis and solve for the corresponding b using the equation b=-xka+yk.

The resulting b’s are then rounded off to the nearest allowed value in the b-axis. If a choice of ap results in solution bq， we let A(p,q)=A(p,q)+1 . At the end of this procedure, a value of Q in A(i,j) corresponds to Q points in the xy-plane lying

on the line y=aix+bj. The number of subdivisions in the ab-plane determines the accuracy of the co-linearity of

these points.

30

Edge Linking and Boundary Detection A problem with using the equation y=ax+b to represent a line

is that the slope approaches infinity as the line approaches the vertical. To use normal representation of a line

Figure 10.19(a) shows the geometrical interpretation od the parameters used in Eq. (10.2-3). Q collinear points lying on a line xcosj +ysinj=i yield Q sinusoidal

curves that intersect at (i,j) in the parameter space. Incrementing and solving for the corresponding gives Q entries in

accumulator A(i,j) associated with the cell determined by (i,j) .

sincos yx (10.2-3)

31

Edge Linking and Boundary Detection

X, Y 平面上有五個點 (1, 2, 3, 4, 5)

X, Y 平面上五個點(1, 2, 3, 4, 5) ，在平面的曲線

從交點 A 知道，點 1, 3, 5) 共線。交點 B 表示點 2,3,4 共線

32

Edge Linking and Boundary Detection Edge-linking based on Hough transform

Compute the gradient of an image and threshold it to obtain a binary image.

Specify subdivisions in the -plane. Examine the counts of the accumulator cells for high pixel

concentrations. Examine the relationship between pixels in a chosen cell.

( 依其對應的找出直線 ) 。 Based on computing the distance between disconnected

pixels identified during traversal of the set of pixels corresponding to a given accumulator cell.

A gap at any point is significant if the distance between that point and its closest neighbor exceeds a certain threshold.

33

Edge Linking and Boundary Detection Fig. (a) is an aerial infrared image containing two hangars and a runway.

Fig. (b) is a thresholded gradient image obtained using the Sobel operator. Fig. (c) shows the Hough transform of the gradient image. Fig. (d) shows the set of pixels linked according to the criteria

They belonged to one of the three accumulator cells with the highest count. No gaps were longer than five pixels.

34


Global Processing via Graph-Theoretic Techniques A global approach for edge detection and linking based on representing

edge segments in the form of a graph and searching the graph for low-cost paths that correspond to significant edges.

This representation provides a rugged approach that performs well in the presence of noise.

Graph G=(N,U) N: set of node U: unordered pairs of distinct elements of N Each pair (ni,nj) of U is called arc ， ni, is said to be a parent ， nj is said to be a

successor 。 The process of identifying the successor of a node is called expansion 。 In each graph we define levels, such that level 0 consists of a single node,

called the start or root, and the nodes in the last level are called goal nodes. Cost (ni,nj) can be associated with every arc (ni,nj). A sequence of nodes n1, n2,…,nk, with each node ni being a successor of node ni-

1, is called a path from n1 to nk. The cost of the entire path is

p q

k

iii nncc

21,

35


座標灰階值

成本

)()(),( qfpfHqpc

Each edge element defined by pixels p and q,has an associated cost, defined as

where H is the highest gray-level valuein the image, and f(x) is the gray-levelvalue of x.

36

Edge Linking and Boundary Detection By convention, the point p is on the right-hand side of the direction

of travel along edge elements. To simplify, we assume that edges start in the top row and

terminate in the last row. p and q are 4-neighbors. An arc exists between two nodes if the two corresponding edge

elements taken in succession can be part of an edge. The minimum cost path is shown dashed. Let r(n) be an estimate of the cost of a minimum-cost path from s

to n plus an estimate of the cost of that path from n to a goal node;

Here, g(n) can be chosen as the lowest-cost path from s to n found so far, and h(n) is obtained by using any variable heuristic information.

nhngnr (10.2-7)

37


38

Edge Linking and Boundary Detection Graph search algorithm Step1: Mark the start node OPEN and set g(s)=0. Step 2: If no node is OPEN exit with failure; otherwise, continue. Step 3: Mark CLOSE the OPEN node n whose estimate r(n) computed from Eq.

(10.2-7) is smallest. Step 4: If n is a goal node, exit with the solution path obtained by tracing back

through the pointets; otherwise, continue. Step 5: Expand node n, generating all of its successors (If there are no

successors go to step 2) Step 6: If a successor ni is not marked, set

Step 7: if a successor ni is marked CLOSED or OPEN, update its value by letting

Mark OPEN those CLOSED successors whose g’ value were thus lowered and redirect to n the pointers from all nodes whose g’ values were lowered. Go to Step 2.

ii nncngnr ,

iii nncngngng ,,min'

39

Edge Linking and Boundary Detection Example 10-9: noisy chromosome silhouette and an edge found using a heuristic graph search.

The edge is shown in white, superimposed on the original image.

40

Thresholding Thresholding

To select a threshold T, that separates the objects from the background. Then any point (x,y) for which f(x,y)>T is called an object point;

otherwise, the point is called a background point. Multilevel thresholding

Classifies a point (x,y) as belonging to one object class if T1 <f(x,y) ≤T2, and to the other object class if f(x,y) >T2

And to the background if f(x,y) ≤T2

41

Thresholding In general, segmentation problems requiring multiple thresholds are

best solved using region growing methods. The thresholding may be viewed as an operation that involves tests

against a function T of the form

where f(x,y) is the gray-level of point (x,y) and p(x,y) denotes some local property of this point.

A threshold image g(x,y) is defined as

Thus, pixels labeled 1 correspond to objects, whereas pixels labeled 0 correspond to the background.

When T depends only on f(x,y) the threshold is called global. If T depends on both f(x,y) and p(x,y), the threshold is called local.

If T depends on the spatial coordinates x and y, the threshold is called dynamic or adaptive.

yxfyxpyxTT ,,,,,

Tyxfif

Tyxfifyxg

),( 0

),( 1,

42

The role of illumination An image f(x,y) is formed as the product of a

reflectance component r(x,y) and an illumination component i(x,y).

In ideal illumination, the reflective nature of objects and background could be easily separable.

However, the image resulting from poor illumination could be quit difficult to segment.

Taking the natural logarithm of Eq.(10.3-3)

yxryxiyxf ,,,

yxryxi

yxryxi

yxfyxz

,','

,ln,ln

,ln,

(10.3-4)

(10.3-5)

43

The role of illumination

From probability theory, If i’(x,y) and r’(x,y) are independent random

variables, the histogram of z(x,y) is given by the convolution of the histograms of i’(x,y) and r’(x,y).

44

Fig. (a)*Fig(c)

),(),(),( yxryxiyxf

影像 f(x,y) 可看作是反射量 r(x,y) 和照度 i(x,y)

的乘積。

電腦產生的反射函數

電腦產生的照度函數

物體和背景的反射特性，使她們容易被分割，但差的照明，會使產生的

影像難以分割。

The role of illumination

45

Thresholding Basic global thresholding

Select an initial estimate for T Segment the image using T

G1 : consisting of all pixels with gray level values >T G2 : consisting of all pixels with gray level values <=T

Compute the average gray level values 1 and 2 for pixels in regions G1 and G2.

Compute a new threshold value T=0.5*(1 + 2)

Repeat step 2 through 4 until the difference in T in successive iterations is smaller than a predefined parameter T0.

The parameter T0 is used to stop the algorithm after changes become small.

46

Basic Global Thresholding To partition the image histogram by using a single global

threshold T. Segmentation is then accomplished by scanning the image

pixel by pixel and labeling each pixel as object or background, depending on whether the gray level of that pixel is great or less than the value T.

47

Basic Global Thresholding Fig. (a) is the original image, (b) is the image histogram.

The clear valley of the histogram. Application of the iterative algorithm resulted in a value of 125.4

after three iterations starting with the average gray level and T0=0. The result obtained using T=125 to segment the original image is

shown in Fig. (c).

48

Thresholding Basic adaptive thresholding

Imaging factors such as uneven illumination can transform a perfectly segmentable histogram into a histogram that cannot be partitioned effectively by a single global threshold.

To divide the original image into subimages and then utilize a different threshold to segment each subimage.

The key issues are How to subdivide the image? How to estimate the threshold for each resulting subimages?

Global threshold手動將 T 設在山谷

處。將原始影像根據亮度的變化分成 16 區塊。

49

圖 10.30(c) 的 (1,2)及 (2,2) 子影像

灰階值得分佈極不均勻，全域門限法注定失敗

分成更多的子影像

Thresholding

50

Thresholding (cont.) Optimal Global and Adaptive Thresholding

Estimating threshold that produce the minimum average segmentation error.

Suppose that an image contains only two principal gray-level regions.

Let z denote gray-level values. We can view these values as random quantities, and their

histogram may be considered an estimate of their probability density function, p(z).

The overall density function is the sum or mixture of two densities in the image.

If the form of the densities is known or assumed, it is possible to determine an optimal threshold for segmenting the image into two distinct regions.

51

Thresholding (cont.) Optimal Global and Adaptive Thresholding

The mixture probability density function describing the overall gray-level variation in the image is

Here, P1 and P2 are the probabilities of occurrence of the two classes of pixels.

Assuming that any given pixel belongs either to an object or to the background,

)()()( 2211 zpPzpPzp

background

object

121 PP

(10.3-5)

(10.3-6)

52

Thresholding (cont.) An image is segmented by classifying as background all pixels with

gray levels greater than a threshold T. All other pixels are called object pixels.

The main objective is to select the value of T that minimizes the average error in making the decisions that a given pixel belongs to an object or to the background.

The probability of erroneously classifying a background point as an object point is

The probability of erroneously classifying a object point as an background point is

The overall probability of error is

TdzzpTE )()( 21

)()()( 2112 TEPTEPTE

將背景誤認為物體的機率。

將物體誤認為背景的機率。

總誤差

T

dzzpTE )()( 12

(10.3-7)

(10.3-8)

(10.3-9)

53

Thresholding (cont.) To find the threshold value for which this error is minimal requires

differentiating E(T) with respect to T and equating the result to 0. The result is

This equation is solved for T to find the optimum threshold. If P1=P2, the optimum threshold is where the curves for p1(z) and p2(z) intersect.

Obtaining an analytical expression for T requires that we know the equations for the two PDFs. Estimating these densities in practice is not always feasible. One of

the principal densities used is the Gaussian density, which is completely characterized by two parameters, the mean and the variance ( 以 Gaussian 密度函數來近似 p(z))

where 1 and are the mean and variance of the Gaussian density of one class of pixels.

22

22

21

21

2

2

22

1

1

22

zz

eP

eP

zp

)()( 2211 TpPTpP

為求得最小誤差的門限值 T ，須求 E(T) 對 T 的偏微分。(10.3-10)

(10.3-11)

54

Thresholding (cont.) Using this equation in the general solution of Eq.(10.3-10) results

in the following solution for the threshold T: (10.3-10) 式的通解 )

where

Since a quadratic equation has two possible solution, two threshold values may be required to obtain the optimal solution.

If the variances are equal, a single threshold is sufficient:

If P1=P2, the optimal threshold is the average of the means

02 CBTAT

2112

22

21

21

22

22

21

212

221

22

21

/ln2

2

PPC

B

A

1

2

21

221 ln

2 P

PT

(10.3-12)

(10.3-13)

(10.3-14)

55

Thresholding (cont.) Instead of assuming a functional form for p(z), a minimum

mean-square-error approach may be used to estimate a composite gray-level PDF of an image from the image histogram.

The mean square error between the (continuous) mixture density p(z) and the (discrete) image histogram h(zi) is

where an n-point histogram is assumed. The principal reason for estimating the complete density is to

determine the presence or absence dominant modes in the PDF. Two dominant modes typically indicate the presence of edges in

the image (or region) over which the PDF is computed.

n

iiims zhzp

ne

1

21 (10.3-15)

56

原始的心臟影像，打有顯影劑，目的在於描繪出左心室的輪廓。前處理 :

(1) 每個像素點經 log function 轉換(2) 將打藥前與打藥後的影像相減，以消除脊椎部份。(3) 將多張心臟影像平均以消除雜訊。

Thresholding (cont.) Example 10.13

The general problem is to outline automatically the boundaries of heart ventricles in cardioangiograms.

Pre -processings: Each pixel was mapped with a log function to counter exponential

effects caused by radioactive absorption. An image obtained before application of the contrast medium was

subtracted from each image captured after the medium was injected in order to remove the spinal column present in both images.

Several angiograms were summed to reduce random noise.

57Block A 和 Block B的影像 histogram

Thresholding (cont.) Each preprocessed image was subdivided into 49 regions by placing a 7x7

grid with 50% overlap over each image. Each of the 49 resulting overlapped regions contained 64x64 pixels. The histogram for region A is bimodal, indicating the presence of a boundary. The histogram for region B is unimodal, indicating the absence of two

markedly distinct regions. After all 49 histograms were computed, a test of bimodality was performed to

reject the unimodal histograms. The remaining histograms were then fitted by bimodal Gaussian density

curves [see Eq. (10.3-11)] using a conjugate gradient hill-climbing method to minimize the error function given in Eq(10.3-15).

58

Thresholding (cont.) The x’s and o’s in Fig. 10.34 (a) are two fits to the histogram shown in

black dots. The optimal thresholds were then obtained by using Eqs.(10.3-12) and .

(10.3-13) For non-bimodal regions, the thresholds were obtained by interpolating

these thresholds. Then a second interpolation was carried out point by point using

neighboring threshold values so that, at the end of the procedure, every point in the image had been assigned a threshold.

Finally, a binary decision was carried out for each pixel using the rule

otherwise0

if1, xyTf(x,y)yxf

59

Thresholding Use of boundary characteristics for histogram improvement and

local thresholding The chances of selecting a “good” threshold are enhanced

considerably if the histogram peaks are tall, narrow, symmetric, and separated by deep valleys.

One approach for improving the shape of histogram is to consider only those pixels that lie on or near the edges between objects and the background. 。

If only the pixels on or near the edge between object and the background were used, the resulting histogram would have peaks of approximately the same height.

In practice the valleys of histograms formed from the pixels selected by a gradient/Laplacian criterion can be expected to be sparsely populated.

0

0

0

,2

2

fandTf

fandTf

Tf

yxs非邊界點標為 0

邊界點的深色邊標為 +

邊界點的亮色邊標為 -

60

Thresholding (cont.) Figure 10.36 shows the labeling produced by Eq. (10.3-16) for an

image of a dark, underlined stroke written on a light background. The information obtained with this procedure can be used to

generate a segmented, binary image in which 1’s correspond to object of interest and 0’s correspond to the background.

The transition from a light background to a dark object must be characterized by the occurrence of a – followed by a + in s(x,y).

The interior of the object is composed of pixels that are labeled either 0 or +.

A horizontal or vertical scan line containing a section of an object has the following structure:(…)(-,+)(o or +)(+, -)(…)

61

Example 10.14

An ordinary scenic bank check

The histogram as a function of gradient values for pixels with gradients greater than 5.This histogram has two dominant modes that are symmetric, nearly of the same height, and are separated by a distinct valley.

The segmented result obtained by using Eq.(10.3-16) with T at or near of the midpoint of the valley.

Thresholding (cont.)

62

Thresholding (cont.) Thresholds based on several variables

Multispectral thresholding Constructing a 3-D “histogram” 。 The basic procedure is analogous to the method used for one variable. The concept of the threshold now becomes one of finding clusters of points in

3-D space. The principal difficultly is that cluster seeking becomes an increasingly

complex task as the number of variables increases.

A monochrome picture of a color

photograph.

Obtained by thresholding about one of the histogram clusters corresponding to facial tones.

Obtained by thresholding about a cluster close to the red axis.

63

Region-Based Segmentation Basic Formulation

Let R represent the entire image region. We may view segmentation as a process that partitions R into

n subregions, R1, R2,…,Rn

jiFALSERRP

niTRUERP

jijiRR

,..., n, iRb

RRa

ji

i

ji

i

n

ii

for (e)

,...,2,1for )( (d)

, and allfor (c)

21 region, connected a is )(

)(1 Every pixel must be in a region

Points in a region must be connected

in some predefined sense.

The regions must be disjoint.

The properties must be satisfied by the pixelsin a segmented region

Regions Ri and Rj are different in the sense ofpredicate P.

64

Region-Based Segmentation Region Growing

Group pixels or subregions into regions based on predefined criteria.

Start with a set of “seed” points and from these grow regions by appending to each seed those neighboring pixels that have properties similar to the seed.

Problems about region growing: Selecting a set of seed points Selecting of similarity criteria depends not only on the

problem under consideration, but also on the type of image data available.

The formulation of a stooping rule. Growing a region should stop when no more pixels satisfy the criteria

for inclusion in that region.

65

Region-Based Segmentation Example 10-16 焊接點的檢測

To use region growing to segment the regions of the weld failures.

An X-ray image of a weld, the pixels of defective welds tend to have values of 255.

We selected as starting points all pixels having values of 255.

Criteria for region growing:(1) The absolute gray-level difference between any pixel and the seed had to be less than 65.(2)To be included in one of the regions, the pixel had to be 8-connected to at least one pixel in

that region

66

Region-Based Segmentation Problems having multimodal histograms generally are best

solved using region-based approach. Histogram of Fig.10.40 .( 無法清楚分離物體與背景 )

This problem cannot be solved effectively by any reasonably general method of automatic threshold selection based on gray levels alone.

The use of connectivity was fundamental in solving the problem.

67

Region-Based Segmentation Region Splitting and Merging

To subdivide an image initially into a set of arbitrary, disjointed regions and then merge and/or split the regions in an attempt to satisfy the conditions stated in Section 10.4.1. ( 一開始將影像分割成任意子影像，然後再合併或分割成滿足條件的 segmentation 。 )

Let R represent the entire image region and select a predicate P. Subdivide R successively into smaller and smaller quadrant regions

so that, for any region Ri, P(Ri)=TRUE We start with the entire region. If P(R)=FALSE, we divide the image into

quadrants. ( 表示區域 R 內的 pixel 有不同的灰階值 ) ，則將 R 分成四的子影像。

If P is FALSE for any quadrant, we subdivide that quadrant into subquadrants, and so on. ( 如果子影像的 P 為 FALSE 則繼續分割成四個子影像。 )

This splitting technique is called “quadtree” The root of the tree corresponds to the entire image and that node

corresponds to a subdividion.

68

Region-Based Segmentation

Region Splitting and Merging

69


Region Splitting and Merging Split into four disjoint quadrants any region Ri for

which P(Ri)=FALSE.

Merge any adjacent regions Rj and Rk for which P(Rj ∪Rk)=TRUE.

Stop when no further merging or splitting is possible.

70


Original maple leaf.

We define P(Ri)=TRUE if at least 80% of the pixels in Ri have the property |zj-mi|<=2i, where zi is the gray level of the j-th pixel in Ri, mi

is the mean gray level of that region, and i is the standard deviation of the gray levels in Ri.

The shading and the stem of the leaf were erroneously eliminated by the thresholding procedure.

71

Segmentation by Morphological Watersheds Basic Concepts

The concept of watersheds is based on visualizing an image in three dimensions: two spatial coordinates versus gray levels.

In such a “topographic” interpretation, we consider three types of points: Points belonging to a regional minimum. ( 區域最小值的點。 ) Points at which a drop of water, if placed at the location of any

of those points, would fall with certainty to a single minimum. [ 一滴水的點 ( 單一最小值 ) 。形成集水盆 (catchment basin)或分水嶺 (watershed)] The set of points satisfying condition (b) is called the catchment

basin or watershed of that minimum. Points at which water would be equally likely to fall to more

than one such minimum. 水可能會掉入的多個最小值的點。形成峰線 (crest line) The points satisfying condition (c) form crest lines on the

topographic surface and are termed divide lines or watershed lines.

The principal objective of segmentation algorithms based on these concepts is to find the watershed line 。

72

Segmentation by Morphological Watersheds The basic idea is

Suppose that a hole is punched in each regional minimum. The entire topography is flooded from below by letting

water rise through the holes at a uniform rate. When the rising water in distinct catchment basins is about

to merge, a dam is built to prevent the merging. The flooding will eventually reach a stage when only the

tops of the dams are visible above the water line. These dam boundaries correspond to the divide lines of the

watersheds. They are the continuous boundaries extracted by a watershed

segmentation algorithm.

73

A simple gray-scale image

A topographic view.

A hole is punched in each regional minimum and that the entire topography is flooded from below by letting water rise through the holes at a uniform rate.

The water has risen into the first and second catchment basins

As the water continues to rise, it will eventually

overflow from one catchment basin

into another.

Segmentation by Morphological Watersheds

74

Water from the left basin actually overflowed into the basin on the right and a short dam was built to prevent water from merging at that level of flooding.

水持續溢入第二個集水區

The final dams corresponds to the watershed lines, which are the desired segmentation result.


75

Segmentation by Morphological Watersheds Watershed 法的優點：

Watershed line form a connected path, thus giving continuous boundaries between regions

Watershed segmentation is in the extraction of nearly uniform (bloblike) objects from the background.

76

Dam Construction Dam construction is based on binary images. The simple way to construct dams separating sets of binary point

is to use morphological dilation. Suppose that each of the connected components is dilated by the

structuring element, subject to two conditions: The dilation has to be constrained to q. (the center of the structuring

element can be located only at points in q during dilation) The dilation cannot be performed on points that would cause the sets

being dilated to merge (become a single connected component). Only points in q that satisfy the two conditions under

consideration describe the one-pixel-thick connected path. This path constitutes the desired separating dam at stage n of

flooding. Construction of the dam at this level of flooding is completed by

setting all the points in the path just determined to a value greater than the maximum gray-level value of the image. The height of all dams is generally set at 1 plus the maximum allowed

value in the image.

77

Dam Construction使用二元化影像的 dilation

Two partially flooded catchment basins at stage n-1 of flooding. 在第

n-1次注入水後的集水區 1 ， Cn-

1(M1)

在第 n-1次注入水後的集水區 2 ， Cn-

1(M2)

C[n-1]

Flooding at stage n, showing that

water has spilled between basins.第 n次注入水後

的集水區 q

第一次dilation

第二次dilation

將二次 dilation重疊的區域建成水壩

78

Watershed Segmentation Algorithm Let M1, M2,…,MR be sets denoting the coordinates of the points in

the regional minima of an image g(x,y). Let C(Mi) be a set denoting the coordinates of the points in the

catchment basin associated with regional minimum Mi. Let T[n] represent the set of coordinates (s,t) for which g(s,t)<n,

That is

T[n] is the set of coordinates of points in g(x,y) lying below the plane g(x,y)=n.

The topography will be flooded in integer flood increments, from n = min + 1 to n = max + 1.

At any step n of the flooding process, the algorithm needs to know the number of points below the flood depth.

Suppose that the coordinates in T[n] that are below the plane g(x,y) = n are “marked” black, and all other coordinates are marked white.

When we look “down” on the xy-plane at any increment n of flooding, we will see a binary image in which black points correspond to points in the function that are below the plane g(x,y)=n.

ntsgtsnT ,, (10.5-1)

79

Watershed Segmentation Algorithm Let Cn(Mi) denote the set of coordinates of points in the

catchment basin associated with minimum Mi that are flooded at stage n.

Cn(Mi) may be viewed as a binary image given by

Cn(mi)=1 at location (x,y) if (x,y) C(Mi) AND (x,y) T[n]; otherwise Cn(Mi)=0.

Next, we let C[n] denote the union of the flooded catchment basins portions at stage n:

Then C[max+1] is the union of all catchment basins:

nTMCMC iin

R

iin MCnC

1

R

iiMCC

1

1max

(10.5-2)

(10.5-3)

(10.5-4)

80

Watershed Segmentation Algorithm Procedure for obtaining C[n] from C[n-1]

Let Q denote the set of connected components in T[n]. For each connected component qQ[n], there are three

possibilities: q∩C[n-1] is empty. q∩C[n-1] contains one connected component of C[n-1] . q∩C[n-1] contains more than one connected component of C[n-1] . Condition (a) occurs when a new minimum is encountered, in which

case connected component q is incorporated into C[n-1] to form C[n]. Condition (b) occurs when q lies within the catchment basin of some

regional minimum, in which case q is incorporated into C[n-1] to form C[n].

Condition (c) occurs when all, or part, of a ridge separating two or more catchment basins is encountered. Further flooding would cause the water level in these catchment basins to merge. Thus a dam must be built within q to prevent overflow between the catchment basins.

81

Original blobs image Image grident

Watershed lines of the gradient

image

Watershed lines superimposed on

original image


82


直接應用 watershed segmentation 會導致 over segmentation 可使用 marker 來控制 over segmentation 。 Marker 是影像中相連接的元件。

Internal marker 結合有興趣的物件。 External marker 結合背景。

83

Segmentation by Morphological Watersheds The use of marker Direct application of the watershed segmentation algorithm

generally leads to oversegmentation due to noise and other local irregularities of the gradient. A practical solution is to limit the number of allowable regions by

incorporating a preprocessing stage designed to bring additional knowledge into the segmentation procedure.

An approach used to control oversegmentation is based on the concept of markers.

A marker is a connected component belonging to an image. Internal markers associated with objects of interest. External markers associated with the background.

A procedure for marker selection will consist of two steps： preprocessing： for minimizing the effect of small spatial detail is to

filter the image with a smoothing filter. Definition of a set of criteria that markers must satisfy: (internal marker)

A region is surrounded by points of higher “altitude” The points in the region form a connected. All the points in the connected component have the same gray-level value.

84


1.1. 先找出先找出 internal markerinternal marker2.2. 再以再以 watershedwatershed 找出找出 watershed linewatershed line

85

The Use of Motion in Segmentation Spatial Techniques

The simplest approaches for detecting changes between two image frames f(x, y, ti) and f(x,y, tj) taken at times ti and tj, respectively, is to compare the two images pixel by pixel. To form a difference image. Suppose that we have a reference image containing only stationary

components. Comparing this image against a subsequent image of the same

scene, but including a moving object, result in the difference of the two images canceling the stationary elements, leaving only nonzero entries that correspond to the nonstationary image components.

A difference image between two images taken at time ti and tj may be defined as

where T is a specified threshold. dij(x,y) has a value of 1 at spatial coordinates (x,y) only if the gray-level difference between the two images is appreciably different at those coordinates, as determined by the specified threshold T.

otherwise

Ttyxftyxfyxd ji

ij0

),,(),,(1),(

86

The Use of Motion in Segmentation (cont.) In dynamic image processing, all pixels in dij(x,y) with value

1 are considered the result of object motion. This approach is applicable only if the two images are

registered spatially and if the illumination is relatively constant within the bounds established by T.

In practice, 1-valued entries in dij(x,y) often arise as a result of noise.

These entries are isolated points in the difference image, and a simple approach to their removal is to form 4- or 8-connected regions of 1’s in dij(x,y) and then ignore any region that has less than a predetermined number of entries.

87

The Use of Motion in Segmentation (cont.) Accumulative differences

Consider a sequence of image frames f(x, y, t1), f(x, y, t2), …, f(x, y, tn).

Let f(x, y, t1) be the reference images. An accumulative difference image (ADI) is formed by comparing

this reference image with every subsequent image in the sequence.

A counter for each pixel location in the accumulative image is incremented every time a difference occurs at that pixel location between the reference and an image in the sequence.

When the kth frame is being compared with the reference, the entry in a given pixel of the accumulative image gives the number of times the gray level at that position was different from the corresponding pixel value in the reference image.

Three types of accumulative difference images (ADI) Absolute, Positive, and Negative ADI Assuming that the gray-level values of the moving objects are large

than the background, these three types of ADIs of the moving objects are defined as follows.

88

The Use of Motion in Segmentation (cont.) Let R(x,y) denote the reference image.

Let k denote tk, so that f(x,y, k)= f(x, y, tk) Assume that R(x,y)=f(x,y,1) For any k>1, and the values of the ADIs are

counts, Absolute Positive Negative

otherwiseyxN

TkyxfyxRyxNyxN

otherwiseyxP

TkyxfyxRyxPyxP

otherwiseyxA

TkyxfyxRyxAyxA

k

kk

k

kk

k

kk

),(

),,(),(1),(),(

),(

),,(),(1),(),(

),(

),,(),(1),(),(

1

1

1

1

1

1

89

The Use of Motion in Segmentation (cont.)

AbsolutAbsolute ADIe ADI

NegativNegative ADIe ADI

Positive Positive ADIADI

Example 10.19Image size: 256x256, object szie:75x50, moving in a southeasterly directionat a speed of pixels per frame. 1. Positive ADI中非零的區域代表移動物體的大小，及移動物體在參考影像中的位置2. Absolute ADI包含 positive 和 negative ADI 的區域3. 移動物體的方向和速度可由 absolute 及 negative ADI中決定。

25

90

The Use of Motion in Segmentation Establishing a reference image

The difference between two images in a dynamic imaging problem has the tendency to cancel all stationary components, leaving only image elements that correspond to noise and to the moving objects.

In practice, obtaining a reference image with only stationary elements is not always possible, and building a reference from a set of images containing one or more moving objects becomes necessary.

One procedure for generating a reference image is as follows Consider the first image in a sequence to be the reference image. When a nonstationary component has moved completely out of its position in

the reference frame, the corresponding background in the present frame can be duplicated in the location originally occupied by the object in the reference frame.

When all moving objects have moved completely out of their original positions, a reference image containing only stationary components will have been created.

Object displacement can be established by monitoring the changes in the positive ADI.

91

行駛中的汽車行駛中的汽車行駛中的汽車行駛中的汽車汽車已被移除汽車已被移除

Frame 1Frame 1 Frame 2Frame 2 ResultResult


92


93


94


95


chapter 10 image segmentation 國立雲林科技大學 資訊工程所 張傳育 (chuan-yu chang )...

Documents

chapter 10 image segmentation 國立雲林科技大學資訊工程所張傳育 (chuan-yu chang )...