image processing tutorial – jorge márquez flores - ccadet … · image processing tutorial –...

Image Processing tutorial – Jorge Márquez Flores - CCADET-UNAM 2008 1/13

Histogram of a population

Let X be a set of N data samples with quantized scalar values in { u0 , u1 ,…, uL-1 } (may be samples from a continuous function u(t) : n , and X∈ n, for example, the height of N adults, pixels with gray levels from an image u=I(x,y), volume densities V(x,y,z), etc.). The values uk , with k=0,1,..., L−1, may be considered as the L possible outcomes of a discrete random variable u. We have time and spatial frequencies, but also an event frequency f :

⎩⎨⎧

=≡

NnfLN

k

dpicmwavesHtz

kfrequency

outcomes possible among ) size ofpopulation a(in event - of incidence

,,inchper dots cm,per lines sec,percycles

The histogram Hist(X) is a graphic plot of fk vs. uk The histogram class intervals or bins are the discretization intervals Δuk ≡ uk −uk −1 = u (L−1) / L , with k >0. The kth “event” is the occurrence of value uk, (when Δuk=1), or in general: u ∈ [uk − uk−1 ). A discrete histogram of a N×M image I(x,y),can be seen as a vector of entries fk defined by:

( )1 1

0 0

1 ( , )0

N Mk

kx y

if I x y ufotherwise

− −

= =

⎧ == ⎨⎩

∑∑ (1) A fast algorithm, avoiding explicit comparisons, is described later. Normalized probabilities can be obtained as follows:

kk

kk

ff

p =∑ (2)


• The histogram of an image, Hist(I), provides a global description of the appearance of the image. A local histogram Hist(ROI), with ROI ⊆ I describes a region appearance).

• Hist(I) is the distribution of gray level values within an image. • A sample population of N ≈ 20% NTotal may well approximate the global

distribution if samples are uniformly distributed in the domain population. The relative frequency distribution of a population “is also” its empirical probability distribution. Thus, Hist(X) = { p(uk), k=0,1,…, L-1 }, is a discrete function, given by (supose for example X as a gray level image):

Usually L = 256, and NTotal = 512×512 pixels. Also Δuk=1 and in 8-bits, gray level images{u0 , u1 , …, u255}={0,1,..., 255}. Then we may write p(uk) as pk. Let us consider the gray level values u in an image the values taken by independent and identically distributed random variables u (plural refers to each pixel). In this case, the histogram is an approximation to the probability density function (PDF) for each random variable u In otherwords,

, with (4)

( ) Prob[ ]uk kp u u ≈ =1

0

( ) 1L

kk

p u−

=

=∑

The approximation depends on the number of samples N ≤ NTotal and its distribution. With uniform sampling (or all data is considered, we have Σ nk =N

( ) kk

np u N=

The number of pixels with graylevel uk

(3)

The k-th gray level The total number of pixels in a ROI ⊆ X


u

= NTotal), then the PDF is the limit when N ∞, and the size of the histogram classes or bins tends to zero: Δuk 0. We then have:

0

Prob[ ] ( )uu

u p u d≤ =∫ (5)

max

0

with ( ) 1u

p u du =∫where umax = uL-1 as in the discrete version in eq. (4). Some authors call the histogram “a discrete pdf”. Figure 1 shows and example of histogram modification-based enhancement of a gray level image: contrast stretching.

Figure 1. (Left-top) low contrast kitty photograph and (Left-bottom) its gray level histogram: a narrow range of gray levels dominate. (Right-top) high, enhanced contrast kitty photograph. (Right-bottom) its gray level histogram spans the full dynamic range of gray levels, from black to white. A LUT (Look-Up Table or transfer function) stretches the narrow interval.


Figure 2. Threshold binarization by histogram features. (Left-top) kitty photograph before and after binarization with threshold 145. (Left-bottom) its gray level histogram and the choosen threshold, at a local minimum (gray value 145) separating important modes. The second histogram corresponds to the binary-thresholded image. In histogram-guided thresholding, features of the gray-level distribution (for example: global and local minima, the mean, the median, or the average value between two modes, etc.) are used as in this example.


s, or other multidimensional domain

Potential Notational Conflicts and Conventions

• We have data (a signal): u(x) (where domain x may be t)

• We traditionally plot u vs. x (dependent against independent variables: intensity u at point x)

• Histogram Hist(u) = p(uk) (or more properly written as the set { p(uk)}) is a plot of frequency (#outcomes) p(uk) against u ; x does not matter.

• In image ,

and u(x) a scalar field whose histogram p(uk) is still unidimensional.

1 2 3( , ,...), or ( , , ,...)x x x y x x x x= = =x−>

• To simplify, at times the L values uk ∈ {u0 , u1 , …, uL-1} are directly taken (with Δuk=1, for all k) as u=k=0,1,…, L−1. Also individual frequencies (probabilities) p(uk) are noted as pk(u), or even pk. .

Some properties of histograms • The histogram loses any domain information.

We do not write “Hist(u(x))” or “p(uk(x))” (no dependence on position “x”). As a collorary, very different datasets and with different dimension (signal series, images, etc.) may have exactly the same histogram. There is no “inverse transform”.

• Histograms of domain information (distribution of positions x of points or features) can be obtained, in apparent contradiction with the above property, but other information is then lost (the gray level attribute itself in this example, since the new attribute of interest is the point position itself).

• As a collorary, several different images may have exactly the same histogram (for common gray-level attributes), as in figure (3).

• Any representation of frequencies of occurrence is called an histogram. A multidimensional histogram is a graphic representation of joint probability distributions, p(uk ,vk), where f : n 2, and f(x)= (u,v). See section on multidimensional histograms.


• Mode(s) of an histogram { } 1

0arg max ( )k

k Lk kmode

uu p u = −

== (maybe not unique)

and pmax = p(umode) is the highest incidence (most frequent value umode).

• In an image region, a narrow histogram (the main peak) indicates a low contrast region.

• As probabilities, the histogram frequencies p(uk) are normalized to [0,1], by dividing the number of samples (= pixels) with attribute (intensity) uk by N (the total of samples, for all uk). Some authors also normalize the values of the set {u0 , u1 , …, uL-1} to [0,1], by dividing by uL-1 (usually uL-1=255 in 8-bit gray level images).

Block pattern Checkerboard Diagonal stripped pattern A particular shape

Figure 3. Four different images with the same global distribution (histogram)

of black and white: a peak on 0 (black) and a peak on 255 (white). Bins – Histograms are defined on Nbin discrete intervals [un , un+1), with

n=0,…, Nbin –1 called bins or classes, where u0 and uNbin are the minimum and supremum intensities of I(x,y) (usually 0 and L=256). If we define Nbin ≡ 1 value of the histogram at some value, then Hist(uk) is the relative frequency of occurrence of intensity uk:

Hist(uk) = p(uk)= card{ pixels with value uk} / (NxNy). for an Nx×Ny rectangle region or image.


Normalized Histogram – Discrete, Empirical Probability Density Function (pdf) Hist(uk) of the pixel (or voxel) intensities { uk }k=0…K-1 (K-discrete intensity levels). Usually K=256 and maxk { Hist(uk)} = card I(x,y). Note we could also define simply Hist(u) with u=0,…, 255, but it may happen that we only use some u (for example the even values, or a logarithmic re-sampling of [0,256) ).

Sampled Histogram: That from a (properly) sampled population – related to

Monte Carlo estimation of PDFs. *Cumulative Histogram – From a discrete ROI of size Nx×Ny :

0 0

1/( ) ( ) ( )x y k kk k

n n

ncpdf = N N card u p u = =

=∑ ∑ (6)

(a.k.a. Cumulative Distribution Function). As a continuos random variable u:

0

( ) ( )u

cpdf u p d υ υ= ∫ (7)

*Histogram Equalization: is the process of applying the transfer function:

0( ) ( )

in

out in

u

u cpdf u p d υ υ= = ∫ (8)

Note that the PDF of the output levels (i.e., p(uout)) is uniform, that is:

11 00

( ) = out Lout

if u uotherwise

p u −≤ ≤⎧⎨⎩ (9)

The transformation (8) generates an image whose intensity levels are equally likely (thus, intensity-level equalized) and cover the entire range [0, 255]. The dynamic range is increased, and will tend to show higher contrast.

Histograms with MATLAB or other computer languages

MATLAB has an histogram function (histo); it is useful to know how to obtain oneself any ROI histogram. This code computes efficiently the histogram for a full image (any rectangular ROI can be easily specified at the main loops):


X = imread('bacteria.tif','TIF'); % matrix contains the image image(X); % display it hold on; % to compare with histogram H=zeros(256,1); % initialize histogram vector to 0 [m,n] = size(X); % obtain dimensions to set loop limits for i=1:m % scan all columns for j=1:n % scan all rows H(X(i,j)) = H(X(i,j))+1; % use data as addresses to count end % frequencies if occurrence end plot(H); % now display it

NOTE: In practice, indices should not start at 0, and they must be type integer (unsigned), but summation on Matlab requires type float or double, so a number of conversions are needed. Note also how the central instruction uses data (image intensities) as the very address of the class (level value) to be incremented; in order to properly work the above algorithm, the image has to be re-quantized to the desired number of histogram bins.


*Multidimensional Histograms Coocurrence Histogram = 2nd order or Bidimensional Histogram: 2-Dimensional pdf Hist(ua,ub) (joint probability densities).

Two attributes (e.g. graylevel intensities) ua,ub are analyzed simultaneosly, thus the name coocurrence or concurrence.

When a low number of bins L is used (v.g., 4, in spatial-dependence analysis of textures) the histogram is called a L×L coocurrence matrix.

Let ua,ub be two random variables representing attribute values of points, either at different locations or different sets of data (examples below) which may be two signals (or different parts of a single signal), images, etc. Then, as in equations (3) and (4), a second -order joint probability is defined as

( , ) Prob[ , ]u ui j a i b jp u u u v = = (10)

number of pairs of points with attributes

total number of such pairs of points in the ROI

= , =u ui ja bu u=

where for simplicity we use here Δui = Δuj = 1, for all I and ui, uj = 0,1,…, L−1. For that simplification, the joint probabilities are often written as:

( , ), , 0,..., 1.i j i jp p u u i j L = = − (11)

Note: do not confound i,j with spatial discrete coordinates. When points are pixels, ua,ub correspond to values I(x,y) in one or more images at one or different locations.

*Coocurrence may be studied between:

• Two pixels at different locations of the same image: ua=I(xa,ya), and ub =I(xb,yb). See spatial-dependence analysis in the chapter on textures. In this case, a discrete 2D histogram becomes a L×L matrix, where L is the number of bins for each intensity (for one pixel and the pixel at offset (Δx, Δy). The co-ocurrence matrix entries hij, for each offset


combination (Δx, Δy), are obtained from accumulating a count of pixels satisfying (see also eq. 1):

( ) ( )1 1

0 0

1 ( , ) AND ( , )0

N Mi j

ijx y

if I x y u I x x y y uhotherwise

− −

= =

⎧ = + Δ + Δ⎪= ⎨⎪⎩

∑∑ = (12)

where image intensities are still in the range [0,255], but the entries i, j correspond to bins according to quantization (usually in four classes, thus, i, j ∈ [0,3] and for example ui ∈ [i×64, 63+i×64]. Note that the “AND” condition expresses the joint occurrence of intensities i, j.

• Two pixels at the same location of different images (two adjacent slices in a volume, two video frames, matching pairs, etc.): : ua =Ia (x,y), and ub =Ib(x,y). The entries are then defined by:

( ) ( )1 1

0 0

1 ( , ) AND ( , )0

N M

ijx y

a a b bif I x y u I x y uhotherwise

− −

= =

⎧ = == ⎨⎩

∑∑ (13)

Note: in practice indirection is used by scanning both images pixel by pixel (x, y) and incrementing the count at the matrix entry indicated by image attributes Ia(x, y) and Ib(x, y), after quantization to L levels (and probabilities normalization must be performed later):

( )

1( , ) ( , ) ( , ) ( , )

0

( , )

( , )with and

: 1

0, , 0, ..., 1

a b a b

t tI x y I x y I x y I x y

i j

x y

i j

h h

h t

+ = +

= =

∀

∀ NM − (14)

*Interpretation of the co-ocurrence histogram:

o if Ia(x,y) ≡ Ib(x,y) the bidimensional histogram information stays along the identity line (diagonal) forming the 1-D histogram (no histogram values >0 outside the diagonal).


o If Ia(x,y) is very similar to Ib(x,y), most information clusters around the diagonal. If the diagonal is not a straight line, the images differ in attribute scales by some distorsion, reflecting different callibration, for example. In stead of coocurrence, the 1-D histogram of the differential image (Ia(x,y) − Ib(x,y)) may be useful.

o If Ia(x,y) is totally different to Ib(x,y), no information clusters along the identity, it spreads on all histogram domain.

• two pixels at the same location of the same image, but from different spectral channels (R and B, etc),

• two pixels at almost the same location of very similar or identical images, one under a geometric transformation (stero pairs, rotation, distorsion or warping): ua=I (x,y), ub=I(s,t), with (s,t) = (T(x,y)).

correspondence problems.

• idem, from images of same subject, with different imaging modalities (CT, PET, MRI, etc.). ua=ICT(x,y), ub=IMRI(x,y).

Co-ocurrence, like histograms, may be used to study higher-level features of regions and shapes.


*Tri-dimensional or third-order Histogram, either:

(1) When I(x,y) or V(x,y,z) is a vector-valued image (volume), v.g., a color image u = (r, g, b), then Hist(r,g,b) is a 3D-pdf.

(2) When we have triads (three points designed by some offset rule) in a signal X( th) or set, or three pixels (voxels) I( xh , yh , zh ), h=1,2,3, (or from three images or volumes Ih) their conjoint empirical pdf is a 3D-Histogram Hist(u1, u2 , u3 ).

In general, predicate histograms in three or more dimensions are formed by counting combined events (logical predicates) on attributes (the joint random variables):

...allpoints

1 [( , ), ( , ),..., ( , )]0ij z

i j j j z zif pred I u I u I uh

otherwise⎧

= ⎨⎩

∑ (15)

**N-Dimensional PDFs and Histograms

Many decision and prediction problems may be formulated by exectation values of function of several random variables. In one dimension, for a scalar random variable X (note the change in notation, more according to litterature) taking values χ∈ , we have:

1(χ) ( χ)p P X = , (16)

χ (χ) χ

(χ) χχ

p d

p d

∞

−∞∞

−∞

⟨ ⟩ =∫

∫ and in general:

(χ) (χ) χ

(χ) χ(χ)

f p d

p df

∞

−∞∞

−∞

⟨ ⟩ =∫

∫ (17)


n n

nd

n

for any function f (χ). By convention p(⋅) is normalized to [0,1], so the denominator is always ∫ p(χ)dχ = 1.

Let the set of n random variables X = { X1,…, Xn }, now we write the joint probability densities or pdf s:

1 1 1(χ ,...,χ ) ( χ ,..., χ )np P X X= = (18)

1 1 1... (χ ,...,χ ) (χ ,...,χ ) χ ,..., χn nf f p d∞ ∞

−∞ −∞

⟨ ⟩ = ∫ ∫ (19)

for example, if u=I(x,y) is the gray level of each pixel (x,y) in an image, the average value, or more precisely, the graylevel value we expect to find is <u>, and we do not need to compute the histogram, but we obtain it from averaging absolutely all pixel values. For example, it turns out that for the function defining the squared error f(u)=(u−<u>)2, its expected value < f > is the standard deviation of u. If X1,…, Xn takes discrete values χ1∈{χ1(0), χ1(1),…, χ1(L-1)}, etc. Equation (19) becomes equation (20) and any of both can be estimated by a sample average (21):

1

1 1χ χ

... (χ ,...,χ ) (χ ,...,χ )n

nAll All

f f p< > = ∑ ∑ (20)

1( ) ( )1

0

1 ( , ..., )N

k kn

k

f x xN−

=

≈ =∑ f

ni

(21)

Note that strictly speaking, equation (20) should be written as: 1

11

1 1

1( ) ( ) 1( ) ( )0 0

... (χ ,...,χ ) (χ ,...,χ )n

n nn

L L

i n i i ni i

f f p− −

= =

< > = ∑ ∑ (22)

It is often useful to use integers, as with u ∈{0,1,…,L−1}and then sum on subscripts i for χ1…

image processing tutorial – jorge márquez flores - ccadet … · image processing tutorial –...

Documents