introduction to image processing-class notes

Dr.Y.Narasimha Murthy Ph.D [email protected]

Introduction to Image Processing - Class Notes

Introduction:

Digital image processing is always an interesting field as it gives an

improved pictorial information for human interpretation and processing of

image data for storage, transmission, and representation for machine

perception. This field of image processing drastically improved in recent

times and extended to various fields of science and technology .In recent

times this branch is known with a new name called computer vision and

more and more people are showing interest in this field . The image

processing mainly deals with image acquisition Image enhancement, image

compression, image segmentation, image restoration etc.

Basic concepts:

The basic definition of image processing refers to processing of digital image ,i.e removing the

noise and any kind of irregularities present in an image using the digital computer. The noise or

irregularity may creep into the image either during its formation or during transformation etc.For

mathematical analysis, an image may be defined as a two-dimensional function f(x,y) where x

and y are spatial (plane) coordinates, and the amplitude of f at any pair of coordinates (x, y) is

called the intensity or gray level of the image at that point. When x, y, and the intensity values of

f are all finite, discrete quantities, we call the image a digital image. It is very important that a

digital image is composed of a finite number of elements, each of which has a particular location

and value.These elements are called picture elements, image elements, pels, and pixels. Pixel is

the most widely used term to denote the elements of a digital image.

One of the first applications of digital images was in the newspaper industry, when pictures were

first sent by submarine cable between London and New York in the year 1920.As the images

transferred were not of good quality ,the need of image processing realised to get the tonal

quality and resolution of images. But the actual growth in digital image processing gained

momentum with the advent of digital computers into the market. With the decreasing price and

increasing performance of computers and the expansion of networking and communication

1


band width via the World Wide Web and the Internet have created unprecedented opportunities

for continued growth of digital image processing.

Fields that Use Digital Image Processing :

Now a days the impact of Digital image processing is found in almost all the

branches of science and technology.For example (i) News paper printing (ii)

Remote sensing (iii)Transportation (iv) Gamma ray imaging (v) X-ray imaging

(vi) Ultra violet imaging (vii)Imaging in visible and IR band ,(viii).Imaging in

the microwave region and (ixs) Imaging in the radio band.

(i).Gamma Ray imaging:

Gamma ray imaging uses the gamma rays and it includes nuclear medicine

and astronomical observations. In nuclear medicine, the approach is to inject

a patient with a radioactive isotope that emits gamma rays as it decays.

Images are produced from the emissions collected by gamma ray detectors.

For example bone scan obtained by using gamma-ray imaging are used to

locate sites of bone pathology, such as infections or tumors. In nuclear

imaging called positron emission tomography (PET).In this technique the

patient is given a radioactive isotope that emits positrons as it

decays .When a positron meets an electron, both are annihilated and two

gamma rays are given off. These are detected and a tomographic image is

created using the basic principles of tomography.The image shows the

defects like tumors in the brain or in the lung, easily visible as small white

masses.

Similarly the same imaging technique can be used to take the photographs

of stars and other stellar objects. Here the images are obtained using the

natural radiation of the object being imaged. Also the gamma radiation can

be used to get the images of gamma radiation from a valve in a nuclear

reactor.

(ii).X-ray Imaging

2


X-ray imaging is used in medical diagnosis and also used extensively in industry and other

areas, like astronomy. X-rays for medical and industrial imaging are generated using an X-ray

tube, which is a vacuum tube with a cathode and anode. When these X-rays of low intensity are

passed through the patient, and the resulting energy falling on the film develops it, much in the

same way that light develops photographic film. In digital radiography, digital images are

obtained by one of two methods: (1) by digitizing X-ray films; or (2) by having the X-rays that

pass through the patient fall directly onto devices (such as a phosphor screen) that convert X-rays

to light. The light signal in turn is captured by a light-sensitive digitizing system.

Angiography is another major application in an area called contrast enhancement radiography.

This procedure is used to obtain images (called angiograms) of blood vessels. A catheter (a

small, flexible, hollow tube) is inserted, for example, into an artery or vein in the groin.The

catheter is threaded into the blood vessel and guided to the area to be studied.When the catheter

reaches the site under investigation, an X-ray contrast medium is injected through the catheter.

This enhances contrast of the blood vessels and enables the radiologist to see any irregularities or

blockages.

The best known use of X-rays in medical imaging is computerized axial

tomography. Due to their resolution and 3-D capabilities, CAT scans

revolutionized medicine from the moment they first became available in the

early 1970s. Each CAT image is a “slice” taken perpendicularly through the

patient. Numerous slices are generated as the patient is moved in a

longitudinal direction. The ensemble of such images constitutes a 3-D

rendition of the inside of the patient, with the longitudinal resolution being

proportional to the number of slice images taken. Figure 1.7(c) shows a

typical head CAT slice image. Techniques similar to the ones just discussed,

but generally involving high energy X-rays, are applicable in industrial

processes. The x-ray images are used to find the flaws in the industrial

boards, such as missing components or broken traces. Industrial CAT scans

are useful when the parts can be penetrated by X-rays, such as in plastic

assemblies, and even large bodies, like solid-propellant rocket motors.

3


(iii).Imaging in the Ultraviolet Band : The ultra violet band imaging include lithography,

industrial inspection, microscopy, lasers, biological imaging, and astronomical observations.

Ultraviolet light is used in fluorescence microscopy which is one of the fastest growing areas of

microscopy. The basic task of the fluorescence microscope is to use an excitation light to

irradiate a prepared specimen and then to separate the much weaker radiating fluorescent light

from the brighter excitation light.Thus, only the emission light reaches the eye or other detector.

The resulting fluorescing areas shine against a dark background with sufficient contrast to permit

detection. The darker the background of the non-fluorescing material, the more efficient the

instrument. Fluorescence microscopy is an excellent method for studying materials that can be

made to fluoresce, either in their natural form (primary fluorescence) orwhen treated with

chemicals capable of fluorescing (secondary fluorescence).

(iii).Imaging in the Visible and Infrared Bands

The imaging in the visible range from pharmaceuticals and micro inspection to materials

characterization. Even in just microscopy, the application areas are too numerous to detail here.

It is not difficult to conceptualize the types of processes one might apply to these images,

ranging from enhancement to measurements.Another major area of visual processing is remote

sensing, which usually includes several bands in the visual and infrared regions of the spectrum.

Images of finger prints comes under visible imaging. They are routinely

processed by computer, either to enhance them or to find features that help

in the automated search of a database for potential matches..Applications of

digital image processing in this area include automated counting and, in law

enforcement, the reading of the serial number for the purpose of tracking

and identifying bills. The two vehicle images taken are used for the

automated license plate reading.

The infrared imaging is used in the Nighttime Lights of the World data set, which provides a

global inventory of human settlements. The images were generated by the infrared imaging

system mounted on satellite. The infrared imaging system operates in the band 10.0 to 13.4µm,

and has the unique capability to observe faint sources of visible near infrared emissions present

on the Earth’s surface, including cities, towns villages, gas flares, and fires. Even without formal

training in image processing, it is not difficult to imagine writing a computer program that would

4


use these images to estimate the percent of total electrical energy used by various regions of the

world.

(iv).Imaging in the Microwave Band

The important application of imaging in the microwave band is RADAR. The unique feature of

imaging radar is its ability to collect data over virtually any region at any time, regardless of

weather or ambient lighting conditions. Some radar waves can penetrate clouds, and under

certain conditions can also see through vegetation, ice, and extremely dry sand. In many cases,

radar is the only way to explore inaccessible regions of the Earth’s surface. An imaging radar

works like a flash camera in that it provides its own illumination (microwave pulses) to

illuminate an area on the ground and take a snapshot image. Instead of a camera lens, a radar

uses an antenna and digital computer processing to record its images. In a radar image, one can

see only the microwave energy that was reflected back toward the radar antenna.

(v).Imaging in the Radio Band

The major applications of imaging in the radio band are in medicine and

astronomy. In medicine radio waves are used in magnetic resonance imaging

(MRI).This technique places a patient in a powerful magnet and passes radio

waves through his or her body in short pulses. Each pulse causes responding

pulse of radio waves to be emitted by the patient’s tissues. The location from

which these signals originate and their strength are determined by a

computer, which produces a two-dimensional picture of a section of the

patient. MRI can produce pictures in any plane.

(vi). Other Imaging fields:

Among other imaging examples the sound (ultra sound imaging) ,electron microscopy and

transporting are very important. Imaging using “sound” finds application in geological

exploration, industry, and medicine. Geological applications use sound in the low end of the

sound spectrum (hundreds of Hertz) while imaging in other areas use ultrasound (millions of

Hertz). The most important commercial applications of image processing in geology are in

mineral and oil exploration.

In a typical ultrasound image, millions of pulses and echoes are sent and received each second.

The probe can be moved along the surface of the body and angled to obtain various views.The

sound waves travel into the body and hit a boundary between tissues (e.g., between fluid and soft

5


tissue, soft tissue and bone). Some of the sound waves are reflected back to the probe, while

some travel on further until they reach another boundary and get reflected. The reflected waves

are picked up by the probe and relayed to the computer.

A transmission electron microscope (TEM) works similar to a slide projector TEMs work the

same way, except that they shine a beam of electrons through a specimen (analogous to the

slide).The fraction of the beam transmitted through the specimen is projected onto a phosphor

screen. The interaction of the electrons with the phosphor produces light and, therefore, a

viewable image.

A scanning electron microscope (SEM), on the other hand, actually scans the electron beam and

records the interaction of beam and sample at each location. This produces one dot on a

phosphor screen.A complete image is formed by a raster scan of the bean through the sample,

much like a TV camera. The electrons interact with a phosphor screen and produce light. SEMs

are suitable for “bulky” samples, while TEMs require very thin samples.

Electron microscopes are capable of very high magnification. While light microscopy is limited

to magnifications on the order 1000 X electron microscopes can achieve magnification of

10,000 X or more.

Elements of Visual Perception :

The elements of visual perception deals with mechanics and parameters related to

how images are formed in the eye and physical limitations of human vision in terms

of factors that are used in digital images.

Structure of the Human Eye

It is a known fact that the eye is nearly a sphere, with an average diameter of approximately 20

mm. Three membranes enclose the eye : the cornea and sclera outer cover; the choroid and the

retina. The cornea is a tough, transparent tissue that covers the anterior surface of the eye.

Continuous with the cornea, the sclera is an opaque membrane that encloses the remainder of the

optic globe. The choroid lies directly below the sclera. This membrane contains a network of

blood vessels that serve as the major source of nutrition to the eye.Even superficial injury to the

choroid, often not deemed serious, can lead to severe eye damage as a result of inflammation that

restricts blood flow. The choroid coat is heavily pigmented and hence helps to reduce the amount

of extraneous light entering the eye and the backscatter within the optical globe. At its anterior

6


extreme, the choroid is divided into the ciliary body and the iris diaphragm. The latter contracts

or expands to control the amount of light that enters the eye. The central opening of the iris (the

pupil) varies in diameter from approximately 2 to 8 mm. The front of the iris contains the visible

pigment of the eye, whereas the back contains a black pigment. The lens is made up of

concentric layers of fibrous cells and is suspended by fibers that attach to the ciliary body. It

contains 60 to 70%water, about 6%fat, and more protein than any other tissue in the eye. The

lens is colored by a slightly yellow pigmentation that increases with age. In extreme cases,

excessive clouding of the lens, caused by the affliction commonly referred to as cataracts, can

lead to poor color discrimination and loss of clear vision. The lens absorbs approximately8% of

the visible light spectrum, with relatively higher absorption at shorter wavelengths. Both infrared

and ultraviolet light are absorbed appreciably by proteins within the lens structure and, in

excessive amounts, can damage the eye.The innermost membrane of the eye is the retina, which

lines the inside of the wall’s entire posterior portion. When the eye is properly focused, light

from an object outside the eye is imaged on the retina. Pattern vision is afforded by the

distribution of discrete light receptors over the surface of the retina. There are two classes of

receptors: cones and rods. The cones in each eye number between 6and 7 million. They are

located primarily in the central portion of the retina, called the fovea, and are highly sensitive to

color. Humans can resolve fine details with these cones largely because each one is connected to

its own nerve end. Muscles controlling the eye rotate the eyeball until the image of an object of

interest falls on the fovea. Cone vision is called photopic or bright-light vision.The number of

rods is much larger: Some 75 to 150 million are distributed over the retinal surface. The larger

area of distribution and the fact that several rods are connected to a single nerve end reduce the

amount of detail discernible by these receptors. Rods serve to give a general, overall picture of

the field of view. They are not involved in color vision and are sensitive to low levels of

illumination. For example, objects that appear brightly colored in daylight when seen by

moonlight appear as colorless forms because only the rods are stimulated. This phenomenon is

known as scotopic or dim-light vision.

7


Image Formation in the Eye :

The basic difference between the lens of the eye and an ordinary optical lens is that the lens of

the eye is flexible and the lens is not. The radius of curvature of the anterior surface of the lens is

greater than the radius of its posterior surface.

The shape of the lens is controlled by tension in the fibers of the ciliary body. To focus on

distant objects, the controlling muscles cause the lens to be relatively flattened. Similarly, these

muscles allow the lens to become thicker in order to focus on objects near the eye. The distance

between the center of the lens and the retina (called the focal length) varies from approximately

17 mm to about 14 mm, as the refractive power of the lens increases from its minimum to its

maximum. When the eye focuses on an object farther away than about 3 m, the lens exhibits its

lowest refractive power. When the eye focuses on a nearby object, the lens is most strongly

refractive. This information makes it easy to calculate the size of the retinal image of any object.

In Fig. above the observer is looking at a tree15 m high at a distance of 100 m. If h is the height

in mm of that object in the retinal image, the geometry yields 15/100= h/17 or h=2.55 mm.

8


Brightness Adaptation and Discrimination

The digital images are displayed as a discrete set of intensities, hence the eye’s ability to

discriminate between different intensity levels is an important consideration in presenting image-

processing results. The range of light intensity levels to which the human visual system can adapt

is enormous (of the order of1010—from the scotopic threshold to the glare limit. Experimental

evidence indicates that subjective brightness (intensity as perceived by the human visual system)

is a logarithmic function of the light intensity incident on the eye

Examples of human perception phenomena are optical illusions, in which the eye fills in non

existing information or wrongly perceives geometrical properties of objects.Optical illusions are

a characteristic of the human visual system that is not fully understood.

Image Sensing and Acquisition :

Image sensing is done by a sensor ,when it is illuminated by either reflected or transmitted

energy from the object or picture under consideration. In the process of image acquisition there

are three components .They are

(i) . Illumination (ii) Optical system (lens system) (iii). Sensor system.

The illumination may be either from a light source or from a source of electromagnetic energy

such as radar, infrared, or X-ray energy etc. Depending on the nature of the

source, illumination energy is reflected from, or transmitted through, objects.

To focus the reflected or transmitted energy certain lens or mirror like

system is required. This is called optical system. In some situations, the

radiation is focused onto a photo converter (e.g., a phosphor screen), which

converts the energy into visible light. Electron microscopy and some

applications of gamma imaging use this approach.

The important part of the image acquisition is the Image sensor. The light reflected from the

object of is focused by a lens system must be recorded. This is done by an image sensor . An

image sensor consists of a 2D array of cells and each of these cells is denoted a pixel and is

capable of measuring the amount of incident light and convert that into a voltage, which in turn

is converted into a digital number. Actually the image sensor is a part of a digital camera. Before

a camera capture an image, all cells are emptied, meaning that no charge is present. When the

camera captures an image, light is allowed to enter and charges each cell. After a certain

9


amount of time, known as the exposure time, which is controlled by the shutter, the incident

light is shut out again. If the exposure time is too low the result is an underexposed picture or if

the time too high ,it forms overexposed image . Fig (a) ,(b) &(c) shows the three principal

sensor arrangements used to transform illumination energy into digital images. In this case the in

coming energy is transformed into a voltage by the combination of input electrical power and

sensor material that is responsive to the particular type of energy being detected. The output

voltage waveform is the response of the sensor(s), and a digital quantity is obtained from each

sensor by digitizing its response.

Fig (a) .Single imaging sensor

Fig(b).s Line sensor

Fig (c). Sensor array

10


Image Acquisition Using a Single Sensor :

A single sensor is the simplest among all the image acquisition sensors. Fig below shows a single

sensor . The best example for a single sensor is the photodiode, which is constructed of silicon

material whose output voltage waveform is proportional to incident light. The use of a filter in

front of a sensor improves selectivity. For example, a green (pass) filter in front of a light sensor

allows only green light .As a consequence, the sensor output will be stronger for green light than

for other components in the visible spectrum. In order to generate a 2-D image using a single

sensor, there has to be relative displacements in both the x- and y-directions between the sensor

and the area to be imaged.

Fig. below shows an arrangement used in high-precision scanning, where a film negative is

mounted onto a drum whose mechanical rotation provides displacement in one dimension.The

single sensor is mounted ona lead screw that provides motion in the perpendicular direction.

Since mechanical motion can be controlled with high precision, this method is an inexpensive

(but slow) way to obtain high-resolution images. Other similar mechanical arrangements use a

flat bed, with the sensor moving in two linear directions. These types of mechanical digitizers

sometimes are referred to as micro-densitometers.

11


Another example of imaging with a single sensor places a laser source coincident with the

sensor. Moving mirrors are used to control the outgoing beaming a scanning pattern and to direct

the reflected laser signal onto the sensor.

Image Acquisition Using Sensor Strips:

A sensor strip consists of an in-line arrangement of sensors in the form of a sensor strip and it is

more widely used than a single sensor. The strip provides imaging elements in one direction.

Motion perpendicular to the strip provides imaging in the other direction, as shown in Fig.

below.This type of arrangement is used in flat bed scanners.

Sensing devices with 4000 or more in-line sensors are used. In-line sensors are used routinely in

air borne imaging applications, in which the imaging system is mounted on an aircraft that flies

at a constant altitude and speed over the geographical area to be imaged. One-dimensional

imaging sensor strips that respond to various bands of the electromagnetic spectrum are mounted

perpendicular to the direction of flight. The imaging strip gives one line of an image at a time,

and the motion of the strip completes the other dimension of a two-dimensional image. Lenses

are used to project the area to be scanned onto the sensors.

A similar sensor strip arrangement mounted in a ring configuration is used

in X-ray medical and industrial imaging to obtain cross-sectional images of

3-D objects .

Some Basic Relationships Between Pixels :

12


Any image is denoted by the function f(x, y) and the image is composed of many pixels.. The

important relationships between pixels are adjacency, connectivity, regions, and boundaries

and distance Measures.

(i). Neighbors of a Pixel

A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose coordinates are given by

(x+1, y), (x-1, y), (x, y+1), (x, y-1)

This set of pixels, called the 4-neighbors of p, is denoted by N4(p). Each pixel is a unit distance

from (x, y), and some of the neighbors of p lie outside the digital image if (x, y) is on the border

of the image.

The four diagonal neighbors of p have coordinates

(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)

and are denoted by ND(p). These points, together with the 4-neighbors, are called the 8-

neighbors of p, denoted by N8(p). Some of the points in ND(p) and N8(p) fall outside the image

if (x, y) is on the border of the image.

Adjacency, Connectivity, Regions, and Boundaries

Connectivity between pixels is a fundamental concept that simplifies the definition of numerous

digital image concepts, such as regions and boundaries. To establish if two pixels are connected,

it must be determined if they are neighbors and if their gray levels satisfy a specified criterion of

similarity (say, if their gray levels are equal).

For instance, in a binary image with values 0 and 1, two pixels may be 4-neighbors, but they are

said to be connected only if they have the same value. Let V be the set of gray-level values used

to define adjacency. In a binary image, V={1} if we are referring to adjacency of pixels with

value 1. In a grayscale image, the idea is the same, but set V typically contains more elements.

For example, in the adjacency of pixels with a range of possible gray-level values 0to 255, set V

could be any subset of these 256 values. We consider three types of adjacency:

(a) 4-adjacency. Two pixels p and q with values from V are 4-adjacent if q is in the set N4(p).

13


(b) 8-adjacency. Two pixels p and q with values from V are 8-adjacent if q is in the set N8(p).

(c) m-adjacency (mixed adjacency).Two pixels p and q with values from V are m-adjacent if

(i) q is in N4(p), or (ii) q is in ND(p) and the set has no pixels whose values are from V.

Mixed adjacency is a modification of 8-adjacency. It is introduced to eliminate the ambiguities

that often arise when 8-adjacency is used. For example, consider the pixel arrangement shown in

Fig(a) .below for V={1}.The three pixels at the top of Fig(b). below show multiple

(ambiguous) 8-adjacency, as indicated by the dashed lines. This ambiguity is removed by using

m-adjacency, as shown in Fig. (c).Two image subsets S1 and S2 are adjacent if some pixel in S1

is adjacent to some pixel in S2. It is clear from the above that adjacent means 4-, 8-, or m-

adjacent.

Fig: (a) Arrangement of pixels (b) pixels that are 8-adjacent (shown dashed) to the center pixel (c) m-adjacency.

A (digital) path (or curve) from pixel p with coordinates (x, y) to pixel q with coordinates (s, t) is

a sequence of distinct pixels with coordinates

(x0 , y0), (x1 , y1) , ……(xn , yn)

where (x0 , y0) = (x, y), (xn , yn) = (s, t), and pixels (xi , yi) and (xi-1 , yi-1) are

adjacent for 1≤ I ≤ n . In this case, n is the length of the path. If (x0,y0) = (xn,yn) the path is

a closed path.

Let S represent a subset of pixels in an image. Two pixels p and q are said to be connected in S if

there exists a path between them consisting entirely of pixels in S. For any pixel p in S, the set of

pixels that are connected to it in S is called a connected component of S. If it only has one

connected component, then set S is called a connected set.

Let R be a subset of pixels in an image. R is called a region of the image if R is a connected set.

The boundary (also called border or contour) of a region R is the set of pixels in the region that

have one or more neighbors that are not in R. If R happens to be an entire image (which we

14


recall is a rectangular set of pixels), then its boundary is defined as the set of pixels in the first

and last rows and columns of the image.

Distance Measures :

For pixels p, q, and z, with coordinates (x, y), (s, t), and (v, w), respectively, D is the distance

function or metric . The Euclidean distance between p and q is defined as

De(p,q) = [ ( x- s)2 + (y – t)2]1/2

For this distance measure, the pixels having a distance less than or equal to some value r from (x,

y) are the points contained in a disk of radius r centered at (x, y). The D4 distance (also called

city-block distance) between p and q is defined as

D4(p,q) = │ x- s│ + │y – t │

In this case, the pixels having a D4 distance from (x, y) less than or equal to some value r form a

diamond centered at (x, y). For example, the pixels with D4 distance ≤ 2 from (x, y) (the center

point) form the following contours of constant distance:

The pixels with D4=1 are the 4-neighbors of (x, y). The D8 distance (also called chessboard

distance) between p and q is defined as

D8(p,q) = max(│ x- s│,│y – t │)

In this case, the pixels with D8 distance from (x, y) less than or equal to some value r form a

square centered at (x, y). For example, the pixels with D8 distance ≤ 2 from (x, y) (the center

point) form the following contours of constant distance

15


The pixels with D8=1 are the 8-neighbors of (x, y).

The D4 and D8 distances between p and q are independent of any paths that might exist between

the points because these distances involve only the coordinates of the points. If we elect to

consider m-adjacency, however, the Dm distance between two points is defined as the shortest

m-path between the points. In this case, the distance between two pixels will depend on the

values of the pixels along the path, as well as the values of their neighbors.

Basic Concepts in Sampling and Quantization :

In the digital image processing the most important part is digitizing the analog signals .To get

digital signals first the analog signals are discretized and then quantized . Digitizing the

coordinate values is called sampling and digitizing the amplitude values of the discrete signal

is called quantization.

This sampling can be as shown below.

Basic Concepts in Sampling and Quantization

To understand the basic idea behind sampling and quantization let us

consider a continuous image, f(x, y),which is to be converted into digital

form.An image may be continuous with respect to the x- and y- coordinates,

and also in amplitude. To convert it to digital form, first it is sampled in both

coordinates and in amplitude. Digitizing the coordinate values is called

sampling. Digitizing the amplitude values is called quantization.

16


Continuous image

(a) Sampling (b). Quantization

When a sensing strip is used for image acquisition, the number of sensors in the strip establishes

the sampling limitations in one image direction. Mechanical motion in the other direction can be

controlled more accurately, but it makes little sense to try to achieve sampling density in one

direction that exceeds the sampling limits established by the number of sensors in the other.

Quantization of the sensor outputs completes the process of generating a digital image.When a

sensing array is used for image acquisition, there is no motion and the number of sensors in the

array establishes the limits of sampling in both directions. Quantization of the sensor outputs is

as before.

In practice, the method of sampling is determined by the sensor arrangement used to generate the

image. When an image is generated by a single sensing element combined with mechanical

motion, the output of the sensor is quantized. However, sampling is accomplished by selecting

the number of individual mechanical increments at which we activate the sensor to collect data.

17


Mechanical motion can be made very exact so, in principle, there is almost no limit as to how

fine we can sample an image. However, practical limits are established by imperfections in the

optics used to focus on the sensor an illumination spot that is inconsistent with the fine resolution

achievable with mechanical displacements.

Fundamental Steps in Digital Image Processing : The important steps involved in image

processing are

(i). Image acquisition

(ii).Image enhancement

(iii). Image restoration

(iv).Image Compression

(v).Image segmentation

(v).Image Recognition and

(vi).Color Image processing.

The various steps in the digital image processing are shown below.

Image acquisition is the first step where in the image of an object is acquired using a suitable

image sensor (camera).

Image enhancement is the simplest and most appealing areas of digital image processing. The

enhancement technique bring backs the detail that is obscured, or simply to highlight certain

18


important features of the image. A familiar example of enhancement is changing the contrast of

an image to make it look better. But this Image enhancement is always subjective.

Image restoration is an area that deals with improving the appearance of an image. However,

unlike enhancement, which is subjective, image restorations objective, in the sense that

restoration techniques tend to be based on mathematical or probabilistic models of image

degradation. Enhancement, on the other hand, is based on human subjective preferences

regarding what constitutes a “good” enhancement result.

Color image processing is an area that has been gaining in importance because of the significant

increase in the use of digital images over the Internet . Color is used also as the basis for

extracting features of interest in an image.

Wavelets are the foundation for representing images in various degrees of resolution.This

method is also used for image data compression and for pyramidal representation, in which

images are subdivided successively into smaller regions.

Compression, is a technique of reducing the storage required to save an image, or the

bandwidth required to transmit it. This is true particularly in uses of the Internet, which are

characterized by significant pictorial content. Image compression is widely used in computers

in the form of image file extensions, such as the jpg file extension used in the JPEG(Joint

Photographic Experts Group) image compression standard.

Morphological processing deals with tools for extracting image components that are useful in the

representation and description of shape.

Segmentation means partition of an image into its constituent parts or objects. In general,

autonomous segmentation is one of the most difficult tasks in digital image processing. A rugged

segmentation procedure provides successful solution of imaging problems that require objects to

be identified individually. On the other hand, weak or erratic segmentation algorithms always

gives failure results. In general, the more accurate the segmentation, the more successful is the

recognition.

Representation and description always follow the output of a segmentation stage, which usually

is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels separating

one image region from another) or all the points in the region itself. In either case, converting the

data to a form suitable for computer processing is necessary. The first decision that must be

made is whether the data should be represented as a boundary or as a complete region. Boundary

19


representation is appropriate when the focus is on external shape characteristics, such as corners

and inflections. Regional representation is appropriate when the focus is on internal properties,

such as texture or skeletal shape. In some applications, these representations complement each

other. Choosing a representation is only part of the solution for transforming raw data into a form

suitable for subsequent computer processing. A method must also be specified for describing the

data so that features of interest are highlighted. Description, also called feature selection, deals

with extracting attributes that result in some quantitative information of interest or are basic for

differentiating one class of objects from another. Recognition is the process that assigns a label

to an object based on its descriptors.

Image compression

A digital image is a 2D image I[x,y] represented by a discrete 2D array of intensity

samples(data), of the order of 105 to 106 .As the number of samples is very large ,it is always a

tedious job not only to process this data but also to store .I.e large amount of memory space is

required to store the digital images .Another important is that there will be some redundancy in

the data also. To overcome these limitations, it is always required to compress the data based on

certain procedure.

The method of efficiently coding the digital image data to reduce the number of bits is called

Image compression. This image compression reduces the data redundancy in the image and also

the amount of space required to store it.

Normally the redundancy is of three types. Spatial redundancy ,Spectrl redundancy and

Temporal Redudancy.

The spatial redundancy is mainly due to correlation between neighboring pixels and the spectral

redundancy arises due to the correlation between various color plans. The temporal redundancy

is due to the correlation between various frames in an image sequence.

To compress the digital image ,the first step is ,converting the continuous image into digital

image by using the ADC(Analog to Digital converter).This digital signal is fed to a Quantizer

using a serial to parallel converter. The quantized output is then coded by using suitable lossless

coding device. This gives the compressed image signal .The block diagram of the image

compression system is shown below.

20


Image Enhancement

Image enhancement is a technique that improves the image features such as edges, boundaries

or the contrast so that the visual appearance of an image will become more suitable for display

and analysis.The enhancement process does not increase the information content in the data but

increases the dynamic range of the chosen features so that they can be detected easily.This

image enhancement is a very important process because of its utility in all the image processing

applications.This Image enhancement includes gray level and contrast manipulation ,noise

reduction ,edge crispening and sharpening, filtering ,interpolation ,magnification and

pseudocoding etc.

The commonly used image enhancement techniques are given below.

Under the point operations we consider (i) contrast stretching (ii) Noise clipping(iii) Window

slicing and (iv)Histogram modeling.

Similarly under Spatial operations ,we consider the operations like (i)Noise smoothing

(ii)Median filtering (iii) Unsharp masking (iv) Filtering(L.P;H.P;B.P) and (v) Zooming

21


In Transform operations we consider (i) Linear filtering (ii)Root filtering and (iii) Homomorphic

filtering.

In Pseudocoloring we consider the operations (i) false coloring and (ii) Pseudocoloring.

The point operations are simplest and they are zero memory operations where a given gray level

u є [0 , L] is mapped into a gray level v є [0,L] according to the transformation

V= f(u).

One of the most common defects of photographic or electronic images is poor contrast resulting

from a reduced, and perhaps nonlinear, image amplitude range. Image contrast can often be

improved by amplitude rescaling of each pixel. Actually the low contrast images occur due to the

poor or non-uniform lighting conditions or due to non-linearity or small dynamic range of

imaging sensors.

The contrast stretching transformation can be expressed as

α u 0 ≤ u ≤ a

v = β(u-a) +va a ≤ u ≤ b

γ (u-b) + vb b ≤ u ≤ L

here the parameters a and b can be obtained from the histogram of the image.

Clipping and Thresholding : A special case of contrast stretching where α = γ = 0is called

clipping. This is useful for noise reduction when the input signal is known to lie in the

range[a,b]. Thresholding is a special case of clipping in which the output becomes binary. For

example ,a seemingly binary image ,such as a printed page does not give binary output when

scanned ,because of sensor noise and back ground illumination variations..Thresholding is used

to make such an image binary.

Digital Negative :

A negative image can be obtained by reverse scaling of the gray levels based on the

transformation

v = L-u

Digital negatives are useful in the display of medical images and in producing negative prints of

the images.

22


Histogram Modelling:

Histogram modelling is another image enhancement technique where the image is modified so

that the histogram will get a desired shape.The histogram of an image represents the relative

frequency of concurrency of the various gray levels in the image. The histogram modeling

technique modify an image so that its histogram has a desired shape.This is useful in stretching

the low-contrast levels of the images with narrow histograms.

Using the histogram equalization , a uniform histogram for the output image is obtained.

Spatial Operations : The spatial operations are performed on local neighborhoods of input

pixels.The image is convolved with a finite impulse resonse filter called spatial mask.

Under the spatial operations, averaging and spatial low-pass filters and median filters are used .

In the median filtering the input pixel is replaced by the median of the remaining pixels in a

window around the input pixel.

v(m,n) = median{y(m-k , n-l),(k, l)єW}

here W denotes the window around the pixel .Generally the window size is chosen such that Nw

is odd. If the Nw is even the median is taken as the average of the two values in the middle.

Unsharp Masking and Crispening: The unsharp masking technique is used in the printing

industry for crispy edges.A signal proportional to the unsharp or low-pass filtered version of the

image is subtracted from the image. This is equivalent to adding the gradient or a high-pass

signal to the image.

Magnification and Zooming: Some times it is desirable to zoom the image on a given region so

that image can be displayed as a larger one.

Transform Operations : In the transform operation enhancement techniques ,zero memory

operations are performed on a transformed image followed by the inverse transformations.

Image Restoration : It is a process of filtering the image such that the effect of degradation

is minimized. The effectiveness of the image restoration filters depend on the extent and the

accuracy of the knowledge of the degradation process as well as on the filter design criteria.Any

image is acquired by optical, electro-optical or electronic means is likely to be degraded by the

sensing environment. The degradations may be due to (i) sensor noise (ii) blur due to camera

misfocus (iii) relative object-camera motion (iv) random atmospheric turbulence etc.These

23


errors deteriorate the quality of the image. So,we can say that the process of image restoration is

one which produces a sharp, clean and high quality images.

This process of image formation and recording can be denoted by the following equation.

g(x,y) = R[ ∫∫ h(x-x1,y-y1)f(x1,y1)dx1dy1] + n(x,y)

Here g(x,y) is the actual image and R is the response characteristic of the recording process and

n(x,y) s is the additive noise source.

The restoration of digital image in the discrete form can be written as

N-1 N-1

g(p,q) = ∑ ∑ f(i,j)h(p-i,q-j) i=0 j=0

Image restoration is different from the image enhancement. Because in image restoration ,the

degraded information of the image is restored back and in image enhancement the image features

are improved and extracted . The digital image restoration process is shown by the following

block diagram.

In the Image restoration process, first the analog image is converted into digital image by using

suitable ADC .The digital image is processed by using suitable digital filter depending on the

type of degradation present in the Image. Normally in the process of removing the noise from the

digital images ,non-linear filters are widely used. Among them Median filters or adaptive median

filters are very widely used. After removing the noise the digital image is again converted back

to analog image so that it can be either displayed or recorded.

Voice Processing: It is one of the very important signal processing method used to remove the

noise from speech or sound and helps to transmit the signal over long distances at high speeds in

the real time environment. The recent developments in DSP algorithms have become very

important because the bandwidths associated with speech or voice are well matched to the

24


processing speeds available. The voice processing methodology can be broadly classified into

three types. They are (i) Speech analysis and sysnthesis (ii)Compression and coding and

(iii)Voice privacy.

Speech is produced by the excitation of an acoustic tube called vocal tract.The vocal tract starts

from the glottis and is terminated by lips.The speech signal consists of periodic sounds

associated with bursts of wide band noise and also short silences. As the vocal organs are in

motion continuously ,the signal generated is not stationary .So, to process this speech signal

Short time Fourier transforms are used.

To transmit the speech signal over long distances , first it is digitized ,encoded ,processed and

then transmitted over a limited bandwidth channel and finally converted back to speech signal.

Generally lot of redundancy is present in voice signals .To reduce the redundancy in the signal it

is properly compressed by coding. This signal compression improves the efficiency of the

transmission and storage capacity. The coding of the signal by using channel Vo-Coders is

shown in the block diagram below.

The channel vocoders splits the speech signal into increasing non-overlapping sub-bands.The

whole range covers the frequencies that humans can hear. Incoming signal is periodically

sampled at every 20ms.

The channel vocoder uses a multiplexer to send two categories of information on the same

channel. The sampling rate for all signals is selected above the Nyquist rate .These samples are

25


then transmitted as a single signal. A De-Multiplexer at the receiving end can easily separate

each signal.This multiplexing scheme is called Time-Division multiplexing.As shown in the

diagram ,the band-pass filters divide the speech into smaller frequency bands and each sub-band

is rectified and filtered using a low-pass filter to determine the spectral envelop of the

speech.After this they are digitized and multiplexed for transmission. Normally there will be 16

-sub bands used to cover the entire audible range of frequencies. The pitch detector tries to

distinguish between the voiced and unvoiced segments.

Now the speech signals are synthesized .This is the reverse process of coding and compression.

The received signal is de-multiplexed to separate out the different information categories .The

part of the signal carrying the spectral envelope information is then converted into analog signal.

The process of synthesis is shown below .

Finally the signal segment is band-passed to its original frequency range. Digital signal

processors provide necessary block to perform all these operations. The digital filters can be

implemented easily using the DSP processors and their processing times are well within the

range required by speech signals..There are a number of pitch estimator algorithms which can be

implemented using the DSPs.

The last part of voice processing is the voice privacy. In systems like telecommunications this is

very important. In these systems the encoding of the speech signals is done using various coding

techniques. The technique like PCM(Pulse coding Modulation ) is widely used.

RADAR :

RADAR is an acronym for Radio Detection And Ranging . It is used to detect a stationary or

moving target by sending radio waves towards it. The main sub-systems of a modern radar

26


are the antenna, the tracking computer and the signal generator. The tracking computer in the

modern radar performs and controls all the functions. By scheduling the appropriate antenna

positions and transmitted signals as a function of time, keeps track of targets and running the

display system.The RADAR model block diagram model is shown below.

The transmitter of the RADAR transmits the signals generated by the signal generator

through the anrenna.The receiver receives the echoed signal from the target .Based on the time

lapse between the transmitted and received signals ,the distance at which the target is located is

found. During the process of detection the signal is corrupted due to echoes, atmospheric noise

or noise generated in the RADAR receiver.The noise power existing at the output of the Radar

receiver is reduced by using the matched digital filters, whose frequency response function

maximizes the output peak-signal to mean-noise power ratio. The working of this matched filter

is similar to FIR filter and helps to store the filter impulse response in a memory chip.It can be

used to process the signal in real time to meet a changing target environment. The block

diagram of RADAR is shown below

27


The front end of the receiver for RADAR is mainly analog due the high frequencies involved.

With fast ADC convertors - multiple channel, complex IF signals are digitized .But around the

antenna, digital technology is followed. Fast digital interfaces are necessary to detect antenna

position, or control other hard ware. The main task of a radar's signal processor is to make

decisions. After a signal has been transmitted, the receiver starts receiving return signals, with

those originating from near objects arriving first because time of arrival translates into target

range. The signal processor over the whole period of time has to make a decision for each of

the range as to whether it contains an object or not.This decision-making is severely hampered

by noise. Atmospheric noise enters into the system through the antenna, and all the electronics in

the radar's signal path produces noise too.These noises are reduced.

DSP in Telecommunications:

The fast developments in very large scale integrated technology (VLSI) revolutionized the digital

integrated circuit technology in terms of small size ,low power ,low cost ,noise immunity and

reliability. So , the today’s new and efficient Digital signal processors are found to be most

suitable solution to the ever expanding Telecommunication field.Using the DSP filters ,the

signals which are multiplexed on a high speed digital line can be filtered so that out of band

components can be reduced. Otherwise distortion occurs due to the inter-channel interference of

these signals. In today’s modern telephone systems , tone signaling is used instead of

conventional sequential pulsing. To generate and detect these signals , suitable signal

28


processing techniques are applied. The signal processing is also applied in locating the faults in

the operation of the wide range signals.

Digital transmission system can carry easily the digital signals at a rate of 1.0Mbps .Based on

this a T-carrier system is developed which can transmit 24-voice band channels simultaneously

over a single twisted pair or wire pair. In this process pulse code modulation is used(PCM).

Digital switching is another important area which implements Time division Multiplex

operation.This TDM allows to carry nearly 120 signals simultaneously via time sharing method

on each of its 16 input or output connections.

Digital signal processing is also used in pulse code modulated transmission terminals to perform

basic band limiting functions using the digital filters. Generally the analog signals are digitized

at a high rate of around 32K.Hz and digitize these into linear PCM or differential PCM code.

Digital filters operating at the high sampling rate removes energy above 3.2K.Hz and the noise

or hum below 200Hz. The use of CODEC followed by a digital multiplexer reduces the chances

of inter channel intelligible cross talk.

One of the important and crucial problems in Telecommunications is the Echo generation. This

will create problems to the user , where the user hears his own voice in return during the

conversation over a telephone line. This echo is cancelled by using suitable digital transversal

filters. This technique is known as echo cancellation.

ECHO Canceller:

Echo is the repetition of a waveform due to reflection from different points or interface of the

medium. Echo can severely affect the quality and intelligibility of voice conversation in a

telephone system. In telecommunication, echo can degrade the quality of service, and echo

cancellation is an important part of communication systems. The development of echo reduction

started in 1950s . There are two types of echo in communication systems: acoustic echo and

telephone line hybrid echo. Acoustic echo results from a feedback path set up between the

speaker and the microphone in a mobile phone, hands-free phone, teleconference or hearing aid

system. Acoustic echo maybe reflected from a multitude of different surfaces, such as walls,

ceilings and floors, and travels through different paths. Telephone line echoes result from an

impedance mismatch at telephone exchange hybrids where the subscriber's 2-wire line is

connected to a 4-wire line.

29


In telephone networks, the cost of running a 4-wire line from the local exchange to subscribers’

premises was considered uneconomical. Hence, at the exchange the 4-wire truck lines are

converted to 2-wire subscribers local lines using a 2/4-wire hybrid bridge circuit. At the receiver

due to any imbalance between the 4/2-wire bridge circuit, some of the signal energy of the 4-

wire circuit is bounced back towards the transmitter, constituting an echo signal. If the echo is

more than a few milliseconds long then it becomes noticeable, and can be annoying and

disruptive. So , Hybrid echo is the main source of echo generated from the public-switched

telephone network (PSTN) and . Echoes on a telephone line are due to the reflection of signals at

the points of impedance mismatch on the connecting circuits.

In digital mobile phone systems, the voice signals are processed at two points in the network:

first voice signals are digitized, compressed and coded within the mobile handset, and then

processed at the radio frequency interface of the network. The total delay introduced by the

various stages of digital signal processing range from 80 ms to 100 ms, resulting in a total round-

trip delay of 160–200 ms for any echo. A delay of this magnitude will make any appreciable

echo disruptive to the communication process. Owing to the inherent processing delay in digital

mobile communication systems, it is essential and mandatory to employ echo cancellers in

mobile phone switching centres.

The echo circuit attenuates the echo signal by detecting in which direction the conversation is

active and to break the round trip the echo cancellers are used.

An echo suppresser is primarily a switch that allows the speech signal during the speech-active

periods and attenuates the line echo during the speech-inactive periods. A line echo suppresser is

controlled by a speech/echo detection device. The echo detector monitors the signal levels on the

incoming and outgoing lines, and decides if the signal on a line from, say, speaker B to speaker

A is the speech from the speaker B to the speaker A, or the echo of speaker A. If the echo

30


detector decides that the signal is an echo then the signal is heavily attenuated. There is a similar

echo suppression unit from speaker A to speaker B. The performance of an echo suppresser

depends on the accuracy of the echo/speech classification subsystem. Echo of speech often has a

smaller amplitude level than the speech signal, but otherwise it has mainly the same spectral

characteristics and statistics as those of the speech. Therefore the only basis for discrimination of

speech from echo is the signal level. As a result, the speech/echo classifier may wrongly classify and

let through high-level echoes as speech, or attenuate low-level speech as echo.

Adaptive Echo Cancellation : The working of an adaptive line echo canceller is shown in the

below diagram. The speech signal on the line from speaker A to speaker B is input to the 4/2wire

hybrid B and to the echo canceller. The echo canceller monitors the signal on line from B to A

and attempts to model and synthesis a replica of the echo of speaker A. This replica is used to

subtract and cancel out the echo of speaker A on the line from B to A. The echo canceller is

basically an adaptive linear filter. The coefficients of the filter are adapted so that the energy of

the signal on the line is minimized. The echo canceller can be an infinite impulse response (IIR)

or a finite impulse response (FIR) filter. The main advantage of an IIR filter is that a long-delay

echo can be synthesized by a relatively small number of filter coefficients. But in practice, echo

cancellers are based on FIR filters. This is mainly due to the practical difficulties associated with

the adaptation and stable operation of adaptive IIR filters.

31


For satisfactory performance, the echo canceller should have a fast convergence rate, so that it

can adequately track changes in the telephone line and the signal characteristics.

SONAR :

SONAR is an acronym for Sound Navigation And Ranging.This technique, similar to RADAR is

used to determine the range ,velocity and direction of the objects that are remote from the

observer. SONAR employs sound waves of high frequencies (ultra sounds) to detect the objects

under water ,because radio waves cannot pass through water. SONAR s y s t e m s a r e

d i v i d e d i n t o t w o m a i n c a t e g o r i e s – a c t i v e a n d p a s s i v e .

P a s s i v e S O N A R s y s t e m p r o v i d e s m o n i t o r i n g t h e u n d e r s e a

e n v i r o n m e n t w i t h o u t s e n d i n g e n e r g y t h r o u g h t h e w a t e r . O n

t h e o t h e r h a n d , a c t i v e S O N A R s y s t e m c a n a c t s a m e a s

R A D A R u s i n g r e s p o n s e s f r o m s i g n a l s s e n t t o w a r d s t a r g e t s

T h e m a j o r s o u r c e s o f a m b i e n t n o i s e i n t h e u n d e r w a t e r s y s t e m s i s i . S e i s m i c d i s t u r b a n c e s , i i . B i o l o g i c a l o r g a n i s m s ' a c t i v i t i e s

i i i . O c e a n i c t u r b u l e n c e , i v . D i s t a n t s h i p p i n g v . W i n d a n d

v i . T h e r m a l n o i s e

I n d e e p w a t e r s t h e d i s t a n t s h i p p i n g n o i s e i s d o m i n a n t i n

f r e q u e n c i e s f r o m a b o u t 1 0 - 2 0 H z t o a b o u t 2 0 0 - 3 0 0 H z , a n d

w i n d n o i s e i s d o m i n a n t i n f r e q u e n c i e s f r o m a b o u t 2 0 0 - 3 0 0

H z t o s e v e r a l t e n s o f k H z .

32


Sonar hardware transmitter block diagram is shown below.In the diagram the wave form

generator generates CW (continuous waveforms), e.g. sinusoidal pulse with rectangular or

Gaussian envelope FM (frequency modulated) waveforms.

Array Shading performs Amplitude shading for side-lobe suppression Complex shading

amplitude shading and phase shifting / time delays) for main-lobe steering, shaping and

broadening.

Power Amplifier / Impedance Matching are the Switching amplifiers to achieve high source

levels and Linear amplifiers if moderate source levels are sufficient but an enhanced coherence

of consecutive pulses is required ,Impedance matching networks that supplies an optimal

coupling of the amplifiers to the transducers are used.

The SONAR receiver block diagram is shown below.

Signal Conditioning provides the Preamplifier and Band Pass Filter with Automatic Gain

Control (AGC) and/or (Adaptive) Time variable Gain ((A)TVG).The Quadrature Demodulation

33


(analog or digitally) and anti-aliasing Filter and Analog-to-Digital Conversion with 16 up to 24

bits are used during the signal conditioning.

Signal Processing block performs the signal processing using Matched Filtering / Pulse so that

Conventional motion compensated near and far field beam forming in time or frequency domain

The Information Processing performs the Image Formation (range decimation, azimuth

decimation and interpolation, geo-coding) and Image Fusion (multi-ping and/or multi-aspect

mode) and also Computer aided detection and classification (CAD and CAC) of Targets (semi-

automatic for aiding an Operator or fully automatic for autonomous operations).

The Display Processing block performs the display of the image of the object detected on the

screen.

References: This class notes is prepared based on the book “Digital Image Processing” by

Rafael C. Gonzalez and Richard E. Woods and other internet material .This is purely for

academic purpose ,not for commercial use

34

introduction to image processing-class notes

Technology