basics of image processing
DESCRIPTION
Seminar Report, B.Tech 3rd Year, Paras Prateek Bhatnagar, Preeti Kumari, Priyanka Rahi & Ruchita at College of Engineering Roorkee.TRANSCRIPT
BASICS OF
IMAGE PROCESSING
Paras Prateek Bhatnagar [ 08 ]
Preeti Kumari [ 12 ]
Priyanka Rahi[ 13 ]
Ruchita [ 24 ]
[ EN (H)-IIIrd year (B-Tech) ]
2
Certificate
This is to certify that the report entitled “ Basics of image processing ” ,submitted
byPreeti Kumari, Paras Prateek Bhatnagar ,Priyanka Rahi &Ruchita , students
of B.tech,EN(H- III rd year) , is an authentic work of their own carried out under my
supervision and guidance .
The students have put a lot of labour and efforts to make the project useful.
Mr A. S. Yadav Mr S. Sinha
[ Project Guide ] [ HOD (Electrical and Electronics) ]
3
Acknowledgement
Making this report would have been impossible task without the co-operation of the
following people. Simply thanking them is not enough but that is all they would let us
do.
We,sincerely pay thanks toour Project Guide, Mr A.S Yadav, under whose able
guidance and supervision this work has been carried out.
We are also grateful and wish to express our sincere thanks to out HOD (Electrical &
Electronics), Mr S. Sinhawho providedus with the required resources.
Preeti Kumari [ 12 ]
Paras Prateek Bhatnagar[ 08 ]
Priyanka Rahi[ 13 ]
Ruchita[ 24 ]
[ EN (H) - 3rd year ]
4
Contents
1. Introduction to image processing...............................................................................5
2. Applications of digital image processing....................................................................7
3. Advantages of digital image processing...................................................................... 9
4. Disadvantages of digital image processing..............................................................10
5. Working with the primary colours...........................................................................11
5.1. Additive primaries ................................................................................................. 11
5.2. Subtractive primaries ........................................................................................... 12
5.3. CMYK colour model ............................................................................................ 13
6. Human vision system................................................................................................. 14
6.1. Colour vision ......................................................................................................... 14
6.2. Visual perception ................................................................................................... 15
6.3. Colours in human brain ......................................................................................... 16
6.4. Mathematics of colour perception ......................................................................... 17
7. Computer vision system.............................................................................................19
7.1. RGB image representation .................................................................................... 19
7.2. Monochrome& Greyscale image representation ................................................ 22
7.3. CMYK colour model ........................................................................................... 25
7.4. HSV and HSL colour model................................................................................28
8. Image Parameters......................................................................................................30
9. Image Enhancements................................................................................................. 32
9.1. Histogram Equalization …………………………………………………………. 32
9.2. Gamma adjustment .……………………………………………….……….…… 34
9.3. Noise reduction…………………….…………………………..……………….. 37
9.4. Homomorphic filtering ..……………………………………......……………….. 39
i. List of acronyms………………………………………………………..………….. 41
ii. Works Cited..….……..……………...………………………………..…………….. 42
5
1. Introduction to image processing
The sense of Image processing can be understood by splitting the word into two parts –
image and processing.
An image, derived from a Latin word imago, stands for an artifact, such as a two-
dimensional picture, that has a similar appearance to some subject—usually a physical object
or a person.
Images may be two-dimensional, such as a photograph, screen display, and as well as a
three-dimensional, such as a statue. They may be captured by optical devices—such as
cameras, mirrors, lenses, telescopes, microscopes, etc. and natural objects and phenomena,
such as the human eye or water surfaces.
Process or processing typically describes the act of taking something through an established
and usually routine set of procedures to convert it from one form to another, as a
manufacturing or administrative procedure, such as processing milk into cheese, or
processing paperwork to grant a mortgage loan.
Thus, image processing is any form of signal processing for which the input is an image,
such as photographs or frames of video; the output of image processing can be either an
image or a set of characteristics or parameters related to the image. Most image-
processing techniques involve treating the image as a two-dimensional signal and applying
standard signal-processing techniques to it.
Image processing usually refers to digital image processing, but optical and analog image
processing are also possible.
The acquisition of images is referred to as imaging. The following example represents a
basic operation of image processing.
The composite image (4) has been split into red (1), green (2) and
blue (3) channels.
7
2. Applications of image processing
Image processing covers a wide range of operations and their applications, but the following
few may be considered as the most important among them:
Computer vision: Computer vision is concerned with the theory behind artificial
systems that extract information from images. The image data can take many forms,
such as video sequences, views from multiple cameras, or multi-dimensional data
from a medical scanner.
Optical sorting: Optical Sorting is a process of visually sorting a product though the
use of Photo detector, Camera, or the standard Mark 1 Human eye ball.
Augmented Reality:Augmented reality (AR) is a term for a live direct or indirect
view of a physical real-world environment whose elements are augmented by virtual
computer-generated imagery.
Face detection:Face detection is a computer technology that determines the locations
and sizes of human faces in digital images. It detects facial features and ignores
anything else, such as buildings, trees and bodies.
Feature detection: In computer vision and image processing the concept of feature
detection refers to methods that aim at computing abstractions of image information
and making local decisions at every image point whether there is an image feature of
a given type at that point or not.
Lane departure warning system: In road-transport terminology, a lane departure
warning system is a mechanism designed to warn a driver when the vehicle begins to
move out of its lane on freeways and arterial roads.
Non-photorealistic rendering:Non-photorealistic rendering (NPR) is an area of
computer graphics that focuses on enabling a wide variety of expressive styles for
digital art. In contrast to traditional computer graphics, which has focused on
8
photorealism, NPR is inspired by artistic styles such as painting, drawing, technical
illustration, and animated cartoons.
Medical image processing: Medical imaging is the technique and process used to
create images of the human body for clinical purposes or medical.
Microscope image processing: Microscope image processing is a broad term that
covers the use of digital image processing techniques to process, analyse and present
images obtained from a microscope.
Remote sensing:Remote sensing is the acquisition of information of a phenomenon,
by the use of real-time sensing device(s) that are not in physical or intimate contact
with the object.
9
3. Advantages of digital image processing
Among the numerous advantages of digital image processing, the important ones are:
Post - processing of image: Post-processing of the image allows the operator to
manipulate the pixel shades to correct image density and contrast, as well as perform
other processing functions that could result in improved diagnosis and fewer repeated
examinations.
Easy storage and retrieval of image: With the advent of electronic record systems,
images can be stored in the computer memory and easily retrieved on the same
computer screen and can be saved indefinitely or be printed on paper or film if
necessary.
Ease of sharing data: All digital imaging systems can be networked into practice
management software programs facilitating integration of data. With networks, the
images can be viewed in more than one room and can be used in conjunction with
pictures obtained with an optical camera to enhance the patients’ understanding of
treatment.
More use ofthe same data: Digital imaging allows the electronic transmission of
images to third-party providers, referring dentists, consultants, and insurance carriers
via a network.
Environmental friendly: Digital imaging is also environmentally friendly since it
does not require chemical processing. It is well known that used film processing
chemicals contaminate the water supply system with harmful metals such as the silver
found in used fixer solution.
Reduction in radiation: Radiation dose reduction is also a benefit derived from the
use of digital systems. Some manufacturers have claimed a 90% decrease in radiation
exposure, but the real savings depend on comparisons.
10
4. Disadvantages of digital image
processing
Along with the advantages, some disadvantages have also been associated with digital image
processing. Important among them are as follows:
High initial cost:The initial cost can be high depending on the system used, the number of
detectors purchased, etc.
Need of extra knowledge:Competency using the software can take time to master depending
on the level of computer literacy of team members.
Limitation on shape and size of detectors:The detectors, as well as the phosphor plates,
cannot be sterilized or autoclaved and in some cases CCD/CMOS detectors pose positioning
limitations because of their size and rigidity. This is not the case with phosphor plates;
however, if a patient has a small mouth, the plates cannot be bent because they will become
permanently damaged.
High maintenance cost:Phosphor plates cost an average of $25 to replace, and CCD/CMOS
detectors can cost more than $5,000 per unit. Thus, digital processing system requires more
maintenance as compared to traditional systems.
Need for standardization:Since digital imaging in dentistry is not standardized,
professionals are unable to exchange information without going through an intermediary
process. Hopefully, this will change within the next few years as manufacturers of digital
equipment become DICOM compliant.
5. Working with the primary colours
Primary colours are sets of colours that can be combined to make a useful range of colours.
For human applications, three primary colours are usually used, since human colour vision is
trichromatic. For additive combination of colours, as in overlapping projected lights or in
CRT displays, the primary colours normally used are red, green, and blue. For subtractive
combination of colours, as in mixing of pigments or dyes, such as in printing, the primaries
normally used are cyan, magenta, and yellow.
5.1 Additive primaries
Media that combine emitted lights to create the sensation of a range of colours are
using the additive colour system. Typically, the primary colours used are red, green,
and blue.Television and other computer and video displays are a common example of
the use of additive primaries and the RGB colour model. The exact colours chosen for
the primaries are a technological compromise between the available phosphors and
the need for large colour triangle to allow a large gamut of colours.
Additive mixing of red and green light produces shades of yellowor orange, or brown.
Mixing green and blue produces shades of cyan; and mixing red and blue produces
shades of purple, including magenta. Mixing nominally equal proportions of the
additive primaries results in shades of grey or white; the colour space that is generated
is called an RGB colour space.
Additive colour
mixing The sRGB colour triangle
5.2 Subtractive primaries
Media that use reflected light and colorants to produce colours are using the
subtractive colour method of colourmixing. RYB make up the primary colour triad in
a standard colour wheel; the secondary coloursVOG (violet, orange, and green) make
up another triad. Triads are formed by 3 equidistant colours on a particular colour
wheel; neither RYB nor VOG is equidistant on a perceptually uniform colour wheel,
but rather have been defined to be equidistant in the RYB wheel. Painters have long
used more than three "primary" colours in their palettes—and at one point considered
red, yellow, blue, and green to be the fourprimaries. Red, yellow, blue, and green are
still widely considered the four psychological primary colours, though red, yellow,
and blue are sometimes listed as the three psychological primaries, with black and
white occasionally added as a fourth and fifth.
During the 18th century, as theorists became aware of Isaac Newton’s scientific
experiments with light and prisms, red, yellow, and blue became the canonical
primary colours. This theory became dogma, despite abundant evidence that red,
yellow, and blue primaries cannot mix all other colours, and has survived in colour
theory to the present day. Using red, yellow, and blue as primaries yields a relatively
small gamut, in which, among other problems, colourful green, cyan, and magenta are
impossible to mix, because red, yellow, and blue are not well-spaced around a
perceptually uniform colour wheel. For this reason, modern three- or four-colour
printing processes, as well as colour photography, use cyan, yellow, and magenta as
primaries instead. Most painters include colours in their palettes which cannot be
mixed from yellow, red, and blue paints, and thus do not fit within the RYB colour
model. The cyan, magenta, and yellow used in printing are sometimes known as
"process blue," "process red," "process yellow".
Standard RYB colour wheel
5.3 CMYK Colour Model
In the printing industry, to produce the varying colours the subtractive
primariescyan, magenta, and yellow are applied together in varying amounts.
Mixing yellow and cyan produces green colours; mixing yellow with magenta
produces reds, and mixing magenta with cyan produces blues. In theory, mixing equal
amounts of all three pigments should produce grey, resulting in black when all three
are applied in sufficient density, but in practice they tend to produce muddy brown
colours. For this reason, and to save ink and decrease drying times, a fourth pigment,
black, is often used in addition to cyan, magenta, and yellow.
The resulting model is the so-called CMYK colour model. The abbreviation stands
for cyan, magenta, yellow, and key—black is referred to as the keycolour. In practice,
colorant mixtures in actual materials such as paint tend to be more complex. Brighter
or more saturated colours can be created using natural pigments instead of mixing,
and natural properties of pigments can interfere with the mixing. In the subtractive
model, adding white to a colour, whether by using less colorant or by mixing in a
reflective white pigment such as zinc oxide does not change the colour’s hue but does
reduce its saturation. Subtractive colour printing works best when the surface or
paper is white, or close to it.
A system of subtractive colour does not have a simple chromaticity gamut analogous
to the RGB colour triangle, but a gamut that must be described in three dimensions.
There are many ways to visualize such models, using various 2D chromaticity spaces
or in 3D colour spaces.
Subtractive
colour mixingOpponentprocess demonstration
6. Human Vision System
6.1 Colour vision
Fundamentally, light is a continuous spectrum of the wavelengths that can be
detected by the human eye, in infinite-dimensional stimulus space. However, the
human eye normally contains only three types of colour receptors, called cone cells.
Each colour receptor responds to different ranges of the colour spectrum. Humans and
other species with three such types of colour receptors are known as trichromats.
These species respond to the light stimulus via a three-dimensional sensation, which
generally can be modelled as a mixture of three primary colours.Species with
different numbers of receptor cell types would have colour vision requiring a different
number of primaries. Since humans can only see to 400 nanometres, but tetrachromats
can see into the ultraviolet to about 300nanometres, this fourth primary colour might
be located in the shorter-wavelength range. The peak response of human colour
receptors varies, even among individuals with "normal" colour vision.
The cones are conventionally labelled according to the ordering of the wavelengths of
the peaks of their spectral sensitivities: short (S), medium (M), and long (L) cone
types, also sometimes referred to as blue, green, and red cones. While the L cones are
often referred to as the red receptors, micro spectrophotometry has shown that their
peak sensitivity is in the greenish-yellow region of the spectrum. Similarly, the S- and
M-cones do not directly correspond to blue and green, although they are often
depicted as such.
Normalized colour response Single colour sensitivity diagram
15
The following table shows the range and peak wavelength that can be detected by
different cone cells:
Cone type Name Range Peak wavelength
S Β 400–500 nm 420–440 nm
M γ 450–630 nm 534–545 nm
L ρ 500–700 nm 564–580 nm
A range of wavelengths of light stimulates each of these receptor types to varying
degrees. Yellowish-green light, for example, stimulates both L and M cones equally
strongly, but only stimulates S-cones weakly. Red light, on the other hand, stimulates
L cones much more than M cones, and S cones hardly at all; blue-green light
stimulates M cones more than Lcones and S cones a bit more strongly, and is also the
peak stimulant for rod cells; and blue light stimulates almost exclusively S-cones.
Violet light appears to stimulate both L and S cones to some extent, but M cones very
little, producing a sensation that is somewhat similar to magenta. The brain combines
the information from each type of receptor to give rise to different perceptions of
different wavelengths of light.
6.2 Visual Perception
Visual perception is the ability to interpret information and surroundings from
visible light reaching the eye. The resulting perception is also known as eyesight,
sight or vision. The various physiological components involved in vision are referred
to collectively as the visual system, and are the focus of much research in psychology,
cognitive science, neuroscience and molecular biology.
The visual system in humans allows individuals to assimilate information from the
environment. The act of seeing starts when the lens of the eye focuses an image of its
16
surroundings onto a light-sensitive membrane in the back of the eye, called the retina.
The retina is actually part of the brain that is isolated to serve as a transducer for the
conversion of patterns of light into neuronal signals. The lens of the eye focuses light
on the photoreceptive cells of the retina, which detect the photons of light and respond
by producing neural impulses. These signals are processed in a hierarchical fashion by
different parts of the brain, from the retina to the lateral geniculate nucleus, to the
primary and secondary visual cortex of the brain. Signals from the retina can also
travel directly from the retina to the Superior colliculus.
6.3 Colours in human brain
Colour processing begins at a very early level in the visual system through initial
colouropponent mechanisms. Opponent mechanisms refer to the opposing colour
effect of red-green, blue-yellow, and light-dark. Visual information is then sent back
via the optic nerve to the optic chiasma: a point where the two optic nerves meet and
information from the temporal visual field crosses to the other side of the brain. After
the optic chiasma the visual fibre tracts are referred to as the optic tracts, which enter
the thalamus to synapse at the lateral geniculate nucleus (LGN). The LGN is
segregated into six layers: two magnocellular (large cell) achromatic layers (M
cells) and four parvocellular (small cell) chromatic layers (P cells). Within the LGN
P-cell layers there are two chromatic opponent types: red vs. green and blue vs.
green/red.
After synapsing at the LGN, the visual tract continues on back toward the primary
visual cortex (V1) located at the back of the brain within the occipital lobe. Within
V1 there is a distinct band (striation). This is also referred to as "striate cortex", with
other cortical visual regions referred to collectively as "extra striate cortex". It is at
this stage that colour processing becomes much more complicated. In V1 the simple
three-colour segregation begins to break down. Many cells in V1 respond to some
parts of the spectrum better than others, but this "colour tuning" is often different
depending on the adaptation state of the visual system. A given cell that might
respond best to long wavelength light if the light is relatively bright might then
become responsive to all wavelengths if the stimulus is relatively dim. Because the
colour tuning of these cells is not stable, some believe that a different, relatively
small, population of neurons in V1 is responsible for colour vision. These specialized
"colour cells" often have receptive fields that can compute local cone ratios. Double
opponent cells are clustered within localized regions of V1 called blobs, and are
thought to come in two flavours, red-green and blue-yellow. Red-green cells
compare the relative amounts of red-green in one part of a scene with the amount of
red-green in an adjacent part of the scene, responding best to local colourcontrast.
This is the first part of the brain in which colour is processed in terms of the full range
of hues found in colour space.
Human Eye colour vision chartVisual pathways in the human brain
6.4Mathematics of colour perception
A "physical colour" is a combination of pure spectral colours.Since there are, in
principle, infinitely many distinct spectral colours, the set of all physical colours may
be thought of as an infinite-dimensional vector space, in fact a Hilbert space. We
call this space Hcolour. More technically, the space of physical colours may be
considered to be the cone over the simplex whose vertices are the spectral colours,
with white at the centroid of the simplex, black at the apex of the cone, and the
monochromatic colour associated with any given vertex somewhere along the line
from that vertex to the apex depending on its brightness. An element C of Hcolour is a
function from the range of visible wavelengths—considered as an interval of real
numbers [Wmin,Wmax]—to the real numbers, assigning to each wavelength w in
[Wmin,Wmax] its intensity C(w).
A humanly perceived colour may be modelled as three numbers: the extents to which
each of the 3 types of cones is stimulated. Thus a humanly perceived colour may be
thought of as a point in 3-dimensional Euclidean space. We call this space
R3colour.Since each wavelength w stimulates each of the 3 types of cone cells to a
known extent, these extents may be represented by 3 functions s(w), m(w), l(w)
corresponding to the response of the S, M, and L cone cells, respectively.
Finally, since a beam of light can be composed of many different wavelengths, to
determine the extent to which a physical colourC in Hcolour stimulates each cone cell,
we must calculate the integral, over the interval [Wmin,Wmax], of C(w)*s(w), of
C(w)*m(w), and of C(w)*l(w). The triple of resulting numbers associates to each
physical colourC (which is a region in Hcolour) to a particular perceived colour (which
is a single point in R3colour). This association is easily seen to be linear. It may also
easily be seen that many different regions in the "physical" space Hcolour can all result
in the same single perceived colour in R3colour, so a perceived colour is not unique to
one physical.
Technically, the image of the (mathematical) cone over the simplex whose vertices
are the spectral colours, by this linear mapping, is also a (mathematical) cone in
R3colour. Moving directly away from the vertex of this cone represents maintaining the
same chromaticitywhile increasing its intensity. Taking a cross-section of this cone
yields a 2D chromaticity space. Both the 3D cone and its projection or cross-section
is convex sets; that is, any mixture of spectral colours is also a colour.
The CIE 1931 xy chromaticity diagram.
7. Computer Vision System
7.1 RGB image representation
The RGB colour model is the most common way to encode colour in computing, and
several different binary digital representations are in use. The main characteristic of
all of them is the quantization of the possible values per component by using only
integer numbers within some range, usually from 0 to some power of two minus one
(2n – 1) to fit them into some bit groupings. As usual in computing, the values can be
represented both in decimal and in hexadecimal notation as well, as is the case of
HTML colours text-encoding convention.RGB values encoded in 24 bits per pixel
(bpp) are specified using three 8-bit unsigned integers (0 through 255) representing
the intensities of red, green, and blue. This representation is the current mainstream
standard representation for the so-called truecolour and common colour interchange
in image file formats such as JPEG or TIFF. It allows more than 16 million different
combinations, many of which are indistinguishable to the human eye. The following
image shows the three "fully saturated" faces of a 24-bpp RGB cube, unfolded into a
plane:
(0, 0, 0) is black
(255, 255, 255) is white
(255, 0, 0) is red
(0, 255, 0) is green
(0, 0, 255) is blue
(255, 255, 0) is yellow
(0, 255, 255) is cyan
(255, 0, 255) is magenta
yellow
(255,255,0)
green
(0,255,0)
cyan
(0,255,255)
red
(255,0,0)
blue
(0,0,255)
20
red(255,0,0)
magenta(255,0,255)
The above definition uses a convention known as full-range RGB. Colour values are
also often scaled from and to the range 0.0 through 1.0;especially they are mapped
from/to other colour models and/or encodings. The 256 levels of a primary usually do
not represent equally spaced intensities, due to gamma correction. This representation
cannot offer the exact mid-point 127.5, or other non-integer values, as bytes do not
hold fractional values, so these need to be rounded or truncated to a nearby integer
value. For example, Microsoft considers the colour "medium grey" to be the
(128,128,128) RGB triplet in its default palette. The effect of such quantization is
usually not noticeable, but may build up in repeated editing operations or colorspace
conversions.Typically, RGB for digital video is not full range. Instead, video RGB
uses a convention with scaling and offsets such that (16, 16, 16) is black, (235, 235,
235) is white, etc.
32-bit graphic mode
The so-called 32 bpp display graphic mode is identical in precision to the 24 bpp
mode; there are still only eight bits per component, and the eight extra bits are often
not used at all. The reason for the existence of the 32 bpp mode is the higher speed at
which most modern 32-bit hardware can access data that is aligned to word addresses,
compared to data not so aligned.
32-bit RGBA
With the need for compositing images,came a variant of 24-bit RGB which includes
an extra 8-bit channel for transparency, thus resulting also in a 32-bit format. The
transparency channel is commonly known as the alpha channel, so the format is
named RGBA. This extra channel allows for alpha blending of the image over
another, and is a feature of the PNG format.
48-bit RGB
High precision colour management typically uses up to 16 bits per component,
resulting in 48 bpp. This makes it possible to represent 65,536 tones of each colour
component instead of 256. This is primarily used in professional image editing, like
Adobe Photoshop for maintaining greater precision when a sequence of more than
one image filtering algorithms is used on the image.
16-bit RGB
A 16-bit mode known as Highcolor, in which there are either 5 bits per colour, called
555 mode (32,768 colours), or the same with an extra bit for green, called 565 mode
(65,535 colours). This was the high-end for some display adapters for personal
computers during the 1990s, but today is considered slightly obsolete. It is still in use
in many devices with colour screens as cell phones, digital cameras, personal
digital assistants and videogame consoles.
3-bit RGB
The minimum RGB binary representation is 3-bit RGB, one bit per component.
Typical for early colour terminals in the 1970s, it is still used today with the
Teletext TV retrieval service.
16-bit RGB
24-bit RGB
3-bit RGB
7.2 Monochrome& Greyscale image representation
Monochrome paletteshave some shades of grey, from black to white; both
considered the most possible darker and lighter "greys", respectively. The general rule
is that those palettes have 2n different shades of grey, where n is the number of bits
needed to represent a single pixel.
1-bitMonochrome
Monochrome graphics displays typically have a black background with a white or
light grey image, though green and amber monochrome monitors were also common.
Such a palette requires only one bit per pixel, which may be represented as:
In some systems, as Hercules and CGA graphic cards for the IBM PC, a bit value of
1 represents white pixels and a value of 0 the black ones .Others, like the Atari ST
and Apple Macintosh with monochrome monitors, a bit value of 0 means a white
pixel and a value of 1 means a black pixel, which it approximates to the printing
logic.
2-bit Greyscale
In a 2-bit colour palette each pixel's value is represented by 2 bits resulting in a 4-
value palette (22 = 4). It has black, white and two intermediate levels of grey as
follows:
A monochrome 2-bit palette is used on:
NeXT Computer, NeXTcube and NeXT station monochrome graphic
displays.
Original Game Boy system portable videogame console.
Macintosh PowerBook 150 monochrome LCD displays.
4-bit Greyscale
In a 4-bit colour palette each pixel's value is represented by 4 bits resulting in a 16-
value palette (24 = 16) as follows:
A monochrome 4-bit palette is used on:
MOS Technology VDC on the Commodore 128
8-bit Greyscale
In an 8-bit colour palette each pixel's value is represented by 8 bits resulting in a 256-
value palette (28 = 256). This is usually the maximum number of greys in ordinary
monochrome systems; each image pixel occupies a single memory byte.
Most scanners can capture images in 8-bit greyscale, and image file formats like
TIFF and JPEG natively support this monochrome palette size. Alpha
channelsemployed for video overlay also use this palette. The grey level indicates the
opacity of the blended image pixel over the background image pixel.
Monochrome (1-bit)
2-bit Greyscale
4-bit Greyscale
8-bit Greyscale
7.3 CMYK colour model
The CMYK colour model is a subtractive colour model, used in colour printing,
and is also used to describe the printing process itself. CMYK refers to the four inks
used in some colour printing: cyan, magenta, yellow, and key black. Though it varies
by print house, press operator, press manufacturer and press run, ink is typically
26
applied in the order of the abbreviation. The “K” in CMYK stands for key since in
four-colour printing cyan, magenta, and yellow printing plates are carefully keyed or
aligned with the key of the black key plate. Some sources suggest that the “K” in
CMYK comes from the last letter in "black" and was chosen because B already means
blue. However, this explanation, though plausible and useful as a mnemonic, is
incorrect.
The CMYK model works by partially or entirely masking colours on a lighter, usually
white, background. The ink reduces the light that would otherwise be reflected. Such
a model is called subtractive because inks “subtract” brightness from white.
In the CMYK model, white is the natural colour of the paper or other background,
while black results from a full combination of coloured inks. To save money on ink,
and to produce deeper black tones, unsaturated and dark colours are produced by
using black ink instead of the combination of cyan, magenta and yellow.
Why black ink is used?
The “black” generated by mixing cyan, magenta and yellow primaries is
unsatisfactory, and so four-colour printing uses black ink in addition to the subtractive
primaries. Common reasons for using black ink include:
Text is typically printed in black and includes fine detail, so to reproduce text
or other finely detailed outlines using three inks without slight blurring would
require impractically accurate registration
A combination of 100% cyan, magenta, and yellow inks soaks the paper with
ink, making it slower to dry, and sometimes impractically so.
A combination of 100% cyan, magenta, and yellow inks often results in a
muddy dark brown colour that does not quite appear black. Adding black ink
absorbs more light, and yields much darker blacks.
Using black ink is less expensive than using the corresponding amounts of
coloured inks.
Comparison with RGB displays
Comparisons between RGB displays and CMYK prints can be difficult, since the
colour reproduction technologies and properties are so different. A computer monitor
mixes shades of red, green, and blue to create colour pictures. A CMYK printer must
compete with the many shades of RGB with only one shade each of cyan, magenta
and yellow, which it will mix using dithering, half toning or some other optical
technique.
Conversion
Since RGB and CMYK spaces are both device-dependent spaces, there is no simple
or general conversion formula that converts between them. Conversions are generally
done through colour management systems, using colour profiles that describe the
spaces being converted. Nevertheless, the conversions cannot be exact, particularly
where these spaces have different gamuts. A general method that has emerged for the
case of halftone printing is to treat each tiny overlap of colour dots as one of 8
(combinations of CMY) or of 16 (combinations of CMYK)colours, which in this
context are known as Neugebauer primaries. The resultant colour would be an area-
weighted colorimetric combination of these primary colours.
A colour photograph Image separated with Image separated with
and yellow yellow and black
28
7.4 HSV and HSL colour model
HSL and HSV are the two most common cylindrical-coordinate representations of points in
an RGB colour model, which rearrange the geometry of RGB in an attempt to be more
perceptually relevant than the Cartesian representation. HSL stands for hue, saturation, and
lightness, and is often also called HLS. HSV stands for hue, saturation, and value, and is
also often called HSB (B for brightness). Unfortunately, while typically consistent, these
definitions are not standardized, and any of these abbreviations might be used for any of these
three or several other related cylindrical models.
The purpose of these models is to aid selection, comparison, and modification of colours by
organizing them into a cylindrical geometry which roughly corresponds to human perception.
Both models are derived from the Cartesian RGB cube. Both models place greysalong a
central vertical axis, with black at its bottom and white at its top, and push the most
colourfulcolours to the edge of the cylinder. The angle around the axis corresponds to “hue”,
the distance from the axis corresponds to “saturation”, and the distance along the axis
corresponds to “lightness”, “value” or “brightness”. Because HSL and HSV are simple
transformations of device-dependent RGB models, the physical colours they define depend on
the colours of the red, green, and blue primaries of the device or of the particular RGB space,
and on the gamma correction used to represent the amounts of those primaries. Each unique
RGB device therefore has unique HSL and HSV spaces to accompany it, and numerical HSL
or HSV values describe a different colour for each basis RGB space.
3D and 2D HSL and HSV models
Comparison of HSL and HSV
models
8. Image Parameters
Brightness
Brightness is an attribute of visual perception in which a source appears to be
radiating or reflecting light. In other words, brightness is the perception elicited by the
luminance of a visual target. This is a subjective attribute/property of an object being
observed.
In the RGB colour space, brightness can be thought of as the arithmetic mean (µ) of
the red, green, and blue colour coordinates.
Brightness is also a colour coordinate in the HSB or HSV colour space.
Contrast
Contrast is the difference in visual properties that makes an object distinguishable
from other objects and the background. In visual perception of the real world, contrast
is determined by the difference in the colour and brightness of the object and other
objects within the same field of view.
Luminance
Luminance is the density of luminous intensity in a given direction. The SI unit for
luminance is candela per square metre.
In imaging operations, luminosity is the term used incorrectly to refer to the luma
component of a colour image signal; that is, a weighted sum of the nonlinear red,
green, and blue signals. It seems to be calculated with the Rec. 601 luma co-efficient
as:
Luma (Y’) = 0.299 R’ + 0.587 G’ + 0.114 B’
The "L" in HSL colour space is sometimes said incorrectly to stand for luminosity.
"L" in this case is calculated as 1/2 (MAX + MIN), where MAX and MIN refer to the
highest and lowest of the R'G'B' components to be converted into HSL colour space.
Gamma
A gamma value is used to quantify contrast, for example of photographic film. It is
the slope of an input–output curve in log–log space, that is:
Gamma values less than 1 are typical of negative film, and values greater than 1 are
typical of slide (reversal) film.
9. Image Enhancements
9.1 Histogram Equalisation
Histogram equalization is a method in image processing of contrast adjustment
using the image's histogram.This method usually increases the global contrast of
many images, especially when the usable data of the image is represented by close
contrast values. Through this adjustment, the intensities can be better distributed on
the histogram. This allows for areas of lower local contrast to gain a higher contrast
without affecting the global contrast. Histogram equalization accomplishes this by
effectively spreading out the most frequent intensity values.
The method is useful in images with backgrounds and foregrounds that are both bright
or both dark. In particular, the method can lead to better views of bone structure in x-
ray images, and to better detail in photographs that are over or under-exposed. A
key advantage of the method is that it is a fairly straightforward technique and an
invertible operator. So in theory, if the histogram equalization function is known,
then the original histogram can be recovered. The calculation is not computationally
intensive. A disadvantage of the method is that it is indiscriminate. It may increase
the contrast of background noise, while decreasing the usable signal.
Implementation
Consider a discrete greyscale image {x} and let ni be the number of occurrences of
grey level i. The probability of an occurrence of a pixel of level i in the image is
L being the total number of grey levels in the image, n being the total number of
pixels in the image, and px(i) being in fact the image's histogram for pixel value i,
normalized to [0,1].Let us also define the cumulative distribution function
corresponding to px as :
,
which is also the image's accumulated normalized histogram. We would like to create
a transformation of the form y = T(x) to produce a new image {y}, such that its CDF
will be linearized across the value range, i.e.
for some constant K. The properties of the CDF allow us to perform such a transform;
it is defined as
The T maps the levels into the range [0, 1]. In order to map the values back into their
original range, the following simple transformation needs to be applied on the result:
Histogram equalization of colour images
The above describes histogram equalization on a greyscale image. However it can
also be used on colour images by applying the same method separately to the Red,
Green and Blue components of the RGB colour values of the image. Still, it should be
noted that applying the same method on the Red, Green, and Blue components of an
RGB image may yield dramatic changes in the image's colour balance since the
relative distributions of the colour channels change as a result of applying the
algorithm. However, if the image is first converted to another colour space, Lab
colour space, or HSL/HSVcolour space in particular, then the algorithm can be
applied to the luminance or value channel without resulting in changes to the hue
and saturation of the image.
9.2 Gamma adjustment
Gamma correction, gamma nonlinearity, gamma encoding, or often simply
gamma, is the name of a nonlinear operation used to code and decode luminance or
tristimulus values in video or still image systems. Gamma correction is, in the
simplest cases, defined by the following power-law expression:
where the input and output values are non-negative real values, typically in a
predetermined range such as 0 to 1. A gamma value is sometimes called an
encoding gamma, and the process of encoding with this compressive power-law
nonlinearity is called gamma compression; conversely a gamma value is
called a decoding gamma and the application of the expansive power-law
nonlinearity is called gamma expansion.
A cathode ray tube (CRT) converts a video signal to light in a nonlinear way, because
the electron gun's intensity as a function of applied video voltage is nonlinear. The
light intensity I is related to the source voltage VS according to
where γ is the Greek letter gamma. For a CRT, the gamma that relates brightness to
voltage is usually in the range 2.35 to 2.55; video look-up tables in computers usually
adjust the system gamma to the range 1.8 to 2.2, which is in the region that makes a
uniform encoding difference give approximately uniform perceptual brightness
difference, as illustrated in the diagram on the top of this section.
For simplicity, consider the example of a monochrome CRT. In this case, when a
video signal of 0.5 (representing mid-grey) is fed to the display, the intensity or
brightness is about 0.22 (resulting in a dark grey). Pure black (0.0) and pure white
(1.0) are the only shades that are unaffected by gamma.
To compensate for this effect, the inverse transfer function (gamma correction) is
sometimes applied to the video signal so that the end-to-end response is linear. In
other words, the transmitted signal is deliberately distorted so that, after it has been
distorted again by the display device, the viewer sees the correct brightness. The
inverse of the function above is:
where VC is the corrected voltage and VS is the source voltage, for example in CRT
1/γ is 1/2.2 or 0.45.
A colour CRT receives three video signals (red, green and blue) and in general each
colour has its own value of gamma, denoted γR, γG or γB. However, in simple display
systems, a single value of γ is used for all three colours. The power-law function, or
its inverse, has a slope of infinity at zero. This leads to problems in converting from
and to a gamma colorspace.
Methods to perform display gamma correction in computing
Up to four elements can be manipulated in order to achieve gamma encoding to
correct the image to be shown on a typical computer display:
The pixel's intensity values in a given image file; that is, the binary pixel
values are stored in the file in such way that they represent the light intensity
via gamma-compressed values instead a linear encoding. This is done
systematically with digital video files, in order to save a gamma-decoding step
while playing.
The rendering software writes gamma-encoded pixel binary values directly to
the video memory or in the CLUT hardware registers of the display adapter.
They drive Digital-to-Analog Converterswhich output the proportional
voltages to the display. For example, when using 8-bit per channel, 24-bit
RGB colour, writing a value of 128 in video memory it outputs the
proportional ≈0.5 voltage to the display, which it is shown darker due to the
monitor behaviour.
Modern display adapters have dedicated calibrating CLUTs, which can be
loaded once with the appropriate gamma-correction look-up tablein order to
modify the encoded signals digitally before the DACs that output voltages to
the monitor. Setting up these tables to be correct is called hardware
calibration.
Some modern monitors allow to the user to manipulate their gamma
behaviour, encoding the input signals by themselves before they are displayed
on screen.
Gamma correction demonstration
37
9.3 Noise Reduction
Noise reduction is the process of removing noise from a signal. Noise reduction
techniques are conceptually very similar regardless of the signal being processed,
however a priori knowledge of the characteristics of an expected signal can mean the
implementations of these techniques vary greatly depending on the type of signal.
All recording devices, both analogue anddigital, have traits which make them
susceptible to noise. Noise can be random or white noise with no coherence or
coherent noise introduced by the devices mechanism or processing algorithms.
In the case of photographic film, noise is introduced due to the grain structure of the
medium. In photographic film, the size of the grains in the film determines the film's
sensitivity, more sensitive film having larger sized grains. Many further uses of these
images require that the noise will be (partially) removed - for aesthetic purposes as in
artistic work or marketing, or for practical purposes such as computer vision.
Types
In salt and pepper noise, pixels in the image are very different in colour or intensity
from their surrounding pixels; the defining characteristic is that the value of a noisy
pixel bears no relation to the colour of surrounding pixels. Generally this type of noise
will only affect a small number of image pixels. When viewed, the image contains
dark and white dots, hence the term salt and pepper noise. Typical sources include
flecks of dust inside the camera, or with digital cameras, faulty CCD elements.
In Gaussian noise, each pixel in the image will be changed from its original value by
a small amount. A histogram, a plot of the amount of distortion of a pixel value
against the frequency with which it occurs, shows a normal distribution of noise.
While other distributions are possible, the Gaussian distribution is usually a good
model, due to the central limit theorem that says that the sum of different noises
tends to approach a Gaussian distribution.
In selecting a noise reduction algorithm, one must weigh several factors:
the available computer power and time available
whether sacrificing some real detail is acceptable if it allows more noise to be
removed
38
the characteristics of the noise and the detail in the image, to better make those
decisions
Chroma and luminance noise separation
In real-world photographs, the highest spatial-frequency detail consists mostly of
variations in brightness ("luminance detail") rather than variations in hue ("chroma
detail"). Since any noise reduction algorithm should attempt to remove noise without
sacrificing real detail from the scene photographed, one risks a greater loss of detail
from luminance noise reduction than chroma noise reduction simply because most
scenes have little high frequency chroma detail to begin with. In addition, most people
find chroma noise in images more objectionable than luminance noise; the coloured
blobs are considered "digital-looking" and unnatural, compared to the grainy
appearance of luminance noise that some compare to film grain. For these two
reasons, most photographic noise reduction algorithms split the image detail into
chroma and luminance components and apply more noise reduction to the former.
Linear smoothing filters
One method to remove noise is by convolving the original image with a mask that
represents a low-pass filter or smoothing operation. For example, the Gaussian
mask comprises elements determined by a Gaussian function. This convolution brings
the value of each pixel into closer harmony with the values of its neighbours. In
general, a smoothing filter sets each pixel to the average value, or a weighted average,
of itself and its nearby neighbours; the Gaussian filter is just one possible set of
weights.
Smoothing filters tend to blur an image, because pixel intensity values that are
significantly higher or lower than the surrounding neighbourhood would "smear"
across the area. Because of this blurring, linear filters are seldom used in practice for
noise reduction; they are, however, often used as the basis for nonlinear noise
reduction filters.
Anisotropic diffusion
Another method for removing noise is to evolve the image under a smoothing partial
differential equation similar to the heat equation which is called anisotropic
diffusion. With a spatially constant diffusion coefficient, this is equivalent to the
39
linear Gaussian filtering, but with a diffusion coefficient designed to detect edges, the
noise can be removed without blurring the edges of the image.
Nonlinear filters
A median filter is an example of a non-linear filter and, if properly designed, is very
good at preserving image detail. To run a median filter:
1. consider each pixel in the image
2. sort the neighbouring pixels into order based upon their intensities
3. replace the original value of the pixel with the median value from the list
A median filter is a rank-selection (RS) filter, a particularly harsh member of the
family of rank-conditioned rank-selection (RCRS) filters; a much milder member
of that family, for example one that selects the closest of the neighbouring values
when a pixel's value is external in its neighbourhood, and leaves it unchanged
otherwise, is sometimes preferred, especially in photographic applications.
Median and other RCRS filters are good at removing salt and pepper noise from an
image, and also cause relatively little blurring of edges, and hence are often used in
computer vision applications.
9.4 Homomorphic filtering
Homomorphic filtering is a generalized technique for signal and image processing,
involving a nonlinear mapping to a different domain in which linear filter techniques
are applied, followed by mapping back to the original domain. This concept was
developed in the 1960s by Thomas Stockham, Alan V. Oppenheim, and Ronald W.
Schafer at MIT.
Homomorphic filter simultaneously normalizes the brightness across an image and
increases contrast. Homomorphic filtering is also used to remove multiplicative
noise. Illumination and reflectance are not separable, but their approximate locations
in the frequency domain may be located. Since illumination and reflectance combine
multiplicatively, the components are made additive by taking the logarithm of the
image intensity, so that these multiplicative components of the image can be separated
40
linearly in the frequency domain. Illumination variations can be thought of as a
multiplicative noise, and can be reduced by filtering in the log domain.
To make the illumination of an image more even, the high-frequency components
are increased and low-frequency components are decreased, because the high-
frequency components are assumed to represent mostly the reflectance in the scene,
whereas the low-frequency components are assumed to represent mostly the
illumination in the scene. That is, high-pass filtering is used to suppress low
frequencies and amplify high frequencies, in the log-intensity domain.
41
i. List of acronyms
AR Augmented Reality
BPP Bits Per Pixel
CCD Charge Coupled Device
CGA Colour Graphics Adapter
CMOS Complementary Metal Oxide Semiconductor
CMYK Cyan Magenta Yellow Key
CRT Cathode Ray Tube
DAC Digital to Analog Converter
DICOM Digital Imaging and Communications in Medicine
HSB Hue Saturation Brightness
HSL Hue Saturation Lightness
HSV Hue Saturation Value
JPEG Joint Photographic Experts Group
LCD Liquid Crystal Display
LGN Lateral Geniculate Nucleus
LPF Low Pass Filter
MIT Massachusetts Institute of Technology
NPR Non Photorealistic Rendering
PNG Portable Network Graphics
RCRS Rank-Condition Rank-Selection
RGB Red Green Blue
RGBA Red Green Blue Alpha
RS Rank-Selection
RYB Red Yellow Blue
SRGB Standard Red Green Blue
TIFF Tagged Image File Format
VOG Violet Orange Green
42
ii. Works Cited
Farid, H. (n.d.). Fundamentals of image processing. Retrieved from
http://www.cs.dartmouth.edu/~farid
Girod, B. (n.d.). EEE 368 Point Operaions.
Jankowski, M. (n.d.). Mathematica. Retrieved from http://www.wolfram.com
Kids Health. (n.d.). Retrieved from http://www.cyh.com/SubDefault.aspx?p=255
Szepesvari, C. (n.d.). Image Processing : Basics.
Wikipedia. (n.d.). Retrieved from http://www.wikipedia.com