report on study of inspection systems and implementation of art networks

7/30/2019 Report on Study of Inspection Systems and Implementation of ART Networks

1/126

1

Acknowledgements

We are highly indebted to Dr. A. S. V. Sarma, scientist-in-charge at Central Electronics EngineeringResearch Institute (CEERI) Centre, Chennai. His valuable guidance and support as a project guide and

project manager for the on-plant training immensely helped us to learn and remain motivated during the

training.

We would also like to thank Dr. P. K. Chatley, head of department of training and placement centre at

Dr. B. R. Ambedkar National Institute of Technology, Jalandhar for the approval and organizational

proceedings on the training.

Our special thanks go to Dr. Arun Khosla, head of department of Electronics and Communication

Engineering at the institute for his consent and recommendations for the training.

We are grateful to Mrs. Thilaka Mohandoss, senior technician (I) at CEERI Centre, Chennai for her

kind support in organizational registration formalities and lastly, we would like to thank the scientists at

various labs and central library at CEERI Centre, Chennai for their consultancy.


2/126

2

INTRODUCTION

The on-plant industrial training was on Study of Inspection Systems and Implementation of ART

Neural Networks as a re-trainable classifier.

Machine vision based inspection system is a widely accepted technology that is being used in the

manufacturing industry for applications including quality assurance and control. Machine vision has

become a production line staple in most industries. In the year of 2010 the machine vision market soared

more than 50% across Northern America and in the world as a continuing trend from 1980s.

The advancement of the electronic technology is the main driver behind this expansion since 1980s when

processing chips made it possible to create smart cameras, faster communicating bus technology surged

and most importantly the digital image processing technology with the soft computing robust algorithms

evolved. The heart of such inspection system is the processing technology which needs to provide the

quality assurance and control in the real world problems of imprecise and uncertain situations.

Presently, most industrial environments require the inspection system to be rugged, robust and flexible

enough to cope up with the constantly changing real time situations. A large variety of situations require

the system to learn new parameters or classes of situational variables as and when a new breed of

situations becomes apparent. This can, at least at present, be fulfilled to a very large extent by the re-

trainable classifier. The Adaptive Resonance Theory (ART) neural networks impart the system with the

so-called stability to remember the prior learned patterns as well as the plasticity to learn a new situation,

giving the classifier its required re-trainability. There is a requirement for such a system in todays

industrial environment. The outline of the project included the study of these ANN from the basic to ART

networks from respective coherent (with respect to pattern recognition and classification) textbooks and

references to internet articles, rarely. From a historical, many researchers have been influenced and

inspired by the complex structure and working of the human brain and its versatility. In particular, the

ART networks were inspired by the brains ability to learn new concepts while retaining those learnt

before. The theory of these ART networks was formulated and proposed by Stephen Grossberg and Gail

Carpenter, both being pioneers of the fields of cognitive sciences & neural sciences and proficient

mathematicians. Since the 1980s both, in conjunction with their students and colleagues, have developed

a large variety of ART networks.

Therefore we, as an electronics engineering students, underwent the on-plant training research on the

implementation of ART neural networks as re-trainable classifier in inspection systems. This required

thorough background knowledge of the basics of machine vision systems and the importance of a

classifier in the system process cycle. Thus, a number references from variety of machine vision books,

journals, documents were taken along with the interaction with the scientists working in labs at CEERI

Centre Chennai, in order to be sufficiently comfortable with the practical industrial environmental factorsaffecting the accuracy, stability and performance quotient of the system being developed, some of which

included illumination techniques as well as sources, importance of imaging sensors, imaging and image

processing techniques, hardware technology, etc. This also highlighted the importance of inspection

systems in the industrial environment.


3/126

3

MACHINE VISION BASED INSPECTION SYSTEM

1. Basics of Machine Vision SystemsI. What is machine vision?

A machine vision system comprises of a group of devices to receive the image of a real scene,analyze for the interested or targeted objects while interpreting to arrive at a decision based on some

predefined criteria set according to the applications and use. Machine vision provides cost and quality

benefits by replacing human vision on tasks that are fast, repetitive and require exact measurements.Machine vision has three general processes:

1) Location orSearch finds the position of the objects of interest. When machine vision is used forguiding a robot this task is called alignmentand when used to follow a moving object it is calledtracking.

2) Identification is selecting a particular object from a set of possible objects. When OpticalCharacter Recognition (OCR) or bar codes are used to identify an object, identification is calledreading.

3) Inspection checks that whether the object has the proper dimensions, meets quality standards, isfree of some class of defects, etc.

A very important feature or requirement of a machine vision system is knowledge about the world orsome pre-defined prototypes/models of real world objects, their features, geometrical properties and therelationship amongst them which adds the robustness, flexibility and adaptive nature of the machine

vision systems.

II. Difference between machine and computer vision

Machine Vision:

Machine vision is basically the application ofcomputer visionto factory automation. Machine vision tends to focus on applications, mainly in manufacturing, e.g., vision based

autonomous robots and systems for vision based inspection or measurement.

It also implies that the external conditions such as lighting can be and are often more controlled inmachine vision than they are in general computer vision, which can enable the use of differentalgorithms.

Computer Vision:

Computer vision is concerned with the theory behind artificial systems that extract information fromimages that is necessary to solve some task.

Computer vision tends to focus on the 3D scene projected onto one or several images, e.g., how toreconstruct structure or other information about the 3D scene from one or several images.

Computer vision often relies on more or less complex assumptions about the scene depicted in animage.
http://computervision.wikia.com/wiki/Computer_visionhttp://computervision.wikia.com/wiki/Computer_visionhttp://computervision.wikia.com/wiki/Computer_visionhttp://en.wikipedia.org/wiki/Machine_visionhttp://en.wikipedia.org/wiki/Machine_visionhttp://en.wikipedia.org/wiki/Machine_visionhttp://computervision.wikia.com/wiki/Computer_vision


4/126

4

III. Differences between machine and human vision

Machine Vision Human Vision

Machine vision provides cost and qualitybenefits by replacing human vision ontasks that are fast, repetitive and requireexact measurements.

Human vision gets fatigue, oversight attasks that are fast, repetitive and requireexact measurements which leads to error.

Machine vision is the application ofcomputer visionto factory automation. Human Vision is a biological vision based

on the binocular eyes and brain.

Machine vision systems use digitalcameras and image processing softwareto visually inspect parts to judge the

quality of workmanship.

Human inspectors work on assembly linesto perform similar inspections.

Machine vision systems have lesserintelligence and the learning capabilitythan that of human.

Machine Vision is yet to replace humanvision.

Machine vision can work in UV and IRranges. Ultraviolent and Infra red is invisible to

human vision.

IV. Components of machine vision based inspection systems

A simple machine vision system will consist of thefollowing:

1. Illumination System2. Optics3. A camera4. Camera interface card for computer,

known as "frame grabber"

5. Computer software to process images6. Digital signal hardware or a network

connection to report results

Fi ure 1.1: Com onents o machine vision s stem
http://computervision.wikia.com/wiki/Computer_visionhttp://computervision.wikia.com/wiki/Computer_visionhttp://computervision.wikia.com/wiki/Digital_camerahttp://computervision.wikia.com/wiki/Digital_camerahttp://computervision.wikia.com/wiki/Digital_camerahttp://computervision.wikia.com/index.php?title=Image_processing&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Image_processing&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Assembly_line&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Assembly_line&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Assembly_line&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Assembly_line&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Image_processing&action=edit&redlink=1http://computervision.wikia.com/wiki/Digital_camerahttp://computervision.wikia.com/wiki/Digital_camerahttp://computervision.wikia.com/wiki/Computer_vision


5/126

5

V. Importance & their basic features

1.Illumination systemThe purpose of the illumination system in machine vision system is to control the lighting on the

products to be inspected so as to improve how the image is appeared to the camera. Hence it determinesthe quality of the image produced for the processing. Thus illumination is one of the critical aspects ofmachine vision systems whose optimization can minimize efforts, time and resources. There are different

sources of illumination including LED, florescent lamps, halogen lamps, xenon, mercury lamps and highPressure Sodium which are used according to the need of applications.

1. OpticsCameras used in the domains of industry, medicine and science are usually shipped without a lens.

Therefore, analysis and adjustment of the optics according to the requirements of the applications of themachine vision system is needed. On the basis of the results of the analysis, lens for the system is ordered

and mount. Due to their standardized mount, in machine vision, the so-called C mount lenses are widely

used. The calculation of these lenses only requires three parameters: an addition, a multiplication and adivision.

2. CameraCamera basically transforms optical signals (light) into electrical ones (voltage) and digitizes them

(raw digital image). Camera consists of the image sensor that converts the photons of light into the

electrical signals. Today digital still cameras either use CCD (charge coupled device) imaging sensor orCMOS (complementary metal oxide semiconductor) imaging sensor. Any one of the discrete valuescoming out of the A/D employed to digitalized image is a pixel. This is the smallest distinguishable areain an image. CCD cameras are becoming smaller, lighter and less expensive. Images are sharper and moreaccurate, and the new dual output cameras produce images twice as fast as previous models. CMOS

cameras are preferred over the applications which needs high speed performance with low costs.

3. Frame grabbersFrame grabbers are the specialized A/D converters which change video or still images into digital

information. Most frame grabbers are printed circuit boards compatible with the most common types ofbus structures, including peripheral component interconnect (PCI). A frame grabber can be of 2 types:

Analog and Digital. In case of an analog frame grabber, we use an analog camera and digital camera forDigital frame grabber. Today's frame grabbers offer greater stability and accuracy than earlier models,and some can even handle image processing and enhancement, using digital signal-processing techniques.

4. Computer SoftwareThe software will typically take several steps to process an image. Often the image is first

manipulated to reduce noise or to convert many shades of gray to a simple combination of black and

white. Following the initial simplification, the software will count, measure, and/or identify objects in theimage. As a final step, the software passes or fails the part according to programmed criteria. If a partfails, the software signals a robotic device to reject the part; alternately, the system may warn a humanworker to fix the production problem that caused the failure.


6/126

6

5. Computer System

Machine vision in the domains of industry, medicine and science are dominated by PCs and theoperating system Windows with the use of modern interfaces, such as USB and FireWire. Efficient

visualization requires graphics hardware with on-board memory. If image sequences should be recorded,the computer configuration should be similar to that of video editing systems (fast processor, fast separate

hard disk). In case of simple applications with one camera and a slow sequence of images a simple low-end computer may be sufficient. However, increasing complexity, number of cameras and number offrames may lead to a processing load that has to be distributed amongst several PCs.

VI. Basic steps in machine based inspection

Machine vision system comprises of the basic steps as given below:

1. Object presentationDepending upon the camera employed in the machine vision system, it is often required to analyse

how the object should be presented to the camera for the imaging in addition to the optimizedillumination. Several cameras can acquire good images when the objects are still. But if the objects are

bigger enough it is required to rotate the objects while in some other cases two or more cameras areemployed to acquire the 3D image of the objects. This is an important step as it tremendously helps in

identification and analysis of the targeted objects.

2. ImagingIn imaging the object presented to the camera with the optimized illumination is sensed by the

imaging sensor (CCD or CMOS) in the camera and is converted into the electrical signal (voltage). Thisanalog electrical signal is converted into the digital using specialized Analog to Digital converter called

Frame grabbers. Frame grabbers is compatible with the most common types of bus structures (USB or

Firewire), including peripheral component interconnect (PCI) which sends the digital data to the computersystem for the image analysis and processing.

3. AnalysisUsing the computer software, the digital image acquired is processed first to remove the noise or

simplification processes. Then the image is segmented employing suitable algorithm among manymethods to separate the image from the background and count, measure and identify objects requirement.As stated above the based upon the image processing algorithms the software gives the suitable digitaloutput.

4. Control ActionBased upon the digital control output from the software in the computer system, the computer

signals a robotic device either to reject the part if the part fails; alternately, the system may warn a human

worker to fix the production problem that caused the failure. Here the cause of the failure is noted for thefurther processing of the rest of the products.

VII. Advantages of Machine vision based Systems

Some of the advantages of using machine vision:


7/126

7

1) Increased productivity: It increases productivity of the company products by reducing directand indirect labor, and reducing burden rate.

2) Increased machine utilization: Through the increased machine utilization it becomes able tolocate the position of an object, to measure dimensions within thousandths-of-an-inch accuracy,to count small items and identify an object.

3) Increased flexibility of production: It increases flexibility in the production as machines areeasily controllable by the increased machine utilization

4) Reducing operations cost: It reduces overall costs which include reducing Work in progress,scrap work, set-up times, lead times, material handling costs.

5) Increased quality assurance: It increases the quality of products by verifying whether a qualityof an object meets standards

6) Reducing errors: Errors caused by fatigue, human judgment, oversight can be eliminated usingmachine vision system

7) Increased customer satisfaction: Better quality products increase the customer satisfactionwhich increases the net profit for the company.

8) High speed and repeatability: Manufacturers favor machine vision systems for visualinspections that require high-speed, high-magnification, 24-hour operation, and repeatability of

measurements.

9) Wide range vision inspection: Machine vision system can work in UV and IR light which haveadvantages over human vision.
http://computervision.wikia.com/index.php?title=Visual_inspection&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Visual_inspection&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Visual_inspection&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Visual_inspection&action=edit&redlink=1http://computervision.wikia.com/index.php?title=Visual_inspection&action=edit&redlink=1


8/126

8

2. Illumination Sources

I. Importance

Purpose of machine vision illumination is to control how the object appears to the camera.Image quality dependent on lighting irrespective of camera, frame grabber parameters as

camera less versatile than human eye in uncontrolled conditions.

Lighting quality determining robustness of MVS. Designing and following a rigorous lighting analysis sequence will minimize time, effort, and

resources.

Optimum lighting eliminating or reducing the need of post image acquisition filters andprocessing. Illumination most critical part of an imaging system.

Benefits of specialist illumination sources over customized design: cost effectiveness, proven,reliability, repeatability and variety.

II. Types of illumination sources

1. Fluorescent Sources:These are common in household usage, less popular in MV industry. Fluorescent are

more powerful than LEDs but less than metal halide bulbs. These are of moderate

intensity with service life to match. Fluorescent tubes are AC devices, with high

frequency supply essential (around 50 Hz). Standard fluorescent tubes do not have a

very uniform colour balance, being predominantly blue with little red.

However, daylight varieties have a higher colour temperature and are

generally more suited to machine vision.

2. LED IlluminationMost appropriate for low-speed MV applications as have adequate light intensity, low cost, longer service

life, low DC supply requirement, more flexible cable than bulky metal halide bulbs and their inflexiblefiber optic light guide. Modern high-intensity LEDs have high illumination and matching metal halide

bulb intensity when used strobe controller. Common use in spotlights, linelights, backlights, diffuselights. These are available in variety of colors, hence, have extensive application. These are advantageouson the grounds of flexibility in application, output stability, etc. Disadvantageous include cost-

effectiveness for large area lighting.

3. Metal Halide (Mercury)Metal halides are also known as mercury, often used in microscopy as has many discrete wavelength

peaks, which complements the use of filters for fluorescence studies. It harnesses very powerful intensity

of a halogen or metal vapor bulb which is housed within a light source box, through an optical fiber lightguide and into a light 'adapter' positioned close to the object to be illuminated. These commonly take theform of line lights and ring lights of various sizes and lengths. Not practical to use beyond 5m due to

losses.

Figure 2.1: Fluorescent Sources


9/126

9

The more sophisticated halogen light sources have RS232 control for intensity. The most powerful light-sources available are Nickel Metal Halide and mercury vapour bulbs and can produce approximately 5times greater light intensity than a halogen. They also provide the ultimate intensity for very high speedapplications such as linescan camera. For fast moving applications, xenon strobe light sources provide

bright white light with short duration, ideal for freezing motion.

4. XenonIt is useful in bright strobe lighting applications. A specific advantage of the source is the high light

power on nominal only 35W. Further the xenon source is suited for line cameras and solves off costly

DC-sources on halogen basis.

III. Different types of Lamp geometries

Lights and lamps are available in different geometries and colors, suitable according to engineers

decision for needs. More than one type of lighting may be required for illuminating different components

of MV system. Dual-circular fluorescent illuminator using two independently controlled circular lampsproviding 360 degree of uniform illumination.

Figure 2.2: Fiber optics illumination

Figure 2.3: A comparison of different sources on various parameters


10/126

10

Figure 2.4: Dual-Circular Lamp

General types including: spot, rectangular, linear & ring formats. These are also consumer customizable,

widely used in medical, pharmaceutical applications, etc.

Figure 2.5: Various lamp lighting geometries

IV. Choice of Illumination Source for a given Application

Figure 2.6:Before and after correct illumination

Ring lights and array lights are available for those occasions when surfaces are flat and diffused. Forcases where outside dimensions must be measured or openings viewed, backlights work best.Wavelength of illumination is an important factor. Viewing from opposite end of spectrum as comparedto observed color useful in many situations UV, IR more useful in situations where image difficult tocapture otherwise. Strobed LED lighting is also useful in moving parts inspection

Figure 2.7:Example of different wavelengths lighting differences


11/126

11

3. Imaging Sensors

I. Importance of imaging sensors in machine vision system.Machine vision system requires appropriate and accurate image sensing device for the robust and

correct identification, analysis and interpretation. Hence, it is mandatory to use the most suitable imaging

sensors available.

Broadly speaking if the camera is the eye of a machine vision system, then the image sensor is the heart

of a camera. Choice of the sensor is made according to the accuracy, throughput, sensitivity, and cost of

the machine vision system. A basic understanding of sensor attributes and application needs will narrow

the search for the right sensor as shown by the figure below.

Figure 3.1: When selecting a sensor for a machine vision application its important to have a thorough understandingof theapplication needs for the dynamic range, speed, and responsivity.

The sensor is made up of millions of "buckets" that essentially count the number of photons that strike the

sensor. This means that the brighter the image at a given point on the sensor, the larger the value that isread for that pixel. The number of resulting pixels in the image determines its "pixel count". For example,a 640x480 image would have 307,200 pixels, or approximately 307 kilopixels; a 3872x2592 image wouldhave 10,036,224 pixels, or approximately 10 megapixels.Hence this illustrates the importance of theimage sensor technology.

II. Different types of cameras, their characteristic

Depending upon the parameter to be considered and the conditions of use there are different types ofcameras:-

A. Based upon type of image sensor implied:

CCD Camera:CCD Camera use CCD imaging sensor which consists of

photodiodes to detect the light photons. All CCDs are opticallyshielded, which are used only for readout purpose. Thecollected charge is simultaneously transferred to the vertical

CCDs at the end of integration time (a new integration periodcan begin right after the transfer) and then shifted out charge

transfer to vertical CCDs simultaneously resetsthe photodiodes.

Figure 3.2: CCD imaging sensor


12/126

12

The charges accumulated from the vertical CCDs are transferred to the horizontal CCDs which shifts thecharge to the output amplifier.

CCD cameras have advantages of high quality images with the optimized photo detectors with high(Quantum Efficiency) QE, low dark current and very low noise: no noise introduced during shifting. Theyhave disadvantages of being highly nonprogrammable, requiring high power: entire array switching all

the time. Limited frame rate (for large sensors) due to required increase in transfer speed (whilemaintaining acceptable transfer efficiency).

CMOS Camera:A CMOS camera is a type of digital camera

having a CMOS image sensor inside and a

consisting of an integrated circuit that records

an image. The complementary metal-oxide

semiconductor (CMOS) sensor consists of

millions of pixel sensors, each of whichincludes a photodetector. As light enters the

camera through the lens, it strikes the CMOS

sensor, which causes each photodetector to

accumulate an electric charge based on the

amount of light that strikes it. Figure 3.3: CMOS Sensor Array

The digital camera then converts the charge to pixels that make up the photo. Unlike CCD, each pixel in a

CMOS imager has its own individual amplifier integrated inside. Since each pixel has its own amplifier,

the pixel is referred to as an "active pixel".

In addition, each pixel in a CMOS imager can be read directly on an x-y coordinate system, rather than

through the "bucket-brigade" process of a CCD. This means that while a CCD pixel always transfers acharge, a CMOS pixel always detects a photon directly, converts it to a voltage and transfers the

information directly to the output. This fundamental difference in how information is read out of the

imager, coupled with the manufacturing process, gives CMOS Imagers several advantages over CCDs.

High speed On-Chip system integration Low cost of manufacturing

B. Based on the motion of the object:

If the object is moving along a conveyor it will need to be either a progressive area scan camera or a

line scan camera.

Progressive Area Scan-This type of camera is used in applications where image is to be read as whole rather than an

interlaced camera that reads two distinct fields (odd and even lines) separated by 40ms time interval and


13/126

13

then the resultant image is read out as a complete frame). The progressive cameras read all lines withinthe same scan and therefore no image blur is visible.

Line Scan cameras-Line scan cameras are a linear image sensor, generally one row of pixels in the sensorup to

about 8000 pixels. Linescan cameras read data at many thousands of lines per second so can deal withdefect detection in very fast moving objects. Hence, are used in applications which demand reading of

data very fast. Area scan cameras do not have the speed to capture data from a moving object. Forexample paper or textiles which may travel at many tens of meters a second.

C. Based upon of colour detection capability:

Monochrome Cameras:These cameras containmonochrome sensors with a matrix of colour filter across them allowing

only a single wavelength to reach the sensor. There are a number of different filters used but generally allfilters will degrade the image sensor sensitivity by around 30 per cent.

Color Cameras:The Colour cameras have a single sensor with an array of colour filters printed over their pixels.

Adjacent pixels use different colours, hence the resolution at each colour is less than that of monochromecameras. Some high performance colour cameras employ colour-separation prism along with three chips

for primary colours for full resolution of each colour. Colour cameras have certain disadvantages overmonochrome cameras:

1. They are less sensitive in comparison to monochrome cameras.

2. Assuming the same number of pixels, the effective resolution of a color camera is lower than that of amonochrome camera.

3. There exists a green, blue and red value for every pixel in the raw digital image. The color camera hasto undergo a color interpolation. This interpolation requires extra processing power and bandwidth duringthe data transfer.

Colour cameras are only used, if the different colors of an image "carry" information.

D. Based upon interfacingcriteria:

Analog Cameras: In these type of cameras the signal from the sensor is converted to analogvoltage and then fed to frame grabber board in machine vision computer.

Digital Cameras: The signal from the pixel is digitalized and digitalized data is directly fed tothe computer.Most new machine vision system employ digital cameras.


14/126

14

III. Typical camera parameters and their significance

Typical Camera parameters include

1. Shutter (Exposure Time)

The shutter determines the CCDs exposure time. It may be adjusted manually or automatically. The

three first sample images show a key ring (the LED is initially off) with correct exposure time, one whichis too short and another which is too long.

Figure 3.5: Correct Exposure time

Switching on the LED, the image is overexposed in such a way that it only shows a big, white spot. TheLED is correctly represented, if we decrease the exposure time. There is, however, a vertical line which

disturbs the image. This is a typical CCD problem and is known as "smear" (Figure 3.9). To avoid this,we close the diaphragm and increase the exposure time:

Figure 3.8:Extremely overexposed Figure 3.9: Smear Figure 3.10: Correct representation

2. Gain (Contrast)

Gain determines the amplification of the CCD's output signal. This parameter may be adjustedmanually or automatically. The amplification increases the contrast. A high gain, however, leads to noisy

images.

Figure 3.11: Source Figure 3.12: Contrast increase Figure 3.13: Gain too high

3. Offset (Brightness)

The offset increases all graylevels resulting in brighter image. The offset is added to the camerasoutput signal. This parameter may be adjusted manually or automatically.

Figure 3.6: Exposure

time too long

Figure 3.7: Exposure

time too short


15/126

15

Figure 3.14: Source image Figure 3.15: Slight brightness increase Figure 3.16: Overdone brightness increase

4. Auto Exposure und Exposure Reference

Auto Exposure determines whether the adjustment of the exposure time and the gain is to be adjustedmanually or automatically. It compares the mean gray level of the current image with the ExposureReference. If these values are different, the exposure time as well as the gain is varied accordingly.

5. Sharpness

This mechanism may use to enhance blurred images. Overdoing its application leads, however, to

distortions.

Figure 3.17: Source image Figure 3.18: Sharpness improvement Figure 3.19: Overdone sharpness

6. Gamma

Gamma increases or decreases the middle gray levels. In other words, it is a way to compensate thenon-linear behavior of picture tubes:

Figure 3.20: Source image Figure 3.21: Increased middle graylevels Figure 3.22: Decreased middle graylevels

7. Saturation

This parameter is used to adjust the colors saturation from monochrome to high color values:

Figure 3.23: Source image Figure 3.24: Saturation = 0 Figure 3.25: Maximum Saturation


16/126

16

8. Hue

This parameter is used to shift color values. Nevertheless, the relation between the colors remains (incontrast to the parameter White Balance):

Figure 3.26: Source image Figure 3.27: Color shift

9. White Balance

This parameter is used to vary the degree of red and blue in the image to achieve a lifelike color

representation. The values can be controlled manually or automatically.The automatic white balance feature offers two operation modes:

Auto: The balancing algorithm effects the video stream continuously. One push:The balancing is controlled by triggers.

Simple multimedia cameras only provide one white balance parameter in which increasing the degree ofone colour leads to a decrease in degree of other and vice versa whereas high quality cameras offer two

parameters which can be changed simultaneously.

Figure 3.28: Source image Figure 3.29: Degree of blue too low Figure 3.30: Degree of red too low

Camera calibration is often used as an early stage in machine vision and especially in the field of

augmented reality. On the basis of camera calibration we have parameters as:

IV. Choice of camera for a given application

Choosing a right camera among different cameras having different features with is an important task.It is essential to decide which technology to prefer.

When making the technology choice it is really important to be clear on exactly what the application isgoing to be. Depending upon requirements different parameters to be considered are:

Resolution Requirement: Generally camera should be chosen with the lowest resolution thatwill meet the requirements. This is important because the higher the resolution, the more image

processing that must be done by the host computer.


17/126

17

Colour or Monochrome Application: The use of color adds a level of complexity that should beavoided unless your application truly needs color. Color cameras produce larger amounts of datathan monochrome cameras meaning greater image processing burdens. Color also negatively

affects camera sensitivity and image resolution.

Image processing speed: For applications demanding more processing speeds where high speedobjects are inspected it is better to use low resolution and monochrome cameras that reduce the

complexity and helps to maintain the image processing speed. Space limitation: Applications can be there in which space is limited, the plant layout is small,

for these spaces we need to use small size cameras.

Cable Limitations: Depending upon availability of cables a third party software package can beused, and cameras can be used that provide interfacing standards such as Firewires DCAM

(IIDC) and Gigabit Ethernets GigE Vision.

V. Frame Grabbers and their functions

It is an important component of a machine vision system, in which video frames are captured in

digital form and then displayed, stored or transmitted in raw or compressed digital form. Historicallyframe grabbers were the predominant way to interface cameras to PC's. This has substantially changed in

recent years as direct camera connections via USB, Ethernet and IEEE 1394 ("FireWire") interfaces havebecome practical and prevalent.

Functioning:

A framegrabber captures individual, digital still frames from an analog video signal or a digital video

stream. The incoming signal from the vision camera is sampled at a rate specified by a fixed frequencypulse, which can be generated in the frame grabber itself or received from the camera. If the signal is notalready digital it passes through an analogue to digital converter, and stored in the buffer until a fullimage has been converted.

Figure 3.31: Schematic of Frame Grabber

Early frame grabbers had only memory enough to acquire (i.e., "grab") and store a single digitized videoframe, hence the name. Modern frame grabbers are typically able to store multiple frames and compress

the frames in real time using algorithms such as MPEG2 & JPEG.

Frame grabbers that perform compression on the video frames are referred to as "Active FrameGrabbers", frame grabbers that simply capture the raw video data are referred to as "Passive FrameGrabbers."


18/126

18

Applications where Frame Grabbers are implied are radar acquisition, manufacturing, and remoteguidance requiring capturing of images at high frame rates and resolutions.

Frame Grabbers can be classified into:

Analog frame grabbers which accept and process analog video signals, include these circuits:

An input signal conditioner to buffer the analog video input signal and protect downstreamcircuitry.

A circuit to recover the horizontal and vertical synchronization pulses from the input signal. An analog-to-digital converter. An NTSC/SECAM/PAL decoder, a function that can also be implemented in software.

Digital frame grabbers which accept and process digital video streams, include these circuits:

Physical interface to the digital video source, such as Camera Link, DVI, GigE Vision, LVDS orRS-422.

Circuitry common to both analog and digital frame grabbers:

Memory for storing the acquired image (i.e., a frame buffer). A bus interface through which a processor can control the acquisition and access the data. General purpose I/O for triggering image acquisition or controlling external equipment.

Based upon different applications there is a huge range of frame grabbers available and can basically besplit into three main categoriesStandard, Advanced and Intelligent.

Standard Frame Grabbers

These are low cost devices with a high enough level of intelligence and software support for inspectionapplications. They can only be used with standard analogue interlaced video sources whereas non-standard video sources (i.e., progressive scan, mega pixel, non-standard cameras and digital sources) are

not supported.

This type of grabber often does not include memory to buffer images on so the video data is sent to theCPU via the PCI-bus line by line which is processor intensive.

Standard grabbers can be triggered to grab the next image although their response is not instantaneous andthere will be a random delay of up to one frame, which remains satisfactory for most applications. Theyalso contain a multiplexer allowing more than one camera to be connected and used in turn.

Advanced Frame Grabbers

Advanced Frame Grabbers are high performance frame grabbers which support non-standard cameras

and therefore are dominant in most machine vision applications. Apart from increased accuracy, thedistinct feature of an advanced frame grabber which sets it apart from standard level grabbers, is theability for asynchronous image capture. This is achieved via the use of synchronization mechanisms

between the grabber and camera resulting in instantaneous capture also known as an asynchronous reset


19/126

19

operation. This operation interrupts the sampling clock and resets the exposure and readout cycle so that afull image can be generated at any time.

Intelligent Frame Grabbers

Intelligent Frame Grabbers effectively contain advanced grabbers but also contain additional on boardprocessing hardware to provide the grabber with a form of intelligence. Escalating the grabber frommerely a messenger to a device with processing capabilities built in.

Intelligent frame grabbers can be split into three predominant types:-

Intelligent capture: These remove interaction from the host during the acquisition cycle owing totime critical applications. Only the actual data transfer requires the host processor and as such thegrabber notifies the host that there is new data residing in the grabber's memory.

Pre-processing engines: These grabbers free up even more host processing time by performingsome of the functions the host would normally do before the data would be ready to process.These functions include flat field correction, image arithmetic, convolution filters and data

reduction. Expandable processing engines: These can form up to 30 processors into one computing engine

increasing the processing power of the host. They excel in applications where the sampling rate ishigher or where the processing cannot be accomplished on a single or dual processor host.


20/126

20

4. Imaging Techniques

I. Strobed and Steady imaging

Imaging can usually be strobe or steady (continuous wave) according to the need of the application in

machine vision inspection systems.

Strobe Imaging:In strobe imaging, a switched current source is coupled to the LED emitter element to enable

strobe operation synchronous with the periodic operation of a line imaging camera. In applications wherethe exposure time is short, such as imaging fast-moving objects, strobing LED lights is a popular way to

take advantage of LED ability for short cycles with high light output.The increase in light output comes from LED ability to be driven for short periods with currents

exceeding normal steady state values, followed by arelatively long cool-off time. The ratio of on-time tooff-time is typically 1 to 100; pulse durations aretypically below 100 microseconds.

The heart of the strobe source is the triggerdetector, which connects to an image acquisitiondevice or an external sensor, and reads a trigger whenthe strobe event occurs. Once the trigger detectordetects a signal, a strobe event begins (refer to the

figure 4.1).

An example to the strobe imaging: Sample particles are dropped between a video camera and

synchronized strobe light by a vibrating feeder. When the strobe flashes, the camera takes an image of theparticles. This image is then digitized by a computer frame grabber. For different products, the user canselect either a camera with height magnification or a camera with wide field of view according to the

particle size in the sample.

Figure 4.2: Schematic of strobe imaging example

Steady ImagingIn steady imaging, camera continuously obtains the image with the

continuous wave illumination source. Continuous-wave xenonlamps are usedin steady imaging based machine-vision applications where continuous light

and a good color balance are required. Continuous-wave xenon systems can

Figure 4.1: Trigger timing diagram in Strobe imaging

Figure 4.3: Digital Image prototypewith uniform

pixels

Figure 4.4: A Xenon lightsource


21/126

21

produce in excess of 250,000 candelas. For many years, continuous-wave xenon illumination systemshave been used in machine-vision systems to provide high-intensity and a wide spectrum of radiation.

Such features improve the contrast of captured images, increase the depth of field of the imaging scene,and shorten camera-integration times.

In steady imaging machine vision system continuously image the viewing area and through imageprocessing determine when a part is present. In these cases, image acquisition must be commanded bysoftware; the software triggers the event. After image acquisition, some image processing is performed to

determine if a part is present for viewing. In some implementations of this technique, the same image thatis used for detection (triggering) is retained and analyzed as the primary image. In other cases, a newimage is acquired by software command and its contents analyzed.

The latency for part detection through imaging is the latencies for exposure, transfer of image

data through the system, and initial processing. If the same image can be used for analysis, there is nofurther latency for image acquisition. Therefore steady imaging is preferred on the applications that

require very high speed. And for this purpose of continuous inspection with low power dissipation CMOScamera is used.

For discrete parts, part detection using imaging algorithm is generally more complex, less

reliable, and has longer latency than using an external sensor to trigger the process.

II. Imaging Configurations

The principle for how information is transferred from an object i.e. device under inspection to a

detector i.e. CCD camera is based on how photons interact with the material within the object. If thedevice under inspection modifies the incoming light in such a way that the outgoing rays are different

from the incoming rays, then we say that the object has created contrast. This is the basic principle of allmachine vision applications. If the object cannot modify the incoming beam in some discernable fashion,then the device cannot be visible to either a camera or the human eye. The goal for the machine vision

lighting is to provide incoming illumination in such a fashion that the naturally occurring features and

properties of the device under test can be exploited to maximize contrast.

Bright Field Imaging:

A bright field image is formed using light (or electrons) transmitted through the object. Regions ofthe object that are thicker or denser will scatter photons more strongly and will

appear darker in the image. When no object is present then a bright

background will be seen. Since the background will tend to be bright for amajority of materials, lighting modes which consist primarily of structure

which emanates from this part of the lighting hemisphere is called Brightfield. Brightfield lighting modes come in a variety of styles, sizes and shapes

and provide varying degrees of contrast enhancement depending on the nature

of the part under inspection.

The purest and most interesting of the Brightfield modes is thatproduced by what is commonly called a Coaxial Illumination mode. CoaxialIlluminators produce light which appears to emanate from the detector,

bounce off the part, and then return back upon itself to the detector.

To accomplish this lighting mode, a beamsplitter is oriented at 45 degrees so as to allow half the lightimpinging on it to pass through, and the other half to be specular reflected (see Figure). For reflective

surfaces, the background signal is very high and uniform over the field of view.

Figure 4.5 : Brightfield

modes reflect back tothe camera


22/126

22

Any variation on the specular surface (reflective,

transmissive or absorptive) will result in a reduction in theamount of light making it back into the sensor, causing theimage of this area to appear darker than that of thesurrounding bright background. This is an excellent

lighting mode for flat parts that have a mixture of highlyreflective areas surrounded by non-specular absorptiveareas. Flat gold pads on a fiberglass circuit board providean excellent example of the high contrast capabilityof the Coaxial Illumination mode.

Dark Field Imaging:A dark field image is formed using light (or electrons) scattered from the object. In the absence of a

object, therefore, the image appears dark. If the part under inspection is flat and has a reflectance valuethat is nonzero, all light emanating from points below the Brightfield angle will be reflected off the part

away from the detector. Since the background will tend to be dark for a majority of materials, lightingmodes which consist primarily of structure which emanates from this part of the lighting hemisphere arecalled Darkfield.

Dark field illuminators provide varying degrees of contrast enhancement depending on the nature of thepart under inspection. For reflective surfaces, the background signal is generally very low and uniform

over the field of view. Any variation on the specular surface (predominantly reflective) will result in anincrease in the amount of light making it back into the sensor, causing the image of this area to appear

lighter than that of the surrounding dark background (see Figure ).

This is an excellent lighting mode for flat parts that have surface variations or imperfectionsthat deviate from the perfect flat background. Application examples include surface flaw detection

(scratches, pits, etc.) as well as OCR on stamped characters.

Figure 4.6: Coaxial Illumination is one ofthe most commonforms of brightfield illumination. It is extremely useful for

flat objects with both reflectiveand absorbing features

Figure 4.7: Darkfield Illumination provides low angle incident lighting that

highlights any deviations from a perfectly flat surface. Most of the light generated

never makes it to the camera.


23/126

23

For the rounded objects

For a rounded part, the normal remains perpendicular to the surface, but because the surface is nolonger flat, the direction of the normal varies across the field of view and is no longer parallel at all

points. For a surface with a slight convex curvature, this phenomenon effectively increases theBrightfield portion of the hemisphere, and reduces the Darkfield portion (see Figure).

Figure 4.8: Curved surfaces alter the effective Brightfield and Darkfield regions because the normal vectors are no longerparallel.

As the curvature continues to increase towards a spherical ball, the Brightfield region continues to growuntil such a point as the entire hemisphere can now only provide the Brightfield lighting mode.

Spherical Brightfield Illuminators

To properly inspect features on a curved surface (spherical orcylindrical), using brightfield illumination a special device is usedcalled as Spherical Brightfield Illuminator (SBI). The goal for the SBI is

to provide light in a fashion that all incident rays impinge upon the

surface parallel with the normal vector. For this a collimated coaxiallighting device may be combined with a convex spherical lens to createa Convex Spherical Illuminator. Schematically, light travels from thecollimator to the beam splitter, where 50% of the energy reflects toward

the part under inspection (see Figure).

In a similar manner, reflective surfaces that are inwardly domed ordepressed may also be imaged using the same spherical brightfield

technique with a slight modification. Here the same basic setup isutilized, but the convex spherical lens is replaced with a concavespherical lens. The concave lens is aligned such that the focal point for

the lens and the concave surface are congruent (see Figure on next page).

Figure 4.9: In a SBI collimated

coaxial light passes through alens element that projects

incident light normal to thecurved surface.


24/126

24

Applications

A common application of this technique is for the imaging of soda

and beer can bottoms. Here the application requires that the date and lotcode is to be read from the bottom of the cans. These codes are generally

printed onto the concave bottom of the reflective aluminum can, creatinga difficult illumination problem. Since the inks used tend to be dark incolor and are fairly absorbing, hence brightfield imaging is to be applied.

The curved surface introduces an additional challenge because normal

brightfield techniques fail to produce a uniform illumination field over theentire dome. As can be seen in Figure dark field illumination provides auniform background, but fails to create high contrast with the date lotcodes. The concave SBI technique increases the uniform brightfield zone

such that the blackabsorbing ink can now be imaged with high contrastagainst the reflective background.

.

Figure 4.10: In the concave SBI, collimated

coaxial light passes through a concave lenselement projecting incident light normal to

the concave surface

Figure 4.11: Concave SBI

provides high contrastbrightfield illumination forobjects that have concave

domed or depressed regions.


25/126

25

5. Image Processing Techniques

I. Image types

A digital image is an array, or a matrix, of square pixels (picture elements) arranged in columns and rows.

In a (8-bit) grayscale image each picture element has an assigned intensity that ranges from 0 to 255. A

grey scale image is what people normally call a black and white image, but the name emphasizes thatsuch an image will also include many shades of grey.

A normal grayscale image has 8 bit colour depth = 256 grayscales. A true colour image has 24 bit

colour depth = 8 x 8 x 8 bits = 256 x 256 x 256 colours = ~16 million colours.

Some grayscale images have more grayscales, for instance 16 bit = 65536 grayscales. In principle three

grayscale images can be combined to form an image with 281,474,976,710,656 grayscales.

There are two general groups of images: vector graphics (or line art) and bitmaps (pixel-basedor images). Some of the most common file formats are:

GIF is an 8-bit (256 colour), non-destructively compressed bitmap format. It is mostly used forweb. It has several sub-standards one of which is the animated GIF.

JPEG is a very efficient (i.e. much information per byte) destructively compressed 24 bit (16million colours) bitmap format. It is widely used, especially for web and Internet (bandwidth-

limited).

Figure 5.1: An imagean array or a matrix of pixels arranged in columns and

rows

Figure 5.2: Each pixel has a value from 0 (black) to255(white). The possible range of the pixel values depend

on the colour depth of the image, here 8 bit = 256 tonesor grayscales

Figure 5.3: A true-colour image assembled from threegrayscale images coloured red, green and blue. Such an

image may contain up to 16 million different colours.


26/126

26

TIFF is the standard 24 bit publication bitmap format. It compresses image non-destructively. PS is a postscript, a standard vector format. It has numerous sub-standards and can be difficult to

transport across platforms and operating systems.

PSD is a dedicated Photoshop format that keeps all the information in an image including all thelayers.

We may loosely classify images according to the way in which the interaction occurs, understanding thatthe division is sometimes unclear, and that images may be of multiple types. Figure (5.4) depicts thesevarious image types.

Figure 5.4: Reflection, Emitted and Altered image formations

Reflection imagessense radiation that has been reflected from the surfaces of objects. Theradiation itself may be ambient or artificial, and it may be from a localized source.

Emitted Radiation Imagessense radiation that has been emitted from the source objects. Theradiation itself may be ambient or artificial, and it may be from a localized source.

Altered Imagessense radiation that has been altered by the transparent or translucent objects. Several standard types of the images are as follows:

Grayscale Images: These are coded using one number per pixel representing one of 256 different graytones ranging from black to white (figure 5.5).

Figure 5.5: Grayscale Image Figure 5.6: Palette Image Figure 5.7: RGB image


27/126

27

Palette Images

These are images coded using one number per pixel, where the number specifies which color in apalette of up to 256 different colors should be displayed for that pixel (as shown in figure 5.6). The colors

in the palette can be True Color RGB colors. Palette images save space at the cost of a reduced totalnumber of colors available for use in the image. The image shown about uses only 16 colors.

RGB Images - These images use three numbers for each pixel, allowing possible use of millions of

colors within the image at the cost of requiring three times as much space as grayscale or palette images.They are often called True Color RGB images (shown in figure 5.7).

RGBa Images - These images are RGB images with a fourth number added for

each pixel that specifies the transparency of that pixel in the range 0 to 255.When seen in an image window, grayscale, palette and RGB images will beshown on a background of solid color (white by default). RGBa images areshown on a background of alternating white and light gray checkerboard patternso that differences in transparency are more visible. RGBa images are used

when combining multiple images in maps for elaborate graphics composition orcreation of special visual effects in maps. For example, the RGBa image

illustrated along is shown in a layer above a grid of lines that become visibleto an increasing degree as transparency increases towards the bottom of theimage.

Compressed Images - Compressed images use sophisticated wavelet compressiontechnology to not only compress the amount of data an image requires but also toreconstitute the image dynamically on demand. At any given zoom level the

desired view of the image is reconstituted from the compressed data store.Compressed images can be viewed, but not edited or otherwise manipulated.Compressed images are used to display very large images that would require too

much time for display and possibly too much room for storage if they werenot compressed.

II. Image Enhancement

Images may suffer from the following degradations:

Poor contrast due to poor illumination or finite sensitivity of the imaging device Electronic sensor noise or atmosphere disturbances leading to broad band noise Aliasing effects due to inadequate sampling

Enhancement techniques transform an image into a better image by sharpening the image features for

display analysis. Image enhancement is the process of applying these techniques to facilitate thedevelopment of a solution to a computer/machine imaging problem.

Figure 5.8: RGBa image

Figure 5.9: Compressed image


28/126

28

Figure 5.10: Schematic of Image Enhancement

Figure above illustrates the importance of the feedback loop from the output image back to the start of theenhancement process and models the experimental nature of the development. The range of applications

includes using enhancement techniques as preprocessing steps to ease the next processing step or as post-processing steps to improve the visual perception of a processed image, or image enhancement may be anend in itself. Enhancement methods operate in the spatial domain by manipulating the pixel data or in thefrequency domain by modifying the spectral components.

Image enhancement techniques can be divided into two broad categories:1. Spatial domain methods, which operate directly on pixels, and2. Frequency domain methods, which operate on the Fourier transform of an image.

1. Spatial domain methods

The value of a pixel with coordinates (x; y) in the enhanced image F is the result of performing someoperation on the pixels in the rectangular neighbourhood of (x; y) in the input image, f. Types include:

A) Grey scale manipulation:A operator T acts only on a 11 pixel neighbourhood in the input image, that is F(x; y) depends on

the value offonly at (x; y). The simplest case is thresholding where the intensity profile is replaced by astep function, active at a chosen threshold value. In this case any pixel with a grey level below the

threshold in the input image gets mapped to 0 in the output image. Other pixels are mapped to 255.Other grey scale transformations are outlined in Figure:


29/126

29

Figure 5.11: Depending upon different step functions different Gray Scale Mappings can be observed.

B) Histogram Equalization

Histogram equalization is a common technique for enhancing the appearance of images. Supposewe have an image which is predominantly dark. Then its histogram would be skewed towards the lowerend of the grey scale and all the image detail is compressed into the dark end of the histogram. If we

could `stretch out' the grey levels at the dark end to produce a more uniformly distributed histogram thenthe image (see Figure) would become much clearer.

Figure 5.12: The original image and its histogram and its equalized versions.

C) Image Smoothing

The aim of image smoothing is to diminish the effects of camera noise, spurious pixel values,

missing pixel values etc. There are many different techniques for image smoothing;


30/126

30

Neighborhood averaging: Each point in the smoothed image, F(x; y) is obtained from theaverage pixel value in a neighborhood of (x; y) in the input image but smoothing will tend to

blur edges because the high frequencies in the image are attenuated.

Edge-preserving smoothing: It is also called median filtering since we set then grey level tobe the median of the pixel values in the neighborhood of that pixel. The outcome of medianfiltering is that pixels with outlying values are forced to become more like their neighbors, but

at the same time edges are preserved.

Figure 5.13 : Original image; with noise; the result of averaging; andthe result of median filtering

D) ImagesharpeningThe main aim in image sharpening is to highlight fine detail in the image, or to enhance

detail that has been blurred (perhaps due to noise or other effects, such as motion). With image

sharpening, we want to enhance the high-frequency components.

E) High boost filtering

High pass filtering can be defined in terms of subtracting a low pass image from the originalimage, that is,

High pass = Original- Low passHowever, in many cases where a high pass image is required, we also want to retain some of the low

frequency components to aid in the interpretation of the image. Thus, if we multiply the original image byan amplification factor A before subtracting the low pass image, we will get a high boost or highfrequency emphasis filter. Thus,

High boost = A. Original -Low pass= (A-1). (Original) + Original - Low pass= (A -1). Original + High pass

2. Frequency domain methods:In this we compute the Fourier transform of the image to be enhanced, multiply the result

by a filter, and take the inverse transform to produce the enhanced image.Low pass filtering involves the elimination of the high frequency components in the image. It results in

blurring of the image (and thus a reduction in sharp transitions associated with noise). An ideal low passfilter (see Figure) would retain all the low frequency components, and eliminate all the high frequencycomponents. However, ideal filters suffer from two problems: blurring and ringing. These problems are

caused by the shape of the associated spatial domain filter, which has a large number of undulations.


31/126

31

Smoother transitions in the frequency domain filter, such as the Butterworth filter, achieve much betterresults.

Figure 5.14: Transfer function for ideal low pass filter.

III. Image Segmentation

A segmentation of an image is a partition of the image that reveals some of its content. The

goal of segmentation is to simplify and/or change the representation of an image into something that ismore meaningful and easier to analyze.

The result of image segmentation is a set of segments that collectively cover the entire image, or a set of

contours extracted from the image. Applications of image segmentation include:

1) Identifying objects in a scene for object-based measurements such as size and shape.

2) Identifying objects in a moving scene for object-based video compression(MPEG4)

3) Identifying objects which are at different distances from a sensor using depth measurement froma laser range finder enabling path planning for mobile robots.

Different types of image segmentation are:

1) Segmentation based on greyscale:Original greyscale image leads to inaccuracies in labelling and in distinguishing of components.

Hence image segmentation is required.

Figure 5.15: Segmentation based on greyscale.

2) Segmentation Based upon Texture:

This type of segmentation enables object surfaces with

varying patterns of grey to be segmented.

Figure 5.16:

Segmentation based upon

texture


32/126

32

3) Segmentation based on range:In this type of segmentation a range image is obtained with a laser range finder. A segmentation

based on the range (the object distance from the sensor) is useful in guiding mobile robots.

Figure 5.17: Segmentation based upon range

4) Segmentation based on motion:

Here the objective of segmenting the image is to observe the motion parameter accurately. Themain difficulty of motion segmentation is that an intermediate step is required to (either implicitly or

explicitly) estimate an optical flow field.

Figure 5.18: Segmentation based upon Motion

Segmentation techniques:

1) Edge-based technique of segmentation:

In this technique the image is analysed and is segmented by boundary detection methods. After thatfurther classification an analysis are observed for the area of interest.

Figure 5.19: Schematic of Edge-based technique


33/126

33

Figure 5.20: Edge based technique of segmentation, here the area of interest i.e. text is been extracted from background.

2) Thresholding

The simplest method of image segmentation is called the thresholding method. This method is basedon a threshold value to turn a gray-scale image into a binary image.The key of this method is to select thethreshold value and segment the image according to that value.

3) Clustering method

Clustering method is based upon the K-means algorithm in which an iterative technique is used topartition an image intoKclusters. The basic algorithm is:

1. PickKcluster centers, either randomly or based on some heuristic2. Assign each pixel in the image to the cluster that minimizes the distance between the

pixel and the cluster centre3. Re-compute the cluster centers by averaging all of the pixels in the cluster4. Repeat steps 2 and 3 until convergence is attained (e.g. no pixels change clusters)

In this case, distance is the squared or absolute difference between a pixel and a cluster centre. Thedifference is typically based on pixel color, intensity, texture, and location, or a weighted combination ofthese factors.Kcan be selected manually, randomly, or by a heuristic.

Figure 5.21: Schematic of K-Means algorithm

Figure 5.21: General representation

how clusters are decided.

Figure 5.22: Different types of clusters in K-means Algorithm method of segmentation
http://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Distancehttp://en.wikipedia.org/wiki/Distancehttp://en.wikipedia.org/wiki/Huehttp://en.wikipedia.org/wiki/Brightnesshttp://en.wikipedia.org/wiki/Texture_%28computer_graphics%29http://en.wikipedia.org/wiki/Randomhttp://en.wikipedia.org/wiki/Heuristichttp://en.wikipedia.org/wiki/Heuristichttp://en.wikipedia.org/wiki/Randomhttp://en.wikipedia.org/wiki/Texture_%28computer_graphics%29http://en.wikipedia.org/wiki/Brightnesshttp://en.wikipedia.org/wiki/Huehttp://en.wikipedia.org/wiki/Distancehttp://en.wikipedia.org/wiki/Distancehttp://en.wikipedia.org/wiki/Algorithm


34/126

34

4) Compression-based methods

Compression based methods postulate that the optimal segmentation is the one that minimizes, over

all possible segmentations, the coding length of the data. In other words segmentation tries to findpatterns in an image and any regularity in the image can be used to compress it. The method describes

each segment by its texture and boundary shape.

5) Histogram-based methods

In this technique, a histogram is computed from all of the pixels in the image, and the peaks andvalleys in the histogram are used to locate the clusters in the image. Color or intensity can be used as themeasure.

A refinement of this technique is to recursively apply the histogram-seeking method to clusters in the

image in order to divide them into smaller clusters. This is repeated with smaller and smaller clusters until

no more clusters are formed.

One disadvantage of the histogram-seeking method is that it may be difficult to identify significant peaks

and valleys in the image. In this technique of image classification distance metric and integrated regionmatching are familiar.

Figure 5.23: Image Histogram

6) Region growing methods

The region growing method or the seeded region growing method takes a set of seeds as input along

with the image. The seeds mark each of the objects to be segmented. The regions are iteratively grown bycomparing all unallocated neighbouring pixels to the regions. The difference between a pixel's intensity

value and the region's mean, , is used as a measure of similarity. The pixel with the smallest differencemeasured this way is allocated to the respective region. This process continues until all pixels areallocated to a region.


35/126

35

Figure 5.24: (a) Image showing defective welds (b) Seedspoints (c) Result of region growing. (d) Boundaries of segmented

defective welds

Seeded region growing requires seeds as additional input. The segmentation results are dependent on thechoice of seeds. Noise in the image can cause the seeds to be poorly placed.

7) Split-and-merge methods

Split-and-merge segmentation is based on a quadtree partition of an image. It is sometimes calledquadtree segmentation.

This method starts at the root of the tree that represents the whole image. If it is found non-uniform (nothomogeneous), then it is split into four son-squares (the splitting process), and so on so forth. The node in

the tree is a segmented node. This process continues recursively until no further splits or merges arepossible.

8) Graph partitioning methods

Graph partitioning methods can effectively be used for image segmentation. In these methods, the

image is modeled as a weighted, undirected graph. Usually a pixel or a group of pixels are associated withnodes and edge weights define the similarities between the neighborhood pixels. The graph (image) is

then partitioned according to a criterion designed to model clusters of interest. Each partition of the nodes(pixels) output from these algorithms are considered an object segment in the image.

9) Watershed transformation

The watershed transformation considers the gradient magnitude of an image as a topographicsurface. Pixels having the highest gradient magnitude intensities (GMIs) correspond to watershed lines,which represent the region boundaries. Water placed on any pixel enclosed by a common watershed lineflows downhill to a common local intensity minimum (LIM). Pixels draining to a common minimum

form a catch basin, which represents a segment.


36/126

36

Figure 5.25: Watershed Transformation.

10) Model based segmentation

The central assumption of such an approach is that structures of interest have a repetitive form of

geometry. Therefore, one can seek for a probabilistic model towards explaining the variation of the shapeof the area of interest and then when segmenting an image impose constraints using this model as prior.Such a task involves

(i) Classification of the training examples,(ii) Probabilistic representation of the variation of the registered samples,(iii) Statistical inference between the model and the image.

11) Semi-automatic segmentation

In this kind of segmentation, the user outlines the region of interest with the mouse clicks and

algorithms are applied so that the path that best fits the edge of the image is shown.

12) Neural networks segmentation

Neural Network segmentation relies on processing small areas of an image using an artificial neuralnetwork or a set of neural networks. After such processing the decision-making mechanism marks theareas of an image accordingly to the category recognized by the neural network. A type of networkdesigned especially for this is the Kohonen map.

Compared with conventional image processing means, Neural networks segmentation methods have

several significant merits, including robustness against noise, independence of geometric variations ininput patterns, capability of bridging minor intensity variations in input patterns, etc.

IV. Feature Extraction

Feature extraction is to simplify and reduce the dimensionality in the image processing when the

input data is too large and contains the redundant information for the faster and better processing. Thisinvolves transforming the input data into the set of features. If the features extracted are carefully chosen


37/126

37

it is expected that the features set will extract the relevant information from the input data in order toperform the desired task using this reduced representation instead of the full size input.

In other words feature extraction is a method of capturing visual content of images for indexing &

retrieval. Feature extraction includes methods for training learning machines with millions of low levelfeatures. Identifying relevant features leads to better, faster, and easier to understand learning machines.

Because of perception subjectivity, there cannot be a single best representation for a feature.

Figure 5.26: Feature extraction process and Image retrieval

Above figure 5.26 shows the architecture of a typical image retrieval system. For each image in the imagedatabase, its features are extracted and the obtained feature space (or vector) is stored in the featuredatabase. When a query image comes in, its feature space will be compared with those in the featuredatabase one by one and the similar images with the smallest feature distance will be retrieved.

Feature extraction is followed by some kind of preprocessing in order to extract the features, whichdescribe its contents. The processing involves filtering, normalization, segmentation, and objectidentification. The output of this stage is a set of significant regions and objects.

It should be observed that image extraction should be guided by the following concerns:

The features should carry enough information about the image and should not require anydomain-specific knowledge for their extraction.

They should be easy to compute in order for the approach to be feasible for a large imagecollection and rapid retrieval.

They should relate well with the human perceptual characteristics since users will finallydetermine the suitability of the retrieved images.

Image Properties

Properties of the image can refer to the following: Global properties of an image:

i.e. average gray level, shape of intensity histogram etc.

Local properties of an image:We can refer to some local features as image primitives: circles, lines, texels (elements

composing a textured region)

Other local features: shape of contours etc.


38/126

38

Image features

The feature of an object is defined as a function which is computed such that it quantifies some significant

characteristics of the object.

We classify the various features currently employed as follows:

General or Local features: In these features no specific shapes or higher spatial information aredetected. Application independent features such as color, texture, and shape are the local features.According to the abstraction level, they can be further divided into:

1. Pixel-level features: These are the features calculated at each pixel, e.g. color,location.

2. Local features: These are the features calculated over the results of subdivision of theimage band on image segmentation or edge detection.

3. Global features: These are the features calculated over the entire image or just regularsub-area of an image.

Domain-specific features: These features are domain specific and application dependent featuressuch as human faces, fingerprints, and conceptual features. These features are often a synthesis of

low-level features for a specific domain.

On the other hand, all features can be classified into

Low-level features: Low- level features can be extracted directed from the original images. Somelow-level shape features may also include:

i. Edge Detectionii. Circle Detection

iii. Line Detectioniv. Corner Detection

High-level features: High-level feature extraction must be based on low-level features.

Image features are local, meaningful, detectable parts of an image:

Meaningful: Features are associated to scene elements that of interests to the user in the imageformation process They should be invariant to some variations in the image formation process(i.e. invariance to viewpoint and illumination for images captured with digital cameras)

Detectable: They can be located/ detected from images via algorithms .They are described by afeature vector, representing the useful information out of the data.

Different features in an image include:

1. Colour features

The color feature is one of the most widely used visual features in image retrieval. Images characterized

by color features have many advantages:

Robustness. The color histogram is invariant to rotation of the image on the view axis, andchanges in small steps when rotated otherwise or scaled. It is also insensitive to changes in image

and histogram resolution and occlusion.


39/126

39

Effectiveness. There is high percentage of relevance between the query image and the extractedmatching images.

Implementation simplicity. The construction of the color histogram is a straightforward process,including scanning the image, assigning color values to the resolution of the histogram, and

building the histogram using color components as indices.

Computational simplicity. The histogram computation has O(X, Y) complexity for images ofsize X Y. The complexity for a single image match is linear; O (n), where n represents thenumber of different colors, or resolution of the histogram.

Low storage requirements. The color histogram size is significantly smaller than the imageitself, assuming color quantization.

2. Texture

Texture is another important property of images. Texture is a powerful regional descriptor that helps in

the retrieval process. Texture, on its own does not have the capability of finding similar images, but it can

be used to classify textured images from non-textured ones and then be combined with another visual

attribute like color to make the retrieval more effective.

Texture has been one of the most important characteristic which has been used to classify and recognize

objects and have been used in finding similarities between images in multimedia databases. Generally we

capture patterns in the image data (or lack of them), e.g. repetitiveness and granularity

3. Shape

Shape based image retrieval is the measuring of similarity between shapes represented by their features.

Shape is an important visual feature and it is one of the primitive features for image content description.

Shape content description is difficult to define because measuring the similarity between shapes isdifficult. Therefore, two steps are essential in shape based image retrieval, they are: feature extraction and

similarity measurement between the extracted features.

Shape descriptors can be divided into two main categories: region based and contour-based methods.

Region-based methods use the whole area of an object for shape description, while contour-based

methods use only the information present in the contour of an object.

The shape descriptors are as follows: features calculated from objects contour:

Circularity aspect ratio

discontinuity angle irregularity length irregularity complexity right-angleness sharpness

Those are translation, rotation (except angle), and scale invariant shape descriptors. It is possible toextract image contours from the detected edges. From the object contour the shape information is derived.We extract and store a set of shape features from the contour image and for each individual contour


40/126

40

6. Soft Computing Techniques

The real world problems are mostly imprecise and uncertain. Conventional computing techniques, alsocalled as hard computing techniques, are based upon precise analytical models which arrive to an ideal

output at the cost of processing time. Thus hard computing are susceptible to imprecision, uncertainty,partial truth and approximations. Soft computing techniques overcome these problems as they are built tobecome tolerant to solve the non-ideal environment problems. Soft computing techniques are not based

upon the precise analytical models to exploit the imprecision and uncertainty and achieve tractability,robustness and low cost.The principle techniques in soft computing are artificial neural network, fuzzy

logic and genetic algorithm which are described below.

I. Artificial Neural Network

In its most general form, an artificial neural network is a machine that is designed to model the way inwhich the brain performs a particular task or function of interest; the network is usually implemented

using electronic components or simulated in software on a digital computer. Our interest will be confinedlargely to neural networks that perform useful computations through a process of learning. To achieve

good performance, neural networks employ a massive interconnection of simple computing cells referredto as neurons or processing units. We may thus offer the following definition of a neural network viewedas an adaptive machine.

A neural network is a massively parallel distributed processor that has a natural propensity for storingexperiential knowledge and making it available for use. It resembles the brain in two respects:

1. Knowledge is acquired by the network through a learning process.2. Interneuron connection strengths known as synaptic weights are used to store the

knowledge.The procedure used to perform the learning process is called a learning algorithm.

Artificial neural networks are also referred to as Neuro-Computers, Connectionist Networks, ParallelDistributed Processors, etc.

Differences between neural networks and digital computers are as follows:

Neural Networks Digital Computers

Inductive Reasoning. Given input andoutput data (training examples), weconstruct the rules.

Deductive Reasoning. We apply knownrules to input data to produce output.

Computation is parallel, asynchronous,and collective.

Computation is serial, synchronous, andcentralized.

Memory is distributed, internalized, shortterm and content addressable.

Memory is in packet, literally stored, andlocation addressable.

Fault tolerant, redundancy, and sharingof responsibilities.

Not fault tolerant. One transistor goesand it no longer works.

They have dynamic connectivity. They have static connectivity.


41/126

41

Models of a neuron

A neuron has a set of nsynapses associated to the inputs. Each of them is characterized by aweight.

A signal xi, i=1,2n at the ith input is multiplied (weighted) by the weight wi, i=1,2n. The weighted input signals are summed. Thus, a linear combination of the input signals

w1x1++wnxn is obtained. A "free weight" (or bias) w0, which does not correspond to anyinput, is added to this linear combination and this forms a weigh

report on study of inspection systems and implementation of art networks

Documents