techniques to analyze the terahertz - repositories
TRANSCRIPT
TECHNIQUES TO ANALYZE THE TERAHERTZ DATA FOR THE
DETECTION OF EXPLOSIVES
by
Kusum Yarlagadda, B.Tech
A Thesis
In
ELECTRICAL ENGINEERING
Submitted to the Graduate Faculty
of Texas Tech University in
Partial Fulfillment of
the Requirements for
the Degree of
MASTER OF SCIENCE
IN
ELECTRICAL ENGINEERING
Approved
Dr. Vittal Rao
Chairperson of the committee
Dr.Mohammad Saed
Accepted
Ralph Ferguson
Dean of the Graduate School
December, 2010
Texas Tech University, Kusum Yarlagadda, December 2010
ii
ACKNOWLEDGEMENTS
I dedicate this Masters thesis to my parents Radha Krishna Murthy Yarlagadda, Rama
Leela Yarlagadda and my brother Santosh Yarlagadda who have supported and
encouraged me so far.
I should express my sincere thanks to the chair of my committee Dr. Vittal Rao for giving
me the opportunity to work in this project. His support and guidance for this entire
project is invaluable. I also should express my thanks to Dr. Mohammed Saed my co-
chair for the support and encouragement he provided for the completion of this project.
I should also take the privilege to thank all my friends at Tech who have helped me a lot
during my stay in Lubbock.
Texas Tech University, Kusum Yarlagadda, December 2010
iii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ............................................................................................. ii
ABSTRACT ...................................................................................................................... vi
LIST OF TABLES .......................................................................................................... vii
LIST OF FIGURES ....................................................................................................... viii
1. INTRODUCTION ....................................................................................................... 1
1.1 Landmines ................................................................................................................. 1
1.2 Need for standoff detection ....................................................................................... 3
1.3 THz technology ......................................................................................................... 5
1.4 Analysis Methods ...................................................................................................... 6
1.5 Our Approach ............................................................................................................ 7
1.6 Thesis Organization................................................................................................... 7
2. REVIEW OF LITERATURE ...................................................................................... 8
2.1 Review of detection methodologies .......................................................................... 8
2.1.1 Prodders, seismic and acoustic sensors ............................................................ 10
2.1.2 Electromagnetic Sensors ................................................................................... 11
2.1.3 Electro-Optic Sensors ....................................................................................... 13
2.1.4 Other Explosive Detectors ................................................................................ 13
2.2 Terahertz data acquisition ....................................................................................... 16
2.2.1 Sources and Detectors ...................................................................................... 16
2.2.2 THz Spectroscopy and Imaging ....................................................................... 18
2.3 Analysis Techniques ............................................................................................... 19
2.3.1 Background of component spatial and spectral pattern analysis ...................... 19
Texas Tech University, Kusum Yarlagadda, December 2010
iv
2.3.2 Background of independent component analysis ............................................. 19
3. SIMULATED TERAHERTZ DATA FOR EXPLOSIVES .................................... 21
3.1 Introduction ............................................................................................................. 21
3.2 Source Signals ......................................................................................................... 21
3.3 Samples Generated .................................................................................................. 23
4. COMPONENT SPATIAL AND SPECTRAL PATTERN ANALYSIS ................ 25
4.1 Introduction ............................................................................................................. 25
4.2 Component spectral response and spatial pattern analysis algorithm ..................... 27
4.2.1 Determining number of significant components .............................................. 28
4.2.2 Non-negative Constraint ................................................................................... 30
4.2.3 Feasible Solution of [T] using non-negativity constraints ................................ 31
4.2.4 Optimal Estimation ........................................................................................... 33
4.3 Results of component spatial and spectral pattern analysis .................................... 36
4.3.1 Further Processing for confirmation ................................................................. 40
4.4 Component spectral and spatial pattern analysis with reference............................. 47
4.5 Results when percentage of explosives varies in the samples ................................ 50
4.6 Conclusions ............................................................................................................. 52
5. INDEPENDENT COMPONENT ANALYSIS ......................................................... 53
5.1 Introduction ............................................................................................................. 53
5.2 FastICA ................................................................................................................... 55
5.2.1 Criteria to choose the algorithm ....................................................................... 55
5.2.2 Estimation of Independence: Measure of non-gaussianity ............................... 57
5.2.3 Fast ICA Algorithm .......................................................................................... 60
Texas Tech University, Kusum Yarlagadda, December 2010
v
5.3 ICA with Reference ................................................................................................. 65
5.4 Results of Independent Component Analysis with reference ................................. 69
5.4.1 First Stage of Processing .................................................................................. 71
5.4.3 Final Stage ........................................................................................................ 75
5.5 Fast algorithm for one-unit ICA-R .......................................................................... 78
5.6 ICA with Multi-Reference ...................................................................................... 81
5.7 Results of Independent Component Analysis with multi reference ........................ 83
5.7.1 First Stage using ICA-mR ................................................................................ 84
5.7.2 Final Stage using ICA-mR ............................................................................... 85
5.8 Conclusions ............................................................................................................. 86
6. TWO STAGE PROCESS ........................................................................................... 87
6.1 Comparison ............................................................................................................. 87
6.2 First stage ................................................................................................................ 88
6.3 Second stage ............................................................................................................ 90
6.4 Time comparison ..................................................................................................... 92
6.5 Need for deterministic signals ................................................................................. 92
6.6 Conclusions ........................................................................................................... 101
7. CONCLUSIONS AND FUTURE WORK .............................................................. 102
REFERENCES .............................................................................................................. 104
Texas Tech University, Kusum Yarlagadda, December 2010
vi
ABSTRACT
Improving landmine detection capability is a challenging technological issue. Most of the
existing technologies are still using metal detectors and probes. There are many plastic
explosives which cannot be detected using these methodologies. So there is requirement a
for new detection technologies that utilize characteristics other than metal content.
Properties of the electromagnetic spectrum, acoustics of mine casings, advanced
prodders, other chemical and biological technologies are in the research phase. There is
also lot of ongoing research on sensor fusion techniques which utilize the advantages of
more than one detection technique. These fusion techniques will be more promising than
using an individual detection technique.
This thesis attempted to utilize the unique properties of the electromagnetic spectrum for
landmine detection. The terahertz gap of the electromagnetic spectrum which has not
been utilized properly until recently, has certain advantages which can prove worthy
when used for explosive detection. Most explosives have unique spectral peaks in the
terahertz frequency. Advanced algorithms, independent component analysis and
component spatial and spectral pattern analysis are implemented and improved to use for
explosive detection. Further an improved method which utilizes both algorithms is
obtained. The results obtained are promising and the approach has the capability to detect
the explosives hidden under some background.
Texas Tech University, Kusum Yarlagadda, December 2010
vii
LIST OF TABLES
1.1 Summary of trends in component technology .............................................................. 4
2.1 Operational characteristics of sensors ......................................................................... 10
4.1 False alarm of component spatial and spectral pattern analysis method .................... 52
6.1 Time Comparison........................................................................................................ 92
Texas Tech University, Kusum Yarlagadda, December 2010
viii
LIST OF FIGURES
1. 1 VS1.6 plastic AT mine, PMD6 wood AP mine, VS50 plastic AP mine, and M14
plastic AP mine. In the figure M14 is roughly two inches across .............................2
1. 2 Electromagnetic Spectrum .......................................................................................... 5
2. 1 Atmospheric attenuation at different frequencies ......................................................16
2. 2 Different source and detectors available in the electromagnetic spectrum ................ 17
3. 1 Diffuse reflectance and transmission spectra of different explosives ........................22
3. 2 Spectra of different explosives................................................................................... 23
3. 3 Field with pixels containing RDX in red, TNT in blue, DNT in magenta and
pixels without any explosive in green .......................................................................24
4. 1 Illustration of multispectral images ............................................................................26
4. 2 Image model for multi-component patterns ............................................................... 26
4. 3 (a) Original source signals and (b) Results obtained ................................................. 38
4. 4 (a) The original samples generated and (b), (c), (d) The rows of [P] arranged as a
30*30 plot ...................................................................................................................39
4. 5 (a) First source signal obtained, (b) Its correlation with RDX, (c) Its correlation
with TNT, (d) Its correlation with DNT .....................................................................42
4. 6 (a) Second source signal obtained, (b) Its correlation with RDX, (c) Its
correlation with TNT, (d) Its correlation with DNT ..................................................43
4. 7 (a) Third source signal obtained, (b) Its correlation with RDX, (c) Its correlation
with TNT, (d) Its correlation with DNT .....................................................................44
4. 8 (a) First row of [P] as 30*30 plot (b) Finally detected pixels after thresholding ...... 46
4. 9 (a) Second row of [P] as 30*30 plot (b) Finally detected pixels after thresholding .. 46
4. 10 Third row of [P] as 30*30 plot (b) Finally detected pixels after thresholding ........ 46
4. 11 Result of [S] obtained using random value of [TO] ................................................ 48
Texas Tech University, Kusum Yarlagadda, December 2010
ix
4. 12 Result of [S] obtained using value of [TO] generated using [Sref] ......................... 48
4. 13 Original samples generated ...................................................................................... 49
4. 14 Finally detected pixels ............................................................................................. 50
4. 15 (a) Samples generated, (b) First row of [P] as 30*30 plot, (c) Second row of [P]
as 30*30 plot, (d) Third row of [P] as 30*30 plot ....................................................51
5. 1 (a) Sources used to generate samples, (b) results obtained ........................................ 63
5. 2 Result of ICA-R: (a) Reference signal given by us (b) Result obtained .................... 71
5. 3 Grid layout of the plot ................................................................................................ 72
5. 4 Output of ICA-R after first stage for RDX as reference ............................................ 73
5. 5 Output of ICA-R after first stage for TNT as reference............................................. 73
5. 6 Output of ICA-R after first stage for DNT as reference ............................................ 74
5. 7 Final output after first stage ....................................................................................... 74
5. 8 Output of second stage using RDX as reference ....................................................... 76
5. 9 Output of second stage using TNT as reference ........................................................ 76
5. 10 Output of second stage using DNT as reference ...................................................... 77
5. 11 Finally detected samples .......................................................................................... 77
5. 12 Result of ICA-mR .................................................................................................... 84
5. 13 Result of first stage of ICA-mR ............................................................................... 85
5. 14 Result of final stage of ICA-mR .............................................................................. 86
6. 1 Originally generated pixels ........................................................................................ 89
6. 2 Output of first stage component pattern analysis ....................................................... 89
6. 3 Pixels being passed to the second stage include in black boxes ................................ 90
6. 4 Output of final stage................................................................................................... 91
Texas Tech University, Kusum Yarlagadda, December 2010
x
6. 5 Pixels generated ......................................................................................................... 93
6. 6 (a) Explosives used to generate samples, (b) [S] matrix obtained ............................. 94
6. 7 [P] corresponding to first source signal ..................................................................... 94
6. 8 [P] corresponding to second source signal................................................................. 95
6. 9 [P] corresponding to third source signal .................................................................... 95
6. 10 Final result obtained ................................................................................................. 96
6. 11 Output of first stage of ICA ..................................................................................... 96
6. 12 Final output of ICA .................................................................................................. 97
6. 13 Source signal spectra used to generate samples, (b) Sources signals obtained ....... 98
6. 14 [P] corresponding to first source signal ................................................................... 98
6. 15 [P] corresponding to second source signal............................................................... 99
6. 16 [P] corresponding to third source signal .................................................................. 99
6. 17 Final result obtained ............................................................................................... 100
6. 18 Result of first stage of ICA .................................................................................... 100
6. 19 Result of final stage of ICA ................................................................................... 101
Texas Tech University, Kusum Yarlagadda, December 2010
1
CHAPTER 1
INTRODUCTION
Recent terrorist attacks in several places have elevated national and international security
concerns. Antipersonnel mines and antitank mines are significant threats in many nations
despite of many programs by the United Nations and humanitarian organizations to clear
them. These mines are inexpensive and are available at very low cost as low as $3-$25
and the detection method requires $300-$1000 per mine to be cleared. According to
ICBL, 2001 there are 15,000-20,000 victims due to these mines per year in over 90
countries. According to the estimates of U.S. State Department survey 2001 there are 40-
50 million mines to be cleared [1]. Worldwide an estimated amount of 100000 mines are
found and destroyed every year (Horowitz et al., 1996). So according to these
calculations, in order to destroy 40-50 million existing mines takes 450-500 years. Some
estimates say that around 19 million new mines are placed annually which needs 19 more
years for clearing them (Horowitz et al., 1996).
These statistics indicate that there is a great need for effective landmine detection
technologies which are fast and efficient. This chapter gives a brief introduction about
landmines, terahertz technology and few analysis techniques that are being used in this
thesis.
1.1 Landmines
The wide spread use of landmines started during World war I. The deployment of US
army in Bosnia in 1995 and Afghanistan in 2001 gave the landmine issue a sense of
urgency. Since then US military is investing a lot of funds in the landmine detection area
of research.
Landmines come in varying shapes and sizes. They can be square, circular, cylindrical or
bar shaped. Casings can be metallic, wooden or plastic. Based on the metal content
present in the landmines they are categorized as metal, low metal or non-metallic.
Texas Tech University, Kusum Yarlagadda, December 2010
2
Landmines are broadly classified as antitank mines and antipersonnel mines. Antitank
mines are designed to destroy vehicles or to impede their motion. Usually these are 6-
14in (15-35cm) and are buried 16in (40cm) deep. These will contain 5-10 kg of explosive
material. These can be metallic or plastic. Antipersonnel mines are designed to kill and
maim people. Usually these are 2-6in (5-15cm). Their casing can be metallic, plastic or
wooden. In the real world antipersonnel mines are cleared by dividing the area into 1m
grids and each square is systematically checked. Both these types of mines have variable
proportions of explosives in them.
Figure: 1. 1 VS1.6 plastic AT mine, PMD6 wood AP mine, VS50 plastic AP mine, and M14 plastic AP
mine. In the figure M14 is roughly two inches across [2]
Some of the most widely used explosives are nitro-based compounds such as 2, 4, 6-
trinitrotoluene (TNT), cyclotrimethylenetrinitramine (RDX), pentaerythritol tetranitrate
(PETN) and nitroglycerin (NG). Plastic explosives are pure molecular crystalline form
mixed with other agents. The range and concentrations in the mines may vary but the
major ingredient will be explosive compound. Few plastic explosives in use are metabel
(PETN based), SX2 (RDX based), C-4 (RDX based), PBX (predominately containing
HMX) and Semtex H (containing RDX and PETN). TNT is widely used in antitank
landmines, NG is used in dynamites, RDX and PETN are used in the manufacture of
plastic explosives [3]. Pipe bombs use black powder, large vehicle bombs use ammonium
nitrate fuel oil (ANFO). There are also some other explosives without the presence of
nitrogen like Triacetone triperoxide (TATP).
Texas Tech University, Kusum Yarlagadda, December 2010
3
For both antipersonnel and antitank mines there are methods for remediating them
without individually detecting them. If all the mines are metal cased then they can be
easily detected using metal detectors. But widespread use of plastic landmines
necessitates additional detection technologies. Because there will be no plastic detectors
these other technologies must focus on disturbances in the background like thermal,
chemical, electromagnetic or dielectric. A few of these methods which are in use are
discussed in chapter 2.
1.2 Need for standoff detection
There are methods that can be efficient in landmine detection when the sensors or the
personnel involved can have close proximity to the field. But in the real world this is
dangerous because it involves loss of life and property. Hence there is lot of ongoing
research on the standoff detection. Remote detection is the situation where personnel
involved are away from the field but the detection equipment moves in close proximity to
the explosives. Standoff detection which is slightly different from remote detection is an
active or passive detection technique where vital assets and other individuals involved in
this detection are separated or are out of zone of severe damage in case any explosive
deploys.
Traditionally explosive detection can be divided into two types: bulk and trace detection.
In bulk detection macroscopic amounts are detected using imaging or by technologies
that use nuclear properties. Standoff detection is possible in either of these cases. In trace
detection microscopic amounts such as vapor or particulates are detected using chemical
sensors or animal olfaction. The best hope for standoff detection of particulates is by
using some technology that analyzes the radiation emitted from explosives. This radiation
can be active or passive. Bulk detection will be easier using such radiation. X-ray
imaging, thermal neutron activation, techniques employing infrared, mm-wave and other
electromagnetic radiation etc., fall under this category.
Texas Tech University, Kusum Yarlagadda, December 2010
4
But before deciding if a standoff detection technique is effective, there are several factors
that must be considered such as
Signals from explosives will be interfered by other signals from the background.
Frequency of false alarms.
Time required for detection must be very low. (Speed of detection is important
when threat is fast approaching)
Effective standoff detection must take into account the output from more than one sensor
because using only one sensor output will increase the number of false alarms. But use of
distributed sensors has got several challenges like communication between sensors,
sensor sampling, data transfer, fusion of information, sensor fault detection, time to
sample, detection decision making and deployment issues [4].
Also for the standoff detection parameters like transmission capability, resolution and
component performance are dependent on the frequency and distance of operation which
is clearly shown in the table below.
Table 1: 1 Summary of trends in component technology [5]
All the different methods of standoff detection techniques for landmine detection are all
discussed in chapter 2. Since terahertz technology has got certain advantages over the
other method for this particular landmine detection problem a little introduction of the
technology is given below.
Texas Tech University, Kusum Yarlagadda, December 2010
5
1.3 THz technology
Generally the frequency from 0.3-10THz (10-300cm-1
) in electromagnetic spectrum is
considered as THz frequency range. But this region is not exploited so far due to lack of
convenient and suitable sources and detectors. This is because on the microwave side of
spectrum it is difficult to produce sources and detectors because these require very short
carrier transit times in active region and also due to low power produced by devices, they
must have small active regions to minimize their capacitance [3]. Similarly it is difficult
to produce sources and detectors on the optical side too. There exists inter band lasers to
operate in visible and near IR frequencies. The working principle is that in this light is
generated by radioactive recombination of conduction band electrons with valence band
holes across band gap of active material. But this cannot be extended to mid IR or to
other longer wavelengths due to lack of suitable narrow band gap semiconductors.
Because of these difficulties and the cost involved this band of electromagnetic spectra is
very little exploited.
Figure: 1. 2 Electromagnetic Spectrum [6]
Radiation in the THz frequency has a unique capability for noninvasive imaging and
spectroscopy of materials. This range of electromagnetic radiation can be transmitted
through many nonmetallic and non-polar materials. These radiations have the capability
to transmit through paper, plastic, card board and other dry packaging material with
sufficient residual energy to excite molecular vibration, rotations and phonon based
Texas Tech University, Kusum Yarlagadda, December 2010
6
resonances in solid material. This radiation has low photon energy in comparison to X-
rays, and hthese are suitable for personnel scanning as well. The radiation power (<1mw)
will not pose any health risk. Because of these reason THz frequency spectroscopy has
gained attention of many researchers in explosive detection, drug detection etc. The
electromagnetic spectrum from sub millimeter wavelength through THz can be used to
gather information on chemical structure of object by measuring intensity of reflected or
emitted energy. As the frequency increases, the spectral features of material become more
apparent whereas the capability of penetration of radiation decreases.
Several materials exhibit characteristic spectral features in this region of electromagnetic
spectrum particularly when frequency greater than 1 THz. So THz spectroscopy can be
used as a powerful tool to identify different chemical species. THz spectroscopy
addresses into molecular vibrational modes and can also excite intermolecular vibrations.
These features create the potential for THz spectroscopy to provide both structural and
chemical information. Different chemical structures of same material also lead to
different spectral features. All the explosives that are being used in landmines has got
unique spectral signatures in this region of the electromagnetic spectra.
This thesis concentrates on using this unique feature to identify the explosive location in
a given field by using signal and image processing techniques.
1.4 Analysis Methods
There are lots of methods that can be used to analyze the data collected from the sensors
whether they are signal or image processing techniques. In this thesis we have employed
two techniques. The first one is an image processing technique component spatial and
spectral pattern analysis which can extract the spectral information and their
corresponding spatial location. The second method is well known independent
component analysis which can obtain the spectral information of all the independent
signals present in a given signal. Combined method that can be used to analyze the
Texas Tech University, Kusum Yarlagadda, December 2010
7
terahertz data to detect the presence of explosive in the plot from which the data is
collected is proposed.
1.5 Our Approach
This thesis concentrates on developing techniques for standoff explosive detection. The
basic component spatial and spectral pattern analysis and independent component
analysis methods are used and they are improved such that they give better result for our
particular case of explosive detection.
Component spatial pattern analysis method is improved to use reference values. This has
greatly reduced the computational effort. Independent component analysis is also
improved by using pre-whitening and normalization which has reduced the computational
effort required.
Considering all the pros and cons of each individual method a two stage combined
method is generated which can detect the explosives with more accuracy and in less time.
1.6 Thesis Organization
Thesis gives detailed description about independent component analysis and component
spatial and spectral pattern analysis which are used to analyze terahertz data for the
explosive detection. Chapter 2 gives a brief review of the existing landmine detection
technologies. Chapter 3 gives the information of the signals that are used to generate data.
Chapter 4 describes the component spatial pattern analysis along with the results obtained
using the method. Chapter 5 describes the independent component analysis algorithm and
the results obtained using the method. Chapter 6 explains a two stage combined process
that employs both methods used in chapter 4 and chapter 5. Chapter 7 gives conclusions
and future work.
Texas Tech University, Kusum Yarlagadda, December 2010
8
CHAPTER 2
REVIEW OF LITERATURE
2.1 Review of detection methodologies
Over 83 countries are being polluted by landmines. If conventional tools like metal
detectors and prodders are used for detecting on an average an area of 10m2 can be
cleared on a working day. For an instance in Cambodia only 146km2 has been cleared
from the last five years [7]. The goal required by Mine Ban Treaty to reach mine free
world by 2010 seems to be impossible, so the first priority of mine action has changed to
mine impact-free world although the final goal remained the same [8].
Few recommendations made during Standing Committee on mine clearance, Mine Risk
Education and Mine Action Technologies are
Technologists should avoid building technologies based on assumed needs and
should work interactively with end users.
Appropriate technologies could save human lives and increase mine action
efficiency.
Nothing is more important than understanding working environment.
Detection probability is important in any design system. It should always be as close as
possible to one. Number of false positives and false negatives are very important in
design of any demining system. Decreasing number of false alarms accelerates demining
operations and also reduces cost of operation. Pilot project on airborne minefield
detection in Mozambique has clearly shown that even with a very high resolution air
borne sensors it is very difficult to find antipersonnel mines using objective signal
processing tools nor using subjective photo-interpretation [9].
Texas Tech University, Kusum Yarlagadda, December 2010
9
Most efficient way of reducing false alarms and increasing detection probability is to use
complementary sensors in parallel and fusing information collected from all different
sensors.
A brief description of the following sensors is given in this chapter.
Prodders, seismic and acoustic sensors.
Electromagnetic sensors (Metal detector, GPR, Microwave radiometer, Electrical
Impedance Tomography, Electrography, Imaging with handheld sensors).
Electro-optic sensors ( visible, IR, multispectral, hyper spectral, LIDAR)
Other kind of explosive detectors (NQR, X-Rays, Neutron activation, Biosensors,
Trace Explosive detection).
The first three sensors cannot differentiate between material if they have same electro-
magnetic, thermal and/or optical properties but can offer good localization capabilities
and 2-D, 3-D imaging capabilities. Last category offers poor localization capabilities,
lacks spatial resolution as well as 2-D and 3-D capabilities.
In the table below you can compare different sensors technologies that will be discussed
later with their status of maturity, cost, clearance speed and their effectiveness.
Texas Tech University, Kusum Yarlagadda, December 2010
10
Table 2: 1 Operational characteristics of sensors [10]
2.1.1 Prodders, seismic and acoustic sensors
Prodders are rigid metal sticks which are about 25 cm long and will be used to scan the
soil. If some unusual object is detected using this then other kind of methods are used to
confirm if the object is explosive. But this is not standoff detection method and there risk
involved for the person handling the prodder. Seismic devices are used from safe position
and give decision based on listening to response obtained from ground. This response
might be due to mechanical vibrations or any other disturbance that is caused by the
explosive which will be different compared to its surroundings. In ultrasonic sensors,
ultrasonic wave is sent into ground and backscattered wave is analyzed. Since ultrasonic
Texas Tech University, Kusum Yarlagadda, December 2010
11
waves can propagate through moisture this is advantageous when there is high level of
moisture or in the water.
Capability of manual prodders is enhanced by addition of ultrasonic sensor at the
prodding extremity. Such prodder also called as smart prodder exert less pressure on the
mine and provide better guess of the mine position.
All these three techniques can be used only in as preliminary detection techniques
because they do not use any property which is explosive specific for detection. Hence
these cannot differentiate explosives uniquely from other materials which causes similar
disturbances.
2.1.2 Electromagnetic Sensors
Metal Detectors: There are three categories of metal detectors. One is based on
electromagnetic induction, second is magnetometer and the other is gradiometer.
In electromagnetic Induction based detector first a primary magnetic signal is sent into
the ground in emitting phase during which it creates eddy currents in buried metallic
objects which in turn creates secondary magnetic field. During the listening phase
emission is stopped and system listens to secondary magnetic field due to which eddy
currents are induced in coils of the detectors. These are characteristic currents for the
buried metallic objects and for the soil. There are two types of electromagnetic devices,
first one send magnetic pulse and second one sends a continuous wave at different
frequencies in a stepped frequency mode. Electromagnetic Induction sensors can also
provide information about shape of metallic pieces included in the mine. Magnetometer
works on the principle of fluxgate magnetometer which measures local perturbations of
earth magnetic field. Gradiometer depending on sensor configuration measures magnetic
field gradient in a given direction.
These techniques can detect only the explosives which contain metals and are inefficient
for plastic explosives.
Texas Tech University, Kusum Yarlagadda, December 2010
12
Ground Penetrating Radar: GPR has got a transmitter which emits a pulse wave or a
continuous wave at given frequencies. GPR also has a receiver which collects waves
backscattered due to the discontinuities in permittivity. Discontinuities are provoked by
not only buried objects like land mines but also by natural discontinuities caused due to
clutter in the soil. This indicates that GPR can also detect plastic objects buried in the
ground.
There are two types of GPR's first one being ultra wide band pulse GPR send a short
pulse into the ground and second one sends continuous wave in a stepped frequency
mode. In the second case more energy can be sent into ground at a given frequency
because it provides directly Fourier transform of received signal. Current GPR's are
working in frequency range from 0.4 to 6.0 GHz.
The penetration depth of these GPR‟s is limited. Also interpretation of radar-grams is
difficult and needs well trained personnel.
Microwave Radiometer: This is a passive GPR. Natural radiation captured by its antenna
is largely amplified by the highly sensitive reception stage. The natural radiation is
comprised of radiation from sky (a few K), radiation reflected from surface and
subsurface and natural radiation from soil. 2-D image of surface and buried objects can
be obtained. Penetration and spatial resolution are frequency dependent. Detection by any
GPR technique is highly limited by moisture.
Performance and design of advanced microwave technologies like GPR or passive
radiometers depends on electromagnetic parameters of medium of propagation like γ, µ,
ξ. All these are dependent of geophysical parameters such as soil water content, type,
texture and structure.
Electrical ImpedanceTomography: Electrical impedance tomography works by
measuring soil impedance between selected locations on the ground. This is limited by
Texas Tech University, Kusum Yarlagadda, December 2010
13
dry environments. This method also has a chance of detonating the mine, hence
extremely dangerous for personnel involved.
2.1.3 Electro-Optic Sensors
LIDAR and THz imaging systems have to still demonstrate usefulness of mine detection
because they have limited soil penetration because they use shorter wavelength than the
GPR's. Wild vegetation also limits capabilities of electro optic sensors.
Hyper spectral Sensors: This works by taking into consideration material reflectivity.
These sensors use the information from more than one region of electromagnetic
spectrum. These signals are further processed to detect the presence of explosives. These
cannot locate the individual mines.
Thermal Infrared: This works in two different approaches. First method measures
apparent difference in temperature of the soil. This difference is due to difference in
emissivity or by difference in thermal flux caused by presence of buried objects. For a
time sequence using principal component analysis we can get contrast enhancement with
respect to background. The second method takes into account polarization properties of
manufactured surfaces. This method also has the same disadvantage as the hyper spectral
sensors that this cannot locate the individual mines.
Scanning Laser Doppler Vibrometry: In this an acoustic power transmitter send and
acoustic wave in the ground. Soil vibrations are induced by backscattered wave created
by buried object. These vibrations are measured using laser Doppler vibrometer. This
techniques also doesn‟t ustilize any property that is explosive specific.
2.1.4 Other Explosive Detectors
There are few nuclear and chemical methods. These technologies include Nuclear
Quadruple Resonance (NQR), Thermal Neutron Activation (TNA), Fast Neutron
Activation (FNA), trace of explosive detection using chemical processes, X-ray back
Texas Tech University, Kusum Yarlagadda, December 2010
14
scattering and X-ray fluorescence. Because of time and cost involved these are more
suited for confirmation.
Nuclear Quadruple Resonance: Alignments of nuclear spins is caused due to quadruple
charge distribution of aspherical atoms. Excitation of nuclear spins to higher quantized
energy levels is caused by radio frequency pulse generated by a transmitter coil. When
equilibrium position is reached by the nuclear spins, they emit unique detectable radio
frequency signal by following a particular precession frequency. This radio frequency
signal can be used to specify atoms and functional groups in the molecules. Nitrogen is a
quadruple atoms this can be detected by NQR technique and it appears in every type of
explosive. Hence this has the capability to be an efficient detection technique.
But this method cannot detect the explosive TNT. Also it is prone to more radio
interferences. This method also has problems with quartz bearing soils and soils that are
magnetic.
Thermal Neutron Activation: Gamma rays can be detected by conventional NaI and /or
GeLi. Most of the explosive materials are rich in nitrogen-14 (14
N) which is a stable
isotope of (N214
). If nitrogen nucleus captures a neutron the following reaction takes place
𝑛 +14N→15
N*→15
N+𝛾
This excitation of nitrogen nucleus de-excites immediately in picoseconds by emitting
one or more gamma-rays with unique energy up to 10.83 MeV and these rays can be used
to detect explosives.
This technique has slow throughput and real time inspection is not possible with this
technique.
Fast Neutron Activation: Fast neutron source can generate fast (14MeV) neutrons and
associated alpha particles (3.5MeV). These neutrons will prompt gamma rays in inelastic
scattering with nuclei of materials. An alpha particle is always associated with neutron
Texas Tech University, Kusum Yarlagadda, December 2010
15
generated. Direction of neutron can be known from direction an alpha particle which is
always 180o from neutron direction. An array of scintillating detectors can be used to
detect alpha particle. Alpha direction can be obtained based on position of detector hit by
alpha. Stoichiometric composition of hit materials in terms of carbon, nitrogen and
oxygen can be obtained by analysis of gamma rays. This can be used in explosive
detection.
This system is complex and poses some issue with radiation hazard.
X-ray back scattering: X-ray back scattered radiation determines whether or not an
object is made up of light chemical elements. This is used for bulk detection. This system
can also produce 2-D image with resolution of few centimeters. Potential problems come
from shallow penetration, system complexity, sensitivity to soil topography and sensor
height variation and safety aspects due to use of ionizing radiation.
X-ray Fluorescence: These cannot penetrate deeply into ground. These don't detect
explosives encapsulated in mine but detect the molecules migrating from mines to ground
surface. When these migrated explosives are illuminated by X-rays emission of photons
is resulted by a series of changes that occurs in the electron configuration, characteristic
of material, which can be captured and analyzed. This technique has high false alarm
rates.
Trace/ vapor explosive detection: Trace explosive detections are used to replace/
complement currently used mine detection techniques with chemical identification of
microscopic residues of explosive component, either in vapor or in particulate form.
Vapor refers to gas phase molecules emitted from explosive surface (solid or liquid)
because of its finite vapor pressure, and particulate refers to microscopic particles of solid
material that adhere to surfaces either directly or indirectly.
All the techniques detected above have pros as well as cons. Terahertz techniques have
got certain advantages in the particular problem of explosive detection. So this thesis
Texas Tech University, Kusum Yarlagadda, December 2010
16
focuses on using terahertz data for explosive detection. Acquisition of data is equally
important as analyzing the data. Although this thesis doesn‟t focus on data acquisition a
brief introduction about is provided here.
2.2 Terahertz data acquisition
2.2.1 Sources and Detectors
The electromagnetic radiation is attenuated at certain frequencies which are determined
by molecular absorption by water vapor, oxygen and other atmospheric molecules. In the
figure we can see atmospheric attenuation of various environmental conditions from
10GHz to 10,000GHz
Figure: 2. 1 Atmospheric attenuation at different frequencies [5]
The minima in this figure shows the atmospheric windows used to define normal
frequency windows of operation. In the figure we can see that these regions of interest
lies at 26 to 40GHz, 70-110GHz, 140GHz, 220GHz, 340 GHz, 410GHZ, 650GHz,
850GHz. Above 1THz window of interest lies centered at 1.5THz. So the sources and
detectors used should operate in this frequency.
Texas Tech University, Kusum Yarlagadda, December 2010
17
It is difficult to produce sources and detectors in the range from 0.1 THz to 10THz. This
can be seen from the figure 2.1 shown below. The general approach for sources in this
region should be to use multipliers to generate radiation from RF side or to use lasers or
other non-linear forms to translate down from optical region. There are few exceptions to
this trend like backward wave oscillators (BWO), vacuum electronic devices and CO2
pumped gas lasers.
Figure: 2. 2 Different source and detectors available in the electromagnetic spectrum[11]
For a transmitter receiver system maximum sensitivity can be achieved if the bandwidth
of receiver matches with that of transmitter. In case of passive receivers because sources
are infinitely broad it is important to minimize bandwidth of operation to obtain as much
received energy from emitting or reflecting field. But in case of active receivers in order
to reduce receiver generated noise while preserving illumination one should try to reduce
receiver bandwidth.
Owing to the difficulty of generation of sources and receivers in this range researches
have focused their attention on all available optical techniques for producing THz
radiation, by using visible/near IR femtosecond laser pulse. This technique was
Texas Tech University, Kusum Yarlagadda, December 2010
18
developed in 1980 and development of THz –time domain spectroscopy and imaging
system has attracted many groups to work in this field. The work is further revolutionized
by recent development of compact, solid-state THz semiconductor laser, quantum
cascade laser (QCL). These devices doesn‟t need expensive femtopulsed laser but they
still require cryogenic cooling.
Since the use of femtosecond laser pulse lot of other sources came into existence like
electro-optic rectification, surface field generation and ultra fast switching of
photoconductive emitters. Of all these methods photo conductive emitters have proven to
be efficient in converting visible/IR radiation into THz radiation and is widely been used
in THz imaging and spectroscopy techniques.
2.2.2 THz Spectroscopy and Imaging
THz time domain spectroscopy is advantageous over other spectroscopy techniques
because this is insensitive to thermal background and it doesn‟t require cryogenically
cooled bolometer detectors. This allows extraction of both absorption coefficients and
refractive index without requiring Kramers-Kronig analysis.
Generally the imaging techniques can be broadly classified as passive and active imaging
techniques. Every object emits radiation at all wavelengths with intensity proportional to
product of physical temperature and its emissivity according to Planck‟s radiation law. In
the case of passive imaging the contrast between warmer and colder objects caused by
naturally occurring radiation is used. This contrast occurs because of difference in
emissivity of different materials. For example to get an image of metal gun concealed
under clothing we use this technology. In case of active imaging the area to be imaged is
illuminated by radiation. Then the reflected or transmitted waves are captured by the
detector. Active system has got advantages over passive systems because incase of active
system objects can be illuminated with required power sufficient to penetrate materials
whereas passive systems rely on natural radiation.
Texas Tech University, Kusum Yarlagadda, December 2010
19
2.3 Analysis Techniques
2.3.1 Background of component spatial and spectral pattern analysis
Component analysis of spatial and spectral patterns in multispectral images is developed
in 1980‟s by Kawata, Sasaki, Minami. Initially this is developed to find the feasible
solution for the spatial and spectral information. Later the algorithm is optimized using
simplex algorithm to find unique spectral pattern and their corresponding spatial location.
This has got several applications in finding the presence of particular material in a field.
2.3.2 Background of independent component analysis
Though with a different name ICA was first introduced in early 1980‟s by Hèrault,
Jutten and Amari [12-14]. During that decade there used to be lot of research taking place
in this field among the French scientists. In a workshop on higher-order spectral analysis
in 1989 the early papers on ICA by Cardoso [15] and Comon [16] are presented.
Cordoso‟s algorithm used higher-order cumulants tensors from which JADE algorithm
came into existence [17]. Lacoume was the first to use fourth-order cumulants [18].
Present most popular algorithms are proposed by Cichoki and Unbehauen [19-21]. Few
other famous papers on ICA are mentioned in the references [22-23]. Another technique
„Nonlinear PCA‟ was introduced by Aapo Hyvӓrinen, Juha Karhunen and Erkki Oja [24-
25]. Several such algorithms were proposed which are restricted by some problems.
After Bell and Sejnowski proposed their method based on infomax principle in mid
1990‟s [26-27], ICA has attained wider attention of many researchers. Later this is
extended by Amari and his co-workers using natural gradient. A few years later Aapo
Hyvӓrinen, Juha Karhunen and Erkki Oja presented fixed point or Fast-ICA [28-30]. Due
its computational efficiency this has contributed to application of ICA to large-scale
problems.
Texas Tech University, Kusum Yarlagadda, December 2010
20
Different statistical criteria in existence for the estimation of ICA model are mutual
information, non-gaussianity measures, likelihood, cumulants, and nonlinear de-
correlation criteria.
When we want a general-purpose measure of dependence of components that doesn‟t
assume anything about the data then we should go for ICA by minimization of mutual
information. ICA estimation by minimization of mutual information is equivalent to
maximizing sum of non-gaussianities of the estimates of the independent components,
when the estimates are constrained to be uncorrelated. Maximum likelihood estimation
tells us what kind of non-linearity must be used. All these can be implemented as
practical ICA algorithms using either natural gradient method or fast fixed-point
algorithms. Bell-Sejnowski algorithm [26-27] is a gradient algorithm that employs
maximum likelihood estimation. There are few ICA methods that employ higher-order
cumulants tensor. Cumulant tensors are generalization of covariance matrix. JADE [31]
and FOBI [32] are two important algorithms of this class. Fourth Order Blind
Identification (FOBI) is the basic method which involves decomposition of weighted
correlation matrix. If there is problem of equal eigenvalues for a cumulants tensor then it
can be solved using Joint Approximate Diagonalization of Eigenmatrices (JADE). Non-
linear de-correlations are useful and possible general criteria for independence. First
successful ICA methods Hèrault-Jutten algorithm [12-14] and Cichocki-Unbehauen [19-
21] are based on nonlinear de-correlation. Today this is mainly of historical interest
because there are several more efficient algorithms for ICA. Cichocki-Unbehauen
algorithm is based on this principle and uses natural gradient. This is extended and
formalized to theory of estimating functions and Equivariant Adaptive Separation via
Independence (EASI) algorithm. The concept that all the IC‟s can be estimated with same
equivariant performance whatever the mixing matrix can be is shown first in this method.
Cichocki-Unbehauen algorithm is same as popular natural gradient algorithm introduced
by Amari, Cichocki and Young [33] as extension to original Bell-Sejnowski [26-27].
Texas Tech University, Kusum Yarlagadda, December 2010
21
CHAPTER 3
SIMULATED TERAHERTZ DATA FOR EXPLOSIVES
3.1 Introduction
The main emphasis of the thesis is to process the terahertz data obtained from the sensors
in the frequency domain and detect the presence of the explosives. Component spatial
and spectral pattern analysis and independent component analysis algorithms which will
be explained in the later chapters are implemented in MATLAB. These algorithms are
used to analyze manually generated samples and detect the presence of explosives. As
explained in previous chapters since the terahertz equipment the transmitters and the
sensors are expensive and since those resources are not available the required samples are
collected from literature.
This chapter first provides the information of the source signals used to generate the
samples and then the details of the samples generated.
3.2 Source Signals
Three explosive source signals namely RDX, TNT, DNT and three other random non-
explosive signals are used to generate the samples. The information of RDX, TNT and
DNT is collected from the data by Rensselaer Polytechnic Institute [34]. They have a
large database of different explosives. Spectra of most used explosives are shown in
Figure 3.1 below. Due to computations involved we have restricted ourselves to use three
explosives so the information of RDX, TNT, DNT signal is taken from 1-21 THz and this
data is sampled to have 200 samples as shown in Figure 3.2. As explained previously we
can notice that each explosive has got different characteristic peaks at different
frequencies.
Texas Tech University, Kusum Yarlagadda, December 2010
22
Figure: 3. 1 Diffuse reflectance and transmission spectra of different explosives [35]
Texas Tech University, Kusum Yarlagadda, December 2010
23
Figure: 3. 2 Spectra of different explosives
3.3 Samples Generated
Nine hundred samples are generated which are assumed to be in a 30*30 plot as shown in
Figure 3.3. Each sample has got different proportions of source signal along with
randomly generated noise. Information of these entire samples is used to compare the
results of the two methods. So the data generated will be a 30*30*200 size matrix where
30, 30 are the dimensions of the plot and 200 is the number of frequency samples for
each signal. These are arranged as a 900*200 matrix. Among these 900 samples
generated 42 samples contain RDX along with other source signal and noise, 9 samples
which contain DNT along with other source signal and noise and 36 samples with TNT
along with other source signal and noise. All the other samples have the non-explosive
signals along with noise. The pixels marked as red contains RDX, pixels marked as blue
contains TNT and pixels marked as magenta contains DNT in the Figure 3.3 shown
below.
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 2000.25
0.3
0.35
0.4
0.45
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 2000.52
0.54
0.56
0.58
0.6
0.62
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 2000.5
0.55
0.6
0.65
0.7
RDX
TNT
DNT S(3,:)
S(2,:)
S(1,:)
Texas Tech University, Kusum Yarlagadda, December 2010
24
Figure: 3. 3 Field with pixels containing RDX in red, TNT in blue, DNT in magenta and pixels
without any explosive in green
The next two chapters explain the two algorithms and also show the simulated results
which are obtained by utilizing this data.
0 5 10 15 20 25 300
5
10
15
20
25
30Field with all original given pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
25
CHAPTER 4
COMPONENT SPATIAL AND SPECTRAL PATTERN ANALYSIS
4.1 Introduction
In digital image processing studies component pattern analysis is important in various
applications like remote sensing in environmental sciences, medical diagnostics with x-
ray images etc. This new theory for multispectral images is developed using principal
component analysis and nonlinear optimization with non-negativity constraint. Using this
we can estimate the spectral curve of components present in the image and we can also
estimate the corresponding spatial pattern. Though we don‟t have any information about
spatial and spectral features of component, using rules of non-negative absorptivity and
density non-negativity we can come up with a feasible solution region for spectral and
spatial features [36] and then using entropy minimization we can optimize the solution
[37].
There are lots of computerized image processing methods like texture analysis etc. for
analyzing spatial pattern of a given component in a known image. But because of the
existing mutual dependency in spatial domain, the components cannot be classifiable in
feature hyper spectral space. In such a case we can use spectral information of the
component. The input required for this is just the images of scene taken at different
frequencies. The main important concepts used in this are multivariate analysis [38],
linear system theory [39], nonlinear programming with non-negativity constraint [40] and
entropy minimization [41].
In the case of our visual system every scene is sensed by three neuro-chemical sensors in
retina and all the three are recognized in brain as color images. The three detectors will
have three different spectral responses in visible region except the black and white case
where all three images will be the same. But in machine vision we go up to hundred
Texas Tech University, Kusum Yarlagadda, December 2010
26
distinct images in the range from ultra violet to infrared frequencies for the given scene.
These are called as multispectral images.
Figure 4.1 shows the example of multispectral image. The spectral information varies
pixel by pixel. If we suppose that there are M components in a multispectral image, then
spectra of every pixel is the combination of these M components.
Figure: 4. 1 Illustration of multispectral images [36]
Figure: 4. 2 Image model for multi-component patterns [36]
Texas Tech University, Kusum Yarlagadda, December 2010
27
In Figure 4.2 we can see the image in different kind of interpretation. Images at N
frequencies can be considered as the linear combination of these M image components
weighted by corresponding spatial responses. In matrix equation form this can be written
as
[Io]N∗L = So N∗M Po M∗L (4.1)
where
Io Matrix of multispectral images, with each row representing information of image at
individual frequencies with L pixels arranged lexicographically.
So Spectral response of M components as column vectors.
Po Spatial patterns of M components as row vectors with L pixels arranged
lexicographically.
For fluorescence and emission images equation (4.1) holds good, but for absorption
images we have to do logarithm of observed image intensity divided by illumination light
intensity.
Section 4.2 explains the algorithm and its implementation in MATLAB and section 4.3
shows the results and the further processing required. Finally in section 4.4 shows the
information about number of false positives and false negatives obtained.
4.2 Component spectral response and spatial pattern analysis algorithm
The ultimate goal of this algorithm is to estimate the spectral response and spatial pattern
of the components of a given multispectral image. Mathematically problem is concerned
about obtaining the values of [So] and [Po] in equation (4.1) if we know the values
of Io . If we have one of them available then this reduces to an inverse problem. But
here we do not have any prior knowledge of either [So] or [Po]. So in order to obtain
Texas Tech University, Kusum Yarlagadda, December 2010
28
those matrices first we should be able to estimate the number of components „M‟ present
in our multispectral image which will give the dimension of these matrices.
4.2.1 Determining number of significant components
If the system is noise free the determining the number of components „M‟ will be easier
i.e., the rank of [Io][Io]t matrix will give us number of components. But practically we
cannot find noise free systems. Also the task here is to find the number of significant
components „M2‟ and not just the number of components because the data collected
practically may have hundreds of insignificant components which can be neglected.
We can factorize any given rectangular matrix using singular value decomposition. For
any given matrix there exists a decomposition with positive singular values such that
Io N∗L = Uo N∗N Λo N∗L[Vo]L∗L (4.2)
[Λo] is a matrix with singular values of [Io] as diagonal elements,
Uo is the eigenvectors of [Io][Io]t ,
Vo is the transpose of eigenvectors of [Io]t[Io] .
Singular values are the square roots of eigenvalues.
Equation 4.2 can be written as
Io = [U ⋮ Un] Λ ⋮ 0… … …0 ⋮ Λn
V…Vn
(4.3)
Io = U Λ V + U Λ Vn + Un Λn V + Un Λn [Vn] (4.4)
Λ Square root of eigenvalues of [Io][Io]t > Threshold.
U Truncated [Uo] matrix with eigenvectors corresponding to significant eigenvalues.
Texas Tech University, Kusum Yarlagadda, December 2010
29
V Truncated [Vo] matrix with eigenvectors corresponding to significant eigenvalues.
Λn Square root of eigenvalues of [Io][Io]t < Threshold.
Un Truncated [Uo] matrix with eigenvectors corresponding to insignificant
eigenvalues.
Vn Truncated [Vo] matrix with eigenvectors corresponding to insignificant eigenvalues.
This is a division will divide the singular value matrix into two such that [Λ] consists of
significant singular values and [Λn] consists of insignificant singular values. Cutoff
position for significance is the position where the ratio of one singular value with its
immediate neighbor (when singular values are assigned in descending order) is larger
than prefixed threshold value. The number of components which are significant „M2‟ is
equal to the number of singular values above the cutoff position or in other words it is
equal to the dimension of the [Λ] matrix. This algorithm has the capability to suppress
influence of noise on the results. The threshold can be found by measuring detector noise,
non-linearity of detector, quantization error of analog-to-digital converter etc. Akaike‟s
information criteria can also be used for this purpose [42].
Based on the previous explanation we can estimate the number of significant components
„M2‟ and can reconstruct [I] which is approximately equal to [Io] using significant values
of singular values and the equation is as shown below
Io N∗L ≈ I N∗L = U N∗M2 Λ M2∗M2[V]M2∗L (4.5)
For the data samples generated [Io] is the transposed matrix of the data such that each
column of this matrix will represent the data of all the 900 pixels at a given frequency.
Now using the component spatial and spectral pattern analysis algorithm we are trying to
decompose this matrix such that we can find the source signals and their probabilities at
each pixel like the one shown in Fig 4.2.
Texas Tech University, Kusum Yarlagadda, December 2010
30
If we try to decompose 900*200 matrix [Io], it can be decomposed into maximum of 200
source signals because it the highest possible rank for that matrix. But the fact is that
even in practical cases when samples are collected from a restricted area or from a given
region then definitely there is going to be a limit for the number of source signals present
in the samples. So it is likely that we know the number of source signals present in the
samples and try to converge to those number of source signals. The information about the
number of significant eigenvalues can be obtained as explained above and the values of
[U], [Λ], [V] are obtained as shown in equation (4.5). Now I is factorized and from that
P and S can be obtained using transformation given below in (4.7) and (4.8)
Io ≈ I N∗L = S N∗M2 P M2∗L = U N∗M2 Λ M2∗M2[T]−1M2∗M2
[T]M2∗M2[V]M2∗L (4.6)
[P] = T [V] (4.7)
S = U Λ [T]−1 (4.8)
So if the value of transformation matrix [T] can be obtained accurately then the values of
[S] and [P] can obtained accurately. This can be found using non-negativity constraint.
4.2.2 Non-negative Constraint
We can use non-negativity as constraint in order to obtain the value of [T] to determine
[P] and [S]. As we know that [S] matrix is composed of source signals as its column
vectors and since the source signal will not be negative each element of [S] should be
positive as indicated in equation (4.9). Since [P] matrix indicates the probability of each
of the source signal in the spatial domain, we can say that all the elements of [P] should
be greater than or equal to zero as indicated in equation (4.10).
Sij ≥ 0 i = 1. . L, j = 1. . M2 (4.9)
Pij ≥ 0 i = 1. . M2, j = 1. . N (4.10)
Pij is the element of [P] which represents spatial pattern of ith
component at jth
pixel.
Texas Tech University, Kusum Yarlagadda, December 2010
31
Sij is the element of [S] which represents spectral response of the jth
component at ith
frequency.
In case of fluorescence spectroscopy we should make sure that there will be no
absorption by the sample, whose existence makes equation (4.10) and (4.1) invalid.
These constraints make the elements of matrices [S] and [P] to be non-negative. Number
of inequalities can be obtained using these constraints if we substitute equations (4.7) and
(4.8) into (4.9) and (4.10) which are
Elements{ U Λ T −1} ≥ 0 (4.11)
Elements{ T V } ≥ 0 (4.12)
T will be a matrix of M2*M2 elements and the values of T are restricted by N*M2
and M2*L inequalities given by equations (4.11) and (4.12). The values of T here are
non-unique, but if values of P and S are known T reduce to absolute. Since the
equation (4.6) involves both [T] and its inverse any scalar multiple of [T] will give us
good result of [I]. But this creates difference in the magnitudes of [S] and [P]. In order to
have some restriction on the magnitude of the source signal we obtain we should
normalize T by using
diag T T t = [E] (4.13)
where [E] is an identity matrix.
4.2.3 Feasible Solution of [T] using non-negativity constraints
If we try to derive the solution using non-negative constraints shown in the previous
explanation then we can come up with a feasible region for matrix [T]. Since the value of
[U], [Λ], [V] are known we can substitute in equation (4.11) and equation (4.12) and get
the inequalities which can be solved simultaneously to find a feasible solution region of
[T]. Equation 4.12 gives us the following equations
Texas Tech University, Kusum Yarlagadda, December 2010
32
t11υ1i + t12υ2i + t13υ3i + ⋯ + t1M2υM2i ≥ 0, 𝑖 = 1,2,……𝐿 (4.14)
……….
tM21υ1i + tM22υ2i + tM23υ3i + ⋯ + tM2M2υM2i ≥ 0, 𝑖 = 1,2,……𝐿 (4.15)
Since this explanation of M2 component case is tedious we go with two component case
which can be easily expressed in the form of mathematical equations. The above equation
for the two component case will be
t11υ1i + t12υ2i ≥ 0, 𝑖 = 1,2,……𝐿 (4.16)
t21υ1i + t22υ2i ≥ 0, 𝑖 = 1,2,……𝐿 (4.17)
From this we can derive the relation between t11 and t12 , and between t21 and t22 as
− max0≤i≤L υ2i
υ1i
−1
≤t12
t11≤ − min0≤i≤L
υ2i
υ1i
−1
(4.18)
− max0≤i≤L υ2i
υ1i
−1
≤t22
t21≤ − min0≤i≤L
υ2i
υ1i
−1
(4.19)
Similarly using equation (4.11) we get
u j1λ1t22−u j2λ2t21
t11 t22−t12 t21≥ 0, 𝑗 = 1,2, . .𝑁 (4.20)
−u j1λ1t12 +u j2λ2t11
t11 t22−t12 t21≥ 0, 𝑗 = 1,2, . .𝑁 (4.21)
Now from equations (4.20) and (4.21) by considering the conditions 𝑡11 > 0, 𝑡21 > 0 and
𝑡12
𝑡11>
𝑡22
𝑡21 the above equations yield
Texas Tech University, Kusum Yarlagadda, December 2010
33
t22
t21≤ min1≤j≤N
λ2u j2
λ1u j1 (4.22)
t12
t11≤ max1≤j≤N
λ2u j2
λ1u j1 (4.23)
For normalizing [T] we have the conditions like
t112 + t12
2 = 1 (4.24)
t212 + t22
2 = 1 (4.25)
Solving the three equations simultaneously we get a solution region for [T].
When the number of components increases the solution becomes complex. Till now we
have discussed about the feasible solution region for both component patterns and
spectra. Now the discussion is how to obtain unique solution for both component spectra
and patterns from the solution region we have. This solution is obtained based on an
entropy minimization criterion.
4.2.4 Optimal Estimation
Because we don‟t have a priori information about the components, estimation of optimal
solution from feasible solution requires estimation theory. In case of decomposing the
given multispectral data into source signals, one way to ascertain that signals obtained are
original source signals is to check if they are mutually independent i.e., the solution set
we get should be independent of one another. Entropy is one way of measuring the
independence of a signal.
Entropy is defined as the measure of uniformity of distribution of a bounded set of value.
This indicates that signals will have maximum entropy if they are uniform. So finding [T]
such that entropy of signals will be minimized will also minimize amount of shared
entropy or mutual information.
Texas Tech University, Kusum Yarlagadda, December 2010
34
So now we try to find the optimal solution by minimizing the entropy of the signals given
by the function H[S] as
H S = − aij lnaijLj=1
Mi=1 (4.26)
Where aij = sij
′′
sij′′ L
j=1
aij is probability density function of a stochastic process,
sij′′ is second derivative of sij with respect to i
th frequency and j
th component of [I].
Therefore minimization of equation (4.26) will localize peaks in spectra and will
smoothen the base line. To emphasize the peak feature we are taking second derivative.
We can do entropy minimization either in spectral domain H[S] or in spatial domain H[P]
or using both H([S],[P]).
If we do this entropy minimization in spatial domain then we have the minimization
function H[P] as
H P = − bij lnbijNj=1
Mi=1 (4.27)
where bij = pij
′′
pij′′ N
j=1
Now we have a constrained optimization problem with minimization of entropy as the
cost function subjected to non-negativity constraints. There are several methods available
to solve a constrained optimization problem like penalty function method [43],
Lagrangian multiplier method [43], augmented Lagrangian multiplier method for
inequality constraints [43], quadratic programming [43], gradient projection method for
equality constraints [43], gradient projection method for inequality constraints [43] etc.
All these methods try to formulate the constrained problem as unconstrained problem and
solve them. Penalty function method is employed in this algorithm.
Texas Tech University, Kusum Yarlagadda, December 2010
35
General Penalty Function Method: General penalty function can be understood by the
following mathematical explanation, let us start with a constrained problem given by
Minimize f(x) (4.28)
Subjected to gj x ≤ 0, j = 1, . . , p
hi x ≤ 0 i = 1, . . , m
Then using penalty function described by [Snyman 2005] we can formulate equation
(4.28) into an unconstrained problem as
Minimize P(x) (4.29)
where P x, ρ, β = f x + ρhi2(x)m
i=1 + βgj2(x)
pj=1
where ρ and β are the penalty parameters and ρ>>0 and β>>0
Using this penalty function method explained the constrained optimization problem can
be converted into unconstrained problem. The problem now is to minimize the entropy
function subjected to non-negative constraints. This can be mathematically represented as
Minimize H( S , P ) (4.30)
Subjected to Sij ≥ 0 i = 1. . L, j = 1. . M2
Pij ≥ 0 i = 1. . M2, j = 1. . N
H S , P is the entropy function.
If this is formulated using the penalty function method we have the cost function as
R S , P = H S , P + γQ S , P (4.31)
Texas Tech University, Kusum Yarlagadda, December 2010
36
Q is the penalty function term that handles the constraints,
γ is the scaling factor or the penalty parameter.
Similar to equation (4.29) non-negative constraint is handled by writing function Q as
Q = F pmi Ni=1
M2m=1 + F sjm M2
m=1Lj=1 (4.32)
where pmi and sjm are the elements of P and S , and
F x = 0 (x ≥ 0)
x2 (x < 0) (4.33)
The penalty function reduces the problem to an unconstrained optimization problem.
Minimization of R can be done using simplex method [44] and Davidon-Fletcher-Powell
method [44] or by using both. Simplex method is employed in this thesis.
Actual minimization occurs when γ → ∞ but presence of noise will not allow this as
optimal solution. We also lack the freedom to start with an initial large value of γ because
this will cause the solution to converge towards some local minimum value which is not
expected. So algorithm starts with smaller value of γ. Using solution from first step as
input to the next iteration gamma can be optimized. This can be repeated until gamma
exceeds signal-to-noise ratio of data (if known). In our algorithm we are not trying to
optimize the gamma, since γ=1 gave a reasonable solution we used that as the penalty
parameter.
4.3 Results of component spatial and spectral pattern analysis
Method explained above is used to decompose the [I] matrix into its respective source
signals [S] and their corresponding probabilities [P].
Algorithm is started with some initial guess value for [T] to minimize the cost function
which is the entropy function. The entropy can be calculated for either [P] or [S] or it can
Texas Tech University, Kusum Yarlagadda, December 2010
37
be a combination of both matrices. This algorithm uses the entropy of the source signals.
The constraints of both matrix [S] and [P] must be positive. A penalty function with
scaling γ=1 is used in order to make constrained problem as unconstrained. The value of
γ=1 gave reasonably good result. Different values of γ are tried for and, if γ is initially
started with a large value in (4.31) the algorithm converges to some local minimum
because of higher emphasis on non-negativity constraint than on the entropy function. So
γ=1 is considered in this implementation.
Optimization function „fminsearch‟ command in the MATLAB is used to optimize the
cost function. This is an algorithm which uses Nelder-Mead Simplex algorithm to
maximize or minimize a given unconstrained problem. The cost function equation (4.31)
written in the form of a function and the initial value of matrix to be optimized [TO] are
the inputs required by the function. The inputs for the function written for cost function
depend on the function written by us. The algorithm converges towards some optimal
value of [T]. This [T] along with [U], [Λ], [V] is used to reconstruct the final values of
[S] and [P]. The results obtained using this are shown below.
Texas Tech University, Kusum Yarlagadda, December 2010
38
The source signals obtained which are columns of matrix [S] are
Figure: 4. 3 (a) Original source signals and (b) Results obtained
It can see that the plots on the left hand side of Figure (4.4) are the original signals taken
to generate the samples and the plots on the right hand side are the sources signals to
which the algorithm has converged. The peaks and valleys are pretty much in the exact
position but there is some scaling involved which cannot be rectified using this algorithm.
Also the order of convergence cannot be controlled. The order obtained here is TNT,
DNT, and RDX.
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 200-2
0
2
4
6
0 20 40 60 80 100 120 140 160 180 2000
2
4
6
8
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
5
(a) (b)
Texas Tech University, Kusum Yarlagadda, December 2010
39
The rows of corresponding probability matrix [P] if arranged in the form of 30*30 plots
can be seen as
Figure: 4. 4 (a) The original samples generated and (b), (c), (d) The rows of [P] arranged as a 30*30
plot
The limitation of this algorithm is that we cannot assure that the components we obtained
are the true ones but they are optimal under entropy minimization criterion. So some
further processing is required.
0 5 10 15 20 25 300
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) (b)
(c) (d)
Texas Tech University, Kusum Yarlagadda, December 2010
40
4.3.1 Further Processing for confirmation
The algorithm will converge to significant number of source signals. But in general
applications we try to use this algorithm to find if the targeted signals are present in the
multispectral image. So some further processing is required for the spectral responses we
have obtained and their corresponding spatial patterns.
4.3.1.1 Cross-Correlation
From the algorithm explained one should be able to obtain [S] and [P]. The algorithm
will surely converge towards some solution irrespective of the presence of target signals
which is searched for. But since the goal is to detect the presence of the target signal in
the multispectral data the result obtained has to be further processed in order to confirm if
source signal obtained is target signal or not. Since the target signals information is
available the aim is to find if the source signals obtained is one from the database that
contains the target signals.
Cross-correlation is a measure of similarity of two waveforms. This is also known as
sliding dot product or inner product.
For discrete functions this cross-correlation is given by
f ∗ g [n] ≝ f ∗ m g[n + m]∞m=−∞ (4.34)
Cross-correlation is a function similar to convolution. Convolution involves reversing
signal, shifting it and then multiplying by another signal whereas correlation doesn‟t
involve reversing it only shifts and then multiplies. Auto-correlation is the cross-
correlation of a signal with itself. This is measure of degree to which two series are
correlated.
This is used to decide whether the source signal obtained is the one from database that
contains the target signals. For each source signal obtained cross-correlation is performed
with each of the target signals in the database and based on the maximum value of the
Texas Tech University, Kusum Yarlagadda, December 2010
41
correlation output it is decided whether the output is target signal or not. If the maximum
value of the correlation output normalized with respect to auto-correlation output of
target signal is greater than certain prefixed threshold value then the signals are
considered to be similar and then it is concluded that target signal is present in the
multispectral image.
So now for each of the three signals we have obtained previously we perform correlation
with all the three explosive source signals we have used to generate the samples and find
the maximum of the correlation output. In the left columns the signals obtained as result
is shown and in the right column the correlation output of the result is shown with
explosive database in the order RDX, TNT and DNT.
From Figure (4.6) it is clear that the first source signal obtained gave good peak value
which is really large only with TNT. Hence it is obvious that the first source obtained is
TNT. Similarly from Figure (4.7) it is clear that the second source signal obtained gave
good peak value which is really large only with DNT. Hence it is obvious that the second
source obtained is DNT. Similarly from Figure (4.8) it is clear that the third source signal
obtained gave good peak value which is really large only with RDX. Hence it is obvious
that the third source obtained is RDX.
Texas Tech University, Kusum Yarlagadda, December 2010
42
Figure: 4. 5 (a) First source signal obtained, (b) It’s correlation with RDX, (c) It’s correlation with
TNT, (d) It’s correlation with DNT
0 50 100 150 200 250 300 350 400-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 20 40 60 80 100 120 140 160 180 200-0.5
0
0.5
1
1.5
2
0 50 100 150 200 250 300 350 400-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 50 100 150 200 250 300 350 400-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
(b)
(d)
(c) (a)
Texas Tech University, Kusum Yarlagadda, December 2010
43
Figure: 4. 6 (a) Second source signal obtained, (b) It’s correlation with RDX, (c) It’s correlation with
TNT, (d) It’s correlation with DNT
0 50 100 150 200 250 300 350 400-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 20 40 60 80 100 120 140 160 180 2000
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 50 100 150 200 250 300 350 400-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 50 100 150 200 250 300 350 400-0.2
0
0.2
0.4
0.6
0.8
1
1.2
(b)
(a) (c)
(d)
Texas Tech University, Kusum Yarlagadda, December 2010
44
Figure: 4. 7 (a) Third source signal obtained, (b) It’s correlation with RDX, (c) It’s correlation with
TNT, (d) It’s correlation with DNT
0 50 100 150 200 250 300 350 400-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
0 20 40 60 80 100 120 140 160 180 2000
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 50 100 150 200 250 300 350 400-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 50 100 150 200 250 300 350 400-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
(b)
(a) (c)
(d)
Texas Tech University, Kusum Yarlagadda, December 2010
45
4.3.1.2 Thresholding to find the spatial position
The spatial significance of each of the components is decided based on the values of the
matrix [P]. Here again after confirmation from correlation output that the explosive traces
are present in the multispectral image it is equally important to be able to clearly identify
the exact spatial location of the explosive in the image. By now we can clearly observe
that in each case when the resultant signal matches with the explosive from the database
the maximum correlation output is significantly large when compared to other results.
Here again some threshold value is set to say whether the source signal is explosive.
Since each row of [P] corresponds to one of the source signals, the rows of [P]
corresponding to target signals are further passed through a thresholding stage where
each pixel probability is compared with maximum value of that row to finally confirm if
the target signal is present or not.
If any of the resultant signals are identified as explosives then we have to observe the
spatial probability of these in the multispectral image. The [P] matrix obtained will have
the probability values arranged in a row which is reconstructed in the form of a 30*30
plot, one for each source signal obtained. Here again a threshold value is considered to
say whether the pixel has got the explosive or not. The smaller the percentage of
explosive present in the sample the smaller should be the threshold value required for that
to be detected as explosive.
With really low value of the threshold the probability plot of each of the resultant signals
detected as explosives is as shown below.
Texas Tech University, Kusum Yarlagadda, December 2010
46
Figure: 4. 8 (a) First row of [P] as 30*30 plot (b) Finally detected pixels after thresholding
Figure: 4. 9 (a) Second row of [P] as 30*30 plot (b) Finally detected pixels after thresholding
Figure: 4. 10 Third row of [P] as 30*30 plot (b) Finally detected pixels after thresholding
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
(a) (b)
(a) (b)
(a) (b)
Texas Tech University, Kusum Yarlagadda, December 2010
47
The short come of this procedure which we should overcome is that in practical cases
when the target signals database is really large correlation of each source obtained with
all the signals in the database involves lot of computations and hence it is not
recommended. So we tried component spectral and spatial pattern analysis with
reference.
4.4 Component spectral and spatial pattern analysis with reference
In the previous method explained there is no fixed order in which the source signals will
converge. If the existing algorithm can be modified such that the order of convergence
can be fixed then the number of computations required for the correlation will drastically
be reduced.
Since prior knowledge of the target signals is available this can be used as reference
signal information to make an initial guess for the value of [T] when solving the
unconstrained problem using the simplex algorithm. Since the knowledge of the target or
the reference signal is available the algorithm initially start with an assumption that the
spectral response matrix [S] is known to us as [Sref]. From the equation (4.8) we can see
that
S = U Λ [T]−1 (4.35)
So we can write
Sref = U Λ [TO]−1 (4.36)
Since we have the values of [U] and [Λ] are known to us and the value of [Sref] is
assumed the only unknown in the above equation is [TO] which can be obtained using
TO = [Sref]−1 U [Λ] (4.37)
We start with this as the initial guess of [T] as [TO] in the unconstrained optimizing
problem using simplex algorithm.
Texas Tech University, Kusum Yarlagadda, December 2010
48
In the figures shown below we can see the output of two methods, first one is the output
using some randomly generated initial value of [TO] and second one is the one obtained
using the reference signal to generate the initial value of [TO].
Figure: 4. 11 Result of [S] obtained using random value of [TO]
Figure: 4. 12 Result of [S] obtained using value of [TO] generated using [Sref]
0 50 100 150 2000
2
4
0 50 100 150 2000
2
4
0 50 100 150 2000
2
4
0 50 100 150 2000
2
4
6
0 50 100 150 2000
5
10
0 50 100 150 2000
2
4
6
0 50 100 150 2000
2
4
0 50 100 150 2000
2
4
0 50 100 150 2000
2
4
0 50 100 150 2000
2
4
6
0 50 100 150 2000
2
4
6
0 50 100 150 2000
2
4
6
Texas Tech University, Kusum Yarlagadda, December 2010
49
By result comparison component spectral and spatial pattern analysis with reference is
definitely giving us a good result which in the later stage only involves one correlation
computation for each source signal.
Now from the [S] matrix obtained by this method we can confirm if those source signals
are explosive by using correlation and can finally detect the explosive pixels. Then we go
for thresholding for rows of [P] corresponding to those explosive signals. The final result
obtained by this method is
Figure: 4. 13 Original samples generated
0 5 10 15 20 25 300
5
10
15
20
25
30Field with all original given pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
50
Figure: 4. 14 Finally detected pixels
4.5 Results when percentage of explosives varies in the samples
The results seen in the previous cases where the percentage of explosives present in the
samples is larger are the situations where detection is easy. In this part of the chapter we
discuss the cases when the percentage of explosives in the samples varies.
In the case of independent component analysis we are not using any random threshold
values and the results obtained doesn‟t have any chance of false positives (Pixel detected
originally doesn‟t have explosive). We can say that there is no chance of false negatives
(Pixel that has explosive not being detected) if the percentage of explosive is reasonably
large.
In this case of the component spatial and spectral pattern analysis method final detected
samples depend upon the value of threshold taken for detecting pixels in [P]. If the
threshold is really low there are chances of false positives and if the threshold is large
there is chance of false negative occurrence. Each row of [P] which corresponds to each
one of the explosives used to generate samples if plotted as 30*30 plot will be as shown
in Figure below.
0 5 10 15 20 25 300
5
10
15
20
25
30Output of Component Spectral and Spatial Pattern Analysis
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
51
Figure: 4. 15 (a) Samples generated, (b) First row of [P] as 30*30 plot, (c) Second row of [P] as 30*30
plot, (d) Third row of [P] as 30*30 plot
From the figures we can clearly observe the variation in percentage of explosives present
in the explosives. The ultimate goal of the thesis is to detect the presence of explosive
and not to identify the type of the explosive present at a particular position in the table
shown below we try to find the total false positives and negatives for different values of
threshold taken. There are 87 pixels that have explosives among the 900 pixels shown in
the figure above.
0 5 10 15 20 25 300
5
10
15
20
25
30Field with all original given pixels
x axis--->
y a
xis
--->
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) (b)
(c) (d)
Texas Tech University, Kusum Yarlagadda, December 2010
52
Table 4: 1 False alarm of component spatial and spectral pattern analysis method
S.NO Threshold False
Positives
False
Negative
Total
Number of
Pixels
Total
Number of
pixels with
explosives
1 Max/2 0 26 900 87
2 Max/5 0 13 900 87
3 Max/6 1 13 900 87
4 Max/7 65 5 900 87
5 Max/8 90 0 900 87
6 Max/10 300 0 900 87
Because false negatives are more dangerous than false positives we should go for smaller
value of threshold such that all the explosives are effectively detected.
4.6 Conclusions
Valid results are obtained using component spatial and spectral pattern analysis when the
trace of explosive present in the sample is significant. But this method involves lot of
further computation to identify if the source signals obtained is one from the database that
has explosives. Since in practical conditions the number of explosives is really large this
is not computationally affordable and it consumes more time.
So the algorithm is improved to component spatial and spectral pattern analysis with
reference which maintains certain order for the source signals converged. This reduces
the computational effort in the later stages.
Texas Tech University, Kusum Yarlagadda, December 2010
53
CHAPTER 5
INDEPENDENT COMPONENT ANALYSIS
5.1 Introduction
Spectral unmixing is technique in which the original source signals are extracted from the
mixed signals. Some of the commonly used unmixing techniques are Least Square
Methods [45], Principal Component Analysis [46], Independent Component Analysis
(ICA) [47-52] and Kalman Filtering [53-54]. Out of these ICA is a very well developed
algorithm for spectral unmixing. ICA is a method in which we try to find linear transform
such that the transformed components are mutually independent or as independently as
possible.
In the last decade independent component analysis has received an increasing amount of
attention from the signal processing community because signal separation is an important
application of independent component analysis. In the literature independent component
analysis problem is addressed under labels blind source separation; signal copy,
waveform-preserving estimation etc [55]. Data analysis and compression, Bayesian
detection, localization of sources, blind identification and de-convolution are the possible
potential applications of independent component analysis [56].
Independent component analysis involves disciplines like neural networks, statistics,
pattern recognition, information theory, system identification etc. Principle component
analysis tries to find components which are uncorrelated whereas independent component
analysis tries to find components that are mutually statistically independent. Principal
component analysis involves only second order statistics whereas independent
component analysis involves higher order statistics. Using principal component analysis
mixing matrices and source signals can only be found up to an orthogonal transformation
and this is known as rotational invariance property of principal component analysis.
Texas Tech University, Kusum Yarlagadda, December 2010
54
Using independent component analysis original mixing matrix and source signals can be
retrieved.
Independent component analysis can be defined as a method for decomposition of a
linear mixture which contains unknown sources signals into independent components,
relying on the assumption that source signals are statistically independent. All we know is
information of X and independent component analysis must be capable of obtaining both
A and S in the equation (5.1) shown below. The assumption we have is that Si obtained
are statistically independent. We must assume independent components must have non-
gaussian distribution. The input „X‟ we have for the problem of independent component
analysis can be expressed in the mathematical form as
XL∗N = AL∗MSM∗N (5.1)
S is a matrix with each row representing values of a source signal at N different
frequencies,
X is obtained from linear transformation of the original source signals with each row
representing values of a mixed signal at N different frequencies,
A is the mixing matrix.
Goal of the independent component analysis is given X we have to find the value of A
such that the independent source signals S can be obtained by the equation (5.2) shown
below
S = A−1
X = WM∗LXL∗N (5.2)
This chapter discusses the methods to obtain this and the results obtained. Section 5.2
gives a brief description of history of ICA, criteria for the choice of algorithm and then
finally explains FastICA algorithm and shows the result obtained. Section 5.3 discusses
ICA with reference algorithm and section 5.4 shows the results of the algorithm. Section
Texas Tech University, Kusum Yarlagadda, December 2010
55
5.5 explains Fast ICA with reference. Section 5.6 discusses ICA with multi-reference and
section 5.7 explains results of the algorithm.
5.2 FastICA
5.2.1 Criteria to choose the algorithm
Main choices are between the different statistical estimation criteria and between one-unit
versus multi-unit methods. This thesis utilized FastICA that employs non-gaussianity
criteria. The reasons for the choice are given below.
In methods employing measurement of independence by non-gaussianity there is a
chance of measurement of non-gaussianity on a single projection i.e., it is possible to
derive only few number of IC‟s. But this is not possible for other kind of statistical
criteria. Since our emphasis is on obtaining only limited IC‟s we should definitely opt for
non-gaussianity as the statistical criteria of the ICA algorithm.
Selection of an ICA Method can be decomposed as
ICA Method = Obj Function + Optimization Algorithm (5.3)
In general statistical properties influence the choice of objective function and algorithm
properties depend on optimization method chosen. Based on our requirements we have
decided to use FastICA that employs non-gaussianity as a measure of independence. So a
function which measures the non-gaussianity is our objective function.
Orthogonalization
As mentioned non-gaussianity is used as a measure of independence when we are trying
to estimate few or one IC. In the case of finding few IC‟s the same algorithm one-unit
ICA has to be run few times and make sure that algorithm doesn‟t converge to same
maxima. This requires orthogonalization of the vectors w1, w2, w3,…… wn. There are two
different methods for achieving this de-correlation.
Texas Tech University, Kusum Yarlagadda, December 2010
56
Deflationary Orthogonalization
In this method independent components are found one-by-one and after estimating „p‟
independent components or „p‟ vectors w1,w2,…..wp ,any of the one-unit ICA algorithms
is used again to obtain wp+1 and after every iteration from wp+1 the projections of
previously estimated „p‟ vectors are subtracted and wp+1 is renormalized.
wp ← wp − (wpTwj)wj
p−1j=1 (5.4)
The disadvantage of this method is that if there are any estimation errors in the first
vector then they will be cumulated in the other IC‟s due to orthogonalization. The
alternative method is symmetric orthogonalization.
Symmetric Orthogonalization
In this one-unit algorithms are run for all the IC‟s in parallel and orthogonalizing of all
the wi‟s is done by special symmetric methods.
W ← WWT −1/2W (5.5)
Any one of these is selected based on the requirements.
Three major criteria we have considered for the choice of this algorithm are
One criterion is based on the choice between estimation of all the IC‟s
simultaneously or one by one. Based on this we decide between symmetric or
hierarchical decorrelation. Since we are trying to find IC‟s one after the other we
go for an algorithm that supports hierarchical decorrelation.
A second criterion is based on the non-linearity to be chosen. For FastICA (tanh)
is used. But for few gradient based algorithms second functions need to be used.
For general problem the function used by FastICA is appropriate.
Texas Tech University, Kusum Yarlagadda, December 2010
57
A third criterion is based on whether on-line and batch algorithms. In the case
when data is readily available we can use FastICA algorithms. But in some cases
when data continuously changes it is better to use gradient methods or use
combination of both. Since we have data readily available we go for FastICA.
From the above explanation we can say that the FastICA that we are going to see in
this chapter is employing some measure of independence as the cost function and it
uses hierarchical de-correlation method.
5.2.2 Estimation of Independence: Measure of non-gaussianity
Since the main aim of independent component analysis is to find the components that are
statistically independent a statistical criteria for measure of independence is required. The
statistical criteria like mutual information, likelihood and non-gaussianity can be used as
measure of independence. But reason for the choice of non-gaussianity as the measure of
independence are mentioned previously.
Two variables „x1„ and „x2„are said to be independent if information on value of „x1„
doesn‟t give any information on value of „x2‟. Weaker form of independence is
uncorrelatedness. Since independence implies uncorrelatedness the independent
component analysis can be constrained to obtain independent components such that
estimated values are uncorrelated.
If „x1‟ and „x2‟ are independent i.e., if they are uncorrelated
cov x1, x2 = 0 (5.6)
Covariance between „x1‟ and „x2‟ is
cov x1, x2 = mean x1 ∗ x2 − mean x1 ∗ mean x2 (5.7)
Texas Tech University, Kusum Yarlagadda, December 2010
58
For gaussian source signals independent component analysis can be estimated only up to
an orthogonal transformation i.e., A cannot be exactly identified for gaussian source
signals. Intuitively we can say that non-gaussianity is independence.
The covariance of two statistically independent variables is always zero and the converse
is not always true. Only in the case of gaussian variables, zero covariance means
independence. According to central limit theorem sum of independent variables will be
more gaussian than the independent variables. For example
Xi = A1S1 + A2S2 (5.8)
is more gaussian than „S1„ and „S2‟. Central limit theorem states that if we can find
signals that have minimal gaussian properties then they will be independent signals. So in
order to find independent component one method is to measure the non-gaussianity of
„WTX‟.
Non-gaussianity is one way to measure the independence of signal. To use non-
gaussianity as a measure of independence we need quantitative measure of that. Kurtosis
and negentropy explained below gives quantitative measure of non-gaussianity.
Kurtosis
This is a classical method for measuring non-gaussianity. This will be equal to fourth
moment of data if the data is preprocessed to have unit variance. This is given by
kurt y = E y4 − 3(E{y2})2 (5.9)
where E{} is the expectance value. Expectance value or mathematical expectation E{y}
is the mean or the first moment.
If we assume „y‟ is of unit variance then
kurt y = E y4 − 3 (5.10)
Texas Tech University, Kusum Yarlagadda, December 2010
59
So this is normalized version of fourth moment. For gaussian
E y4 = 3(E{y2})2 (5.11)
So the kurtosis for gaussian variable is zero. Non-zero kurtosis implies that data is non-
gaussian. Random variables that have negative kurtosis are called sub-gaussian and those
with positive kurtosis are super-gaussian. Generally non gaussianity is measured using
absolute value of kurtosis. Square of kurtosis can also be used.
Kurtosis drawback is that statistical significance of this is poor because its value depends
on few tail values which are outliers. So this method will not be robust enough for
independent component analysis.
Negentropy
This is based on information–theoretic quantity of differential entropy. Entropy of a
random variable can be identified as degree of information that observation of variable
gives. For discrete random variable „y‟ entropy „H‟ is given by
H y = − P Y = ai logP Y = ai i (5.12)
One of the fundamental of the information theory is that a gaussian variable has the
largest entropy among all random variables of equal variance.
Negentropy is the difference of entropy of a signal „y‟ and entropy of a gaussian signal
„ygauss‟ with zero mean and same variance as of „y‟. More random the variable is the more
is the negentropy. So now we can measure non-gaussianity using negentropy which is
J y = H ygauss − H y (5.13)
„ygauss„ is gaussian random variable of zero mean and same variance as „y‟. So negentropy
is always positive and is zero only if „y‟ is gaussian. As far as statistical properties are
concerned negentropy is the optimal estimator of non-gaussianity.
Texas Tech University, Kusum Yarlagadda, December 2010
60
This method is robust. Drawback is that it is computationally difficult.
Approximations of Negentropy
In order to overcome the drawback of kurtosis and negentropy calculations we come up
with an approximation to calculate negentropy. Classical method of approximating
negentropy is using higher-order moments which is given by equation
J y ≈1
12E y3 2 +
1
48kurt(y)2 (5.14)
This method also suffers from non-robustness encountered with kurtosis. So to overcome
this we go for
J(y) ≈ ρ[E Gi y − E{Gi ygauss }]2 (5.15)
G1 y = log cosh(a1y)/a1 (5.16)
G2 y = exp −a2y2
2 /a2 (5.17)
G3 y =y4
4 (5.18)
Where 1 ≤ a1 ≤ 2 and a2 ≈ 1. G1 is good for general purpose, G2 is good for super
Gaussian and G3 is good for sub-Gaussian signals. So these approximations are in good
agreement between properties of kurtosis and negentropy. This method has robustness.
5.2.3 Fast ICA Algorithm
Fast ICA methods that employ all the statistical criteria explained previously are
available. But since this thesis uses non-gaussianity as the measure of independence we
explain the concerned algorithm here. In this method negentropy is used as cost function
and the algorithm tries to find the component such that the cost function is maximized.
This is given by
Texas Tech University, Kusum Yarlagadda, December 2010
61
J y = H y − H(ygauss ) (5.19)
where ygauss is a gaussian random variable with zero mean and same variance as the
output „y‟,
H(.) denotes the differential entropy.
A flexible and reliable approximation of negentropy is given by Hyvarinen [57] as shown
in equation (5.13) and this is
J(y) ≈ ρ[E G y − E G ygauss ]2 (5.20)
G(.) is a quadratic function, given by equations (5.14), (5.15) and (5.16).
If the algorithm is used to generate only one independent component then the
independent component obtained is the one with maximum negentropy.
In The algorithm aims to find a weight vector „W‟ and update it after each iteration by
learning rule in a direction such that „WTX‟ maximizes non-gaussianity.
Basic steps in FastICA are [58]
1.) Choose initial weight vector W
2.) W+ = E XG WTX − E G′ WTX W
3.) W =W +
W +
4.) If not converged go to 2
Derivative of the function G is given by
g1 u = G1′ u = tanh a1u ; g2 u = G2
′ (u) = ue−u 2
2 (5.21)
Where 1 ≤ a1 ≤ 2 after a1 = 1 is considered.
Texas Tech University, Kusum Yarlagadda, December 2010
62
Convergence means old and new „W‟ must move in same direction i.e., their dot product
should be positive. This can be extended to several unit ICA. In this in order to avoid
convergence to same output, the output W1TX……Wn
TX must be de-correlated after
every iteration. The de-correlation is performed using deflationary orthogonalization
explained above.
Using this algorithm the given data can be decomposed into all the original source
signals. The accuracy or the convergence results are dependent on the number of signals
used to obtain the independent signals. General accurate results can be obtained only if
the number of the signals used is greater than or equal to the number of source signals
that are present in the samples. The number of source signals will be equal to the number
of significant eigenvalues present in the samples.
In order to get clear understanding of the algorithm three samples are generated with the
explosives and are used to verify the results of the algorithm. All the three samples have
got each source signal in different proportions. When FastICA algorithm is implemented
on the data generated the results obtained are as shown below in Figure 5.1. The
algorithm is capable of obtaining the original source signals.
Texas Tech University, Kusum Yarlagadda, December 2010
63
Figure: 5. 1 (a) Sources used to generate samples, (b) results obtained
The algorithm which is discussed till now is unconstrained ICA. From the Figure 5.1 it is
clearly visible that the result doesn‟t have any specific order and the magnitude of the
results is not controllable. Hence we can say that the methods discussed have got certain
drawbacks which can be overcome using constrained ICA [59] method discussed below.
Two ambiguities of the above discussed algorithm that can be taken care of are discussed
below.
We cannot determine variances (energies) of independent components
Since both S and A are unknown any scalar multiplier to source Si could always be
compensated by dividing corresponding Ai of A by the same scalar. But we can
overcome this by fixing the magnitude of independent components by constraining that
independent components must have unit variance E {Si2} =1. But this still leaves the
ambiguity of sign unattended.
0 50 100 150 200 2500
0.2
0.4
0.6
0.8
0 50 100 150 200 2500
1
2
3
4
0 50 100 150 200 2500
0.1
0.2
0.3
0.4
0.5
0 50 100 150 200 2500
1
2
3
4
0 50 100 150 200 2500
0.2
0.4
0.6
0.8
1
0 50 100 150 200 2500
5
10
15
Texas Tech University, Kusum Yarlagadda, December 2010
64
This can be visualized as
Maximize J y (5.22)
subjected to h W = [h1 W … . hM W ]
where hi Wi = WiTWi − 1 for i=1,2,…M which defines that the row norms of W matrix
is one.
This can be solved using Lagrangian multiplier method [60].
We cannot determine order of independent components
Since both S and A are unknown we can write
X = AP−1PS (5.23)
where P*S gives me independent components but in a different order. So since P-1
P will
not make any difference to X this is ambiguity in the order.
Using constrained ICA method the independent components can be ordered in a descent
manner according to certain statistical measure defined by Ɫ(u). This can be visualized as
Maximize J y (5.24)
subjected to g W ≤ 0, g W = [g1 W … . gM W ]T
where gi W = Ɫ ui+1 − Ɫ(ui) and Ɫ(u) is the index of measure of some statistical
measures of ‟u‟ such as variance, normalized kurtosis.
Again this can be optimized using Lagrangian multipliers method [60].
Any of the above constraints can be incorporated into the FastICA algorithm and the
results can be obtained accordingly. The algorithms discussed till now can be used when
Texas Tech University, Kusum Yarlagadda, December 2010
65
we are trying to find all the original source signals or few dominant signals present in the
data. But we are in search of an algorithm which should have the ability to detect the
presence of target signal given by us. In such a case we have to employ further
processing techniques like correlation or any other signal similarity techniques for all the
signals we got to finalize if they are one from the targeted signals. This is
computationally ineffective in practical cases when the target signals are large in number.
ICA with reference is one such algorithm which takes information of the target signal as
input and can say whether that signal is present in the data.
5.3 ICA with Reference
In many practical applications there will be requirement to search for only one targeted
signal. Such problem is addressed by Wei Lu and Jagapath C.Rajapakse in their ICA with
reference paper [61-62]. This algorithm is being used by us and is discussed in detail in
this section.
The motivation of ICA with reference is to perform both the separation of independent
sources and selection of desired sources simultaneously in a single stage. For ICA with
single reference we start with the data available and try to converge towards a „y‟ that is
closer to reference siven by us.
y = WTX (5.25)
So here we try to optimize „W‟ such that the signal obtained is non-gaussian and is close
to the reference signal given. There are two main goals for this algorithm
1.) Output is one of the independent components present in the input signal.
2.) Extracted independent component must be close to the reference in some distance
criteria.
So now we can define ICA-R as a constrained problem with the aim of maximizing
negentropy which is given by equation (5.13)
Texas Tech University, Kusum Yarlagadda, December 2010
66
J(y) ≈ ρ[E G WTX − E G ygauss ]2 (5.26)
Subjected to constraints
g y ≤ 0, h y = E y2 − 1 = 0 (5.27)
where g y = ε y, r − ξ ≤ 0 which is a measure of closeness of signal obtained to the
reference signal,
h(y) is introduced to ensure that W and J(y) are bounded.
If we start the algorithm with just the cost function and the h(y) constraint then the
algorithm will have M solutions of which (M-1) are local. By introducing the constraint
of g(y) the algorithm will converge towards the global optimum solution.
In order to make the inequality constraint as equality constraint we introduce slack
variables.
g y ≤ 0 −→ g y + z2 = 0 (5.28)
Now we use lagrangian multiplier method [40] to obtain the optimal solution for the ICA-
R algorithm. The Lagrangian function is given by
L1 W, µ, λ, z = J y + µ{g y + z2} +1
2γ g y + z2 2 + λh y +
1
2γ h(y) 2 (5.29)
where µ, λ are Lagrange multipliers for constraints g(w) and h(w),
γ is the scalar penalty parameter,
||. || denotes the Eucliden norm,
and 1
2γ||. ||2 term is included to ensure that the optimization problem is held at
the condition of local convexity assumption [60].
Texas Tech University, Kusum Yarlagadda, December 2010
67
Because minimization of the Lagrangian function with respect to z can be done for a
given W, the function is first minimized with respect to z. We know that at minimum or
maximum values the first derivative of the function with respect to z will be equal to
zero. By using we can obtain the value of z as shown below
dL1
dz= 2µz + γ g y + z2 2z = 0 (5.30)
z2 = max 0,− µ
γ+ g y =
1
2γmax 0,− 2µ + 2γg y (5.31)
Substituting the value of z2 and J(y) in our original equation we have the Lagrangian
function as
L1 W, µ, λ = ρ[E G WTX − E G ygauss ]2µ g y +
1
2γmax 0,− 2µ +
2γgy+12γgy+2+ λhy+12γh(y)2 (5.32)
L1 w, µ, λ = ρ[E G WTX − E G ygauss ]2 −
1
2γ max2 µ + γg w , 0 −
µ2−λhw−12γ||h(w)||2 (5.33)
In order to find the maximum of L1 we use Newton-like learning method which is
wk+1 = wk − η(L1wk2
′′ )−1L1wk
′ (5.34)
Where k is iteration index,
η is positive learning rate which is to avoid uncertainty in convergence.
We have to calculate the first derivative of L1 in order to obtain the maximum value and
it is given by
Texas Tech University, Kusum Yarlagadda, December 2010
68
dL1
dW= ρ E xGy
′ y −1
2µE xgy
′ w − λE{xy} (5.35)
Where p = ∓p and on order to simplify the calculations Hessian matrix is approximated
as
d2L1
dw2 = s(w)Rxx (5.36)
where s w = ρ E Gy2′′ y −
1
2µE gy2
′′ w − λ
Rxx = E{xxT}
So now equation (5.33) can be written as
wk+1 = wk − ηRxx−1L1wk
′ /s(wk) (5.37)
The optimum values of µ and λ are also found using
µk+1
= max 0, µk
+ γg wk , (5.38)
λk+1 = λk + γh(wk) (5.39)
So now this optimization algorithm converges at an optimum point (w*,µ
*,λ
*) which
satisfies the first order conditions which are
L1w′ w∗µ∗λ
∗ = 0 (5.40)
h w∗ = 0 (5.41)
g w∗ ≤ 0 (5.42)
λ∗ > 0 (5.43)
µ∗ ≥ 0 (5.44)
Texas Tech University, Kusum Yarlagadda, December 2010
69
µ∗g w∗ = 0 (5.45)
The value of ξ is a critical parameter in the convergence of the algorithm. Generally the
algorithm first starts with a lower value of ξ to avoid converging to local optimum and
then increases the value to reach the global optimum.
In this algorithm the convergence condition based on which we are deciding if the signal
obtained is targeted signal is the number of loops. The algorithm checks for two
conditions before stopping. The first condition is if the signal obtained is close to
reference in some distance criteria then the algorithm stops. The second condition is if the
number of loops exceeds the maximum value then the algorithm stops. If the targeted
signal is present in the data the algorithm converges within few number of loops, but if
targeted signal is not present the algorithm will reach the maximum number of loops
given by us.
The drawback of this algorithm that can be overcome is that it is computationally
ineffective.
5.4 Results of Independent Component Analysis with reference
General independent component analysis is a method which is used to decompose the
given sample into its base source signals. Similar to previous method the data we have
here is the 900*200 matrix with each sample signal as row elements. In order to
implement the independent component analysis with reference and obtain accurate results
we require samples equal to greater than the number of source signals originally present
in the samples.
The original number of source signals can be decided based on the number of significant
eigenvalues. Based on this we should decide the number of samples that should be passed
into the algorithm. First the result of ICA-R is shown and then the application of
algorithm for explosive detection is discussed.
Texas Tech University, Kusum Yarlagadda, December 2010
70
In order to verify the results of this algorithm we have to randomly generate few samples
such that the number of samples is greater than or equal to number of sources signals.
Here I have generated six samples with RDX in one sample and the remaining samples
with other sources signals. The aim of this algorithm is to extract the signal whose priori
information is available with us. So in this case we are trying to extract the signal „y‟ that
is close to the reference signal we have given. So we start with initial guess for „W‟ and
try to converge towards a signal that is close to reference.
We have to start with some initial guess value for „W‟ and the try to maximize the
negentropy of the signals with the constraint that the signal obtained is as close as
possible to the reference given by us. The other constraint must be used to keep the
magnitude of the output in the range of the original source signals we used to generate the
samples.
The stopping constraint used for this algorithm is if the output obtained is close to the
reference signal or if the number of iterations exceeded a prefixed value, two hundred in
our case.
Texas Tech University, Kusum Yarlagadda, December 2010
71
The output obtained for the example explained above is
Figure: 5. 2 Result of ICA-R: (a) Reference signal given by us (b) Result obtained
5.4.1 First Stage of Processing
In order to use this ICA-R to detect the explosives we should be able to identify the
correct pixels that have the probability of explosives in it. From the explanation of ICA-R
we are clear that if we pass few samples into the ICA-R we cannot say which of those
samples has got the explosive. But still we cannot pass individual sample into the
algorithm because independent component analysis requires number of samples greater
than or equal to the number of source signals. So one of the possibilities is that we can
pass each individual sample into the algorithm along with few other known samples
which are manually generated such that the algorithm converges to the solution only if
the sample from the original data has got the explosive. This algorithms guarantees the
solution but it is highly time consuming because in order to identify the presence of
explosive in 900 pixel plot we require ICA-R algorithm to be run for 900 times.
0 20 40 60 80 100 120 140 160 180 200-2
-1
0
1
2
3
0 20 40 60 80 100 120 140 160 180 200-2
-1
0
1
2
3
(a)
(b)
Texas Tech University, Kusum Yarlagadda, December 2010
72
Hence we go for a preprocessing stage in which entire plot is divided into square grids
and we perform ICA-R on each grid to first decide if there is any chance of the grid
having explosive in its samples. The division of the grids is as shown in the figure below.
Figure: 5. 3 Grid layout of the plot
Each of these grids is passed into ICA-R for each explosive at a time and so the ICA-R is
run for 100 times for every explosive detection. If at least one pixel in the grid has got the
explosive then the algorithm indicates the presence of explosive and we pass all the nine
pixels of that grid to the next stage.
As shown in Fig (3.2) the data generated has got the three different explosives RDX,
TNT and DNT. So this stage has to undergo ICA-R algorithm for 300 times, 100 times
for each explosive. This stage detects all the grids that have atleast one pixel that has
explosive content in it.
0 5 10 15 20 25 300
5
10
15
20
25
30
Texas Tech University, Kusum Yarlagadda, December 2010
73
Output after first stage when RDX is used as reference is
Figure: 5. 4 Output of ICA-R after first stage for RDX as reference
Output after first stage when TNT is used as reference is
Figure: 5. 5 Output of ICA-R after first stage for TNT as reference
0 5 10 15 20 25 300
5
10
15
20
25
30Field with pixels after first stage
x axis--->
y a
xis
--->
0 5 10 15 20 25 300
5
10
15
20
25
30Field with pixels after first stage
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
74
Output after first stage when DNT is used as reference is
Figure: 5. 6 Output of ICA-R after first stage for DNT as reference
Combined output of all the three explosives can be seen as
Figure: 5. 7 Final output after first stage
0 5 10 15 20 25 300
5
10
15
20
25
30Field with pixels after first stage
x axis--->
y a
xis
--->
0 5 10 15 20 25 300
5
10
15
20
25
30Field with pixels after first stage
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
75
Now all these pixels are selected because atleast one pixel in their grid has got the
probability of explosive in it. But truly speaking not all these pixels will have explosives.
So now we have to do further processing to truly identify only those pixels which have
got explosive in them.
5.4.3 Final Stage
This stage gets the output of first stage as input. Now we have to confirm for each pixel
individually if it has got explosive traces in it. So now we have to pass every pixel
information into the ICA-R. But we cannot send individual pixel information into the
ICA-R because the algorithms requires number of samples equal to or greater than the
number of source signals in the original sample.
Because we are not sure about the number of source signals in the sample it is better to go
for little large number of samples than just constraining it to one or two. In this case I
have considered six other randomly generated samples along with the suspected sample
and send the information of all these seven samples into ICA-R algorithm. Here again the
algorithm has to be run for each explosive individually, but since we know that the
suspected pixels are derived during particular reference signal in the first stage, we can
pass the pixels corresponding to those explosive alone to the corresponding reference
algorithms in the second stage. In that way we can save some computational effort
required.
Texas Tech University, Kusum Yarlagadda, December 2010
76
Output using RDX as reference is
Figure: 5. 8 Output of second stage using RDX as reference
The output using TNT as reference
Figure: 5. 9 Output of second stage using TNT as reference
0 5 10 15 20 25 300
5
10
15
20
25
30Finally detected pixels
x axis--->
y a
xis
--->
0 5 10 15 20 25 300
5
10
15
20
25
30Finally detected pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
77
The output using DNT as reference
Figure: 5. 10 Output of second stage using DNT as reference
The final output combining all the results is
Figure: 5. 11 Finally detected samples
0 5 10 15 20 25 300
5
10
15
20
25
30Finally detected pixels
x axis--->
y a
xis
--->
0 5 10 15 20 25 300
5
10
15
20
25
30Finally detected pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
78
The algorithm is successful in determining the exact pixels where explosive traces are
present. But the drawback of this algorithm is that it involves lot of computational effort.
In practical case where the explosive database is really large if we have to use ICA-R
individually for each explosive then it will consume lot of time before revealing the final
result.
5.5 Fast algorithm for one-unit ICA-R
This is explained as an extension to the previous ICA-R by Qiu-Hau Lin, Yong-Rui
Zheng, Fu-Liang Yin, Hualou Liang, Vince D. Calhoun [42]. The authors have suggested
some alternatives to the ICA-R algorithm in order to reduce its computational
complexity. The complexity can be reduced by
1. Pre-whitening the observed signals and
2. Normalizing the weight vector.
There are two reasons for the reduction in the complexity. First one is, from equation
(5.36) the weight vectors learning algorithm requires computation of inverse of
covariance matrix. Removing this by some means reduces the computational effort. This
can be achieved by data whitening process of the centered data. Second reason is, since
the method explained previously has got two constraints, among which g(y) is to measure
the closeness of signal with the reference and h(y) is for checking if J(y) and w are
bounded. The task of h(y) can be accomplished by normalizing the weight vector instead
of having a constraint.
So now the problem will be maximizing negentropy which is given by
J(y) ≈ ρ[E G WTX − E G ygauss ]2 (5.45)
Subjected to constraint
g y ≤ 0, (5.46)
Texas Tech University, Kusum Yarlagadda, December 2010
79
where g y = ε y, r − ξ ≤ 0 which is a measure of closeness of signal obtained to the
reference signal.
The initial mixed data „X‟ is pre-whitened. In order to make the inequality constraint as
equality constraint we introduce slack variables similar to the previous case.
g y ≤ 0 −→ g y + z2 = 0 (5.47)
Now we use lagrangian multiplier method [40] to obtain the optimal solution for the
FastICA-R algorithm. The Lagrangian function is given by
L1 W, µ, λ, z = J y + µ{g y + z2} +1
2γ g y + z2 2 (5.48)
Because minimization of the Lagrangian function with respect to z can be done for a
given W, the function is first minimized with respect to z. We know that at minimum or
maximum values the first derivative of the function with respect to z will be equal to
zero. By using we can obtain the value of z as shown below
dL1
dz= 2µz + γ g y + z2 2z = 0 (5.49)
z2 = max 0,− µ
γ+ g y =
1
2γmax 0,− 2µ + 2γg y (5.50)
Substituting the value of z2 and J(y) in our original equation we have the Lagrangian
function as
L1 W, µ = ρ[E G WTX − E G ygauss ]2µ g y +
1
2γmax 0,− 2µ +
2γg y +1
2γ g y + 2 (5.51)
L1 w, µ = ρ[E G WTX − E G ygauss ]2 −
1
2γ max2 µ + γg w , 0 − µ2 (5.52)
In order to find the maximum of L1 we use Newton-like learning method which is
Texas Tech University, Kusum Yarlagadda, December 2010
80
wk+1 = wk − η(L1wk2
′′ )−1L1wk
′ (5.53)
Where k is iteration index,
η is positive learning rate which is to avoid uncertainty in convergence.
We have to calculate the first derivative of L1 in order to obtain the maximum value and
it is given by
dL1
dW= ρ E xGy
′ y −1
2µE xgy
′ w (5.54)
Where p = ∓p and on order to simplify the calculations Hessian matrix is approximated
as
d2L1
dw2 = s w Rxx = s(w) (5.55)
Here since X is whitened data Rxx=1
where s w = ρ E Gy2′′ y −
1
2µE gy2
′′ w
So now equation (5.34) can be written as
wk+1 = wk −ηRxx
−1L1w k
′
s wk = wk − ηL1wk
′ /s(wk) (5.56)
Weight vector is normalized after each iteration
wk+1 = wk+1/ wk+1 (5.57)
The optimum values of µ is also found using
µk+1
= max 0, µk
+ γg wk , (5.58)
Texas Tech University, Kusum Yarlagadda, December 2010
81
So now this optimization algorithm converges at an optimum point (w*,µ
*) which
satisfies the first order conditions which are
L1w′ w∗µ∗ = 0 (5.59)
g w∗ ≤ 0 (5.60)
µ∗ ≥ 0 (5.61)
µ∗g w∗ = 0 (5.62)
The convergence conditions are all same as in the previous case. This algorithm gives
results similar to the previous case but with less computational effort.
5.6 ICA with Multi-Reference
The problem of ICA-R can easily be extended to multi reference case. The problem of
ICA with multi reference can be explained as
Maximize J yi li=1 (5.63)
Subject to g y ≤ 0, h y = 0
Where l is the number of desired independent sources to be extracted
g y = (g1 y1 , g2 y2 …… . , gl yl )T with gi yi = εi yi , ri − ξi
h y = (h11 y1 , h12 y1, y2 ,……… , h1l yl , yl , h21 y2, y1 ,………… hll yl , yl
with hij yi , yj = E yiyj 2
= 0 for all i, j = 1,2,… . l, i ≠ j
hii yi = E yi2 − 1 2 = 0 for all i = 1,2,… . l,
Texas Tech University, Kusum Yarlagadda, December 2010
82
h(y) here includes here the constraint to bound the signal as well as the uncorrelatedness
constraint. Here additional constraint of uncorrelatedness is introduced in order to get
different IC‟s as the output.
The Lagrangian function will be similar to the previous case with µ, λ, g(y) as vectors
and h(y) as matrix
L2 W, µ, λ, z = J y + µT{g y + z2} +1
2γ g y + z2 2 + λTh y +
1
2γ h(y) 2 (5.64)
Where µ = µ1
, µ2
,………µl
T and λ = λ1, λ2 ,……… λl
T
Similar to previous case the Lagrangian function is first optimized with respect to zi and
the equations are as shown below
dL1
dz= 2µizi + γ gi yi + zi
2 2zi = 0 (5.65)
zi2 = max 0,−
µ i
γ+ gi yi =
1
2γmax 0,− 2µi + 2γgi yi (5.66)
Using these equations we can write the augmented Lagrangian function as
L2 W, µ, λ = J yi −max 2 µi +γgi w i ,0 −µi
2
2γ l
i=1 − λTh W −
1
2γT||h(W)||2 (5.67)
So now the Newton like learning algorithm is again used and it is
wk+1 = wk − η s (W) L2w′ Rxx
−1 (5.68)
The results obtained using both ICA with reference and ICA with multiple references is
accurate and the required source signals can be extracted accurately.
Texas Tech University, Kusum Yarlagadda, December 2010
83
5.7 Results of Independent Component Analysis with multi reference
This is similar to ICA-R but here the reference is not just one source signal but the
reference here will be a group of source signals together. The reference here will be a
matrix with different explosive signal information we have as row elements of the matrix.
Here again the algorithm is we have the information about X and we should be able to
converge to the solution such that the signals obtained are as close as possible to the
reference signals considered.
We have to start with some initial guess value for W and then try maximizing the
negentropy function with the constraint that the output obtained is as close as possible to
the reference signals which we have considered and the other constraint is imposed to
restrict the magnitude of the output.
Because the main emphasis of this thesis is just to discuss the presence of the algorithm
and not to find which explosive that is we can use a convergence condition such that if
the algorithm converges to atleast one of the source signals then we can stop any further
iterations and can confirm that there is explosive trace in the sample.
First to verify the result of ICA-mR I have generated six samples such that one sample
has got RDX along with noise, one has got TNT along with noise, one has got DNT
along with noise and others has got random signals along with noise. In the data
generated by us it has got all the three explosives RDX, TNT and DNT. So now if we try
to run this algorithm for the data the algorithm converged to the solution shown below
Texas Tech University, Kusum Yarlagadda, December 2010
84
Figure: 5. 12 Result of ICA-mR
5.7.1 First Stage using ICA-mR
So now if we use this algorithm for the first stage as explained in 6.3.2 this algorithm can
used to detect if each grid has the possibility of any of the explosives in it. If any pixel in
the grid has got atleast one possible explosive in it then that grid will be considered to
have explosive traces in it.
0 20 40 60 80 100 120 140 160 180 200-2
-1
0
1
2
3
0 20 40 60 80 100 120 140 160 180 200-2
-1
0
1
2
3
0 20 40 60 80 100 120 140 160 180 200-2
-1
0
1
2
3
0 20 40 60 80 100 120 140 160 180 200-2
-1
0
1
2
3
0 20 40 60 80 100 120 140 160 180 200-1
0
1
2
3
4
0 20 40 60 80 100 120 140 160 180 200-1
0
1
2
3
4
Texas Tech University, Kusum Yarlagadda, December 2010
85
The output of this algorithm for the data we generated previously is
Figure: 5. 13 Result of first stage of ICA-mR
5.7.2 Final Stage using ICA-mR
This stage gets the output of first stage as input. So now we have to confirm for each
pixel individually if it has got explosive traces in it. So now we have to pass each pixel
information into the ICA-mR. But we cannot send individual pixel information into the
ICA-mR because the algorithms requires number of samples equal to or greater than the
number of source signals in the original sample.
Because we are not sure about the number of source signals in the sample it is better to go
for little large number of samples then just constraining it to one or two. In this case I
have considered six other randomly generated samples along with the suspected sample
and send the information of all these seven samples into ICA-mR algorithm.
0 5 10 15 20 25 300
5
10
15
20
25
30Field with pixels after first stage
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
86
The results obtained using this algorithm is
Figure: 5. 14 Result of final stage of ICA-mR
5.8 Conclusions
Independent component analysis algorithm is successfully implemented. Valid reasons
for selection of the algorithm are provided. Fast ICA algorithm is implemented and
drawbacks of that algorithm are discussed. In order to overcome these ICA-R algorithm
is used. This algorithm involves lot of computational effort. This algorithm is
computationally improved by pre-whitening and normalizing weight vector and this is
FastICA-R. Later ICA with multiple references is successfully implemented and is used
for explosive detection.
0 5 10 15 20 25 300
5
10
15
20
25
30Finally detected pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
87
CHAPTER 6
TWO STAGE PROCESS
Independent component analysis and component spatial and spectral pattern analysis
algorithms are discussed in the previous chapters. Each of them has got certain pros and
cons. This chapter first compares the two methods and then a two stage process is
proposed which is less time consuming than independent component analysis and is more
trustworthy than component spatial and spectral pattern analysis.
6.1 Comparison
Component spatial and spectral pattern analysis algorithm can give spatial and spectral
information along with the probability of each signal at the spatial location where ICA
can provide only spectral and spatial information. Component pattern analysis can
provide spatial information of entire data where as for ICA in order to know the spatial
location we have to use the algorithm once for every pixel. The size of data cannot really
influence the computational time for component pattern analysis whereas in ICA
computational time depends on number of pixels.
ICA can be implemented using single reference whereas component pattern analysis
cannot be implemented for single reference. For the case of ICA-R no further processing
is required to confirm if the source signal obtained is same as the reference. Both for
ICA-mR and component pattern analysis further processing is required to say if the
source signals obtained is one from the source signal.
ICA method is more accurate than component pattern analysis. ICA can detect even small
traces of explosives present is sample. Since ICA algorithm is implemented for one pixel
at a time pixels with very low trace can also be identified. But component pattern analysis
is implemented for entire data and hence the spatial information is found based on the
magnitude of the intensity matrix [P]. Hence there are chances of false negatives if the
threshold value selected in this method is large.
Texas Tech University, Kusum Yarlagadda, December 2010
88
Component pattern analysis doesn‟t require any pre-whitening process. For ICA using
pre-whitening we can improve the computational speed of the algorithm and if necessary
we can also reduce the dimension of the data before passing the data that to the ICA
algorithm.
By considering all these advantages and disadvantages a a two stage process is proposed
which utilizes the advantages of both the methods.
6.2 First stage
In this stage we perform component spatial and spectral pattern analysis for the entire
samples. As explained in the previous chapter this method is really fast and the location
of the explosive can be detected exactly just by using algorithm once for data of any size.
Similar to the method explained in chapter 4 first number of significant components in
the field are estimated and then using component spatial and spectral pattern analysis all
the significant components and their corresponding probability matrices are obtained.
Since we are using component spatial and spectral pattern analysis with reference the
order of the components will be maintained, so we can perform correlation for each
component obtained with the reference we have and based on the correlation output
decide if the estimated component is explosive. If the component obtained is explosive
we further do thresholding for the probability matrix to decide upon which pixel has got
explosive.
The data which we have used to verify the algorithm is same as the previous cases and is
shown in the figure 6.1 below. The algorithm has estimated all the three explosive
components and their corresponding rows in [P] matrix. The results after thresholding [P]
corresponding to each explosive are similar to the one shown in chapter 4. Here the
combined result of all the three explosives is shown in figure 6.2
Texas Tech University, Kusum Yarlagadda, December 2010
89
Figure: 6. 1 Originally generated pixels
Figure: 6. 2 Output of first stage component pattern analysis
0 5 10 15 20 25 300
5
10
15
20
25
30Field with all original given pixels
x axis--->
y a
xis
--->
0 5 10 15 20 25 300
5
10
15
20
25
30Output of First Stage-Component Spectral and Spatial Pattern Analysis
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
90
6.3 Second stage
In this stage we perform the independent component analysis on all the pixels that are
detected in the first stage along with their neighborhood pixels. This is done because the
first stage fails to detect a pixel only if the pixel has got minute trace of explosive or if
the threshold in the first stage is really large. Both these happen only when the percentage
of explosive content is really small. Generally this happens for the boundary pixels of the
explosive location. So if we include the neighborhood pixels of the earlier detected pixels
we can assure that we have finally detected all the pixels that have got explosive traces in
them.
Figure: 6. 3 Pixels being passed to the second stage include in black boxes
As explained previously independent component analysis has the capability to detect
even the minute traces of the explosive. So ICA will be an appropriate method for the
final confirmation about exact location of explosive.
Similar to the method explained in chapter 5 along with each pixel that has been
suspected to have explosive in the first stage we add few samples that doesn‟t have any
0 5 10 15 20 25 300
5
10
15
20
25
30Output of First Stage-Component Spectral and Spatial Pattern Analysis
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
91
explosive traces and then use this as data for independent component analysis algorithm.
Number of additional samples is decided based on the number of significant components
in the original pixel. Now if the independent component analysis detects that there is
explosive in the input data we can come to a conclusion that original pixel has got
explosive because other samples given as input doesn‟t have any traces of explosive.
Figure: 6. 4 Output of final stage
This method can accurately detect the presence of explosives without any false positives.
0 5 10 15 20 25 300
5
10
15
20
25
30Finally detected pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
92
6.4 Time comparison
The table below compares the time taken for all the methods explained in 4 and 5
chapters
Table 6: 1 Time Comparison
Method Time Taken in sec
Component Spectral and Spatial pattern
analysis
19.404
Component Spectral and Spatial pattern
analysis with reference
14.074
ICA-R 64.337
Fast ICA-R 60.349
ICA-mR 164.270
Two Stage Process 23.484
Component spatial pattern analysis is really faster than ICA. But when accuracy is
required we have to go for ICA which is really time consuming. After an in-depth
comparison of two methods we came up with a two stage process which is
computationally affordable and is as accurate as Fast ICA-R.
6.5 Need for deterministic signals
In all the cases explained till now in this chapter we are always generating the samples
based on the assumption if a pixel contains explosive then it has only one explosive in it.
But in practical cases there are chances that more than one explosive might reside in the
same area.
In order to take into consideration these cases samples are generated similar to previous
case but with two explosives in the same pixel. So in the figure shown below pixels in red
Texas Tech University, Kusum Yarlagadda, December 2010
93
has RDX and TNT, pixels shown in blue have TNT and DNT, pixels shown in magenta
has got DNT and RDX.
Figure: 6. 5 Pixels generated
So now if I try to use component spatial and spectral pattern analysis and ICA to find the
exact location of the explosives and the results obtained are discussed below. The result
of component spatial and spectral pattern analysis is shown first. The values of [S] which
contains the source signals to which the algorithm converged is as shown below
0 5 10 15 20 25 300
5
10
15
20
25
30Field with all original given pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
94
Figure: 6. 6 (a) Explosives used to generate samples, (b) [S] matrix obtained
The values of columns of [P] when plotted as 30*30 field and their corresponding
threshold output values are shown below
Figure: 6. 7 [P] corresponding to first source signal
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 2000
1
2
3
4
0 20 40 60 80 100 120 140 160 180 200-2
0
2
4
6
0 20 40 60 80 100 120 140 160 180 200-2
0
2
4
6
0 20 40 60 80 100 120 140 160 180 200-2
0
2
4
6
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
(a) (b)
Texas Tech University, Kusum Yarlagadda, December 2010
95
Figure: 6. 8 [P] corresponding to second source signal
Figure: 6. 9 [P] corresponding to third source signal
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
Texas Tech University, Kusum Yarlagadda, December 2010
96
Because we are not concerned about the explosive type and since only the exact location
of the explosive is what matters the combined output is shown below,
Figure: 6. 10 Final result obtained
The result of independent component analysis is shown below. The output of the first
stage is as shown below.
Figure: 6. 11 Output of first stage of ICA
0 5 10 15 20 25 300
5
10
15
20
25
30Output of Component Spectral and Spatial Pattern Analysis
x axis--->
y a
xis
--->
0 5 10 15 20 25 300
5
10
15
20
25
30Field with pixels after first stage
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
97
The final result of the independent component analysis is as shown below
Figure: 6. 12 Final output of ICA
From the results we can say that component spatial and spectral pattern analysis method
works well for the case when the pixel has more than one explosive. This happens only if
we choose a lower value of threshold for the [P] matrix. The results of independent
component analysis are really bad in this case. The reason for this might be because of
the overlap of different peak positions of the explosive spectra.
So deterministic i.e., considering only some range of frequency where the peak positions
might not overlap for different explosives might give us better result. To serve the
purpose we have considered the first 60 frequency samples of all the explosives and other
source signals to generate the samples and again used both the algorithms to obtain the
results.
The result of component spatial and spectral patter analysis is shown first. The values of
[S] which contains the source signals to which the algorithm converged is as shown
below
0 5 10 15 20 25 300
5
10
15
20
25
30Finally detected pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
98
Figure: 6. 13 Source signal spectra used to generate samples, (b) Sources signals obtained
The values of columns of [P] when plotted as 30*30 field and their corresponding
threshold output values are shown below
Figure: 6. 14 [P] corresponding to first source signal
0 20 40 600
1
2
3
0 20 40 600
2
4
0 20 40 600
2
4
0 20 40 600
2
4
0 20 40 60-2
0
2
4
0 20 40 600
2
4
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
Texas Tech University, Kusum Yarlagadda, December 2010
99
Figure: 6. 15 [P] corresponding to second source signal
Figure: 6. 16 [P] corresponding to third source signal
Because we are not concerned about the explosive type and since only the exact location
of the explosive is what matters the combined output is shown below,
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
0 5 10 15 20 25 300
5
10
15
20
25
30
Texas Tech University, Kusum Yarlagadda, December 2010
100
Figure: 6. 17 Final result obtained
The result of independent component analysis is shown below. The output of the first
stage is as shown below.
Figure: 6. 18 Result of first stage of ICA
0 5 10 15 20 25 300
5
10
15
20
25
30Output of Component Spectral and Spatial Pattern Analysis
x axis--->
y a
xis
--->
0 5 10 15 20 25 300
5
10
15
20
25
30Field with pixels after first stage
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
101
The final result of the independent component analysis is as shown below
Figure: 6. 19 Result of final stage of ICA
The results clearly indicate that for the case when there are more than one explosive at
the same pixel using deterministic spectral information gives us better results.
6.6 Conclusions
Two stage procedure gave pretty good results. It has the accuracy of the independent
component analysis and also it is computationally fast like the component spatial pattern
analysis technique. This chapter also presents the need for deterministic signals. As
terahertz region is significant for its distinct absorption or reflectance peaks, proper usage
of this information gives a better results as explained.
0 5 10 15 20 25 300
5
10
15
20
25
30Finally detected pixels
x axis--->
y a
xis
--->
Texas Tech University, Kusum Yarlagadda, December 2010
102
CHAPTER 7
CONCLUSIONS AND FUTURE WORK
This thesis attempted to obtain a technique for explosive detection. First two different
approaches component analysis of spatial and spectral pattern analysis, independent
component analysis are successfully implemented. Later by considering different factors
we have developed a two stage combined process which gives accurate results and is a
low time consuming process.
Component spatial and spectral pattern analysis implemented has the capability to obtain
spectral information of underlying signals in a given multispectral image and their
corresponding spatial information. This algorithm needs further processing to confirm if
the spectrum obtained is one from the database that contains explosive. This is done
using correlation. Also some threshold value is required for spatial matrices to say which
of the pixels has got explosive. Since the database is really large correlation stage
involves lot of computation. In order to overcome this component spatial and spectral
pattern analysis with reference is implemented. This algorithm is computationally faster
in the final processing stage.
Independent component analysis is a technique implemented that can extract the source
signals present in the signal which is closest to the reference signal given. This is a two
stage process. In the first stage entire plot is divided into grids and individual grid
information is passed into ICA-R algorithm. All those grids detected in the first stage are
passed to the second stage. In this stage individual pixel information is combined with
few other known signals that don‟t have any explosive and ICA-R algorithm is
implemented. This finally detects the pixels that have explosive traces. But this is time
consuming and this is computationally improved by including pre-whitening of data and
normalizing the weight vector. This is computationally faster than the previous method.
Independent component analysis with multiple references is also successfully
implemented in the similar manner.
Texas Tech University, Kusum Yarlagadda, December 2010
103
Both these methods are combined to obtain a two stage process which is more accurate
and computationally better. In the first stage component spatial and spectral pattern
analysis is implemented with a moderate threshold value. There are chances of pixels to
with low traces of explosive to be missed in this stage of detection. So for the second
stage border pixels along with the pixels detected in the first stage are verified by ICA-R
pixel by pixel. This method gave good results without any false alarms.
For the case when there are more than one explosive in the same pixel or in the same
location only deterministic signals can give good results using the above methods.
In this thesis we have concentrated on using THz spectroscopy for explosive detection.
But THz signals have got their own limitations. So these must be overcome to design an
efficient method. In recent years lot of research is being done in data fusion techniques.
These utilize the information from more than one sensor and try to make use of these
advantages to come up with more efficient techniques.
Component spatial and spectral pattern analysis method can further be improved such
that further processing to confirm that spectrum obtained is explosive can be eliminated.
Texas Tech University, Kusum Yarlagadda, December 2010
104
REFERENCES
[1] Jacqueline MacDonald and J.R. Lockwood,”Alternatives for landmine detection,”
RAND Publications, Santa Monica, CA, 2003.
[2] Rob Siegel,”Landmine Detection,” IEEE Instrum. Meas. Mag., vol. 5, no. 4, Dec
2002.
[3] Committee on the Review of Existing and Potential Standoff Explosives Detection
Techinques & National Research Council,“Existing and Potential standoff explosives
detection techniques,” National Academies Press, Washington, D.C., 2004.
[4] A. Giles Davies et al.,“Thz spectroscopy of explosives and drugs,” Materials
Today, vol 11, no. 3, pp. 18-26, March 2008.
[5] Committee on Assessment of Security Technologies for Transportation, National
Research Council, “Assessment of millimeter wave and THz technology for detection
and identification of concealed explosives and weapons,” National Academies Press,
Washington, D.C., 2007.
[6] P. Kužel, Laboratory of Terahertz Spectroscopy, Prague [Online], Available:
http://department.fzu.cz/lts
[7] M.Acheroy,“Mine action: status of sensor technology for close-in and remote
detection of antipersonnel mines”, Proc. of the 3rd Int. Workshop on Advanced
Ground Penetrating Radar, (IWAGPR 2005), pp. 3 - 13, May 2005.
[8] Geneva International Center on Humanitarian Demining [Online], Available:
http://www.gichd.org
[9] P. Druyts et al., ” Usefulness of semi-automatic tools for airborne minefield
detection,” In CLAWAR'98, Brussels, Belgium, pp.241-248, November 1998.
[10] Marc Acheroy� and Idesbald van den Bosch†, ”Humanitarian demining: sensor
technology status and signal processing aspects,” Proc of GDR Ondes (Invited
paper), 2003.
[11] Allen, S.J , ”Terahertz dynamics in semiconductor quantum structures,” Infrared
and Millimeter Waves, 2002. Conf. Digest. Twenty Seventh Int. Conf., 2002, pp 11-
12.
Texas Tech University, Kusum Yarlagadda, December 2010
105
[12] Herault J and Ans B, “ Neural Network with modifiable synapses- Decoding of
Composite Sensory Messages Under Unsupervised and Permanent Learning,”
Comptes Rendus De L Academie Des Sciences Serie Iii-Sciences De La Vie-Life
Sciences, vol. 299, no. 13, pp. 525-528, 1984.
[13] ANS, B., J. H´ERAULT, and C. JUTTEN, “Adaptive neural architectures Detection
of primitives,” Proc. of COGNITIVA’85, pp. 593–597, 1985.
[14] S.-I. Amari,”Estimating functions of independent component analysis for temporally
correlated signals,” Neural Computation, vol. 12, no. 9, pp. 2083-2107, 2000.
[15] J.-F. Cardoso,” Blind identification of independent signals,” In Proc. Workshop on
Higher-Order Spectral Analysis, Vail, Colorado, 1989, pp. 157-160.
[16] P. Comon,” Seperation of stochastic processes,” In Proc. Workshop on Higher-
Order Spectral Analysis, Vail, Colorado, 1989, pp. 174-179.
[17] J.-F.Cardoso and A. Souloumiac,”Blind beamforming for non Gaussian signals”,
IEE Proceedings-F, vol. 140, no. 6, pp. 362-370, 1993.
[18] J.-L. Lacoume and P. Ruiz,”Sources identification: a solution based on cumulants,”
In Proc. IEEE ASSP Workshop, Minneapolis, Minnesota, pp. 199-203, 1988.
[19] A. Cichocki and L. Moszczynski,”A new learning algorithm for blind separation of
sources,” Electronics Letters, vol. 28, no. 21, pp. 1986-1987, 1992.
[20] A. Cichocki and R. Unbehauen,”Robust neural networks with on-line learning for
blind identification and blind separation of sources”, IEEE Trans. On Circuits and
Syst, vol. 43, no. 11, pp. 894-906, 1996.
[21] A. Cichocki et al.,”Robust learning algorithm for blind separation of signals,”
Electronics Letters, vol. 30, no. 17, pp. 1386-1387, 1994.
[22] G. Burel,”Blind separation of sources: a nonlinear neural algorithm,” Neural
networks, vol. 5, no. 6, pp. 937-947, 1992.
[23] J.-P. Nadal and N. Parga,”Non-linear neurons in the low noise limit: a factorial code
maximizes information transfer,” Network, vol. 5, no. 4, pp. 565-581, 1994.
[24] E. Oja et al., Learning in nonlinear constrained Hebbian networks,” In Proc. Int.
Conf. on Artificial Neural Networks (ICANN’91), Espoo, Finland, 1991, pp. 385-390.
[25] J. Karhunen and J. Joutsensalo,”Representation and separation of signals using
nonlinear PCA type learning,“ Neural Network, vol. 7, no. 1, pp. 113-127, 1994.
Texas Tech University, Kusum Yarlagadda, December 2010
106
[26] A. J. Bell and T.J. Sejnowski,”A non-linear information maximization algorithm that
performs blind separation,” In Advances in Neural Information Processing Systems 7,
pp. 467-474. The MIT Press, Cambridge, MA, 1995.
[27] A. J. Bell and T.J. Sejnowski,”An information-maximization approach to blind
separation and blind deconvolution,”Neural Computation, vol. 7, pp. 1129-1159,
1995.
[28] A. Hyvӓrinen and E. Oja,“A fast fixed-point algorithm for independent component
analysis,” Neural Computation, vol. 9, no. 7, pp. 1483-1492, 1997.
[29] A. Hyvӓrinen,“A family of fixed-point algorithms for independent component
analysis,” In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing
(ICASSP’97), Munich, Germany, pp. 3917-3920, 1997.
[30] A. Hyvӓrinen, “Fast and robust fixed-point algorithms for independent component
analysis,” IEEE Trans. Neural Net., vol. 10, no. 3, pp. 626-634, 1999.
[31] J.-F. Cardoso and A. Souloumia, “Blind beamforming for non Gaussian
signals,” IEE Proceedings-F, vol. 140, no. 6, pp. 362–370, 1993.
[32] J.-F. Cardoso, “Source separation using higher order moments,” Proc. Int. Conf.
Acoust. Speech Signal Process., vol. 4, pp. 2109–2112, May 1989.
[33] S.-I. Amari et al.,“ A new learning algorithm for blind source separation,” In
Advances in Neural Information Processing Systems 8, pp 757-763. MIT Press,
Cambridge, MA, 1996.
[34] E.J. Heilweil and M. Campbell, THz Spectral Database[Online], Available:
http://webbook.nist.gov/chemistry/thz-ir/
[35] X.-C. Zhang et al.,“THz Diffuse Reflectance Spectra of Selected Explosives and
Related compounds”, Proc. SPIE, vol. 5790, no.19, 2005.
[36] S.Kawata, K.Sasaki, and S.Minami, “Component analysis of spatial and spectral
patterns in multispectral images. I. Basis,” J.Opt.Soc.Am.A, vol. 4, no. 11, pp. 2101-
2106, 1987.
[37] K.Sasaki, S.Kawata, and S.Minami, “Component analysis of spatial and spectral
patterns in multispectral images. II. Entropy minimization,” J.Opt.Soc.Am.A vol. 6,
no. 1, pp. 73-79, 1989.
[38] Naes T and Risvik E. (Eds.), “Multivariate analysis of data in sensory science”,
Elsevier, New York, 1996.
Texas Tech University, Kusum Yarlagadda, December 2010
107
[39] C. T. Chen,”Introduction to Linear System Theory,” Holt and Rinehart and Winston,
New York, 1970.
[40] Benjamin W. Wah and Yixin Chen,“Solving Large-Scale Nonlinear Programming
Problems by Constraint Partitioning,” Proc. of the Principles and Practice of
Constraint Programming, vol. 3709, pp. 697-711, 2005.
[41] Watanabe, S and Kaminuma, T, “Recent developments of the minimum entropy
algorithm,” Proc. 9th Int. Conf. Pattern Recognition (1CPR), Rome, vol. 1, pp. 536-
540, 1988.
[42] H.Akaike, “A new look at the statistical model identification,” IEEE Trans. Autom.
Contro., Vol. 19, no. 6, pp. 716-723, 1974.
[43] Shuonan Dong, “Methods of Constrained Optimization,” MIT, Cambridge, MA,
May 2006.
[44] J.Kowalik and M.R. Osborne,“Method for Unconstrained Optimization Problems,”
American Elsevier, New York, 1968.
[45] Heinz Daniel C and Chang Chein-I, “Fully Constrained Least Squares Linear
Spectral Mixture Analysis Method for Material Quantification in Hyperspectral
Imagery,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 3, March 2001.
[46] Almeida T. I. R. and De Souza C. R. Filho 2004 “Principal component analysis
applied to feature-oriented band ratios of hyperspectral data: a tool for vegetation
studies,” INT. J. Remote Sens., vol. 25, no. 22, pp. 5005–5023, 20 November, 2004.
[47] Hyvärinen Aapo and Oja Erkki. 2000, “Independent Component Analysis:
Algorithms and Applications,” Neural Networks, vol. 13, no. 4-5, pp. 411-430, 2000.
[48] Hyvarinen Aapo and Oja Erkki, 1997, “A Fast Fixed Point Algorithm for
Independent Component Analysis,” Neural Computations, vol. 9, pp. 1483-1492,
1997.
[49] Jos´e M. P. Nascimento and Jos´e M. B. Dias,“Does Independent Component
Analysis Play a Role in Unmixing Hyperspectral Data?,” IEEE Trans. Geosci.
Remote Sens., vol. 43, no. 1, pp. 175-187, 2005.
[50] Keshava Nirmal,“A Survey of Spectral Unmixing Algorithms,” Lincoln Laboratory
J., vol. 14, no. 1, 2003.
[51] Cromp Robert F. (1998) et al.,“Analyzing Hyperspectral data with Independent
Component Analysis,” Proc. SPIE, vol. 3240, pp. 133-143 , 1998.
Texas Tech University, Kusum Yarlagadda, December 2010
108
[52] Hsuan Ren and CHEIN-I CHANG,“Automatic Spectral Target Recognition in
Hyperspectral Imagery,” IEEE Trans. Aerosp. Electron. Syst, vol. 39, pp. 1232-1249,
2003.
[53] Chein-I Chang and Clark Brumbley,“Kalman Filterinlg Approach to
Multispectral/Hyperspectral Image Classification,” IEEE Trans. Geosci. Remote
Sens., vol. 35, no. 1, pp. 319-330, 1999.
[54] Chein-I Chang and Clark M. Brumbley, “A Kalman Filtering Approach to
Multispectral Image Classification and Detection of Changes in Signature
Abundance,” IEEE Trans. Geosci. Remote Sens., vol. 37, no. 1, pp. 257-268, 1999.
[55] Lieven De Lathauwer et al.,”An introduction to Independent Component Analysis,”
J. Chemometr., vol. 14, no. 3, pp. 123–149, 2000.
[56] Pierre Comon,“Independent Component Analysis, A new concept,” Signal
Processing, Elsevier., vol. 36, no. 3, pp. 287-314, April 1994.
[57] Aapo Hyvarinen and Erkki Oja,”ICA: Algorithms and Applications”, Neural
Networks, Elsevier., vol. 13, no. 4-5, pp. 411-430, 2000.
[58] A. Hyvarinen and E. Oja,“A fast fixed-point algorithm for independent component
analysis,” Neural Computation, vol. 9, no. 7, pp. 1483-1492, 1997.
[59] Wei Lu and Jagath C. Rajapakse,“Constrained Independent Component analysis”,
Advances in Neural Information Processing Systems 13, vol. 10, pp. 570-576, 2000.
[60] D.P. Bertsekas,“Constrained Optimization and Lagrangian Multiplier Methods”,
Academic Press, New York, 1982.
[61] Wei Lu and Jagath C. Rajapakse,”ICA with Reference”, Neurocomputing, vol. 69,
no. 16-18, pp 2244-2257, October 2006.
[62] Qiu-Hua Lin et al., ”A fast algorithm for one-unit ICA-R,” Information Sciences,
vol. 177, no. 5, pp 1265-1275, March 2007.