techniques to analyze the terahertz - repositories

TECHNIQUES TO ANALYZE THE TERAHERTZ DATA FOR THE

DETECTION OF EXPLOSIVES

by

Kusum Yarlagadda, B.Tech

A Thesis

In

ELECTRICAL ENGINEERING

Submitted to the Graduate Faculty

of Texas Tech University in

Partial Fulfillment of

the Requirements for

the Degree of

MASTER OF SCIENCE

IN

ELECTRICAL ENGINEERING

Approved

Dr. Vittal Rao

Chairperson of the committee

Dr.Mohammad Saed

Accepted

Ralph Ferguson

Dean of the Graduate School

December, 2010

Texas Tech University, Kusum Yarlagadda, December 2010

ii

ACKNOWLEDGEMENTS

I dedicate this Masters thesis to my parents Radha Krishna Murthy Yarlagadda, Rama

Leela Yarlagadda and my brother Santosh Yarlagadda who have supported and

encouraged me so far.

I should express my sincere thanks to the chair of my committee Dr. Vittal Rao for giving

me the opportunity to work in this project. His support and guidance for this entire

project is invaluable. I also should express my thanks to Dr. Mohammed Saed my co-

chair for the support and encouragement he provided for the completion of this project.

I should also take the privilege to thank all my friends at Tech who have helped me a lot

during my stay in Lubbock.


iii

TABLE OF CONTENTS

ACKNOWLEDGEMENTS ............................................................................................. ii

ABSTRACT ...................................................................................................................... vi

LIST OF TABLES .......................................................................................................... vii

LIST OF FIGURES ....................................................................................................... viii

1. INTRODUCTION ....................................................................................................... 1

1.1 Landmines ................................................................................................................. 1

1.2 Need for standoff detection ....................................................................................... 3

1.3 THz technology ......................................................................................................... 5

1.4 Analysis Methods ...................................................................................................... 6

1.5 Our Approach ............................................................................................................ 7

1.6 Thesis Organization................................................................................................... 7

2. REVIEW OF LITERATURE ...................................................................................... 8

2.1 Review of detection methodologies .......................................................................... 8

2.1.1 Prodders, seismic and acoustic sensors ............................................................ 10

2.1.2 Electromagnetic Sensors ................................................................................... 11

2.1.3 Electro-Optic Sensors ....................................................................................... 13

2.1.4 Other Explosive Detectors ................................................................................ 13

2.2 Terahertz data acquisition ....................................................................................... 16

2.2.1 Sources and Detectors ...................................................................................... 16

2.2.2 THz Spectroscopy and Imaging ....................................................................... 18

2.3 Analysis Techniques ............................................................................................... 19

2.3.1 Background of component spatial and spectral pattern analysis ...................... 19


iv

2.3.2 Background of independent component analysis ............................................. 19

3. SIMULATED TERAHERTZ DATA FOR EXPLOSIVES .................................... 21

3.1 Introduction ............................................................................................................. 21

3.2 Source Signals ......................................................................................................... 21

3.3 Samples Generated .................................................................................................. 23

4. COMPONENT SPATIAL AND SPECTRAL PATTERN ANALYSIS ................ 25

4.1 Introduction ............................................................................................................. 25

4.2 Component spectral response and spatial pattern analysis algorithm ..................... 27

4.2.1 Determining number of significant components .............................................. 28

4.2.2 Non-negative Constraint ................................................................................... 30

4.2.3 Feasible Solution of [T] using non-negativity constraints ................................ 31

4.2.4 Optimal Estimation ........................................................................................... 33

4.3 Results of component spatial and spectral pattern analysis .................................... 36

4.3.1 Further Processing for confirmation ................................................................. 40

4.4 Component spectral and spatial pattern analysis with reference............................. 47

4.5 Results when percentage of explosives varies in the samples ................................ 50

4.6 Conclusions ............................................................................................................. 52

5. INDEPENDENT COMPONENT ANALYSIS ......................................................... 53

5.1 Introduction ............................................................................................................. 53

5.2 FastICA ................................................................................................................... 55

5.2.1 Criteria to choose the algorithm ....................................................................... 55

5.2.2 Estimation of Independence: Measure of non-gaussianity ............................... 57

5.2.3 Fast ICA Algorithm .......................................................................................... 60


v

5.3 ICA with Reference ................................................................................................. 65

5.4 Results of Independent Component Analysis with reference ................................. 69

5.4.1 First Stage of Processing .................................................................................. 71

5.4.3 Final Stage ........................................................................................................ 75

5.5 Fast algorithm for one-unit ICA-R .......................................................................... 78

5.6 ICA with Multi-Reference ...................................................................................... 81

5.7 Results of Independent Component Analysis with multi reference ........................ 83

5.7.1 First Stage using ICA-mR ................................................................................ 84

5.7.2 Final Stage using ICA-mR ............................................................................... 85

5.8 Conclusions ............................................................................................................. 86

6. TWO STAGE PROCESS ........................................................................................... 87

6.1 Comparison ............................................................................................................. 87

6.2 First stage ................................................................................................................ 88

6.3 Second stage ............................................................................................................ 90

6.4 Time comparison ..................................................................................................... 92

6.5 Need for deterministic signals ................................................................................. 92

6.6 Conclusions ........................................................................................................... 101

7. CONCLUSIONS AND FUTURE WORK .............................................................. 102

REFERENCES .............................................................................................................. 104


vi

ABSTRACT

Improving landmine detection capability is a challenging technological issue. Most of the

existing technologies are still using metal detectors and probes. There are many plastic

explosives which cannot be detected using these methodologies. So there is requirement a

for new detection technologies that utilize characteristics other than metal content.

Properties of the electromagnetic spectrum, acoustics of mine casings, advanced

prodders, other chemical and biological technologies are in the research phase. There is

also lot of ongoing research on sensor fusion techniques which utilize the advantages of

more than one detection technique. These fusion techniques will be more promising than

using an individual detection technique.

This thesis attempted to utilize the unique properties of the electromagnetic spectrum for

landmine detection. The terahertz gap of the electromagnetic spectrum which has not

been utilized properly until recently, has certain advantages which can prove worthy

when used for explosive detection. Most explosives have unique spectral peaks in the

terahertz frequency. Advanced algorithms, independent component analysis and

component spatial and spectral pattern analysis are implemented and improved to use for

explosive detection. Further an improved method which utilizes both algorithms is

obtained. The results obtained are promising and the approach has the capability to detect

the explosives hidden under some background.


vii

LIST OF TABLES

1.1 Summary of trends in component technology .............................................................. 4

2.1 Operational characteristics of sensors ......................................................................... 10

4.1 False alarm of component spatial and spectral pattern analysis method .................... 52

6.1 Time Comparison........................................................................................................ 92


viii

LIST OF FIGURES

1. 1 VS1.6 plastic AT mine, PMD6 wood AP mine, VS50 plastic AP mine, and M14

plastic AP mine. In the figure M14 is roughly two inches across .............................2

1. 2 Electromagnetic Spectrum .......................................................................................... 5

2. 1 Atmospheric attenuation at different frequencies ......................................................16

2. 2 Different source and detectors available in the electromagnetic spectrum ................ 17

3. 1 Diffuse reflectance and transmission spectra of different explosives ........................22

3. 2 Spectra of different explosives................................................................................... 23

3. 3 Field with pixels containing RDX in red, TNT in blue, DNT in magenta and

pixels without any explosive in green .......................................................................24

4. 1 Illustration of multispectral images ............................................................................26

4. 2 Image model for multi-component patterns ............................................................... 26

4. 3 (a) Original source signals and (b) Results obtained ................................................. 38

4. 4 (a) The original samples generated and (b), (c), (d) The rows of [P] arranged as a

30*30 plot ...................................................................................................................39

4. 5 (a) First source signal obtained, (b) Its correlation with RDX, (c) Its correlation

with TNT, (d) Its correlation with DNT .....................................................................42

4. 6 (a) Second source signal obtained, (b) Its correlation with RDX, (c) Its

correlation with TNT, (d) Its correlation with DNT ..................................................43

4. 7 (a) Third source signal obtained, (b) Its correlation with RDX, (c) Its correlation

with TNT, (d) Its correlation with DNT .....................................................................44

4. 8 (a) First row of [P] as 30*30 plot (b) Finally detected pixels after thresholding ...... 46

4. 9 (a) Second row of [P] as 30*30 plot (b) Finally detected pixels after thresholding .. 46

4. 10 Third row of [P] as 30*30 plot (b) Finally detected pixels after thresholding ........ 46

4. 11 Result of [S] obtained using random value of [TO] ................................................ 48


ix

4. 12 Result of [S] obtained using value of [TO] generated using [Sref] ......................... 48

4. 13 Original samples generated ...................................................................................... 49

4. 14 Finally detected pixels ............................................................................................. 50

4. 15 (a) Samples generated, (b) First row of [P] as 30*30 plot, (c) Second row of [P]

as 30*30 plot, (d) Third row of [P] as 30*30 plot ....................................................51

5. 1 (a) Sources used to generate samples, (b) results obtained ........................................ 63

5. 2 Result of ICA-R: (a) Reference signal given by us (b) Result obtained .................... 71

5. 3 Grid layout of the plot ................................................................................................ 72

5. 4 Output of ICA-R after first stage for RDX as reference ............................................ 73

5. 5 Output of ICA-R after first stage for TNT as reference............................................. 73

5. 6 Output of ICA-R after first stage for DNT as reference ............................................ 74

5. 7 Final output after first stage ....................................................................................... 74

5. 8 Output of second stage using RDX as reference ....................................................... 76

5. 9 Output of second stage using TNT as reference ........................................................ 76

5. 10 Output of second stage using DNT as reference ...................................................... 77

5. 11 Finally detected samples .......................................................................................... 77

5. 12 Result of ICA-mR .................................................................................................... 84

5. 13 Result of first stage of ICA-mR ............................................................................... 85

5. 14 Result of final stage of ICA-mR .............................................................................. 86

6. 1 Originally generated pixels ........................................................................................ 89

6. 2 Output of first stage component pattern analysis ....................................................... 89

6. 3 Pixels being passed to the second stage include in black boxes ................................ 90

6. 4 Output of final stage................................................................................................... 91


x

6. 5 Pixels generated ......................................................................................................... 93

6. 6 (a) Explosives used to generate samples, (b) [S] matrix obtained ............................. 94

6. 7 [P] corresponding to first source signal ..................................................................... 94

6. 8 [P] corresponding to second source signal................................................................. 95

6. 9 [P] corresponding to third source signal .................................................................... 95

6. 10 Final result obtained ................................................................................................. 96

6. 11 Output of first stage of ICA ..................................................................................... 96

6. 12 Final output of ICA .................................................................................................. 97

6. 13 Source signal spectra used to generate samples, (b) Sources signals obtained ....... 98

6. 14 [P] corresponding to first source signal ................................................................... 98

6. 15 [P] corresponding to second source signal............................................................... 99

6. 16 [P] corresponding to third source signal .................................................................. 99

6. 17 Final result obtained ............................................................................................... 100

6. 18 Result of first stage of ICA .................................................................................... 100

6. 19 Result of final stage of ICA ................................................................................... 101


1

CHAPTER 1

INTRODUCTION

Recent terrorist attacks in several places have elevated national and international security

concerns. Antipersonnel mines and antitank mines are significant threats in many nations

despite of many programs by the United Nations and humanitarian organizations to clear

them. These mines are inexpensive and are available at very low cost as low as $3-$25

and the detection method requires $300-$1000 per mine to be cleared. According to

ICBL, 2001 there are 15,000-20,000 victims due to these mines per year in over 90

countries. According to the estimates of U.S. State Department survey 2001 there are 40-

50 million mines to be cleared [1]. Worldwide an estimated amount of 100000 mines are

found and destroyed every year (Horowitz et al., 1996). So according to these

calculations, in order to destroy 40-50 million existing mines takes 450-500 years. Some

estimates say that around 19 million new mines are placed annually which needs 19 more

years for clearing them (Horowitz et al., 1996).

These statistics indicate that there is a great need for effective landmine detection

technologies which are fast and efficient. This chapter gives a brief introduction about

landmines, terahertz technology and few analysis techniques that are being used in this

thesis.

1.1 Landmines

The wide spread use of landmines started during World war I. The deployment of US

army in Bosnia in 1995 and Afghanistan in 2001 gave the landmine issue a sense of

urgency. Since then US military is investing a lot of funds in the landmine detection area

of research.

Landmines come in varying shapes and sizes. They can be square, circular, cylindrical or

bar shaped. Casings can be metallic, wooden or plastic. Based on the metal content

present in the landmines they are categorized as metal, low metal or non-metallic.


2

Landmines are broadly classified as antitank mines and antipersonnel mines. Antitank

mines are designed to destroy vehicles or to impede their motion. Usually these are 6-

14in (15-35cm) and are buried 16in (40cm) deep. These will contain 5-10 kg of explosive

material. These can be metallic or plastic. Antipersonnel mines are designed to kill and

maim people. Usually these are 2-6in (5-15cm). Their casing can be metallic, plastic or

wooden. In the real world antipersonnel mines are cleared by dividing the area into 1m

grids and each square is systematically checked. Both these types of mines have variable

proportions of explosives in them.

Figure: 1. 1 VS1.6 plastic AT mine, PMD6 wood AP mine, VS50 plastic AP mine, and M14 plastic AP

mine. In the figure M14 is roughly two inches across [2]

Some of the most widely used explosives are nitro-based compounds such as 2, 4, 6-

trinitrotoluene (TNT), cyclotrimethylenetrinitramine (RDX), pentaerythritol tetranitrate

(PETN) and nitroglycerin (NG). Plastic explosives are pure molecular crystalline form

mixed with other agents. The range and concentrations in the mines may vary but the

major ingredient will be explosive compound. Few plastic explosives in use are metabel

(PETN based), SX2 (RDX based), C-4 (RDX based), PBX (predominately containing

HMX) and Semtex H (containing RDX and PETN). TNT is widely used in antitank

landmines, NG is used in dynamites, RDX and PETN are used in the manufacture of

plastic explosives [3]. Pipe bombs use black powder, large vehicle bombs use ammonium

nitrate fuel oil (ANFO). There are also some other explosives without the presence of

nitrogen like Triacetone triperoxide (TATP).


3

For both antipersonnel and antitank mines there are methods for remediating them

without individually detecting them. If all the mines are metal cased then they can be

easily detected using metal detectors. But widespread use of plastic landmines

necessitates additional detection technologies. Because there will be no plastic detectors

these other technologies must focus on disturbances in the background like thermal,

chemical, electromagnetic or dielectric. A few of these methods which are in use are

discussed in chapter 2.

1.2 Need for standoff detection

There are methods that can be efficient in landmine detection when the sensors or the

personnel involved can have close proximity to the field. But in the real world this is

dangerous because it involves loss of life and property. Hence there is lot of ongoing

research on the standoff detection. Remote detection is the situation where personnel

involved are away from the field but the detection equipment moves in close proximity to

the explosives. Standoff detection which is slightly different from remote detection is an

active or passive detection technique where vital assets and other individuals involved in

this detection are separated or are out of zone of severe damage in case any explosive

deploys.

Traditionally explosive detection can be divided into two types: bulk and trace detection.

In bulk detection macroscopic amounts are detected using imaging or by technologies

that use nuclear properties. Standoff detection is possible in either of these cases. In trace

detection microscopic amounts such as vapor or particulates are detected using chemical

sensors or animal olfaction. The best hope for standoff detection of particulates is by

using some technology that analyzes the radiation emitted from explosives. This radiation

can be active or passive. Bulk detection will be easier using such radiation. X-ray

imaging, thermal neutron activation, techniques employing infrared, mm-wave and other

electromagnetic radiation etc., fall under this category.


4

But before deciding if a standoff detection technique is effective, there are several factors

that must be considered such as

Signals from explosives will be interfered by other signals from the background.

Frequency of false alarms.

Time required for detection must be very low. (Speed of detection is important

when threat is fast approaching)

Effective standoff detection must take into account the output from more than one sensor

because using only one sensor output will increase the number of false alarms. But use of

distributed sensors has got several challenges like communication between sensors,

sensor sampling, data transfer, fusion of information, sensor fault detection, time to

sample, detection decision making and deployment issues [4].

Also for the standoff detection parameters like transmission capability, resolution and

component performance are dependent on the frequency and distance of operation which

is clearly shown in the table below.

Table 1: 1 Summary of trends in component technology [5]

All the different methods of standoff detection techniques for landmine detection are all

discussed in chapter 2. Since terahertz technology has got certain advantages over the

other method for this particular landmine detection problem a little introduction of the

technology is given below.


5

1.3 THz technology

Generally the frequency from 0.3-10THz (10-300cm-1

) in electromagnetic spectrum is

considered as THz frequency range. But this region is not exploited so far due to lack of

convenient and suitable sources and detectors. This is because on the microwave side of

spectrum it is difficult to produce sources and detectors because these require very short

carrier transit times in active region and also due to low power produced by devices, they

must have small active regions to minimize their capacitance [3]. Similarly it is difficult

to produce sources and detectors on the optical side too. There exists inter band lasers to

operate in visible and near IR frequencies. The working principle is that in this light is

generated by radioactive recombination of conduction band electrons with valence band

holes across band gap of active material. But this cannot be extended to mid IR or to

other longer wavelengths due to lack of suitable narrow band gap semiconductors.

Because of these difficulties and the cost involved this band of electromagnetic spectra is

very little exploited.

Figure: 1. 2 Electromagnetic Spectrum [6]

Radiation in the THz frequency has a unique capability for noninvasive imaging and

spectroscopy of materials. This range of electromagnetic radiation can be transmitted

through many nonmetallic and non-polar materials. These radiations have the capability

to transmit through paper, plastic, card board and other dry packaging material with

sufficient residual energy to excite molecular vibration, rotations and phonon based


6

resonances in solid material. This radiation has low photon energy in comparison to X-

rays, and hthese are suitable for personnel scanning as well. The radiation power (<1mw)

will not pose any health risk. Because of these reason THz frequency spectroscopy has

gained attention of many researchers in explosive detection, drug detection etc. The

electromagnetic spectrum from sub millimeter wavelength through THz can be used to

gather information on chemical structure of object by measuring intensity of reflected or

emitted energy. As the frequency increases, the spectral features of material become more

apparent whereas the capability of penetration of radiation decreases.

Several materials exhibit characteristic spectral features in this region of electromagnetic

spectrum particularly when frequency greater than 1 THz. So THz spectroscopy can be

used as a powerful tool to identify different chemical species. THz spectroscopy

addresses into molecular vibrational modes and can also excite intermolecular vibrations.

These features create the potential for THz spectroscopy to provide both structural and

chemical information. Different chemical structures of same material also lead to

different spectral features. All the explosives that are being used in landmines has got

unique spectral signatures in this region of the electromagnetic spectra.

This thesis concentrates on using this unique feature to identify the explosive location in

a given field by using signal and image processing techniques.

1.4 Analysis Methods

There are lots of methods that can be used to analyze the data collected from the sensors

whether they are signal or image processing techniques. In this thesis we have employed

two techniques. The first one is an image processing technique component spatial and

spectral pattern analysis which can extract the spectral information and their

corresponding spatial location. The second method is well known independent

component analysis which can obtain the spectral information of all the independent

signals present in a given signal. Combined method that can be used to analyze the


7

terahertz data to detect the presence of explosive in the plot from which the data is

collected is proposed.

1.5 Our Approach

This thesis concentrates on developing techniques for standoff explosive detection. The

basic component spatial and spectral pattern analysis and independent component

analysis methods are used and they are improved such that they give better result for our

particular case of explosive detection.

Component spatial pattern analysis method is improved to use reference values. This has

greatly reduced the computational effort. Independent component analysis is also

improved by using pre-whitening and normalization which has reduced the computational

effort required.

Considering all the pros and cons of each individual method a two stage combined

method is generated which can detect the explosives with more accuracy and in less time.

1.6 Thesis Organization

Thesis gives detailed description about independent component analysis and component

spatial and spectral pattern analysis which are used to analyze terahertz data for the

explosive detection. Chapter 2 gives a brief review of the existing landmine detection

technologies. Chapter 3 gives the information of the signals that are used to generate data.

Chapter 4 describes the component spatial pattern analysis along with the results obtained

using the method. Chapter 5 describes the independent component analysis algorithm and

the results obtained using the method. Chapter 6 explains a two stage combined process

that employs both methods used in chapter 4 and chapter 5. Chapter 7 gives conclusions

and future work.


8

CHAPTER 2

REVIEW OF LITERATURE

2.1 Review of detection methodologies

Over 83 countries are being polluted by landmines. If conventional tools like metal

detectors and prodders are used for detecting on an average an area of 10m2 can be

cleared on a working day. For an instance in Cambodia only 146km2 has been cleared

from the last five years [7]. The goal required by Mine Ban Treaty to reach mine free

world by 2010 seems to be impossible, so the first priority of mine action has changed to

mine impact-free world although the final goal remained the same [8].

Few recommendations made during Standing Committee on mine clearance, Mine Risk

Education and Mine Action Technologies are

Technologists should avoid building technologies based on assumed needs and

should work interactively with end users.

Appropriate technologies could save human lives and increase mine action

efficiency.

Nothing is more important than understanding working environment.

Detection probability is important in any design system. It should always be as close as

possible to one. Number of false positives and false negatives are very important in

design of any demining system. Decreasing number of false alarms accelerates demining

operations and also reduces cost of operation. Pilot project on airborne minefield

detection in Mozambique has clearly shown that even with a very high resolution air

borne sensors it is very difficult to find antipersonnel mines using objective signal

processing tools nor using subjective photo-interpretation [9].


9

Most efficient way of reducing false alarms and increasing detection probability is to use

complementary sensors in parallel and fusing information collected from all different

sensors.

A brief description of the following sensors is given in this chapter.

Prodders, seismic and acoustic sensors.

Electromagnetic sensors (Metal detector, GPR, Microwave radiometer, Electrical

Impedance Tomography, Electrography, Imaging with handheld sensors).

Electro-optic sensors ( visible, IR, multispectral, hyper spectral, LIDAR)

Other kind of explosive detectors (NQR, X-Rays, Neutron activation, Biosensors,

Trace Explosive detection).

The first three sensors cannot differentiate between material if they have same electro-

magnetic, thermal and/or optical properties but can offer good localization capabilities

and 2-D, 3-D imaging capabilities. Last category offers poor localization capabilities,

lacks spatial resolution as well as 2-D and 3-D capabilities.

In the table below you can compare different sensors technologies that will be discussed

later with their status of maturity, cost, clearance speed and their effectiveness.


10

Table 2: 1 Operational characteristics of sensors [10]

2.1.1 Prodders, seismic and acoustic sensors

Prodders are rigid metal sticks which are about 25 cm long and will be used to scan the

soil. If some unusual object is detected using this then other kind of methods are used to

confirm if the object is explosive. But this is not standoff detection method and there risk

involved for the person handling the prodder. Seismic devices are used from safe position

and give decision based on listening to response obtained from ground. This response

might be due to mechanical vibrations or any other disturbance that is caused by the

explosive which will be different compared to its surroundings. In ultrasonic sensors,

ultrasonic wave is sent into ground and backscattered wave is analyzed. Since ultrasonic


11

waves can propagate through moisture this is advantageous when there is high level of

moisture or in the water.

Capability of manual prodders is enhanced by addition of ultrasonic sensor at the

prodding extremity. Such prodder also called as smart prodder exert less pressure on the

mine and provide better guess of the mine position.

All these three techniques can be used only in as preliminary detection techniques

because they do not use any property which is explosive specific for detection. Hence

these cannot differentiate explosives uniquely from other materials which causes similar

disturbances.

2.1.2 Electromagnetic Sensors

Metal Detectors: There are three categories of metal detectors. One is based on

electromagnetic induction, second is magnetometer and the other is gradiometer.

In electromagnetic Induction based detector first a primary magnetic signal is sent into

the ground in emitting phase during which it creates eddy currents in buried metallic

objects which in turn creates secondary magnetic field. During the listening phase

emission is stopped and system listens to secondary magnetic field due to which eddy

currents are induced in coils of the detectors. These are characteristic currents for the

buried metallic objects and for the soil. There are two types of electromagnetic devices,

first one send magnetic pulse and second one sends a continuous wave at different

frequencies in a stepped frequency mode. Electromagnetic Induction sensors can also

provide information about shape of metallic pieces included in the mine. Magnetometer

works on the principle of fluxgate magnetometer which measures local perturbations of

earth magnetic field. Gradiometer depending on sensor configuration measures magnetic

field gradient in a given direction.

These techniques can detect only the explosives which contain metals and are inefficient

for plastic explosives.


12

Ground Penetrating Radar: GPR has got a transmitter which emits a pulse wave or a

continuous wave at given frequencies. GPR also has a receiver which collects waves

backscattered due to the discontinuities in permittivity. Discontinuities are provoked by

not only buried objects like land mines but also by natural discontinuities caused due to

clutter in the soil. This indicates that GPR can also detect plastic objects buried in the

ground.

There are two types of GPR's first one being ultra wide band pulse GPR send a short

pulse into the ground and second one sends continuous wave in a stepped frequency

mode. In the second case more energy can be sent into ground at a given frequency

because it provides directly Fourier transform of received signal. Current GPR's are

working in frequency range from 0.4 to 6.0 GHz.

The penetration depth of these GPR‟s is limited. Also interpretation of radar-grams is

difficult and needs well trained personnel.

Microwave Radiometer: This is a passive GPR. Natural radiation captured by its antenna

is largely amplified by the highly sensitive reception stage. The natural radiation is

comprised of radiation from sky (a few K), radiation reflected from surface and

subsurface and natural radiation from soil. 2-D image of surface and buried objects can

be obtained. Penetration and spatial resolution are frequency dependent. Detection by any

GPR technique is highly limited by moisture.

Performance and design of advanced microwave technologies like GPR or passive

radiometers depends on electromagnetic parameters of medium of propagation like γ, µ,

ξ. All these are dependent of geophysical parameters such as soil water content, type,

texture and structure.

Electrical ImpedanceTomography: Electrical impedance tomography works by

measuring soil impedance between selected locations on the ground. This is limited by


13

dry environments. This method also has a chance of detonating the mine, hence

extremely dangerous for personnel involved.

2.1.3 Electro-Optic Sensors

LIDAR and THz imaging systems have to still demonstrate usefulness of mine detection

because they have limited soil penetration because they use shorter wavelength than the

GPR's. Wild vegetation also limits capabilities of electro optic sensors.

Hyper spectral Sensors: This works by taking into consideration material reflectivity.

These sensors use the information from more than one region of electromagnetic

spectrum. These signals are further processed to detect the presence of explosives. These

cannot locate the individual mines.

Thermal Infrared: This works in two different approaches. First method measures

apparent difference in temperature of the soil. This difference is due to difference in

emissivity or by difference in thermal flux caused by presence of buried objects. For a

time sequence using principal component analysis we can get contrast enhancement with

respect to background. The second method takes into account polarization properties of

manufactured surfaces. This method also has the same disadvantage as the hyper spectral

sensors that this cannot locate the individual mines.

Scanning Laser Doppler Vibrometry: In this an acoustic power transmitter send and

acoustic wave in the ground. Soil vibrations are induced by backscattered wave created

by buried object. These vibrations are measured using laser Doppler vibrometer. This

techniques also doesn‟t ustilize any property that is explosive specific.

2.1.4 Other Explosive Detectors

There are few nuclear and chemical methods. These technologies include Nuclear

Quadruple Resonance (NQR), Thermal Neutron Activation (TNA), Fast Neutron

Activation (FNA), trace of explosive detection using chemical processes, X-ray back


14

scattering and X-ray fluorescence. Because of time and cost involved these are more

suited for confirmation.

Nuclear Quadruple Resonance: Alignments of nuclear spins is caused due to quadruple

charge distribution of aspherical atoms. Excitation of nuclear spins to higher quantized

energy levels is caused by radio frequency pulse generated by a transmitter coil. When

equilibrium position is reached by the nuclear spins, they emit unique detectable radio

frequency signal by following a particular precession frequency. This radio frequency

signal can be used to specify atoms and functional groups in the molecules. Nitrogen is a

quadruple atoms this can be detected by NQR technique and it appears in every type of

explosive. Hence this has the capability to be an efficient detection technique.

But this method cannot detect the explosive TNT. Also it is prone to more radio

interferences. This method also has problems with quartz bearing soils and soils that are

magnetic.

Thermal Neutron Activation: Gamma rays can be detected by conventional NaI and /or

GeLi. Most of the explosive materials are rich in nitrogen-14 (14

N) which is a stable

isotope of (N214

). If nitrogen nucleus captures a neutron the following reaction takes place

𝑛 +14N→15

N*→15

N+𝛾

This excitation of nitrogen nucleus de-excites immediately in picoseconds by emitting

one or more gamma-rays with unique energy up to 10.83 MeV and these rays can be used

to detect explosives.

This technique has slow throughput and real time inspection is not possible with this

technique.

Fast Neutron Activation: Fast neutron source can generate fast (14MeV) neutrons and

associated alpha particles (3.5MeV). These neutrons will prompt gamma rays in inelastic

scattering with nuclei of materials. An alpha particle is always associated with neutron


15

generated. Direction of neutron can be known from direction an alpha particle which is

always 180o from neutron direction. An array of scintillating detectors can be used to

detect alpha particle. Alpha direction can be obtained based on position of detector hit by

alpha. Stoichiometric composition of hit materials in terms of carbon, nitrogen and

oxygen can be obtained by analysis of gamma rays. This can be used in explosive

detection.

This system is complex and poses some issue with radiation hazard.

X-ray back scattering: X-ray back scattered radiation determines whether or not an

object is made up of light chemical elements. This is used for bulk detection. This system

can also produce 2-D image with resolution of few centimeters. Potential problems come

from shallow penetration, system complexity, sensitivity to soil topography and sensor

height variation and safety aspects due to use of ionizing radiation.

X-ray Fluorescence: These cannot penetrate deeply into ground. These don't detect

explosives encapsulated in mine but detect the molecules migrating from mines to ground

surface. When these migrated explosives are illuminated by X-rays emission of photons

is resulted by a series of changes that occurs in the electron configuration, characteristic

of material, which can be captured and analyzed. This technique has high false alarm

rates.

Trace/ vapor explosive detection: Trace explosive detections are used to replace/

complement currently used mine detection techniques with chemical identification of

microscopic residues of explosive component, either in vapor or in particulate form.

Vapor refers to gas phase molecules emitted from explosive surface (solid or liquid)

because of its finite vapor pressure, and particulate refers to microscopic particles of solid

material that adhere to surfaces either directly or indirectly.

All the techniques detected above have pros as well as cons. Terahertz techniques have

got certain advantages in the particular problem of explosive detection. So this thesis


16

focuses on using terahertz data for explosive detection. Acquisition of data is equally

important as analyzing the data. Although this thesis doesn‟t focus on data acquisition a

brief introduction about is provided here.

2.2 Terahertz data acquisition

2.2.1 Sources and Detectors

The electromagnetic radiation is attenuated at certain frequencies which are determined

by molecular absorption by water vapor, oxygen and other atmospheric molecules. In the

figure we can see atmospheric attenuation of various environmental conditions from

10GHz to 10,000GHz

Figure: 2. 1 Atmospheric attenuation at different frequencies [5]

The minima in this figure shows the atmospheric windows used to define normal

frequency windows of operation. In the figure we can see that these regions of interest

lies at 26 to 40GHz, 70-110GHz, 140GHz, 220GHz, 340 GHz, 410GHZ, 650GHz,

850GHz. Above 1THz window of interest lies centered at 1.5THz. So the sources and

detectors used should operate in this frequency.


17

It is difficult to produce sources and detectors in the range from 0.1 THz to 10THz. This

can be seen from the figure 2.1 shown below. The general approach for sources in this

region should be to use multipliers to generate radiation from RF side or to use lasers or

other non-linear forms to translate down from optical region. There are few exceptions to

this trend like backward wave oscillators (BWO), vacuum electronic devices and CO2

pumped gas lasers.

Figure: 2. 2 Different source and detectors available in the electromagnetic spectrum[11]

For a transmitter receiver system maximum sensitivity can be achieved if the bandwidth

of receiver matches with that of transmitter. In case of passive receivers because sources

are infinitely broad it is important to minimize bandwidth of operation to obtain as much

received energy from emitting or reflecting field. But in case of active receivers in order

to reduce receiver generated noise while preserving illumination one should try to reduce

receiver bandwidth.

Owing to the difficulty of generation of sources and receivers in this range researches

have focused their attention on all available optical techniques for producing THz

radiation, by using visible/near IR femtosecond laser pulse. This technique was


18

developed in 1980 and development of THz –time domain spectroscopy and imaging

system has attracted many groups to work in this field. The work is further revolutionized

by recent development of compact, solid-state THz semiconductor laser, quantum

cascade laser (QCL). These devices doesn‟t need expensive femtopulsed laser but they

still require cryogenic cooling.

Since the use of femtosecond laser pulse lot of other sources came into existence like

electro-optic rectification, surface field generation and ultra fast switching of

photoconductive emitters. Of all these methods photo conductive emitters have proven to

be efficient in converting visible/IR radiation into THz radiation and is widely been used

in THz imaging and spectroscopy techniques.

2.2.2 THz Spectroscopy and Imaging

THz time domain spectroscopy is advantageous over other spectroscopy techniques

because this is insensitive to thermal background and it doesn‟t require cryogenically

cooled bolometer detectors. This allows extraction of both absorption coefficients and

refractive index without requiring Kramers-Kronig analysis.

Generally the imaging techniques can be broadly classified as passive and active imaging

techniques. Every object emits radiation at all wavelengths with intensity proportional to

product of physical temperature and its emissivity according to Planck‟s radiation law. In

the case of passive imaging the contrast between warmer and colder objects caused by

naturally occurring radiation is used. This contrast occurs because of difference in

emissivity of different materials. For example to get an image of metal gun concealed

under clothing we use this technology. In case of active imaging the area to be imaged is

illuminated by radiation. Then the reflected or transmitted waves are captured by the

detector. Active system has got advantages over passive systems because incase of active

system objects can be illuminated with required power sufficient to penetrate materials

whereas passive systems rely on natural radiation.


19

2.3 Analysis Techniques

2.3.1 Background of component spatial and spectral pattern analysis

Component analysis of spatial and spectral patterns in multispectral images is developed

in 1980‟s by Kawata, Sasaki, Minami. Initially this is developed to find the feasible

solution for the spatial and spectral information. Later the algorithm is optimized using

simplex algorithm to find unique spectral pattern and their corresponding spatial location.

This has got several applications in finding the presence of particular material in a field.

2.3.2 Background of independent component analysis

Though with a different name ICA was first introduced in early 1980‟s by Hèrault,

Jutten and Amari [12-14]. During that decade there used to be lot of research taking place

in this field among the French scientists. In a workshop on higher-order spectral analysis

in 1989 the early papers on ICA by Cardoso [15] and Comon [16] are presented.

Cordoso‟s algorithm used higher-order cumulants tensors from which JADE algorithm

came into existence [17]. Lacoume was the first to use fourth-order cumulants [18].

Present most popular algorithms are proposed by Cichoki and Unbehauen [19-21]. Few

other famous papers on ICA are mentioned in the references [22-23]. Another technique

„Nonlinear PCA‟ was introduced by Aapo Hyvӓrinen, Juha Karhunen and Erkki Oja [24-

25]. Several such algorithms were proposed which are restricted by some problems.

After Bell and Sejnowski proposed their method based on infomax principle in mid

1990‟s [26-27], ICA has attained wider attention of many researchers. Later this is

extended by Amari and his co-workers using natural gradient. A few years later Aapo

Hyvӓrinen, Juha Karhunen and Erkki Oja presented fixed point or Fast-ICA [28-30]. Due

its computational efficiency this has contributed to application of ICA to large-scale

problems.

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Kawata%20S%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Sasaki%20K%22%5BAuthor%5D

http://www.ncbi.nlm.nih.gov/pubmed?term=%22Minami%20S%22%5BAuthor%5D


20

Different statistical criteria in existence for the estimation of ICA model are mutual

information, non-gaussianity measures, likelihood, cumulants, and nonlinear de-

correlation criteria.

When we want a general-purpose measure of dependence of components that doesn‟t

assume anything about the data then we should go for ICA by minimization of mutual

information. ICA estimation by minimization of mutual information is equivalent to

maximizing sum of non-gaussianities of the estimates of the independent components,

when the estimates are constrained to be uncorrelated. Maximum likelihood estimation

tells us what kind of non-linearity must be used. All these can be implemented as

practical ICA algorithms using either natural gradient method or fast fixed-point

algorithms. Bell-Sejnowski algorithm [26-27] is a gradient algorithm that employs

maximum likelihood estimation. There are few ICA methods that employ higher-order

cumulants tensor. Cumulant tensors are generalization of covariance matrix. JADE [31]

and FOBI [32] are two important algorithms of this class. Fourth Order Blind

Identification (FOBI) is the basic method which involves decomposition of weighted

correlation matrix. If there is problem of equal eigenvalues for a cumulants tensor then it

can be solved using Joint Approximate Diagonalization of Eigenmatrices (JADE). Non-

linear de-correlations are useful and possible general criteria for independence. First

successful ICA methods Hèrault-Jutten algorithm [12-14] and Cichocki-Unbehauen [19-

21] are based on nonlinear de-correlation. Today this is mainly of historical interest

because there are several more efficient algorithms for ICA. Cichocki-Unbehauen

algorithm is based on this principle and uses natural gradient. This is extended and

formalized to theory of estimating functions and Equivariant Adaptive Separation via

Independence (EASI) algorithm. The concept that all the IC‟s can be estimated with same

equivariant performance whatever the mixing matrix can be is shown first in this method.

Cichocki-Unbehauen algorithm is same as popular natural gradient algorithm introduced

by Amari, Cichocki and Young [33] as extension to original Bell-Sejnowski [26-27].


21

CHAPTER 3

SIMULATED TERAHERTZ DATA FOR EXPLOSIVES

3.1 Introduction

The main emphasis of the thesis is to process the terahertz data obtained from the sensors

in the frequency domain and detect the presence of the explosives. Component spatial

and spectral pattern analysis and independent component analysis algorithms which will

be explained in the later chapters are implemented in MATLAB. These algorithms are

used to analyze manually generated samples and detect the presence of explosives. As

explained in previous chapters since the terahertz equipment the transmitters and the

sensors are expensive and since those resources are not available the required samples are

collected from literature.

This chapter first provides the information of the source signals used to generate the

samples and then the details of the samples generated.

3.2 Source Signals

Three explosive source signals namely RDX, TNT, DNT and three other random non-

explosive signals are used to generate the samples. The information of RDX, TNT and

DNT is collected from the data by Rensselaer Polytechnic Institute [34]. They have a

large database of different explosives. Spectra of most used explosives are shown in

Figure 3.1 below. Due to computations involved we have restricted ourselves to use three

explosives so the information of RDX, TNT, DNT signal is taken from 1-21 THz and this

data is sampled to have 200 samples as shown in Figure 3.2. As explained previously we

can notice that each explosive has got different characteristic peaks at different

frequencies.


22

Figure: 3. 1 Diffuse reflectance and transmission spectra of different explosives [35]


23

Figure: 3. 2 Spectra of different explosives

3.3 Samples Generated

Nine hundred samples are generated which are assumed to be in a 30*30 plot as shown in

Figure 3.3. Each sample has got different proportions of source signal along with

randomly generated noise. Information of these entire samples is used to compare the

results of the two methods. So the data generated will be a 30*30*200 size matrix where

30, 30 are the dimensions of the plot and 200 is the number of frequency samples for

each signal. These are arranged as a 900*200 matrix. Among these 900 samples

generated 42 samples contain RDX along with other source signal and noise, 9 samples

which contain DNT along with other source signal and noise and 36 samples with TNT

along with other source signal and noise. All the other samples have the non-explosive

signals along with noise. The pixels marked as red contains RDX, pixels marked as blue

contains TNT and pixels marked as magenta contains DNT in the Figure 3.3 shown

below.

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 2000.25

0.3

0.35

0.4

0.45

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 2000.52

0.54

0.56

0.58

0.6

0.62

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 2000.5

0.55

0.6

0.65

0.7

RDX

TNT

DNT S(3,:)

S(2,:)

S(1,:)


24

Figure: 3. 3 Field with pixels containing RDX in red, TNT in blue, DNT in magenta and pixels

without any explosive in green

The next two chapters explain the two algorithms and also show the simulated results

which are obtained by utilizing this data.

0 5 10 15 20 25 300

5

10

15

20

25

30Field with all original given pixels

x axis--->

y a

xis

--->


25

CHAPTER 4

COMPONENT SPATIAL AND SPECTRAL PATTERN ANALYSIS

4.1 Introduction

In digital image processing studies component pattern analysis is important in various

applications like remote sensing in environmental sciences, medical diagnostics with x-

ray images etc. This new theory for multispectral images is developed using principal

component analysis and nonlinear optimization with non-negativity constraint. Using this

we can estimate the spectral curve of components present in the image and we can also

estimate the corresponding spatial pattern. Though we don‟t have any information about

spatial and spectral features of component, using rules of non-negative absorptivity and

density non-negativity we can come up with a feasible solution region for spectral and

spatial features [36] and then using entropy minimization we can optimize the solution

[37].

There are lots of computerized image processing methods like texture analysis etc. for

analyzing spatial pattern of a given component in a known image. But because of the

existing mutual dependency in spatial domain, the components cannot be classifiable in

feature hyper spectral space. In such a case we can use spectral information of the

component. The input required for this is just the images of scene taken at different

frequencies. The main important concepts used in this are multivariate analysis [38],

linear system theory [39], nonlinear programming with non-negativity constraint [40] and

entropy minimization [41].

In the case of our visual system every scene is sensed by three neuro-chemical sensors in

retina and all the three are recognized in brain as color images. The three detectors will

have three different spectral responses in visible region except the black and white case

where all three images will be the same. But in machine vision we go up to hundred


26

distinct images in the range from ultra violet to infrared frequencies for the given scene.

These are called as multispectral images.

Figure 4.1 shows the example of multispectral image. The spectral information varies

pixel by pixel. If we suppose that there are M components in a multispectral image, then

spectra of every pixel is the combination of these M components.

Figure: 4. 1 Illustration of multispectral images [36]

Figure: 4. 2 Image model for multi-component patterns [36]


27

In Figure 4.2 we can see the image in different kind of interpretation. Images at N

frequencies can be considered as the linear combination of these M image components

weighted by corresponding spatial responses. In matrix equation form this can be written

as

[Io]N∗L = So N∗M Po M∗L (4.1)

where

Io Matrix of multispectral images, with each row representing information of image at

individual frequencies with L pixels arranged lexicographically.

So Spectral response of M components as column vectors.

Po Spatial patterns of M components as row vectors with L pixels arranged

lexicographically.

For fluorescence and emission images equation (4.1) holds good, but for absorption

images we have to do logarithm of observed image intensity divided by illumination light

intensity.

Section 4.2 explains the algorithm and its implementation in MATLAB and section 4.3

shows the results and the further processing required. Finally in section 4.4 shows the

information about number of false positives and false negatives obtained.

4.2 Component spectral response and spatial pattern analysis algorithm

The ultimate goal of this algorithm is to estimate the spectral response and spatial pattern

of the components of a given multispectral image. Mathematically problem is concerned

about obtaining the values of [So] and [Po] in equation (4.1) if we know the values

of Io . If we have one of them available then this reduces to an inverse problem. But

here we do not have any prior knowledge of either [So] or [Po]. So in order to obtain


28

those matrices first we should be able to estimate the number of components „M‟ present

in our multispectral image which will give the dimension of these matrices.

4.2.1 Determining number of significant components

If the system is noise free the determining the number of components „M‟ will be easier

i.e., the rank of [Io][Io]t matrix will give us number of components. But practically we

cannot find noise free systems. Also the task here is to find the number of significant

components „M2‟ and not just the number of components because the data collected

practically may have hundreds of insignificant components which can be neglected.

We can factorize any given rectangular matrix using singular value decomposition. For

any given matrix there exists a decomposition with positive singular values such that

Io N∗L = Uo N∗N Λo N∗L[Vo]L∗L (4.2)

[Λo] is a matrix with singular values of [Io] as diagonal elements,

Uo is the eigenvectors of [Io][Io]t ,

Vo is the transpose of eigenvectors of [Io]t[Io] .

Singular values are the square roots of eigenvalues.

Equation 4.2 can be written as

Io = [U ⋮ Un] Λ ⋮ 0… … …0 ⋮ Λn

V…Vn

(4.3)

Io = U Λ V + U Λ Vn + Un Λn V + Un Λn [Vn] (4.4)

Λ Square root of eigenvalues of [Io][Io]t > Threshold.

U Truncated [Uo] matrix with eigenvectors corresponding to significant eigenvalues.


29

V Truncated [Vo] matrix with eigenvectors corresponding to significant eigenvalues.

Λn Square root of eigenvalues of [Io][Io]t < Threshold.

Un Truncated [Uo] matrix with eigenvectors corresponding to insignificant

eigenvalues.

Vn Truncated [Vo] matrix with eigenvectors corresponding to insignificant eigenvalues.

This is a division will divide the singular value matrix into two such that [Λ] consists of

significant singular values and [Λn] consists of insignificant singular values. Cutoff

position for significance is the position where the ratio of one singular value with its

immediate neighbor (when singular values are assigned in descending order) is larger

than prefixed threshold value. The number of components which are significant „M2‟ is

equal to the number of singular values above the cutoff position or in other words it is

equal to the dimension of the [Λ] matrix. This algorithm has the capability to suppress

influence of noise on the results. The threshold can be found by measuring detector noise,

non-linearity of detector, quantization error of analog-to-digital converter etc. Akaike‟s

information criteria can also be used for this purpose [42].

Based on the previous explanation we can estimate the number of significant components

„M2‟ and can reconstruct [I] which is approximately equal to [Io] using significant values

of singular values and the equation is as shown below

Io N∗L ≈ I N∗L = U N∗M2 Λ M2∗M2[V]M2∗L (4.5)

For the data samples generated [Io] is the transposed matrix of the data such that each

column of this matrix will represent the data of all the 900 pixels at a given frequency.

Now using the component spatial and spectral pattern analysis algorithm we are trying to

decompose this matrix such that we can find the source signals and their probabilities at

each pixel like the one shown in Fig 4.2.


30

If we try to decompose 900*200 matrix [Io], it can be decomposed into maximum of 200

source signals because it the highest possible rank for that matrix. But the fact is that

even in practical cases when samples are collected from a restricted area or from a given

region then definitely there is going to be a limit for the number of source signals present

in the samples. So it is likely that we know the number of source signals present in the

samples and try to converge to those number of source signals. The information about the

number of significant eigenvalues can be obtained as explained above and the values of

[U], [Λ], [V] are obtained as shown in equation (4.5). Now I is factorized and from that

P and S can be obtained using transformation given below in (4.7) and (4.8)

Io ≈ I N∗L = S N∗M2 P M2∗L = U N∗M2 Λ M2∗M2[T]−1M2∗M2

[T]M2∗M2[V]M2∗L (4.6)

[P] = T [V] (4.7)

S = U Λ [T]−1 (4.8)

So if the value of transformation matrix [T] can be obtained accurately then the values of

[S] and [P] can obtained accurately. This can be found using non-negativity constraint.

4.2.2 Non-negative Constraint

We can use non-negativity as constraint in order to obtain the value of [T] to determine

[P] and [S]. As we know that [S] matrix is composed of source signals as its column

vectors and since the source signal will not be negative each element of [S] should be

positive as indicated in equation (4.9). Since [P] matrix indicates the probability of each

of the source signal in the spatial domain, we can say that all the elements of [P] should

be greater than or equal to zero as indicated in equation (4.10).

Sij ≥ 0 i = 1. . L, j = 1. . M2 (4.9)

Pij ≥ 0 i = 1. . M2, j = 1. . N (4.10)

Pij is the element of [P] which represents spatial pattern of ith

component at jth

pixel.


31

Sij is the element of [S] which represents spectral response of the jth

component at ith

frequency.

In case of fluorescence spectroscopy we should make sure that there will be no

absorption by the sample, whose existence makes equation (4.10) and (4.1) invalid.

These constraints make the elements of matrices [S] and [P] to be non-negative. Number

of inequalities can be obtained using these constraints if we substitute equations (4.7) and

(4.8) into (4.9) and (4.10) which are

Elements{ U Λ T −1} ≥ 0 (4.11)

Elements{ T V } ≥ 0 (4.12)

T will be a matrix of M2*M2 elements and the values of T are restricted by N*M2

and M2*L inequalities given by equations (4.11) and (4.12). The values of T here are

non-unique, but if values of P and S are known T reduce to absolute. Since the

equation (4.6) involves both [T] and its inverse any scalar multiple of [T] will give us

good result of [I]. But this creates difference in the magnitudes of [S] and [P]. In order to

have some restriction on the magnitude of the source signal we obtain we should

normalize T by using

diag T T t = [E] (4.13)

where [E] is an identity matrix.

4.2.3 Feasible Solution of [T] using non-negativity constraints

If we try to derive the solution using non-negative constraints shown in the previous

explanation then we can come up with a feasible region for matrix [T]. Since the value of

[U], [Λ], [V] are known we can substitute in equation (4.11) and equation (4.12) and get

the inequalities which can be solved simultaneously to find a feasible solution region of

[T]. Equation 4.12 gives us the following equations


32

t11υ1i + t12υ2i + t13υ3i + ⋯ + t1M2υM2i ≥ 0, 𝑖 = 1,2,……𝐿 (4.14)

……….

tM21υ1i + tM22υ2i + tM23υ3i + ⋯ + tM2M2υM2i ≥ 0, 𝑖 = 1,2,……𝐿 (4.15)

Since this explanation of M2 component case is tedious we go with two component case

which can be easily expressed in the form of mathematical equations. The above equation

for the two component case will be

t11υ1i + t12υ2i ≥ 0, 𝑖 = 1,2,……𝐿 (4.16)

t21υ1i + t22υ2i ≥ 0, 𝑖 = 1,2,……𝐿 (4.17)

From this we can derive the relation between t11 and t12 , and between t21 and t22 as

− max0≤i≤L υ2i

υ1i

−1

≤t12

t11≤ − min0≤i≤L

υ2i

υ1i

−1

(4.18)

− max0≤i≤L υ2i

υ1i

−1

≤t22

t21≤ − min0≤i≤L

υ2i

υ1i

−1

(4.19)

Similarly using equation (4.11) we get

u j1λ1t22−u j2λ2t21

t11 t22−t12 t21≥ 0, 𝑗 = 1,2, . .𝑁 (4.20)

−u j1λ1t12 +u j2λ2t11

t11 t22−t12 t21≥ 0, 𝑗 = 1,2, . .𝑁 (4.21)

Now from equations (4.20) and (4.21) by considering the conditions 𝑡11 > 0, 𝑡21 > 0 and

𝑡12

𝑡11>

𝑡22

𝑡21 the above equations yield


33

t22

t21≤ min1≤j≤N

λ2u j2

λ1u j1 (4.22)

t12

t11≤ max1≤j≤N

λ2u j2

λ1u j1 (4.23)

For normalizing [T] we have the conditions like

t112 + t12

2 = 1 (4.24)

t212 + t22

2 = 1 (4.25)

Solving the three equations simultaneously we get a solution region for [T].

When the number of components increases the solution becomes complex. Till now we

have discussed about the feasible solution region for both component patterns and

spectra. Now the discussion is how to obtain unique solution for both component spectra

and patterns from the solution region we have. This solution is obtained based on an

entropy minimization criterion.

4.2.4 Optimal Estimation

Because we don‟t have a priori information about the components, estimation of optimal

solution from feasible solution requires estimation theory. In case of decomposing the

given multispectral data into source signals, one way to ascertain that signals obtained are

original source signals is to check if they are mutually independent i.e., the solution set

we get should be independent of one another. Entropy is one way of measuring the

independence of a signal.

Entropy is defined as the measure of uniformity of distribution of a bounded set of value.

This indicates that signals will have maximum entropy if they are uniform. So finding [T]

such that entropy of signals will be minimized will also minimize amount of shared

entropy or mutual information.


34

So now we try to find the optimal solution by minimizing the entropy of the signals given

by the function H[S] as

H S = − aij lnaijLj=1

Mi=1 (4.26)

Where aij = sij

′′

sij′′ L

j=1

aij is probability density function of a stochastic process,

sij′′ is second derivative of sij with respect to i

th frequency and j

th component of [I].

Therefore minimization of equation (4.26) will localize peaks in spectra and will

smoothen the base line. To emphasize the peak feature we are taking second derivative.

We can do entropy minimization either in spectral domain H[S] or in spatial domain H[P]

or using both H([S],[P]).

If we do this entropy minimization in spatial domain then we have the minimization

function H[P] as

H P = − bij lnbijNj=1

Mi=1 (4.27)

where bij = pij

′′

pij′′ N

j=1

Now we have a constrained optimization problem with minimization of entropy as the

cost function subjected to non-negativity constraints. There are several methods available

to solve a constrained optimization problem like penalty function method [43],

Lagrangian multiplier method [43], augmented Lagrangian multiplier method for

inequality constraints [43], quadratic programming [43], gradient projection method for

equality constraints [43], gradient projection method for inequality constraints [43] etc.

All these methods try to formulate the constrained problem as unconstrained problem and

solve them. Penalty function method is employed in this algorithm.


35

General Penalty Function Method: General penalty function can be understood by the

following mathematical explanation, let us start with a constrained problem given by

Minimize f(x) (4.28)

Subjected to gj x ≤ 0, j = 1, . . , p

hi x ≤ 0 i = 1, . . , m

Then using penalty function described by [Snyman 2005] we can formulate equation

(4.28) into an unconstrained problem as

Minimize P(x) (4.29)

where P x, ρ, β = f x + ρhi2(x)m

i=1 + βgj2(x)

pj=1

where ρ and β are the penalty parameters and ρ>>0 and β>>0

Using this penalty function method explained the constrained optimization problem can

be converted into unconstrained problem. The problem now is to minimize the entropy

function subjected to non-negative constraints. This can be mathematically represented as

Minimize H( S , P ) (4.30)

Subjected to Sij ≥ 0 i = 1. . L, j = 1. . M2

Pij ≥ 0 i = 1. . M2, j = 1. . N

H S , P is the entropy function.

If this is formulated using the penalty function method we have the cost function as

R S , P = H S , P + γQ S , P (4.31)


36

Q is the penalty function term that handles the constraints,

γ is the scaling factor or the penalty parameter.

Similar to equation (4.29) non-negative constraint is handled by writing function Q as

Q = F pmi Ni=1

M2m=1 + F sjm M2

m=1Lj=1 (4.32)

where pmi and sjm are the elements of P and S , and

F x = 0 (x ≥ 0)

x2 (x < 0) (4.33)

The penalty function reduces the problem to an unconstrained optimization problem.

Minimization of R can be done using simplex method [44] and Davidon-Fletcher-Powell

method [44] or by using both. Simplex method is employed in this thesis.

Actual minimization occurs when γ → ∞ but presence of noise will not allow this as

optimal solution. We also lack the freedom to start with an initial large value of γ because

this will cause the solution to converge towards some local minimum value which is not

expected. So algorithm starts with smaller value of γ. Using solution from first step as

input to the next iteration gamma can be optimized. This can be repeated until gamma

exceeds signal-to-noise ratio of data (if known). In our algorithm we are not trying to

optimize the gamma, since γ=1 gave a reasonable solution we used that as the penalty

parameter.

4.3 Results of component spatial and spectral pattern analysis

Method explained above is used to decompose the [I] matrix into its respective source

signals [S] and their corresponding probabilities [P].

Algorithm is started with some initial guess value for [T] to minimize the cost function

which is the entropy function. The entropy can be calculated for either [P] or [S] or it can


37

be a combination of both matrices. This algorithm uses the entropy of the source signals.

The constraints of both matrix [S] and [P] must be positive. A penalty function with

scaling γ=1 is used in order to make constrained problem as unconstrained. The value of

γ=1 gave reasonably good result. Different values of γ are tried for and, if γ is initially

started with a large value in (4.31) the algorithm converges to some local minimum

because of higher emphasis on non-negativity constraint than on the entropy function. So

γ=1 is considered in this implementation.

Optimization function „fminsearch‟ command in the MATLAB is used to optimize the

cost function. This is an algorithm which uses Nelder-Mead Simplex algorithm to

maximize or minimize a given unconstrained problem. The cost function equation (4.31)

written in the form of a function and the initial value of matrix to be optimized [TO] are

the inputs required by the function. The inputs for the function written for cost function

depend on the function written by us. The algorithm converges towards some optimal

value of [T]. This [T] along with [U], [Λ], [V] is used to reconstruct the final values of

[S] and [P]. The results obtained using this are shown below.


38

The source signals obtained which are columns of matrix [S] are

Figure: 4. 3 (a) Original source signals and (b) Results obtained

It can see that the plots on the left hand side of Figure (4.4) are the original signals taken

to generate the samples and the plots on the right hand side are the sources signals to

which the algorithm has converged. The peaks and valleys are pretty much in the exact

position but there is some scaling involved which cannot be rectified using this algorithm.

Also the order of convergence cannot be controlled. The order obtained here is TNT,

DNT, and RDX.

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 200-2

0

2

4

6

0 20 40 60 80 100 120 140 160 180 2000

2

4

6

8

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

5

(a) (b)


39

The rows of corresponding probability matrix [P] if arranged in the form of 30*30 plots

can be seen as

Figure: 4. 4 (a) The original samples generated and (b), (c), (d) The rows of [P] arranged as a 30*30

plot

The limitation of this algorithm is that we cannot assure that the components we obtained

are the true ones but they are optimal under entropy minimization criterion. So some

further processing is required.

0 5 10 15 20 25 300

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

(a) (b)

(c) (d)


40

4.3.1 Further Processing for confirmation

The algorithm will converge to significant number of source signals. But in general

applications we try to use this algorithm to find if the targeted signals are present in the

multispectral image. So some further processing is required for the spectral responses we

have obtained and their corresponding spatial patterns.

4.3.1.1 Cross-Correlation

From the algorithm explained one should be able to obtain [S] and [P]. The algorithm

will surely converge towards some solution irrespective of the presence of target signals

which is searched for. But since the goal is to detect the presence of the target signal in

the multispectral data the result obtained has to be further processed in order to confirm if

source signal obtained is target signal or not. Since the target signals information is

available the aim is to find if the source signals obtained is one from the database that

contains the target signals.

Cross-correlation is a measure of similarity of two waveforms. This is also known as

sliding dot product or inner product.

For discrete functions this cross-correlation is given by

f ∗ g [n] ≝ f ∗ m g[n + m]∞m=−∞ (4.34)

Cross-correlation is a function similar to convolution. Convolution involves reversing

signal, shifting it and then multiplying by another signal whereas correlation doesn‟t

involve reversing it only shifts and then multiplies. Auto-correlation is the cross-

correlation of a signal with itself. This is measure of degree to which two series are

correlated.

This is used to decide whether the source signal obtained is the one from database that

contains the target signals. For each source signal obtained cross-correlation is performed

with each of the target signals in the database and based on the maximum value of the


41

correlation output it is decided whether the output is target signal or not. If the maximum

value of the correlation output normalized with respect to auto-correlation output of

target signal is greater than certain prefixed threshold value then the signals are

considered to be similar and then it is concluded that target signal is present in the

multispectral image.

So now for each of the three signals we have obtained previously we perform correlation

with all the three explosive source signals we have used to generate the samples and find

the maximum of the correlation output. In the left columns the signals obtained as result

is shown and in the right column the correlation output of the result is shown with

explosive database in the order RDX, TNT and DNT.

From Figure (4.6) it is clear that the first source signal obtained gave good peak value

which is really large only with TNT. Hence it is obvious that the first source obtained is

TNT. Similarly from Figure (4.7) it is clear that the second source signal obtained gave

good peak value which is really large only with DNT. Hence it is obvious that the second

source obtained is DNT. Similarly from Figure (4.8) it is clear that the third source signal

obtained gave good peak value which is really large only with RDX. Hence it is obvious

that the third source obtained is RDX.


42

Figure: 4. 5 (a) First source signal obtained, (b) It’s correlation with RDX, (c) It’s correlation with

TNT, (d) It’s correlation with DNT

0 50 100 150 200 250 300 350 400-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 20 40 60 80 100 120 140 160 180 200-0.5

0

0.5

1

1.5

2

0 50 100 150 200 250 300 350 400-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200 250 300 350 400-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

(b)

(d)

(c) (a)


43

Figure: 4. 6 (a) Second source signal obtained, (b) It’s correlation with RDX, (c) It’s correlation with


0 50 100 150 200 250 300 350 400-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 20 40 60 80 100 120 140 160 180 2000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0 50 100 150 200 250 300 350 400-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 50 100 150 200 250 300 350 400-0.2

0

0.2

0.4

0.6

0.8

1

1.2

(b)

(a) (c)

(d)


44

Figure: 4. 7 (a) Third source signal obtained, (b) It’s correlation with RDX, (c) It’s correlation with


0 50 100 150 200 250 300 350 400-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100 120 140 160 180 2000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0 50 100 150 200 250 300 350 400-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0 50 100 150 200 250 300 350 400-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

(b)

(a) (c)

(d)


45

4.3.1.2 Thresholding to find the spatial position

The spatial significance of each of the components is decided based on the values of the

matrix [P]. Here again after confirmation from correlation output that the explosive traces

are present in the multispectral image it is equally important to be able to clearly identify

the exact spatial location of the explosive in the image. By now we can clearly observe

that in each case when the resultant signal matches with the explosive from the database

the maximum correlation output is significantly large when compared to other results.

Here again some threshold value is set to say whether the source signal is explosive.

Since each row of [P] corresponds to one of the source signals, the rows of [P]

corresponding to target signals are further passed through a thresholding stage where

each pixel probability is compared with maximum value of that row to finally confirm if

the target signal is present or not.

If any of the resultant signals are identified as explosives then we have to observe the

spatial probability of these in the multispectral image. The [P] matrix obtained will have

the probability values arranged in a row which is reconstructed in the form of a 30*30

plot, one for each source signal obtained. Here again a threshold value is considered to

say whether the pixel has got the explosive or not. The smaller the percentage of

explosive present in the sample the smaller should be the threshold value required for that

to be detected as explosive.

With really low value of the threshold the probability plot of each of the resultant signals

detected as explosives is as shown below.


46

Figure: 4. 8 (a) First row of [P] as 30*30 plot (b) Finally detected pixels after thresholding

Figure: 4. 9 (a) Second row of [P] as 30*30 plot (b) Finally detected pixels after thresholding

Figure: 4. 10 Third row of [P] as 30*30 plot (b) Finally detected pixels after thresholding

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30

(a) (b)

(a) (b)

(a) (b)


47

The short come of this procedure which we should overcome is that in practical cases

when the target signals database is really large correlation of each source obtained with

all the signals in the database involves lot of computations and hence it is not

recommended. So we tried component spectral and spatial pattern analysis with

reference.

4.4 Component spectral and spatial pattern analysis with reference

In the previous method explained there is no fixed order in which the source signals will

converge. If the existing algorithm can be modified such that the order of convergence

can be fixed then the number of computations required for the correlation will drastically

be reduced.

Since prior knowledge of the target signals is available this can be used as reference

signal information to make an initial guess for the value of [T] when solving the

unconstrained problem using the simplex algorithm. Since the knowledge of the target or

the reference signal is available the algorithm initially start with an assumption that the

spectral response matrix [S] is known to us as [Sref]. From the equation (4.8) we can see

that

S = U Λ [T]−1 (4.35)

So we can write

Sref = U Λ [TO]−1 (4.36)

Since we have the values of [U] and [Λ] are known to us and the value of [Sref] is

assumed the only unknown in the above equation is [TO] which can be obtained using

TO = [Sref]−1 U [Λ] (4.37)

We start with this as the initial guess of [T] as [TO] in the unconstrained optimizing

problem using simplex algorithm.


48

In the figures shown below we can see the output of two methods, first one is the output

using some randomly generated initial value of [TO] and second one is the one obtained

using the reference signal to generate the initial value of [TO].

Figure: 4. 11 Result of [S] obtained using random value of [TO]

Figure: 4. 12 Result of [S] obtained using value of [TO] generated using [Sref]

0 50 100 150 2000

2

4

0 50 100 150 2000

2

4

0 50 100 150 2000

2

4

0 50 100 150 2000

2

4

6

0 50 100 150 2000

5

10

0 50 100 150 2000

2

4

6

0 50 100 150 2000

2

4

0 50 100 150 2000

2

4

0 50 100 150 2000

2

4

0 50 100 150 2000

2

4

6

0 50 100 150 2000

2

4

6

0 50 100 150 2000

2

4

6


49

By result comparison component spectral and spatial pattern analysis with reference is

definitely giving us a good result which in the later stage only involves one correlation

computation for each source signal.

Now from the [S] matrix obtained by this method we can confirm if those source signals

are explosive by using correlation and can finally detect the explosive pixels. Then we go

for thresholding for rows of [P] corresponding to those explosive signals. The final result

obtained by this method is

Figure: 4. 13 Original samples generated

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


50

Figure: 4. 14 Finally detected pixels

4.5 Results when percentage of explosives varies in the samples

The results seen in the previous cases where the percentage of explosives present in the

samples is larger are the situations where detection is easy. In this part of the chapter we

discuss the cases when the percentage of explosives in the samples varies.

In the case of independent component analysis we are not using any random threshold

values and the results obtained doesn‟t have any chance of false positives (Pixel detected

originally doesn‟t have explosive). We can say that there is no chance of false negatives

(Pixel that has explosive not being detected) if the percentage of explosive is reasonably

large.

In this case of the component spatial and spectral pattern analysis method final detected

samples depend upon the value of threshold taken for detecting pixels in [P]. If the

threshold is really low there are chances of false positives and if the threshold is large

there is chance of false negative occurrence. Each row of [P] which corresponds to each

one of the explosives used to generate samples if plotted as 30*30 plot will be as shown

in Figure below.

0 5 10 15 20 25 300

5

10

15

20

25

30Output of Component Spectral and Spatial Pattern Analysis

x axis--->

y a

xis

--->


51

Figure: 4. 15 (a) Samples generated, (b) First row of [P] as 30*30 plot, (c) Second row of [P] as 30*30

plot, (d) Third row of [P] as 30*30 plot

From the figures we can clearly observe the variation in percentage of explosives present

in the explosives. The ultimate goal of the thesis is to detect the presence of explosive

and not to identify the type of the explosive present at a particular position in the table

shown below we try to find the total false positives and negatives for different values of

threshold taken. There are 87 pixels that have explosives among the 900 pixels shown in

the figure above.

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->

5 10 15 20 25 30

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

(a) (b)

(c) (d)


52

Table 4: 1 False alarm of component spatial and spectral pattern analysis method

S.NO Threshold False

Positives

False

Negative

Total

Number of

Pixels

Total

Number of

pixels with

explosives

1 Max/2 0 26 900 87

2 Max/5 0 13 900 87

3 Max/6 1 13 900 87

4 Max/7 65 5 900 87

5 Max/8 90 0 900 87

6 Max/10 300 0 900 87

Because false negatives are more dangerous than false positives we should go for smaller

value of threshold such that all the explosives are effectively detected.

4.6 Conclusions

Valid results are obtained using component spatial and spectral pattern analysis when the

trace of explosive present in the sample is significant. But this method involves lot of

further computation to identify if the source signals obtained is one from the database that

has explosives. Since in practical conditions the number of explosives is really large this

is not computationally affordable and it consumes more time.

So the algorithm is improved to component spatial and spectral pattern analysis with

reference which maintains certain order for the source signals converged. This reduces

the computational effort in the later stages.


53

CHAPTER 5

INDEPENDENT COMPONENT ANALYSIS

5.1 Introduction

Spectral unmixing is technique in which the original source signals are extracted from the

mixed signals. Some of the commonly used unmixing techniques are Least Square

Methods [45], Principal Component Analysis [46], Independent Component Analysis

(ICA) [47-52] and Kalman Filtering [53-54]. Out of these ICA is a very well developed

algorithm for spectral unmixing. ICA is a method in which we try to find linear transform

such that the transformed components are mutually independent or as independently as

possible.

In the last decade independent component analysis has received an increasing amount of

attention from the signal processing community because signal separation is an important

application of independent component analysis. In the literature independent component

analysis problem is addressed under labels blind source separation; signal copy,

waveform-preserving estimation etc [55]. Data analysis and compression, Bayesian

detection, localization of sources, blind identification and de-convolution are the possible

potential applications of independent component analysis [56].

Independent component analysis involves disciplines like neural networks, statistics,

pattern recognition, information theory, system identification etc. Principle component

analysis tries to find components which are uncorrelated whereas independent component

analysis tries to find components that are mutually statistically independent. Principal

component analysis involves only second order statistics whereas independent

component analysis involves higher order statistics. Using principal component analysis

mixing matrices and source signals can only be found up to an orthogonal transformation

and this is known as rotational invariance property of principal component analysis.


54

Using independent component analysis original mixing matrix and source signals can be

retrieved.

Independent component analysis can be defined as a method for decomposition of a

linear mixture which contains unknown sources signals into independent components,

relying on the assumption that source signals are statistically independent. All we know is

information of X and independent component analysis must be capable of obtaining both

A and S in the equation (5.1) shown below. The assumption we have is that Si obtained

are statistically independent. We must assume independent components must have non-

gaussian distribution. The input „X‟ we have for the problem of independent component

analysis can be expressed in the mathematical form as

XL∗N = AL∗MSM∗N (5.1)

S is a matrix with each row representing values of a source signal at N different

frequencies,

X is obtained from linear transformation of the original source signals with each row

representing values of a mixed signal at N different frequencies,

A is the mixing matrix.

Goal of the independent component analysis is given X we have to find the value of A

such that the independent source signals S can be obtained by the equation (5.2) shown

below

S = A−1

X = WM∗LXL∗N (5.2)

This chapter discusses the methods to obtain this and the results obtained. Section 5.2

gives a brief description of history of ICA, criteria for the choice of algorithm and then

finally explains FastICA algorithm and shows the result obtained. Section 5.3 discusses

ICA with reference algorithm and section 5.4 shows the results of the algorithm. Section


55

5.5 explains Fast ICA with reference. Section 5.6 discusses ICA with multi-reference and

section 5.7 explains results of the algorithm.

5.2 FastICA

5.2.1 Criteria to choose the algorithm

Main choices are between the different statistical estimation criteria and between one-unit

versus multi-unit methods. This thesis utilized FastICA that employs non-gaussianity

criteria. The reasons for the choice are given below.

In methods employing measurement of independence by non-gaussianity there is a

chance of measurement of non-gaussianity on a single projection i.e., it is possible to

derive only few number of IC‟s. But this is not possible for other kind of statistical

criteria. Since our emphasis is on obtaining only limited IC‟s we should definitely opt for

non-gaussianity as the statistical criteria of the ICA algorithm.

Selection of an ICA Method can be decomposed as

ICA Method = Obj Function + Optimization Algorithm (5.3)

In general statistical properties influence the choice of objective function and algorithm

properties depend on optimization method chosen. Based on our requirements we have

decided to use FastICA that employs non-gaussianity as a measure of independence. So a

function which measures the non-gaussianity is our objective function.

Orthogonalization

As mentioned non-gaussianity is used as a measure of independence when we are trying

to estimate few or one IC. In the case of finding few IC‟s the same algorithm one-unit

ICA has to be run few times and make sure that algorithm doesn‟t converge to same

maxima. This requires orthogonalization of the vectors w1, w2, w3,…… wn. There are two

different methods for achieving this de-correlation.


56

Deflationary Orthogonalization

In this method independent components are found one-by-one and after estimating „p‟

independent components or „p‟ vectors w1,w2,…..wp ,any of the one-unit ICA algorithms

is used again to obtain wp+1 and after every iteration from wp+1 the projections of

previously estimated „p‟ vectors are subtracted and wp+1 is renormalized.

wp ← wp − (wpTwj)wj

p−1j=1 (5.4)

The disadvantage of this method is that if there are any estimation errors in the first

vector then they will be cumulated in the other IC‟s due to orthogonalization. The

alternative method is symmetric orthogonalization.

Symmetric Orthogonalization

In this one-unit algorithms are run for all the IC‟s in parallel and orthogonalizing of all

the wi‟s is done by special symmetric methods.

W ← WWT −1/2W (5.5)

Any one of these is selected based on the requirements.

Three major criteria we have considered for the choice of this algorithm are

One criterion is based on the choice between estimation of all the IC‟s

simultaneously or one by one. Based on this we decide between symmetric or

hierarchical decorrelation. Since we are trying to find IC‟s one after the other we

go for an algorithm that supports hierarchical decorrelation.

A second criterion is based on the non-linearity to be chosen. For FastICA (tanh)

is used. But for few gradient based algorithms second functions need to be used.

For general problem the function used by FastICA is appropriate.


57

A third criterion is based on whether on-line and batch algorithms. In the case

when data is readily available we can use FastICA algorithms. But in some cases

when data continuously changes it is better to use gradient methods or use

combination of both. Since we have data readily available we go for FastICA.

From the above explanation we can say that the FastICA that we are going to see in

this chapter is employing some measure of independence as the cost function and it

uses hierarchical de-correlation method.

5.2.2 Estimation of Independence: Measure of non-gaussianity

Since the main aim of independent component analysis is to find the components that are

statistically independent a statistical criteria for measure of independence is required. The

statistical criteria like mutual information, likelihood and non-gaussianity can be used as

measure of independence. But reason for the choice of non-gaussianity as the measure of

independence are mentioned previously.

Two variables „x1„ and „x2„are said to be independent if information on value of „x1„

doesn‟t give any information on value of „x2‟. Weaker form of independence is

uncorrelatedness. Since independence implies uncorrelatedness the independent

component analysis can be constrained to obtain independent components such that

estimated values are uncorrelated.

If „x1‟ and „x2‟ are independent i.e., if they are uncorrelated

cov x1, x2 = 0 (5.6)

Covariance between „x1‟ and „x2‟ is

cov x1, x2 = mean x1 ∗ x2 − mean x1 ∗ mean x2 (5.7)


58

For gaussian source signals independent component analysis can be estimated only up to

an orthogonal transformation i.e., A cannot be exactly identified for gaussian source

signals. Intuitively we can say that non-gaussianity is independence.

The covariance of two statistically independent variables is always zero and the converse

is not always true. Only in the case of gaussian variables, zero covariance means

independence. According to central limit theorem sum of independent variables will be

more gaussian than the independent variables. For example

Xi = A1S1 + A2S2 (5.8)

is more gaussian than „S1„ and „S2‟. Central limit theorem states that if we can find

signals that have minimal gaussian properties then they will be independent signals. So in

order to find independent component one method is to measure the non-gaussianity of

„WTX‟.

Non-gaussianity is one way to measure the independence of signal. To use non-

gaussianity as a measure of independence we need quantitative measure of that. Kurtosis

and negentropy explained below gives quantitative measure of non-gaussianity.

Kurtosis

This is a classical method for measuring non-gaussianity. This will be equal to fourth

moment of data if the data is preprocessed to have unit variance. This is given by

kurt y = E y4 − 3(E{y2})2 (5.9)

where E{} is the expectance value. Expectance value or mathematical expectation E{y}

is the mean or the first moment.

If we assume „y‟ is of unit variance then

kurt y = E y4 − 3 (5.10)


59

So this is normalized version of fourth moment. For gaussian

E y4 = 3(E{y2})2 (5.11)

So the kurtosis for gaussian variable is zero. Non-zero kurtosis implies that data is non-

gaussian. Random variables that have negative kurtosis are called sub-gaussian and those

with positive kurtosis are super-gaussian. Generally non gaussianity is measured using

absolute value of kurtosis. Square of kurtosis can also be used.

Kurtosis drawback is that statistical significance of this is poor because its value depends

on few tail values which are outliers. So this method will not be robust enough for

independent component analysis.

Negentropy

This is based on information–theoretic quantity of differential entropy. Entropy of a

random variable can be identified as degree of information that observation of variable

gives. For discrete random variable „y‟ entropy „H‟ is given by

H y = − P Y = ai logP Y = ai i (5.12)

One of the fundamental of the information theory is that a gaussian variable has the

largest entropy among all random variables of equal variance.

Negentropy is the difference of entropy of a signal „y‟ and entropy of a gaussian signal

„ygauss‟ with zero mean and same variance as of „y‟. More random the variable is the more

is the negentropy. So now we can measure non-gaussianity using negentropy which is

J y = H ygauss − H y (5.13)

„ygauss„ is gaussian random variable of zero mean and same variance as „y‟. So negentropy

is always positive and is zero only if „y‟ is gaussian. As far as statistical properties are

concerned negentropy is the optimal estimator of non-gaussianity.


60

This method is robust. Drawback is that it is computationally difficult.

Approximations of Negentropy

In order to overcome the drawback of kurtosis and negentropy calculations we come up

with an approximation to calculate negentropy. Classical method of approximating

negentropy is using higher-order moments which is given by equation

J y ≈1

12E y3 2 +

1

48kurt(y)2 (5.14)

This method also suffers from non-robustness encountered with kurtosis. So to overcome

this we go for

J(y) ≈ ρ[E Gi y − E{Gi ygauss }]2 (5.15)

G1 y = log cosh(a1y)/a1 (5.16)

G2 y = exp −a2y2

2 /a2 (5.17)

G3 y =y4

4 (5.18)

Where 1 ≤ a1 ≤ 2 and a2 ≈ 1. G1 is good for general purpose, G2 is good for super

Gaussian and G3 is good for sub-Gaussian signals. So these approximations are in good

agreement between properties of kurtosis and negentropy. This method has robustness.

5.2.3 Fast ICA Algorithm

Fast ICA methods that employ all the statistical criteria explained previously are

available. But since this thesis uses non-gaussianity as the measure of independence we

explain the concerned algorithm here. In this method negentropy is used as cost function

and the algorithm tries to find the component such that the cost function is maximized.

This is given by


61

J y = H y − H(ygauss ) (5.19)

where ygauss is a gaussian random variable with zero mean and same variance as the

output „y‟,

H(.) denotes the differential entropy.

A flexible and reliable approximation of negentropy is given by Hyvarinen [57] as shown

in equation (5.13) and this is

J(y) ≈ ρ[E G y − E G ygauss ]2 (5.20)

G(.) is a quadratic function, given by equations (5.14), (5.15) and (5.16).

If the algorithm is used to generate only one independent component then the

independent component obtained is the one with maximum negentropy.

In The algorithm aims to find a weight vector „W‟ and update it after each iteration by

learning rule in a direction such that „WTX‟ maximizes non-gaussianity.

Basic steps in FastICA are [58]

1.) Choose initial weight vector W

2.) W+ = E XG WTX − E G′ WTX W

3.) W =W +

W +

4.) If not converged go to 2

Derivative of the function G is given by

g1 u = G1′ u = tanh a1u ; g2 u = G2

′ (u) = ue−u 2

2 (5.21)

Where 1 ≤ a1 ≤ 2 after a1 = 1 is considered.


62

Convergence means old and new „W‟ must move in same direction i.e., their dot product

should be positive. This can be extended to several unit ICA. In this in order to avoid

convergence to same output, the output W1TX……Wn

TX must be de-correlated after

every iteration. The de-correlation is performed using deflationary orthogonalization

explained above.

Using this algorithm the given data can be decomposed into all the original source

signals. The accuracy or the convergence results are dependent on the number of signals

used to obtain the independent signals. General accurate results can be obtained only if

the number of the signals used is greater than or equal to the number of source signals

that are present in the samples. The number of source signals will be equal to the number

of significant eigenvalues present in the samples.

In order to get clear understanding of the algorithm three samples are generated with the

explosives and are used to verify the results of the algorithm. All the three samples have

got each source signal in different proportions. When FastICA algorithm is implemented

on the data generated the results obtained are as shown below in Figure 5.1. The

algorithm is capable of obtaining the original source signals.


63

Figure: 5. 1 (a) Sources used to generate samples, (b) results obtained

The algorithm which is discussed till now is unconstrained ICA. From the Figure 5.1 it is

clearly visible that the result doesn‟t have any specific order and the magnitude of the

results is not controllable. Hence we can say that the methods discussed have got certain

drawbacks which can be overcome using constrained ICA [59] method discussed below.

Two ambiguities of the above discussed algorithm that can be taken care of are discussed

below.

We cannot determine variances (energies) of independent components

Since both S and A are unknown any scalar multiplier to source Si could always be

compensated by dividing corresponding Ai of A by the same scalar. But we can

overcome this by fixing the magnitude of independent components by constraining that

independent components must have unit variance E {Si2} =1. But this still leaves the

ambiguity of sign unattended.

0 50 100 150 200 2500

0.2

0.4

0.6

0.8

0 50 100 150 200 2500

1

2

3

4

0 50 100 150 200 2500

0.1

0.2

0.3

0.4

0.5

0 50 100 150 200 2500

1

2

3

4

0 50 100 150 200 2500

0.2

0.4

0.6

0.8

1

0 50 100 150 200 2500

5

10

15


64

This can be visualized as

Maximize J y (5.22)

subjected to h W = [h1 W … . hM W ]

where hi Wi = WiTWi − 1 for i=1,2,…M which defines that the row norms of W matrix

is one.

This can be solved using Lagrangian multiplier method [60].

We cannot determine order of independent components

Since both S and A are unknown we can write

X = AP−1PS (5.23)

where P*S gives me independent components but in a different order. So since P-1

P will

not make any difference to X this is ambiguity in the order.

Using constrained ICA method the independent components can be ordered in a descent

manner according to certain statistical measure defined by Ɫ(u). This can be visualized as

Maximize J y (5.24)

subjected to g W ≤ 0, g W = [g1 W … . gM W ]T

where gi W = Ɫ ui+1 − Ɫ(ui) and Ɫ(u) is the index of measure of some statistical

measures of ‟u‟ such as variance, normalized kurtosis.

Again this can be optimized using Lagrangian multipliers method [60].

Any of the above constraints can be incorporated into the FastICA algorithm and the

results can be obtained accordingly. The algorithms discussed till now can be used when


65

we are trying to find all the original source signals or few dominant signals present in the

data. But we are in search of an algorithm which should have the ability to detect the

presence of target signal given by us. In such a case we have to employ further

processing techniques like correlation or any other signal similarity techniques for all the

signals we got to finalize if they are one from the targeted signals. This is

computationally ineffective in practical cases when the target signals are large in number.

ICA with reference is one such algorithm which takes information of the target signal as

input and can say whether that signal is present in the data.

5.3 ICA with Reference

In many practical applications there will be requirement to search for only one targeted

signal. Such problem is addressed by Wei Lu and Jagapath C.Rajapakse in their ICA with

reference paper [61-62]. This algorithm is being used by us and is discussed in detail in

this section.

The motivation of ICA with reference is to perform both the separation of independent

sources and selection of desired sources simultaneously in a single stage. For ICA with

single reference we start with the data available and try to converge towards a „y‟ that is

closer to reference siven by us.

y = WTX (5.25)

So here we try to optimize „W‟ such that the signal obtained is non-gaussian and is close

to the reference signal given. There are two main goals for this algorithm

1.) Output is one of the independent components present in the input signal.

2.) Extracted independent component must be close to the reference in some distance

criteria.

So now we can define ICA-R as a constrained problem with the aim of maximizing

negentropy which is given by equation (5.13)


66

J(y) ≈ ρ[E G WTX − E G ygauss ]2 (5.26)

Subjected to constraints

g y ≤ 0, h y = E y2 − 1 = 0 (5.27)

where g y = ε y, r − ξ ≤ 0 which is a measure of closeness of signal obtained to the

reference signal,

h(y) is introduced to ensure that W and J(y) are bounded.

If we start the algorithm with just the cost function and the h(y) constraint then the

algorithm will have M solutions of which (M-1) are local. By introducing the constraint

of g(y) the algorithm will converge towards the global optimum solution.

In order to make the inequality constraint as equality constraint we introduce slack

variables.

g y ≤ 0 −→ g y + z2 = 0 (5.28)

Now we use lagrangian multiplier method [40] to obtain the optimal solution for the ICA-

R algorithm. The Lagrangian function is given by

L1 W, µ, λ, z = J y + µ{g y + z2} +1

2γ g y + z2 2 + λh y +

1

2γ h(y) 2 (5.29)

where µ, λ are Lagrange multipliers for constraints g(w) and h(w),

γ is the scalar penalty parameter,

||. || denotes the Eucliden norm,

and 1

2γ||. ||2 term is included to ensure that the optimization problem is held at

the condition of local convexity assumption [60].


67

Because minimization of the Lagrangian function with respect to z can be done for a

given W, the function is first minimized with respect to z. We know that at minimum or

maximum values the first derivative of the function with respect to z will be equal to

zero. By using we can obtain the value of z as shown below

dL1

dz= 2µz + γ g y + z2 2z = 0 (5.30)

z2 = max 0,− µ

γ+ g y =

1

2γmax 0,− 2µ + 2γg y (5.31)

Substituting the value of z2 and J(y) in our original equation we have the Lagrangian

function as

L1 W, µ, λ = ρ[E G WTX − E G ygauss ]2µ g y +

1

2γmax 0,− 2µ +

2γgy+12γgy+2+ λhy+12γh(y)2 (5.32)

L1 w, µ, λ = ρ[E G WTX − E G ygauss ]2 −

1

2γ max2 µ + γg w , 0 −

µ2−λhw−12γ||h(w)||2 (5.33)

In order to find the maximum of L1 we use Newton-like learning method which is

wk+1 = wk − η(L1wk2

′′ )−1L1wk

′ (5.34)

Where k is iteration index,

η is positive learning rate which is to avoid uncertainty in convergence.

We have to calculate the first derivative of L1 in order to obtain the maximum value and

it is given by


68

dL1

dW= ρ E xGy

′ y −1

2µE xgy

′ w − λE{xy} (5.35)

Where p = ∓p and on order to simplify the calculations Hessian matrix is approximated

as

d2L1

dw2 = s(w)Rxx (5.36)

where s w = ρ E Gy2′′ y −

1

2µE gy2

′′ w − λ

Rxx = E{xxT}

So now equation (5.33) can be written as

wk+1 = wk − ηRxx−1L1wk

′ /s(wk) (5.37)

The optimum values of µ and λ are also found using

µk+1

= max 0, µk

+ γg wk , (5.38)

λk+1 = λk + γh(wk) (5.39)

So now this optimization algorithm converges at an optimum point (w*,µ

*,λ

*) which

satisfies the first order conditions which are

L1w′ w∗µ∗λ

∗ = 0 (5.40)

h w∗ = 0 (5.41)

g w∗ ≤ 0 (5.42)

λ∗ > 0 (5.43)

µ∗ ≥ 0 (5.44)


69

µ∗g w∗ = 0 (5.45)

The value of ξ is a critical parameter in the convergence of the algorithm. Generally the

algorithm first starts with a lower value of ξ to avoid converging to local optimum and

then increases the value to reach the global optimum.

In this algorithm the convergence condition based on which we are deciding if the signal

obtained is targeted signal is the number of loops. The algorithm checks for two

conditions before stopping. The first condition is if the signal obtained is close to

reference in some distance criteria then the algorithm stops. The second condition is if the

number of loops exceeds the maximum value then the algorithm stops. If the targeted

signal is present in the data the algorithm converges within few number of loops, but if

targeted signal is not present the algorithm will reach the maximum number of loops

given by us.

The drawback of this algorithm that can be overcome is that it is computationally

ineffective.

5.4 Results of Independent Component Analysis with reference

General independent component analysis is a method which is used to decompose the

given sample into its base source signals. Similar to previous method the data we have

here is the 900*200 matrix with each sample signal as row elements. In order to

implement the independent component analysis with reference and obtain accurate results

we require samples equal to greater than the number of source signals originally present

in the samples.

The original number of source signals can be decided based on the number of significant

eigenvalues. Based on this we should decide the number of samples that should be passed

into the algorithm. First the result of ICA-R is shown and then the application of

algorithm for explosive detection is discussed.


70

In order to verify the results of this algorithm we have to randomly generate few samples

such that the number of samples is greater than or equal to number of sources signals.

Here I have generated six samples with RDX in one sample and the remaining samples

with other sources signals. The aim of this algorithm is to extract the signal whose priori

information is available with us. So in this case we are trying to extract the signal „y‟ that

is close to the reference signal we have given. So we start with initial guess for „W‟ and

try to converge towards a signal that is close to reference.

We have to start with some initial guess value for „W‟ and the try to maximize the

negentropy of the signals with the constraint that the signal obtained is as close as

possible to the reference given by us. The other constraint must be used to keep the

magnitude of the output in the range of the original source signals we used to generate the

samples.

The stopping constraint used for this algorithm is if the output obtained is close to the

reference signal or if the number of iterations exceeded a prefixed value, two hundred in

our case.


71

The output obtained for the example explained above is

Figure: 5. 2 Result of ICA-R: (a) Reference signal given by us (b) Result obtained

5.4.1 First Stage of Processing

In order to use this ICA-R to detect the explosives we should be able to identify the

correct pixels that have the probability of explosives in it. From the explanation of ICA-R

we are clear that if we pass few samples into the ICA-R we cannot say which of those

samples has got the explosive. But still we cannot pass individual sample into the

algorithm because independent component analysis requires number of samples greater

than or equal to the number of source signals. So one of the possibilities is that we can

pass each individual sample into the algorithm along with few other known samples

which are manually generated such that the algorithm converges to the solution only if

the sample from the original data has got the explosive. This algorithms guarantees the

solution but it is highly time consuming because in order to identify the presence of

explosive in 900 pixel plot we require ICA-R algorithm to be run for 900 times.

0 20 40 60 80 100 120 140 160 180 200-2

-1

0

1

2

3

0 20 40 60 80 100 120 140 160 180 200-2

-1

0

1

2

3

(a)

(b)


72

Hence we go for a preprocessing stage in which entire plot is divided into square grids

and we perform ICA-R on each grid to first decide if there is any chance of the grid

having explosive in its samples. The division of the grids is as shown in the figure below.

Figure: 5. 3 Grid layout of the plot

Each of these grids is passed into ICA-R for each explosive at a time and so the ICA-R is

run for 100 times for every explosive detection. If at least one pixel in the grid has got the

explosive then the algorithm indicates the presence of explosive and we pass all the nine

pixels of that grid to the next stage.

As shown in Fig (3.2) the data generated has got the three different explosives RDX,

TNT and DNT. So this stage has to undergo ICA-R algorithm for 300 times, 100 times

for each explosive. This stage detects all the grids that have atleast one pixel that has

explosive content in it.

0 5 10 15 20 25 300

5

10

15

20

25

30


73

Output after first stage when RDX is used as reference is

Figure: 5. 4 Output of ICA-R after first stage for RDX as reference

Output after first stage when TNT is used as reference is

Figure: 5. 5 Output of ICA-R after first stage for TNT as reference

0 5 10 15 20 25 300

5

10

15

20

25

30Field with pixels after first stage

x axis--->

y a

xis

--->

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


74

Output after first stage when DNT is used as reference is

Figure: 5. 6 Output of ICA-R after first stage for DNT as reference

Combined output of all the three explosives can be seen as

Figure: 5. 7 Final output after first stage

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


75

Now all these pixels are selected because atleast one pixel in their grid has got the

probability of explosive in it. But truly speaking not all these pixels will have explosives.

So now we have to do further processing to truly identify only those pixels which have

got explosive in them.

5.4.3 Final Stage

This stage gets the output of first stage as input. Now we have to confirm for each pixel

individually if it has got explosive traces in it. So now we have to pass every pixel

information into the ICA-R. But we cannot send individual pixel information into the

ICA-R because the algorithms requires number of samples equal to or greater than the

number of source signals in the original sample.

Because we are not sure about the number of source signals in the sample it is better to go

for little large number of samples than just constraining it to one or two. In this case I

have considered six other randomly generated samples along with the suspected sample

and send the information of all these seven samples into ICA-R algorithm. Here again the

algorithm has to be run for each explosive individually, but since we know that the

suspected pixels are derived during particular reference signal in the first stage, we can

pass the pixels corresponding to those explosive alone to the corresponding reference

algorithms in the second stage. In that way we can save some computational effort

required.


76

Output using RDX as reference is

Figure: 5. 8 Output of second stage using RDX as reference

The output using TNT as reference

Figure: 5. 9 Output of second stage using TNT as reference

0 5 10 15 20 25 300

5

10

15

20

25

30Finally detected pixels

x axis--->

y a

xis

--->

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


77

The output using DNT as reference

Figure: 5. 10 Output of second stage using DNT as reference

The final output combining all the results is

Figure: 5. 11 Finally detected samples

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


78

The algorithm is successful in determining the exact pixels where explosive traces are

present. But the drawback of this algorithm is that it involves lot of computational effort.

In practical case where the explosive database is really large if we have to use ICA-R

individually for each explosive then it will consume lot of time before revealing the final

result.

5.5 Fast algorithm for one-unit ICA-R

This is explained as an extension to the previous ICA-R by Qiu-Hau Lin, Yong-Rui

Zheng, Fu-Liang Yin, Hualou Liang, Vince D. Calhoun [42]. The authors have suggested

some alternatives to the ICA-R algorithm in order to reduce its computational

complexity. The complexity can be reduced by

1. Pre-whitening the observed signals and

2. Normalizing the weight vector.

There are two reasons for the reduction in the complexity. First one is, from equation

(5.36) the weight vectors learning algorithm requires computation of inverse of

covariance matrix. Removing this by some means reduces the computational effort. This

can be achieved by data whitening process of the centered data. Second reason is, since

the method explained previously has got two constraints, among which g(y) is to measure

the closeness of signal with the reference and h(y) is for checking if J(y) and w are

bounded. The task of h(y) can be accomplished by normalizing the weight vector instead

of having a constraint.

So now the problem will be maximizing negentropy which is given by

J(y) ≈ ρ[E G WTX − E G ygauss ]2 (5.45)

Subjected to constraint

g y ≤ 0, (5.46)


79

where g y = ε y, r − ξ ≤ 0 which is a measure of closeness of signal obtained to the

reference signal.

The initial mixed data „X‟ is pre-whitened. In order to make the inequality constraint as

equality constraint we introduce slack variables similar to the previous case.

g y ≤ 0 −→ g y + z2 = 0 (5.47)

Now we use lagrangian multiplier method [40] to obtain the optimal solution for the

FastICA-R algorithm. The Lagrangian function is given by

L1 W, µ, λ, z = J y + µ{g y + z2} +1

2γ g y + z2 2 (5.48)

Because minimization of the Lagrangian function with respect to z can be done for a

given W, the function is first minimized with respect to z. We know that at minimum or

maximum values the first derivative of the function with respect to z will be equal to

zero. By using we can obtain the value of z as shown below

dL1

dz= 2µz + γ g y + z2 2z = 0 (5.49)

z2 = max 0,− µ

γ+ g y =

1

2γmax 0,− 2µ + 2γg y (5.50)

Substituting the value of z2 and J(y) in our original equation we have the Lagrangian

function as

L1 W, µ = ρ[E G WTX − E G ygauss ]2µ g y +

1

2γmax 0,− 2µ +

2γg y +1

2γ g y + 2 (5.51)

L1 w, µ = ρ[E G WTX − E G ygauss ]2 −

1

2γ max2 µ + γg w , 0 − µ2 (5.52)

In order to find the maximum of L1 we use Newton-like learning method which is


80

wk+1 = wk − η(L1wk2

′′ )−1L1wk

′ (5.53)

Where k is iteration index,

η is positive learning rate which is to avoid uncertainty in convergence.

We have to calculate the first derivative of L1 in order to obtain the maximum value and

it is given by

dL1

dW= ρ E xGy

′ y −1

2µE xgy

′ w (5.54)

Where p = ∓p and on order to simplify the calculations Hessian matrix is approximated

as

d2L1

dw2 = s w Rxx = s(w) (5.55)

Here since X is whitened data Rxx=1

where s w = ρ E Gy2′′ y −

1

2µE gy2

′′ w

So now equation (5.34) can be written as

wk+1 = wk −ηRxx

−1L1w k

′

s wk = wk − ηL1wk

′ /s(wk) (5.56)

Weight vector is normalized after each iteration

wk+1 = wk+1/ wk+1 (5.57)

The optimum values of µ is also found using

µk+1

= max 0, µk

+ γg wk , (5.58)


81

So now this optimization algorithm converges at an optimum point (w*,µ

*) which

satisfies the first order conditions which are

L1w′ w∗µ∗ = 0 (5.59)

g w∗ ≤ 0 (5.60)

µ∗ ≥ 0 (5.61)

µ∗g w∗ = 0 (5.62)

The convergence conditions are all same as in the previous case. This algorithm gives

results similar to the previous case but with less computational effort.

5.6 ICA with Multi-Reference

The problem of ICA-R can easily be extended to multi reference case. The problem of

ICA with multi reference can be explained as

Maximize J yi li=1 (5.63)

Subject to g y ≤ 0, h y = 0

Where l is the number of desired independent sources to be extracted

g y = (g1 y1 , g2 y2 …… . , gl yl )T with gi yi = εi yi , ri − ξi

h y = (h11 y1 , h12 y1, y2 ,……… , h1l yl , yl , h21 y2, y1 ,………… hll yl , yl

with hij yi , yj = E yiyj 2

= 0 for all i, j = 1,2,… . l, i ≠ j

hii yi = E yi2 − 1 2 = 0 for all i = 1,2,… . l,


82

h(y) here includes here the constraint to bound the signal as well as the uncorrelatedness

constraint. Here additional constraint of uncorrelatedness is introduced in order to get

different IC‟s as the output.

The Lagrangian function will be similar to the previous case with µ, λ, g(y) as vectors

and h(y) as matrix

L2 W, µ, λ, z = J y + µT{g y + z2} +1

2γ g y + z2 2 + λTh y +

1

2γ h(y) 2 (5.64)

Where µ = µ1

, µ2

,………µl

T and λ = λ1, λ2 ,……… λl

T

Similar to previous case the Lagrangian function is first optimized with respect to zi and

the equations are as shown below

dL1

dz= 2µizi + γ gi yi + zi

2 2zi = 0 (5.65)

zi2 = max 0,−

µ i

γ+ gi yi =

1

2γmax 0,− 2µi + 2γgi yi (5.66)

Using these equations we can write the augmented Lagrangian function as

L2 W, µ, λ = J yi −max 2 µi +γgi w i ,0 −µi

2

2γ l

i=1 − λTh W −

1

2γT||h(W)||2 (5.67)

So now the Newton like learning algorithm is again used and it is

wk+1 = wk − η s (W) L2w′ Rxx

−1 (5.68)

The results obtained using both ICA with reference and ICA with multiple references is

accurate and the required source signals can be extracted accurately.


83

5.7 Results of Independent Component Analysis with multi reference

This is similar to ICA-R but here the reference is not just one source signal but the

reference here will be a group of source signals together. The reference here will be a

matrix with different explosive signal information we have as row elements of the matrix.

Here again the algorithm is we have the information about X and we should be able to

converge to the solution such that the signals obtained are as close as possible to the

reference signals considered.

We have to start with some initial guess value for W and then try maximizing the

negentropy function with the constraint that the output obtained is as close as possible to

the reference signals which we have considered and the other constraint is imposed to

restrict the magnitude of the output.

Because the main emphasis of this thesis is just to discuss the presence of the algorithm

and not to find which explosive that is we can use a convergence condition such that if

the algorithm converges to atleast one of the source signals then we can stop any further

iterations and can confirm that there is explosive trace in the sample.

First to verify the result of ICA-mR I have generated six samples such that one sample

has got RDX along with noise, one has got TNT along with noise, one has got DNT

along with noise and others has got random signals along with noise. In the data

generated by us it has got all the three explosives RDX, TNT and DNT. So now if we try

to run this algorithm for the data the algorithm converged to the solution shown below


84

Figure: 5. 12 Result of ICA-mR

5.7.1 First Stage using ICA-mR

So now if we use this algorithm for the first stage as explained in 6.3.2 this algorithm can

used to detect if each grid has the possibility of any of the explosives in it. If any pixel in

the grid has got atleast one possible explosive in it then that grid will be considered to

have explosive traces in it.

0 20 40 60 80 100 120 140 160 180 200-2

-1

0

1

2

3

0 20 40 60 80 100 120 140 160 180 200-2

-1

0

1

2

3

0 20 40 60 80 100 120 140 160 180 200-2

-1

0

1

2

3

0 20 40 60 80 100 120 140 160 180 200-2

-1

0

1

2

3

0 20 40 60 80 100 120 140 160 180 200-1

0

1

2

3

4

0 20 40 60 80 100 120 140 160 180 200-1

0

1

2

3

4


85

The output of this algorithm for the data we generated previously is

Figure: 5. 13 Result of first stage of ICA-mR

5.7.2 Final Stage using ICA-mR

This stage gets the output of first stage as input. So now we have to confirm for each

pixel individually if it has got explosive traces in it. So now we have to pass each pixel

information into the ICA-mR. But we cannot send individual pixel information into the

ICA-mR because the algorithms requires number of samples equal to or greater than the

number of source signals in the original sample.

Because we are not sure about the number of source signals in the sample it is better to go

for little large number of samples then just constraining it to one or two. In this case I

have considered six other randomly generated samples along with the suspected sample

and send the information of all these seven samples into ICA-mR algorithm.

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


86

The results obtained using this algorithm is

Figure: 5. 14 Result of final stage of ICA-mR

5.8 Conclusions

Independent component analysis algorithm is successfully implemented. Valid reasons

for selection of the algorithm are provided. Fast ICA algorithm is implemented and

drawbacks of that algorithm are discussed. In order to overcome these ICA-R algorithm

is used. This algorithm involves lot of computational effort. This algorithm is

computationally improved by pre-whitening and normalizing weight vector and this is

FastICA-R. Later ICA with multiple references is successfully implemented and is used

for explosive detection.

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


87

CHAPTER 6

TWO STAGE PROCESS

Independent component analysis and component spatial and spectral pattern analysis

algorithms are discussed in the previous chapters. Each of them has got certain pros and

cons. This chapter first compares the two methods and then a two stage process is

proposed which is less time consuming than independent component analysis and is more

trustworthy than component spatial and spectral pattern analysis.

6.1 Comparison

Component spatial and spectral pattern analysis algorithm can give spatial and spectral

information along with the probability of each signal at the spatial location where ICA

can provide only spectral and spatial information. Component pattern analysis can

provide spatial information of entire data where as for ICA in order to know the spatial

location we have to use the algorithm once for every pixel. The size of data cannot really

influence the computational time for component pattern analysis whereas in ICA

computational time depends on number of pixels.

ICA can be implemented using single reference whereas component pattern analysis

cannot be implemented for single reference. For the case of ICA-R no further processing

is required to confirm if the source signal obtained is same as the reference. Both for

ICA-mR and component pattern analysis further processing is required to say if the

source signals obtained is one from the source signal.

ICA method is more accurate than component pattern analysis. ICA can detect even small

traces of explosives present is sample. Since ICA algorithm is implemented for one pixel

at a time pixels with very low trace can also be identified. But component pattern analysis

is implemented for entire data and hence the spatial information is found based on the

magnitude of the intensity matrix [P]. Hence there are chances of false negatives if the

threshold value selected in this method is large.


88

Component pattern analysis doesn‟t require any pre-whitening process. For ICA using

pre-whitening we can improve the computational speed of the algorithm and if necessary

we can also reduce the dimension of the data before passing the data that to the ICA

algorithm.

By considering all these advantages and disadvantages a a two stage process is proposed

which utilizes the advantages of both the methods.

6.2 First stage

In this stage we perform component spatial and spectral pattern analysis for the entire

samples. As explained in the previous chapter this method is really fast and the location

of the explosive can be detected exactly just by using algorithm once for data of any size.

Similar to the method explained in chapter 4 first number of significant components in

the field are estimated and then using component spatial and spectral pattern analysis all

the significant components and their corresponding probability matrices are obtained.

Since we are using component spatial and spectral pattern analysis with reference the

order of the components will be maintained, so we can perform correlation for each

component obtained with the reference we have and based on the correlation output

decide if the estimated component is explosive. If the component obtained is explosive

we further do thresholding for the probability matrix to decide upon which pixel has got

explosive.

The data which we have used to verify the algorithm is same as the previous cases and is

shown in the figure 6.1 below. The algorithm has estimated all the three explosive

components and their corresponding rows in [P] matrix. The results after thresholding [P]

corresponding to each explosive are similar to the one shown in chapter 4. Here the

combined result of all the three explosives is shown in figure 6.2


89

Figure: 6. 1 Originally generated pixels

Figure: 6. 2 Output of first stage component pattern analysis

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->

0 5 10 15 20 25 300

5

10

15

20

25

30Output of First Stage-Component Spectral and Spatial Pattern Analysis

x axis--->

y a

xis

--->


90

6.3 Second stage

In this stage we perform the independent component analysis on all the pixels that are

detected in the first stage along with their neighborhood pixels. This is done because the

first stage fails to detect a pixel only if the pixel has got minute trace of explosive or if

the threshold in the first stage is really large. Both these happen only when the percentage

of explosive content is really small. Generally this happens for the boundary pixels of the

explosive location. So if we include the neighborhood pixels of the earlier detected pixels

we can assure that we have finally detected all the pixels that have got explosive traces in

them.

Figure: 6. 3 Pixels being passed to the second stage include in black boxes

As explained previously independent component analysis has the capability to detect

even the minute traces of the explosive. So ICA will be an appropriate method for the

final confirmation about exact location of explosive.

Similar to the method explained in chapter 5 along with each pixel that has been

suspected to have explosive in the first stage we add few samples that doesn‟t have any

0 5 10 15 20 25 300

5

10

15

20

25

30Output of First Stage-Component Spectral and Spatial Pattern Analysis

x axis--->

y a

xis

--->


91

explosive traces and then use this as data for independent component analysis algorithm.

Number of additional samples is decided based on the number of significant components

in the original pixel. Now if the independent component analysis detects that there is

explosive in the input data we can come to a conclusion that original pixel has got

explosive because other samples given as input doesn‟t have any traces of explosive.

Figure: 6. 4 Output of final stage

This method can accurately detect the presence of explosives without any false positives.

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


92

6.4 Time comparison

The table below compares the time taken for all the methods explained in 4 and 5

chapters

Table 6: 1 Time Comparison

Method Time Taken in sec

Component Spectral and Spatial pattern

analysis

19.404

Component Spectral and Spatial pattern

analysis with reference

14.074

ICA-R 64.337

Fast ICA-R 60.349

ICA-mR 164.270

Two Stage Process 23.484

Component spatial pattern analysis is really faster than ICA. But when accuracy is

required we have to go for ICA which is really time consuming. After an in-depth

comparison of two methods we came up with a two stage process which is

computationally affordable and is as accurate as Fast ICA-R.

6.5 Need for deterministic signals

In all the cases explained till now in this chapter we are always generating the samples

based on the assumption if a pixel contains explosive then it has only one explosive in it.

But in practical cases there are chances that more than one explosive might reside in the

same area.

In order to take into consideration these cases samples are generated similar to previous

case but with two explosives in the same pixel. So in the figure shown below pixels in red


93

has RDX and TNT, pixels shown in blue have TNT and DNT, pixels shown in magenta

has got DNT and RDX.

Figure: 6. 5 Pixels generated

So now if I try to use component spatial and spectral pattern analysis and ICA to find the

exact location of the explosives and the results obtained are discussed below. The result

of component spatial and spectral pattern analysis is shown first. The values of [S] which

contains the source signals to which the algorithm converged is as shown below

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


94

Figure: 6. 6 (a) Explosives used to generate samples, (b) [S] matrix obtained

The values of columns of [P] when plotted as 30*30 field and their corresponding

threshold output values are shown below

Figure: 6. 7 [P] corresponding to first source signal

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 2000

1

2

3

4

0 20 40 60 80 100 120 140 160 180 200-2

0

2

4

6

0 20 40 60 80 100 120 140 160 180 200-2

0

2

4

6

0 20 40 60 80 100 120 140 160 180 200-2

0

2

4

6

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30

(a) (b)


95

Figure: 6. 8 [P] corresponding to second source signal

Figure: 6. 9 [P] corresponding to third source signal

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30


96

Because we are not concerned about the explosive type and since only the exact location

of the explosive is what matters the combined output is shown below,

Figure: 6. 10 Final result obtained

The result of independent component analysis is shown below. The output of the first

stage is as shown below.

Figure: 6. 11 Output of first stage of ICA

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


97

The final result of the independent component analysis is as shown below

Figure: 6. 12 Final output of ICA

From the results we can say that component spatial and spectral pattern analysis method

works well for the case when the pixel has more than one explosive. This happens only if

we choose a lower value of threshold for the [P] matrix. The results of independent

component analysis are really bad in this case. The reason for this might be because of

the overlap of different peak positions of the explosive spectra.

So deterministic i.e., considering only some range of frequency where the peak positions

might not overlap for different explosives might give us better result. To serve the

purpose we have considered the first 60 frequency samples of all the explosives and other

source signals to generate the samples and again used both the algorithms to obtain the

results.

The result of component spatial and spectral patter analysis is shown first. The values of

[S] which contains the source signals to which the algorithm converged is as shown

below

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


98

Figure: 6. 13 Source signal spectra used to generate samples, (b) Sources signals obtained

The values of columns of [P] when plotted as 30*30 field and their corresponding

threshold output values are shown below

Figure: 6. 14 [P] corresponding to first source signal

0 20 40 600

1

2

3

0 20 40 600

2

4

0 20 40 600

2

4

0 20 40 600

2

4

0 20 40 60-2

0

2

4

0 20 40 600

2

4

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30


99

Figure: 6. 15 [P] corresponding to second source signal

Figure: 6. 16 [P] corresponding to third source signal

Because we are not concerned about the explosive type and since only the exact location

of the explosive is what matters the combined output is shown below,

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30

5 10 15 20 25 30

5

10

15

20

25

30

0 5 10 15 20 25 300

5

10

15

20

25

30


100

Figure: 6. 17 Final result obtained

The result of independent component analysis is shown below. The output of the first

stage is as shown below.

Figure: 6. 18 Result of first stage of ICA

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


101

The final result of the independent component analysis is as shown below

Figure: 6. 19 Result of final stage of ICA

The results clearly indicate that for the case when there are more than one explosive at

the same pixel using deterministic spectral information gives us better results.

6.6 Conclusions

Two stage procedure gave pretty good results. It has the accuracy of the independent

component analysis and also it is computationally fast like the component spatial pattern

analysis technique. This chapter also presents the need for deterministic signals. As

terahertz region is significant for its distinct absorption or reflectance peaks, proper usage

of this information gives a better results as explained.

0 5 10 15 20 25 300

5

10

15

20

25


x axis--->

y a

xis

--->


102

CHAPTER 7

CONCLUSIONS AND FUTURE WORK

This thesis attempted to obtain a technique for explosive detection. First two different

approaches component analysis of spatial and spectral pattern analysis, independent

component analysis are successfully implemented. Later by considering different factors

we have developed a two stage combined process which gives accurate results and is a

low time consuming process.

Component spatial and spectral pattern analysis implemented has the capability to obtain

spectral information of underlying signals in a given multispectral image and their

corresponding spatial information. This algorithm needs further processing to confirm if

the spectrum obtained is one from the database that contains explosive. This is done

using correlation. Also some threshold value is required for spatial matrices to say which

of the pixels has got explosive. Since the database is really large correlation stage

involves lot of computation. In order to overcome this component spatial and spectral

pattern analysis with reference is implemented. This algorithm is computationally faster

in the final processing stage.

Independent component analysis is a technique implemented that can extract the source

signals present in the signal which is closest to the reference signal given. This is a two

stage process. In the first stage entire plot is divided into grids and individual grid

information is passed into ICA-R algorithm. All those grids detected in the first stage are

passed to the second stage. In this stage individual pixel information is combined with

few other known signals that don‟t have any explosive and ICA-R algorithm is

implemented. This finally detects the pixels that have explosive traces. But this is time

consuming and this is computationally improved by including pre-whitening of data and

normalizing the weight vector. This is computationally faster than the previous method.

Independent component analysis with multiple references is also successfully

implemented in the similar manner.


103

Both these methods are combined to obtain a two stage process which is more accurate

and computationally better. In the first stage component spatial and spectral pattern

analysis is implemented with a moderate threshold value. There are chances of pixels to

with low traces of explosive to be missed in this stage of detection. So for the second

stage border pixels along with the pixels detected in the first stage are verified by ICA-R

pixel by pixel. This method gave good results without any false alarms.

For the case when there are more than one explosive in the same pixel or in the same

location only deterministic signals can give good results using the above methods.

In this thesis we have concentrated on using THz spectroscopy for explosive detection.

But THz signals have got their own limitations. So these must be overcome to design an

efficient method. In recent years lot of research is being done in data fusion techniques.

These utilize the information from more than one sensor and try to make use of these

advantages to come up with more efficient techniques.

Component spatial and spectral pattern analysis method can further be improved such

that further processing to confirm that spectrum obtained is explosive can be eliminated.


104

REFERENCES

[1] Jacqueline MacDonald and J.R. Lockwood,”Alternatives for landmine detection,”

RAND Publications, Santa Monica, CA, 2003.

[2] Rob Siegel,”Landmine Detection,” IEEE Instrum. Meas. Mag., vol. 5, no. 4, Dec

2002.

[3] Committee on the Review of Existing and Potential Standoff Explosives Detection

Techinques & National Research Council,“Existing and Potential standoff explosives

detection techniques,” National Academies Press, Washington, D.C., 2004.

[4] A. Giles Davies et al.,“Thz spectroscopy of explosives and drugs,” Materials

Today, vol 11, no. 3, pp. 18-26, March 2008.

[5] Committee on Assessment of Security Technologies for Transportation, National

Research Council, “Assessment of millimeter wave and THz technology for detection

and identification of concealed explosives and weapons,” National Academies Press,

Washington, D.C., 2007.

[6] P. Kužel, Laboratory of Terahertz Spectroscopy, Prague [Online], Available:

http://department.fzu.cz/lts

[7] M.Acheroy,“Mine action: status of sensor technology for close-in and remote

detection of antipersonnel mines”, Proc. of the 3rd Int. Workshop on Advanced

Ground Penetrating Radar, (IWAGPR 2005), pp. 3 - 13, May 2005.

[8] Geneva International Center on Humanitarian Demining [Online], Available:

http://www.gichd.org

[9] P. Druyts et al., ” Usefulness of semi-automatic tools for airborne minefield

detection,” In CLAWAR'98, Brussels, Belgium, pp.241-248, November 1998.

[10] Marc Acheroy� and Idesbald van den Bosch†, ”Humanitarian demining: sensor

technology status and signal processing aspects,” Proc of GDR Ondes (Invited

paper), 2003.

[11] Allen, S.J , ”Terahertz dynamics in semiconductor quantum structures,” Infrared

and Millimeter Waves, 2002. Conf. Digest. Twenty Seventh Int. Conf., 2002, pp 11-

12.

http://department.fzu.cz/lts/

http://ieeexplore.ieee.org.lib-e2.lib.ttu.edu/xpl/mostRecentIssue.jsp?punumber=9954



http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8133




105

[12] Herault J and Ans B, “ Neural Network with modifiable synapses- Decoding of

Composite Sensory Messages Under Unsupervised and Permanent Learning,”

Comptes Rendus De L Academie Des Sciences Serie Iii-Sciences De La Vie-Life

Sciences, vol. 299, no. 13, pp. 525-528, 1984.

[13] ANS, B., J. H´ERAULT, and C. JUTTEN, “Adaptive neural architectures Detection

of primitives,” Proc. of COGNITIVA’85, pp. 593–597, 1985.

[14] S.-I. Amari,”Estimating functions of independent component analysis for temporally

correlated signals,” Neural Computation, vol. 12, no. 9, pp. 2083-2107, 2000.

[15] J.-F. Cardoso,” Blind identification of independent signals,” In Proc. Workshop on

Higher-Order Spectral Analysis, Vail, Colorado, 1989, pp. 157-160.

[16] P. Comon,” Seperation of stochastic processes,” In Proc. Workshop on Higher-

Order Spectral Analysis, Vail, Colorado, 1989, pp. 174-179.

[17] J.-F.Cardoso and A. Souloumiac,”Blind beamforming for non Gaussian signals”,

IEE Proceedings-F, vol. 140, no. 6, pp. 362-370, 1993.

[18] J.-L. Lacoume and P. Ruiz,”Sources identification: a solution based on cumulants,”

In Proc. IEEE ASSP Workshop, Minneapolis, Minnesota, pp. 199-203, 1988.

[19] A. Cichocki and L. Moszczynski,”A new learning algorithm for blind separation of

sources,” Electronics Letters, vol. 28, no. 21, pp. 1986-1987, 1992.

[20] A. Cichocki and R. Unbehauen,”Robust neural networks with on-line learning for

blind identification and blind separation of sources”, IEEE Trans. On Circuits and

Syst, vol. 43, no. 11, pp. 894-906, 1996.

[21] A. Cichocki et al.,”Robust learning algorithm for blind separation of signals,”

Electronics Letters, vol. 30, no. 17, pp. 1386-1387, 1994.

[22] G. Burel,”Blind separation of sources: a nonlinear neural algorithm,” Neural

networks, vol. 5, no. 6, pp. 937-947, 1992.

[23] J.-P. Nadal and N. Parga,”Non-linear neurons in the low noise limit: a factorial code

maximizes information transfer,” Network, vol. 5, no. 4, pp. 565-581, 1994.

[24] E. Oja et al., Learning in nonlinear constrained Hebbian networks,” In Proc. Int.

Conf. on Artificial Neural Networks (ICANN’91), Espoo, Finland, 1991, pp. 385-390.

[25] J. Karhunen and J. Joutsensalo,”Representation and separation of signals using

nonlinear PCA type learning,“ Neural Network, vol. 7, no. 1, pp. 113-127, 1994.


106

[26] A. J. Bell and T.J. Sejnowski,”A non-linear information maximization algorithm that

performs blind separation,” In Advances in Neural Information Processing Systems 7,

pp. 467-474. The MIT Press, Cambridge, MA, 1995.

[27] A. J. Bell and T.J. Sejnowski,”An information-maximization approach to blind

separation and blind deconvolution,”Neural Computation, vol. 7, pp. 1129-1159,

1995.

[28] A. Hyvӓrinen and E. Oja,“A fast fixed-point algorithm for independent component

analysis,” Neural Computation, vol. 9, no. 7, pp. 1483-1492, 1997.

[29] A. Hyvӓrinen,“A family of fixed-point algorithms for independent component

analysis,” In Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing

(ICASSP’97), Munich, Germany, pp. 3917-3920, 1997.

[30] A. Hyvӓrinen, “Fast and robust fixed-point algorithms for independent component

analysis,” IEEE Trans. Neural Net., vol. 10, no. 3, pp. 626-634, 1999.

[31] J.-F. Cardoso and A. Souloumia, “Blind beamforming for non Gaussian

signals,” IEE Proceedings-F, vol. 140, no. 6, pp. 362–370, 1993.

[32] J.-F. Cardoso, “Source separation using higher order moments,” Proc. Int. Conf.

Acoust. Speech Signal Process., vol. 4, pp. 2109–2112, May 1989.

[33] S.-I. Amari et al.,“ A new learning algorithm for blind source separation,” In

Advances in Neural Information Processing Systems 8, pp 757-763. MIT Press,

Cambridge, MA, 1996.

[34] E.J. Heilweil and M. Campbell, THz Spectral Database[Online], Available:

http://webbook.nist.gov/chemistry/thz-ir/

[35] X.-C. Zhang et al.,“THz Diffuse Reflectance Spectra of Selected Explosives and

Related compounds”, Proc. SPIE, vol. 5790, no.19, 2005.

[36] S.Kawata, K.Sasaki, and S.Minami, “Component analysis of spatial and spectral

patterns in multispectral images. I. Basis,” J.Opt.Soc.Am.A, vol. 4, no. 11, pp. 2101-

2106, 1987.

[37] K.Sasaki, S.Kawata, and S.Minami, “Component analysis of spatial and spectral

patterns in multispectral images. II. Entropy minimization,” J.Opt.Soc.Am.A vol. 6,

no. 1, pp. 73-79, 1989.

[38] Naes T and Risvik E. (Eds.), “Multivariate analysis of data in sensory science”,

Elsevier, New York, 1996.

http://webbook.nist.gov/chemistry/thz-ir/


107

[39] C. T. Chen,”Introduction to Linear System Theory,” Holt and Rinehart and Winston,

New York, 1970.

[40] Benjamin W. Wah and Yixin Chen,“Solving Large-Scale Nonlinear Programming

Problems by Constraint Partitioning,” Proc. of the Principles and Practice of

Constraint Programming, vol. 3709, pp. 697-711, 2005.

[41] Watanabe, S and Kaminuma, T, “Recent developments of the minimum entropy

algorithm,” Proc. 9th Int. Conf. Pattern Recognition (1CPR), Rome, vol. 1, pp. 536-

540, 1988.

[42] H.Akaike, “A new look at the statistical model identification,” IEEE Trans. Autom.

Contro., Vol. 19, no. 6, pp. 716-723, 1974.

[43] Shuonan Dong, “Methods of Constrained Optimization,” MIT, Cambridge, MA,

May 2006.

[44] J.Kowalik and M.R. Osborne,“Method for Unconstrained Optimization Problems,”

American Elsevier, New York, 1968.

[45] Heinz Daniel C and Chang Chein-I, “Fully Constrained Least Squares Linear

Spectral Mixture Analysis Method for Material Quantification in Hyperspectral

Imagery,” IEEE Trans. Geosci. Remote Sens., vol. 39, no. 3, March 2001.

[46] Almeida T. I. R. and De Souza C. R. Filho 2004 “Principal component analysis

applied to feature-oriented band ratios of hyperspectral data: a tool for vegetation

studies,” INT. J. Remote Sens., vol. 25, no. 22, pp. 5005–5023, 20 November, 2004.

[47] Hyvärinen Aapo and Oja Erkki. 2000, “Independent Component Analysis:

Algorithms and Applications,” Neural Networks, vol. 13, no. 4-5, pp. 411-430, 2000.

[48] Hyvarinen Aapo and Oja Erkki, 1997, “A Fast Fixed Point Algorithm for

Independent Component Analysis,” Neural Computations, vol. 9, pp. 1483-1492,

1997.

[49] Jos´e M. P. Nascimento and Jos´e M. B. Dias,“Does Independent Component

Analysis Play a Role in Unmixing Hyperspectral Data?,” IEEE Trans. Geosci.

Remote Sens., vol. 43, no. 1, pp. 175-187, 2005.

[50] Keshava Nirmal,“A Survey of Spectral Unmixing Algorithms,” Lincoln Laboratory

J., vol. 14, no. 1, 2003.

[51] Cromp Robert F. (1998) et al.,“Analyzing Hyperspectral data with Independent

Component Analysis,” Proc. SPIE, vol. 3240, pp. 133-143 , 1998.


108

[52] Hsuan Ren and CHEIN-I CHANG,“Automatic Spectral Target Recognition in

Hyperspectral Imagery,” IEEE Trans. Aerosp. Electron. Syst, vol. 39, pp. 1232-1249,

2003.

[53] Chein-I Chang and Clark Brumbley,“Kalman Filterinlg Approach to

Multispectral/Hyperspectral Image Classification,” IEEE Trans. Geosci. Remote

Sens., vol. 35, no. 1, pp. 319-330, 1999.

[54] Chein-I Chang and Clark M. Brumbley, “A Kalman Filtering Approach to

Multispectral Image Classification and Detection of Changes in Signature

Abundance,” IEEE Trans. Geosci. Remote Sens., vol. 37, no. 1, pp. 257-268, 1999.

[55] Lieven De Lathauwer et al.,”An introduction to Independent Component Analysis,”

J. Chemometr., vol. 14, no. 3, pp. 123–149, 2000.

[56] Pierre Comon,“Independent Component Analysis, A new concept,” Signal

Processing, Elsevier., vol. 36, no. 3, pp. 287-314, April 1994.

[57] Aapo Hyvarinen and Erkki Oja,”ICA: Algorithms and Applications”, Neural

Networks, Elsevier., vol. 13, no. 4-5, pp. 411-430, 2000.

[58] A. Hyvarinen and E. Oja,“A fast fixed-point algorithm for independent component

analysis,” Neural Computation, vol. 9, no. 7, pp. 1483-1492, 1997.

[59] Wei Lu and Jagath C. Rajapakse,“Constrained Independent Component analysis”,

Advances in Neural Information Processing Systems 13, vol. 10, pp. 570-576, 2000.

[60] D.P. Bertsekas,“Constrained Optimization and Lagrangian Multiplier Methods”,

Academic Press, New York, 1982.

[61] Wei Lu and Jagath C. Rajapakse,”ICA with Reference”, Neurocomputing, vol. 69,

no. 16-18, pp 2244-2257, October 2006.

[62] Qiu-Hua Lin et al., ”A fast algorithm for one-unit ICA-R,” Information Sciences,

vol. 177, no. 5, pp 1265-1275, March 2007.

http://www.sciencedirect.com/science/journal/01651684



http://www.sciencedirect.com/science?_ob=PublicationURL&_tockey=%23TOC%235668%231994%23999639996%23436620%23FLP%23&_cdi=5668&_pubType=J&view=c&_auth=y&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=c686490c45cca1879ca86a82fe4291fd

techniques to analyze the terahertz - repositories

Documents