gnr401 principles of satellite image processingavikb/gnr401/dip/gnr401-image... · gnr401...

137
GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay [email protected] Slot 5 Guest Lectures November 9 & November 14, 2012 9.30 AM – 10.55 AM

Upload: others

Post on 27-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

GNR401 Principles of Satellite Image

Processing

Instructor: Prof. B. Krishna MohanCSRE, IIT Bombay

[email protected] 5 Guest Lectures

November 9 & November 14, 2012 9.30 AM – 10.55 AM

Page 2: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Contents of the LecturesImage Classification• Problem of Image Classification• Supervised Classification Approaches - parametric• Nonparametric Classifiers• Feature Evaluation/Selection• Accuracy Assessment• Unsupervised Classification

IIT Bombay Slide 1

GNR401 B. Krishna Mohan

November 9&14, 2012 Image Classification

Page 3: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Concept of Image Classification

Image classification - assigning pixels in the image to categories or classes of interest. It may be considered as a mapping from numbers to symbols

Examples: built-up areas, waterbody, green vegetation, bare soil, rocky areas, cloud, shadow, …

IIT Bombay Slide 2

GNR401 B. Krishna Mohan

Page 4: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Concept of Image ClassificationImage classification is a process of mapping numbers to

symbolsf(x): x D; x ∈ Rn, D = {c1, c2, …, cL}Number of bands = n; Number of classes = Lf(.) is a function assigning a pixel vector x to a

single class in the set of classes D

IIT Bombay Slide 3

GNR401 B. Krishna Mohan

Page 5: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Concept of Image Classification• In order to classify a set of data into different classes

or categories, the relationship between the data and the classes into which they are classified must be well understood

• To achieve this by computer, the computer must be trained

• Ideas originally developed out of research in Pattern Recognition field

IIT Bombay Slide 4

GNR401 B. Krishna Mohan

Page 6: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Learning in Image ClassificationComputer classification of remotely sensed

images involves the process of the computer program learning the relationship between the data and the information classes

IIT Bombay Slide 5

GNR401 B. Krishna Mohan

Page 7: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Concept of Learning

•Extract knowledge from data by learning mechanisms

Data Knowledge

Program

Learning

IIT Bombay Slide 6

GNR401 B. Krishna Mohan

Page 8: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

LearningLearning denotes changes in a system that enables it to perform the same task more efficiently the next time

e.g. Solving problems in mathematics and physics

Driving a car

Cooking!Identifying classes of pixels given the data indifferent bands

IIT Bombay Slide 7

GNR401 B. Krishna Mohan

Page 9: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Types of LearningSupervised Learning

Learning process designed to form a mapping from one set of variables (data) to another set of variables (information classes)

A teacher is involved in the learning process

Unsupervised learning

Learning happens without a teacher

Exploration of the data space to discover the scientifc laws underlying the data distribution

IIT Bombay Slide 8

GNR401 B. Krishna Mohan

Page 10: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Generalization

Learning leads to recognition of unknown patterns not presented to the classifier during training

Program

Data Knowledge

Learning

Generalization

IIT Bombay Slide 9

GNR401 B. Krishna Mohan

Page 11: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Features• Features are attributes of the data elements

based on which the elements are assigned to various classes.

• E.g., in satellite remote sensing, the features are measurements made by sensors in different wavelengths of the electromagnetic spectrum – visible/ infrared / microwave/texture features …

IIT Bombay Slide 10

GNR401 B. Krishna Mohan

Page 12: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Features• In medical diagnosis, the features may be the

temperature, blood pressure, lipid profile, blood sugar, and a variety of other data collected through pathological investigations

• The features may be qualitative (high, moderate, low) or quantitative.

• The classification may be presence of heart disease (positive) or absence of heart disease (negative)

IIT Bombay Slide 11

GNR401 B. Krishna Mohan

Page 13: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Supervised Classification• The classifier has the advantage of an analyst

or domain knowledge using which the classifier can be guided to learn the relationship between the data and the classes.

• The number of classes, prototype pixels for each class can be identified using this prior knowledge

IIT Bombay Slide 12

GNR401 B. Krishna Mohan

Page 14: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Unsupervised Classification

• When access to domain knowledge or the experience of an analyst is unavailable or unreliable, the data can still be analyzed by numerical exploration, whereby the data are grouped into subsets or clusters based on statistical similarity

IIT Bombay Slide 13

GNR401 B. Krishna Mohan

Page 15: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Partially Supervised ClassificationWhen prior knowledge is available

– For some classes, and not for others, – For some dates and not for others in a

multitemporal dataset, Combination of supervised and unsupervised

methods can be employed for partially supervised classification of images

IIT Bombay Slide 14

GNR401 B. Krishna Mohan

Page 16: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Supervised vs. Unsupervised Classifiers

Supervised classification generally performs better than unsupervised classification IF good quality training data is available

Unsupervised classifiers are used to carry out preliminary analysis of data prior to supervised classification

IIT Bombay Slide 15

GNR401 B. Krishna Mohan

Page 17: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Role of Image ClassifierThe image classifier performs the role of a discriminant

– discriminates one class against othersDiscriminant value highest for one class, lower for

other classes (multiclass)Discriminant value positive for one class, negative for

another class (two class)

IIT Bombay Slide 16

GNR401 B. Krishna Mohan

Page 18: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Discriminant Functiong(ck, x) is discriminant function, relating feature vector

x and class ck, k=1,…,L Denote g(ck,x) as gk(x) for simplicity

Multiclass Casegk(x) > gl(x), l = 1,…,L, l ≠k x ∈ ck

Two Class Caseg(x) > 0 x ∈ c1; g(x) < 0 x ∈ c2

IIT Bombay Slide 17

GNR401 B. Krishna Mohan

Page 19: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Example of Image ClassificationMultiple Class CaseRecognition of characters or digits from bitmaps

of scanned textTwo Class CaseDistinguishing between text and graphics in

scanned document

IIT Bombay Slide 18

GNR401 B. Krishna Mohan

Page 20: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Prototype / Training Data• Using domain knowledge (maps of the study

area, experienced interpreter), small sets of sample pixels are selected for each class.

• The size and spatial distribution of the samples are important for proper representation of the total pixel population in terms of the samples

IIT Bombay Slide 19

GNR401 B. Krishna Mohan

Page 21: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Statistical Characterization of Classes

Each class has a conditional probability density function (pdf) denoted by p(x | ck)

The distribution of feature vectors in each class ck is indicated by p(x | ck)

We estimate P(ck | x), the conditional probability of class ck given that the pixel’s feature vector is x

IIT Bombay Slide 20

GNR401 B. Krishna Mohan

Page 22: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Supervised Classification Principles

• The classifier learns the characteristics of different thematic classes – forest, agricultural land, polluted water, clear water, open soils, manmade objects, desert etc.

• This happens by means of analyzing the statistics of small sets of pixels in each class that are reliably selected by a human analyst through experience or with the help of a map of the area

IIT Bombay Slide 21

GNR401 B. Krishna Mohan

Page 23: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Supervised Classification Principles• Typical characteristics of classes

– Mean vector– Covariance matrix– Minimum and maximum gray levels within each band– Conditional probability density function p(Ci|x) where Ci is

the ith class and x is the feature vector• Number of classes L into which the image is to be

classified should be specified by the user

IIT Bombay Slide 22

GNR401 B. Krishna Mohan

Page 24: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Prototype Pixels for Different Classes• The prototype pixels are samples of the

population of pixels belonging to each class• The size and distribution of samples are

formally governed by the mathematical theory of sampling

• There are several criteria for choosing the samples belonging to different classes

IIT Bombay Slide 23

GNR401 B. Krishna Mohan

Page 25: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 24

GNR401 B. Krishna Mohan

Page 26: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 25

GNR401 B. Krishna Mohan

Page 27: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Supervised Classification Algorithms• There are many techniques for assigning pixels

to informational classes, e.g.:– Minimum Distance from Mean (MDM)– Parallelpiped– Maximum Likelihood (ML)– Support Vector Machines (SVM)– Artificial Neural Networks (ANN)– …

IIT Bombay Slide 26

GNR401 B. Krishna Mohan

Page 28: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Minimum Distance Classifier

• Simplest kind of supervised classification• The method:

– Calculate the mean vector for each class– Calculate the statistical (Euclidean) distance from

each pixel to class mean vector– Assign each pixel to the class it is closest to

IIT Bombay Slide 27

GNR401 B. Krishna Mohan

Page 29: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

• 2-D Feature Space

o o oo o o

o o o

x x x x x x xx x xx x x

+ + + ++ + ++ + + +

Band 1

Band 2

Class 1

Class 2Class 3

●Sample to be classifiedMean

Vectors

Concept of MDM

IIT Bombay Slide 28

GNR401 B. Krishna Mohan

Page 30: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Minimum Distance Classifier

• Algorithm• Estimate class mean vector and covariance matrix from

training samples• mi = Sj∈Ci Xj ; Ci = E{(X - mi ) (X - mi )T } | X ∈ Ci}• Compute distance between X and mi

• X ∈ Ci if d(X, mi) ≤ d(X,mj) ∀j

• Compute P(Ck|X) = Leave X unclassified if• maxk P(Ck|X) < Tmin

IIT Bombay Slide 29

GNR401 B. Krishna Mohan1

1

1k

K

j j

d

d

Page 31: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Comments on MDM• Normally classifies every pixel no matter how far it is

from a class mean (still picks closest class) unless the Tmin condition is applied

• Distance between X and mi can be computed in different ways – Euclidean, Mahalanobis, city block, …

IIT Bombay Slide 30

GNR401 B. Krishna Mohan

Page 32: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Parallelepiped Classifier• Assign ranges of values for each class in each

band– Really a “feature space” classifier– Training data provide bounds for each feature for

each class– Results in bounding boxes for each class – A pixel is assigned to a class only if its feature

vector falls within the corresponding box

IIT Bombay Slide 31

GNR401 B. Krishna Mohan

Page 33: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Band 1

Band 2 Band 3

C1C2

C3

C4

Parallepiped Classifier

C5C6

IIT Bombay Slide 32

GNR401 B. Krishna Mohan

Page 34: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Band 1

Band 2 Band 3

C1C2

C3

C4

Parallepiped Classifier

C5C6

Unclassified

Assigned to class 5

IIT Bombay Slide 33

GNR401 B. Krishna Mohan

Page 35: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Advantages/Disadvantages of Parallelpiped Classifier

• Does NOT assign every pixel to a class. Only the pixels that fall within ranges.

• Fastest method computationally • Good for helping decide if you need additional

classes (if there are many unclassified pixels)• Problems when class ranges overlap—must develop

rules to deal with overlap areas.

IIT Bombay Slide 34

GNR401 B. Krishna Mohan

Page 36: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Maximum Likelihood Classifier

• Calculates the likelihood of a pixel being in different classes conditional on the available features, and assigns the pixel to the class with the highest likelihood

IIT Bombay Slide 35

GNR401 B. Krishna Mohan

Page 37: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Likelihood Calculation• The likelihood of a feature vector x to be in class Ci is

taken as the conditional probability P(Ci|x).• We need to compute P(Ci|x), that is the conditional

probability of class Ci given the pixel vector x.• It is not possible to directly estimate the conditional

probability of a class given the feature vector. Instead, it is computed indirectly in terms of the conditional probability of feature vector x given that it belongs to class Ci.

IIT Bombay Slide 36

GNR401 B. Krishna Mohan

Page 38: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Likelihood CalculationP(Ci|x) is computed using Bayes’ Theorem in terms of

P(x|Ci)P(Ci|x) = P(x|Ci) P(Ci) / P(x)x is assigned to class Cj such that P(Cj|x) = Maxi P(Ci|x), i=1…K, the number of classes.P(Ci) is the prior probability of occurrence of class i in

the imageP(x) is the multivariate probability density function of

feature x.

IIT Bombay Slide 37

GNR401 B. Krishna Mohan

Page 39: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Likelihood Calculation• P(x) can be ignored in the computation of Max{P(Ci|x)}• If P(x|Cj) is not assumed to have a known distribution, then its

estimation is said to be non-parametric estimation.• If P(x|Cj) is assumed to have a known distribution, then its

estimation is said to be parametric.• The training data x with the class already given, can be used

to estimate the conditional density function P(x|Ci)

IIT Bombay Slide 38

GNR401 B. Krishna Mohan

Page 40: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Likelihood Calculation• P(x|Ci) is assumed to be multivariate Gaussian

distributed in practical parametric classifiers.• Gaussian distribution is mathematically simple to

handle.• Each class conditional density function P(x|Ci) is

represented by its mean vector mi and covariance matrix Si

IIT Bombay Slide 39

GNR401 B. Krishna Mohan

1( ) ( )/ 2 1/ 21( | )

(2 ) | |i i i

i Li

p C e

S S

Tx μ x μx

Page 41: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Assumption of Gaussian Distribution• Each class is assumed to be multivariate normally distributed• That implies each class has a mean mi that has the highest

likelihood of occurrence• The likelihood function decreases exponentially as the feature

vector x deviates from the mean vector mi

• The rate of decrease is governed by the class variance; Smaller the variance, steep will be the decrease, and larger the variance, slower will be the decrease.

IIT Bombay Slide 40

GNR401 B. Krishna Mohan

Page 42: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Likelihood Calculation• Taking logarithm of the Gaussian distribution, we get

• We assume that the covariance matrices for each class are different.

• The term is known as the Mahalanobis distance between x and mi (after Prof. P.C. Mahalanobis, famous Indian statistician and founder of Indian Statistical Institute)

IIT Bombay Slide 41

GNR401 B. Krishna Mohan

11 1( ) ( ) ( ) ln 2 ln ln ( )2 2 2

ti i i i i i

Lg x x x P C S S

1( ) ( )ti i ix x S

Page 43: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Interpretation of Mahalanobis distance• The Mahalanobis distance between two multivariate

quantities x and y is

• If the covariance matrix is k.I, (I is the unit matrix) then the Mahalanobis distance reduces to a scaled version of the Euclidean distance.

• Mahalanobis distance reduces the Euclidean distance according to the extent of variation within the data, given by the covariance matrix S

IIT Bombay Slide 42

GNR401 B. Krishna Mohan

1( , ) ( ) ( )tMd x y x y x y S

Page 44: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Advantages/Disadvantages of Maximum Likelihood Classifier

• Normally classifies every pixel no matter how far it is from a class mean

• Slowest method – more computationally intensive• Normally distributed data assumption is not always true, in

which case the results are not likely to be very accurate• Tmin condition can be introduced into the classification rule to

separately handle ambiguous feature vectors

IIT Bombay Slide 43

GNR401 B. Krishna Mohan

Page 45: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Computational Aspects of Supervised Classifiers

• Maximum likelihood classifier – highly computation intensive

• Storing certain terms pre-computed saves time– Inverse of covariance matrix– Matrix-vector products

IIT Bombay Slide 44

GNR401 B. Krishna Mohan

Page 46: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Maximum Likelihood Classifier

What can be precomputed?• Inverse of covariance matrix• (x – mi) (a new image set can be computed with zero

mean) that can be used in MDM, MLM• ln|Si|, lnP(wi) and (L/2)ln(2p)

IIT Bombay Slide 45

GNR401 B. Krishna Mohan

11 1( ) ( ) ( ) ln 2 ln ln ( )2 2 2

ti i i i i i

Lg x x x P C S S

Page 47: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Maximum Likelihood Classifier

• Computational cost increases nonlinearly with number of bands

• In case the data contains a large number of bands (original and derived) it is prudent to evaluate the bands for ability to separate pixels into different classes

• Fewer bands, less the size of training data needed

IIT Bombay Slide 46

GNR401 B. Krishna Mohan

Page 48: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Maximum Likelihood Classifier

• For N band data, covariance matrix size is proportional to N2

• To determine the elements of covariance matrix, that many independent training samples are needed for each class

• Problem for high dimensional data like hyperspectral images

IIT Bombay Slide 47

GNR401 B. Krishna Mohan

Page 49: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Problems with highly correlated data

• If data in some of the bands is highly correlated, the covariance matrix can be ill-conditioned

• Inverting the covariance matrix can lead to numerical instabilities and problems in estimating P(Ck | X)

• Identifying and removing some of the bands can improve computations

IIT Bombay Slide 48

GNR401 B. Krishna Mohan

Page 50: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Number of training samples

• For N band data, the number of training samples needed is of the order of N.(N+1)

• In practice, to deal with slightly erroneous samples and noise, the number of samples desired is of the order of 10N to 100N

IIT Bombay Slide 49

GNR401 B. Krishna Mohan

Page 51: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Nearest-Neighbor ClassifierNon-parametric in nature• The algorithm is:

– Find the distance of given feature vector x from ALL the training samples

– x is assigned to the class of the nearest training sample (in the feature space)

– This method does not depend on the class statistics like mean and covariance.

IIT Bombay Slide 50

GNR401 B. Krishna Mohan

Page 52: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Concept

• 2-D Feature Space

o o oo o o

o o o

x x x x x x xx x xx x x

+ + + ++ + ++ + + +

Band 1

Band 2

Class 1

Class 2Class 3

●Sample to be classifiedFeature

Vectors

Nearest neighbor

IIT Bombay Slide 51

GNR401 B. Krishna Mohan

Page 53: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

About the Terminology• The nearest neighbor classifier is a general technique

developed by Pattern Recognition practitioners. Image processing community adopted it.

• The nearest neighbor referred to in this context is the training sample with the most similar feature vector, i.e., at the smallest distance from the given pixel’s feature vector. This has no relevance to where that pixel is in the image. Any pixel with the same feature vector will be identically classified, irrespective of its location.

IIT Bombay Slide 52

GNR401 B. Krishna Mohan

Page 54: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Spectral Angle Mapper• Given a large dimensional data set, computing the covariance

matrix, its inverse, and the distance for each pixel• (X – m)TS-1(X – m) is highly time consuming and if the

covariance matrix is close to singular then its inverse can be unstable, leading to erroneous results

• In such cases, alternate methods can be applied, such as Spectral Angle Mapper

IIT Bombay Slide 53

GNR401 B. Krishna Mohan

Page 55: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

S.A.M. Principle• If each class is represented by a vector vi, then the

angle between the class vector and the pixel feature vector x is given by

• cosq = [vi.x] / [|vi||x|]• For small values of q, the value of cosq is large• The likelihood of x to belong to different classes can

be ranked according to the value of cosq.

IIT Bombay Slide 54

GNR401 B. Krishna Mohan

Page 56: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

S.A.M. Advantage

• The value of the vector would not be greatly affected by minor changes in vi or x.

• The computation is simpler compared to the Mahalanobis distance computation involved in ML method

IIT Bombay Slide 55

GNR401 B. Krishna Mohan

Page 57: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Feature Data• For N-band data, each pixel has N values

associated with it• Are the N band data good enough to extract

the classes of interest?• Can the work be done with fewer than N

bands?• Which bands should be discarded?

IIT Bombay Slide 56

GNR401 B. Krishna Mohan

Page 58: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Feature Selection Techniques

Page 59: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Feature Evaluation• For a given set of classes and the training data, how

well is each class separated from every other class?• How useful are the available features? Can we

reduce the dimensionality of the dataset without compromising on accuracy of classification? In this case new data are generated from original data

• The given image is then classified after training the classifier using training data

• The classification result is verified using test data

IIT Bombay Slide 57

GNR401 B. Krishna Mohan

Page 60: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Feature Selection/Reduction• Feature selection – selecting a subset of

features based on some criteria• Feature reduction – discarding some features

after performing some transformation on the input data; e.g., PCT

• Reduced number of bands also reduces need for large training data

IIT Bombay Slide 58

GNR401 B. Krishna Mohan

Page 61: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Motivation to Consider Feature Selection

• Image classification– each element should have useful features – discriminate between elements of different classes

• Discrimination power of features

IIT Bombay Slide 59

GNR401 B. Krishna Mohan

Page 62: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Class separability in feature space• Two generic cases

IIT Bombay Slide 60

GNR401 B. Krishna Mohan

Well separated classes Poorly separated classes

Band 2 Band 2

Band 1 Band 1

Page 63: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Low separability (with one feature)IIT Bombay Slide 61

GNR401 B. Krishna Mohan

p(X|Ci)

mi mj

X

Each class is represented by normal distribution, but with close means and high variances between means

Page 64: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

An example• Separation of a bunch of fruits into different

categories• Useful features

– Color – Size – Season of availability– Texture

IIT Bombay Slide 62

GNR401 B. Krishna Mohan

Page 65: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Important Features• Shape Features• Spectral Features• Texture Features• Transform Features

IIT Bombay Slide 63

GNR401 B. Krishna Mohan

Page 66: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Exam

ple

IIT Bombay Slide 64

GNR401 B. Krishna Mohan

Page 67: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Exam

ple

IIT Bombay Slide 65

GNR401 B. Krishna Mohan

Page 68: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Exam

ple

IIT Bombay Slide 66

GNR401 B. Krishna Mohan

Page 69: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Feature selection• After all features are calculated, we need to

select the best subset that can effectively classify the data into the desired classes

• Improve efficiency, reduce dependence on large training data set

IIT Bombay Slide 67

GNR401 B. Krishna Mohan

Page 70: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Feature Reduction• Remove non-discriminative features• Remove redundant features• Example: Principal Component Analysis

IIT Bombay Slide 68

GNR401 B. Krishna Mohan

Page 71: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Principal Component Analysis• Maximizes variance among the selected bands• Cannot guarantee class separability

IIT Bombay Slide 69

GNR401 B. Krishna Mohan

Page 72: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Linear Discriminant Analysis (LDA)• PCs are not necessarily good for

discrimination in classification.• Linear Discriminant Analysis (LDA), seeks to

find a linear transformation by maximising between-class variance and minimising within-class variance.

IIT Bombay Slide 70

GNR401 B. Krishna Mohan

Page 73: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

LDA B

.

2.0

1.5

1.0

0.5

0.5 1.0 1.5 2.0.. .... . . ... .. A

w

.

B

.2.0

1.5

1.0

0.5

0.5 1.0 1.5 2.0

..

..

....

...

. .

Aw

.

A projection to a lower dimensional space while retaining class separability

PCA cannot guarantee class separability

IIT Bombay Slide 71

GNR401 B. Krishna Mohan

Page 74: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Some Distance Measures

Page 75: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

DivergenceBased on a measure of difference between pairs of

classesConsider the likelihood ratio for two classes

where p(x|wi) and p(x|wj) are the ith and jth spectral class probability distributions

Lij = 0 or infinity for highly separable classesLet Lij’ = ln (p(x|wi)) – ln (p(x|wj))

IIT Bombay Slide 72

GNR401 B. Krishna Mohan

( ) ( | ) / ( | )ij i jL x p x w p x w

Page 76: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Divergence• Divergence for a pair of classes is defined as

• where

• Therefore

IIT Bombay Slide 73

GNR401 B. Krishna Mohan

' '{ ( ) | } { ( ) | }ij jiij i jd E L x w E L x w

' '{ ( ) | } ( ) ( | )ij i ij ix

E L x w L x p x w dx

( | ){ ( | ) ( | )}ln( | )

iij i j

jx

p xd p x p x dxp x

Page 77: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Divergence• Divergence is a distance metric• dij = dji

• For i=j, dii = djj = 0• Divergence of a class with itself is zero.• It can be shown that

dij(x1, x2, …, xn) > dij(x1, x2, … , xn-1)This means that divergence increases with more

features. This is true for statistically independent features

IIT Bombay Slide 74

GNR401 B. Krishna Mohan

Page 78: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Divergence for multidimensional Gaussian feature set

• The average divergence dave is given by• dave =

IIT Bombay Slide 75

GNR401 B. Krishna Mohan

1 10.5 {( )( )ij i j j id Tr

1 10.5 {( )( )( )T

i j i jj iTr m m m m

1 1( ) ( )

L L

i j i ji j i

p w p w d

Page 79: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Divergence for Feature Selection• Given L classes, B bands, and B1 bands to be selected out of B,

then the number of pairs to be evaluated using the divergence equation is

(BCB1 . LC2)

which is very large even for modest size classification problem

For example, for a problem of ten classes, 7 bands in the input image, with 4 bands to be selected, the number of combinations is 1575.

IIT Bombay Slide 76

GNR401 B. Krishna Mohan

Page 80: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Bhattacharyya DistanceFor normally distributed classes, the

Bhattacharyya distance between two class distributions is given by

IIT Bombay Slide 77

GNR401 B. Krishna Mohan

1

1/ 2 1/2

| |1 1 2( ) ( ) ln8 2 2 | | | |

i j

i jTi j i j

i j

B m m m m

Page 81: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Jeffries Matusita (JM) Distance• The JM distance is defined as• Jij = 2(1-e-B), where B is the Bhattacharyya

distance between the two classes• The average JM distance Jave between the

classes is given by

IIT Bombay Slide 78

GNR401 B. Krishna Mohan

1 1( ) ( )

L L

ave i j iji j i

J p w p w J

Page 82: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Interpretation of JM distance• As the separation between classes increases,

the JM distance increases exponentially towards 2.

• This is a desirable behavior compared to the divergence measure

• Computationally JM distance is more expensive than Divergence

IIT Bombay Slide 79

GNR401 B. Krishna Mohan

Page 83: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Weight Factors• Weighted divergence Wij is computed as

where fi and fj are weights or prior probabilities of classes i and j.

IIT Bombay Slide 80

GNR401 B. Krishna Mohan

1

1 12

2

1 1

12

K K

i j iji j i

ijK K

i ii i

f f dW

f f

Page 84: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Other Measures• There are several measures recently proposed, such

as based on • Genetic Algorithms• Sequential Search Algorithms

• In practice, user can choose the pairs of classes of interest, and can see the values of divergence or JM distance only for those classes.

IIT Bombay Slide 81

GNR401 B. Krishna Mohan

Page 85: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Accuracy Assessment

Page 86: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Relevance of Accuracy Assessment • Output of classification of remotely sensed

images – Becomes input layer for a GIS– Accuracy part of metadata– Class-wise and overall accuracy need to be

quantified

IIT Bombay Slide 82

GNR401 B. Krishna Mohan

Page 87: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Accuracy Assessment• Estimated using ground truth i.e., set of points

whose class is known• These points are known as test points• Test points should not be used during the

classifier training• Test points should be in sufficient number for

the accuracy assessment to be significant

IIT Bombay Slide 83

GNR401 B. Krishna Mohan

Page 88: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Choice of Test Points• Test points should not be taken in the immediate

neighbourhood of training data points• High spatial dependence would bias the assessment

in such a case• Based on the accuracy of classification of the test

data points, various quantitative estimates are defined.

• The degree of agreement between the classifier output and the ground truth is considered

IIT Bombay Slide 84

GNR401 B. Krishna Mohan

Page 89: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Classification Accuracy Assessment

• Error or confusion matrix is often deployed to derive measures of accuracy

• Used to derive the overall accuracy as well as user’s accuracy and producer’s accuracy

IIT Bombay Slide 85

GNR401 B. Krishna Mohan

Page 90: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Confusion matrixIIT Bombay Slide 86

GNR401 B. Krishna Mohan

R C

Forest Indust. Urban Water Total

Forest 68 7 3 0 78

Indust. 12 112 15 10 149

Urban 3 9 89 0 101

Water 0 2 5 56 63

Total 83 130 112 66 391

Page 91: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Measures of Accuracy from Confusion Matrix

• Overall AccuracyNumber of test samples correctly classified / Total number of test samplesSum of diagonal elements / Sum of rows

• Producer’s Accuracy (Omission errors)No. of samples of a class correctly classified / No. of samples of the class

• User’s Accuracy (Commission errors)No. of samples that actually belong to a class / No. of samples assigned to that class

IIT Bombay Slide 87

GNR401 B. Krishna Mohan

Page 92: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Accuracy Computation• Overall accuracy for this example

• Correctly classified samples = • 68 + 112 + 89 + 56 = 325• Total number of samples = • 78+ 149 + 101 + 63 = 391• Overall accuracy = 325 / 391 ~ 83%

IIT Bombay Slide 88

GNR401 B. Krishna Mohan

Page 93: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Producer’s Accuracy• Forest: • Total reference samples = 83• Correctly classified samples = 68• Accuracy = 68/83 ~ 82%• Industrial areas: 112/130 ~ 86%• Urban areas: 89/112 ~ 79%• Water: 56/66 ~ 85%

IIT Bombay Slide 89

GNR401 B. Krishna Mohan

Page 94: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

User’s Accuracy• Industrial Areas• Correctly classified samples = 112• Samples assigned to Industrial Areas = 149• User’s Accuracy = 112 / 149 ~ 75%• Forest: 68/78 ~ 87%• Urban: 89 / 191 ~ 88%• Water: 56/63 ~ 88%

IIT Bombay Slide 90

GNR401 B. Krishna Mohan

Page 95: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Kappa coefficient• Kappa coefficient is defined as

IIT Bombay Slide 91

GNR401 B. Krishna Mohan

1 1

2

1

r r

ii i ii i

r

i ii

N C C C

N C C

Ci+ denotes the sum of the ith row elements; C+i denotes the sum of the ith column elements

is interpreted as the index of correct classification after adjusting for chance agreement between the true and computed classes.

Page 96: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Kappa Coefficient for this exampleIIT Bombay Slide 92

GNR401 B. Krishna Mohan

1 1

2

1

r r

ii i ii i

r

i ii

N C C C

N C C

= (85761/107575) ~ 0.80(Verify the calculations!)

Page 97: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Interpretation of Kappa Coefficient• Kappa coefficient is used to compare the

degree of consensus between raters (inspectors) in, for example, Measurement Systems Analysis.

• The degree of chance agreement between the raters and the measurement system is considered in the rating to assess the actual degree of agreement

IIT Bombay Slide 93

GNR401 B. Krishna Mohan

Page 98: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Interpretation of Kappa Coefficient• Suppose 150 samples are independently verified and

the following determinations are made:Inspector

Accept Reject TotalAccept 20 19 39

System Reject 1 110 111Total 21 129 150

IIT Bombay Slide 94

GNR401 B. Krishna Mohan

Page 99: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Chance AgreementInspector

Accept Reject TotalAccept 5.46 33.54 39

Sys. Reject 15.54 95.46 111Total 21 129 150

The chance values are produced by computing: (row total x col. total) / Total values

IIT Bombay Slide 95

GNR401 B. Krishna Mohan

Page 100: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Kappa Coefficient for this example• κ = (130 – 100.92) / (150 – 100.92) = 0.593• 130 comes from the first table• 100.92 comes from the chance table• 150 is the total number of samples• Chance agreement is deducted from actual

agreement to estimate the degree of agreement between the system (classifier) and the inspector (ground truth)

IIT Bombay Slide 96

GNR401 B. Krishna Mohan

Page 101: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Cross Validation• In the cross-validation approach to accuracy

estimation, the ability of the classifier to generalize is determined from the training data itself.

• All the training samples except one are used for training and the one remaining sample is used for testing.

• This process is repeated several times

IIT Bombay Slide 97

GNR401 B. Krishna Mohan

Page 102: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Cross Validation• This method is also known as Leave one out method

of accuracy estimation.– Over a large number of runs the average accuracy of

correct classification of the sample left out is computed.– The one sample being left out is randomly selected.– The risk is that small classes are likely to be ignored– Classes with large number of training samples likely to be

chosen more often.

IIT Bombay Slide 98

GNR401 B. Krishna Mohan

Page 103: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Cross Validation• An extension of this method is to divide the

training data into k subsets, and each time one subset is left out, and used to test the generalization capability of the classifier.

• This way, a maximum of k runs will be adequate to extensively test the classifier

IIT Bombay Slide 99

GNR401 B. Krishna Mohan

Page 104: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Cross Validation• How do we select the subsets?• The procedure is to randomly sample the training

data to make the subsets, at the same time ensure that samples from each class are chosen

• The solution for this is to adopt the stratified sampling approach

• The stratification is performed on the basis of the classes.

• The size of the sample from each class is proportional to the size of the class itself.

IIT Bombay Slide 100

GNR401 B. Krishna Mohan

Page 105: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

General Comments• Accuracy assessments in general are expensive • Attention must be paid to limitations imposed by

– Region– Sensor– Weather etc. versus the statistical requirements imposed

on the study

IIT Bombay Slide 101

GNR401 B. Krishna Mohan

Page 106: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

General Comments• Sampling schemes must include

– Appropriate sample design – Should be in relation to the question being asked and the

kind of imagery being analyzed. • What kind of sample unit will be used and how big

will it be? – Single pixels? – Polygons? – Multiple Polygons? – How many samples should be taken in order to be

statistically valid?

IIT Bombay Slide 102

GNR401 B. Krishna Mohan

Page 107: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Fractional Classification• When coarse resolution imagery are being

classified, and reference data available at higher resolution, it is desirable to report the number of high resolution pixels within each thematic class inside the coarse resolution pixel – mixture estimates

• If only the dominant class is reported, then the remaining useful information is lost

IIT Bombay Slide 103

GNR401 B. Krishna Mohan

Page 108: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Mixed Pixels• When spatial resolution is coarse, one pixel

may contain parts of many landuse classes– e.g., tree, bare soil, grass

• Classification is estimating proportions of different classes within a pixel

• The problem is called mixture modeling

IIT Bombay Slide 104

GNR401 B. Krishna Mohan

Page 109: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Mixed Pixels• Important when spectra of different classes are

compared as in hyperspectral remote sensing• Reference spectra are drawn from single classes• Most pixel spectra are mixtures of more than one

pure class• Mixture modeling estimates the relative proportion

of each class assuming a particular model for mixing

IIT Bombay Slide 105

GNR401 B. Krishna Mohan

Page 110: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Mixed Pixels• Most common mixing model is a linear mixture

modeling

IIT Bombay Slide 106

GNR401 B. Krishna Mohan

,1

L

n k k n nk

R w A

R is the reflectance recorded for a mixed pixel, wk is the proportion and Rk is the reflectance belonging to pure class k. x is additive noise present in the data

wk are estimated from known pure pixels and given reflectances of mixed pixels, minimizing the effect of noise

Page 111: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Classifiers v/s Features• The definition and selection of land cover

classes is crucial • The capability of existing features is to be

ascertained for reliable classification• Choice of classifier can be critical in some

cases, while ancillary data can help in other cases

IIT Bombay Slide 107

GNR401 B. Krishna Mohan

Page 112: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 108

CSRE

GNR401 B. Krishna Mohan

0.6m x 0.6m

Type of features used:•Spectral•Spatial•Textural•Contex-tual

Page 113: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 109

GNR401 B. Krishna Mohan

5.8m x 5.8m

Type of features used:•Spectral•Textural•Contex-tual

Page 114: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 110

GNR401 B. Krishna Mohan

23.25m x 23.25m

Type of features used:Spectral

Page 115: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Unsupervised Classification

Page 116: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Unsupervised Classification• When access to domain knowledge or the

experience of an analyst is missing, the data can still be analyzed by numerical exploration, whereby the data are grouped into subsets or clusters based on statistical similarity

IIT Bombay Slide 111

GNR401 B. Krishna Mohan

Page 117: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Unsupervised Classification• Unsupervised classification is also known as learning without

teacher• In the absence of reliable training data it is possible to

understand the structure of the data using statistical methods such as clustering algorithms

• Popular clustering algorithms are k-means and ISODATA.

IIT Bombay Slide 112

GNR401 B. Krishna Mohan

The book Pattern Recognition Principles by J.T. Tou and R.C. Gonzalez contains discussion of several clustering algorithms with numerical examples

Page 118: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Concept of Clustering• 2-D Feature Space

IIT Bombay Slide 113

GNR401 B. Krishna Mohan

o o oo o o

o o o x x x x x x xx x xx x x

+ + + ++ + ++ + + + Class 1

Class 2

Class 3

Band 1

Band 2

Page 119: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Concept of ClusteringIIT Bombay Slide 114

GNR401 B. Krishna Mohan

o o oo o o

o o o x x x x x x xx x xx x x

+ + + ++ + ++ + + +

Band 1

Band 2

Page 120: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Overlapping ClustersIIT Bombay Slide 115

GNR401 B. Krishna Mohan

o o oo o o

o o o o o o oo o o o

x x x x x x xx x xx x x

+ + + ++ + ++ + + +

Band 2

Band 1

Page 121: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Clustering Algorithms• All feature vectors are points in an L-dimensional

space where L is the number of bands (The letter K is reserved for the number of clusters!)

• It is required to find which sets of feature vectors tend to form clusters

• Members of a cluster are more similar to each other than to members of another cluster – In other words, they possess low intra-cluster variability and high inter-cluster variability

IIT Bombay Slide 116

GNR401 B. Krishna Mohan

Page 122: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

K-Means• Iterative algorithm• Number of clusters K is known by user• Most popular clustering algorithm• Initialize randomly K cluster mean vectors• Assign each pixel to any of the K clusters based on

minimum feature distance• After all pixels are assigned to the K clusters, each

cluster mean is recomputed. • Iterate till cluster mean vectors stabilize

IIT Bombay Slide 117

GNR401 B. Krishna Mohan

Page 123: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 118

GNR401 B. Krishna Mohan

Data Heap

C1 C2 … Ck

mi m2 mk

Iteration 0m0

1, ..., m0K

Page 124: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

K-Means algorithmIIT Bombay Slide 119

GNR401 B. Krishna Mohan

• Each cluster Ci is represented by its mean vector mi

• ||x-mi|| = ( ) ( )Ti iX X

Assign X to cluster Ck where ||X-Ck|| = Min (||X-Ci||), i=1,2,…,K

Page 125: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 120

GNR401 B. Krishna Mohan

Data Heap

C1 C2 … Ck

Iteration n

1 2

Mean vectors, , ..., n n n

K

Page 126: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

K-Means algorithm contd.IIT Bombay Slide 121

GNR401 B. Krishna Mohan

• After all the pixels are assigned to the K clusters, update mi

Replace existing cluster mean vectors by the updated mean vectors if their difference is more than a user-specified threshold

1| |

i

newi

X Ci

XC

max( ) || ||iold new new oldi i i i iif Max d

Page 127: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 122

GNR401 B. Krishna Mohan

Data Heap

C1 C2 … Ck

Iteration n+1

1 21 1 1

1 2

, , ...,

, , ...,

n n nK

n n nK

Convergence!

Page 128: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

K-Means Algorithm• Choice of K is important• For normal remotely sensed images, about 15-20

iterations are maximum required for stabilization of the cluster centres

• Convergence can be based on average difference between cluster centres or maximum difference between any pair of cluster centres

IIT Bombay Slide 123

GNR401 B. Krishna Mohan

Page 129: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

ISODATA ALGORITHM• Iterative Self-Organizing Data Analysis

Technique (the last A added to make the acronym sound better)

• Developed in biology in the 1960’s by Ball and Hall

• See Tou and Gonzalez’s classic “Pattern Recognition Principles” for an excellent exposition to clustering algorithms

IIT Bombay Slide 124

GNR401 B. Krishna Mohan

Page 130: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

User specified parameters for ISODATA

• Generalization of K-Means algorithm• Consists of many user-specified parameters

– Minimum size of cluster– Maximum size of cluster– Maximum intra-cluster variance– Minimum separation between pairs of clusters– Maximum number of clusters– Minimum number of clusters– Maximum number of iterations

IIT Bombay Slide 125

GNR401 B. Krishna Mohan

Page 131: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Cluster Evaluation• Desired properties of clusters

– High separation between different clusters– Low variability within each cluster

IIT Bombay Slide 126

GNR401 B. Krishna Mohan

Page 132: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

High Separation Between Clusters

• Mahalanobis distance• Compute mean vector and covariance matrix

for each cluster• High Mahalanobis distance between mean

vectors desirable

IIT Bombay Slide 127

GNR401 B. Krishna Mohan

Page 133: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

High Separation Between Clusters

• For small data, it is also possible to find minimum distance between samples of two clusters

• The minimum distance should be high

IIT Bombay Slide 128

GNR401 B. Krishna Mohan

Page 134: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Low Intra-Cluster Variability• Trace of covariance matrix• Within each cluster, find distance of an

element from every other element• Note maximum distance• Max distance should be small

IIT Bombay Slide 129

GNR401 B. Krishna Mohan

Page 135: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 130

GNR401 B. Krishna Mohan

INPU

T IM

AG

E

Page 136: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

IIT Bombay Slide 131

GNR401 B. Krishna Mohan

K-M

EAN

S: K

=7

Page 137: GNR401 Principles of Satellite Image Processingavikb/GNR401/DIP/GNR401-Image... · GNR401 Principles of Satellite Image Processing Instructor: Prof. B. Krishna Mohan CSRE, IIT Bombay

Contd…