review of statistics and linear algebra mean: variance:

21
Review of Statistics and Linear Algebra Mean: N i i x N 1 1 Variance: N i i x N 1 2 2 ) ( 1

Upload: rosalind-patrick

Post on 12-Jan-2016

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Review of Statistics and Linear Algebra Mean: Variance:

Review of Statistics and Linear Algebra

Mean:

N

iixN 1

1

Variance:

N

iixN 1

22 )(1

Page 2: Review of Statistics and Linear Algebra Mean: Variance:

Probabilities of Normal Distribution

x

f(x)

+ +2 +3-3 -2 -

%7.99)]3()3[(

%5.95)]2()2[(

%3.68)]()[(

xP

xP

xP

Page 3: Review of Statistics and Linear Algebra Mean: Variance:

Covariance ))((1

),cov(1

yyxxN

yx i

N

ii

Correlation coefficientyxyx

i

N

ii yx

yyxxN

r

),cov())((

1

1

If r>0, x and y are positively correlated; if r<0, x and y are negatively correlated. The magnitude of r reflects the strength of correlation between x and y. Q: please draw a diagram to show x and y relationships: a) strongly positively correlated b) strongly negatively correlated c) weakly positively correlated d) weakly negatively correlated

Page 4: Review of Statistics and Linear Algebra Mean: Variance:

Variance covariance matrix: symmetric

)var(...),cov(),cov(),cov(

..............

),cov(...)var(),cov(),cov(

),cov(...),cov()var(),cov(

),cov(...),cov(),cov()var(

321

332313

232212

131211

nnnn

n

n

n

xxxxxxx

xxxxxxx

xxxxxxx

xxxxxxx

x1 x2 x3 xn…

x1

x2

x3

xn

Page 5: Review of Statistics and Linear Algebra Mean: Variance:

1...

..............

...1

...1

...1

321

33231

22321

11312

nnn

n

n

n

rrr

rrr

rrr

rrr

x1 x2 x3 xn…

x1

x2

x3

xn

Correlation Coefficients Matrix: Symmetric

How would you get correlation coefficients matrix from the variance-covariance matrix?

Page 6: Review of Statistics and Linear Algebra Mean: Variance:

Eigenvalues and eigenvectors

The eigenvalue of a matrix A

0)( xIA

333231

232221

131211

aaa

aaa

aaa

100

010

001

333231

232221

131211

aaa

aaa

aaa

Characteristic polynomial:

041

32

23

1 cccc We will have three solutions, each of them is called a eigenvalue: 1, 2, 3

Page 7: Review of Statistics and Linear Algebra Mean: Variance:

Eigenvectors

Once we have the eigenvalues, we can substitute the eigenvalues into the following equation to solve for a eigenvector

0)( 1 xIA

0

0

0

3

2

1

333231

232221

131211

x

x

x

aaa

aaa

aaa

The solution to this linear systems is the eigenvector corresponding to the eigenvalue. Therefore, there is as many eigenvectors as eigenvalues. The eigenvectors can be thought of a basis in a n-dimensional space, meaning that each eigenvectors is like the direction of axis. What is special is that these axes are perpendicular to each other (or orthogonal to each other). All points along the vector direction in the multidimensional space are solutions to the above linear system. Usually, one only use a vector of unit length as the eigenvector.

Page 8: Review of Statistics and Linear Algebra Mean: Variance:

Principal Component Analysis of Remotely Sensed Data

Step 1: calculate variance-covariance matrix/correlation matrixStep 2: calculate eigenvalues and eigenvectors for the above matrixStep 3: transform the data using the eigenvectors.

nnnn aaaa

bbbb

bbbb

bbbb

7321

73332313

72332212

71312111

...

..............

...

...

...Pixel 1Pixel 2

Pixel n

6636616

63332313

62332212

61312111

...

..............

...

...

...

vvvv

vvvv

vvvv

vvvv

nnnn PCPCPCPC

PCPCPCPC

PCPCPCPC

PCPCPCPC

6321

63332313

62332212

61312111

...

..............

...

...

...

=

nx6 6x6 nx6

n=lines samples

Page 9: Review of Statistics and Linear Algebra Mean: Variance:

Eigenvalues are the variances of principal components, the percent variance or information that a principal component represents is

100%

1

n

ii

pp

Because satellite data across bands are often highly correlated, usually 95% of the information can be compressed in a few bands.

Eigenvectors: The coefficients for each eigenvectors are the weights that a band carry to a principal component. The information content for each component can be explained from: (1) the sign of each coefficients; (2) the magnitude of each coefficients

Principal Component Transformation can (1) reduce dimensionality (2) reduce noise (3) improve visual interpretability

Interpretation of PCA

Page 10: Review of Statistics and Linear Algebra Mean: Variance:

PCA Example

x1

x2

PC1

PC2

In a extreme case: x1 and x2 is on a straight line, we only need one dimension to represent the whole dataset.

Page 11: Review of Statistics and Linear Algebra Mean: Variance:

Kauth-Thomas (KT) Transformation (or Tasseled Cap Transformation)

Empirical observation of crop development

1. Soils form a line in spectral space2. Growth of crops make the point moving away from the soil line. On bright soil, growth of crops making the scene less bright, but greener. On dark soil, growth of crops in makes the scene greener, but not as much change in the brightness.

Red

NIR

Dark soil

Bright soil

Mature crop

Soil line

Senescence

3. As crops mature, they reach the same point in the spectral space regardless to their soil background. At this point, little soil background can be seen due to canopy closure, minimize its impact on the overall spectral signals.4. When crops senesce and turn yellow their trajectories remain together and mover away from the green spot. The development of vegetation takes place almost totally in the same plane, while the yellowing development moves out of this plane

Page 12: Review of Statistics and Linear Algebra Mean: Variance:

Based on the above observation, Kauth and Thomas (1976) that developed a linear transformation from the original 4 Landsat MSS bands to a new set of axes which are orthogonal to each other. The first axis passes along the soil line, and the second axis is perpendicular to the first one passing through the plane of vegetation development. The third axis indicates crop senescence which is perpendicular to both soil and vegetation line. A fourth axis is required to account for the remaining variation. Kauth and Thomas named the four axes as: soil-brightness green-stuff yellow-stuff non-suchOnly the first two components are often used.The transformation coefficients are:

81.0543.0012.0223.0

194.0039.0522.0829.0

491.0600.0562.0290.0

264.0586.0632.0433.0

Page 13: Review of Statistics and Linear Algebra Mean: Variance:

KT Transformation for TM data

Transformation Coefficients for TM images: The most valuable transformations are the first three components: brightness, greenness and wetness. They usually consist of more than 95% of the total information from the 6 reflective bands.

Page 14: Review of Statistics and Linear Algebra Mean: Variance:

Compare KT Transformation and PCA

Common: 1. Linear transformations. 2. Transformed components are orthogonal to each other.

Different:1. PCA coefficients varies from scene, KT coefficients are fixed.2. PCA components may vary from scene, but KT components are fixed in what each component represents.3. Interpretation of principal components is not always straightforward and sometimes can be difficult.

Page 15: Review of Statistics and Linear Algebra Mean: Variance:

Vegetation Indices

1. Normalized Difference Vegetation Index (NDVI)

redNIR

redNIRNDVI

NDVI: [-1.0, 1.0]

Often, the more the leaves of vegetation present, the bigger the contrast in reflectance in the red and near-infrared spectra.

Page 16: Review of Statistics and Linear Algebra Mean: Variance:

2. Perpendicular Vegetation Index (PVI)

Wet soil

Dry soilFull vegeta

tion co

ver

Partial

vegetatio

n cover

1

*2

a

baPVI redNIR

Where a and b are slope and intercept of the soil line

Page 17: Review of Statistics and Linear Algebra Mean: Variance:

red

NIRSR

3. Simple Ratio

4. Soil Adjusted Vegetation Index (SAVI)

LLSAVI

redNIR

redNIR

)1(

Where L is an adjustment factor for soil. Huete (1988) found the optimal value for L is 0.5.

Huete, 1988.A soil-adjusted vegetation index (SAVI). Remote Sensing of Environment, 25:295-309

Page 18: Review of Statistics and Linear Algebra Mean: Variance:

5. Global Environmental Monitoring Index:

red

redGEMI

1

125.0)25.01(

5.0

5.05.1)(2 22

redNIR

redNIRredNIR

where

Pinty and Verstraete, 1992. Gemi: a non-linear index to monitor global vegetation from satellites. Vegetatio, 101:15-20.

Page 19: Review of Statistics and Linear Algebra Mean: Variance:

6. Atmospherically Resistant Vegetation Index

rbNIR

rbNIR

RR

RRARVI

Where )( redblueredrb RRRR

Developed for use with EOS-MODIS data on a global scale by Kaufman and Tanre (1992). The value is usually takes the value of 1.0. What this does is to correct for atmospheric effect on the reflectance value for red band

Kaufman and Tanre, 1992. Atmospherically Resistant Vegetation Index (ARVI) for EOS-MODIS. IEEE Trans. Geosci. Rem. Sen. 30(2):261-270.

Page 20: Review of Statistics and Linear Algebra Mean: Variance:

7. Soil and Atmospheric Resistant Vegetation Index

LRR

RRSARVI

rbNIR

rbNIR

Huete and Liu, 1994. An error and sensitivity analysis of the atmospheric- and soil-correcting variants of the Normalized Difference Vegetation Index for the MODIS-EOS. IEEE Trans. Geosci. Rem. Sen. 32:897-905

Page 21: Review of Statistics and Linear Algebra Mean: Variance:

8. Enhanced Vegetation Index

)1(2Re1

Re LLRCRCR

RREVI

BluedNIR

dNIR

Where C1, C2 coefficients adjusting for atmospheric effects and L is a soil adjustment factor. They are empirically determined as C1=6.0, C2=7.5 and L=1.0. EVI has improved sensitivity to high biomass regions.

Huete and Justice, 1999 MODIS vegetation index. http://modarch.gsfc.nasa.gov/MODIS/LAND/#vegetation-indices