spatial modeling and analysis

45
Spatial Modeling and Spatial Modeling and Analysis Analysis Deana D. Pennington, PhD Deana D. Pennington, PhD University of New Mexico University of New Mexico

Upload: keefe

Post on 16-Jan-2016

120 views

Category:

Documents


12 download

DESCRIPTION

Deana D. Pennington, PhD University of New Mexico. Spatial Modeling and Analysis. What is spatial analysis?. Analyses where the data are spatially located and explicit consideration is given to the possible importance of their spatial arrangement in the analysis. Statistical Issues. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Spatial Modeling and Analysis

Spatial Modeling and Spatial Modeling and AnalysisAnalysis

Deana D. Pennington, PhDDeana D. Pennington, PhDUniversity of New MexicoUniversity of New Mexico

Page 2: Spatial Modeling and Analysis

What is spatial analysis?What is spatial analysis?

Analyses where the data are spatially located and explicit consideration is given to the possible importance of their spatial arrangement in the

analysis

Page 3: Spatial Modeling and Analysis

Statistical IssuesStatistical IssuesValid statistics depend on:Valid statistics depend on: Temporal stability and causal transienceTemporal stability and causal transience Unit homogeneityUnit homogeneity IndependenceIndependence Constant effectsConstant effects

BUT Ecology & Earth Science violate all of these!BUT Ecology & Earth Science violate all of these!We study:We study: Change with time (no temporal stability)Change with time (no temporal stability) Legacies, persistence, recovery (no causal transience )Legacies, persistence, recovery (no causal transience ) Heterogenity through space and time (no unit Heterogenity through space and time (no unit

homogeneityhomogeneity Spatial structure (no independence)Spatial structure (no independence) Differences in response through space/time (non-Differences in response through space/time (non-

constant effects)constant effects) Attributes rather than causal factors, which must be Attributes rather than causal factors, which must be

inferredinferred

Page 4: Spatial Modeling and Analysis

Issues in Spatial AnalysisIssues in Spatial Analysis

•Error•Small sample sizes compared with size of environmental data sets•Spatial dependency•Spatial heterogeneity•Boundaries effects•Modifiable Areal Unit Problem

Page 5: Spatial Modeling and Analysis

Spatial DependencySpatial Dependency

Tobler’s Law: All things are related, but nearby things are more related than distant things

Non-independent observations: duplicates observations in the sample set, therefore is a loss of information compared with independent observations. Affects mean, variance, confidence intervals and significance tests

***Field samples tend to be taken from nearby locations, and are almost always spatially autocorrelated***

Page 6: Spatial Modeling and Analysis

Spatial HeterogeneitySpatial Heterogeneity

•Stratification of the landscape (regions, classes, etc) problematic due to gradational nature•Intra-strata variability, mixtures•Differences in numbers of observations within strata

Heterogenity in spatial data

Page 7: Spatial Modeling and Analysis

300 x 300 pixels, 192 training pixels out of 90,000 total pixels, 7 mislabeled

*low % samples*errors in samples

Hyperspectral ExampleHyperspectral Example

Roads 33Clouds 23

River 23

Riparian 28

Arid upland 25

Barren 22

Agriculture 38

7TrueColor

FalseColor

6 km2

Page 8: Spatial Modeling and Analysis

Hyperspectral ResultsHyperspectral Results

7

Riparian

Riparian

Riparian

Riparian

Arid upland

Semi-arid upland

Arid upland

K-meansUnsupervised10 classes

Semi-arid upland

Clouds/barren

•Confusion between river & agriculture

•Confusion between clouds and barren

•Unsampled semi-arid upland

•Mislabeled arid upland

•Unsampled variability in riparian

River/agriculture

•Road variability

Page 9: Spatial Modeling and Analysis

7

Clouds

Agriculture

River

Riparian

Arid upland

Barren

Roads

Unclassified

K-means UnsupervisedMaximum Likelihood89.44%

Naïve Bayesian83.33%

Parallelepiped82.78%

Minimum Distance69.44%

Support Vector Machine77.22%

•Confusion between river & agriculture•Confusion between clouds and barren•Unsampled semi-arid upland

•Mislabeled arid upland (4.4%)•Unsampled variability in riparian•Road variability

Page 10: Spatial Modeling and Analysis

Boundary EffectsBoundary Effects

•Loss of neighbors in analyses that depend on neighborhood values•Solution: collect data along a border outside of the analysis area

Page 11: Spatial Modeling and Analysis

Modifiable Areal Unit Modifiable Areal Unit Problem (MAUP)Problem (MAUP)

•Results sensitive to cell size, location, orientation

Page 12: Spatial Modeling and Analysis

Components of Spatial Analysis

Exploratory Spatial Data Analysis (ESDA) Finding interesting patterns. Visualization Showing interesting patterns. Spatial Modeling Explaining interesting patterns.

Page 13: Spatial Modeling and Analysis

Spatial AnalysesSpatial Analyses

Things to consider:Things to consider: Objective: describe, map, causationObjective: describe, map, causation Data type: binary (Y/N), categorical, Data type: binary (Y/N), categorical,

continuouscontinuous Expected pattern: gradient, periodic, Expected pattern: gradient, periodic,

clusteredclustered Scale of patternScale of pattern Univariate/multivariateUnivariate/multivariate

Page 14: Spatial Modeling and Analysis

Spatial AnalysesSpatial Analyses

Biological survey where each point denotesthe observation of an endangered species. If a pattern exists, like this diagram, we may be ableto analyze behavior in termsof environmental characteristics

1. Quantify pattern• Attraction or

repulsion• Directionality

2. Make inferences about process based on observed pattern

Page 15: Spatial Modeling and Analysis

ChoicesChoices

Point pattern analyses

Single scale of pattern Quadrat analysisNearest neighbor

Multiscale patternRefined nearest neighbor2nd order analysisRipley’s K

Make maps from pointsDistance interpolation

KrigingTrend surface analysis

Spline

Test models with space as causal factor

Mantel testMantel correlogramMultivariate analysis

Describe spatial structure

Gradient, periodic

Single scale of patternSemivariogramCorrelogram

Multiscale patternSpectral analysis

EdgeWaveletanalysis

ContextAdjacency measures

Cross variogramCross

correlogramSelf-similarity

Fractaldimension

Network AnalysisPath analysisAllocationConnectivity

Page 16: Spatial Modeling and Analysis

Point Pattern Point Pattern AnalysisAnalysis

Clustered (attraction)

Uniform (repulsion)

Page 17: Spatial Modeling and Analysis

Point Pattern AnalysisPoint Pattern Analysis

Statistical tests for significant patterns in data, compared with the null hypothesis of random spatial pattern

The standard against which spatial point patterns are compared is a:

Completely Spatially Random (CSR) Point Process Poisson probability distribution (mean = variance)

used to generate spatially random points

Page 18: Spatial Modeling and Analysis

Quadrat AnalysisQuadrat Analysis1. Divide the area up into quadrats2. Count the number of points in each quadrat3. Compare counts with expected counts in random distribution

# ofcells

# of pts/cell

Expected CSR = null hypothesis

Clustered

UniformExpected mean #/cell in CSR = N/# of quadsFor Poisson distribution:

p(x) = (e- x)/x!

Chi square 2 = (observed – expected)2/expected# Oi P(x) Ei0 2 0.0156 0.391 2 0.0649 1.62 5.39 2.422 5 0.1350 3.383 1 0.1873 4.68… 2

Check Chi square tableIf Ho rejected:Mean <> varianceMean > variance (uniform)Mean < variance (clustered)

Page 19: Spatial Modeling and Analysis

Nearest Neighbor Nearest Neighbor DistanceDistance

1. Calculate the distance to the nearest neighbor for every point2. Calculate mean nn distance3. Calculate expected mean for CSR distribution E(di) = 0.5 A/N4. Compare expected mean to observed mean with Z statistic

Z = [ d – E(di)] / [0.0683 A/N2]

Look up in significance in z-statistic tableIf Ho rejected,

observed mean < expected and Z < 0 => clusteredobserved mean > expected and Z > 0 => uniform

Page 20: Spatial Modeling and Analysis

Ripley’s KRipley’s K1. Expand a circle of increasing radius around each point2. Count the number of points within each circle.3. Calculate L(d), a measure of the expected number of points

within distance (d); L(d) = [ASkij/N(N-1)]0.5, where A = area, Skij = number of points j within distance d of all i points

4. Monte Carlo simulations or t-test

Radius

L(d)

Expected CSR mean

Clustered

Uniform

***Note added information – mean clustering distance

Page 21: Spatial Modeling and Analysis

Lab #12ALab #12A

Point pattern analysisPoint pattern analysis

Page 22: Spatial Modeling and Analysis

Analysis of Continuous Analysis of Continuous DataData

1. Variation in mean values

2. Describe local variability & spatial dependence

Page 23: Spatial Modeling and Analysis

Mean trendsMean trends

Focal

Zonal

Global

Input Output

Single value (surface analysis)

or table

Page 24: Spatial Modeling and Analysis

Grid Analysis: Focal Grid Analysis: Focal AnalysisAnalysis

Spatial filters: output value for each cell is calculated from neighboring cells (moving windows)

Neighborhood shapes: MajorityMaximumMeanMedianMinimumRangeStandard deviationSumVariety

Species A habitatSpecies B habitat

Range Species A = 4 cellsSpecies A depends on B

•Low pass: Smoothing, removing noise•High pass: Emphasize local variation•Edge enhancement

Page 25: Spatial Modeling and Analysis

Grid Analysis: Zonal Grid Analysis: Zonal AnalysisAnalysis

Vegetation class A or land use A

Vegetation class B or land use B

Vegetation class C or land use C

AreaCentroidGeometryPerimeter

MajorityMaximumMeanMedianMinimumRangeStatisticsStandard deviationSumThicknessVarietyOutput is:

a) grid with same value in each cell for a given zoneb) table with values by zone

Page 26: Spatial Modeling and Analysis

Lab 12B Lab 12B Neighborhood and Neighborhood and

Zone AnalysisZone Analysis

Page 27: Spatial Modeling and Analysis

Geostatistics BasicsGeostatistics Basics

Parametric StatsUnivariate Multivariate

Spatial StatsUnivariateMultivariate

meanvariance

x

correlation

covariance

x, y

semi-variancelag correlation

lag covariance

x, h

h = lag (time or space)

cross-semivariance (variogram)cross correlation ||inverse

cross covariance (correlogram)

x, y, h

Page 28: Spatial Modeling and Analysis

Semi-variance Semi-variance hh

N

Variance: 2 = (xi – x )2

i=1

N

Nh

Semi-variance: h = (xi – xi+h )2

i=1

2Nh

Local meanw.r.t study

extent

1. Slide x through space to get h 2. Vary h

Xi

Xi+h

Page 29: Spatial Modeling and Analysis

Semi-variance Semi-variance hh

Nh

Semi-variance: h = (xi – xi+h )2

i=1

2Nh

Local mean

Xi

Xi+h

Number of cells N = 10Number of windows Nh = # cells – h

h = 1….Nh = 9

h = 5….Nh = 5

Limit h to 1/3 of study extent

Page 30: Spatial Modeling and Analysis

Nh

Semi-variance: h = (xi – xi+h )2

i=1

2Nh

Next x

Semi-variogramSemi-variogramIf xi is similar to xi+h , h is small, and they are spatially correlatedIf xi is not similar to xi+h , h is large, and they are not spatially correlated

=> h measures heterogeneity

Nugget

Sill

Range

Nugget – value of h at distance 0 (not in data) – measure of unexplained variabilityRange – distance h of leveling off – below range heterogeneity is increasing in a predictable manner, above range, heterogenity is constant – measure of independenceSill – measure of maximum heterogeneity in data (max)

h

hh

0independence

spatialdependence

Page 31: Spatial Modeling and Analysis

Semi-variogramsSemi-variograms

h

hh

0

h

hh

0

periodic, cyclic

Examples: timber harvest, forest agerange harvest areasill rotation

gradient, no sill or range

Page 32: Spatial Modeling and Analysis

Lag Covariance: Geary’s Lag Covariance: Geary’s CC

Xi

Xi+hXi-h

Centered around mean values of x, x

Nh

Lag covariance: Ch = (xi – xi-h )(xi – xi+h ) i=1

Nh

Local mean

Correlograms have the inverse shape of semi-variograms

If x, xi+h and xi-h are all the same, Ch = 0If values are increasing or decreasing through space (xi-h < x < xi+h, or xi-h > x > xi+h, 1 term is negative and Ch = negative, things are not similar. Otherwise positive, things are similar

Page 33: Spatial Modeling and Analysis

Lag Correlation: Moran’s Lag Correlation: Moran’s II

Centered around mean values of x, xStandardized against sample variation

Nh

Lag covariance: Ch = (xi – xi-h )(xi – xi+h ) i=1

Nh

Lag correlation Ph = Ch Sx-h Sx+h

Page 34: Spatial Modeling and Analysis

ComparisonComparison

Semi-variance h 0 < Gh <

Lag Covariance Geary’s C Ch - < Ch <

Lag Correlation Moran’s I Ph -1 < Ph < +1

h

hh

0 h

CChh

0-

h

PPhh

0-1

+1

range similar h

zero

Correlated Independent

Page 35: Spatial Modeling and Analysis

Lab 12C CorrelogramsLab 12C Correlograms

Page 36: Spatial Modeling and Analysis

Surface AnalysisSurface Analysis

Spatial distribution of Spatial distribution of surface information in surface information in terms of a three-terms of a three-dimensional structuredimensional structure

Surfaces do not have to Surfaces do not have to be elevation, but could be elevation, but could be population density, be population density, species richness, or any species richness, or any other measured other measured attributedattributed

Page 37: Spatial Modeling and Analysis

Surface Surface AnalysisAnalysis

Given geolocated point data, calculate values at regular intervals between points

Inverse distance weighting

•Can’t create extremes (ridges, valleys)•Isotropic influence (not ridge preserving)•Best with dense samples

Kriging

•Uses semi-variogram to determine relative importance (weighting) of data at different distances•Uses global variation, only works well if semi-varigram captures variation across entire mapTrend analysis

•Calculates a best-fit polynomial equation using linear regression•Recalculates all positions using equation (lose original data)•Smoothing depends on polynomial order

Spline

•Calculates a 2-D minimum curvature surface that passes through every input point

Page 38: Spatial Modeling and Analysis

Surface Analysis: Surface Analysis: StreamsStreams

Page 39: Spatial Modeling and Analysis

Network AnalysisNetwork Analysis Designed specifically for line features organized Designed specifically for line features organized

in connected networks, typically applies to in connected networks, typically applies to transportation problems and location analysistransportation problems and location analysis

•Streams•Dispersal vectors•Community interactions

Page 40: Spatial Modeling and Analysis

Network AnalysisNetwork Analysis

•Pathfinding: shortest or least cost•Allocation of network areas to a center based on supply, demand and impedance•Connectivity

Page 41: Spatial Modeling and Analysis

Integrated Integrated AnalysisAnalysis

DEMHydroModel

Watershed

LandCover

Soil

Grid Process

Statistics

Modeling- regression,

et al.

GaugePoints

Samples

Field Data(Vector)

Page 42: Spatial Modeling and Analysis

Lab 12D Lab 12D CorrelationCorrelation

Page 43: Spatial Modeling and Analysis

SamplingSampling Spatial dependency must be considered in Spatial dependency must be considered in

sample designsample design Non-independent observationsNon-independent observations Fewer degrees of freedomFewer degrees of freedom Differences within groups will appear small => Differences within groups will appear small =>

over estimate significance of between group over estimate significance of between group variationvariation

Spatial structure & heterogeneity can affect Spatial structure & heterogeneity can affect experimental results – response due to treatments experimental results – response due to treatments or due to inherent spatial structure?or due to inherent spatial structure?

Solutions:Solutions: include space as an explanatory variable (Mantel include space as an explanatory variable (Mantel

test)test) Sample at greater distance than the variogram Sample at greater distance than the variogram

rangerange

Page 44: Spatial Modeling and Analysis

Elevation (m)

Vegetation cover type

P, juniper, 2200m, 16CP, pinyon, 2320m, 14CA, creosote, 1535m, 22C

Sample 3, lat, long, species, absence

Mean annual temperature (C)

Access File

Excel File

Integrated data:

Sample 2, lat, long, species, presence

Sample 1, lat, long, species, presence

Example: Integrating Example: Integrating Species Occurrence Species Occurrence Points and ImagesPoints and Images

1. Semantics2. Compatible scales3. Reproject4. Resample grain5. Clip extent6. Sample occurrence points

Page 45: Spatial Modeling and Analysis

ENM ResultsENM Results

Geographic patterns of species richness of 17 native rodent species.

Sanchez-Cordero and Martinez-Meyer, 2000

Model building and testing. a) training data; b) predictive model.

Peterson, Ball and Cohoon, 2002