why interpolation? · 2017-04-05 · 1 spatial interpolation and prediction cp204c ©radke 2017...

11
1 Spatial Interpolation and Prediction cp204c © Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things are more closely related.” Once again ….1 st law of geography W. Tobler , UCSB Readings Bolstad, GIS Fundamentals, Bolstad , Paul. 2015. GIS Fundamentals: A First Text on Geographic Information Systems . Eider Press (1 st edition pp. 333 350 ., 2 nd edition 395 421, 3 rd edition 437 470, 4 th edition 473 520, 5 th edition 519 - 559 ) . Spatial Interpolation Definition: Estimating the value of a variable of interest at an un sampled location based on the values measured at sampled locations. Sample point Sample point Estimate value Spatial Interpolation Assumes a field - based conceptual model of space that a variable of interest varies continuously over the study area . Temperature (urban heat island) elevation, precipitation Soil type, vegetation type, geology Fire risk, erosion potential , property values Concentrations of students attending schools Air Pollution modeling Why interpolation? We cannot sample everywhere Too expensive, too tedious, physically impossible ( vegetation, ground water, public opinion ) Some locations inaccessible, off limits or not clearly visible ( property value, household amenities ) Some locations inaccessible to even high resolution remote sensing satellites ( Cloud cover, forest canopy, roof tops )

Upload: others

Post on 09-Jul-2020

15 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

1

Spatial Interpolation and Prediction

cp204c©  Radke 2017

Spatial Interpolation

“Everything is related to everything else, but close things are more closely related.”

Once again ….1st law of geography

W.  Tobler,  UCSB

Readings

qBolstad, GIS Fundamentals,Bolstad,  Paul.  2015.  GIS  Fundamentals:  A  First  Text  on  Geographic  Information  Systems.  Eider  Press  -­ (1stedition  pp.  333-­350.,  2nd edition  395-­421,  3rd edition  437-­470,  4th edition  473-­520,  5th edition 519-559).

Spatial Interpolation

Definition:Estimating  the  value  of  a  variable  of  interest  at  an  un-­sampled  location  based  on  the  values  measured  at  sampled  locations.

Sample  point

Sample  point

Estimate  value

Spatial Interpolation

Assumes a field-based conceptual model of space – that avariable of interest varies continuously over the study area.

ü Temperature (urban heat island) ü elevation, precipitationü Soil type, vegetation type, geologyü Fire risk, erosion potential, property valuesü Concentrations of students attending schoolsü Air Pollution modeling

Why interpolation?

We cannot sample everywhere ü Too expensive, too tedious, physically impossible

(vegetation, ground water, public opinion)ü Some locations inaccessible, off limits or not clearly

visible (property value, household amenities)ü Some locations inaccessible to even high resolution remote

sensing satellites (Cloud cover, forest canopy, roof tops)

Page 2: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

2

Typical Inputs / Output of Interpolation

ü Points to Points

ü Points to Lines: contours (i.e., isolines)

ü Points to vector polygons

ü Points to raster grids

Sample data:

ü Location: x, y coordinates

ü Variable of interest that varies spatially(i.e., Z-value)

ü Time of data capture

Sampling Strategy

ü Number of samples

ü Type of sample: • Random,

• Uniform,

• Cluster,

• Adaptive sampling: fewer samples taken in homogenous areas

Systematic – Regular Lattice

ü Regular spatial interval ü Square or triangulated pattern

Bolstad, GIS Fundamentals

Random – Poisson Process

ü Each location has an equal probability of being selected

ü No location influences anyother potential selection

Bolstad, GIS Fundamentals

Cluster - Neighborhoods

ü Could be a stratified random clustering

ü Could be a systematic or regular clustering pattern

Bolstad, GIS Fundamentals

Page 3: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

3

Adaptive – Intelligent Sampling

ü More sampling of data where patterns shift through space.

ü Sampling pattern dictated by data variability.

ü Example – surface or elevation points.

Bolstad, GIS Fundamentals

Adaptive - Elevation pointsElevation points Bolstad, GIS Fundamentals

Interpolation Methods

ü Several  methods  that  vary  in  approach  &  complexity

ü All  methods  use  the  sample  points  to  estimate  values  at  un-­sampled  locations

ü Yet,  usually  produce  different  results  from  the  same  sample  data  points  due  to  the  underlying  mathematical  formulas  /  models  and  different  parameters  used  in  estimation

Main Characteristics of Interpolation Methods

üGlobal  vs.  Local  

üExact  vs.  Inexact  

üDeterministic  vs.  Stochastic

Global vs. Local Estimators

üGlobal: use  all  sample  points  to  estimate  values  at  un-­sampled  locations

ü Local: estimates  are  based  on  neighboring  points

“Everything  is  related  to  everything  else,  but  close  things  are  more  closely  related.”  

• W.  Tobler,  1st law  of  geography

Exact vs. Inexact Estimators:

üExact: the  values  at  input  sample  locations  will  have  same  values  in  the  output  surface

ü Inexact estimators  will  create  an  output  surface  where  even  the  values  at  the  original  sample  locations  may  be  estimates

Page 4: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

4

Deterministic vs. Stochastic Methods

üDeterministic: based  on  a  mathematical  model

üStochastic: based on a geostatistical model that incorporates random variation and accounts for spatial autocorrelation

…. more on this later

Evaluating Spatial Interpolation Results - Validation

One simple approach:üWithhold a small subset of the sample points

from the interpolation process

üCheck estimated values at withheld sample points with the observed values at those locations.

Spatial Interpolation Algorithms

A  very  brief  review  of  the  most  commonly  used  spatial  interpolation  techniques

Algorithm – same as – a recipe

Ø where  zˆ is  the  estimated  value  of  an  attribute  at  the  point  of  interest  x0,  

Ø z is  the  observed  value  at  the  sampled  point  xi,  λi is  the  weight  assigned  to  the  sampled  point,  and  

Ø n represents  the  number  of  sampled  points  used  for  the  estimation  (Webster  and  Oliver,  2001).

Spatial Interpolation Techniques

Deterministic Methods:ü Natural  neighbors:  Thiessen polygonsü IDW:  inverse  distance  weightedü Spline functions

Geostatistical Methods:ü Kriging

Nearest  Neighbors

The  nearest  neighbors  (NN)  method  predicts  the  value  of  an  attribute  at  an  un-­sampled  point  based  on  the  value  of  the  nearest  sample  by  drawing  perpendicular  bisectors  between  sampled  points  (n),  forming  such  as  Thiessen (or  Dirichlet/Voronoi)  polygons  (Vi,  i=1,2,…,  n).  

The  estimations  of  the  attribute  at  unsampled points  within  polygon  Vi are  the  measured  value  at  thenearest  single  sampled  data  point  xi that  is  zˆ (x0)  =  z(xi).  The  weights  are:

λi is  the  weight  assigned  to  thesampled  point

Thiessen Polygons

ü Aka Nearest Neighbor Interpolation

ü One point – the nearest point, is used to assign value to an unsampled location

ü Space is partitioned using Delaunay Triangulation to create Thiessen (aka, Voronoi or Dirichlet) polygons.

ü Each point within a polygon is closer to the sample point than any other point

ü Defines Areas of influence

Page 5: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

5

Triangular  Irregular  Network

The  triangular  irregular  network  (TIN)  was  developed  by  Peucker (Poiker)  and  co-­workers  (Little,  Fowler,  Mark  1978)  for  digital  elevation  modeling  that    avoids  the  redundancies  of  the  altitude  matrix  in  the  grid  system  

Voronoi Polygons

… or …

Thiessen Polygons

Delaunay Triangulation (TIN)

Thiessen Polygons

ü Thiessen polygon boundaries are the perpendicular bisectors of straight lines drawn between two neighboring points (Delaunay triangulation, in red)

Police  StationsThe  Voronoi or  Thiessen Polygons

Unconstrained Allocation Solution

(point-polygon class)

Thiessen Polygon Interpolation

Bolstad, 3rd edition, GIS Fundamentals

Thiessen Polygon Interpolation

ü Local  estimator:  estimates  are  based  on  the  nearest  sample  point

ü Exact:  sample  values  are  maintained  in  output

ü Deterministic:  mathematical  model  based  on  Delaunay  triangulation

Page 6: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

6

Thiessen Polygon  -­ Natural  Neighbors

üFor  each  neighbor,  the  area  of  the  portion  of  its  original  polygon  that  became  incorporated  in  the  tile  of  the  new  point  is  calculated.

üThese  areas  are  scaled  to  sum  to  1  and  are  used  as  weights  for  the  corresponding  samples.

üA  new  Voronoi polygon,  beige  color,  is  then  created  around  the  interpolation  point  (red  star).  The  proportion  of  overlap  between  this  new  polygon  and  the  initial  polygons  is  then  used  as  the  weights.

Fixed radius

Sample  point

Estimate  value

Sample  point

Fixed radius Inverse  Distance  Weighting

The  inverse  distance  weighting  or  inverse  distance  weighted  (IDW)  method  estimates  the  values  of  an  attribute  at  un-­sampled  points  using  a  linear  combination  of  values  

The  assumption  is  that  sampled  points  closer  to  the  un-­sampled  point  are  more  similar  to  it  than  those  further  away  in  their  values.  (Tobler’s Law  – once  again  referenced). The  weights  can  be  expressed  as:

where  di is  the  distance  between  x0 and  xi,  p is  a  power  parameter,  and  n represents  the  number  of  sampled  points  used  for  the  estimation

Local, exact, deterministic method

Inverse  Distance  Weighting

x0xi d

i

Estimate  value

Sample  point

IDW Interpolation

Zj = estimated value at location j

i = # of sample pts considered, here 3

n = user defined exponent that can be used to increase weight of nearby pts, here n= 1 (no exponent).

Page 7: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

7

üThe  main  factor  affecting  the  accuracy  of  IDW  is  the  value  of  the  power  parameter.  üAs  weights  diminish  as  the  distance  increases.

üespecially  when  the  value  of  the  power  parameter  increases,  so  nearby  samples  have  a  heavier  weight  and  have  more  influence  on  the  estimation,  and  the  resultant  spatial  interpolation  is  local.  

The  weight  factor  

The  choice  of  power  parameter  and  neighborhood  size  is  arbitrary  –based  random  choice  or  personal  whim.

IDW Interpolation

ü Where Voronoi method is based on closest point, IDW derives an estimate based on a user defined parameter for the number of sample pts to consider (ie. search radius). • The larger number of sample points the smoother the resulting surface,

up to the point where all sample points are used and one value is estimated for the entire output surface.

ü The User can also input a power parameter • the higher the power, the greater the influence of nearby points, but

resulting surface not as smooth as a lower power.

Estimate  value

Sample  point

IDW Interpolation i  =  #  sample  points

n = Weight exponent

n = Weight exponent

i  =  #  sample  points

IDW Interpolation

p=1  and  n=12                                                p=2  and  n=12                                            p=4  and  n=121-­12pts                                                                                          2-­12pts                                                                    4-­12pts

linear

IDW Interpolation

Local  points  have  more  influence

Spline Interpolation

ü Technique  is  named  after  a  spline,  a  flexible  ruler  that  was  used  by  draftsmen  to  draw  a  smooth  road  from  a  set  of  survey  points.  

ü The  spline creates  the  smoothest  possible  line  along  the  set  of  points.    

Page 8: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

8

Splines

ü Spline functions,  which  are  based  on  a  set  of  polynomial  functions,  serve  the  same  purpose  as  the  bendy  ruler  with  a  set  of  sample  points.

Splines

ü For surface creation, spline functions are like bending a rubber sheet to pass through all the sample points, while minimizing the total curvature of the surface.

ü Usually but not always exact interpolations, as exactness may not result in a smooth surface.

ü As with the IDW method, you can input the number of points to consider in the estimate.

• The more points, the more distant sample pts impact the local estimate and the smoother the overall surface.

Splines Splines

ü Splines are  good  spatial  interpolators  for  gently  varying  surfaces  like  elevation,  water  tables,  pollution  concentrations.

Splines

Polynomial  

Polynomial  comes  from  poly-­ (meaning  "many")  and  -­nomial (in  this  case  meaning  "term")  ...  so  it  says  "many  terms"

A  polynomial  can  have: constants  (like  3,  -­20,  or  ½)variables  (like  x and  y)exponents  (like  the  2  in  y2),

…  that  can  be  combined  using  addition,  subtraction,  multiplication  and  division

A  polynomial  can  have  constants,  variables and  exponents,

but  never  division  by  a  variable.

Global  polynomial  interpolation  (GPI)

üGlobal  polynomial  interpolation  (GPI)  fits  a  smooth  surface  that  is  defined  by  a  mathematical  function  (a  polynomial)  to  the  input  sample  points.  

üThe  global  polynomial  surface  changes  gradually  and  captures  coarse-­scale  pattern  in  the  data.

üConceptually,  GPI  is  like  taking  a  piece  of  paper  and  fitting  it  between  the  raised  points  (raised  to  the  height  of  value).  

Spatial regression

Use  observations  of  dependent  variables,  independent  variables,  and  sample  coordinates  to  develop  a  prediction  equation.

Zi = f(xi,yi,ai,bj)

Page 9: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

9

Trend Surface:ü A  spatial  regression  where  one  fits  a  statistical  model,  trend  surface  through  the  measured  points.

ü Trend  surfaces  are  the  most  accurate  when  you  need  to  fit  a  smoothly  varying  surface    such  as  the  mean  daily  temperature  over  a  large  area.

Simple  Spatial  regression Trend SurfaceOriginal Surface Trend Surface

Kriging

ü A  set  of  geostatistical estimatorsü Standard  (i.e.,  non-­spatial)  statistical  methods  are  based  on  the  assumption  of  independence  /  normal  distribution  of  data  values,  which  is  violated  by  spatial  autocorrelation.

ü Geostatistical models  account  for  spatial  autocorrelation  – a  measure  of  the  tendency  of  nearby  points  to  have  similar  values.

Kriging

ü Kriging is  based  on  3  main  components  of  the  sample  data:  the  spatial  trend,  spatial  autocorrelation,  and  random  variation.

ü These  three  are  combined  in  a  mathematical  model  to  create  an  estimation  function.

ü The  function  is  then  applied  to  the  data  for  the  sample  points  and  used  to  estimate  values  over  the  surface  of  the  study  area.

ü The semivariogram plots the semivariance over lag distances

ü Semivariance is typically small at small lag distances and increases to a plateau

Kriging -­ Semivariogram Kriging -­ Semivariogram

ü The  semivariogram is  defined  as    

γ(si,sj)  =  ½  var(Z(si)  -­ Z(sj))

where  var is  the  variance.

ü If  two  locations,  si and  sj,  are  close  to  each  other  in  terms  of  the  distance  measure  of  d(si,  sj),  you  expect  them  to  be  similar,  so  the  difference  in  their  values,  Z(si)  -­ Z(sj),  will  be  small.  As  si and  sj get  farther  apart,  they  become  less  similar,  so  the  difference  in  their  values,  Z(si)  -­ Z(sj),  will  become  larger.

Page 10: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

10

Kriging -­ Semivariogram

The  height  that  the  semivariogram reaches  when  it  levels  off  is  called  the  sill.  

nugget  effect  +  the  partial  sill  =  the  sill

partial  sill  

nugget    

sill    

The  distance  at  which  the  semivariogram levels  off  to  the  sill  is  called  the  range.

range    

Kriging -­ Covariance  function

ü The  covariance  function  is  defined  to  be

C(si,  sj)  =  cov(Z(si),  Z(sj)),

Where:    cov is  the  covariance.Covariance  is  a  scaled  version  of  correlation.  When  two  locations,  si and sj,  are  close  to  each  other,  you  expect  them  to  be  similar,  and  their covariance (a  correlation)  will  be  large.  As  si and sj get  farther  apart,  they  become  less  similar,  and  their  covariancebecomes  zero.

Kriging -­ Covariance  function

covariance  function  decreases  with  distance

partial  sill  

nugget    

sill    range    

Kriging – Semivariogram &  Covariance  function

ü The  relationship  between  the  semivariogram  and  the  covariance  function:  

γ(si,  sj)  =  sill  -­ C(si,  sj),

üSemivariogram and  covariance  both  measure  the  strength  of  statistical  correlation  as  a  function  of  distance.

üThere  are  some  instances  when  semivariograms exist,  but  covariance  functions  do  not.  

üThere  are  no  hard-­and-­fast  rules  on  choosing  the  "best"  semivariogram model.

ü The  process  of  modeling  semivariograms and  covariance  functions  fits  a  semivariogram or  covariance  curve  to  your  empirical  data.  

ü The  goal  is  to  achieve  the  best  fit,  and  also  incorporate  your  knowledge  of  the  phenomenon  in  the  model.  

ü The  model  will  then  be  used  in  your  predictions.ü The  sill,  range,  and  nugget are  the  important  characteristics  of  the  model.

Kriging -­ Semivariogram

ü Nugget: initial semivariancewhen Autocorrelation is highest. Theoretically, the semivarianceshould be zero when the lag distance is zero. Thus, the nugget is an indicator of the error in the sample measurements

ü Sill is point of plateau: this can be thought of as the natural variation when there is little autocorrelation

ü Range is the lag distance at which the sill is reached.

Kriging -­ Semivariogram

Page 11: Why interpolation? · 2017-04-05 · 1 Spatial Interpolation and Prediction cp204c ©Radke 2017 Spatial Interpolation “Everything is related to everything else, but close things

11

Kriging Recap

ü Kriging resembles IDW: in that distance and weights are used to estimate values at unsampled locations.

ü

ü However, IDW uses a coarse weighting scheme (inverse distance) while Kriging uses the semivariance method to calculate weights that minimize error in the predicted values.

Kriging Pros & Cons

Pros• Because they are based on statistical models, Kriging methods can produce

evaluative measures of the accuracy of the predictions.• Effective method when samples are sparse.

• In theory, Kriging methods should produce optimal interpolation weights.

Cons• Much  more  complex  and  nuanced  process  than  the  deterministic  spatial  interpolation  methods.

• There  are  no  hard-­and-­fast  rules  on  choosing  the  "best"  semivariogram model.

• Much  more  computationally  intensive.

Kriging Surface

Original Contours Kriging Contours

Kriging Contours

Kriging

ü A detailed review of Kriging is beyond the scope of this presentation and course. Dedicated self-study and/or a course on geostatistics/spatial statistics (such as: ESPMc177  /  LD  ARCHc177) is needed to better understand and appropriately apply the method.

ü For more info, see:• Burrough & McDonnell, Principles of GIS, Chap. 5&6• Bailey & Gatrell, Interactive Spatial Data Analysis