mahalanobis distance a theoretical and practical approach
TRANSCRIPT
Mahalanobis distanceMahalanobis distance
A theoretical and practical A theoretical and practical approachapproach
PreviewPreview
• An introduction of Mahalanobis distance
• Our project:• Methodolgy• Results
• A demonstration in how to use Mahalanobis distance
Mahalanobis distanceMahalanobis distance
• Introduced by P. C. Mahalanobis in 1936
• A distance measure: based on correlations between the variables and by which different patterns could be identified and analyzed with respect to base or reference point (Taguchi & Jugulum, 2002)
Mahalanobis distanceMahalanobis distance
• M.D. is a very useful way of determining the ”similarity” of a set of values from an ”unknown”: sample to a set of values measured from a collection of ”known” samples
• Superior to Euclidean distance because it takes distribution of the points (correlations) into account
• Traditionally to classify observations into different groups
Mahalanobis distanceMahalanobis distance
EcologicalDistance
xy
z
w
r
p
Mahalanobis distanceMahalanobis distance
• D2t(x) = (x – mt)S-1
t(x – mt)`• Dt is the generalized squared distance of each
pixel from the t group
• St represents the within-group covariance matrix
• mt is the vector of the means of the variables of the t group
• X is the vector containing the values of the environmental variables observed at location x
Mahalanobis distanceMahalanobis distance
• The result of using this algorithm (with GIS) is a single raster with the value of ecological distance from the species’ ”optimal” conditions; the higher the distance, the less suitable the pixel’s ecological conditions
Mahalanobis vs. other classical Mahalanobis vs. other classical statistical approachesstatistical approaches
• 1. It takes into account not only the average value but also its variance and the covariance of the variables measured
• 2. It accounts for ranges of acceptability (variance) between variables
• 3. It compensates for interactions (covariance) between variables
• 4. It is dimensionless• 5.If the variables are normally distributed they
can be converted to probabilities using the x2 density function
Our projectOur project
Reports for the Large Predator Policy Statement.
Potential habitat for large carnivores in
Scandinavia; a GIS analysis at the
ecoregion level.- NINA Fagrapport 064
MethodologyMethodology
• Potential suitability maps• Species involved: bear (Ursus arctos), wolf (Canis
lupus), lynx (Lynx lynx), and wolverine (Gulo gulo)
• Two datasets1. A given set of environmental variables in
which we thought to influence the large carnivores distribution
2. A training set consisting of known data on the presence of the large carnivores today
VariablesVariables
• Landcover (1km x 1km): derived from an AVHRR image and put together with an elevationmodel (100m x 100m)– Reclassified into 6 classes
• Water• Forest• Cultivated land• Mountain• Alpine tundra (above 550 meters classified to mountain)• Ice/snow/bare mountain)
VariablesVariables
• Human density (SSB)– number of humans per square kilometers– Finland: humans linked to buildings– Norway: humans linked to adresses (GAB)– Sweden: humans linked to estates
• Infrastructure– harmonization of public roads and private
roads in Sweden and Norway– Railway
VariablesVariables
• Prey density– Based on maps with average shot moose, roe
deer and deer per county (kommune) and wild reindeer per wildreindeer management area
– Created an index based on each species preference for the prey species (Solberg et al. 2003)
– Example: lynx – 20% deer – 100% roedeer – 80% wild reindeer
Training dataTraining data
• Core homeranges • Multiannual fixes from of radio-collared female bears older than
2years, from Sarek and Dalarna• Multiannual fixes from radio-collared female lynx older than 2 years,
from Sarek, Grimsø, Nord Trøndelag, Hedmark and Østfold• Multiannual fixes from radio-collared female wolverines older than 2
years, from Sarek, Troms and the Snøhetta Plateau• Packranges of both radio-collared and snow-tracked wolves.
• The point coordinates were estimated to homeranges with Ranges 6, using a minimum konveks polygon method 75 %
• The core homeranges for each species were transformed to masks (a mask grid formes the outerlining of our reference area).
Example maskgridsExample maskgrids
Pre-processing (ArcInfo)Pre-processing (ArcInfo)
• All data were transformed into raster from vector (polygrid)
• Grids with 1km x 1km resolution
• Either constant (0/1) or continous
• 16 bit
Preprocessing (ArcInfo)Preprocessing (ArcInfo)
• One projection! • Parameter til Lambert Azimuthal Equal Area • Units of Measure: meters • Pixel Size: 1000 meters • Radius of sphere: 6370997 m • Longitude of origin: 20 00 00 E• Latitude of origin: 55 00 00 N • False easting: 0.0 • False northing: 0.0
Focal operation (neighboorhood)Focal operation (neighboorhood)
• A circular window of 5 km radius ≈ 80km2
• Smallest core area• Species perception of
space (Salvatori 2003)
• Smoothing executed with FOCALMEAN (Tomlin 1990)
FocalmeanFocalmean
Example with human density around Indre Oslofjord
ResultsResults
• One single grid with values from 0 – 900 001• The homerangemask is used to cut the
reference dataset• The dataset is treated in S –plus (0 values are
deleted, .33 and .66 quantiles)• The result grid is reclassified into these classes:
• 1. 0 – 33%
• 2. 33% - 66%
• 3. 66% - max (inside the homerange)
• 4. Max – ∞ (900 001)
ResultsResults
ResultsResults
ResultsResults
Egnethets
klasse
Art 1 2 3 4 1-3 (km2) 1-3 (%)
Norge Bjørn 168.342 74.286 63.473 15.989 306.101 95
Ulv 48.788 83.283 155.491 34.528 287.562 89
Jerv 84.999 39.991 54.898 142.202 179.888 56
Gaupe 177.933 80.891 55.205 8.061 314.029 97
Sverige Bjørn 231.349 76.684 97.250 41.028 405.283 91
Ulv 196.945 99.839 111.546 37.981 408.330 91
Jerv 39.777 23.045 86.987 296.502 149.809 34
Gaupe 260.924 59.498 124.714 1.175 445.136 99
Validation of the resultsValidation of the results
• Overlay with pointdata on shot femalebears, lynx and wolverines, also observed lynx familygroups and registered wolverine natal dens
• No available independent data on wolves• A historical dataset on bounty payments
(skuddpremier) showed presence of large carnivores over the whole Scandinavian peninsula
ResultsResultsArt Datatype 1 2 3 4
Bjørn Homerange 89 % 9 % 2 % 0 %
Skutt bjørn Testpunkt 86 % 11 % 3 % 0 %
Jerv Homerange 82 % 12 % 6 % 0 %
Jervehi Testpunkt 65 % 19 % 13 % 3 %
Skutt jerv Testpunkt 43 % 12 % 21 % 24 %
Gaupe Homerange 85 % 13 % 2 % 0 %
Skutt gaupe Testpunkt 74 % 20 % 6 % 0 %
Familiegrupper Testpunkt 72 % 21 % 7 % 0 %
ConclusionConclusion
• The result shows large non fragmented areas suitable for large carnivores
• Over 90% of the total area is potentially suitable for reproductive females of the species; bear, wolf and lynx
• About 48% of the total area is potentially suitable for wolverines
Recomended referencesRecomended references
• Clark, J.D., Dunn, J.E. & Smith, K.G. 1993. A multivariate model of female black bear habitat use for a geographic information system. The Journal of Wildlife Management 57(3):519 – 526
• Corsi, F., Sinibaldi, I. & Boitani, L. 1998 Large carnivores conservation areas in Europe; discussion paper for the Large Carnivore Initiative IEA – Istituto Ecologia Applicata, Rome
• Corsi, F., Dupre, E. & Boitani, L. 1999. A large-scale model of wolf distribution in Italy for conservation planning. Conservation Biology 13:150 - 159
• Knick, S. T. & Dyer, D. L. 1997. Distribution of black tailed jackrabbit habitat determined by GIS in Southwestern Idaho. Journal of Wildlife Management 61(1):75 – 85
• Salvatori in prep. 2003