-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
1/51
1
Geographic theory and geospatialknowledge discovery
Harvey J. MillerDepartment of Geography
University of Utah
[email protected] International Conference on Data Mining
Pisa, Italy 18 December 2008
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
2/51
2
GIS trends
Geospatial technologies High-resolution monitors Location-aware technologies Geosensor networks etc, etc.
Atsunami of digital geo-data Increased volume
Giga to terabyte and beyond
Increased coverage Seamless databases
Increased spectrum Text, sound, imagery
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
3/51
3
Introduction
Geospatial knowledgediscovery Human-centered process
of extracting novel,interesting and useful
patterns from geo-referenced data
A (very!) special case ofKDD
Location is important Observations are not independent Errors are often spatial Relationships are often local Non-linearity is typical Distributions are non-normal Highly multivariate but often
redundant Time often interacts with space Many data layers are categorical Data objects often cannot be
reduced to points Spatially aggregated data are
modifiable
Why is spatial special?- a mini course
- after Openshaw (1999)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
4/51
4
Introduction
Geographic theory and datamining There is a rich and
underexploited body ofgeographic theory
This can help guide the GKDprocess
Techniques
Background knowledge Pattern evaluation
etc Geography its notjust trivia!
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
5/51
5
Great geographic theories
Spatial dependency Toblers first law Cartographic transformations
Spatial heterogeneity Spatial non-stationarity Disaggregate spatial statistics
Spatial interaction Spatial interaction theory Time geography
Spatial organization The concept of region Spatial logic
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
6/51
6
What I will not talk about
The map One of the most
powerfultechnologies in the
history of civilization Still evolving!
Useful in GKD Interfaces
Pattern visualization
But, why the map?
Earliest known map ofthe world - sixth centuryB.C.E
www.antweb.orgGoogle Earth
www.gutenberg.org
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
7/51
7
What I will not talk about
Domain theory Theories about processes specific to particular
domains Ecosystems biology, biogeography
Landscapes geology, geomorphology
Cities economics, political science, sociology,geography
These are theories with geospatial components,but are not uniquely geography
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
8/51
8
What I am seeking and why
A theory of geography
- a unique perspective
-a coherent way ofthinking
- amenable to formal andcomputationalrepresentation
Framework fororganizing GKD
Suggest newtechniques and
strategies
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
9/51
9
Spatial dependency
Toblers First Law ofGeography Everything is related to
everything else, but near thingsare related to more distant
things
Everything is related toeverything else
Spatial interdependency Near things are related to more
distant things Interdependency and proximity
Tobler, W. R. 1970. A computermovie simulating urban growthin the Detroit region. EconomicGeography46: 234-240.
Waldo Tobler receiving the 1999ESRI Lifetime Achievement
Award
Susanna Baumgart - UCSB
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
10/51
10
Spatial dependency
Spatial autocorrelation Association based on geospatialproximity
Confounding
Something to be corrected e.g., econometrics
Informing Reveals information about
spatial process
e.g., spatial autocorrelationstatistics, spatial econometrics
esri.com
Body Mass Index in Salt Lake City, USA
Dr. Ikuho Yamada, University of Utah
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
11/51
11
Spatial dependency
Spatial interpolation Estimate variables at
unobserved locationsusing values atobserved locations
Based on modeled
proximityrelationships
e.g., IDW, kriging
Spatial interpolation of influenzaover time in Europe 2004-2005www.eurosurveillance.org
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
12/51
12
Spatial dependency
Geo-space
Proximity is the core oftheories of geo-space
Two main components Locations
Length metric
Formal theory Beguin and Thisse
(1979), others Admits a wide range of
length metrics Including semi-metrics
Miller and Wentz (2003)Annals, AAG
The fundamental tenet ofgeography:
Geo-space is explanatory
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
13/51
13
Spatial dependency
Geo-space does not have to beEuclidean Geographic processes can follow
other metrics
Cartographic transformations Project geo-space based on:
Alternative proximity relationships
Smooth spatial heterogeneity
Why?
Visualization Improve explanation
CartoDraw Keim, North & Panse
Swedish migration map Hagerstrand (cited by Tobler 1963)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
14/51
14
Spatial dependency
Iceland in air passenger spaceAir passenger flows in Iceland
Cliff and Haggett (1998)
Spatial modeling of disease propagation in analternative geo-space
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
15/51
15
Spatial dependency
Time-space maps Map with separation
measured in travel time
Why? Exploratory visualization Synoptic summary
Greater explanatory
power
Time-space transformationof Salt Lake City, USA
Nobbir Ahmed and Miller (2007)
Based on average daily traveltimes (vectors representdisplacements)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
16/51
16
Spatial dependency
Time-space transformations for 4 periods of the day morning, midday, afternoon and evening
1992 2001
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
17/51
17
Spatial dependency
Solutions are sometimes > 3D
Highly stressed solution western SLC
Less stressful surfacerepresentation
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
18/51
18
Spatial dependency
Comparing alternative spaces Bi-dimensional regression
Degree of fit between twoplanar configurations
After transformations,
rotations and translations Fit and significance
measures
Including spatial variation
Can be extended to higherdimensions
Nobbir Ahmed and Miller
Tobler (1994)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
19/51
19
Spatial heterogeneity
Geographic variationoccurs naturally Friction of distance
Relative location
Spatial processes arenon-stationary Apparent variation in
process with respectto location
If its stationary, itsnot spatial!
www.geovista.psu.edu
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
20/51
20
Spatial heterogeneity
Question: What do
Charles Darwin and PaulKrugman have incommon?
Besides beards
uk.gizmodo.com
ericblackink.minnpost.com
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
21/51
21
Spatial heterogeneity
Both recognized the power
of spatial heterogeneity Darwin
Observed geographic variationin species
Natural selection leads todifferences in species diversityand composition amongdifferent geographic locations
Long distance dispersal results
in geographic isolation andevolutionary divergence
uk.gizmodo.com
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
22/51
22
Spatial heterogeneity
Both recognized the power
of spatial heterogeneity Krugman
New economic geography
Geographic variation inproductive factors
Increasing returns enhancevariation and lead to greaterheterogeneity
ericblackink.minnpost.com
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
23/51
23
Spatial heterogeneity
Disaggregate spatialstatistics Decompose processes by
location Examples
Getis-Ord G K-function analysis Geographically weighted
regression
Unimaginable prior to
GIS! Data intensive Visually intensive
Local clustering of birth defects,
Shanxi province, China.
Wu et al (2004)www.biomedcentral.com
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
24/51
24
Spatial heterogeneity
Geographically weightedregression Assess spatial variation in
model structure Parameter estimates
Parameter errors
Goodness of fit
Influence
Determine whethervariation is systematic
Validate models betweendata subsets
Ask questions about spatialstructures in data
GWR with different spatial lags
Laffin, S. W. GeoComputation 99
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
25/51
25
Spatial heterogeneity
PARM_4
-1.857220 - -0.601224
-0.601223 - 0.055812
0.055813 - 0.787442
0.787443 - 1.664200
1.664201 - 3.123810
TVAL_4
-3.008610 - -2.580000
-2.579999 - -1.960000
-1.959999 - 1.960000
1.960001 - 2.580000
2.580001 - 5.342070
GWR: Spatial variation in the effect of social class onvoter turnout, Dublin Ireland
Parameter estimates t-tests
Fotheringham and Demsar (2009)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
26/51
26
Spatial heterogeneity
GWR and visual insight Use visual analytics to
explore parameter space
Example SOM clusters based on
eight parameters
Cartographic visualization ofclusters
Fotheringham and Demsar (2009)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
27/51
27
Spatial heterogeneity
GWR and visual insight Use visual analytics to
explore parameter space
Example Cluster selection
Parallel coordinate plot of
clusters across all eightvariables
Fotheringham and Demsar (2009)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
28/51
28
Spatial interaction
Spatial interaction theory Linkages and flows
between locations Spatial separation (-)
Complementarity (+)
Origin supply
Destination demand
Can be multidimensional
Map multiple variables intoa single measure
Originally an analogy with
Newtons Law ofGravitation, but
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
29/51
29
Spatial interaction
spatial interaction has a
solid theoretical base Entropy maximization
Alan Wilson 1960s
Discrete choice theory Stewart Fotheringham
1980s which has resulted in awide spectrum of models Flow total constraints and
quasi-constraints Spatial association among
origins, destinations Behavioral processes etc, etc
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
30/51
30
Spatial interaction
Data mining of spatialinteractions Existing techniques
Connections
Flows
Need better techniques Attributed flows
Spatial object dyads Origin-destination pairs
orgnet.com
CubeView detecting outliersin flows Shashi Shekhar
Visualizing social networks
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
31/51
31
Spatial interaction
The death of distance? Distance is changing
High mobility Connectivity
Space-adjustingtechnologies (Ron Abler)
Change the nature ofspace with respect to thetime, cost and effort
Space-time convergence(Don Janelle)
Shrinking of distance dueto transport Rate per unit time
Janelle 1969
Convergence: Edinburgh andLondon 1658-1950
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
32/51
32
Spatial interaction
Telepresence (Don Janelle) Participate in events without physical
presence
Space-time fragmentation (Helen Couclelis)
Spatial fragmentation Activities not tightly coupled with place
Temporal fragmentation Activities outside standard hours
Fluid time
Short planning horizons - Flocking
Need to expand theories of spatial interaction
Why let climbing a mountaininterfere with business?
Mt. Olympus, Utah, 18 June 2006
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
33/51
33
Spatial interaction
Time geography Individual in geo-space
and time
Constraints imposed by:
Activity timing Activity locations Mobility resources
Ability to trade time forspace
Space-time path Realized movement
Paths in theory and practiceMeipo Kwan
Miller (2005)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
34/51
34
Spatial interaction
Time geography Individual in geo-space
and time
Constraints imposed by:
Activity timing Activity locations Mobility resources
Ability to trade time forspace
Space-time prism Potential movement
t
ija
ijt
jt
jxix
it
x
ijv
Prism in theory and practice
Miller andBridwell
(2009)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
35/51
35
Spatial interaction
Temporal Spatial
Presence Telepresence
Synchronous SP
Face-to-faceST
Telephone
TV
Asynchronous AP
Post-it notes
AT
Mail
EmailWebpages
Janelle (1995)
Communication modes basedon spatio-temporal constraints
Possible time geographic expressions
Relationsbetween
paths
Temporalevents
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
36/51
36
Space-time cube: Visual analytic environment for
exploratory time geography Kraak and Huisman
Linking space-timepaths with attributes
Visualizing the intersection ofmultiple space-time prisms
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
37/51
37
Spatial interaction
Interactive, multiscalevisualization of space-time paths Explore paths at
different levels ofspatio-temporalgranularity
Aggregation based onspatial similarity
and attributesimilarity (eventually) Tetsuo Kobayashi and Miller (in
progress)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
38/51
38
Spatial organization
The concept of region Partitioning of geographic space
based on homogeneity
Two types of region
Formal Explicit
Land cover, terrain, settlementpatterns
Functional
Implicit Organization, interactions,
linkages
www.desertmuseum.org
Formal regions basedon biogeography
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
39/51
39
Spatial organization
Regions and locationalprocesses Functional regions highlight the
interplay between spatialprocess and spatial pattern
Von Thunen bid-renttheory www.rri.wvu.edu
teacherweb.ftl.pinecrest.edu
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
40/51
40
Spatial organization
Central Place Theory Theory of the frequency,
size and spacing of citiesas market centers
wolf.readinglitho.co.uk
Nesting ofmarket
areas andcities
Distance Transport AdministrationWikipedia
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
41/51
41
Spatial organization
Spatial logic A route to explanation
Spatial logic: Pattern suggestsprocess
Process logic: Process suggestspatterns
Why? Patterns are integrated
manifestations of complexprocesses
Why not? Difficult to distinguish individualprocesses
Equifinality
Continental drift inferred
through spatial logic by AlfredWegener (1912)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
42/51
42
Spatial organization
Geography and complexity Can spatial interaction explainintricate geographic patterns?
Complexity theory Simple, local interactions can
generate complex global behavior
Importance of geographic context Pattern and intensity ofinteractions
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
43/51
43
Spatial organization
The problem of arbitraryregions Arbitrary regions lead to
artifacts
Two types of effects Scale
Zoning
Solutions
Design optimal regions No regions!
Assess effects
10 15 5
5 10 15
5 10 5
n = 9; mean = 8.89
6.67 11.67 6.67
n= 3; mean = 8.34
7.5 11.25
6.67
n= 3; mean = 8.47
12.5
8
7.5
n= 3; mean = 9.33
Modifiable Areal UnitProblem example(after Oliver; www.geog.ubc.ca)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
44/51
44
Is there a theory of geography?
Yes!
A unique perspectivefocusing on the role ofspatio-temporal proximity
Is it formal? Yes: geographers have
been building the formaland analytical foundationsof their field
Is it coherent? Yes, but it is not unified
Still need a grand unifiedtheory derived from firstprinciples
Corn van Elzakker ITC; www.itc.nl
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
45/51
45
Opportunities and challenges
Spatial patterns & relations
Potentially large! Geo-theory can guide GKD
Background knowledge
Pattern evaluation
Background knowledge: challenges Geographic ontology
Concepts can be abstract, vague, multi-level
Knowledge extraction
Geo-theory: Implicit information Equations, algorithms, etc
KD: Explicit information Networks, hierarchies, rules
Concept hierarchy forlocation
- based on Han and Kamber
(2003)
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
46/51
46
Opportunities and challenges
Spatial pattern evaluation
Reality = theory Interesting but not novel
Reality = null Not interesting or novel
Between theory and null Maybe interesting and novel
Problems What is a good spatial null?
Not Complete Spatial Randomness (CSR)
What is the metric? How do we measure spatial departures
from theory and null?
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
47/51
47
Opportunities and challenges
Geographic theory as a pattern filter
Spatial data mining often generates alarge number of spatial and temporalpatterns and relationships
Meta-mining (Roddick 1999) Mining the results of previous mining
exercises
Derive higher-level patterns and
rules
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
48/51
48
Opportunities and challenges
Algorithms and infrastructure Geographic models and
techniques can becomputationally complex
Often involve pairwise distances
between all geo-locations
Research needs
Heuristics
High-performance computing
This is a surprisingly under-researched area! 10 years old!
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
49/51
49
Conclusion
Geographic theory Rich, coherent,
formal
Useful, butunderexploited indata mining
Waiting to bediscovered
wikimedia.org
Help fill the blank spots onthe map!
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
50/51
Bibliography
Ahmed, N. and Miller, H. J. (2007) "Time-space transformations of geographic space for
exploring, analyzing and visualizing transportation systems," Journal of TransportGeography, 15, 2-17
Beguin, H., and J. F. Thisse. (1979) An axiomatic approach to geographical space,Geographical Analysis 11, 32541
Fotheringham, A. S. (1983) A new set of spatial-interaction models: The theory ofcompeting destinations, Environment and Planning A, 15, 1536.
Fotheringham, A. S. and Demar, U. (2009) Looking for a relationship? Try GWR? in H. J.Miller and J. Han (eds.) (2009) Geographic Data Mining and Knowledge Discovery - secondedition, Taylor and Francis, in press.
Getis, A., and J. K. Ord. (1992) The analysis of spatial association by use of distancestatistics, Geographical Analysis, 24, 189206.
Janelle, D. G. (1969) Spatial organization: A model and concept. Annals of the Association
of American Geographers 59: 34864. Links Janelle, D. G. (1995) Metropolitan expansion, telecommuting and transportation, in The
geography of urban transportation, ed. S. Hanson, 40734. New York: Guilford.
-
8/3/2019 Geographic Theory and Geospatial Knowledge Discovery - Final
51/51
Bibliography
Kraak, H. J. and Huisman, O. (2009) Beyond exploratory visualization of space-timepaths, in in H. J. Miller and J. Han (eds.) (2009) Geographic Data Mining and KnowledgeDiscovery - second edition, Taylor and Francis, in press.
Miller, H. J. (2004) "Tobler's First Law and spatial analysis" Annals of the Association ofAmerican Geographers, 94, 284-289.
Miller, H.J. (2005) "A measurement theory for time geography," Geographical Analysis, 37,17-45
Miller, H. J. (2005) "Necessary space-time conditions for human interaction," Environment
and Planning B: Planning and Design, 32, 381-401. Miller, H. J. and Bridwell, S. A. (2009), "A field-based theory for time geography," Annals of
the Association of American Geographers, 99 (in press).
Miller, H. J. and Wentz, E. A. (2003) "Representation and spatial analysis in geographicinformation systems," Annals of the Association of American Geographers, 93(3), 574-594.
Tobler, W. R. (1963) Geographic area and map projections. The Geographical Review 53:5978
Tobler, W. R. (1970) A computer movie simulating urban growth in the Detroit region,Economic Geography, 46, 234-240.
Tobler, W. R. (1994) Bi-dimensional regression, Geographical Analysis, 26, 187212