managing spatial data chapter 8 slides from james pick, geo-business: gis in the digital...

32
Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John Wiley and Sons. DO NOT CIRCULATE WITHOUT PERMISSION OF JAMES PICK Copyright (c) 2008 by John Wiley and Sons

Post on 21-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Managing Spatial Data

Chapter 8 Slides from

James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008.

Copyright © 2008 John Wiley and Sons.

DO NOT CIRCULATE WITHOUT

PERMISSION OF JAMES PICK

Copyright (c) 2008 by John Wiley and Sons

Page 2: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Topics covered• The lecture concerns the design and uses of spatial

databases and data warehouses. • Database model approaches of relational, object-

oriented, and object-relational are explained and compared on their pluses and minuses. The first two are covered in Connections-A. The relational model is the most frequently used one for GIS.

• Data warehouses have different uses than databases. They are used to archive huge amounts of data over a long period of time. Their data design is simpler than for data-bases.

• Spatial data warehouse examples are gives on an auto insurance firm and the City of Portland

• The final topic is data quality. GIS professionals and users need to be scrutinizing continually for data errors and correcting them.

Copyright (c) 2008 by John Wiley and Sons

Page 3: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Review - Design Elements of a GIS

(Source: Pick, 2007)

Relational data-bases involve the manipulation of the attribute tables and intermediate tables that are created

Copyright (c) 2008 by John Wiley and Sons

Page 4: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Relationship of Spatial Data and Attribute Data (Fig. 8.2)

Copyright (c) 2008 by John Wiley and Sons

Page 5: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Data-bases: Relational Model

• The relational model is based on organizing data into a series of tables. For each table, the rows represent records, while the columns indicate attributes. – An example is the Attribute Tables in ArcGIS– The tables are related to each other through relational

operators. For instance, the Join operator joins two tables together. Many operators can work in sequence to support complex spatial data manipulation (think of some of the manipulations in the lab from ArcGIS.

Copyright (c) 2008 by John Wiley and Sons

Page 6: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Example of Relational Tables and Sequence of Operations by Three Relational Operators

Copyright (c) 2008 by John Wiley and Sons

Page 7: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Example of Relational Tables and Sequence of

Operations by Three Relational

Operators on Supplier Data (Fig.

8.3)

Relational Model

In it, based on queries, tables are sliced, diced,

and combined to yield new tables.

The starting and ending attributes are spatially

referenced (X,Y coordinates)

Copyright (c) 2008 by John Wiley and Sons

Page 8: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Copyright (c) 2008 by John Wiley and Sons

Page 9: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Data-bases: Object-oriented Data Model

• The key unit is the object, which represents a real world “thing” having attributes and behaviors. It is able to relate to, and communicate with other objects by sending messages to them that activate their behaviors.

• The objects can be organized into hierarchical classes, so that characteristics can be inherited from higher objects to lower ones.

• The object-oriented model is more suitable for models applied to rapidly changing environments having complex behaviors.

• Object-oriented programming languages, such as Visual Basic, Java, and C++, allow the building and manipulating of objects, including spatial objects. GIS software is programmed today in object-oriented languages, and developers can customize GIS applications by using these languages.

Copyright (c) 2008 by John Wiley and Sons

Page 10: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Spatial Object and Example (Fig 8.4)

Copyright (c) 2008 by John Wiley and Sons

Page 11: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Spatial Objects Showing Multiplicity and Class Hierarchy

Copyright (c) 2008 by John Wiley and Sons

Page 12: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Copyright (c) 2008 by John Wiley and Sons

Page 13: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Data-bases: Object-Relational Data Model

• In this model, object-oriented capabilities complement a relational database.

• Relational tables remain as the place for data storage, but the relational model can interact with some object-oriented functionality on top.

• This model is appearing in the commercial marketplace in some major contemporary products including Oracle Spatial 10g and ESRI’s Geodatabase model of ArcGIS, which is mixed object-relational.

Copyright (c) 2008 by John Wiley and Sons

Page 14: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

(Source: Modified from Tomlinson, 2003)

Copyright (c) 2008 by John Wiley and Sons

Page 15: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

TABLE 8.5 Appropriate Data Model for Certain Data Modeling Situations in Business (cont.)

Copyright (c) 2008 by John Wiley and Sons

Page 16: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Oracle Spatial 11g• Oracle Spatial 11g is a Spatial Relational Database that

is a version of the Standard Oracle 11 Database product, which is among the leading databases for medium and large businesses. Note: 11g recently superseded 10g.

• Within Oracle Spatial 11g, there are a number of Spatial Functions available.

• Among them is the geodetic function supporting the use of the latitude/longitude coordinate system while other functions handle indexing, partitioning, and aggregation for spatial data. Relational spatial operators can change and transform spatial data.

• Oracle Spatial 11g can be a great choice for a large business that has invested hugely in its Oracle mainframe data-bases, but doesn’t need a lot of spatial functionality. – That said, Oracle Spatial 11g’s functionality is improving year by

year, and today could be classified as moderate.

Copyright (c) 2008 by John Wiley and Sons

Page 17: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Copyright (c) 2008 by John Wiley and Sons

Page 18: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

New York City’s Integrated Data Architecture Using Oracle Spatial 11g

• New York City standardized on Oracle Spatial 11g, with the justification was that Oracle had for some time supported the non-spatial, heavy-duty database processing for the city

• the decision to centralize its spatial applications on Oracle Spatial 11g as the main repository leveraged on the city’s existing Oracle knowledge and skill base, as well as offered the capacity to support a very large spatial processing demand.

Copyright (c) 2008 by John Wiley and Sons

Page 19: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Oracle Spatial 11g Integrated Data Architecture for New York City (Fig 8.9)

(Source: GITA)Copyright (c) 2008 by John Wiley

and Sons

Page 20: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Pluses and Minuses of Oracle Spatial 10g (or 11g)

• The pluses are the potential for high-volume spatial applications in the enterprise environment, and potential in large IT shops to leverage the Oracle knowledge already present.

• A minus for the GIS or IT department of a smaller enterprise it that it may not have the knowledge or skills to support Oracle Spatial.

• Another minus is that the spatial features are only moderate and the GIS interface may be less-friendly than for some other packages.

• Perhaps the greatest deterrent is the high cost of Oracle databases.

Copyright (c) 2008 by John Wiley and Sons

Page 21: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Enmax Case Studyin Geo-Business

• Enmax is a private corporation wholly owned by the City of

Calgary in Canada. • It serves a territory around Calgary of

422 square miles and has over 360,000 customers.

• It distributes natural gas and electricity and has started an initiative in wind energy.

• It’s process of adoption of an enterprise approach with Oracle Spatial 10g is the focus of this case.

Copyright (c) 2008 by John Wiley and Sons

Page 22: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Enmax Database Configur-

ation (Fig 8.12)

(Source: Lawrence, 2005)

Copyright (c) 2008 by John Wiley and Sons

Page 23: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Spatial Data Warehouses

• A data warehouse is oriented towards a subject-oriented view of data, rather than query-oriented. It receives data from one or multiple relational databases, stores large or massive amounts of data, and emphasizes permanent storage of data received over periods of time.

Copyright (c) 2008 by John Wiley and Sons

Page 24: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Data Warehouse Star Schema including location

Copyright (c) 2008 by John Wiley and Sons

Page 25: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Spatially-enabling a data warehouse• Data warehouses can be spatially-enabled in several ways.

– The data in the warehouse can have spatial attributes, supporting mapping. Mapping functions are built into some data warehouse packages.

– “Slicing and dicing” and what-if spreadsheet-like functions are performed on the data in the warehouse, and may include spatial characteristics.

• Technically, this follows the OLAP data management model, which was proposed originally in the 1990s by Codd.

– Furthermore, the data warehouse can be linked to GIS, data mining, and other software packages for more spatial and numerical analysis.

Copyright (c) 2008 by John Wiley and Sons

Page 26: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

The Data Warehouse and Its Data Flows, Spatial Functions and Components

Copyright (c) 2008 by John Wiley and Sons

Page 27: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Spatial data warehouse: Example in Auto Insurance

• Spatial data warehouses can be built for large-scale analysis of auto insurance.

• In this real-world example, the data warehouse resides in Oracle Spatial 11g.

• The business items in the data warehouse have location attributes that include census blocks, locations of policies, business sites, landmarks, elevation, and traffic characteristics.

• For data warehouses in auto risk insurance, maps can be produced that take spatial views from the usual ZIP-code geography down to hundreds of block groups, small areas within the ZIPs (Reid, 2006).

• This allows underwriters to set more refined policy pricing. The geoprocessing needs to be fast, many tens of millions of location data processed per day (Reid, 2006).

Copyright (c) 2008 by John Wiley and Sons

Page 28: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Example of City of Portland

• The data consist of city and regional traffic accidents from the Oregon Department of Transportation.

• The solution combined an SQL Server data warehouse with a customized program written in ArcObjects API (application programming interface) from ESRI Inc.

• There is a pre-defined schema of non-spatial and spatial attributes for transport of data between the data warehouse and the ArcObjects program.

Copyright (c) 2008 by John Wiley and Sons

Page 29: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Example of City of Portland (cont.)• The city’s spatial data warehouse for city and regional

traffic accidents has over fifteen years of data and fourteen dimensions, including time, streets, age and gender of participants, cause, surface, and weather.

• The volume of data is huge, so attention was given to mitigating performance bottlenecks (SQL Server Magazine, 2002).

• A customized program allows the GIS software to utilize part or all of the data warehouse.

• The benefits of this data-warehouse/GIS approach included halving of replication time for a time slice of data, fast spatial queries, and response times shortened by twenty-fold or more

Copyright (c) 2008 by John Wiley and Sons

Page 30: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Spatial Data Quality• No matter how sophisticated the storage and access of data, for its

ultimate use, the data are only as good as their quality. • An example from the field of medicine is preventable deaths from

medical errors, which was estimated as 44,000 to 98,000 Americans yearly (Institute of Medicine, cited in Pierce, 2003).

• Likewise with GIS, the impacts of poor data quality can be profound. – What if a governmental spatial system tracking shipments of

nuclear materials has errors so that it recommends the wrong nuclear shipment routes, compromising security.

– In business, what if an insurance underwriter receives erroneous data from a spatial database about a large customer commercial property and prices the property policy too low?

– What if a private health-care firm’s ambulance routing software is inaccurate for section of a city, cutting crucial minutes from the transport of critically ill patients?

• Data quality is a crucial topic for the success of GIS. • Management has the responsibility to exercise control and maintain

data quality.

Copyright (c) 2008 by John Wiley and Sons

Page 31: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Spatial Data has Distinctive Considerations to Achieve Data Quality(1) Spatial completeness. Are there sufficient types and

numbers of spatial features for the problem at hand?(2) Coverage. Does the geographic extent of the data

correspond to the extent of the problem at hand? Are the geographic features consistent in the procedures used to locate them across the whole coverage?

(3) Transforming spatial data. When data are aggregated, joined, split apart, and queried in the data transformation inside databases and data warehouses, errors can occur leading to erroneously transformed results.

(4) Accuracy. This can be divided (Tomlinson, 2003) into referential (error in referring to a spatial feature), topological (error in the presenting of the topology, such as a broken line segment), relative (two features are not located correctly one to the other), and absolute (error in the map position relative to the true earth position).

Copyright (c) 2008 by John Wiley and Sons

Page 32: Managing Spatial Data Chapter 8 Slides from James Pick, Geo-Business: GIS in the Digital Organization, John Wiley and Sons, 2008. Copyright © 2008 John

Summary• Data management is essential to GIS success. • Each of the relational, object-oriented, and

object relational data models has pluses and minuses and is appropriate for certain problems.

• Data warehouses contrast with databases in being non-volatile and storing data historically.

• The data quality issues permeate data management, since the use of data is compromised if quality is low.

• The data management issues of GIS are similar to those of IS in most ways, but the additional need to handle spatial data makes GIS different and unique.

Copyright (c) 2008 by John Wiley and Sons