gis data models geog 370 christine erlien, instructor

26
GIS Data Models GEOG 370 Christine Erlien, Instructor

Upload: oswin-lambert

Post on 13-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: GIS Data Models GEOG 370 Christine Erlien, Instructor

GIS Data Models

GEOG 370

Christine Erlien, Instructor

Page 2: GIS Data Models GEOG 370 Christine Erlien, Instructor

GIS Data Models: Why?

Knowing how GIS data are structured helps us to use GIS programs more effectively

– Basic computer file structures

– Database structures

Page 3: GIS Data Models GEOG 370 Christine Erlien, Instructor

Basic computer file structures

What is where? – Computer file structures allow the

computer to store, order, & search data

Types:– Simple list– Ordered sequential– Indexed file

Page 4: GIS Data Models GEOG 370 Christine Erlien, Instructor

Basic computer file structures: Simple list

Simple List– Most basic– No order, no organization– Input is simple just add on– Searching difficult & inefficient

– Example: If my class roster were ordered based on when you added this class

Page 5: GIS Data Models GEOG 370 Christine Erlien, Instructor

Basic computer file structures: Ordered sequential files

Ordered sequential files– Records ordered by alphabetic or

numerical character sequence• How? Algorithm: divide and conquer

– Record compared to records preceding & following to determine which 1/2 to search

– Repeat until done

– Inserting a record is slow– Searching more efficient than simple list

Page 6: GIS Data Models GEOG 370 Christine Erlien, Instructor

Basic computer file structures:Ordered sequential files

Example file:Chapel Hill

Cary

Durham

Graham

Greensboro

Raleigh

To add: Maggie Valley

What’s the process?

Page 7: GIS Data Models GEOG 370 Christine Erlien, Instructor

Basic computer file structures

Indexed files– Database index

• Can be built for field that uniquely identifies a record (primary key) or other fields

• Used to determine the location of rows in a file that satisfy some condition

• Keys & indexes can be extracted & sorted and original file accessed faster than the original file could be sorted

– Types• Direct: Each record searched for particular properties • Inverted: Index based on anticipated search criteria

Page 8: GIS Data Models GEOG 370 Christine Erlien, Instructor

Indexed files

Inverted index

Direct index

Page 9: GIS Data Models GEOG 370 Christine Erlien, Instructor

Advantages– Quicker (i.e., reduces computational time)

Disadvantages– Inverted

• Requires knowledge of likely search criteria• Data additions require recalculation of index

Basic computer file structures:Indexed files

Page 10: GIS Data Models GEOG 370 Christine Erlien, Instructor

Databases & Database Structures

What is where?

– Geographic searches data retrieval

– Data retrieval requires data organization

Page 11: GIS Data Models GEOG 370 Christine Erlien, Instructor

Databases & Database Structures Database: Collection of multiple files

– Requires more elaborate structure for management

DBMS: Database Management System

Database structure types– Hierarchical data structures– Network systems– Relational database systems

Page 12: GIS Data Models GEOG 370 Christine Erlien, Instructor

Database Structures: Hierarchical

Hierarchical data structures– One-to-many (parent-child) relationship– Requires relationship be defined before

structure & decision rules developed– Advantage:

• Easy to search

– Disadvantage:• Knowledge of all questions that might be asked

necessary – Unanticipated criteria make search impossible

• Large index files memory intensive, slow access

Page 13: GIS Data Models GEOG 370 Christine Erlien, Instructor

Hierarchical Database Structures

Page 14: GIS Data Models GEOG 370 Christine Erlien, Instructor

Database Structures: Network Systems Network Systems

– Allow users to move from data item to data item through a series of pointers

• Pointers: Computer structures that direct a piece of data to all others to which it relates (connect one file location to another)

– Pointers indicate relationships among data items

Page 15: GIS Data Models GEOG 370 Christine Erlien, Instructor

Database Structures: Network Systems

Page 16: GIS Data Models GEOG 370 Christine Erlien, Instructor

Database Structures: Network Systems

Advantages:– Less rigid than hierarchical structure– Can handle many-to-many relationships– Reduce data redundancy – Greater search flexibility

Disadvantages:– In very complex GIS databases, the

number of pointers can get quite large storage space

Page 17: GIS Data Models GEOG 370 Christine Erlien, Instructor

Database Structures: Relational Databases Predominant in GIS Tuples: Ordered records/rows of

attribute values Primary Key: Unique identifier for each

record in a relational table Lu_code Crop type Status Cost

010001 Row crops Active 1000/ha

020001 Orchards Dormant 1500/ha

021001 Rangeland Active 900/ha

010001 Row crops Active 1100/ha

010404 Garden farms

Active 1250/ha

010001 Row crops Dormant 1050/ha

Page 18: GIS Data Models GEOG 370 Christine Erlien, Instructor

Database Structures: Relational Databases

Joining tables Relational join– Matching data from one table to

corresponding data in another table

– How? Link the primary key to the foreign key

• Primary Key: Unique identifier in 1st table• Foreign key: Column in 2nd table to which

primary key is linked

Page 19: GIS Data Models GEOG 370 Christine Erlien, Instructor

Database Structures: Relational Databases

Page 20: GIS Data Models GEOG 370 Christine Erlien, Instructor

Relational DB & Normal Forms

Normal forms: A set of rules established to indicate the form tables should take

Goal: Reduce database redundancy database performance is better

First normal form – Table must contain columns & rows– Columns will be used for searches, so only

one value per cell

Page 21: GIS Data Models GEOG 370 Christine Erlien, Instructor

Second normal form– Every column that is not the primary key

should be dependent on the primary key• On the entire primary key if primary key is

comprised of more than one column

Relational DB & Normal Forms

| PART | WAREHOUSE | QUANTITY | WAREHOUSE-ADDRESS |

Key: Part & Warehouse togetherAddress only dependent on warehouse portion of key

| PART | WAREHOUSE | QUANTITY | | WAREHOUSE | WAREHOUSE-ADDRESS |

Example from William Kent, "A Simple Guide to Five Normal Forms in Relational

Database Theory", Communications of the ACM 26(2), Feb. 1983, 120-125.

Page 22: GIS Data Models GEOG 370 Christine Erlien, Instructor

Relational DB & Normal Forms

Third Normal Form– Nonprimary keys must depend on primary

key– Primary key does not depend on any

nonprimary key

| EMPLOYEE | DEPARTMENT | LOCATION |Key field: EmployeeLocation is redundant & not dependent on key field

| EMPLOYEE | DEPARTMENT | | DEPARTMENT | LOCATION |

Page 23: GIS Data Models GEOG 370 Christine Erlien, Instructor

Normalization of Database Tables

Normalization: Process of organizing data in a database– Creating tables & establishing relationships

between them according to rules of normal form

– Goal: Make the database more flexible by eliminating redundancy and inconsistent dependency

Page 24: GIS Data Models GEOG 370 Christine Erlien, Instructor

Normalization of Database Tables

Problem with data redundancy:– Wastes disk space

– Creates maintenance problems• If data existing in more than one place must be

changed must be changed the same way in each case

Page 25: GIS Data Models GEOG 370 Christine Erlien, Instructor

Normalization & Normal Forms

Describing databases– If the 1st rule is observed, the database is

said to be in "first normal form." – If the first 3 rules are observed, the

database is considered to be in "third normal form."

Additional levels of normalization are possible, but 3rd normal form is considered the highest level necessary for most applications

Page 26: GIS Data Models GEOG 370 Christine Erlien, Instructor

Recap

File types– Simple list– Ordered Sequential– Indexed

Databases: Many files– Structure necessary access to data in 1 or

more files easier

Database types– Hierarchical– Network– Relational