enggis_v&r
TRANSCRIPT
-
7/24/2019 ENGgis_V&R
1/110
CSRSR NCU
Geographic Information Systems(GIS)
tel: 03-4227151-57624
fax: 03-4254908
e-mail: [email protected]
Chi-Farn ChenCSRSR NCU
-
7/24/2019 ENGgis_V&R
2/110
CSRSR NCU
Vector and Raster Data Model
-
7/24/2019 ENGgis_V&R
3/110
CSRSR NCU
Model and GIS
representation of reality model
GIS itself is based on a model of complexity GIS is used to model complexity
full representation of reality?
Data model =
limited representation of reality
-
7/24/2019 ENGgis_V&R
4/110
CSRSR NCU
Data Model
Reality is too complex for even the most
sophisticated GIS software, so in order torepresent reality in a spatial database, a
simplification of reality is created. This
simplification is known as a data model.
In a data model, reality is represented by
geometry and attributes.
-
7/24/2019 ENGgis_V&R
5/110
-
7/24/2019 ENGgis_V&R
6/110CSRSR NCU
There are two formats used by GIS
systems to store and retrieve spatial
data:
Vector
Raster
GIS Data Formats
RasterVector
-
7/24/2019 ENGgis_V&R
7/110CSRSR NCU
Vector Format Data are associated with points, lines, or areas
Points are located by coordinates
Lines are described by a series of points join-the-dots-books
(Arcs nodes, vertices) Areas are described by a series of lines enclosing
the area.
(Polygons)
-
7/24/2019 ENGgis_V&R
8/110CSRSR NCU
Vector Format Any attributes (name or code) can be associated
with a point, line or polygon.
Data are stored in two files:
a file containing information of coordinates a file containing information of the attributes
A third file contains information needed to link
positional data with their attributes (Identifier).
-
7/24/2019 ENGgis_V&R
9/110CSRSR NCU
Coordinates Table
Point ID x y
1 1 3
2 2 1
3 4 14 1 2
5 3 2
1
2 3
4
5
X
Y
common identifiers provide link to:
coordinates table (for where)
attributes table (for what)
Attributes Table
Point ID model year
1 a 90
2 b 90
3 b 804 a 70
5 c 70
Features have unique identifiers:
point ID, line ID, polygon ID
-
7/24/2019 ENGgis_V&R
10/110CSRSR NCU
Raster Format Data are divided into cell, pixels (picture elements)
Pixels are organized in arrays
Row and Column Numbers (coordinates) are used
to identify the location of the pixel within the
array.
Each pixel has a single value (attribute)
-
7/24/2019 ENGgis_V&R
11/110CSRSR NCU
row
pixel:Coordinate: (2,9)
Attribute: 8
column
8
Raster Format
-
7/24/2019 ENGgis_V&R
12/110CSRSR NCU
Vector and Raster Representation ofPoint Map Features
Map FeatureGIS Vector
Format
GIS Raster
Format
(X,Y)
Coordinate in space Pixel Locatedin an Array
-
7/24/2019 ENGgis_V&R
13/110CSRSR NCU
Vector and Raster Representation ofLine Map Features
Map FeatureGIS Vector
FormatGIS Raster
Format
-
7/24/2019 ENGgis_V&R
14/110CSRSR NCU
Vector and Raster Representation ofArea Map Features
Map FeatureGIS Vector
Format
GIS Raster
Format
-
7/24/2019 ENGgis_V&R
15/110
CSRSR NCU
Vector Model
Vector model uses discrete points, lines and/or areas
corresponding to discrete objects with name or code
number of attributes.
Raster Model
Raster model uses regularly spaced grid cells in specific
sequence. An element of the grid cell is called a pixel
(picture element). The conventional sequence is row by row
from the top to bottom and then column by column from
the left to the right. Every location is given in two
dimensional image coordinates; row number and column
number, which contains a single value of attributes.
-
7/24/2019 ENGgis_V&R
16/110
CSRSR NCU
Comparison of Raster and Vector Formats
Raster formats are efficientwhen comparing informationamong arrays with the samecell size.
Raster files are generallyvery large because each celloccupies a separate line ofdata, only one attribute can
be assigned to each cell, andcell sizes are relatively small.
Raster representations arerelatively coarse and
imprecise.
Vector formats are efficientwhen comparing informationwhose geographical shapesand sizes are different.
Vector files are muchsmaller because a relativelysmall number of vectors canprecisely describe large areas
and many attributes can bedescribed to these areas.
Vector representations ofshapes can be very precise.
RasterVector
-
7/24/2019 ENGgis_V&R
17/110
CSRSR NCU
Vector Model
-
7/24/2019 ENGgis_V&R
18/110
CSRSR NCU
There are different models to store and managevector information. Each of them has different
advantages and disadvantages.
Vector Model
Spaghetti Model (list of coordinates)
Vertex Dictionary Model
Topological Model (Dual Independent Map Encoding: DIME)
Arc / Node Model
-
7/24/2019 ENGgis_V&R
19/110
CSRSR NCU
Vector representation without model
-
7/24/2019 ENGgis_V&R
20/110
CSRSR NCU
Spaghetti Model (List of coordinates)
simple and easy to manage
lots of duplication, hence need for large storage space
very often used in CAC (computer assisted cartography) no topology
-
7/24/2019 ENGgis_V&R
21/110
CSRSR NCU
No Topology
-
7/24/2019 ENGgis_V&R
22/110
CSRSR NCU
Spaghetti data often contains
crossing lines, loose ends,double digitization of
common boundaries (slivers) Its a mess!
-
7/24/2019 ENGgis_V&R
23/110
CSRSR NCU
Sliver Polygons
Sliver polygons are small, narrow polygonfeatures that inevitably appear along
borders of polygons following the overlay of
two or more geographic data sets.
Often occur when a shared boundary arc is
digitized twice. Should be removed, but difficult to find.
-
7/24/2019 ENGgis_V&R
24/110
CSRSR NCU
Sliver Polygons
-
7/24/2019 ENGgis_V&R
25/110
CSRSR NCU
Vertex Dictionary Model
no duplication, but still this model does not use topology
-
7/24/2019 ENGgis_V&R
26/110
CSRSR NCU
Topological Model (DIME) / (TIGER)
assigns a directional code in the form of a
"from node" and a "to node nodes (intersections of lines) are identified
with codes
developed by US Bureau of the Census
both street addresses and UTM coordinatesare explicitly defined for each link
-
7/24/2019 ENGgis_V&R
27/110
CSRSR NCU
Topology
-
7/24/2019 ENGgis_V&R
28/110
CSRSR NCU
The GIS vector data model is slightly more complexas each vertex, arc, node and polygon is uniquely
identified and the relationships between them are
stored in the database The relationships betweenthe elements of a vector data model, in terms of
relative location and connections, are known as
Topology. Topology gives the vector data model alevel of intelligence which means that the GIS can
recognize which arcs are joined to each other, and
identify those polygons which are adjacent to each
other.
-
7/24/2019 ENGgis_V&R
29/110
CSRSR NCU
Topology:
A GIS topology is a set of rules and behaviors
that model how points, lines, polygons share
geometry. For example, adjacent features,
such as two countries, share a common edge.
Simple definition:
Topology stores the relationships of one
spatial element with respect to another.
-
7/24/2019 ENGgis_V&R
30/110
CSRSR NCU
Topology:
Topology is a mathematical approach that
allows us to structure data based on theprinciples of feature adjacency and feature
connectivity.
It is in fact the mathematical method used todefine spatial relationships. Without a
topologic data structure in a vector based GIS
most data manipulation and analysis
functions would not be practical or feasible.
-
7/24/2019 ENGgis_V&R
31/110
CSRSR NCU
Defining Topology
Topology :
the spatial relationships between features in a GIS
Do polygons overlap?
Do lines intersect or connect?Are points located near each other?
-
7/24/2019 ENGgis_V&R
32/110
CSRSR NCU
Topology:
Three main concepts
Connectivity Arc-node topology
Definition of areas / Containment Polygon-arc topology
Adjacency/Contiguity
Left/right topology
-
7/24/2019 ENGgis_V&R
33/110
CSRSR NCU
Topology : spatial relationship
Connectivity:
The topological identification of connected arcs by recording the from- and to-node
for each arc. Arcs that share a common node are connected.
Arcs connect to each other at nodes
Describing linear network, links have direction.
Area:
Arcs that connect to surround an area define a polygon
Containment:Accounts for polygons within polygons islands
Describing which landscape features are located within, or intersect, the boundary of
polygons
Adjacency:The identification of adjacent polygons by recording the left-hand and right-hand
polygons.
Arcs have direction and left and right sides
Describing a landscape features neighbor
-
7/24/2019 ENGgis_V&R
34/110
CSRSR NCU
Concept of Topology
Topology distinguishes GIS data models from non-
topological data models supported by many CAD, mapping
and graphics systems
Topology refers to knowledge about relative spatial
positioning of features.
knowledge about how features are connected and which features
are adjacent to each other.
Can be viewed as a mathematical procedure that
determines spatial properties and relationships, including:
Connectivity, contiguity (adjacency)
Lengths of arcs and areas of polygons
-
7/24/2019 ENGgis_V&R
35/110
CSRSR NCU
Topology Rules for Coverages:
Each arc has a beginning node and an ending
node - this determines directionality.
Directionality is determined during digitizing.
Actual direction is important only if your
application requires directional modeling.
Arcs connect to other arcs at nodes Connected arcs form polygon boundaries - arc
coordinates are stored only once because two
adjacent polygons share the common arcbetween them.
Arcs have polygons on their left and right sides
-
7/24/2019 ENGgis_V&R
36/110
CSRSR NCU
Topology Concept I
Arc-node topology is how Arc/INFO keeps trackof which arcs are connected to other arcs through
shared nodes (nodes are endpoints of arcs). It
defines length, direction, and connectivity for arcs.
The from-node is an arcs starting point; the to-node
is its ending point. They are determined as you
digitize your data. You can see the from-node and
to-node whenever you list attribute records for a
coverage containing lines. Arcs connect if they
share a node.
-
7/24/2019 ENGgis_V&R
37/110
CSRSR NCU
Topology Concept II
Polygon-arc topology expresses the relationshipbetween the arc features and the polygon features
for which the arcs create boundaries. It defines
area and adjacency. Arcs or a set of arcs that form
a closed figure define the area of a polygon. Two
polygons are adjacent if they share an arc. Polygonsare stored as a list of arcs to avoid redundancy.
-
7/24/2019 ENGgis_V&R
38/110
CSRSR NCU
Topology Concept III
Left-right topology refers to contiguity -- howpolygons are associated with their neighboring
polygons. Each arc has a list of which polygons are
on the right side and which are on the left side.Commands in Arc/INFO use this information to
determine from one polygon what the adjacent
polygons are: 1
2
3
4
5
6
7
-
7/24/2019 ENGgis_V&R
39/110
CSRSR NCU
Vector Data Model
Points: represent discrete point features
each point locationhas a record in the
table
airports are point features
each point is stored as acoordinate pair
-
7/24/2019 ENGgis_V&R
40/110
CSRSR NCU
Lines: represent linear features
roads are linear features
each road segmenthas a record in the
table
Vector Data Model
-
7/24/2019 ENGgis_V&R
41/110
CSRSR NCU
Vector Data Model
Lines: fundamental spatial data model
Lines start and end at nodes line #1 goes from node #2 to node #1
Vertices determine shape of line
Nodes and vertices are stored as coordinate pairs_
node
node
vertex
vertex
vertex
vertex
-
7/24/2019 ENGgis_V&R
42/110
CSRSR NCU
Vector Data Model
Polygons: represent bounded areas
each bounded polygonhas a record in the
table
landforms and water are
polygonal features
-
7/24/2019 ENGgis_V&R
43/110
CSRSR NCU
Polygon #2 is bounded by lines 1 & 2
Line 2 has polygon 1 on left and polygon 2 on right_
Polygons: fundamental spatial data model
Vector Data Model
-
7/24/2019 ENGgis_V&R
44/110
CSRSR NCU
complex data model, especially for larger data sets
arc-node topology, only used for ArcInfo data sets_
Polygons: fundamental spatial data model
Vector Data Model
-
7/24/2019 ENGgis_V&R
45/110
CSRSR NCU
Connectivity
Arc-node topology
-
7/24/2019 ENGgis_V&R
46/110
CSRSR NCU
Definition of areas
Polygon-arc topology
-
7/24/2019 ENGgis_V&R
47/110
CSRSR NCU
Adjacency/Contiguity
Left/right topology
-
7/24/2019 ENGgis_V&R
48/110
CSRSR NCU
Arc / Node Model
-
7/24/2019 ENGgis_V&R
49/110
CSRSR NCU
File 1. Coordinates of nodes and vertex for all the arcs
ARC F_node Vertex T_node
1 3.2, 5.2 1, 5.2 1,3
2 1,3 1.8,2.6 2.8,3 3.3,4 3.2, 5.2
3 1,2 3.5,2 4.2,2.7 5.2,2.7
File 2. Arcs topology
ARC F_node T_node R_poly L_poly
1 1 2 External A
2 2 1 A External
3 3 4 External External
File 3. Polygons topology
Polygon Arcs
A 1, 2
File 4. Nodes topology
Node Arcs
1 1,2
2 1,2
3 3
4 4
Arc / Node Model
G f
-
7/24/2019 ENGgis_V&R
50/110
CSRSR NCU
The geometry of a point is given by two dimensionalcoordinates (x, y), while line, string and area are given
by a series of point coordinates.
The topology however defines additional structure as follows
Node : an intersect of more than two lines or strings, or start
and end point of string with node number
Arc : a line or a string with chain number, start and end
node number, left and right neighbored polygons
Polygon : an area with polygon number, series of arcs thatform the area in clockwise order (minus sign is
assigned in case of anti-clockwise order).
Geometry and Topology of Vector Data
-
7/24/2019 ENGgis_V&R
51/110
CSRSR NCU
-
7/24/2019 ENGgis_V&R
52/110
CSRSR NCU
Area-Area Relationships"overlaps" : two areas overlap
"is within" : an island within an area"is adjacent to" : two area share a common boudary
Topological Relationships between Spatial Objects
Point-Pont Relationship"is within" : within a certain distance"is nearest to" : nearest to a certain point
Point-Line Relationships"on line" : point on a line"is nearest to" : a point nearest to a linePoint-area Relationships"is contained in? : a point in an area"on border of area" : a point on border of an area
Line-Line Relationships"intersects" : two lines intersect"crosses" : two lines cross without an intersect"flow into" : a stream flows into the river
Line-Area Relationship"intersects" : a line intersects an area"borders" : a line is a part of border of an area
-
7/24/2019 ENGgis_V&R
53/110
CSRSR NCU
-
7/24/2019 ENGgis_V&R
54/110
CSRSR NCU
Topology Review
Topology is the spatial relationship betweenconnecting or adjacent features in a geographicdata layer.
A procedure used by the computer to explicitly defineand store the spatial relationships between connecting
or adjacent coverage features.
Think of topology as geometry on a rubber sheet. This type of geometry is concerned with spatial
relationships rather than ridged coordinate location.
-
7/24/2019 ENGgis_V&R
55/110
CSRSR NCU
Topology Review
If a map is stretched and distorted, some
properties change: Distances
Angles
Relative proximities
-
7/24/2019 ENGgis_V&R
56/110
CSRSR NCU
Topology Review
Other properties (topological properties) remain
constant after distortion: Adjacency
Containment
Connectivity
Areas remain areas, lines remain lines, points
remain points
-
7/24/2019 ENGgis_V&R
57/110
CSRSR NCU
Topology Review
By tracking all the arcs that meet at any node,
topology knows which arcs connect to each other.
(Arc-node topology) A list of arcs is used to construct the polygon.
Storing each arc only once reduces the amount of
data and ensures that the boundaries of adjacentpolygons do not overlap.
(Polygon-arc topology)
It is easy to find the similar characteristic between
adjacent polygons.
(Left-right topology)
-
7/24/2019 ENGgis_V&R
58/110
CSRSR NCU
Vector Topology helps deal with:
overshoots
slivers
dangles
Not sharing border
Topology Review
-
7/24/2019 ENGgis_V&R
59/110
CSRSR NCU
Bureau of the census
Address matching to convert street addresses to
geographic coordinates and census reporting zones
With geographic coordinates, data could beaggregated to user-specified custom reporting zones
DIME files were the major component of thegeocoding approach
TIGER
-
7/24/2019 ENGgis_V&R
60/110
CSRSR NCU
Address Matching or Geocoding:A list of addresses is converted to
points on a map by referencing themto a special street network.
Address Matching
-
7/24/2019 ENGgis_V&R
61/110
CSRSR NCU
TIGER Address Range Example
-
7/24/2019 ENGgis_V&R
62/110
CSRSR NCU
Two input files and one output file
Input:A database (dbf) file that has the address list that
needs to be geocoded.
A geographic base file or reference layer (commonlystreet layer) that will spatially reference the address
location with the address database (input1).
Output:This will be a point file that will hold the geocoded
address locations with an attribute file that shows the
full address and the matching accuracy.
Address Matching or Geocoding:
Address Matching
?
-
7/24/2019 ENGgis_V&R
63/110
CSRSR NCU
Tabular dataText
Databases
to
Geographic maps
TIGER Streets
ZIP Codes
What do you match?
-
7/24/2019 ENGgis_V&R
64/110
CSRSR NCU
Raster Model
Raster Model
-
7/24/2019 ENGgis_V&R
65/110
CSRSR NCU
A raster data model uses a
grid.
One grid cell is one unit orholds one attribute.
Every cell has a value, even if
it is missing. A cell can hold a number or
an index value standing for an
attribute. A cell has a resolution, given
as the cell size in ground units.
Raster Model
Raster Data Structures
-
7/24/2019 ENGgis_V&R
66/110
CSRSR NCU
Raster Data Structures
Square grid: equal length sides
conceptually simplest
cells can be recursively divided into
cells of same shape
4-connected neighborhood (above,below, left, right)
all neighboring cells are
equidistant
8-connected neighborhood (alsoinclude diagonals)
all neighboring cells not
equidistant
center of cells on diagonal is 1.41units away (square root of 2)
rectangular
commonly occurs for lat/long
when projected
data collected at 1degree by 1
degree will be varying sizedrectangles
triangular (3-sided) and
hexagonal (6-sided)
all adjacent cells and points areequidistant
triangulated irregular
network (tin):
vector model used to represent
continuous surfaces (elevation) more later under vector
-
7/24/2019 ENGgis_V&R
67/110
CSRSR NCU
Resolution(pixel size)
column
row pixel:Coordinate: (2,9)Attribute: 3
Raster Model (Grid, Image)
choose raster pixel size 1/2 the length (1/4 the area)
of smallest feature to map (smallest feature called
minimum mapping unit or resel--resolution element)
A i t hAssignment scheme
-
7/24/2019 ENGgis_V&R
68/110
CSRSR NCU
The value of a cell may be:The value of a cell may be:
an average over the cellan average over the cell a total within the cella total within the cell
max or min or the commonest value in the cellmax or min or the commonest value in the cell
tthe value found at the cellhe value found at the cells central points central point
Assignment schemeAssignment scheme
Assignment schemeAssignment scheme
-
7/24/2019 ENGgis_V&R
69/110
CSRSR NCU
Assignment schemeAssignment scheme
Line assignment Polygon assignment
-
7/24/2019 ENGgis_V&R
70/110
CSRSR NCU
The mixed pixel problem
W GWW W G
W W G
W GGW W G
W G G
W GEW E G
E E G
Water dominates Winner takes all Edges separate
R D
-
7/24/2019 ENGgis_V&R
71/110
CSRSR NCU
Each cell can be owned by only one feature.
Raster is easy to understand, easy to read and
write, and easy to draw on the screen. Spatial analytical operations are faster
Grids are poor at representing points, lines and
areas, but good at surfaces.Grids are a natural representation for scanned
or remotely sensed data.
Grids suffer from the mixed pixel problem.Grid compression is easier
techniques used in
GIS are run-length encoding and quad trees).
Raster Data
-
7/24/2019 ENGgis_V&R
72/110
CSRSR NCU
Raster Data Sources
Scanned maps
B & W aerial photos
Color aerial photos
Satellite images
Run Length Encoded Compression
-
7/24/2019 ENGgis_V&R
73/110
CSRSR NCU
Run-Length Encoded Compression
Uncompressed:
AAADDDDDDBBBBBBBBBCCCCDDDD
DBBBBBAAAA
Run-Length Encoded:
3A6D9B4C5D5B4A
A A A D D D
D D D B B B
B B B B B B
C C C C D D
D D D B B B
B B A A A A
Data Compression
-
7/24/2019 ENGgis_V&R
74/110
CSRSR NCU
Runlength Compression (for single layer)
Full Matrix--162 bytes111111122222222223
111111122222222233
111111122222222333111111222222223333
111113333333333333
111113333333333333111113333333333333
111333333333333333
111333333333333333
1,7,2,17,3,18
1,7,2,16,3,18
1,7,2,15,3,181,6,2,14,3,18
1,5,3,18
1,5,3,181,5,3,18
1,3,3,18
1,3,3,18
Run Length (row)--44 bytes
Value thru column coding.
1st number is value, 2nd islast column with that value.
Now, GIS packages generally rely on commercial
compression routines. Pkzip is the most common, general
purpose routine. MrSid (from Lizard Technology)and
ECW (from ER Mapper) are used for images. All these
essentially use the same concept. Occasionally, data is still
delivered to you in run-length compression, especially in
remote sensing applications.
This is a lossless
compression, asopposed to lossy,
since the original
data can be exactly
reproduced.
Basic Quadtree Compression
-
7/24/2019 ENGgis_V&R
75/110
CSRSR NCU
Basic Quadtree Compression
G = Gray W = White
Quadtree Compression:G0,G10,G11,G12,W13,G200,G201,G210,G211,W202,W203,W212,W213,W22,W23,W3
NOTE: There are many variations of the quadtree compression.The one above represents one of the most basic.
0 1
2 3
0 1
2 3
11
12 13
10
201
202 203
200 211
212 213
210
22 23
Basic QuadtreeStructure 0 1
2 3
11
12 13
10
201
202 203
200 211
212 213
210
22 23
Data Compression
-
7/24/2019 ENGgis_V&R
76/110
CSRSR NCU
Data CompressionQuad Tree Representation (for single layer)
sides of square grid divided evenlyon a recursive basis
length decreases by half # of areas increases fourfold
area decreases by one fourth
Resample by combining (e.g.
average) the four cell values although storage increases if save allsamples, can save processing costs ifsome operations dont need highresolution
for nominal or binary data can savestorage by usingmaximum block
representation
all blocks with same value at any one
level in tree can be stored as singlevalue
Layer Width Cell
Count
1 1 12 2 4
3 4 16
4 8 64
5 16 2566 32 1024
store this quadrant
as single 1
store this quadrant
as single zero
1 11 1
1 1
1
1
I 1,0,1,1 II 1III 0,0,0,1 IV 0
Essentially involves compression applied to both row and column.
2
2
1
2
3
4
4
4
4
54
4
4
3
4
2
3 4
2.53.5
3.25
Raster Array Representations for multiple layers
-
7/24/2019 ENGgis_V&R
77/110
CSRSR NCU
Raster Array Representations for multiple layers
How organize into a one
dimensional data stream for
computer storage & processing?
Band Sequential (BSQ)
each characteristic in a separate file
elevation file, temperature file, etc.
good for compression
good if focus on one characteristic
bad if focus on one area Band Interleaved by Pixel (BIP)
all measurements for a pixel grouped together
good if focus on multiple characteristics of
geographical area
bad if want to remove or add a layer
Band Interleaved by Line (BIL)
rows follow each other for each characteristic
A B
B B
III IV
I II 150 160
120 140Elevation
Soil
Veg
File 1: Veg A,B,B,BFile 2: Soil I,II,III,IV
File 3: El. 120,140,150,160
A,I,120, B,II,140 B,III,150 B,IV,160
A,B,I,II,120,140 B,B,III,IV,150,160
Note that we start in lower left.
Upper left is alternative.
-
7/24/2019 ENGgis_V&R
78/110
Fil F t f V t S ti l D t
-
7/24/2019 ENGgis_V&R
79/110
CSRSR NCU
Arc Export
Arc Export is a transfer format, either ASCII or
compressed into binary, used to transfer files between
different versions of ARC/INFO. It is undocumented and
will work only with ESRI products.
ARC/INFO Coverages
An ARC/INFO "coverage" is a set of internal binary files
used by ARC/INFO, a GIS program. This file format is
proprietary and not readily usable by other programs.
File Formats for Vector Spatial Data
Fil F t f V t S ti l D t
-
7/24/2019 ENGgis_V&R
80/110
CSRSR NCU
File Formats for Vector Spatial Data
ArcView Shape
The shapefile format defines the geometry and attributes ofgeographically-referenced features in as many with specific
file extensions that must be stored in the same projectworkspace. They are:
.shp - the file that stores the feature geometry. (required)
.shx - the file that stores the index of the featuregeometry. (required)
.dbf - the dBASE file that stores the attributeinformation of features. (required)
.sbn and .sbx - the files that store the spatial index of thefeatures. (optional)
ARC/INFO A Vi
-
7/24/2019 ENGgis_V&R
81/110
CSRSR NCU
ARC/INFO vs. ArcView
ARC/INFO is a topologically based hybrid system
ArcView is a file based, non topological, pseudoobject-oriented graphic data structure
Fil F t f V t S ti l D t
-
7/24/2019 ENGgis_V&R
82/110
CSRSR NCU
ArcGIS has a well-defined model for working with data. This
generic model, called the geodatabase (short for geographicdatabase), defines all the types of data that can be used in
ArcGISfor example, features, rasters, addresses, and
survey measurementsand how they are represented,accessed, stored, managed and processed. The geodatabase is
a common framework shared by all ArcGIS products and
applications.
ArcGIS Geodatabase
File Formats for Vector Spatial Data
Fil F t f V t S ti l D t
-
7/24/2019 ENGgis_V&R
83/110
CSRSR NCU
The geodatabase offers you the ability to
Handle rich data types.
Apply sophisticated rules and relationships.
Access large volumes of geographic data stored in both files
and databases.
ArcGIS Geodatabase
File Formats for Vector Spatial Data
Fil F t f V t S ti l D t
-
7/24/2019 ENGgis_V&R
84/110
CSRSR NCU
ArcGIS supports a collection of files in a file system or a
collection of tables in a relational database management
system (RDBMS).
Such as several well-known data set types such as
coverages, shapefiles, grids, images, and triangulated
irregular networks (TINs).
And manages the same types of geographic information in
an RDBMS such as DB2, Informix, Oracle, SQL Server,or Microsoft Access.
ArcGIS Geodatabase
File Formats for Vector Spatial Data
Fil F t f V t S ti l D t
-
7/24/2019 ENGgis_V&R
85/110
CSRSR NCU
File Formats for Vector Spatial Data
Geodatabase Data Management
Two categories:
Personal Geodatabase Single user editing
Stored in MS Access
Size limit of 2 GB
ArcSDE Geodatabase Enterprise
Supports multiuser editing via
versioning
Requires ArcEditor or ArcInfo Editorto edit
File Formats for Vector Spatial Data
-
7/24/2019 ENGgis_V&R
86/110
CSRSR NCU
File Formats for Vector Spatial Data
Comparison of the file and geodatabase implementations
File-Based Data Sets
Coverages
Shapefiles
Grids
TINs
Images
Vector Product Format files
Computer-aided design files
Geography markup language
Tables
XML
DB2 with its Spatial type
Informix with its Spatial type
SQL Server
Oracle
Oracle with Spatial or Locator
Personal geodatabases (Microsoft
Access)
Geodatabase
File Formats for Vector Spatial Data
-
7/24/2019 ENGgis_V&R
87/110
CSRSR NCU
AutoCAD" Drawing Files (DWG)
DWG is the internal, proprietary format used in AutoCAD
software, which is a computer-aided design/drafting (CAD)program. Despite its proprietary nature, AutoCAD canconvert any DWG file to a DXF file (described below)without loss of graphic information. As with DXF files,there are a number of ways to store attribute informationin DWG files. The emerging standard is one that usesExtended Entity Data (EED) to link attributes, but many
others are possible. However, the lack of one standard forlinking attributes can cause problems when data istransferred between systems.
File Formats for Vector Spatial Data
File Formats for Vector Spatial Data
-
7/24/2019 ENGgis_V&R
88/110
CSRSR NCU
Autodesk's Data Interchange File (DXF) Format
DXF is probably the most widely used vector datatransfer format, and a file in DXF format offers somevery strong advantages. It contains very completedisplay information, and almost every graphics
program can read it. However, there are severaldifferent ways to store attribute information in DXFand to link DXF entities to external attributes. Becausethere are no attribute standards, many programs thatclaim to read DXF files still do not import attributeinformation properly.
File Formats for Vector Spatial Data
File Formats for Vector Spatial Data
-
7/24/2019 ENGgis_V&R
89/110
CSRSR NCU
Digital Line Graphs (DLG)
DLG, a transfer format used by the US GeologicalSurvey (USGS), depicts vector information portrayedon printed paper maps. It carries very accuratecoordinate information and sophisticated, feature-
classification information but no other attribute data.DLG does not include any display information. TheDLG standard is significant because the USGS andother US government agencies have used it to publishlarge numbers of digital maps.
File Formats for Vector Spatial Data
File Formats for Vector Spatial Data
-
7/24/2019 ENGgis_V&R
90/110
CSRSR NCU
MapInfo" Data Transfer Files (MIF/MID)
MIF/MID is a transfer standard used by MapInfo, a
desktop mapping system. It carries all three types of GISinformation: geographic, attribute, and display. Attribute
links are implicit in the file format.
MapInfo Map Files.
MapInfo has its own internal binary format, known as a
map file. It is undocumented and proprietary, so it cannotbe used outside a MapInfo system.
File Formats for Vector Spatial Data
File Formats for Vector Spatial Data
-
7/24/2019 ENGgis_V&R
91/110
CSRSR NCU
MicroStation Design Files (DGN)
DGN is the internal format used by Bentley Systems
Inc.'s MicroStation, a CAD program. It is welldocumented and standardized, so it may also be used asa transfer standard. DGN files contain detailed displayinformation. The most common way to store attributesis to place them in an external database file and recordlinks in the MSLINK field-a data item carried for eachelement in the DGN file.
File Formats for Vector Spatial Data
File Formats for Raster Spatial Data
-
7/24/2019 ENGgis_V&R
92/110
CSRSR NCU
File Formats for Raster Spatial Data
The generic raster data model is actually implemented in several different
computer file formats:
GRID is ESRIs proprietary format for storing and processing raster data
Standard industry formats for image data such as JPEG, TIFF and MrSidformats can be used to display raster data, but not for analysis (must
convert to GRID)
Georeferencing information required to display images with mapped
vector data
Requires an accompanying world file which provides locational
information
Image Image File World File
TIFF image.tif image.tfw
Bitmap image.bmp image.bpw
BIL image.bil image.blw
JPEG image.jpg image.jpw
Viewing File
-
7/24/2019 ENGgis_V&R
93/110
CSRSR NCU
Viewing File
Most importantly, file information includes organizing it so
that people can logically use it without having to know
anything about itsphysical structure. The difference between logical andphysical:
LOGICAL VIEWLOGICAL VIEW
Focus on how you need to arrange and access informationto meet your particular needs.
PHYSICAL VIEWPHYSICAL VIEW
Deal with how information is physically arranged, stored,
and accessed on some type of secondary storage device.
Logical View and Physical View (Vector Data)
-
7/24/2019 ENGgis_V&R
94/110
CSRSR NCU
Logical View and Physical View (Vector Data)
Logical View - Shape fi le (*.shp)
Logical View and Physical View (Vector Data)
-
7/24/2019 ENGgis_V&R
95/110
CSRSR NCU
Logical View and Physical View (Vector Data)
Physical View - Shape fi le (*.shp)
Logical View and Physical View (Raster Data)
-
7/24/2019 ENGgis_V&R
96/110
CSRSR NCU
Logical View and Physical View (Raster Data)
Logical View - Imagine fi le (*.img)
Logical View and Physical View (Raster Data)
-
7/24/2019 ENGgis_V&R
97/110
CSRSR NCU
Logical View and Physical View (Raster Data)
Physical View - Imagine fi le (*.img)
Metadata
-
7/24/2019 ENGgis_V&R
98/110
CSRSR NCU
Metadata
Metadata is data about data
Metadata
-
7/24/2019 ENGgis_V&R
99/110
CSRSR NCU
Allows a producer to fully describe a dataset sothat users can understand the assumptions and
limitations and evaluate the dataset's
applicability for their intended use.
Metadata
What is Metadata?
-
7/24/2019 ENGgis_V&R
100/110
CSRSR NCU
Metadata should contain possible answers for the following
questions:
Who collected the original data and who is responsible for thedataset?
What is the purpose of the dataset?
What will users find from this dataset?
What elements are mandatory and what are optional?
What terminology standard and scales have used in this dataset?
How can users access to the dataset?
What geographic area(s) does this dataset cover?What type of transfer protocol is needed to receive the dataset?
What is Metadata?
What is Metadata?
-
7/24/2019 ENGgis_V&R
101/110
CSRSR NCU
Who?
When?
How?What?
Where?
Cost?Purpose?
a a a a
Resume of spatial data
Need for Metadata Standards
-
7/24/2019 ENGgis_V&R
102/110
CSRSR NCU
The proper use and effective retrieval of geo-spatial
data to facilitate the organization and managementof geographic data to provide information about an
organization's database to others
Need for Metadata Standards
Metadata standard will promote:
Essential Elements of Metadata for Spatial Information
-
7/24/2019 ENGgis_V&R
103/110
CSRSR NCU
1. Identification [IDEN]
2. Data Quality [QUAL]
3. Spatial Data Organization [SDOR]
4. Spatial Reference [SREF]
5. Distribution [DIST]
6. Entity and Attributes Information [ENTI]
7. Metadata Reference [REFE]
sse t a e e ts o etadata o Spat a o at o
-
7/24/2019 ENGgis_V&R
104/110
CSRSR NCU
[IDEN] Identification
[TIT] Title: What is the name of the data set?[AUT] Author: Who developed the data set?
[COV] Area Coverage: What geographic area does it cover?
[THE] Themes: What themes of information does it include?
[CUR] Currentness: How current are the data?
[RES] Restriction: Are there restrictions on accessing or using the data?
[QUAL] Data Quality
[ACC] Accuracy: What is the positional and attribute accuracy?
[COM] Completeness: Are the data complete?
[LCO] Logical Consistency: Were the consistency of data verified?
[LIN] Lineage: What data were used to create the data set, and what
processes were applied to those sources?
[SDOR] Spatial Data Organization
-
7/24/2019 ENGgis_V&R
105/110
CSRSR NCU
[SDOR] Spatial Data Organization
[VEC ]Vector: Has vector model been used to encode the spatial data?
[RAS] Raster: Has raster model been used to encode the spatial data?
[TNE] Type and Number of Elements: What type and how many spatial
objects are there?
[SREF] Spatial Reference
[PRO] Projection: What map projection method was used to represent the
location of spatial objects?
[LOL] Longitude/Latitude: Are coordinate locations encoded usinglongitude and latitude?
[GRI] Grid System: Is a gird system such as the State Plane Coordinate
System used?
[DAT] Datum: What horizontal and vertical datums are used?[COO] Coordinate System: What parameters should be used to convert the
data to other coordinate system?
-
7/24/2019 ENGgis_V&R
106/110
CSRSR NCU
[DIST] Distribution
[DIS] Distributor: From whom can one obtain the data?
[ENTI] Entity and Attributes Information
[FEA] Features: What geographic features are included (roads, houses,
elevation, temperature)?[ATT] Attributes: What characteristics of those features are included?
(lengths, widths, heights)
[AVA] Attribute Values: What parameters are used to represent the
characteristics of features?[FOR] Formats: What formats are available?
[MED] Media: What media are available?
[ONL] Online: Are the data available online?
[PRI] Price: What is the price of the data?
-
7/24/2019 ENGgis_V&R
107/110
CSRSR NCU
[REFE] Metadata Reference
[CUR] Currentness of Metadata: When were the metadata compiled?
[RES] Responsible Party: By whom the metadata compiled?[CIT] Citation: Recommended reference to be used for the dataset.
Currently, a number of metadata exist.
-
7/24/2019 ENGgis_V&R
108/110
CSRSR NCU
USA: Content Standards for Digital GeospatialMetadata, Federal Geographic Data Committee
(FGDC)
International Organization for Standardization (ISO):ISO CD 15046 - Part 15: Geographic Information -
Metadata
Open GIS consortium (OGC), a private sector
initiative, was formed in 1994 for developing software
specifications to advance geoprocessing inter-
operability across the GIS industry.
OGC has been working very closely with ISO/TC 211 in identifying the
overlap and division of labor in mutual work programs. The formation of
the ISO/TC 211 - OGC coordination group is a result of such efforts.
http: www fgdc gov metadata metadata html
Metadata Example ( Image.Lan File )
http://www.fgdc.gov/metadata/metadata.htmlhttp://www.fgdc.gov/metadata/metadata.html -
7/24/2019 ENGgis_V&R
109/110
CSRSR NCU
Position Field Type
Byte 0 HDWord Char[6]
Byte 6 IPACK (Bits) Short int
Byte 8 NBands Short int
Byte 10 UnUsed Char[6]
Byte 16 IColumn int
Byte 20 IRow int
Byte 24 XStart int
Byte 28 YStart int
Byte 32 UnUsed Char[56]
Byte 88 MapType Short int
Byte 90 NClass Short int
Byte 92 UnUsed Char[14]
Byte 106 IAUTYP Short int
Byte 108 ACRE float
Byte 112 XMap float
Byte 116 YMap float
Byte 120 XCell float
Byte 124 YCell float
Data Exchange
-
7/24/2019 ENGgis_V&R
110/110
Spatial Data Transfer System (SDTS)
SDTS, a new transfer format developed by the US government,was designed to handle all types of geographic data. SDTS
can be either binary or ASCII but is generally binary.Virtually all geographic concepts can be encoded in SDTS,including coordinate information, complex attributeinformation, and display information. This versatility causes
a corresponding increase in complexity. To simplify things,several standard subsets of SDTS have been adopted. Thefirst of these, the Topological Vector Profile (TVP), is used tostore certain types of vector maps. SDTS can also be used for
raster information. Not much data is available in SDTSformat at this time, nor do many software systems support it.However, it will be the foundation of the US National SpatialData Infrastructure (NSDI). Its importance will increase as
Data Exchange