enggis_v&r

Upload: herlambang-andy

Post on 21-Feb-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/24/2019 ENGgis_V&R

    1/110

    CSRSR NCU

    Geographic Information Systems(GIS)

    tel: 03-4227151-57624

    fax: 03-4254908

    e-mail: [email protected]

    Chi-Farn ChenCSRSR NCU

  • 7/24/2019 ENGgis_V&R

    2/110

    CSRSR NCU

    Vector and Raster Data Model

  • 7/24/2019 ENGgis_V&R

    3/110

    CSRSR NCU

    Model and GIS

    representation of reality model

    GIS itself is based on a model of complexity GIS is used to model complexity

    full representation of reality?

    Data model =

    limited representation of reality

  • 7/24/2019 ENGgis_V&R

    4/110

    CSRSR NCU

    Data Model

    Reality is too complex for even the most

    sophisticated GIS software, so in order torepresent reality in a spatial database, a

    simplification of reality is created. This

    simplification is known as a data model.

    In a data model, reality is represented by

    geometry and attributes.

  • 7/24/2019 ENGgis_V&R

    5/110

  • 7/24/2019 ENGgis_V&R

    6/110CSRSR NCU

    There are two formats used by GIS

    systems to store and retrieve spatial

    data:

    Vector

    Raster

    GIS Data Formats

    RasterVector

  • 7/24/2019 ENGgis_V&R

    7/110CSRSR NCU

    Vector Format Data are associated with points, lines, or areas

    Points are located by coordinates

    Lines are described by a series of points join-the-dots-books

    (Arcs nodes, vertices) Areas are described by a series of lines enclosing

    the area.

    (Polygons)

  • 7/24/2019 ENGgis_V&R

    8/110CSRSR NCU

    Vector Format Any attributes (name or code) can be associated

    with a point, line or polygon.

    Data are stored in two files:

    a file containing information of coordinates a file containing information of the attributes

    A third file contains information needed to link

    positional data with their attributes (Identifier).

  • 7/24/2019 ENGgis_V&R

    9/110CSRSR NCU

    Coordinates Table

    Point ID x y

    1 1 3

    2 2 1

    3 4 14 1 2

    5 3 2

    1

    2 3

    4

    5

    X

    Y

    common identifiers provide link to:

    coordinates table (for where)

    attributes table (for what)

    Attributes Table

    Point ID model year

    1 a 90

    2 b 90

    3 b 804 a 70

    5 c 70

    Features have unique identifiers:

    point ID, line ID, polygon ID

  • 7/24/2019 ENGgis_V&R

    10/110CSRSR NCU

    Raster Format Data are divided into cell, pixels (picture elements)

    Pixels are organized in arrays

    Row and Column Numbers (coordinates) are used

    to identify the location of the pixel within the

    array.

    Each pixel has a single value (attribute)

  • 7/24/2019 ENGgis_V&R

    11/110CSRSR NCU

    row

    pixel:Coordinate: (2,9)

    Attribute: 8

    column

    8

    Raster Format

  • 7/24/2019 ENGgis_V&R

    12/110CSRSR NCU

    Vector and Raster Representation ofPoint Map Features

    Map FeatureGIS Vector

    Format

    GIS Raster

    Format

    (X,Y)

    Coordinate in space Pixel Locatedin an Array

  • 7/24/2019 ENGgis_V&R

    13/110CSRSR NCU

    Vector and Raster Representation ofLine Map Features

    Map FeatureGIS Vector

    FormatGIS Raster

    Format

  • 7/24/2019 ENGgis_V&R

    14/110CSRSR NCU

    Vector and Raster Representation ofArea Map Features

    Map FeatureGIS Vector

    Format

    GIS Raster

    Format

  • 7/24/2019 ENGgis_V&R

    15/110

    CSRSR NCU

    Vector Model

    Vector model uses discrete points, lines and/or areas

    corresponding to discrete objects with name or code

    number of attributes.

    Raster Model

    Raster model uses regularly spaced grid cells in specific

    sequence. An element of the grid cell is called a pixel

    (picture element). The conventional sequence is row by row

    from the top to bottom and then column by column from

    the left to the right. Every location is given in two

    dimensional image coordinates; row number and column

    number, which contains a single value of attributes.

  • 7/24/2019 ENGgis_V&R

    16/110

    CSRSR NCU

    Comparison of Raster and Vector Formats

    Raster formats are efficientwhen comparing informationamong arrays with the samecell size.

    Raster files are generallyvery large because each celloccupies a separate line ofdata, only one attribute can

    be assigned to each cell, andcell sizes are relatively small.

    Raster representations arerelatively coarse and

    imprecise.

    Vector formats are efficientwhen comparing informationwhose geographical shapesand sizes are different.

    Vector files are muchsmaller because a relativelysmall number of vectors canprecisely describe large areas

    and many attributes can bedescribed to these areas.

    Vector representations ofshapes can be very precise.

    RasterVector

  • 7/24/2019 ENGgis_V&R

    17/110

    CSRSR NCU

    Vector Model

  • 7/24/2019 ENGgis_V&R

    18/110

    CSRSR NCU

    There are different models to store and managevector information. Each of them has different

    advantages and disadvantages.

    Vector Model

    Spaghetti Model (list of coordinates)

    Vertex Dictionary Model

    Topological Model (Dual Independent Map Encoding: DIME)

    Arc / Node Model

  • 7/24/2019 ENGgis_V&R

    19/110

    CSRSR NCU

    Vector representation without model

  • 7/24/2019 ENGgis_V&R

    20/110

    CSRSR NCU

    Spaghetti Model (List of coordinates)

    simple and easy to manage

    lots of duplication, hence need for large storage space

    very often used in CAC (computer assisted cartography) no topology

  • 7/24/2019 ENGgis_V&R

    21/110

    CSRSR NCU

    No Topology

  • 7/24/2019 ENGgis_V&R

    22/110

    CSRSR NCU

    Spaghetti data often contains

    crossing lines, loose ends,double digitization of

    common boundaries (slivers) Its a mess!

  • 7/24/2019 ENGgis_V&R

    23/110

    CSRSR NCU

    Sliver Polygons

    Sliver polygons are small, narrow polygonfeatures that inevitably appear along

    borders of polygons following the overlay of

    two or more geographic data sets.

    Often occur when a shared boundary arc is

    digitized twice. Should be removed, but difficult to find.

  • 7/24/2019 ENGgis_V&R

    24/110

    CSRSR NCU

    Sliver Polygons

  • 7/24/2019 ENGgis_V&R

    25/110

    CSRSR NCU

    Vertex Dictionary Model

    no duplication, but still this model does not use topology

  • 7/24/2019 ENGgis_V&R

    26/110

    CSRSR NCU

    Topological Model (DIME) / (TIGER)

    assigns a directional code in the form of a

    "from node" and a "to node nodes (intersections of lines) are identified

    with codes

    developed by US Bureau of the Census

    both street addresses and UTM coordinatesare explicitly defined for each link

  • 7/24/2019 ENGgis_V&R

    27/110

    CSRSR NCU

    Topology

  • 7/24/2019 ENGgis_V&R

    28/110

    CSRSR NCU

    The GIS vector data model is slightly more complexas each vertex, arc, node and polygon is uniquely

    identified and the relationships between them are

    stored in the database The relationships betweenthe elements of a vector data model, in terms of

    relative location and connections, are known as

    Topology. Topology gives the vector data model alevel of intelligence which means that the GIS can

    recognize which arcs are joined to each other, and

    identify those polygons which are adjacent to each

    other.

  • 7/24/2019 ENGgis_V&R

    29/110

    CSRSR NCU

    Topology:

    A GIS topology is a set of rules and behaviors

    that model how points, lines, polygons share

    geometry. For example, adjacent features,

    such as two countries, share a common edge.

    Simple definition:

    Topology stores the relationships of one

    spatial element with respect to another.

  • 7/24/2019 ENGgis_V&R

    30/110

    CSRSR NCU

    Topology:

    Topology is a mathematical approach that

    allows us to structure data based on theprinciples of feature adjacency and feature

    connectivity.

    It is in fact the mathematical method used todefine spatial relationships. Without a

    topologic data structure in a vector based GIS

    most data manipulation and analysis

    functions would not be practical or feasible.

  • 7/24/2019 ENGgis_V&R

    31/110

    CSRSR NCU

    Defining Topology

    Topology :

    the spatial relationships between features in a GIS

    Do polygons overlap?

    Do lines intersect or connect?Are points located near each other?

  • 7/24/2019 ENGgis_V&R

    32/110

    CSRSR NCU

    Topology:

    Three main concepts

    Connectivity Arc-node topology

    Definition of areas / Containment Polygon-arc topology

    Adjacency/Contiguity

    Left/right topology

  • 7/24/2019 ENGgis_V&R

    33/110

    CSRSR NCU

    Topology : spatial relationship

    Connectivity:

    The topological identification of connected arcs by recording the from- and to-node

    for each arc. Arcs that share a common node are connected.

    Arcs connect to each other at nodes

    Describing linear network, links have direction.

    Area:

    Arcs that connect to surround an area define a polygon

    Containment:Accounts for polygons within polygons islands

    Describing which landscape features are located within, or intersect, the boundary of

    polygons

    Adjacency:The identification of adjacent polygons by recording the left-hand and right-hand

    polygons.

    Arcs have direction and left and right sides

    Describing a landscape features neighbor

  • 7/24/2019 ENGgis_V&R

    34/110

    CSRSR NCU

    Concept of Topology

    Topology distinguishes GIS data models from non-

    topological data models supported by many CAD, mapping

    and graphics systems

    Topology refers to knowledge about relative spatial

    positioning of features.

    knowledge about how features are connected and which features

    are adjacent to each other.

    Can be viewed as a mathematical procedure that

    determines spatial properties and relationships, including:

    Connectivity, contiguity (adjacency)

    Lengths of arcs and areas of polygons

  • 7/24/2019 ENGgis_V&R

    35/110

    CSRSR NCU

    Topology Rules for Coverages:

    Each arc has a beginning node and an ending

    node - this determines directionality.

    Directionality is determined during digitizing.

    Actual direction is important only if your

    application requires directional modeling.

    Arcs connect to other arcs at nodes Connected arcs form polygon boundaries - arc

    coordinates are stored only once because two

    adjacent polygons share the common arcbetween them.

    Arcs have polygons on their left and right sides

  • 7/24/2019 ENGgis_V&R

    36/110

    CSRSR NCU

    Topology Concept I

    Arc-node topology is how Arc/INFO keeps trackof which arcs are connected to other arcs through

    shared nodes (nodes are endpoints of arcs). It

    defines length, direction, and connectivity for arcs.

    The from-node is an arcs starting point; the to-node

    is its ending point. They are determined as you

    digitize your data. You can see the from-node and

    to-node whenever you list attribute records for a

    coverage containing lines. Arcs connect if they

    share a node.

  • 7/24/2019 ENGgis_V&R

    37/110

    CSRSR NCU

    Topology Concept II

    Polygon-arc topology expresses the relationshipbetween the arc features and the polygon features

    for which the arcs create boundaries. It defines

    area and adjacency. Arcs or a set of arcs that form

    a closed figure define the area of a polygon. Two

    polygons are adjacent if they share an arc. Polygonsare stored as a list of arcs to avoid redundancy.

  • 7/24/2019 ENGgis_V&R

    38/110

    CSRSR NCU

    Topology Concept III

    Left-right topology refers to contiguity -- howpolygons are associated with their neighboring

    polygons. Each arc has a list of which polygons are

    on the right side and which are on the left side.Commands in Arc/INFO use this information to

    determine from one polygon what the adjacent

    polygons are: 1

    2

    3

    4

    5

    6

    7

  • 7/24/2019 ENGgis_V&R

    39/110

    CSRSR NCU

    Vector Data Model

    Points: represent discrete point features

    each point locationhas a record in the

    table

    airports are point features

    each point is stored as acoordinate pair

  • 7/24/2019 ENGgis_V&R

    40/110

    CSRSR NCU

    Lines: represent linear features

    roads are linear features

    each road segmenthas a record in the

    table

    Vector Data Model

  • 7/24/2019 ENGgis_V&R

    41/110

    CSRSR NCU

    Vector Data Model

    Lines: fundamental spatial data model

    Lines start and end at nodes line #1 goes from node #2 to node #1

    Vertices determine shape of line

    Nodes and vertices are stored as coordinate pairs_

    node

    node

    vertex

    vertex

    vertex

    vertex

  • 7/24/2019 ENGgis_V&R

    42/110

    CSRSR NCU

    Vector Data Model

    Polygons: represent bounded areas

    each bounded polygonhas a record in the

    table

    landforms and water are

    polygonal features

  • 7/24/2019 ENGgis_V&R

    43/110

    CSRSR NCU

    Polygon #2 is bounded by lines 1 & 2

    Line 2 has polygon 1 on left and polygon 2 on right_

    Polygons: fundamental spatial data model

    Vector Data Model

  • 7/24/2019 ENGgis_V&R

    44/110

    CSRSR NCU

    complex data model, especially for larger data sets

    arc-node topology, only used for ArcInfo data sets_

    Polygons: fundamental spatial data model

    Vector Data Model

  • 7/24/2019 ENGgis_V&R

    45/110

    CSRSR NCU

    Connectivity

    Arc-node topology

  • 7/24/2019 ENGgis_V&R

    46/110

    CSRSR NCU

    Definition of areas

    Polygon-arc topology

  • 7/24/2019 ENGgis_V&R

    47/110

    CSRSR NCU

    Adjacency/Contiguity

    Left/right topology

  • 7/24/2019 ENGgis_V&R

    48/110

    CSRSR NCU

    Arc / Node Model

  • 7/24/2019 ENGgis_V&R

    49/110

    CSRSR NCU

    File 1. Coordinates of nodes and vertex for all the arcs

    ARC F_node Vertex T_node

    1 3.2, 5.2 1, 5.2 1,3

    2 1,3 1.8,2.6 2.8,3 3.3,4 3.2, 5.2

    3 1,2 3.5,2 4.2,2.7 5.2,2.7

    File 2. Arcs topology

    ARC F_node T_node R_poly L_poly

    1 1 2 External A

    2 2 1 A External

    3 3 4 External External

    File 3. Polygons topology

    Polygon Arcs

    A 1, 2

    File 4. Nodes topology

    Node Arcs

    1 1,2

    2 1,2

    3 3

    4 4

    Arc / Node Model

    G f

  • 7/24/2019 ENGgis_V&R

    50/110

    CSRSR NCU

    The geometry of a point is given by two dimensionalcoordinates (x, y), while line, string and area are given

    by a series of point coordinates.

    The topology however defines additional structure as follows

    Node : an intersect of more than two lines or strings, or start

    and end point of string with node number

    Arc : a line or a string with chain number, start and end

    node number, left and right neighbored polygons

    Polygon : an area with polygon number, series of arcs thatform the area in clockwise order (minus sign is

    assigned in case of anti-clockwise order).

    Geometry and Topology of Vector Data

  • 7/24/2019 ENGgis_V&R

    51/110

    CSRSR NCU

  • 7/24/2019 ENGgis_V&R

    52/110

    CSRSR NCU

    Area-Area Relationships"overlaps" : two areas overlap

    "is within" : an island within an area"is adjacent to" : two area share a common boudary

    Topological Relationships between Spatial Objects

    Point-Pont Relationship"is within" : within a certain distance"is nearest to" : nearest to a certain point

    Point-Line Relationships"on line" : point on a line"is nearest to" : a point nearest to a linePoint-area Relationships"is contained in? : a point in an area"on border of area" : a point on border of an area

    Line-Line Relationships"intersects" : two lines intersect"crosses" : two lines cross without an intersect"flow into" : a stream flows into the river

    Line-Area Relationship"intersects" : a line intersects an area"borders" : a line is a part of border of an area

  • 7/24/2019 ENGgis_V&R

    53/110

    CSRSR NCU

  • 7/24/2019 ENGgis_V&R

    54/110

    CSRSR NCU

    Topology Review

    Topology is the spatial relationship betweenconnecting or adjacent features in a geographicdata layer.

    A procedure used by the computer to explicitly defineand store the spatial relationships between connecting

    or adjacent coverage features.

    Think of topology as geometry on a rubber sheet. This type of geometry is concerned with spatial

    relationships rather than ridged coordinate location.

  • 7/24/2019 ENGgis_V&R

    55/110

    CSRSR NCU

    Topology Review

    If a map is stretched and distorted, some

    properties change: Distances

    Angles

    Relative proximities

  • 7/24/2019 ENGgis_V&R

    56/110

    CSRSR NCU

    Topology Review

    Other properties (topological properties) remain

    constant after distortion: Adjacency

    Containment

    Connectivity

    Areas remain areas, lines remain lines, points

    remain points

  • 7/24/2019 ENGgis_V&R

    57/110

    CSRSR NCU

    Topology Review

    By tracking all the arcs that meet at any node,

    topology knows which arcs connect to each other.

    (Arc-node topology) A list of arcs is used to construct the polygon.

    Storing each arc only once reduces the amount of

    data and ensures that the boundaries of adjacentpolygons do not overlap.

    (Polygon-arc topology)

    It is easy to find the similar characteristic between

    adjacent polygons.

    (Left-right topology)

  • 7/24/2019 ENGgis_V&R

    58/110

    CSRSR NCU

    Vector Topology helps deal with:

    overshoots

    slivers

    dangles

    Not sharing border

    Topology Review

  • 7/24/2019 ENGgis_V&R

    59/110

    CSRSR NCU

    Bureau of the census

    Address matching to convert street addresses to

    geographic coordinates and census reporting zones

    With geographic coordinates, data could beaggregated to user-specified custom reporting zones

    DIME files were the major component of thegeocoding approach

    TIGER

  • 7/24/2019 ENGgis_V&R

    60/110

    CSRSR NCU

    Address Matching or Geocoding:A list of addresses is converted to

    points on a map by referencing themto a special street network.

    Address Matching

  • 7/24/2019 ENGgis_V&R

    61/110

    CSRSR NCU

    TIGER Address Range Example

  • 7/24/2019 ENGgis_V&R

    62/110

    CSRSR NCU

    Two input files and one output file

    Input:A database (dbf) file that has the address list that

    needs to be geocoded.

    A geographic base file or reference layer (commonlystreet layer) that will spatially reference the address

    location with the address database (input1).

    Output:This will be a point file that will hold the geocoded

    address locations with an attribute file that shows the

    full address and the matching accuracy.

    Address Matching or Geocoding:

    Address Matching

    ?

  • 7/24/2019 ENGgis_V&R

    63/110

    CSRSR NCU

    Tabular dataText

    Databases

    to

    Geographic maps

    TIGER Streets

    ZIP Codes

    What do you match?

  • 7/24/2019 ENGgis_V&R

    64/110

    CSRSR NCU

    Raster Model

    Raster Model

  • 7/24/2019 ENGgis_V&R

    65/110

    CSRSR NCU

    A raster data model uses a

    grid.

    One grid cell is one unit orholds one attribute.

    Every cell has a value, even if

    it is missing. A cell can hold a number or

    an index value standing for an

    attribute. A cell has a resolution, given

    as the cell size in ground units.

    Raster Model

    Raster Data Structures

  • 7/24/2019 ENGgis_V&R

    66/110

    CSRSR NCU

    Raster Data Structures

    Square grid: equal length sides

    conceptually simplest

    cells can be recursively divided into

    cells of same shape

    4-connected neighborhood (above,below, left, right)

    all neighboring cells are

    equidistant

    8-connected neighborhood (alsoinclude diagonals)

    all neighboring cells not

    equidistant

    center of cells on diagonal is 1.41units away (square root of 2)

    rectangular

    commonly occurs for lat/long

    when projected

    data collected at 1degree by 1

    degree will be varying sizedrectangles

    triangular (3-sided) and

    hexagonal (6-sided)

    all adjacent cells and points areequidistant

    triangulated irregular

    network (tin):

    vector model used to represent

    continuous surfaces (elevation) more later under vector

  • 7/24/2019 ENGgis_V&R

    67/110

    CSRSR NCU

    Resolution(pixel size)

    column

    row pixel:Coordinate: (2,9)Attribute: 3

    Raster Model (Grid, Image)

    choose raster pixel size 1/2 the length (1/4 the area)

    of smallest feature to map (smallest feature called

    minimum mapping unit or resel--resolution element)

    A i t hAssignment scheme

  • 7/24/2019 ENGgis_V&R

    68/110

    CSRSR NCU

    The value of a cell may be:The value of a cell may be:

    an average over the cellan average over the cell a total within the cella total within the cell

    max or min or the commonest value in the cellmax or min or the commonest value in the cell

    tthe value found at the cellhe value found at the cells central points central point

    Assignment schemeAssignment scheme

    Assignment schemeAssignment scheme

  • 7/24/2019 ENGgis_V&R

    69/110

    CSRSR NCU

    Assignment schemeAssignment scheme

    Line assignment Polygon assignment

  • 7/24/2019 ENGgis_V&R

    70/110

    CSRSR NCU

    The mixed pixel problem

    W GWW W G

    W W G

    W GGW W G

    W G G

    W GEW E G

    E E G

    Water dominates Winner takes all Edges separate

    R D

  • 7/24/2019 ENGgis_V&R

    71/110

    CSRSR NCU

    Each cell can be owned by only one feature.

    Raster is easy to understand, easy to read and

    write, and easy to draw on the screen. Spatial analytical operations are faster

    Grids are poor at representing points, lines and

    areas, but good at surfaces.Grids are a natural representation for scanned

    or remotely sensed data.

    Grids suffer from the mixed pixel problem.Grid compression is easier

    techniques used in

    GIS are run-length encoding and quad trees).

    Raster Data

  • 7/24/2019 ENGgis_V&R

    72/110

    CSRSR NCU

    Raster Data Sources

    Scanned maps

    B & W aerial photos

    Color aerial photos

    Satellite images

    Run Length Encoded Compression

  • 7/24/2019 ENGgis_V&R

    73/110

    CSRSR NCU

    Run-Length Encoded Compression

    Uncompressed:

    AAADDDDDDBBBBBBBBBCCCCDDDD

    DBBBBBAAAA

    Run-Length Encoded:

    3A6D9B4C5D5B4A

    A A A D D D

    D D D B B B

    B B B B B B

    C C C C D D

    D D D B B B

    B B A A A A

    Data Compression

  • 7/24/2019 ENGgis_V&R

    74/110

    CSRSR NCU

    Runlength Compression (for single layer)

    Full Matrix--162 bytes111111122222222223

    111111122222222233

    111111122222222333111111222222223333

    111113333333333333

    111113333333333333111113333333333333

    111333333333333333

    111333333333333333

    1,7,2,17,3,18

    1,7,2,16,3,18

    1,7,2,15,3,181,6,2,14,3,18

    1,5,3,18

    1,5,3,181,5,3,18

    1,3,3,18

    1,3,3,18

    Run Length (row)--44 bytes

    Value thru column coding.

    1st number is value, 2nd islast column with that value.

    Now, GIS packages generally rely on commercial

    compression routines. Pkzip is the most common, general

    purpose routine. MrSid (from Lizard Technology)and

    ECW (from ER Mapper) are used for images. All these

    essentially use the same concept. Occasionally, data is still

    delivered to you in run-length compression, especially in

    remote sensing applications.

    This is a lossless

    compression, asopposed to lossy,

    since the original

    data can be exactly

    reproduced.

    Basic Quadtree Compression

  • 7/24/2019 ENGgis_V&R

    75/110

    CSRSR NCU

    Basic Quadtree Compression

    G = Gray W = White

    Quadtree Compression:G0,G10,G11,G12,W13,G200,G201,G210,G211,W202,W203,W212,W213,W22,W23,W3

    NOTE: There are many variations of the quadtree compression.The one above represents one of the most basic.

    0 1

    2 3

    0 1

    2 3

    11

    12 13

    10

    201

    202 203

    200 211

    212 213

    210

    22 23

    Basic QuadtreeStructure 0 1

    2 3

    11

    12 13

    10

    201

    202 203

    200 211

    212 213

    210

    22 23

    Data Compression

  • 7/24/2019 ENGgis_V&R

    76/110

    CSRSR NCU

    Data CompressionQuad Tree Representation (for single layer)

    sides of square grid divided evenlyon a recursive basis

    length decreases by half # of areas increases fourfold

    area decreases by one fourth

    Resample by combining (e.g.

    average) the four cell values although storage increases if save allsamples, can save processing costs ifsome operations dont need highresolution

    for nominal or binary data can savestorage by usingmaximum block

    representation

    all blocks with same value at any one

    level in tree can be stored as singlevalue

    Layer Width Cell

    Count

    1 1 12 2 4

    3 4 16

    4 8 64

    5 16 2566 32 1024

    store this quadrant

    as single 1

    store this quadrant

    as single zero

    1 11 1

    1 1

    1

    1

    I 1,0,1,1 II 1III 0,0,0,1 IV 0

    Essentially involves compression applied to both row and column.

    2

    2

    1

    2

    3

    4

    4

    4

    4

    54

    4

    4

    3

    4

    2

    3 4

    2.53.5

    3.25

    Raster Array Representations for multiple layers

  • 7/24/2019 ENGgis_V&R

    77/110

    CSRSR NCU

    Raster Array Representations for multiple layers

    How organize into a one

    dimensional data stream for

    computer storage & processing?

    Band Sequential (BSQ)

    each characteristic in a separate file

    elevation file, temperature file, etc.

    good for compression

    good if focus on one characteristic

    bad if focus on one area Band Interleaved by Pixel (BIP)

    all measurements for a pixel grouped together

    good if focus on multiple characteristics of

    geographical area

    bad if want to remove or add a layer

    Band Interleaved by Line (BIL)

    rows follow each other for each characteristic

    A B

    B B

    III IV

    I II 150 160

    120 140Elevation

    Soil

    Veg

    File 1: Veg A,B,B,BFile 2: Soil I,II,III,IV

    File 3: El. 120,140,150,160

    A,I,120, B,II,140 B,III,150 B,IV,160

    A,B,I,II,120,140 B,B,III,IV,150,160

    Note that we start in lower left.

    Upper left is alternative.

  • 7/24/2019 ENGgis_V&R

    78/110

    Fil F t f V t S ti l D t

  • 7/24/2019 ENGgis_V&R

    79/110

    CSRSR NCU

    Arc Export

    Arc Export is a transfer format, either ASCII or

    compressed into binary, used to transfer files between

    different versions of ARC/INFO. It is undocumented and

    will work only with ESRI products.

    ARC/INFO Coverages

    An ARC/INFO "coverage" is a set of internal binary files

    used by ARC/INFO, a GIS program. This file format is

    proprietary and not readily usable by other programs.

    File Formats for Vector Spatial Data

    Fil F t f V t S ti l D t

  • 7/24/2019 ENGgis_V&R

    80/110

    CSRSR NCU

    File Formats for Vector Spatial Data

    ArcView Shape

    The shapefile format defines the geometry and attributes ofgeographically-referenced features in as many with specific

    file extensions that must be stored in the same projectworkspace. They are:

    .shp - the file that stores the feature geometry. (required)

    .shx - the file that stores the index of the featuregeometry. (required)

    .dbf - the dBASE file that stores the attributeinformation of features. (required)

    .sbn and .sbx - the files that store the spatial index of thefeatures. (optional)

    ARC/INFO A Vi

  • 7/24/2019 ENGgis_V&R

    81/110

    CSRSR NCU

    ARC/INFO vs. ArcView

    ARC/INFO is a topologically based hybrid system

    ArcView is a file based, non topological, pseudoobject-oriented graphic data structure

    Fil F t f V t S ti l D t

  • 7/24/2019 ENGgis_V&R

    82/110

    CSRSR NCU

    ArcGIS has a well-defined model for working with data. This

    generic model, called the geodatabase (short for geographicdatabase), defines all the types of data that can be used in

    ArcGISfor example, features, rasters, addresses, and

    survey measurementsand how they are represented,accessed, stored, managed and processed. The geodatabase is

    a common framework shared by all ArcGIS products and

    applications.

    ArcGIS Geodatabase

    File Formats for Vector Spatial Data

    Fil F t f V t S ti l D t

  • 7/24/2019 ENGgis_V&R

    83/110

    CSRSR NCU

    The geodatabase offers you the ability to

    Handle rich data types.

    Apply sophisticated rules and relationships.

    Access large volumes of geographic data stored in both files

    and databases.

    ArcGIS Geodatabase

    File Formats for Vector Spatial Data

    Fil F t f V t S ti l D t

  • 7/24/2019 ENGgis_V&R

    84/110

    CSRSR NCU

    ArcGIS supports a collection of files in a file system or a

    collection of tables in a relational database management

    system (RDBMS).

    Such as several well-known data set types such as

    coverages, shapefiles, grids, images, and triangulated

    irregular networks (TINs).

    And manages the same types of geographic information in

    an RDBMS such as DB2, Informix, Oracle, SQL Server,or Microsoft Access.

    ArcGIS Geodatabase

    File Formats for Vector Spatial Data

    Fil F t f V t S ti l D t

  • 7/24/2019 ENGgis_V&R

    85/110

    CSRSR NCU

    File Formats for Vector Spatial Data

    Geodatabase Data Management

    Two categories:

    Personal Geodatabase Single user editing

    Stored in MS Access

    Size limit of 2 GB

    ArcSDE Geodatabase Enterprise

    Supports multiuser editing via

    versioning

    Requires ArcEditor or ArcInfo Editorto edit

    File Formats for Vector Spatial Data

  • 7/24/2019 ENGgis_V&R

    86/110

    CSRSR NCU

    File Formats for Vector Spatial Data

    Comparison of the file and geodatabase implementations

    File-Based Data Sets

    Coverages

    Shapefiles

    Grids

    TINs

    Images

    Vector Product Format files

    Computer-aided design files

    Geography markup language

    Tables

    XML

    DB2 with its Spatial type

    Informix with its Spatial type

    SQL Server

    Oracle

    Oracle with Spatial or Locator

    Personal geodatabases (Microsoft

    Access)

    Geodatabase

    File Formats for Vector Spatial Data

  • 7/24/2019 ENGgis_V&R

    87/110

    CSRSR NCU

    AutoCAD" Drawing Files (DWG)

    DWG is the internal, proprietary format used in AutoCAD

    software, which is a computer-aided design/drafting (CAD)program. Despite its proprietary nature, AutoCAD canconvert any DWG file to a DXF file (described below)without loss of graphic information. As with DXF files,there are a number of ways to store attribute informationin DWG files. The emerging standard is one that usesExtended Entity Data (EED) to link attributes, but many

    others are possible. However, the lack of one standard forlinking attributes can cause problems when data istransferred between systems.

    File Formats for Vector Spatial Data

    File Formats for Vector Spatial Data

  • 7/24/2019 ENGgis_V&R

    88/110

    CSRSR NCU

    Autodesk's Data Interchange File (DXF) Format

    DXF is probably the most widely used vector datatransfer format, and a file in DXF format offers somevery strong advantages. It contains very completedisplay information, and almost every graphics

    program can read it. However, there are severaldifferent ways to store attribute information in DXFand to link DXF entities to external attributes. Becausethere are no attribute standards, many programs thatclaim to read DXF files still do not import attributeinformation properly.

    File Formats for Vector Spatial Data

    File Formats for Vector Spatial Data

  • 7/24/2019 ENGgis_V&R

    89/110

    CSRSR NCU

    Digital Line Graphs (DLG)

    DLG, a transfer format used by the US GeologicalSurvey (USGS), depicts vector information portrayedon printed paper maps. It carries very accuratecoordinate information and sophisticated, feature-

    classification information but no other attribute data.DLG does not include any display information. TheDLG standard is significant because the USGS andother US government agencies have used it to publishlarge numbers of digital maps.

    File Formats for Vector Spatial Data

    File Formats for Vector Spatial Data

  • 7/24/2019 ENGgis_V&R

    90/110

    CSRSR NCU

    MapInfo" Data Transfer Files (MIF/MID)

    MIF/MID is a transfer standard used by MapInfo, a

    desktop mapping system. It carries all three types of GISinformation: geographic, attribute, and display. Attribute

    links are implicit in the file format.

    MapInfo Map Files.

    MapInfo has its own internal binary format, known as a

    map file. It is undocumented and proprietary, so it cannotbe used outside a MapInfo system.

    File Formats for Vector Spatial Data

    File Formats for Vector Spatial Data

  • 7/24/2019 ENGgis_V&R

    91/110

    CSRSR NCU

    MicroStation Design Files (DGN)

    DGN is the internal format used by Bentley Systems

    Inc.'s MicroStation, a CAD program. It is welldocumented and standardized, so it may also be used asa transfer standard. DGN files contain detailed displayinformation. The most common way to store attributesis to place them in an external database file and recordlinks in the MSLINK field-a data item carried for eachelement in the DGN file.

    File Formats for Vector Spatial Data

    File Formats for Raster Spatial Data

  • 7/24/2019 ENGgis_V&R

    92/110

    CSRSR NCU

    File Formats for Raster Spatial Data

    The generic raster data model is actually implemented in several different

    computer file formats:

    GRID is ESRIs proprietary format for storing and processing raster data

    Standard industry formats for image data such as JPEG, TIFF and MrSidformats can be used to display raster data, but not for analysis (must

    convert to GRID)

    Georeferencing information required to display images with mapped

    vector data

    Requires an accompanying world file which provides locational

    information

    Image Image File World File

    TIFF image.tif image.tfw

    Bitmap image.bmp image.bpw

    BIL image.bil image.blw

    JPEG image.jpg image.jpw

    Viewing File

  • 7/24/2019 ENGgis_V&R

    93/110

    CSRSR NCU

    Viewing File

    Most importantly, file information includes organizing it so

    that people can logically use it without having to know

    anything about itsphysical structure. The difference between logical andphysical:

    LOGICAL VIEWLOGICAL VIEW

    Focus on how you need to arrange and access informationto meet your particular needs.

    PHYSICAL VIEWPHYSICAL VIEW

    Deal with how information is physically arranged, stored,

    and accessed on some type of secondary storage device.

    Logical View and Physical View (Vector Data)

  • 7/24/2019 ENGgis_V&R

    94/110

    CSRSR NCU

    Logical View and Physical View (Vector Data)

    Logical View - Shape fi le (*.shp)

    Logical View and Physical View (Vector Data)

  • 7/24/2019 ENGgis_V&R

    95/110

    CSRSR NCU

    Logical View and Physical View (Vector Data)

    Physical View - Shape fi le (*.shp)

    Logical View and Physical View (Raster Data)

  • 7/24/2019 ENGgis_V&R

    96/110

    CSRSR NCU

    Logical View and Physical View (Raster Data)

    Logical View - Imagine fi le (*.img)

    Logical View and Physical View (Raster Data)

  • 7/24/2019 ENGgis_V&R

    97/110

    CSRSR NCU

    Logical View and Physical View (Raster Data)

    Physical View - Imagine fi le (*.img)

    Metadata

  • 7/24/2019 ENGgis_V&R

    98/110

    CSRSR NCU

    Metadata

    Metadata is data about data

    Metadata

  • 7/24/2019 ENGgis_V&R

    99/110

    CSRSR NCU

    Allows a producer to fully describe a dataset sothat users can understand the assumptions and

    limitations and evaluate the dataset's

    applicability for their intended use.

    Metadata

    What is Metadata?

  • 7/24/2019 ENGgis_V&R

    100/110

    CSRSR NCU

    Metadata should contain possible answers for the following

    questions:

    Who collected the original data and who is responsible for thedataset?

    What is the purpose of the dataset?

    What will users find from this dataset?

    What elements are mandatory and what are optional?

    What terminology standard and scales have used in this dataset?

    How can users access to the dataset?

    What geographic area(s) does this dataset cover?What type of transfer protocol is needed to receive the dataset?

    What is Metadata?

    What is Metadata?

  • 7/24/2019 ENGgis_V&R

    101/110

    CSRSR NCU

    Who?

    When?

    How?What?

    Where?

    Cost?Purpose?

    a a a a

    Resume of spatial data

    Need for Metadata Standards

  • 7/24/2019 ENGgis_V&R

    102/110

    CSRSR NCU

    The proper use and effective retrieval of geo-spatial

    data to facilitate the organization and managementof geographic data to provide information about an

    organization's database to others

    Need for Metadata Standards

    Metadata standard will promote:

    Essential Elements of Metadata for Spatial Information

  • 7/24/2019 ENGgis_V&R

    103/110

    CSRSR NCU

    1. Identification [IDEN]

    2. Data Quality [QUAL]

    3. Spatial Data Organization [SDOR]

    4. Spatial Reference [SREF]

    5. Distribution [DIST]

    6. Entity and Attributes Information [ENTI]

    7. Metadata Reference [REFE]

    sse t a e e ts o etadata o Spat a o at o

  • 7/24/2019 ENGgis_V&R

    104/110

    CSRSR NCU

    [IDEN] Identification

    [TIT] Title: What is the name of the data set?[AUT] Author: Who developed the data set?

    [COV] Area Coverage: What geographic area does it cover?

    [THE] Themes: What themes of information does it include?

    [CUR] Currentness: How current are the data?

    [RES] Restriction: Are there restrictions on accessing or using the data?

    [QUAL] Data Quality

    [ACC] Accuracy: What is the positional and attribute accuracy?

    [COM] Completeness: Are the data complete?

    [LCO] Logical Consistency: Were the consistency of data verified?

    [LIN] Lineage: What data were used to create the data set, and what

    processes were applied to those sources?

    [SDOR] Spatial Data Organization

  • 7/24/2019 ENGgis_V&R

    105/110

    CSRSR NCU

    [SDOR] Spatial Data Organization

    [VEC ]Vector: Has vector model been used to encode the spatial data?

    [RAS] Raster: Has raster model been used to encode the spatial data?

    [TNE] Type and Number of Elements: What type and how many spatial

    objects are there?

    [SREF] Spatial Reference

    [PRO] Projection: What map projection method was used to represent the

    location of spatial objects?

    [LOL] Longitude/Latitude: Are coordinate locations encoded usinglongitude and latitude?

    [GRI] Grid System: Is a gird system such as the State Plane Coordinate

    System used?

    [DAT] Datum: What horizontal and vertical datums are used?[COO] Coordinate System: What parameters should be used to convert the

    data to other coordinate system?

  • 7/24/2019 ENGgis_V&R

    106/110

    CSRSR NCU

    [DIST] Distribution

    [DIS] Distributor: From whom can one obtain the data?

    [ENTI] Entity and Attributes Information

    [FEA] Features: What geographic features are included (roads, houses,

    elevation, temperature)?[ATT] Attributes: What characteristics of those features are included?

    (lengths, widths, heights)

    [AVA] Attribute Values: What parameters are used to represent the

    characteristics of features?[FOR] Formats: What formats are available?

    [MED] Media: What media are available?

    [ONL] Online: Are the data available online?

    [PRI] Price: What is the price of the data?

  • 7/24/2019 ENGgis_V&R

    107/110

    CSRSR NCU

    [REFE] Metadata Reference

    [CUR] Currentness of Metadata: When were the metadata compiled?

    [RES] Responsible Party: By whom the metadata compiled?[CIT] Citation: Recommended reference to be used for the dataset.

    Currently, a number of metadata exist.

  • 7/24/2019 ENGgis_V&R

    108/110

    CSRSR NCU

    USA: Content Standards for Digital GeospatialMetadata, Federal Geographic Data Committee

    (FGDC)

    International Organization for Standardization (ISO):ISO CD 15046 - Part 15: Geographic Information -

    Metadata

    Open GIS consortium (OGC), a private sector

    initiative, was formed in 1994 for developing software

    specifications to advance geoprocessing inter-

    operability across the GIS industry.

    OGC has been working very closely with ISO/TC 211 in identifying the

    overlap and division of labor in mutual work programs. The formation of

    the ISO/TC 211 - OGC coordination group is a result of such efforts.

    http: www fgdc gov metadata metadata html

    Metadata Example ( Image.Lan File )

    http://www.fgdc.gov/metadata/metadata.htmlhttp://www.fgdc.gov/metadata/metadata.html
  • 7/24/2019 ENGgis_V&R

    109/110

    CSRSR NCU

    Position Field Type

    Byte 0 HDWord Char[6]

    Byte 6 IPACK (Bits) Short int

    Byte 8 NBands Short int

    Byte 10 UnUsed Char[6]

    Byte 16 IColumn int

    Byte 20 IRow int

    Byte 24 XStart int

    Byte 28 YStart int

    Byte 32 UnUsed Char[56]

    Byte 88 MapType Short int

    Byte 90 NClass Short int

    Byte 92 UnUsed Char[14]

    Byte 106 IAUTYP Short int

    Byte 108 ACRE float

    Byte 112 XMap float

    Byte 116 YMap float

    Byte 120 XCell float

    Byte 124 YCell float

    Data Exchange

  • 7/24/2019 ENGgis_V&R

    110/110

    Spatial Data Transfer System (SDTS)

    SDTS, a new transfer format developed by the US government,was designed to handle all types of geographic data. SDTS

    can be either binary or ASCII but is generally binary.Virtually all geographic concepts can be encoded in SDTS,including coordinate information, complex attributeinformation, and display information. This versatility causes

    a corresponding increase in complexity. To simplify things,several standard subsets of SDTS have been adopted. Thefirst of these, the Topological Vector Profile (TVP), is used tostore certain types of vector maps. SDTS can also be used for

    raster information. Not much data is available in SDTSformat at this time, nor do many software systems support it.However, it will be the foundation of the US National SpatialData Infrastructure (NSDI). Its importance will increase as

    Data Exchange