data classification methods in gis - university of thessaly · k = 1+3,322*log 10 n, where n =...

10
Laboratory of Urban and Regional Planning Department of Architecture – School of Engineering – University of Patras University of Thessaly, Department of Planning and Regional Development Databases and Geographic Information Systems Databases and Geographic Information Systems Bases de Bases de Donnees Donnees - SIG SIG Vassilis PAPPAS, Associate Professor [email protected] Master Franco – Hellenique “POpulation, DEveloppement, PROspective” Volos, 2013 Data classification methods in GIS The most common methods DRAWING FEATURES DRAWING CATEGORIES OF FEATURES DRAWING QUANTITIES OF FEATURES Categories in Urban Planning: LAND USE, BUILDING CUNSTRUCTION TYPE, BUILDING QUALITY, … Quantities in Urban Planning: POPULATION, LAND VALUES, … Data Classification methods A feature layer is a reference to a feature class and has an associated drawing method.

Upload: others

Post on 19-Sep-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Laboratory of Urban and Regional Planning

Department of Architecture – School of Engineering – University of Patras

University of Thessaly, Department of Planning and Regional Development

Databases and Geographic Information SystemsDatabases and Geographic Information Systems

Bases de Bases de DonneesDonnees -- SIGSIG

Vassilis PAPPAS, Associate Professor

[email protected]

Master Franco – Hellenique

“POpulation, DEveloppement, PROspective”

Volos, 2013

Data classification methods in GISThe most common methods

� DRAWING FEATURES

� DRAWING CATEGORIES OF FEATURES

� DRAWING QUANTITIES OF FEATURES

Categories in Urban Planning: LAND USE,

BUILDING CUNSTRUCTION TYPE,

BUILDING QUALITY, …

Quantities in Urban Planning: POPULATION,

LAND VALUES, …

Data Classification methods

A feature layer is a reference to a feature class and has an associated

drawing method.

A layer lets us assign any type of drawing method to a geographic dataset.

Data Classification methods

Geographic datasets do not contain the instructions for drawing the data

BLOCK_ID AREA PERIMETER ADEQUACY COVER (%) F.A.R. P. HEIGHT P. LAND USE

1 3954,227000 254,311000 20-600 40 0,6 7,5 gk2

57 1896,508000 174,281100 20-600 40 0,6 7,5 gk1

167 2647,750000 212,230700 12-300 40 0,6 7,5 ak

168 316,265600 71,350790 15-500 70 0,8 8,5 gk

169 3702,945000 304,357800 15-500 70 0,8 8,5 ak

170 846,890600 129,475100 15-500 70 0,8 8,5 ak

171 1096,961000 132,914700 15-500 70 0,8 8,5 ta

172 343,187500 75,198450 15-500 70 0,8 8,5 gk

242 2578,617000 209,745900 25-1000 60 0,6 10,5 pk

243 4661,258000 348,084000 10-300 60 0,6 10,5 ak

244 3522,188000 341,614400 10-300 60 0,6 10,5 gk

245 385,757800 78,804500 15-500 70 0,8 8,5 gk

246 1125,063000 135,727400 15-500 70 0,8 8,5 gk

247 3006,898000 265,786300 15-500 70 0,8 8,5 xo

252 868,085900 146,842800 10-300 60 0,6 10,5 gk

253 343,929700 76,586040 10-300 60 0,6 10,5 ak

254 890,703100 136,971300 10-300 60 0,6 10,5 ak

257 2051,953000 219,292200 10-300 60 0,6 10,5 gk

258 1912,266000 192,092900 25-1000 60 0,6 10,5 ak

259 1174,656000 137,505000 25-1000 60 0,6 10,5 ak

264 1768,484000 170,005800 15-500 70 0,8 8,5 gk

265 1376,539000 150,555600 10-300 60 0,6 10,5 gk

266 1320,211000 150,161500 25-1000 60 0,6 10,5 ak

292 1157,344000 141,667200 10-300 60 0,6 10,5 gk

293 892,242200 121,613000 25-1000 60 0,6 10,5 ak

295 755,593800 124,848200 15-500 70 0,8 8,5 gk

296 730,953100 116,225200 10-300 60 0,6 10,5 ak

297 2548,492000 211,246600 25-1000 60 0,6 10,5 gk

300 2910,672000 222,256800 10-300 60 0,6 10,5 ak

City of Patras, Greece,

Part of a digital map and

Attribute table

for the official building census (2001)

41.738 buildings

Drawing features

Maps present descriptive information about geographic features

using symbols and labels.

� Points – Marker symbol

� Lines – Line symbols

� Areas – Fill symbols

CharacterSimpleArrowPictureMultilayer

Simple

Line

Marker

Gradient

Picture

Multilayer

Cartographic

Hash

Marker

Multilayer

The simplest way to draw a feature layer is to draw all the features

with the same symbol

Drawing features

All road axes have

the same line symbol

All buildings have

the same fill symbol

All blocks have

the same fill symbol

Single symbol

Drawing features

All buildings have

different fill symbol

according to their coverage

All blocks have

different fill symbol

according to their size

Practically this map does not give us

any useful information

Unique values

Drawing features

All buildings are grouped

in three classes according

to their coverage:

All blocks have

the same fill symbol

as a background

But, how we define what is “small”,

“medium” or “big”?

Group of Quantities (quantile)

Data in Urban Planning

Numerical data (quantitative data)

Population density

Building heights

Building coverage

etc

Textual data (qualitative data)

Land use

Building construction type

Building condition

etc

Two main types of data

Classification method

Categories based on their

semantics

Land use

Categories in Urban Planning

Grouping Categories

CODE LAND USE

000 NOT URBAN LAND USE

010 VACANT PLOT

020 AMBANDONED BUILDING

030 UNDER CONSTRUCTION

040 CULTIVATED LAND

050 TREE LANDS

060 FREE LANDS

070

080

090 OTHER NOT URBAN LAND USE

100 RESIDENCE

110 PRIMARY RESIDENCE

111 SINGLE FAMILY

112 SINGLE FAMILY (WITH GARDEN)

113 DUPLEX FAMILY

114 DUPLEX FAMILY (WITH GARDEN)

115 GROUP QUARTERS

116 MULTI-FAMILY

117

118

119 RESIDENCE, OTHER TYPE

120 SECONDARY RESIDENCE

121 COUNTRY HOUSE

122 COUNTRY MULTI-STORE HOUSE

123

Tree structured

coding system

Patterns may be easier to see

through generalization.

That means many categories

to few.

The process of grouping categories is

based to their meaning (semantics)

and the used coding system

It is easer to read a thematic map with less than seven (7) categories

(Mitchell A.,1999)

Classification methods for numerical data

It is easer to read a thematic map with less than seven (7) categories

(Mitchell A.,1999)

The practical type of Sturges gives the number of classes (k)with good results:

k = 1+3,322*log10n, where n = number of cases

How many categories for these records (cases)?

BLOCK_ID AREA PERIMETER ADEQUACY COVER (%) F.A.R. P. HEIGHT P. LAND USE

1 3954,227000 254,311000 20-600 40 0,6 7,5 gk2

57 1896,508000 174,281100 20-600 40 0,6 7,5 gk1

167 2647,750000 212,230700 12-300 40 0,6 7,5 ak

168 316,265600 71,350790 15-500 70 0,8 8,5 gk

169 3702,945000 304,357800 15-500 70 0,8 8,5 ak

170 846,890600 129,475100 15-500 70 0,8 8,5 ak

171 1096,961000 132,914700 15-500 70 0,8 8,5 ta

172 343,187500 75,198450 15-500 70 0,8 8,5 gk

242 2578,617000 209,745900 25-1000 60 0,6 10,5 pk

243 4661,258000 348,084000 10-300 60 0,6 10,5 ak

244 3522,188000 341,614400 10-300 60 0,6 10,5 gk

245 385,757800 78,804500 15-500 70 0,8 8,5 gk

246 1125,063000 135,727400 15-500 70 0,8 8,5 gk

247 3006,898000 265,786300 15-500 70 0,8 8,5 xo

252 868,085900 146,842800 10-300 60 0,6 10,5 gk

253 343,929700 76,586040 10-300 60 0,6 10,5 ak

254 890,703100 136,971300 10-300 60 0,6 10,5 ak

257 2051,953000 219,292200 10-300 60 0,6 10,5 gk

258 1912,266000 192,092900 25-1000 60 0,6 10,5 ak

259 1174,656000 137,505000 25-1000 60 0,6 10,5 ak

264 1768,484000 170,005800 15-500 70 0,8 8,5 gk

265 1376,539000 150,555600 10-300 60 0,6 10,5 gk

266 1320,211000 150,161500 25-1000 60 0,6 10,5 ak

292 1157,344000 141,667200 10-300 60 0,6 10,5 gk

293 892,242200 121,613000 25-1000 60 0,6 10,5 ak

295 755,593800 124,848200 15-500 70 0,8 8,5 gk

296 730,953100 116,225200 10-300 60 0,6 10,5 ak

297 2548,492000 211,246600 25-1000 60 0,6 10,5 gk

300 2910,672000 222,256800 10-300 60 0,6 10,5 ak

City of Patras, Greece,

Part of a digital map and

Attribute table

for the official building census (2001)

41.738 buildings

Classification methods for numerical data

• A classification method subdivides a group of attribute data in classesaccording to the desired criteria.

• Classes group attribute data (features) with similar data, by assigning themthe same symbol.

• Each class has a lower and upper numeric limit(class breaks: minimum and maximum for the specific class).

• By changing the classes we create very different maps that changethe way we ready and translate the specific spatial unity (reference area).

• Apart from the following “technocratic” approach (following classificationmethods), a crucial factor to define classes is the very good knowledge

of the specific spatial variables, their behaviour, distribution and substantial

meaning (thematic approach).

Classification method: Manual

Normally we use this method if we want to emphasize particular patterns

by placing breaks at important threshold values,

or if we need to comply with a particular standard that demands certain

class breaks.

City of Patras, Greece,

Part of central area

Population densities

inh. / Ha

Classification method: Equal interval

This method divides the attribute range into equally sized classes, and is best

applied to familiar data ranges such as percentages.

Normally we use this method to emphasize the relative amount of attribute

values compared to other values.

City of Patras, Greece,

Part of central area

Population densities

inh. / Ha

Classification method: Quantile

Each class will contain an equal number of features. This method is well

suited to linearly distributed data.

City of Patras, Greece,

Part of central area

Population densities

inh. / Ha

Classification method: Natural Breaks

Classes are based on natural groupings of data values. In this method, data values are arranged in order. The class breaks are determined statistically by finding adjacent feature pairs, between which there is a relatively large difference in data value (minimizes the internal standard deviation for the data of each class). This is the default classification method in ArcGIS 9.2

City of Patras, Greece,

Part of central area

Population densities

inh. / Ha

Classification method: Geometrical interval

This method creates class ranges based on intervals that has a geometric sequence based on a multiplier (and its inverse). It creates these intervals by minimizing the square sum of elements per class, this ensures that each interval has an appropriate number of values within it and the intervals are fairly similar. This algorithm was specifically designed to accommodate continuous data. It produces a result that is visually appealing and cartographically comprehensive.

City of Patras, Greece,

Part of central area

Population densities

inh. / Ha

Classification method: Standard Deviation

Use this method to emphasize how much feature values vary from the mean.

Best used on normally distributed data.

City of Patras, Greece,

Part of central area

Population densities

inh. / Ha

References – Bibliography for further reading

• Mitchell A.,

The ESRI Guide to GIS Analysis. Volume 1: Geographic Patterns &

Relationships,

ESRI Press, Redlands, USA 1999

• Zeiler M.,

Modeling our World, The ESRI Guide to Geodatabase design,

ESRI Press, USA, 1999

• Online Help of ArcGIS 9.2 (and 10.1)

• Online Help of ArcView 3.3