migue final presentation_v28

46
1 Bio-inspired computational techniques applied to the clustering and visualization of spatio-temporal geospatial data Miguel BARRETO-SANZ June 27, 2011

Upload: askroll

Post on 25-Dec-2014

260 views

Category:

Technology


0 download

DESCRIPTION

Bio-inspired computational techniques applied to the clustering and visualization of spatio-temporal geospatial data

TRANSCRIPT

Page 1: Migue final presentation_v28

1

Bio-inspired computational techniques applied to the clustering and visualization

of spatio-temporal geospatial data

Miguel BARRETO-SANZ

June 27, 2011

Page 2: Migue final presentation_v28

2

More data has been created since 2005 than in the previous 40,000 years

Page 3: Migue final presentation_v28

3

1980 First

commercial

vendors of

Geographical

information

Systems (GIS)

software

1972 Landsat 1,

1st civilian

Earth

observation

satellite

1993 It is

launched

the 24th

Navstar

satellite

completing

the Global

Positioning

System

2000 Civilian

demand

for GPS

products

2010 Social networks

Geotag

2005 Google

Earth

2006 GPS

receiver

built into

cell

phones

1997 Tropical

Rainfall

Measuring

Mission

(TRMM)

1992 Internet

explosion

Geospatial data timeline

Page 4: Migue final presentation_v28

4

These data are critical for

decision support, but their

value depends on our ability

to extract useful information

Page 5: Migue final presentation_v28

5

NASA earth observatory (Information from several missions

e.g. Terra, TRMM, SRTM)

Challenges

• Highly-dimensional • Large quantity of data • Unlabeled samples (labeling is

expensive and time consuming process)

-30.1

30.5

Mean annual

temperature (ºC)

0

12084

Annual

precipitation (mm)

Worldclim (climate data from weather stations)

Elevation Slope Aspect

Landscape

Class

Moisture

Solar

Radiation Exposure Curvature

Derivate variables

Page 6: Migue final presentation_v28

6

Spatio-temporal challenges Spatio-temporal representations at several levels

Fuzzy boundaries in geographical space

Variables and clusters evolved in a temporal context

Visualization of clusters in geographical and feature space

Hours

Days

Months

Years

Page 7: Migue final presentation_v28

7

Thesis

Clustering

Visualization and projection

Spatio-temporal data

FGHSON Tree-structured SOM component planes SOM GHSOM Colombia (Ecoregions)

South America (Ecoregions)

Colombia

(agroecozones,

ecoregions)

Page 8: Migue final presentation_v28

8

Visualization and projection

Page 9: Migue final presentation_v28

9

3 1

3 2

Data set SOM training

Visualization

Visualization by using Self-organizing Maps

Page 10: Migue final presentation_v28

10

Correlation hunting

Exploration

Similar

Partial correlations

Visualization by using Self-organizing Maps

Page 11: Migue final presentation_v28

11

Climate variables. • Average Temperature (TempAvg) • Average Relative Humidity (RHAvg) • Radiation (Rad) • Precipitation (Prec) Soil variables. • Order (Ord) • Texture (Tex) • Deep (Dee) Topographic variables. • Landscape (Ls) • Slope (Sl). Other variables. • Water Balance (WB) • Variety (Var) Production

A real world problem: Classification of agro-ecological variables related with

productivity in the sugar cane culture.

Total 54 variables

Page 12: Migue final presentation_v28

12

5 Variables

Classical approach: scatter plot matrix

Page 13: Migue final presentation_v28

13

23 Variables

Classical approach: scatter plot matrix

Page 14: Migue final presentation_v28

14

54 Variables

Classical approach: scatter plot matrix

Page 15: Migue final presentation_v28

15

5 Variables

SOM component planes

Page 16: Migue final presentation_v28

16

23 Variables

SOM component planes

Page 17: Migue final presentation_v28

17

54 Variables SOM component planes

Page 18: Migue final presentation_v28

18

54 Variables

SOM component planes

Page 19: Migue final presentation_v28

19

Correlation Hunting

Page 20: Migue final presentation_v28

20

SOM of component planes

Page 21: Migue final presentation_v28

21

Tree-structured SOM component planes

Page 22: Migue final presentation_v28

22

54 Variables

Tree-structured SOM component planes

Page 23: Migue final presentation_v28

23

Tree-structured SOM component planes

Page 24: Migue final presentation_v28

24

Clustering

Page 25: Migue final presentation_v28

25

Hierarchical Self-organizing Structures

• It combines the advantages of the Hierarchical representation and Soft Competitive Learning

• In the state of the art all the methods are crisp

approaches

• In geospatial applications crisp memberships are

not the optimal representation of clusters.

Page 26: Migue final presentation_v28

26

Real world data and its fuzzy nature

Crisp

Fuzzy

Page 27: Migue final presentation_v28

27

An approach to tackle this problem consists in allowing a fuzzy representation in the hierarchical structures

Page 28: Migue final presentation_v28

28

α-cut

α-cut

α-cut

Breadth grow process

Depth

gro

w p

rocess

Hierarchy Fuzzy membership

Fuzzy Growing Hierarchical Self-Organizing Networks FGHSON

Page 29: Migue final presentation_v28

29

Precipitation

Temperature

Similar Zones

Case study-South America Cali Colombia

Page 30: Migue final presentation_v28

30

Case study-South America Cali Colombia

Page 31: Migue final presentation_v28

31

To finding the right prototype

Case study-South America Cali Colombia

Page 32: Migue final presentation_v28

32

Level 1

Page 33: Migue final presentation_v28

33

Level 2

Page 34: Migue final presentation_v28

34

Fortaleza Brazil

Cali Colombia

Level 3

Page 35: Migue final presentation_v28

35

Spatio-Temporal Clustering

Page 36: Migue final presentation_v28

36

Space - Where

Time – When

Spatio-Temporal Clustering

Homologues places for Colombian coffee production. Brazil, Equator, East Africa, and New Guinea.

Page 37: Migue final presentation_v28

37

Space and time – Where and when

Argentina

United States Maize (Zea maize L.)

Spatio-Temporal Clustering

Page 38: Migue final presentation_v28

38

Objective: to find similar environmental zones trough time in South America.

In these experience we are looking for regions with similar patterns in time

windows of three months.

Spatio-Temporal Clustering

Page 39: Migue final presentation_v28

39

Spatio-Temporal Clustering

Page 40: Migue final presentation_v28

40

Precipitation

Temperature

Similar Zones to Cali in the period jan-feb-mar?

Spatio-Temporal Clustering

Page 41: Migue final presentation_v28

41

Spatio-Temporal Clustering

Page 42: Migue final presentation_v28

42

Conclusions

1. Original contributions

FGHSON • Capability to reflect the underlying structure of a dataset in a hierarchical fuzzy way

• It does not require an a-priory definition of the number of clusters.

•The algorithm executes self-organizing processes in parallel.

•Only three parameters are necessary to the setup of the algorithm.

Page 43: Migue final presentation_v28

43

Conclusions

Tree-structured SOM component planes

• It creates structures that allow the visual exploratory data

analysis of large high-dimensional datasets.

• Similarities on variables’ behavior can be easily

detected (e.g. local correlations, maximal and minimal values and outliers).

Page 44: Migue final presentation_v28

44

Conclusions

2. Test of methodologies for clustering and visualization of georeferenced data • GHSOM

• SOM

• FGHSON

3. Methodology contributions • Clustering of spatio-temporal datasets through time by using FGHSON.

Page 45: Migue final presentation_v28

45

The COCH project

4. Agroecological knowledge contribution • In sugar cane productivity

• In sugar cane agroecoregionalizacion

• In Andean blackberry production

Conclusions

Page 46: Migue final presentation_v28

46

Questions