classification of similar productivity zones in the sugar cane culture using clustering of som...
TRANSCRIPT
Classification of similar productivity zones in the sugar cane culture using clustering of SOM
component planes based on the SOM distance matrix
Miguel BARRETOAndrés Pérez-Uribe
MINISTERIO DE AGRICULTURA Y
DESARROLLO RURAL
asocaña
Introduction
The agricultural productivity of a geographic area depends on many agro-ecological variables like soil and terrain characteristics, climaticconstraints, human behavior and management.
Soil
Management
Climate
Genotype
Productivity
The problem
The world of agriculture is diverse and heterogeneous.
The traditional approach has been to develop technologies and agriculture management as if it was homogeneous, with controlled experiments. However, it is expensive and it takes long time.
In agriculture there are really few possibilities of controlling or modifying the conditions in which the cultures grow.
A new approach
Sowing Growing Harvest
SoilManagementClimate Genotype
Experiment
1. Every crop is an experiment
A new approach
1999 2000 2001 2002
4 experiments
Same cultivated zone
For example:
A new approach
1358 experiments
Sowing Growing Harvest
SoilManagementClimate Genotype
2. Each agroecological event is unique in time and space, but it is possible to find similar characteristics between events that allow finding similar behaviors permitting to discover why and how the agroecological variables affect the crop development and therefore the agricultural productivity.
Challenges
This approach presents these challenges : To deal with problems such as: quantity, quality
and type of data. Quantity refers to the number of variables and the observations associated to each variable. Quality refers to data integrity. As far as the type of data is concerned, it refers to the nature of the data, qualitative (e.g., genotype) and quantitative (e.g., temperature).
To optimize the visualization and analysis of the variables.
The idea
Soil typeA, B etc
Variety typeA,B etc
Management type
A,B etc
Weather condition Sunny, rainy etc
1. To construct a plane for each zone with its characteristics.
The idea 2. To find natural groups of experiments with similar characteristics (Without knowing the productivity).
Zone 1
Rainy B
B C
Zone 2
Sunny A
B ASunny A
B A
Zone 3
Sunny A
B A
Zone 5
Sunny A
B A
Zone 6
Rainy B
B C
Zone 7
Rainy B
B C
Zone 8
Rainy B
B C
Zone 9
Conditions A
Conditions B
3. Add labels and look for the more homogeneous groups
The idea (Analyze the conditions)
Soil typeB
Variety typeC
Management typeB
Weather condition Rainy
Soil typeA
Variety typeA
Management typeB
Weather condition Sunny
Conditions AHigh productivity
Conditions BLow productivity
4. To extract new knowledge about the relationship between the agro-ecological variables and productivity.
The variables
Climate variables. Continuous data.
Average Temperature (TempAvg), / After seed (AS) / Before Harvest (BH) Average Relative Humidity (RHAvg) / After seed (AS) / Before Harvest (BH) Radiation (Rad) / After seed (AS) / Before Harvest (BH) Precipitation (Prec) / After seed (AS) / Before Harvest (BH)Soil variables. Order (Ord) / 3 Orders (Ord1, Ord2, Ord3) Nominal Data Texture (Tex) / Ordinal Data Deep (Dee)/ Ordinal DataTopographic variables. Landscape (Ls) / 3 Landscapes (Ls1, Ls2, Ls3) Nominal Data Slope (Sl). / Ordinal DataOther variables. Water Balance (WB) Ordinal Data Variety (Var) / 3 varieties (V1, V2, V3) Nominal DataProductionTotal 54
Months After Seed (AS)
Months Before Harvest (BH)
1 2 3 4 1 2 3 4
SOM visualization of the variables
Soil type
Variety typeManagement type
Weather condition
Relative Humidity (RH)Before Harvest (BH)After Seeding (AS)
Radiation (Ra)Before Harvest (BH)After Seeding (AS)
Soil order 2
Sugarcane variety 1
Precipitation (P)Before Harvest (BH)After Seeding (AS)
Temperature (T)Before Harvest (BH)After Seeding (AS)
Component planes
Zone 1
Zone 2
Zone 3
Zone 4
Zone n
Zone 1 Zone 2 Zone 3 Zone 1358
Variable 1
Variable 2
Variable 54
To improve the analysis of the relationships between variables and/or their influence on the outputs of the system, it is possible to slice the Self-organizing maps in order to visualize their so-called component planes
SOM visualization of the variables
Relative Humidity (RH)Before Harvest (BH)After Seeding (AS)
Radiation (Ra)Before Harvest (BH)After Seeding (AS)
Soil order 2
Sugarcane variety 1
Precipitation (P)Before Harvest (BH)After Seeding (AS)
Temperature (T)Before Harvest (BH)After Seeding (AS)
Relative Humidity (RH)Before Harvest (BH)After Seeding (AS)
Radiation (Ra)Before Harvest (BH)After Seeding (AS)
Sugarcane variety 1
Precipitation (P)Before Harvest (BH)After Seeding (AS)
Temperature (T)Before Harvest (BH)After Seeding (AS)
Soil order 2
Correlation hunting
The task of organizing similar components planes in order to find correlating components is called correlation hunting.
However, when the number of components is large it is difficult to determine which planes are similar to each other.
Correlation huntingA new SOM can be used to reorganize the component planes in order to perform the correlation hunting. The main idea is to place correlated components close to each other.
An advantage of using a SOM for component plane projection is that the placements of the component planes can be shown on a regular grid. In addition, an ordered presentation of similar components is automatically generated. A disadvantage is that the choice of grouping variables is left to the user.
Clustering of SOM component planes based on the SOM distance matrix
The U-matrix had been used as an effective cluster distance function. The U-matrix visualizes distances between each map unit and its neighbors, thus it is possible to visualize the SOM cluster structure.
Clustering of SOM component planes based on the SOM distance matrix
Clusters with similar productivity
Medium
High
Low
Productivity
0
10
- 10
Prototypes from clusters with similar productivity
Relative Humidity (RH)Before Harvest (BH)After Seeding (AS)
Radiation (Ra)Before Harvest (BH)After Seeding (AS)
Soil order 2
Sugarcane variety 1
Precipitation (P)Before Harvest (BH)After Seeding (AS)
Temperature (T)Before Harvest (BH)After Seeding (AS)
Best Matching Units from radiation before harvest (RaBH)
Ra1BH Ra2BH Ra3BH Ra4BH Ra5BH
Best Matching Units
Analyzing the plots
Radiation Relative Humidity Temperature
Analyzing the plots
It is possible to examine the behavior of the radiation for the two component planes previously chosen as example in a scatter plot.
It is possible to observe that the two zones present similar values of radiation in the months after seed (RaAS).
During the months before harvest (RaBH) the radiation presents the same behavior in the high-medium the and low productivity regions, but with a shift.
This pattern indicates that the high radiation in the months before the harvest might affect the accumulation of saccharose in the plant.
Conclusions
Visualization of agroecological zones is very important but difficult due to the high dimensionality of the data. The SOM algorithm is a powerful technique able to deal with this problem, but …
In this study we have utilized the U-matrix and the component plane representation to illustrate the usefulness of the SOM for similar zones visualization and analysis tasks.
By analyzing the obtained groups of agro-ecological variables and cultivated zones, it was possible, as an example of the application of the methodology, to find a relationship between the radiation after seed, before harvest, and a high-medium productivity.
We are currently looking forward to develop data mining and visualization techniques in order to improve the decision support in the sugar cane culture based on the aforementioned methodology.
The end
Thanks for new ideas and directions to explore!