review of digital soil mapping steps
TRANSCRIPT
Review of DSM steps
Review of DSM concept• Computer-assisted soil mapping-
– Computer models replace tacit models – Computer models to link soil-forming factors (Soil Climate Organism Relief
Parent)– Produces digital soil maps (with error/uncertainty maps)
• Its important features: – High quality and spatially explicit input data– Computer and internet applications – Soil survey principles– GIS and remote sensing principles– Computer and geo-statistical modelling
• Key points to consider– Adequate understanding of data and DSM steps– Optimum data mining– Documentation of steps and data– Accuracy assessment and reporting
DSM steps• Step 1: Decide on mapping focus
– Study area– Geographic projection– Mapping soil property or class (categorical) values
• Step 2: Identify data needs and sources– Data needs (SCORPAN factors)– Data sources (Availability of SCORPAN factors)
• Step 3: Data collection and documentation– Database creation– Metadata development
• Step 4: Input data preparation– GIS database development– GIS operations (clipping, re-projection, format translation,)– Digital terrain analysis (digital terrain parameters)– Image/raster analysis (correction, re-sampling, normalization, PCA)
• Step 5: Spatial prediction• Step 6: Accuracy assessment
1. Get data(Institute, Internet)
2. Develop GIS database
(MS Office, QGIS)
3. Terrain Modelling
(SAGA)
4. Image transformation
(ILWIS)
5. Soil Mapping(R, RStudio)
Summary steps
Detailed DSM Steps1. Choice of focus (soil type, soil property, scale) and study area2. Data identification, collection and documentation
1. SCORPAN factors2. Decide minimum dataset3. Data sources: National libraries, internet, NGOs, colleagues
3. Data organization 1. Develop input database, 2. Choose appropriate scale/resolution
4. Choice of tools1. Database tools (MS office)2. GIS and Image pre-processing tools3. Spatial interpolation and regression tools
5. Methodical application of tools and documentation6. Accuracy assessment and reporting
1. Choice of focus and study area
• Interest (national, regional, etc)
• Data availability
• Other factors
Which soil data is available?
Which environmental covariate is available?
Detailed soil map with Legends and soil data
Soil point data with sitedescription
Detailed soil map with legend
No data
All covariatesC, O, R, P
At least 3 covariatesIncluding R & O
At least 2 covariatesIncluding R
Only one covariate
No data
Increasing level of data adequacy
Climate (C)Organism (O)Relief (R)Parent (P)
Relief (R)Organism (O)
Relief (R)
Deciding on minimum dataset
2. Data identification and collection
• Required data – soil forming factors as input GIS layers– S – soil (profiles, soil map)
– C – climate (mean rainfall)
– O – organism (NDVI, land cover/use)
– R – Relief (Elevation and terrain parameters)
– P – parent material (geology, soil map)
– N – Spatial co-ordiates
• Data sources– National institutions , NGOs, Research institutes
– Regional databases
– Global databases – use of internet
3. Data organization
Methods for data organization
– Input point-data organized in MS Office
– GIS database developed in QGIS
4. Choice of tools• Data organization
– MS office for point-data– QGIS for spatial data
• 1st stage: Data pre-processing– Georeferencing using QGIS– Co-ordinate transformation using QGIS– Digital terrain analysis using SAGA– Study area clipping and vector-raster conversion using QGIS
• 2nd stage : data pre-processing – Pixel re-sampling using ILWIS– Image transformation (to normal distribution) using ILWIS– Principal Component Analysis
• Spatial modelling– Spatial interpolation using R
Sequence of input data organization1. Gather and align point-data in spreadsheet
– Develop database in Spreadsheet (MS Office)
– Create at least 4 columns – ID, X, Y, Attribute
– Check consistency –missing/outlying/factor entries
– Create a sheet for documentation (description)
2. Gather and align other SCORPAN factors– Download and georeference scanned maps in QGIS
– Download and stack multiband image layers in QGIS
– Download single-band images and visualize in QGIS
– Note and document map/image characteristics
3. Choosing the right scale/pixel size and samples
1. Pixel-samples formula: Cartographic rule, which is mathematically given as
Use the model to estimate required samples for a given resolution
2. Largest pixel size of input SCORPAN factors
Samples
Areapixel *0791.0
Methodical application of tools1. Display input data in QGIS and note projection
2. Decide on one projection (not decimal degrees)
3. Ensure all input data conform to the projection. If yes, they should overlay. If not, investigate:– Layer properties (general/metadata)
– Displayed layer coordinates, values, pixel sizes
– The layer should be in the study area
4. Clip all data to the study area using QGIS/ILWIS
5. Export DEM as Arc Info ASCII to SAGA
6. Fill sinks and carry out digital terrain analysis in SAGA
7. Export the terrain parameters as Arc Info ASCII to for 2nd Stage pre-processing in ILWIS
8. Resample layers to common pixel in ILWIS
9. Histogram analysis and normalization of images
10.Image transformation and SCORPAN maplistcreation
11.Carry out principal component analysis (PCA)
12.Choose the best (optimum) PCs from PCA layers
13.Remove undefined areas from the chosen PCs
14.Import data into R
15.Carry out spatial regression in R using the R code
16.Split the data in R for calibration and validation
17.Carry out accuracy assessment
18.Report on map output
DSM accuracy assessment & reporting
• DSM are not 100% accurate due to:
– Gaps and errors in input data
– Modelling errors
– Sampling errors
• Uncertainty represents the departure of soil map output from truth
• Assessment and visualization of uncertainty is a standard procedure in DSM
• In continuous variable map, it’s expressed by:
using geostatistical models
Comparing map output with independent validation data
• Using geostatistical model
Variance map output
Map show where samples were taken
Sampling points have low uncertainties
Map output
Uncertainty map
• Holdout validationUse independent samples
Compare map values with holdout samples
True uncertainty
• Comparison statisticsMean error
Root mean square error
Correlation
• Correlation (r)
• ME
• RMSE
• RMSE and ME should be close to zero
n
i
ii valpredn
ME1
1
n
i
ii valpredn
RMSE1
21
Accuracy assessment of categorical maps
• In categorical map, use is made of holdout validation data
• Accuracy is expressed by:
using confusion matrix
Agreement plot
Find some resources online
• Google: Youtube FAO DSM Module
Freely downloadable