nr 422 quality control jim graham spring 2009. staircase of knowledge increasing subjectivity human...
TRANSCRIPT
NR 422Quality Control
Jim Graham
Spring 2009
Staircase of Knowledge
Increasing Subjectivity
Hum
an v
alue
add
ed
ObservationAnd
Measurement
Data
Information
Knowledge
Understanding
Wisdom
OrganizationInterpretation
Verification
SelectionTesting
ComprehensionIntegration
Judgment
Environmental Monitoring and Characterization, Aritola, Pepper, and Brusseau
Error
• Data does not match reality (ever)
• Gross errors
• Accuracy (bias): distance from truth– | Measurement mean – Truth |
• Precision: variance within the data– Standard Deviation (stddev)
• Measurement Limits
Accuracy and Precision
High AccuracyLow Precision
http://en.wikipedia.org/wiki/Accuracy_and_precision
Low AccuracyHigh Precision
Bias (Accuracy)
• Bias = Distance from truth
Truth Mean
Bias
Standard Deviation (Precision)
Each band represents one standard deviationSource: Wikipedia
Other Approaches
• Confidence Intervals
• +- Some range (suspect)
Sources of Error• Measurement Error
– Protocol– User– Instrument
• Processing Errors– Procedure– User – Instrument
• Data Errors– Age– Metadata/Documentation
Protocol
• Rule #1: Have one!
• Step by step instructions on how to collect the data– Calibration– Equipment required– Training required– Steps– QAQC
• See Globe Protocols:– http://www.globe.gov/sda/tg00/aerosol.pdf
Protocol Error
• Is there a protocol?
• What is being measured?
• Is it complete: How large? How small?
• Unexpected circumstances (illness, weather, accidents, equipment failures, changing ecosystems)
User Measurement Errors
• Wrong Datum
• Data in wrong field/attribute
• Missing data
• Gross errors
• Precision and Accuracy
• Observer error: expertise and “drift”
Instrument Errors
• Calibration
• Drift
• Humans as instruments:– DBH– Weight– Humans are almost always involved!– Fortunately we can be calibrated and have
our drift measured
Calibration
• Sample a portion of the study area repeatedly and/or with higher precision– GPS: benchmarks, higher resolution– Measurements: lasers, known distances– Identifications: experts, known samples
• Use bias and stddev throughout study
• Also provides an estimate for min/max
Flow of error
• Capture error during data collection
• Determine error of other datasets– If unavailable, estimate the error
• Maintain error throughout processing– Error will increase
• Document final error in reports and metadata
Processing Error
• Error changes with processing
• The change depends on the operation and the type of error:– Min/Max– Average Error– Standard Error of the Mean– Standard Deviation– Confidence Intervals
Combing Bias
• Add/Subtraction:– Bias (Bias1+Bias2)=
• T- (Mean1*Num1+Mean2*Num2)/(Num1*Num2)
• Simplified: (|Bias1|+|Bias2|)/2
• Multiply Divide:– Bias (Bias1*Bias2)=
• T- (Mean1*Mean2)• Simplified: |Bias1|*|Bias2|
Derived by Jim Graham
Combining Standard Deviation
• Add/Subtract:– StdDev=sqrt(StdDev1^2+StdDev2^2)
• Multiply/Divide:– StdDev=
• sqrt((StdDev1/Mean1)^2+(StdDev2/Mean2)^2)
http://www.rit.edu/cos/uphysics/uncertainties/Uncertaintiespart2.html
Exact numbers
• Adding/Subtracting:– Error does not change
• Multiplying:– Multiply the error by the same number
Significant Digits (Figures)
• How many significant digits are in:– 12– 12.00– 12.001– 12000– 0.0001– 0.00012– 123456789
• Only applies to measured values, not exact values (i.e. 2 oranges)
Significant Digits
• Cannot create precision:– 1.0 * 2.0 = 2.0– 12 * 11 = 130 (not 131)– 12.0 * 11 = 130 (still not 131)– 12.0 * 11.0 = 131
• Can keep digits for calculations, report with appropriate significant digits
Rounding
• If you have 2 significant digits:– 1.11 -> ?– 1.19 -> ?– 1.14 -> ?– 1.16 -> ?– 1.15 -> ?– 1.99 -> ?– 1.155 -> ?
Quality Control/Assurance
• Calibrate “Instruments”
• Perform random checks on data
• Watch for “drift”
• Document all errors in Metadata!
Design of Sampling
• Random
• Stratified random
• Clustered
• Systematic
• Iterative
Number of Samples
• 30?
• Figure 2.7 from Environmental Monitoring and Characterization
Statistical Studies
• Is the sampling really random or uniform?– Bias– “Most data is collect near a road, a porta-
poty, and a restaurant!” – Tom Stohlgren
Spatial Autocorrelation
• Used to determine type of sampling
Rounding
• If you have 2 significant digits:– 1.11 -> 1.1– 1.19 -> 1.2– 1.14 -> 1.1– 1.16 -> 1.2– 1.15 -> 1.1– 1.99 -> 2.0– 1.155 -> 1.5