statistical performances measures - models comparison · 2012-01-25 · statistical performances...
TRANSCRIPT
![Page 1: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/1.jpg)
Statistical Performances measures - modelscomparison
L Patryla, D. Galeriua ...
a Commissariat a l’Energie Atomique, DAM, DIF, F-91297 Arpajon (France)b ”Horia Hulubei” Institute for Physics & Nuclear Engineering (Romania)
September, 12th 2011
CEA L Patryla , D. Galeriua ... September, 12th 2011 1 / 22
![Page 2: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/2.jpg)
OUTLINE
1 Statistical performance measure
2 Simple statistical analysis on wheat experiments
3 Conclusions
CEA L Patryla , D. Galeriua ... September, 12th 2011 2 / 22
![Page 3: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/3.jpg)
OUTLINE
1 Statistical performance measure
2 Simple statistical analysis on wheat experiments
3 Conclusions
CEA L Patryla , D. Galeriua ... September, 12th 2011 3 / 22
![Page 4: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/4.jpg)
INTRODUCTION
Introduction.
In order to compare predictions from a model and observations measurements, several statisticalperformances measures can be used (U.S. Environmental Protection Agency).
Some of these performance measures are:
the fractional bias (FB)
the geometric mean bias (MG);
the normalized mean square error (NMSE);
the geometric variance (VG)
the correlation coefficient (R)
the fraction of predictions within a factor of two of observations (FAC2)
CEA L Patryla , D. Galeriua ... September, 12th 2011 4 / 22
![Page 5: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/5.jpg)
INTRODUCTION
Introduction.
In order to compare predictions from a model and observations measurements, several statisticalperformances measures can be used (U.S. Environmental Protection Agency).
Some of these performance measures are:
the fractional bias (FB)
the geometric mean bias (MG);
the normalized mean square error (NMSE);
the geometric variance (VG)
the correlation coefficient (R)
the fraction of predictions within a factor of two of observations (FAC2)
CEA L Patryla , D. Galeriua ... September, 12th 2011 4 / 22
![Page 6: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/6.jpg)
INTRODUCTION
Introduction.
In order to compare predictions from a model and observations measurements, several statisticalperformances measures can be used (U.S. Environmental Protection Agency).
Some of these performance measures are:
the fractional bias (FB)
the geometric mean bias (MG);
the normalized mean square error (NMSE);
the geometric variance (VG)
the correlation coefficient (R)
the fraction of predictions within a factor of two of observations (FAC2)
CEA L Patryla , D. Galeriua ... September, 12th 2011 4 / 22
![Page 7: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/7.jpg)
INTRODUCTION
Introduction.
In order to compare predictions from a model and observations measurements, several statisticalperformances measures can be used (U.S. Environmental Protection Agency).
Some of these performance measures are:
the fractional bias (FB)
the geometric mean bias (MG);
the normalized mean square error (NMSE);
the geometric variance (VG)
the correlation coefficient (R)
the fraction of predictions within a factor of two of observations (FAC2)
CEA L Patryla , D. Galeriua ... September, 12th 2011 4 / 22
![Page 8: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/8.jpg)
INTRODUCTION
Introduction.
In order to compare predictions from a model and observations measurements, several statisticalperformances measures can be used (U.S. Environmental Protection Agency).
Some of these performance measures are:
the fractional bias (FB)
the geometric mean bias (MG);
the normalized mean square error (NMSE);
the geometric variance (VG)
the correlation coefficient (R)
the fraction of predictions within a factor of two of observations (FAC2)
CEA L Patryla , D. Galeriua ... September, 12th 2011 4 / 22
![Page 9: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/9.jpg)
INTRODUCTION
Introduction.
In order to compare predictions from a model and observations measurements, several statisticalperformances measures can be used (U.S. Environmental Protection Agency).
Some of these performance measures are:
the fractional bias (FB)
the geometric mean bias (MG);
the normalized mean square error (NMSE);
the geometric variance (VG)
the correlation coefficient (R)
the fraction of predictions within a factor of two of observations (FAC2)
CEA L Patryla , D. Galeriua ... September, 12th 2011 4 / 22
![Page 10: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/10.jpg)
INTRODUCTION
Introduction.
In order to compare predictions from a model and observations measurements, several statisticalperformances measures can be used (U.S. Environmental Protection Agency).
Some of these performance measures are:
the fractional bias (FB)
the geometric mean bias (MG);
the normalized mean square error (NMSE);
the geometric variance (VG)
the correlation coefficient (R)
the fraction of predictions within a factor of two of observations (FAC2)
A perfect model would have
MG, VG, R, and FAC2=1.0;
FB and NMSE = 0.0.
CEA L Patryla , D. Galeriua ... September, 12th 2011 4 / 22
![Page 11: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/11.jpg)
Systematic errors
Systematic errors.
the systematic bias refers to the ration of Cp to Co
FB and MG are measures of mean bias and indicate only systematic errors which lead to alwaysunderestimate or overestimate the measured values,
FB is based on a linear scale and the systematic bias refers to the arithmetic difference between Cpand Co,
MG is based on a logarithmic scale.
CEA L Patryla , D. Galeriua ... September, 12th 2011 5 / 22
![Page 12: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/12.jpg)
Systematic errors
Systematic errors.
the systematic bias refers to the ration of Cp to Co
FB and MG are measures of mean bias and indicate only systematic errors which lead to alwaysunderestimate or overestimate the measured values,
FB is based on a linear scale and the systematic bias refers to the arithmetic difference between Cpand Co,
MG is based on a logarithmic scale.
FB =
Xi
`Coi − Cpi
´0.5
Xi
`Coi + Cpi
´ = FBFN − FBFP
CEA L Patryla , D. Galeriua ... September, 12th 2011 5 / 22
![Page 13: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/13.jpg)
Systematic errors
Systematic errors.
the systematic bias refers to the ration of Cp to Co
FB and MG are measures of mean bias and indicate only systematic errors which lead to alwaysunderestimate or overestimate the measured values,
FB is based on a linear scale and the systematic bias refers to the arithmetic difference between Cpand Co,
MG is based on a logarithmic scale.
FB =
Xi
`Coi − Cpi
´0.5
Xi
`Coi + Cpi
´ = FBFN − FBFP
CEA L Patryla , D. Galeriua ... September, 12th 2011 5 / 22
![Page 14: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/14.jpg)
Systematic errors
Systematic errors.
the systematic bias refers to the ration of Cp to Co
FB and MG are measures of mean bias and indicate only systematic errors which lead to alwaysunderestimate or overestimate the measured values,
FB is based on a linear scale and the systematic bias refers to the arithmetic difference between Cpand Co,
MG is based on a logarithmic scale.
MG = e
“lnCo − lnCp
”
CEA L Patryla , D. Galeriua ... September, 12th 2011 5 / 22
![Page 15: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/15.jpg)
Random errors
Systematic and Random errors.
Random error is due to unpredictable fluctuations We don’t have expected value
Values are scattered about the true value, and tend to have null arithmetic mean whenmeasurement is repeated.
NMSE and VG are measures of scatter and reflect both systematic and unsystematic (random)errors.
CEA L Patryla , D. Galeriua ... September, 12th 2011 6 / 22
![Page 16: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/16.jpg)
Random errors
Systematic and Random errors.
Random error is due to unpredictable fluctuations We don’t have expected value
Values are scattered about the true value, and tend to have null arithmetic mean whenmeasurement is repeated.
NMSE and VG are measures of scatter and reflect both systematic and unsystematic (random)errors.
CEA L Patryla , D. Galeriua ... September, 12th 2011 6 / 22
![Page 17: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/17.jpg)
Random errors
Systematic and Random errors.
Random error is due to unpredictable fluctuations We don’t have expected value
Values are scattered about the true value, and tend to have null arithmetic mean whenmeasurement is repeated.
NMSE and VG are measures of scatter and reflect both systematic and unsystematic (random)errors.
NMSE =
`Co − Cp
´2“CoCp
”
CEA L Patryla , D. Galeriua ... September, 12th 2011 6 / 22
![Page 18: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/18.jpg)
Random errors
Systematic and Random errors.
Random error is due to unpredictable fluctuations We don’t have expected value
Values are scattered about the true value, and tend to have null arithmetic mean whenmeasurement is repeated.
NMSE and VG are measures of scatter and reflect both systematic and unsystematic (random)errors.
VG = e
“lnCo − lnCp
”
CEA L Patryla , D. Galeriua ... September, 12th 2011 6 / 22
![Page 19: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/19.jpg)
Correlation coefficient R
Correlation coefficient R.
Reflects the linear relationship between two variables
It is insensitive to either an additive or a multiplicative factor
.A perfect correlation coefficient is only a necessary, but not sufficient, condition for a perfect model.
For exemple, scatter plot might show generally poor agreement, however, the presence of a goodmatch for a few extreme pairs will greatly improve R.
to avoid using
R =
“Co − C0
” “Cp − Cp
”σco σcp
CEA L Patryla , D. Galeriua ... September, 12th 2011 7 / 22
![Page 20: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/20.jpg)
Correlation coefficient R
Correlation coefficient R.
Reflects the linear relationship between two variables
It is insensitive to either an additive or a multiplicative factor
.A perfect correlation coefficient is only a necessary, but not sufficient, condition for a perfect model.
For exemple, scatter plot might show generally poor agreement, however, the presence of a goodmatch for a few extreme pairs will greatly improve R.
to avoid using
R =
“Co − C0
” “Cp − Cp
”σco σcp
CEA L Patryla , D. Galeriua ... September, 12th 2011 7 / 22
![Page 21: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/21.jpg)
Correlation coefficient R
Correlation coefficient R.
Reflects the linear relationship between two variables
It is insensitive to either an additive or a multiplicative factor
.A perfect correlation coefficient is only a necessary, but not sufficient, condition for a perfect model.
For exemple, scatter plot might show generally poor agreement, however, the presence of a goodmatch for a few extreme pairs will greatly improve R.
to avoid using
R =
“Co − C0
” “Cp − Cp
”σco σcp
CEA L Patryla , D. Galeriua ... September, 12th 2011 7 / 22
![Page 22: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/22.jpg)
Correlation coefficient R
Correlation coefficient R.
Reflects the linear relationship between two variables
It is insensitive to either an additive or a multiplicative factor
.A perfect correlation coefficient is only a necessary, but not sufficient, condition for a perfect model.
For exemple, scatter plot might show generally poor agreement, however, the presence of a goodmatch for a few extreme pairs will greatly improve R.
to avoid using
R =
“Co − C0
” “Cp − Cp
”σco σcp
CEA L Patryla , D. Galeriua ... September, 12th 2011 7 / 22
![Page 23: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/23.jpg)
Correlation coefficient R
Correlation coefficient R.
Reflects the linear relationship between two variables
It is insensitive to either an additive or a multiplicative factor
.A perfect correlation coefficient is only a necessary, but not sufficient, condition for a perfect model.
For exemple, scatter plot might show generally poor agreement, however, the presence of a goodmatch for a few extreme pairs will greatly improve R.
to avoid using
R =
“Co − C0
” “Cp − Cp
”σco σcp
CEA L Patryla , D. Galeriua ... September, 12th 2011 7 / 22
![Page 24: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/24.jpg)
FAC2
FAC2.
FAC2 is the most robust measure, because it is not overly influenced by high and low outlier.
FAC2 = fraction of data that satisfy 0.5 ≤ CpCo≤ 2.0
CEA L Patryla , D. Galeriua ... September, 12th 2011 8 / 22
![Page 25: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/25.jpg)
Properties of Performance measures
Properties of Performance measures.
multiple performance measures have to be considered
Advantages of each performance measure are partly determined by the distribution of the variable
For a log normal distribution, MG and Vg provide a more balanced treatment of extremely high andlow values
MG and VG would be more appropriate for dataset were both predicted and observedconcentrations vary by many orders of magnitude.
However, MG and VG are strongly influenced by extremely low value and are undefined for zerovalues → It is necessary to impose a minimum threshold for data which can be the limit ofdetection (LOD). In this case, if Cp or Co are lower than the threshold, they are set to the LOD
FB and NMSE are strongly influenced by infrequently occurring high observed and predictedconcentration.
FAC2 is the most robust measure, because it is not overly influenced by high and low outlier.
CEA L Patryla , D. Galeriua ... September, 12th 2011 9 / 22
![Page 26: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/26.jpg)
Properties of Performance measures
Properties of Performance measures.
multiple performance measures have to be considered
Advantages of each performance measure are partly determined by the distribution of the variable
For a log normal distribution, MG and Vg provide a more balanced treatment of extremely high andlow values
MG and VG would be more appropriate for dataset were both predicted and observedconcentrations vary by many orders of magnitude.
However, MG and VG are strongly influenced by extremely low value and are undefined for zerovalues → It is necessary to impose a minimum threshold for data which can be the limit ofdetection (LOD). In this case, if Cp or Co are lower than the threshold, they are set to the LOD
FB and NMSE are strongly influenced by infrequently occurring high observed and predictedconcentration.
FAC2 is the most robust measure, because it is not overly influenced by high and low outlier.
CEA L Patryla , D. Galeriua ... September, 12th 2011 9 / 22
![Page 27: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/27.jpg)
Properties of Performance measures
Properties of Performance measures.
multiple performance measures have to be considered
Advantages of each performance measure are partly determined by the distribution of the variable
For a log normal distribution, MG and Vg provide a more balanced treatment of extremely high andlow values
MG and VG would be more appropriate for dataset were both predicted and observedconcentrations vary by many orders of magnitude.
However, MG and VG are strongly influenced by extremely low value and are undefined for zerovalues → It is necessary to impose a minimum threshold for data which can be the limit ofdetection (LOD). In this case, if Cp or Co are lower than the threshold, they are set to the LOD
FB and NMSE are strongly influenced by infrequently occurring high observed and predictedconcentration.
FAC2 is the most robust measure, because it is not overly influenced by high and low outlier.
CEA L Patryla , D. Galeriua ... September, 12th 2011 9 / 22
![Page 28: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/28.jpg)
Properties of Performance measures
Properties of Performance measures.
multiple performance measures have to be considered
Advantages of each performance measure are partly determined by the distribution of the variable
For a log normal distribution, MG and Vg provide a more balanced treatment of extremely high andlow values
MG and VG would be more appropriate for dataset were both predicted and observedconcentrations vary by many orders of magnitude.
However, MG and VG are strongly influenced by extremely low value and are undefined for zerovalues → It is necessary to impose a minimum threshold for data which can be the limit ofdetection (LOD). In this case, if Cp or Co are lower than the threshold, they are set to the LOD
FB and NMSE are strongly influenced by infrequently occurring high observed and predictedconcentration.
FAC2 is the most robust measure, because it is not overly influenced by high and low outlier.
CEA L Patryla , D. Galeriua ... September, 12th 2011 9 / 22
![Page 29: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/29.jpg)
Properties of Performance measures
Properties of Performance measures.
multiple performance measures have to be considered
Advantages of each performance measure are partly determined by the distribution of the variable
For a log normal distribution, MG and Vg provide a more balanced treatment of extremely high andlow values
MG and VG would be more appropriate for dataset were both predicted and observedconcentrations vary by many orders of magnitude.
However, MG and VG are strongly influenced by extremely low value and are undefined for zerovalues → It is necessary to impose a minimum threshold for data which can be the limit ofdetection (LOD). In this case, if Cp or Co are lower than the threshold, they are set to the LOD
FB and NMSE are strongly influenced by infrequently occurring high observed and predictedconcentration.
FAC2 is the most robust measure, because it is not overly influenced by high and low outlier.
CEA L Patryla , D. Galeriua ... September, 12th 2011 9 / 22
![Page 30: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/30.jpg)
Properties of Performance measures
Properties of Performance measures.
multiple performance measures have to be considered
Advantages of each performance measure are partly determined by the distribution of the variable
For a log normal distribution, MG and Vg provide a more balanced treatment of extremely high andlow values
MG and VG would be more appropriate for dataset were both predicted and observedconcentrations vary by many orders of magnitude.
However, MG and VG are strongly influenced by extremely low value and are undefined for zerovalues → It is necessary to impose a minimum threshold for data which can be the limit ofdetection (LOD). In this case, if Cp or Co are lower than the threshold, they are set to the LOD
FB and NMSE are strongly influenced by infrequently occurring high observed and predictedconcentration.
FAC2 is the most robust measure, because it is not overly influenced by high and low outlier.
CEA L Patryla , D. Galeriua ... September, 12th 2011 9 / 22
![Page 31: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/31.jpg)
Properties of Performance measures
Properties of Performance measures.
multiple performance measures have to be considered
Advantages of each performance measure are partly determined by the distribution of the variable
For a log normal distribution, MG and Vg provide a more balanced treatment of extremely high andlow values
MG and VG would be more appropriate for dataset were both predicted and observedconcentrations vary by many orders of magnitude.
However, MG and VG are strongly influenced by extremely low value and are undefined for zerovalues → It is necessary to impose a minimum threshold for data which can be the limit ofdetection (LOD). In this case, if Cp or Co are lower than the threshold, they are set to the LOD
FB and NMSE are strongly influenced by infrequently occurring high observed and predictedconcentration.
FAC2 is the most robust measure, because it is not overly influenced by high and low outlier.
CEA L Patryla , D. Galeriua ... September, 12th 2011 9 / 22
![Page 32: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/32.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Cp
Co= 1−0.5FB
1+0.5FB
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 33: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/33.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Cp
Co= 1−0.5FB
1+0.5FB
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 34: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/34.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Cp
Co= 1−0.5FB
1+0.5FB
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 35: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/35.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Cp
Co= 1−0.5FB
1+0.5FB
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 36: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/36.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Cp
Co= 1−0.5FB
1+0.5FB
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 37: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/37.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
〈Cp〉〈Co〉
= 1MG
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 38: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/38.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
〈Cp〉〈Co〉
= 1MG
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 39: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/39.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Cp
Co=
2+NMSE±q
(2+NMSE)2−4
2
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 40: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/40.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Cp
Co=
2+NMSE±q
(2+NMSE)2−4
2
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 41: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/41.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
〈Cp〉〈Co〉
= exp[±√
lnVG ]
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 42: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/42.jpg)
Interpretation of Performance measures
Interpretation of Performance measures.
FB is symmetrical and bounded; values for the fractional bias range between -2.0 (extremeunderprediction) to +2.0 (extreme overprediction)
The fractional bias is a dimensionless number, which is convenient for comparing the results fromstudies involving different concentration levels
Values of the FB that are equal to -0.67 are equivalent to underprediction by a factor of two
Values of the FB that are equal to +0.67 are equivalent to overprediction by a factor of two
Model predictions with a fractional bias of 0 (zero) are relatively free from bias
Values of the MG that are equal to +0.5 are equivalent to underprediction by a factor of two
values of the MG that are equal to +2 are equivalent to overprediction by a factor of two
Value of NMSE that are equal to 0.5 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
Value of VG that are equal to 1.6 corresponds to an equivalent factor of two mean bias
It doesn’t differentiate whether the factor of two mean bias is underprediction or overprediction
〈Cp〉〈Co〉
= exp[±√
lnVG ]
CEA L Patryla , D. Galeriua ... September, 12th 2011 10 / 22
![Page 43: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/43.jpg)
Model acceptance Criteria
How good is good enough ?
Fraction of prediction within a factor 2 of observation is about 50% or greanter (FAC2 > 0.5)
The mean bias is within ±30% of the mean (|FB| < 0.3 or 0.7 < MG < 1.3)
Random scatter is about a factor of two to three of the mean (NMSE < 1.5 or VG < 4)
CEA L Patryla , D. Galeriua ... September, 12th 2011 11 / 22
![Page 44: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/44.jpg)
Model acceptance Criteria
How good is good enough ?
Fraction of prediction within a factor 2 of observation is about 50% or greanter (FAC2 > 0.5)
The mean bias is within ±30% of the mean (|FB| < 0.3 or 0.7 < MG < 1.3)
Random scatter is about a factor of two to three of the mean (NMSE < 1.5 or VG < 4)
CEA L Patryla , D. Galeriua ... September, 12th 2011 11 / 22
![Page 45: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/45.jpg)
Model acceptance Criteria
How good is good enough ?
Fraction of prediction within a factor 2 of observation is about 50% or greanter (FAC2 > 0.5)
The mean bias is within ±30% of the mean (|FB| < 0.3 or 0.7 < MG < 1.3)
Random scatter is about a factor of two to three of the mean (NMSE < 1.5 or VG < 4)
CEA L Patryla , D. Galeriua ... September, 12th 2011 11 / 22
![Page 46: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/46.jpg)
OUTLINE
1 Statistical performance measure
2 Simple statistical analysis on wheat experiments
3 Conclusions
CEA L Patryla , D. Galeriua ... September, 12th 2011 12 / 22
![Page 47: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/47.jpg)
HTO IN WHEAT LEAF (1/3)
Difficult to say which model is better
Difficult to say if models make overprediction ou underprediction
HTO leaf (CERES)HTO leaf (IFIN)HTO leaf (JAEA)
Pred
icte
d (B
q.l-1
)
0
108
2×108
3×108
4×108
Measured (Bq.l-1)0 108 2×108 3×108 4×108
CEA L Patryla , D. Galeriua ... September, 12th 2011 13 / 22
![Page 48: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/48.jpg)
HTO IN WHEAT LEAF (1/3)
Difficult to say which model is better
Difficult to say if models make overprediction ou underprediction
HTO leaf (CERES)HTO leaf (IFIN)HTO leaf (JAEA)
Pred
icte
d (B
q.l-1
)
0
108
2×108
3×108
4×108
Measured (Bq.l-1)0 108 2×108 3×108 4×108
CEA L Patryla , D. Galeriua ... September, 12th 2011 13 / 22
![Page 49: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/49.jpg)
HTO IN WHEAT LEAF (2/3)
61 experiments
3 models (CEA, JAEA, IFIN)
Some of values equal 0 �without detection threshold or other informations we use only arithmeticscale (FB and NMSE)
More than a factor 2 for CEA and JAEA (radom and systematic errors)
Only about 30% value are within a factor of 2 of observations
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.16 0.31 0.858JAEA 1.13 0.26 0.30 0.818IFIN 0.42 0.15 0.36 0.912
CEA L Patryla , D. Galeriua ... September, 12th 2011 14 / 22
![Page 50: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/50.jpg)
HTO IN WHEAT LEAF (2/3)
61 experiments
3 models (CEA, JAEA, IFIN)
Some of values equal 0 �without detection threshold or other informations we use only arithmeticscale (FB and NMSE)
More than a factor 2 for CEA and JAEA (radom and systematic errors)
Only about 30% value are within a factor of 2 of observations
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.16 0.31 0.858JAEA 1.13 0.26 0.30 0.818IFIN 0.42 0.15 0.36 0.912
CEA L Patryla , D. Galeriua ... September, 12th 2011 14 / 22
![Page 51: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/51.jpg)
HTO IN WHEAT LEAF (2/3)
61 experiments
3 models (CEA, JAEA, IFIN)
Some of values equal 0 �without detection threshold or other informations we use only arithmeticscale (FB and NMSE)
More than a factor 2 for CEA and JAEA (radom and systematic errors)
Only about 30% value are within a factor of 2 of observations
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.16 0.31 0.858JAEA 1.13 0.26 0.30 0.818IFIN 0.42 0.15 0.36 0.912
CEA L Patryla , D. Galeriua ... September, 12th 2011 14 / 22
![Page 52: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/52.jpg)
HTO IN WHEAT LEAF (2/3)
61 experiments
3 models (CEA, JAEA, IFIN)
Some of values equal 0 �without detection threshold or other informations we use only arithmeticscale (FB and NMSE)
More than a factor 2 for CEA and JAEA (radom and systematic errors)
Only about 30% value are within a factor of 2 of observations
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.16 0.31 0.858JAEA 1.13 0.26 0.30 0.818IFIN 0.42 0.15 0.36 0.912
CEA L Patryla , D. Galeriua ... September, 12th 2011 14 / 22
![Page 53: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/53.jpg)
HTO IN WHEAT LEAF (2/3)
61 experiments
3 models (CEA, JAEA, IFIN)
Some of values equal 0 �without detection threshold or other informations we use only arithmeticscale (FB and NMSE)
More than a factor 2 for CEA and JAEA (radom and systematic errors)
Only about 30% value are within a factor of 2 of observations
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.16 0.31 0.858JAEA 1.13 0.26 0.30 0.818IFIN 0.42 0.15 0.36 0.912
CEA L Patryla , D. Galeriua ... September, 12th 2011 14 / 22
![Page 54: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/54.jpg)
HTO IN WHEAT LEAF (3/3)
All models tend to underestimate activity in leaf (less than a factor of 2)Surely due to very low values
UnderpredictionOverprediction
95% confidence limits for FBTFWT / Tritium model (CERES)TFWT / Tritium model (CERES)TFWT / Tritium model (JAEA)+/- a factor-of-two mean bias for prediction
NM
SE (n
orm
aliz
ed m
ean
squa
re e
rror
)
1
2
3
4
5
6
7
8
FB (with 95% conf. int.) (Fractionnal Bias)−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
CEA L Patryla , D. Galeriua ... September, 12th 2011 15 / 22
![Page 55: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/55.jpg)
HTO IN WHEAT LEAF (3/3)
All models tend to underestimate activity in leaf (less than a factor of 2)Surely due to very low values
UnderpredictionOverprediction
95% confidence limits for FBTFWT / Tritium model (CERES)TFWT / Tritium model (CERES)TFWT / Tritium model (JAEA)+/- a factor-of-two mean bias for prediction
NM
SE (n
orm
aliz
ed m
ean
squa
re e
rror
)
1
2
3
4
5
6
7
8
FB (with 95% conf. int.) (Fractionnal Bias)−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
CEA L Patryla , D. Galeriua ... September, 12th 2011 15 / 22
![Page 56: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/56.jpg)
OBT IN GRAIN WHEAT (1/4)
IFIN and JAEA seems make underprediction OBT at the end of harvest but how much ?
Difficult to say which model is better
OBT grain (CERES )OBT grain (IFIN)OBT grain (JAEA)
Pred
icte
d (K
Bq.k
g-1)
0
50
100
150
200
250
Measured (KBq.kg-1)0 50 100 150 200 250
CEA L Patryla , D. Galeriua ... September, 12th 2011 16 / 22
![Page 57: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/57.jpg)
OBT IN GRAIN WHEAT (1/4)
IFIN and JAEA seems make underprediction OBT at the end of harvest but how much ?
Difficult to say which model is better
OBT grain (CERES )OBT grain (IFIN)OBT grain (JAEA)
Pred
icte
d (K
Bq.k
g-1)
0
50
100
150
200
250
Measured (KBq.kg-1)0 50 100 150 200 250
CEA L Patryla , D. Galeriua ... September, 12th 2011 16 / 22
![Page 58: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/58.jpg)
OBT IN GRAIN WHEAT (2/4)
14 experiments at the end or harvest
3 models (CEA, JAEA, IFIN)
Use arithmetic and logarithmic scale �gives about the same results )
More than a factor 2 for all models (radom and systematic errors)
All model made underprediction (more than a factor of 2 for JAEA)
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.4 0.5 0.41JAEA 1.8 1.0 0.07 0.86IFIN 0.8 0.7 0.5 0.66
Model/Performance (factor 2) VG (1.6) MG (2.0 or 0.5) FAC2 RCEA 2.1 1.8 0.5 0.76JAEA 15.2 4.0 0.07 0.61IFIN 1.8 1.9 0.5 0.89
CEA L Patryla , D. Galeriua ... September, 12th 2011 17 / 22
![Page 59: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/59.jpg)
OBT IN GRAIN WHEAT (2/4)
14 experiments at the end or harvest
3 models (CEA, JAEA, IFIN)
Use arithmetic and logarithmic scale �gives about the same results )
More than a factor 2 for all models (radom and systematic errors)
All model made underprediction (more than a factor of 2 for JAEA)
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.4 0.5 0.41JAEA 1.8 1.0 0.07 0.86IFIN 0.8 0.7 0.5 0.66
Model/Performance (factor 2) VG (1.6) MG (2.0 or 0.5) FAC2 RCEA 2.1 1.8 0.5 0.76JAEA 15.2 4.0 0.07 0.61IFIN 1.8 1.9 0.5 0.89
CEA L Patryla , D. Galeriua ... September, 12th 2011 17 / 22
![Page 60: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/60.jpg)
OBT IN GRAIN WHEAT (2/4)
14 experiments at the end or harvest
3 models (CEA, JAEA, IFIN)
Use arithmetic and logarithmic scale �gives about the same results )
More than a factor 2 for all models (radom and systematic errors)
All model made underprediction (more than a factor of 2 for JAEA)
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.4 0.5 0.41JAEA 1.8 1.0 0.07 0.86IFIN 0.8 0.7 0.5 0.66
Model/Performance (factor 2) VG (1.6) MG (2.0 or 0.5) FAC2 RCEA 2.1 1.8 0.5 0.76JAEA 15.2 4.0 0.07 0.61IFIN 1.8 1.9 0.5 0.89
CEA L Patryla , D. Galeriua ... September, 12th 2011 17 / 22
![Page 61: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/61.jpg)
OBT IN GRAIN WHEAT (2/4)
14 experiments at the end or harvest
3 models (CEA, JAEA, IFIN)
Use arithmetic and logarithmic scale �gives about the same results )
More than a factor 2 for all models (radom and systematic errors)
All model made underprediction (more than a factor of 2 for JAEA)
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.4 0.5 0.41JAEA 1.8 1.0 0.07 0.86IFIN 0.8 0.7 0.5 0.66
Model/Performance (factor 2) VG (1.6) MG (2.0 or 0.5) FAC2 RCEA 2.1 1.8 0.5 0.76JAEA 15.2 4.0 0.07 0.61IFIN 1.8 1.9 0.5 0.89
CEA L Patryla , D. Galeriua ... September, 12th 2011 17 / 22
![Page 62: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/62.jpg)
OBT IN GRAIN WHEAT (2/4)
14 experiments at the end or harvest
3 models (CEA, JAEA, IFIN)
Use arithmetic and logarithmic scale �gives about the same results )
More than a factor 2 for all models (radom and systematic errors)
All model made underprediction (more than a factor of 2 for JAEA)
Model/Performance (factor 2) NMSE (0.5) FB (±2/3) FAC2 RCEA 0.7 0.4 0.5 0.41JAEA 1.8 1.0 0.07 0.86IFIN 0.8 0.7 0.5 0.66
Model/Performance (factor 2) VG (1.6) MG (2.0 or 0.5) FAC2 RCEA 2.1 1.8 0.5 0.76JAEA 15.2 4.0 0.07 0.61IFIN 1.8 1.9 0.5 0.89
CEA L Patryla , D. Galeriua ... September, 12th 2011 17 / 22
![Page 63: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/63.jpg)
OBT IN GRAIN WHEAT (3/4)
CEA and IFIN models tend to underestimate activity in leaf (less than a factor of 2), JAEAunderestimates about a factor of 3Surely due to very low values
UnderpredictionOverprediction
95% confidence limits for FBOBT grain / Tritium model (CERES)OBT grain / Tritium model (IFIN)OBT grain/ Tritium model (JAEA)+/- a factor-of-two mean bias for prediction
NM
SE (n
orm
aliz
ed m
ean
squa
re e
rror
)
1
2
3
4
5
6
7
8
FB (with 95% conf. int.) (Fractionnal Bias)−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
CEA L Patryla , D. Galeriua ... September, 12th 2011 18 / 22
![Page 64: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/64.jpg)
OBT IN GRAIN WHEAT (3/4)
CEA and IFIN models tend to underestimate activity in leaf (less than a factor of 2), JAEAunderestimates about a factor of 3Surely due to very low values
UnderpredictionOverprediction
95% confidence limits for FBOBT grain / Tritium model (CERES)OBT grain / Tritium model (IFIN)OBT grain/ Tritium model (JAEA)+/- a factor-of-two mean bias for prediction
NM
SE (n
orm
aliz
ed m
ean
squa
re e
rror
)
1
2
3
4
5
6
7
8
FB (with 95% conf. int.) (Fractionnal Bias)−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
CEA L Patryla , D. Galeriua ... September, 12th 2011 18 / 22
![Page 65: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/65.jpg)
HTO IN WHEAT LEAF (4/4)
CEA and IFIN models tend to underestimate activity in leaf (less than a factor of 2), JAEAunderestimates about a factor of 4
Random scatter is less than a factor of 3 (CEA, IFIN) and 5 (JAEA)
95% confidence limits for VGOBT / Tritium model (CERES)OBT / Tritium model (JAEA)OBT / Tritium model (IFINH)± a factor-of-two mean bias for prediction
UnderpredictionOverprediction
VG
2
4
8
16
MG (with 95% conf. int.)0.25 0.5 1 2 4
CEA L Patryla , D. Galeriua ... September, 12th 2011 19 / 22
![Page 66: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/66.jpg)
HTO IN WHEAT LEAF (4/4)
CEA and IFIN models tend to underestimate activity in leaf (less than a factor of 2), JAEAunderestimates about a factor of 4
Random scatter is less than a factor of 3 (CEA, IFIN) and 5 (JAEA)
95% confidence limits for VGOBT / Tritium model (CERES)OBT / Tritium model (JAEA)OBT / Tritium model (IFINH)± a factor-of-two mean bias for prediction
UnderpredictionOverprediction
VG
2
4
8
16
MG (with 95% conf. int.)0.25 0.5 1 2 4
CEA L Patryla , D. Galeriua ... September, 12th 2011 19 / 22
![Page 67: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/67.jpg)
OUTLINE
1 Statistical performance measure
2 Simple statistical analysis on wheat experiments
3 Conclusions
CEA L Patryla , D. Galeriua ... September, 12th 2011 20 / 22
![Page 68: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/68.jpg)
CONCLUSIONS (1/2)
Statistical analysis can seriously help the models comparison
Performance measures have to be used to compare predictions toobservations
In case of wheat all models have systematic errors
HTO modelling in wheat leaf seems good for the 3 models
Systematic errors :
„Cp
Co= 0.76(JAEA) 0.86(IFIN&CEA
«OBT modelling in wheat grain seems make underprediction for all model
Systematic errors :
„Cp
Co= 0.3(JAEA) 0.48(IFIN) 0.7(CEA)
«
CEA L Patryla , D. Galeriua ... September, 12th 2011 21 / 22
![Page 69: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/69.jpg)
CONCLUSIONS (1/2)
Statistical analysis can seriously help the models comparison
Performance measures have to be used to compare predictions toobservations
In case of wheat all models have systematic errors
HTO modelling in wheat leaf seems good for the 3 models
Systematic errors :
„Cp
Co= 0.76(JAEA) 0.86(IFIN&CEA
«OBT modelling in wheat grain seems make underprediction for all model
Systematic errors :
„Cp
Co= 0.3(JAEA) 0.48(IFIN) 0.7(CEA)
«
CEA L Patryla , D. Galeriua ... September, 12th 2011 21 / 22
![Page 70: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/70.jpg)
CONCLUSIONS (1/2)
Statistical analysis can seriously help the models comparison
Performance measures have to be used to compare predictions toobservations
In case of wheat all models have systematic errors
HTO modelling in wheat leaf seems good for the 3 models
Systematic errors :
„Cp
Co= 0.76(JAEA) 0.86(IFIN&CEA
«OBT modelling in wheat grain seems make underprediction for all model
Systematic errors :
„Cp
Co= 0.3(JAEA) 0.48(IFIN) 0.7(CEA)
«
CEA L Patryla , D. Galeriua ... September, 12th 2011 21 / 22
![Page 71: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/71.jpg)
CONCLUSIONS (1/2)
Statistical analysis can seriously help the models comparison
Performance measures have to be used to compare predictions toobservations
In case of wheat all models have systematic errors
HTO modelling in wheat leaf seems good for the 3 models
Systematic errors :
„Cp
Co= 0.76(JAEA) 0.86(IFIN&CEA
«OBT modelling in wheat grain seems make underprediction for all model
Systematic errors :
„Cp
Co= 0.3(JAEA) 0.48(IFIN) 0.7(CEA)
«
CEA L Patryla , D. Galeriua ... September, 12th 2011 21 / 22
![Page 72: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/72.jpg)
CONCLUSIONS (1/2)
Statistical analysis can seriously help the models comparison
Performance measures have to be used to compare predictions toobservations
In case of wheat all models have systematic errors
HTO modelling in wheat leaf seems good for the 3 models
Systematic errors :
„Cp
Co= 0.76(JAEA) 0.86(IFIN&CEA
«OBT modelling in wheat grain seems make underprediction for all model
Systematic errors :
„Cp
Co= 0.3(JAEA) 0.48(IFIN) 0.7(CEA)
«
CEA L Patryla , D. Galeriua ... September, 12th 2011 21 / 22
![Page 73: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/73.jpg)
CONCLUSIONS (1/2)
Statistical analysis can seriously help the models comparison
Performance measures have to be used to compare predictions toobservations
In case of wheat all models have systematic errors
HTO modelling in wheat leaf seems good for the 3 models
Systematic errors :
„Cp
Co= 0.76(JAEA) 0.86(IFIN&CEA
«OBT modelling in wheat grain seems make underprediction for all model
Systematic errors :
„Cp
Co= 0.3(JAEA) 0.48(IFIN) 0.7(CEA)
«
CEA L Patryla , D. Galeriua ... September, 12th 2011 21 / 22
![Page 74: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/74.jpg)
CONCLUSIONS (1/2)
Statistical analysis can seriously help the models comparison
Performance measures have to be used to compare predictions toobservations
In case of wheat all models have systematic errors
HTO modelling in wheat leaf seems good for the 3 models
Systematic errors :
„Cp
Co= 0.76(JAEA) 0.86(IFIN&CEA
«OBT modelling in wheat grain seems make underprediction for all model
Systematic errors :
„Cp
Co= 0.3(JAEA) 0.48(IFIN) 0.7(CEA)
«
CEA L Patryla , D. Galeriua ... September, 12th 2011 21 / 22
![Page 75: Statistical Performances measures - models comparison · 2012-01-25 · Statistical Performances measures - models comparison L Patryla, D. Galeriua... a Commissariat `a l’Energie](https://reader033.vdocuments.site/reader033/viewer/2022041814/5e59872e7b3d5d5d756e1f71/html5/thumbnails/75.jpg)
CONCLUSIONS (1/2)
ARE MODELS IN ACCEPTANCE CRITERIA
HTO LeafTest/models CEA IFIN JAEAFAC2 > 0.5 no no noMean bias within ±30% of the mean (|FB| < 0.3 or 0.7 <MG < 1.3))
ok ok ok
Random scatter (NMSE < 1.5 or VG < 4) ok ok ok
Acceptance ok ? ok ? ok ?
OBT GrainTest/models CEA IFIN JAEAFAC2 > 0.5 ok ok noMean bias within ±30% of the mean (|FB| < 0.3 or 0.7 <MG < 1.3))
no no no
Random scatter (NMSE < 1.5 or VG < 4) ok ok no
Acceptance ok ? ok ? no
CEA L Patryla , D. Galeriua ... September, 12th 2011 22 / 22