A STATISTICAL COMPARISON OF AMPS 10-KM AND 3.3-KM DOMAINS
Michael G. Duda, Kevin W. Manning, and Jordan G. Powers
Mesoscale and Microscale Meteorology Division, NCARAMPS Users’ Workshop 2004
June 8-10, 2004
Introduction• Purpose:
– Demonstrate the usefulness of statistical significance testing in comparing biases of two domains
– Determine where biases at McMurdo Station are significantly different in the 3.3-km and 10-km AMPS domains
– Examine a 7 day period beginning 12Z Nov. 27, 2003 when McMurdo Station was affected by a snowstorm
• Methodology:– Use hypothesis testing to identify statistically significant
differences in mean bias– Consider only differences that are statistically significant
Domain Configuration
Compaq OSF/Alpha Linux/Xeon (SPAWAR machine)
Forecast Analysis Times
Why Consider Statistical Significance?
•Mean bias curves do not indicate the variance in the biases
•Some differences between curves are not as relevant
Hypothesis Testing
• Consider biases to be from a hypothetical population (assumed to be normally distributed)
• Let d = x3.3 – x10
– x3.3 and x10 are biases in 3.3-km and 10-km domains at a given time
• Perform one-sample Student’s t test• H0: d=0
• Reject H0 with 95% confidence if t t
• Test statistic: 0
/dts n
Hypothesis Testing Example
Circled pressure levels will be examined in the next two slides
Example: 150 hPa Temperature
differences between curves
•For this data we can reject the null hypothesis at the 5 percent level
•This means we reject the hypothesis that the means of the 3.3-km and 10-km bias populations are the same
Example: 850 hPa Temperature
differences between curves
•For this data we cannot reject the null hypothesis at the 5 percent level
•This means we cannot reject the hypothesis that the 3.3-km and 10-km bias populations have the same mean
Comparison Results: Temperature
• Statistically significant differences– Surface: 3.3-km grid has warm bias while 10-
km grid has a cool bias at hours 24, 36– 925 hPa: 3.3-km grid has warm bias while 10-
km grid has a cool bias at hours 24, 36 – 300 hPa: 3.3-km grid has larger warm bias than
10-km grid
• No statistically significant differences– At hours 24 and 36, no significant differences in
MAE at any level
24hr Temperature (Mean Bias)
36hr Temperature (Mean Bias)
24hr Temperature (MAE)
Comparison Results: Wind U-Component
• Statistically significant differences– Surface: 3.3-km grid has lower positive bias
than 10-km grid at forecast hours 12, 24, 36– 850 hPa: 3.3-km grid has larger negative bias
at forecast hours 12, 24, 36– 500 hPa: 3.3-km grid has smaller bias, but
MAEs of both grids are similarly large
• Differences at other levels are not statistically significant
24hr Wind U-Component (Mean Bias)
36hr Wind U-Component (Mean Bias)
24hr Wind U-Component (MAE)
Example: Surface Temperature
35 hr forecast valid 23Z Dec 01, 2003
10-km domain 3.3-km domain
Summary
• Use a Student’s t test (at 5 percent level) to perform statistical significance testing on difference between 3.3-km and 10-km biases
• Identify statistically significant differences on model bias v. pressure plots for McMurdo
• Consider only statistically significant differences between mean biases to improve objectivity– Apparently large differences in mean bias may be
statistically insignificant and misleading
Questions?
Hypothesis Testing Example
*
*
* Biases at these pressure levels will be examined in the following slides
Example: 400 hPa Wind V-Component
differences between curves
For this data we do not reject the null hypothesis at the 95 percent level
Example: 925 hPa Wind V-Component
differences between curves
For this data we do reject the null hypothesis at the 95 percent level