matthew greenstein | meteo 485 | apr. 26, 2004 using neural networks and lagged climate indices to...
TRANSCRIPT
Matthew GreensteinMatthew Greenstein || METEO 485METEO 485 || Apr. 26, Apr. 26, 20042004
Using Neural Networks and Using Neural Networks and Lagged Climate Indices to Lagged Climate Indices to
Predict Monthly Predict Monthly Temperature and Temperature and
Precipitation AnomaliesPrecipitation Anomalies
OverviewOverview
• To correlate monthly temperature and precipitation anomalies with a number of climate indices lagged several months
• To use neural networks because they simulate non-linear interactions between variables(as opposed to linear regression)
OverviewOverview
1. Introduction to neural networks
2. Data collection
• Temperature and precipitation anomalies
• Climate indices
3. Methods of attack (“how to”)
4. Results
5. Discussion
6. Future ideas
Neural NetworksNeural Networks
• Creates categorical and numerical forecasts
• Uses categorical and numerical predictors
Neural NetworksNeural Networks
• Layered regression equations
• Predictors are linearly regressed (weighted) to create the hidden layer of intermediate forecasts
• Hidden layer forecasts used as predictors to produce either another hidden layer (and so on) or a final forecast
Neural NetworksNeural Networks
• Layered regression captures non-linear relationships, i.e. mimics whatever equation best fits the data
• You don’t need to know form of equation ahead of time
• Each dot is a node
• Human brain:10 billion nodes
• Neural net:10 – 1000 nodes
Neural NetworksNeural Networks
Training a networkTraining a network
1. Training data (66% of dataset) run through neural net / forecasts generated
2. Error calculated skill scores
3. Neural net tuned (weights changed) toimprove scores
4. Repeat fixed number of times (epochs) or until weights stop changing
Neural NetworksNeural Networks
Training a networkTraining a network
)()1()()( twtwtw
Etw
Learning rate: how much weights are changed compared to error slope
Momentum: use aportion of previousweight change forless “jumpiness”
Neural NetworksNeural Networks
Training a networkTraining a network
)()1()()( twtwtw
Etw
Decay: eliminates useless weights / interactions
Neural NetworksNeural Networks
WEKAWEKA• Waikato Environment for Knowledge Analysis (University of Waikato, New Zealand)
• Weka: flightless bird with an inquisitive nature found only in New Zealand
• Set values oflearning rate, momentum,number of nodes, & epochsto fit data well without overfitting
• Overfitting = fit too perfectly to training data performs poorly on new data
Data CollectionData Collection
What data is needed?What data is needed?• Monthly anomalies
• 6 regions of the U.S. (NW, SW, NC, SC, NE, SE)
• Temperature and precipitation
• U.S. Climate Division data since 1895 available
• Climate indices
• Monthly values lagged 2, 3, & 4 months
• Since 1948 available
Anomaly DataAnomaly Data
• Obtained average monthly anomaly data for the U.S. Climate Divisions in each of the 6 regions
• Dataset from Jeremy Ross
• Averaged using GrADS
• Monthly, 1950 – present
• °F, inches
Climate Index DataClimate Index Data
• Obtained from CDC’s climate indices page:
• http://www.cdc.noaa.gov/ClimateIndices/
• From 1950 – present
• SOI, PNA, NAO, EPO, MEI, Nino3, Nino1+2, Nino3.4, Nino4, AO, NOI, WP, NP, QBO
Climate Index DataClimate Index Data
Some years & months missing!Some years & months missing!
• No SOI until 1951
• No AO until 1958
• No PNA for June & July
• No EPO for Aug & Sept
• WEKA throws out cases with missing data No forecasts were made for Aug – Jan !!
• Need to re-run without PNA and EPO to get a neural net that can be used during any month
Data ProcessingData Processing
Excel fileExcel file
• Row for each month (Jan 1950 – Dec 2000)
• Columns of month; each anomaly; andeach index lagged 2, 3, and 4 months
Data ProcessingData Processing
Conversion to ARFF Conversion to ARFF / / Attribute-Relation File Format
• Save as a CSV
• Fix blanks: ,, replaced by ,?,
• Change file extension: .csv .arff
Method IMethod I
• Following procedure followed for each anomaly
(NE T, NE P, SE T, SE P, SW T, SW P, NW T, NW P)
• Build neural nets
• Vary learning rate (L),momentum (M), layers,epochs
• Decay
• Indices and month predict anomaly
• Takes a long time to try many possibilities
Method IMethod I
Skill scoresSkill scores
• Calculated with remaining 34% of dataset
• Many scores provided
• 2 used
1. Correlation coefficient (r)
2. Root relative squared error• Relative to error if prediction
= average of actual values
• Outliers are penalized strongly
ii
iii
aa
ap
2
2
)(
)(
Results IResults I
NE TemperatureNE Temperature
• Linear regression: r = 0.1067, RRSE = 102.25%
• Neural nets:Neural Net Layers
8,4,2 9,6
Momentum
0.3 0.3
Learning Rate
0.3 0.3
Epochs 500 500
R -0.0205 -0.1190
RRSE 100.25% 100.30%
Results IResults I
SE TemperatureSE Temperature
• Linear regression: r = 0.0352, RRSE = 104.78%
• Neural nets:Neural Net Layers
4,2 15,7
Momentum
0.3 0.3
Learning Rate
0.2 0.2
Epochs 500 500
R 0.0372 -0.0389
RRSE 100.13% 100.74%
Results IResults I
SW TemperatureSW Temperature
• Linear regression: r = 0.036, RRSE = 103.40%
• Neural nets:Neural Net Layers
9,3 9,3
Momentum
0.2 0.4
Learning Rate
0.2 0.2
Epochs 500 500
R 0.063 0.025
RRSE 99.99% 99.97%
Results IResults I
NW TemperatureNW Temperature
• Linear regression: r = 0.011, RRSE = 103.88%
• Neural nets:Neural Net Layers
Auto 9,5
Momentum
0.05 0.8
Learning Rate
0.2 0.2
Epochs 500 500
R 0.176 0.224
RRSE 98.66% 99.08%
Results IResults I
NE PrecipitationNE Precipitation
• Linear regression: r = 0.073, RRSE = 101.044%
• Neural nets:Neural Net Layers
5,3 5,3
Momentum
0.5 0.2
Learning Rate
0.2 0.1
Epochs 500 2000
R 0.061 0.115
RRSE 99.732% 99.999%
Results IResults I
SE PrecipitationSE Precipitation
• Linear regression: r = 0.063, RRSE = 104.14%
• Neural nets:Neural Net Layers
8,3 Auto
Momentum
0.7 0.5
Learning Rate
0.2 0.2
Epochs 500 500
R 0.0651 0.1179
RRSE 99.85% 101.05%
Results IResults I
SW PrecipitationSW Precipitation
• Linear regression: r = 0.187, RRSE = 98.83%
• Neural nets:Neural Net Layers
9,5 9,5
Momentum
0.5 0.3
Learning Rate
0.2 0.2
Epochs 500 500
R 0.280 0.289
RRSE 92.25% 96.12%
Results IResults I
NW PrecipitationNW Precipitation
• Linear regression: r = 0.091, RRSE = 101.49%
• Neural nets:Neural Net Layers
9,5 Auto
Momentum
0.3 0.3
Learning Rate
0.5 0.6
Epochs 500 500
R 0.052 0.098
RRSE 99.91% 102.71%
Results IResults I
• Putrid results !!
• Not worth trying NC/SC… away from oceans
• RRSE ~ 100%, r ~ 0.10
• No big improvement over linear regression
• SW Precipitation predictedthe best (although still bad)…El Nino-related?
Method IIMethod II
• Predict positive or negative anomaly instead of actual value!
• Anomalies changed to binary (1, 0) predictands
• Vary indices used
• Does that cause significant changes?
• This became the most interesting part of the study
• Limited time available: NE T, NE P, SW P
Method IIMethod II
Skill scoresSkill scores
• Many scores provided
• 3 used
1. Percent Correctly Classified
2. TP (True Positive) Rate
3. TN (True Negative) Rate
Results IIResults II
NE Temperature NE Temperature
Neural Net Setup
Auto No Month
Only Nino:Nino 3.4
No Nino’s
No Nino’s, SOI, MEI
More Epochs
Momentum
0.5 0.5 0.5 0.5 0.5 0.5
Learning Rate
0.5 0.5 0.5 0.5 0.5 0.5
Epochs 500 500 500 500 500 1000
Classified Correctly
56.04%
54.59% 51.69% 54.59% 53.62% 56.52%
TP Rate .617 .628 .606 .617 .553 .628
TN Rate .513 .478 .442 .487 .522 .513
Auto = WEKA automatically chooses node setup
Results IIResults II
NE PrecipitationNE Precipitation
Neural Net Setup
Auto No Month
Only Nino:Nino 3.4
No Nino’s
No Nino’s, SOI, MEI
Auto
Momentum
0.3 0.3 0.3 0.3 0.3 0.3
Learning Rate
0.2 0.2 0.2 0.2 0.2 0.2
Epochs 500 500 500 500 500 1000
Classified Correctly
54.11%
48.79% 53.14% 49.28% 51.21% 53.62%
TP Rate .664 .573 .691 .645 .755 .645
TN Rate .402 .392 .351 .320 .237 .412
** Changing the epochs results in overfitting!
Results IIResults II
SW PrecipitationSW Precipitation
Neural Net Setup
Auto No Month
Only Nino:Nino 3.4
No Nino’s
No Nino’s, SOI, MEI
Momentum
0.1 0.1 0.1 0.1 0.1
Learning Rate
0.2 0.2 0.2 0.2 0.2
Epochs 500 500 500 500 500
Classified Correctly
60.39% 60.39% 62.32% 61.83% 59.42%
TP Rate .611 .611 .646 .628 .841
TN Rate .596 .596 .596 .606 .298
** Changing the epochs did not change the ‘Only Nino: Nino 3.4” value
DiscussionDiscussion
• NE Temperature 94 +, 113 –
• Predict negative correct 54.59%
• Best neural net: 56.52% correctly classified
• NE Precipitation 110 +, 97 –
• Predict positive correct 53.14%
• Best neural net: 54.11 % correctly classified
• SW Precipitation 113 +, 94 –
• Predict positive correct 54.59%
• Best neural net: 62.32% correctly classified
DiscussionDiscussion
• These types of neural nets do not provide significant skill over ‘guessing’
• Similar to Method I, not significant difference in skill of logistic regression versus neural nets
• There is some sensitivity to which variables are included in the neural net… even though the decay factor would attempt to eliminate useless interactions
• Different sensitivities in each region
• Using the ‘auto’ setting for layers produced better results
DiscussionDiscussion
• The study was originally supposed to predict the anomaly, but predicting the sign of the anomaly seems to show more promise
• Time constraints prevented a more in depth look at Method II possible Meteo 485 project in future semesters
• Missing June – Sept data could have caused problems with this study
Future WorkFuture Work
1. Obtain missing PNA & EPO data
2. Build neural nets for other regionsof the country for Method II
3. Use different lag times and combinations of lag times
4. Use different climate indices
5. Omit different indices from current set
6. Try other tools that WEKA offers
Special thanks to…Special thanks to…
• Jeremy Ross
For gathering anomaly data
• Climate Diagnostics Center (CDC)
For climate indices
• Dr. George Young
Neural net info from Meteo 474 notes