applied forecasting st3010 michaelmas term 2015 prof. rozenn dahyot room 128 lloyd institute school...
TRANSCRIPT
Applied Forecasting ST3010Michaelmas term 2015
Prof. Rozenn Dahyot
Room 128 Lloyd Institute
School of Computer Science and Statistics
Trinity College Dublin
[email protected] or [email protected]
+353 1 896 1760
Lecture notes available online
@ https://www.scss.tcd.ie/Rozenn.Dahyot/In the ‘teaching’ section.
Possibly some materials will be on blackboard
Timetable
•Monday• 9am-10am in LB01• 4pm-5pm in LB1.07
•Friday• 10am-11am in Salmon 1 (Hamilton Building)
Organization of the course
•Lectures-tutorials only:• No labs but information using R for Forecasting will be
provided.
•Exam 100%• No assignments
Software R http://www.r-project.org/
Content
Content
• Introduction to forecasting; ARIMA models, GARCH models, Kalman Filters,data transformations, seasonality, exponential smoothing and Holt Winters algorithms, performance measures. Use of transformations and differences.
Textbook
• Forecasting: Methods and Applications by Makridakis, Wheelwright and Hyndman, published by Wiley
• Many more books in the libraries in Trinity on Forecasting , time series covering the content of this course.
Who Forecast?
Why Forecast?
How to Forecast?
In this course we will use maths/stats techniques for forecasting
Steps in a Forecasting Procedure?
Problem definition
Exploratory Analysis
Gathering information
Selecting and fitting models to make forecast
Using and evaluating the forecast
Examples
• https://www.google.ie/trends/
• http://static.googleusercontent.com/media/research.google.com/en//archive/papers/detecting-influenza-epidemics.pdf
Examples…. Warnings
Epidemiological modeling of online social network dynamicshttp://arxiv.org/abs/1401.4208
http://languagelog.ldc.upenn.edu/nll/?p=9977
Quantitative Forecasting
Quantitative methods
Time series models Vs Explanatory models
Time series
What is the nature of the data to analyse?• Examples from fma packages in R
• airpass• beer• internet• cowtemp• Dowjones• mink
• Can you predict how these time series look like ?
Visualization tools
• Numerical values
• Time plot
• Season plot
Patterns to identify
• Trends• Seasonal• Error/noise
• Visualize and identify patterns:• airpass• beer• internet• cowtemp• Dowjones• mink
Time series
• Definition
• Sampling rate & Unit of time
• Preparation of Data before analysis
Limitations in this module
• 1D time series
• No outliers
• No missing data
Notations
• Variables Vs numerical values
• Time series
Auto-Correlation Function (ACF)Mean value of the time series
Autocorrelation at lag k
Auto Correlation Function (ACF)
Lag k
r1
r2
r3
1 2 3
Time
be
er
- m
ea
n(b
ee
r)
1991 1992 1993 1994 1995
-20
02
04
0
> plot(beer-mean(beer),lwd="3")> lines(lag(beer-mean(beer),1),col="red",lwd=3)
In red, The lag series beer (lag 1 ).The two time series overlap well.
0 10 20 30 40 50
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
Lag
AC
F
Series ts(beer, freq = 1)
Time
be
er
- m
ea
n(b
ee
r)
1991 1992 1993 1994 1995
-20
02
04
0
In red, The lag series beer (lag 6 ).The two time series do not overlap well.
> plot(beer-mean(beer),lwd="3")> lines(lag(beer-mean(beer),6),col="red",lwd=3)
0 10 20 30 40 50
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
Lag
AC
F
Series ts(beer, freq = 1)
Time
be
er
- m
ea
n(b
ee
r)
1991 1992 1993 1994 1995
-20
02
04
0
In red, The lag series beer (lag 12 ).The two time series do overlap well.
> plot(beer-mean(beer),lwd="3")> lines(lag(beer-mean(beer),12),col="red",lwd=3)
0 10 20 30 40 50
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
Lag
AC
F
Series ts(beer, freq = 1)
Time
air
pa
ss -
me
an
(air
pa
ss)
1950 1952 1954 1956 1958 1960
-10
00
10
02
00
30
0
For the airpass time series
0 10 20 30 40 50
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
Lag
AC
F
Series ts(airpass, freq = 1)
Time
air
pa
ss -
me
an
(air
pa
ss)
1950 1952 1954 1956 1958 1960
-10
00
10
02
00
30
0
Time
air
pa
ss -
me
an
(air
pa
ss)
1950 1952 1954 1956 1958 1960
-10
00
10
02
00
30
0
Lag 1Lag 6
Lag 12
Partial AutoCorrelation Function (PACF)
Holt-Winters Algorithms
Part I
Algo I: Simple Exponential Smoothing (SES)
• What does SES do?
• What happens when a=1 or a=0 ?
• SES is an algorithm suitable for a time series with …
Algo I: Simple Exponential Smoothing (SES)
Algo II: Double Exponential Smoothing (DES)
SES(a)
DES( ,a b)
DES( ,a b)
SHW+( , ,a b g)
SHW+( , ,a b g) SHWx( , ,a b g)
Linear Regression
Useful formulas
Auto-Regressive Models – AR(1)
𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜖𝑡
Explanatory variable
Parameters to estimate
Auto-Regressive Models – AR(2)
𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜑2 𝑦𝑡 −2+𝜖𝑡
Explanatory variables
Parameters to estimate
Auto-Regressive Models – AR(p)
𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜑2 𝑦𝑡 −2+…+𝜑𝑝 𝑦 𝑡−𝑝+𝜖𝑡
Parameters to estimate
Explanatory variables
AR(1): Least Squares estimates of the parameters
�̂�=(𝑋 ¿¿𝑇 𝑋 )−1𝑋 𝑇 �⃑� ¿
, , 6, ,
𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜖𝑡model
Write the least squares solution.
AR(1): Least Squares estimates of the parameters
, , 6, ,
𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜖𝑡model
AR(1): Least Squares estimates of the parameters
�̂�=(𝑋 ¿¿𝑇 𝑋 )−1𝑋 𝑇 �⃑� ¿
𝑦𝑋=[1141316
17]𝜃=[𝜑0
𝜑1]
Estimate of s
Estimate the standard deviation of the noise
Example: dowjones
Auto-Regressive Models – AR(p)
𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜑2 𝑦𝑡 −2+…+𝜑𝑝 𝑦 𝑡−𝑝+𝜖𝑡
Parameters to estimate
Explanatory variables
Moving Average MA(1)
𝑦 𝑡=𝜑0+𝜑1𝜖𝑡 −1+𝜖𝑡
Explanatory variable
Parameters to estimate
Can Least Squares Algorithm be used to estimate the parameters?
Moving average MA(q)
𝑦 𝑡=𝜑0+𝜑1𝜖𝑡 −1+𝜑2𝜖𝑡− 2+…+𝜑𝑞𝜖𝑡−𝑞+𝜖𝑡
Parameters to estimate
Explanatory variables
Exercises
Remark
Expectation
Summary 17/11/2014
• Using ACF and PACF to identify AR(p) and MA(q)
• Procedure to fit an ARIMA(p,d,q)
• Definition of BIC/AIC
Fitting ARIMA(p,d,q)
Fitting ARIMA(p,d,q)
To avoid overfitting choose p ≤ 3 q ≤ 3 d ≤ 3
PACF for AR(1)
Maths
ACF for MA(1)
Maths
MA(1) as an AR(∞)
For MA(1) the Damped sine wave/exponential decay in the PACF corresponds to these coefficients vanishing towards 0
AR(1) as an MA(∞)
Criteria to select the best ARIMA model
Exercise: Show
Hirotugu Aikaike (1927-2009)
1970s: proposed model selection with an information Criterion (AIC)
Bayesian information Criterion
Thomas Bayes (1701-1761)
The BIC was developed by Gideon E. Schwarz, who gave a Bayesian argument for adopting it.
http://en.wikipedia.org/wiki/Bayesian_information_criterion
Seasonal ARIMA(p,d,q)(P,D,Q)s
Seasonal ARIMA(p,d,q)(P,D,Q)s
Choose your criterion AIC or BIC (and stick to it).Select the ARIMA model with the lowest AIC or BIC
with m=p+q+P+Q
ARIMA(0,0,0)(P=1,0,0)s Vs ARIMA(0,0,0)(0,D=1,0)s
𝑦 𝑡=𝑐+𝜑1 𝑦𝑡− 𝑠+𝜖𝑡
𝑦 𝑡=𝑐+𝑦𝑡 −𝑠+𝜖𝑡
Summary
Summary
Summary
20141960s1950s 1970s 1980s 1990s
SESDESSHW+SHWx
ARIMA AIC BICHolt Winters
Other time series models
ARCH (1982): autoregressive conditional heteroskedasticity GARCH (1986): generalized autoregressive conditional heteroskedasticity…More at http://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity
Concluding Remarks
time
Concluding remarks
• The Prediction – Update loop
• Combining experts