applied forecasting st3010 michaelmas term 2015 prof. rozenn dahyot room 128 lloyd institute school...

76
Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College Dublin [email protected] or [email protected] +353 1 896 1760

Upload: jane-charlene-carpenter

Post on 02-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Applied Forecasting ST3010Michaelmas term 2015

Prof. Rozenn Dahyot

Room 128 Lloyd Institute

School of Computer Science and Statistics

Trinity College Dublin

[email protected] or [email protected]

+353 1 896 1760

Page 2: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Lecture notes available online

@ https://www.scss.tcd.ie/Rozenn.Dahyot/In the ‘teaching’ section.

Possibly some materials will be on blackboard

Page 3: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Timetable

•Monday• 9am-10am in LB01• 4pm-5pm in LB1.07

•Friday• 10am-11am in Salmon 1 (Hamilton Building)

Page 4: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Organization of the course

•Lectures-tutorials only:• No labs but information using R for Forecasting will be

provided.

•Exam 100%• No assignments

Page 5: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Software R http://www.r-project.org/

Page 6: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Content

Page 7: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Content

• Introduction to forecasting; ARIMA models, GARCH models, Kalman Filters,data transformations, seasonality, exponential smoothing and Holt Winters algorithms, performance measures. Use of transformations and differences.

Page 8: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Textbook

• Forecasting: Methods and Applications by Makridakis, Wheelwright and Hyndman, published by Wiley

• Many more books in the libraries in Trinity on Forecasting , time series covering the content of this course.

Page 9: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Who Forecast?

Page 10: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Why Forecast?

Page 11: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

How to Forecast?

In this course we will use maths/stats techniques for forecasting

Page 12: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Steps in a Forecasting Procedure?

Problem definition

Exploratory Analysis

Gathering information

Selecting and fitting models to make forecast

Using and evaluating the forecast

Page 14: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Examples…. Warnings

Epidemiological modeling of online social network dynamicshttp://arxiv.org/abs/1401.4208

http://languagelog.ldc.upenn.edu/nll/?p=9977

Page 15: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Quantitative Forecasting

Page 16: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Quantitative methods

Page 17: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Time series models Vs Explanatory models

Time series

Page 18: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

What is the nature of the data to analyse?• Examples from fma packages in R

• airpass• beer• internet• cowtemp• Dowjones• mink

• Can you predict how these time series look like ?

Page 19: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Visualization tools

• Numerical values

• Time plot

• Season plot

Page 20: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Patterns to identify

• Trends• Seasonal• Error/noise

• Visualize and identify patterns:• airpass• beer• internet• cowtemp• Dowjones• mink

Page 21: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Time series

• Definition

• Sampling rate & Unit of time

• Preparation of Data before analysis

Page 22: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Limitations in this module

• 1D time series

• No outliers

• No missing data

Page 23: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Notations

• Variables Vs numerical values

• Time series

Page 24: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College
Page 25: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Auto-Correlation Function (ACF)Mean value of the time series

Autocorrelation at lag k

Page 26: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Auto Correlation Function (ACF)

Lag k

r1

r2

r3

1 2 3

Page 27: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Time

be

er

- m

ea

n(b

ee

r)

1991 1992 1993 1994 1995

-20

02

04

0

> plot(beer-mean(beer),lwd="3")> lines(lag(beer-mean(beer),1),col="red",lwd=3)

In red, The lag series beer (lag 1 ).The two time series overlap well.

0 10 20 30 40 50

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series ts(beer, freq = 1)

Page 28: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Time

be

er

- m

ea

n(b

ee

r)

1991 1992 1993 1994 1995

-20

02

04

0

In red, The lag series beer (lag 6 ).The two time series do not overlap well.

> plot(beer-mean(beer),lwd="3")> lines(lag(beer-mean(beer),6),col="red",lwd=3)

0 10 20 30 40 50

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series ts(beer, freq = 1)

Page 29: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Time

be

er

- m

ea

n(b

ee

r)

1991 1992 1993 1994 1995

-20

02

04

0

In red, The lag series beer (lag 12 ).The two time series do overlap well.

> plot(beer-mean(beer),lwd="3")> lines(lag(beer-mean(beer),12),col="red",lwd=3)

0 10 20 30 40 50

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series ts(beer, freq = 1)

Page 30: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Time

air

pa

ss -

me

an

(air

pa

ss)

1950 1952 1954 1956 1958 1960

-10

00

10

02

00

30

0

For the airpass time series

0 10 20 30 40 50

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

Lag

AC

F

Series ts(airpass, freq = 1)

Time

air

pa

ss -

me

an

(air

pa

ss)

1950 1952 1954 1956 1958 1960

-10

00

10

02

00

30

0

Time

air

pa

ss -

me

an

(air

pa

ss)

1950 1952 1954 1956 1958 1960

-10

00

10

02

00

30

0

Lag 1Lag 6

Lag 12

Page 31: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Partial AutoCorrelation Function (PACF)

Page 32: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Holt-Winters Algorithms

Part I

Page 33: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Algo I: Simple Exponential Smoothing (SES)

Page 34: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

• What does SES do?

• What happens when a=1 or a=0 ?

• SES is an algorithm suitable for a time series with …

Algo I: Simple Exponential Smoothing (SES)

Page 35: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Algo II: Double Exponential Smoothing (DES)

Page 36: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

SES(a)

DES( ,a b)

Page 37: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

DES( ,a b)

SHW+( , ,a b g)

Page 38: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

SHW+( , ,a b g) SHWx( , ,a b g)

Page 39: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Linear Regression

Page 40: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Useful formulas

Page 41: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Auto-Regressive Models – AR(1)

𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜖𝑡

Explanatory variable

Parameters to estimate

Page 42: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Auto-Regressive Models – AR(2)

𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜑2 𝑦𝑡 −2+𝜖𝑡

Explanatory variables

Parameters to estimate

Page 43: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Auto-Regressive Models – AR(p)

𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜑2 𝑦𝑡 −2+…+𝜑𝑝 𝑦 𝑡−𝑝+𝜖𝑡

Parameters to estimate

Explanatory variables

Page 44: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

AR(1): Least Squares estimates of the parameters

�̂�=(𝑋 ¿¿𝑇 𝑋 )−1𝑋 𝑇 �⃑� ¿

, , 6, ,

𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜖𝑡model

Write the least squares solution.

Page 45: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

AR(1): Least Squares estimates of the parameters

, , 6, ,

𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜖𝑡model

Page 46: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

AR(1): Least Squares estimates of the parameters

�̂�=(𝑋 ¿¿𝑇 𝑋 )−1𝑋 𝑇 �⃑� ¿

𝑦𝑋=[1141316

17]𝜃=[𝜑0

𝜑1]

Page 47: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Estimate of s

Estimate the standard deviation of the noise

Page 48: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Example: dowjones

Page 49: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Auto-Regressive Models – AR(p)

𝑦 𝑡=𝜑0+𝜑1 𝑦𝑡 −1+𝜑2 𝑦𝑡 −2+…+𝜑𝑝 𝑦 𝑡−𝑝+𝜖𝑡

Parameters to estimate

Explanatory variables

Page 50: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Moving Average MA(1)

𝑦 𝑡=𝜑0+𝜑1𝜖𝑡 −1+𝜖𝑡

Explanatory variable

Parameters to estimate

Can Least Squares Algorithm be used to estimate the parameters?

Page 51: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Moving average MA(q)

𝑦 𝑡=𝜑0+𝜑1𝜖𝑡 −1+𝜑2𝜖𝑡− 2+…+𝜑𝑞𝜖𝑡−𝑞+𝜖𝑡

Parameters to estimate

Explanatory variables

Page 52: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Exercises

Page 53: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Remark

Page 54: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Expectation

Page 55: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College
Page 56: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College
Page 57: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Summary 17/11/2014

• Using ACF and PACF to identify AR(p) and MA(q)

• Procedure to fit an ARIMA(p,d,q)

• Definition of BIC/AIC

Page 58: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Fitting ARIMA(p,d,q)

Page 59: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Fitting ARIMA(p,d,q)

To avoid overfitting choose p ≤ 3 q ≤ 3 d ≤ 3

Page 60: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

PACF for AR(1)

Maths

Page 61: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

ACF for MA(1)

Maths

Page 62: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

MA(1) as an AR(∞)

For MA(1) the Damped sine wave/exponential decay in the PACF corresponds to these coefficients vanishing towards 0

Page 63: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

AR(1) as an MA(∞)

Page 64: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Criteria to select the best ARIMA model

Page 65: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Exercise: Show

Page 66: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Hirotugu Aikaike (1927-2009)

1970s: proposed model selection with an information Criterion (AIC)

Page 67: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Bayesian information Criterion

Thomas Bayes (1701-1761)

The BIC was developed by Gideon E. Schwarz, who gave a Bayesian argument for adopting it.

http://en.wikipedia.org/wiki/Bayesian_information_criterion

Page 68: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Seasonal ARIMA(p,d,q)(P,D,Q)s

Page 69: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Seasonal ARIMA(p,d,q)(P,D,Q)s

Choose your criterion AIC or BIC (and stick to it).Select the ARIMA model with the lowest AIC or BIC

with m=p+q+P+Q

Page 70: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

ARIMA(0,0,0)(P=1,0,0)s Vs ARIMA(0,0,0)(0,D=1,0)s

𝑦 𝑡=𝑐+𝜑1 𝑦𝑡− 𝑠+𝜖𝑡

𝑦 𝑡=𝑐+𝑦𝑡 −𝑠+𝜖𝑡

Page 71: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Summary

Page 72: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Summary

Page 73: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Summary

20141960s1950s 1970s 1980s 1990s

SESDESSHW+SHWx

ARIMA AIC BICHolt Winters

Page 74: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Other time series models

ARCH (1982): autoregressive conditional heteroskedasticity GARCH (1986): generalized autoregressive conditional heteroskedasticity…More at http://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity

Page 75: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Concluding Remarks

time

Page 76: Applied Forecasting ST3010 Michaelmas term 2015 Prof. Rozenn Dahyot Room 128 Lloyd Institute School of Computer Science and Statistics Trinity College

Concluding remarks

• The Prediction – Update loop

• Combining experts