time series in r: forecasting and visualisation · ggplot2 package (for graphics) fma package (for...

46
1 Time Series in R: Forecasting and Visualisation Time series in R 29 May 2017

Upload: others

Post on 24-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

1

Time Series in R:Forecasting andVisualisation

Time series in R

29 May 2017

Page 2: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Outline

1 ts objects

2 Time plots

3 Lab session 1

4 Seasonal plots

5 Seasonal or cyclic?

6 Lag plots and autocorrelation

7 Lab session 2

2

Page 3: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Time series

Time series consist of sequences of observationscollected over time.We will assume the time periods are equallyspaced.

Time series examplesDaily IBM stock pricesMonthly rainfallAnnual Google profitsQuarterly Australian beer production

3

Page 4: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

ts objects and ts function

A time series is stored in a ts object in R:a list of numbersinformation about times those numbers were recorded.

Example

Year Observation

2012 1232013 392014 782015 522016 110

y <- ts(c(123,39,78,52,110), start=2012)

4

Page 5: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

ts objects and ts function

For observations that are more frequent than onceper year, add a frequency argument.E.g., monthly data stored as a numerical vector z:

y <- ts(z, frequency=12, start=c(2003, 1))

5

Page 6: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

ts objects and ts function

ts(data, frequency, start)

Type of data frequency start example

Annual 1 1995Quarterly 4 c(1995,2)Monthly 12 c(1995,9)Daily 7 or 365.25 1 or c(1995,234)Weekly 52.18 c(1995,23)Hourly 24 or 168 or 8,766 1Half-hourly 48 or 336 or 17,532 1

6

Page 7: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

ts objects

Class: “ts”Print and plotting methods available.

ausgdp

## Qtr1 Qtr2 Qtr3 Qtr4## 1971 4612 4651## 1972 4645 4615 4645 4722## 1973 4780 4830 4887 4933## 1974 4921 4875 4867 4905## 1975 4938 4934 4942 4979## 1976 5028 5079 5112 5127## 1977 5130 5101 5072 5069## 1978 5100 5166 5244 5312## 1979 5349 5370 5388 5396## 1980 5388 5403 5442 5482## 1981 5506 5531 5560 5583## 1982 5568 5524 5452 5358## 1983 5303 5320 5408 5531## 1984 5624 5669 5697 5736## 1985 5811 5894 5952 5965## 1986 5943 5924 5935 5979## 1987 6035 6097 6167 6227## 1988 6256 6272 6295 6345## 1989 6413 6468 6497 6511## 1990 6514 6512 6490 6442## 1991 6390 6346 6328 6340## 1992 6362 6389 6433 6491## 1993 6541 6566 6602 6671## 1994 6765 6847 6890 6918## 1995 6962 7018 7083 7134## 1996 7173 7212 7242 7276## 1997 7332 7400 7478 7550## 1998 7618

7

Page 8: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

ts objects

start(ausgdp)

## [1] 1971 3

end(ausgdp)

## [1] 1998 1

frequency(ausgdp)

## [1] 4

8

Page 9: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

ts objects

Residential electricity sales

elecsales

## Time Series:## Start = 1989## End = 2008## Frequency = 1## [1] 2354 2380 2319 2469 2386 2569 2576 2763 2844## [10] 3001 3108 3358 3076 3181 3222 3176 3431 3527## [19] 3638 3655

9

Page 10: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

ts objects

start(elecsales)

## [1] 1989 1

end(elecsales)

## [1] 2008 1

frequency(elecsales)

## [1] 1

10

Page 11: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

fpp2

Main package used in this course> library(fpp2)

This loads:

some data for use in examples and exercisesforecast package (for forecasting functions)ggplot2 package (for graphics)fma package (for lots of time series data)expsmooth package (for more time series data)

11

Page 12: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Outline

1 ts objects

2 Time plots

3 Lab session 1

4 Seasonal plots

5 Seasonal or cyclic?

6 Lag plots and autocorrelation

7 Lab session 2

12

Page 13: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

ts objects

autoplot(ausgdp)

5000

6000

7000

1975 1980 1985 1990 1995

Time

ausg

dp

13

Page 14: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Time plots

autoplot(a10) + ylab("$ million") + xlab("Year") +ggtitle("Antidiabetic drug sales")

10

20

30

1995 2000 2005

Year

$ m

illio

n

Antidiabetic drug sales

14

Page 15: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Outline

1 ts objects

2 Time plots

3 Lab session 1

4 Seasonal plots

5 Seasonal or cyclic?

6 Lag plots and autocorrelation

7 Lab session 2

15

Page 16: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Lab Session 1

16

Page 17: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Outline

1 ts objects

2 Time plots

3 Lab session 1

4 Seasonal plots

5 Seasonal or cyclic?

6 Lag plots and autocorrelation

7 Lab session 2

17

Page 18: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Time plot

autoplot(a10) + ylab("$ million") + xlab("Year") +ggtitle("Antidiabetic drug sales")

10

20

30

1995 2000 2005

Year

$ m

illio

n

Antidiabetic drug sales

18

Page 19: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Seasonal plot

ggseasonplot(a10, year.labels=TRUE,year.labels.left=TRUE) +ylab("$ million") +ggtitle("Seasonal plot: antidiabetic drug sales")

1991 19911992 1992199319931994 19941995 1995

1996 199619971997

1998199819991999

2000 2000

20012001

20022002

2003 20032004

20042005 2005

2006 2006

2007

2007

2008

2008

10

20

30

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Month

$ m

illio

n

Seasonal plot: antidiabetic drug sales

19

Page 20: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Seasonal polar plotsggseasonplot(a10, polar=TRUE) + ylab("$ million")

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan/

10

20

Month

$ m

illio

n

year1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

Seasonal plot: a10

20

Page 21: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Seasonal subseries plots

ggsubseriesplot(a10) + ylab("$ million") +ggtitle("Subseries plot: antidiabetic drug sales")

10

20

30

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Month

$ m

illio

n

Subseries plot: antidiabetic drug sales

21

Page 22: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Quarterly Australian Beer Production

beer <- window(ausbeer,start=1992)autoplot(beer)

400

450

500

1995 2000 2005 2010

Time

beer

22

Page 23: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Quarterly Australian Beer Production

ggseasonplot(beer,year.labels=TRUE)

1992

1993

19941995

1996

199719981999

20002001

2002

2003

2004

20052006

2007

20082009

2010

400

450

500

Q1 Q2 Q3 Q4

Quarter

Seasonal plot: beer

23

Page 24: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Quarterly Australian Beer Production

ggsubseriesplot(beer)

400

450

500

Q1 Q2 Q3 Q4

Quarter

beer

24

Page 25: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Outline

1 ts objects

2 Time plots

3 Lab session 1

4 Seasonal plots

5 Seasonal or cyclic?

6 Lag plots and autocorrelation

7 Lab session 2

25

Page 26: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Time series patterns

Trend pattern exists when there is a long-termincrease or decrease in the data.

Seasonal pattern exists when a series is influencedby seasonal factors (e.g., the quarter ofthe year, the month, or day of the week).

Cyclic pattern exists when data exhibit rises andfalls that are not of fixed period (durationusually of at least 2 years).

26

Page 27: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Time series patterns

autoplot(window(elec, start=1980)) +ggtitle("Australian electricity production") +xlab("Year") + ylab("GWh")

8000

10000

12000

14000

1980 1985 1990 1995

Year

GW

h

Australian electricity production

27

Page 28: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Time series patterns

autoplot(bricksq) +ggtitle("Australian clay brick production") +xlab("Year") + ylab("million units")

200

300

400

500

600

1960 1970 1980 1990

Year

mill

ion

units

Australian clay brick production

28

Page 29: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Time series patterns

autoplot(ustreas) +ggtitle("US Treasury Bill Contracts") +xlab("Day") + ylab("price")

86

88

90

0 20 40 60 80 100

Day

pric

e

US Treasury Bill Contracts

29

Page 30: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Time series patterns

autoplot(lynx) +ggtitle("Annual Canadian Lynx Trappings") +xlab("Year") + ylab("Number trapped")

0

2000

4000

6000

1820 1840 1860 1880 1900 1920

Year

Num

ber

trap

ped

Annual Canadian Lynx Trappings

30

Page 31: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Seasonal or cyclic?

Differences between seasonal and cyclic patterns:seasonal pattern constant length; cyclic patternvariable lengthaverage length of cycle longer than length ofseasonal patternmagnitude of cycle more variable thanmagnitude of seasonal pattern

The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.

31

Page 32: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Seasonal or cyclic?

Differences between seasonal and cyclic patterns:seasonal pattern constant length; cyclic patternvariable lengthaverage length of cycle longer than length ofseasonal patternmagnitude of cycle more variable thanmagnitude of seasonal pattern

The timing of peaks and troughs is predictable withseasonal data, but unpredictable in the long termwith cyclic data.

31

Page 33: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Outline

1 ts objects

2 Time plots

3 Lab session 1

4 Seasonal plots

5 Seasonal or cyclic?

6 Lag plots and autocorrelation

7 Lab session 2

32

Page 34: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Example: Beer production

beer <- window(ausbeer, start=1992)gglagplot(beer)

33

Page 35: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Example: Beer production

lag 7 lag 8 lag 9

lag 4 lag 5 lag 6

lag 1 lag 2 lag 3

400 450 500 400 450 500 400 450 500

400

450

500

400

450

500

400

450

500

Quarter1

2

3

4

34

Page 36: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Lagged scatterplots

Each graph shows yt plotted against yt−k fordifferent values of k.The autocorrelations are the correlationsassociated with these scatterplots.

35

Page 37: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Autocorrelation

Results for first 9 lags for beer data:

r1 r2 r3 r4 r5 r6 r7 r8 r9

-0.102 -0.657 -0.060 0.869 -0.089 -0.635 -0.054 0.832 -0.108

ggAcf(beer)

−0.5

0.0

0.5

4 8 12 16

Lag

AC

F

Series: beer

36

Page 38: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Autocorrelation

r4 higher than for the other lags. This is due tothe seasonal pattern in the data: the peakstend to be 4 quarters apart and the troughs tendto be 2 quarters apart.r2 is more negative than for the other lagsbecause troughs tend to be 2 quarters behindpeaks.Together, the autocorrelations at lags 1, 2, . . . ,make up the autocorrelation or ACF.The plot is known as a correlogram

37

Page 39: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Trend and seasonality in ACF plots

When data have a trend, the autocorrelations forsmall lags tend to be large and positive.When data are seasonal, the autocorrelationswill be larger at the seasonal lags (i.e., atmultiples of the seasonal frequency)When data are trended and seasonal, you see acombination of these effects.

38

Page 40: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Aus monthly electricity production

elec2 <- window(elec, start=1980)autoplot(elec2)

8000

10000

12000

14000

1980 1985 1990 1995

Time

elec

2

39

Page 41: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Aus monthly electricity production

ggAcf(elec2, lag.max=48)

0.00

0.25

0.50

0.75

0 12 24 36 48

Lag

AC

F

Series: elec2

40

Page 42: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Google stock price

autoplot(goog)

400

500

600

700

800

0 200 400 600 800 1000

Time

goog

41

Page 43: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Google stock price

ggAcf(goog, lag.max=100)

0.00

0.25

0.50

0.75

1.00

0 20 40 60 80 100

Lag

AC

F

Series: goog

42

Page 44: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Which is which?

40

60

80

0 20 40 60

chir

ps p

er m

inut

e

1. Daily temperature of cow

7

8

9

10

11

1974 1976 1978

thou

sand

s

2. Monthly accidental deaths

200

400

600

1950 1952 1954 1956 1958 1960

thou

sand

s

3. Monthly air passengers

30

60

90

1860 1880 1900

thou

sand

s

4. Annual mink trappings

0.0

0.5

1.0

12 246 18

AC

F

A

0.0

0.5

1.0

5 10 15

AC

F

B

0.0

0.5

1.0

5 10 15

AC

F

C

0.0

0.5

1.0

12 246 18

AC

F

D

43

Page 45: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Outline

1 ts objects

2 Time plots

3 Lab session 1

4 Seasonal plots

5 Seasonal or cyclic?

6 Lag plots and autocorrelation

7 Lab session 2

44

Page 46: Time Series in R: Forecasting and Visualisation · ggplot2 package (for graphics) fma package (for lots of time series data) expsmooth package (for more time series data) 11. Outline

Lab Session 2

45