take home project 2 group c: robert matarazzo, michael stromberg, yuxing zhang, yin chu, leslie wei,...

34
TAKE HOME PROJECT 2 Group C: Robert Matarazzo, Michael Stromberg, Yuxing Zhang, Yin Chu, Leslie Wei, and Kurtis Hollar

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

TAKE HOME PROJECT 2Group C:

Robert Matarazzo, Michael Stromberg, Yuxing Zhang,

Yin Chu, Leslie Wei, and Kurtis Hollar

Introduction

We chose to forecast the imported petroleum price index.

Petroleum has many uses but is mainly used to produce fuels.

The price of petroleum heavily influences the price of gasoline.

2

Original Data

3

Original Data

4

Original Data

Decay in the autocorrelation

Large spike at lag 1 in the partial correlation

This structure indicates non-stationary data

5

Original Data

ADF Test Statistic -0.443169 1% Critical Value* -3.4575 5% Critical Value -2.8729 10% Critical Value -2.5728

*MacKinnon critical values for rejection of hypothesis of a unit root.

Augmented Dickey-Fuller Test EquationDependent Variable: D(PETROP)

Method: Least SquaresDate: 05/28/10 Time: 11:42

Sample(adjusted): 1989:01 2010:04Included observations: 256 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob. PETROP(-1) -0.004220 0.009521 -0.443169 0.6580

C 1.450960 1.370614 1.058621 0.2908R-squared 0.000773 Mean dependent var 0.960156Adjusted R-squared -0.003161 S.D. dependent var 12.89963S.E. of regression 12.92000 Akaike info criterion 7.963212

Sum squared resid 42399.31 Schwarz criterion 7.990909Log likelihood -1017.291 F-statistic 0.196399Durbin-Watson stat 0.728269 Prob(F-statistic) 0.658021

The ADF test statistic confirms the notion of non-stationary data

6

Application of Step Functionand Dummy Variables

7

Application of Step Functionand Dummy Variables

8

Application of Step Functionand Dummy Variables

Dependent Variable: PETROP

Method: Least Squares

Date: 05/28/10 Time: 11:45Sample: 1988:12 2010:04

Included observations: 257

Variable Coefficient Std. Error t-Statistic Prob. C 105.3547 4.906600 21.47203 0.0000

D1 314.1453 75.53614 4.158874 0.0000D2 266.1453 75.53614 3.523417 0.0005

D3 183.5453 75.53614 2.429901 0.0158

D4 96.24534 75.53614 1.274163 0.2038D5 45.44534 75.53614 0.601637 0.5480

STEP 130.5766 19.47246 6.705704 0.0000R-squared 0.241580 Mean dependent var 117.0074

Adjusted R-squared 0.223378 S.D. dependent var 85.53265

S.E. of regression 75.37662 Akaike info criterion 11.50973Sum squared resid 1420409. Schwarz criterion 11.60640

Log likelihood -1472.000 F-statistic 13.27215Durbin-Watson stat 0.111977 Prob(F-statistic) 0.000000

Regression of petroleum price against dummy variables and step function

9

Application of Step Functionand Dummy Variables

10

Application of Step Functionand Dummy Variables

11

Application of Step Functionand Dummy Variables

Decay in the autocorrelation

Large spike at lag 1 in the partial correlation

This structure indicates non-stationary data

12

Application of Step Functionand Dummy Variables

Breusch-Godfrey Serial Correlation LM Test:

F-statistic 5887.195 Probability 0.000000

Obs*R-squared 251.6986 Probability 0.000000

Test Equation:

Dependent Variable: RESID

Method: Least Squares

Date: 05/28/10 Time: 11:49

Variable Coefficient Std. Error t-Statistic Prob.

C 1.306229 0.715046 1.826776 0.0689

D1 -373.8602 11.44297 -32.67159 0.0000

D2 62.97759 26.14838 2.408470 0.0167

D3 -1.306229 10.89304 -0.119914 0.9046

D4 -1.306229 10.89304 -0.119914 0.9046

D5 -1.306229 10.89304 -0.119914 0.9046

STEP 1.904366 2.810381 0.677618 0.4986

RESID(-1) 1.207329 0.062577 19.29363 0.0000

RESID(-2) -0.178742 0.065813 -2.715908 0.0071

R-squared 0.979372 Mean dependent var 5.73E-14

Adjusted R-squared 0.978706 S.D. dependent var 74.48806

S.E. of regression 10.86954 Akaike info criterion 7.644198

Sum squared resid 29300.44 Schwarz criterion 7.768484

Log likelihood -973.2794 F-statistic 1471.799

Durbin-Watson stat 1.900201 Prob(F-statistic) 0.000000

The F-statistic indicates that there is still serial correlation in this revised data

Further steps must be taken

13

Logarithm Transformation and First Differenced Data

Regression of the first difference of the logarithm of petroleum price against dummy variables and step function

Dependent Variable: DLNPETROPMethod: Least SquaresDate: 05/29/10 Time: 23:20Sample(adjusted): 1989:01 2009:04Included observations: 244 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob. C 0.010559 0.004479 2.357406 0.0192

DD1 -0.113533 0.069245 -1.639584 0.1024DD2 -0.245607 0.098132 -2.502824 0.0130DD3 -0.507634 0.120437 -4.214950 0.0000DD4 -0.877988 0.139356 -6.300309 0.0000DD5 -1.178878 0.156127 -7.550778 0.0000

DSTEP -1.236968 0.171380 -7.217702 0.0000R-squared 0.222473 Mean dependent varAdjusted R-squared 0.202789 S.D. dependent varS.E. of regression 0.069100 Akaike info criterionSum squared resid 1.131627 Schwarz criterionLog likelihood 309.3474 F-statisticDurbin-Watson stat 1.102828 Prob(F-statistic)

14

Logarithm Transformation and First Differenced Data

15

Logarithm Transformation and First Differenced Data

16

Logarithm Transformation and First Differenced Data

ADF Test Statistic -7.814147 1% Critical Value* -3.4580 5% Critical Value -2.8731 10% Critical Value -2.5729

*MacKinnon critical values for rejection of hypothesis of a unit root.

Augmented Dickey-Fuller Test EquationDependent Variable: D(DLNPETROP)Method: Least SquaresDate: 05/30/10 Time: 15:57Sample(adjusted): 1989:06 2010:04Included observations: 251 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob. DLNPETROP(-1) -0.700127 0.089597 -7.814147 0.0000

D(DLNPETROP(-1)) 0.315975 0.082244 3.841921 0.0002D(DLNPETROP(-2)) 0.098728 0.077206 1.278768 0.2022D(DLNPETROP(-3)) 0.143967 0.067344 2.137800 0.0335D(DLNPETROP(-4)) 0.069445 0.063319 1.096741 0.2738

C 0.004222 0.004091 1.032067 0.3031R-squared 0.286618 Mean dependent var -6.30E-05Adjusted R-squared 0.272059 S.D. dependent var 0.075215S.E. of regression 0.064173 Akaike info criterion -2.630867Sum squared resid 1.008940 Schwarz criterion -2.546593Log likelihood 336.1738 F-statistic 19.68690Durbin-Watson stat 2.007980 Prob(F-statistic) 0.000000

The ADF test statistic indicates that the data is now stationary

17

Logarithm Transformation and First Differenced Data

Notice the significant spikes at lags 1, 2, and 11

The spike at lag 10 may also be significant

18

Building The ModelDependent Variable: DLNPETROPMethod: Least SquaresDate: 05/29/10 Time: 23:26Sample(adjusted): 1989:03 2009:04Included observations: 242 after adjusting endpointsConvergence achieved after 6 iterationsBackcast: 1988:04 1989:02

Variable Coefficient Std. Error t-Statistic Prob. C 0.010550 0.006695 1.575659 0.1165

DD1 -0.118552 0.060062 -1.973814 0.0496DD2 -0.253143 0.109485 -2.312124 0.0216DD3 -0.530545 0.146073 -3.632045 0.0003DD4 -0.894882 0.172724 -5.180999 0.0000DD5 -1.194821 0.194876 -6.131179 0.0000

DSTEP -1.245789 0.212066 -5.874541 0.0000AR(1) 0.519082 0.065070 7.977273 0.0000AR(2) -0.196711 0.064619 -3.044167 0.0026

MA(11) 0.176080 0.066417 2.651109 0.0086R-squared 0.428059 Mean dependent var 0.004986Adjusted R-squared 0.405871 S.D. dependent var 0.077414S.E. of regression 0.059671 Akaike info criterion -2.759510Sum squared resid 0.826055 Schwarz criterion -2.615339Log likelihood 343.9007 F-statistic 19.29289Durbin-Watson stat 1.980779 Prob(F-statistic) 0.000000Inverted AR Roots .26+.36i .26 -.36iInverted MA Roots .82+.24i .82 -.24i .56 -.65i .56+.65i

.12 -.85i .12+.85i -.35+.78i -.35 -.78i -.72+.46i -.72 -.46i -.85

Based off of the correlogram, we tried modeling with an AR(1) AR(2) MA(11)

All of the coefficients are significant at a 5% level

19

Building The ModelBreusch-Godfrey Serial Correlation LM Test:F-statistic 0.776204 Probability 0.461351Obs*R-squared 1.622180 Probability 0.444374

Test Equation:Dependent Variable: RESIDMethod: Least SquaresDate: 05/29/10 Time: 23:30

Variable Coefficient Std. Error t-Statistic Prob. C -8.72E-05 0.006703 -0.013005 0.9896

DD1 0.001103 0.060235 0.018311 0.9854DD2 0.002716 0.110076 0.024676 0.9803DD3 0.006493 0.147130 0.044130 0.9648DD4 0.009790 0.173997 0.056265 0.9552DD5 0.010948 0.196074 0.055836 0.9555

DSTEP 0.011014 0.213172 0.051666 0.9588AR(1) 0.583091 0.772579 0.754733 0.4512AR(2) 0.028583 0.158990 0.179781 0.8575

MA(11) 0.001298 0.066538 0.019508 0.9845RESID(-1) -0.572796 0.773672 -0.740359 0.4598RESID(-2) -0.362230 0.330054 -1.097488 0.2736

R-squared 0.006703 Mean dependent var -6.23E-05Adjusted R-squared -0.040802 S.D. dependent var 0.058546S.E. of regression 0.059728 Akaike info criterion -2.749708Sum squared resid 0.820517 Schwarz criterion -2.576703Log likelihood 344.7147 F-statistic 0.141104Durbin-Watson stat 1.998401 Prob(F-statistic) 0.999515

The F-statistic indicates that there is no longer serial correlation in the data

20

Building The Model

The resulting correlogram indicates that the spike at lag 10 may still be significant

We will try incorporating an MA(10) into a new model

21

Building The Model

Adding an MA(10) term made the coefficients more significant in general

All of the coefficients are significant at a 5% level

Dependent Variable: DLNPETROP

Method: Least Squares

Date: 05/29/10 Time: 23:56

Sample(adjusted): 1989:03 2009:04

Included observations: 242 after adjusting endpoints

Convergence achieved after 6 iterations

Backcast: 1988:04 1989:02

Variable Coefficient Std. Error t-Statistic Prob.

C 0.010651 0.007566 1.407688 0.1606

DD1 -0.122285 0.059446 -2.057094 0.0408

DD2 -0.274387 0.108398 -2.531304 0.0120

DD3 -0.553364 0.144543 -3.828359 0.0002

DD4 -0.923081 0.171140 -5.393715 0.0000

DD5 -1.216743 0.192998 -6.304429 0.0000

DSTEP -1.274860 0.210151 -6.066406 0.0000

AR(1) 0.515431 0.064487 7.992847 0.0000

AR(2) -0.191517 0.064415 -2.973152 0.0033

MA(10) 0.157807 0.064670 2.440197 0.0154

MA(11) 0.185034 0.064715 2.859226 0.0046

R-squared 0.441873 Mean dependent var 0.004986

Adjusted R-squared 0.417712 S.D. dependent var 0.077414

S.E. of regression 0.059073 Akaike info criterion -2.775696

Sum squared resid 0.806103 Schwarz criterion -2.617107

Log likelihood 346.8592 F-statistic 18.28842

Durbin-Watson stat 1.977425 Prob(F-statistic) 0.000000

Inverted AR Roots .26+.35i .26 -.35i

Inverted MA Roots .86+.26i .86 -.26i .56+.70i .56 -.70i

.08 -.88i .08+.88i -.41 -.75i -.41+.75i

-.71 -.39i -.71+.39i -.7822

Building The Model

23

Building The Model

The correlogram has no significant spikes

All remaining lags are even less significant than without the MA(10) term

24

Building The Model

Once again, there is no serial correlation in this model

Breusch-Godfrey Serial Correlation LM Test:F-statistic 0.617507 Probability 0.540183Obs*R-squared 1.298123 Probability 0.522536

Test Equation:Dependent Variable: RESIDMethod: Least SquaresDate: 05/29/10 Time: 23:59

Variable Coefficient Std. Error t-Statistic Prob. C -8.47E-05 0.007580 -0.011180 0.9911

DD1 0.000363 0.059634 0.006085 0.9952DD2 0.000184 0.108973 0.001688 0.9987DD3 0.001832 0.145544 0.012587 0.9900DD4 0.004007 0.172313 0.023253 0.9815DD5 0.004465 0.194091 0.023006 0.9817

DSTEP 0.004825 0.211181 0.022846 0.9818AR(1) 0.343824 0.801985 0.428716 0.6685AR(2) 0.064231 0.162570 0.395097 0.6931

MA(10) -0.005656 0.064992 -0.087020 0.9307MA(11) 0.001445 0.064821 0.022295 0.9822

RESID(-1) -0.331181 0.802074 -0.412905 0.6801RESID(-2) -0.273438 0.337831 -0.809393 0.4191

R-squared 0.005364 Mean dependent var 2.72E-06Adjusted R-squared -0.046757 S.D. dependent var 0.057834S.E. of regression 0.059171 Akaike info criterion -2.764545Sum squared resid 0.801779 Schwarz criterion -2.577123Log likelihood 347.5100 F-statistic 0.102918Durbin-Watson stat 1.998171 Prob(F-statistic) 0.999950 25

Building The Model

ARCH Test:F-statistic 0.213734 Probability 0.644277Obs*R-squared 0.215330 Probability 0.642622

Test Equation:Dependent Variable: RESID^2Method: Least SquaresDate: 05/30/10 Time: 00:00Sample(adjusted): 1989:04 2009:04Included observations: 241 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob. C 0.003238 0.000444 7.299747 0.0000

RESID^2(-1) 0.029897 0.064668 0.462314 0.6443R-squared 0.000893 Mean dependent var 0.003338Adjusted R-squared -0.003287 S.D. dependent var 0.006005S.E. of regression 0.006015 Akaike info criterion -7.380936Sum squared resid 0.008646 Schwarz criterion -7.352016Log likelihood 891.4027 F-statistic 0.213734Durbin-Watson stat 2.001542 Prob(F-statistic) 0.644277

We ran the ARCH test to check for heteroskedasticity

The F-statistic indicates the data is homoskedastic

26

Building The Model

The residual^2 correlogram shows slight structure in the earlier lags

27

Building The ModelDependent Variable: DLNPETROP

Method: ML - ARCH

Date: 05/30/10 Time: 00:03

Sample(adjusted): 1989:03 2009:04

Included observations: 242 after adjusting endpoints

Convergence achieved after 97 iterations

Backcast: 1988:04 1989:02

Coefficient Std. Error z-Statistic Prob.

C 0.010161 0.007176 1.415926 0.1568

DD1 -0.121103 3.887236 -0.031154 0.9751

DD2 -0.273365 6.337982 -0.043131 0.9656

DD3 -0.550069 15.65666 -0.035133 0.9720

DD4 -0.920318 21.19041 -0.043431 0.9654

DD5 -1.212506 21.40283 -0.056652 0.9548

DSTEP -1.271241 21.42383 -0.059338 0.9527

AR(1) 0.508052 0.073322 6.929041 0.0000

AR(2) -0.196208 0.075139 -2.611254 0.0090

MA(10) 0.163558 0.066876 2.445691 0.0145

MA(11) 0.164319 0.066187 2.482657 0.0130

Variance Equation

C 0.000432 0.000555 0.777128 0.4371

ARCH(1) 0.067695 0.065380 1.035413 0.3005

GARCH(1) 0.803114 0.214145 3.750337 0.0002

R-squared 0.441553 Mean dependent var 0.004986

Adjusted R-squared 0.409711 S.D. dependent var 0.077414

S.E. of regression 0.059477 Akaike info criterion -2.771904

Sum squared resid 0.806566 Schwarz criterion -2.570065

Log likelihood 349.4004 F-statistic 13.86729

Durbin-Watson stat 1.961365 Prob(F-statistic) 0.000000

Inverted AR Roots .25+.36i .25 -.36i

Inverted MA Roots .86 -.26i .86+.26i .56 -.69i .56+.69i

.07 -.87i .07+.87i -.41 -.74i -.41+.74i

-.70 -.37i -.70+.37i -.75

To fix this slight structure, we tried adding ARCH/GARCH to the model

However, most variable coefficients became insignificant

We returned to an AR(1) AR(2) MA(10) MA(11) model 28

Forecasting The Last 12 Months

29

Forecasting The Last 12 Months

30

Forecasting Through 2010

31

Forecasting Through 2010

32

Conclusion

There is an upward trend in the forecast suggesting an increase in future petroleum price

Because of this, companies that heavily rely on oil may want to hedge against this Ex: Southwest Airlines (2007)

33

The End

34