housing starts forecast
DESCRIPTION
Example of forecasting work.TRANSCRIPT
John Montgomery
Econ 401/Dr. TownsendDecember 7, 2009
Appendix 14.1 is a highly aggregated model of real gross domestic product and its
major components. The Model contains 11 behavioral equations and two identities. One
of these identities is for real disposable income, and the other is the accounting identity
for real GDP. Each equation within the model is estimated using two stage least squares.
There are 12 endogenous variables: personal consumption expenditures, GDP, rate of
growth of CPI, nonresidential fixed investment, change in business inventories,
residential fixed investement, imports of goods and services, average yield on AAA
corporate bonds, interest rate on 3-month treasury bills, personal and indirect business tax
payments, civilian unemployment rate, wage inflation, and disposable personal income.
In addition to these endogenous variables, there are 9 exogenous variables: government
purchases of goods and services, potential GDP, money stock, household net worth, rate
of growth of oil prices, corporate profits, rate of growth of labor productivity, transfer
payments to persons, and exports of goods and services.
The instruments used for the individual behavioral equations differ compared to
what we will be using for our model. Furthermore this model uses two-stage least
squares for each of the equations, and we use ordinary least squares for the recursive
equations.
Comparatively the model provides a good forecast, and the flow chart is a good
representation of the equation visually.
Case set four calls for us to create a simplified structural model of the U.S.
economy. The model uses the Fair method, which uses two stage least squares, and
includes the lagged dependent and independent variables as instruments. These lagged
variables are included as such in order to obtain consistent parameter estimates when
autocorrelated disturbances create a problem.
The model contains 11 behavioral equations, and two identities. The majority of
the equations are estimated using two stage least squares, although there are three
recursive equations which are estimated using the ordinary least squares method. Using
quarterly data from 1960-1993 I have created a historical simulation which I will explain
here.
Dependent Variable: TAXMethod: Two-Stage Least SquaresDate: 12/07/09 Time: 20:36Sample: 1960Q1 1993Q4Included observations: 136Convergence achieved after 7 iterationsInstrument list: C GDPPOT INFL INR INV IR M M2 RL RS X YPD GDP(-1) TAX(-1)
Lagged dependent variable & regressors
added to instrument list
Variable Coefficient Std. Error t-Statistic Prob.
C -3.967408 22.99851 -0.172507 0.8633GDP 0.186861 0.005053 36.98054 0.0000AR(1) 0.781575 0.054284 14.39795 0.0000
R-squared 0.995351 Mean dependent var 790.8930Adjusted R-squared 0.995281 S.D. dependent var 235.3508S.E. of regression 16.16703 Sum squared resid 34762.61F-statistic 14231.47 Durbin-Watson stat 2.331248Prob(F-statistic) 0.000000
Inverted AR Roots .78
The first equation examined is the equation for tax. It is a very simple equation, and is
the calculation of total business and personal taxes. Its instruments are potential gdp, inflation,
nonresidential fixed investment, change in business inventories, residential fixed investment,
imports of goods and services, the money stock, average yield on AAA corporate bonds, interest
rates on three-month treasury bills, exports, disposable personal income, gross domestic product
lagged by one quarter, and finally itself lagged by one quarter. The high r-squared number
indicates that we should have a very good fitting line, and we also see a Durbin-Watson statistic
within the acceptable range. I have used the auto-regressive model to help correct for any serial
correlation, so that explains why we have such a good D-W stat.
200
400
600
800
1000
1200
1400
1960 1965 1970 1975 1980 1985 1990
TAX TAX (Baseline)
Above is the historical simulation of taxes, and as our r-squared value had indicated we
have a decently nice fitting line. The MAPE for the historical simulation is .05%.
1200
1220
1240
1260
1280
1300
1320
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
TAX TAX (Scenario 1)
Above is the ex-post ante forecast for the tax equation. We have been able to generate a
fairly strong forecast which has a MAPE of .019.
Dependent Variable: CONSMethod: Two-Stage Least SquaresDate: 12/07/09 Time: 20:38Sample: 1960Q1 1993Q4Included observations: 136Convergence achieved after 44 iterationsInstrument list: C G GDPPOT INFL INR INV IR M M2 RL WINF X CONS CONS(-2) NETWRTH(-1) YPD(-1)
Lagged dependent variable & regressors
added to instrument list
Variable Coefficient Std. Error t-Statistic Prob.
C -146.5984 35.31959 -4.150627 0.0001YPD 0.192170 0.039694 4.841345 0.0000
NETWRTH 0.040520 0.009706 4.174715 0.0001RS -5.241978 1.330045 -3.941204 0.0001
CONS(-1) 0.586638 0.085641 6.849938 0.0000AR(1) 0.406659 0.116241 3.498412 0.0006
R-squared 0.999569 Mean dependent var 2834.458Adjusted R-squared 0.999552 S.D. dependent var 873.7046
S.E. of regression 18.49246 Sum squared resid 44456.23F-statistic 60252.24 Durbin-Watson stat 2.165461Prob(F-statistic) 0.000000
Inverted AR Roots .41
The above table is the results of the two stage least squares regression for the
consumption equation. Personal consumption represents two-thirds of GDP and is one of the
most important behavioral equations within the entire model. Because of the presence of the
lagged dependent variable in the equation, and in accordance with Fair’s method, I have included
the consumption variable lagged twice upon itself in the instruments. In addition to this I have
included a lagged variable of both net worth and personal disposable income because they are
also endogenous variables. Again, we notice a high r-squared value, indicating a good-fitting
line. Also, the Durbin-Watson statistic is within its accepted values, which has happened again
because of the addition of the autoregressive model. The negative coefficient present for the
variable representing the three-month treasury bill interest rates makes sense as one can
assume that as consumption increases, the interest on these would in turn decrease. The
positive coefficients for both net worth and disposable personal income also makes sense as it is
only logical to assume that consumption would increase as these two variables do as well.
1000
1500
2000
2500
3000
3500
4000
4500
1960 1965 1970 1975 1980 1985 1990
CONS CONS (Baseline)
Above is the graph for the historical simulation of consumption, and as our r-
squared value indicates we have a strong fit; the MAPE for the historical simulation of
consumption is .017%.
4400
4440
4480
4520
4560
4600
4640
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
CONS CONS (Scenario 1)
Above is a graphical representation of the ex-post ante forecast for the
consumption equation. Although it looks like it is dipping far below the actual line, it
really isn’t, as can be seen in a graphical representation including the historical
simulation.
1200
1600
2000
2400
2800
3200
3600
4000
4400
4800
1965 1970 1975 1980 1985 1990 1995
CONS (Scenario 1)CONSCONS (Baseline)
As you can see there is actually a very close fitting ex-post forecast provided, and
the MAPE of .01%.
Dependent Variable: MMethod: Two-Stage Least SquaresDate: 12/07/09 Time: 21:14Sample: 1960Q1 1993Q4Included observations: 136Convergence achieved after 5 iterationsInstrument list: C CONS G GDP GDPPOT INFL INR INV M2 RL RS X YPD(-1)
Lagged dependent variable & regressors
added to instrument list
Variable Coefficient Std. Error t-Statistic Prob.
M(-1) 0.997952 0.019728 50.58575 0.0000C -5.780378 8.064005 -0.716812 0.4748
YPD 0.002797 0.003503 0.798507 0.4260AR(1) 0.120054 0.089127 1.347008 0.1803
R-squared 0.996581 Mean dependent var 345.2206Adjusted R-squared 0.996503 S.D. dependent var 182.2808S.E. of regression 10.77870 Sum squared resid 15335.81F-statistic 12825.49 Durbin-Watson stat 1.984664Prob(F-statistic) 0.000000
Inverted AR Roots .12
The next equation is for imports of goods and services. The r-squared value is strong,
and the Durbin-Watson statistic is again within the acceptable region. The positive coefficient of
personal disposable income makes sense in the fact that the more money people have, the more
they will spend, and the more goods and services we will import.
100
200
300
400
500
600
700
800
1960 1965 1970 1975 1980 1985 1990
M M (Baseline)
The historical simulation shows a decent fitting line, and the simulation has become a
strong trend. The MAPE for the import equation is .092%.
780
800
820
840
860
880
900
920
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
M (Scenario 1) M
100
200
300
400
500
600
700
800
900
1000
1965 1970 1975 1980 1985 1990 1995
M (Scenario 1) M M (Baseline)
The first graph above shows the ex-post forecast, and the graph directly below shows the
ex-post forecast included with the actual numbers, and the historical simulation. The MAPE for
the ex-post forecast is .06%, and it continues along the trend that the historical simulation begins.
Dependent Variable: INRMethod: Two-Stage Least SquaresDate: 12/07/09 Time: 20:42Sample (adjusted): 1960Q2 1993Q4Included observations: 135 after adjustmentsConvergence achieved after 26 iterationsInstrument list: C CONS G GDPPOT INFL INV IR M M2 X YPD GDP( -1) INR(-1) RL(-5)
Lagged dependent variable & regressors
added to instrument list
Variable Coefficient Std. Error t-Statistic Prob.
C 21.67766 108.1270 0.200483 0.8414GDP 0.107208 0.015837 6.769528 0.0000
RL(-4) -6.854006 3.604487 -1.901520 0.0594AR(1) 0.977314 0.021144 46.22132 0.0000
R-squared 0.995463 Mean dependent var 425.6459Adjusted R-squared 0.995359 S.D. dependent var 126.2919S.E. of regression 8.603287 Sum squared resid 9696.167F-statistic 9578.989 Durbin-Watson stat 1.365430Prob(F-statistic) 0.000000
Inverted AR Roots .98
Moving forward we next look at the equation for nonresidential investment, and
immediately we notice that it has a positive effect on aggregate economic activity. However, it has
a negative effect on the opportunity cost of investment. Again, we see a high r-squared value,
which translates to a good fitting line.
100
200
300
400
500
600
700
800
1960 1965 1970 1975 1980 1985 1990
INR INR (Baseline)
The historical simulation shows a line that doesn’t fit quite as well as many of the
previous equations historical simulations have, and we see a MAPE of .144%.
620
640
660
680
700
720
740
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
INR (Scenario 1) INR
100
200
300
400
500
600
700
800
1960 1965 1970 1975 1980 1985 1990
INR (Scenario 1)INRINR (Baseline)
As we look at the above graphs we also see a larger separation between the actual
numbers, and the ex-post forecast. The MAPE for nonresidential investment is .058%.
Dependent Variable: IRMethod: Least SquaresDate: 12/07/09 Time: 20:43
Sample: 1960Q1 1993Q4Included observations: 136Convergence achieved after 33 iterations
Variable Coefficient Std. Error t-Statistic Prob.
C 12.99230 60.40403 0.215090 0.8300YPD(-1) 0.048791 0.013085 3.728651 0.0003RS(-1) -3.810494 0.941601 -4.046825 0.0001AR(1) 0.949368 0.029951 31.69789 0.0000
R-squared 0.961015 Mean dependent var 190.7934Adjusted R-squared 0.960129 S.D. dependent var 43.58628S.E. of regression 8.703175 Akaike info criterion 7.194223Sum squared resid 9998.374 Schwarz criterion 7.279890Log likelihood -485.2072 F-statistic 1084.643Durbin-Watson stat 1.109263 Prob(F-statistic) 0.000000
Inverted AR Roots .95
Residential investment is a variable that reflects household demand for new homes. It is
estimated as a function of real disposable income and the cost of borrowing. We are using the
interest rates for three-month treasury bills as a proxy for mortgage rates.
80
120
160
200
240
280
1960 1965 1970 1975 1980 1985 1990
IR IR (Baseline)
The historic simulation shows an actual set of values that oscillates regularly between
peaks and troughs, but the simulation almost begins to show a trend. The MAPE for the historical
simulation is .13%.
252
256
260
264
268
272
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
IR IR (Scenario 1)
The ex-post forecast shows a forecast that falls below the values of the actual numbers.
The MAPE is .032%.
Dependent Variable: INVMethod: Two-Stage Least SquaresDate: 12/07/09 Time: 21:23Sample: 1960Q1 1993Q4Included observations: 136Convergence achieved after 10 iterationsInstrument list: C CONS G GDPPOT INFL INR IR M M2 RL RS X INV( -2) (GDP-CONS-GDP(1)+CONS(-1))
Lagged dependent variable & regressors
added to instrument list
Variable Coefficient Std. Error t-Statistic Prob.
C 2.837186 1.626528 1.744321 0.0834D(GDP-CONS) 0.360108 0.058365 6.169931 0.0000
INV(-1) 0.709656 0.054278 13.07454 0.0000AR(1) -0.182547 0.106412 -1.715472 0.0886
R-squared 0.675370 Mean dependent var 21.58603
Adjusted R-squared 0.667992 S.D. dependent var 22.24099S.E. of regression 12.81529 Sum squared resid 21678.58F-statistic 50.06773 Durbin-Watson stat 2.073371Prob(F-statistic) 0.000000
Inverted AR Roots -.18
The next equation is for the change in business inventories. Reasearch has shown that
much of the variation in real output growth over the course of a business cycle can be attributed
to variations in the rate of inventory accumulation. This equation is estimated as a function of the
change in the difference between total output and consumption.
-80
-40
0
40
80
120
160
200
1960 1965 1970 1975 1980 1985 1990
INV INV (Baseline)
The historic simulation of business inventories is represented graphically above.
Immediately one’s eyes would be drawn to the beginning of the cycle in which there is an
impossibly large peak in the simulation. This peak could be controlled through the use of a
dummy variable, but doesn’t affect the simulation greatly. The MAPE of the historical simulation
is the largest of all the equations at 2..77%. However, it is important to note that this number is
still below the 5% threshold that is generally considered in good form for a forecast.
10
20
30
40
50
60
70
80
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
INV (Scenario 1) INV
-80
-40
0
40
80
120
160
200
1960 1965 1970 1975 1980 1985 1990 1995
INV (Scenario 1)INVINV (Baseline)
The above graphs show the ex-post forecast for the equation regarding business
inventories. The MAPE improves from the historical simulation to .449%.
Dependent Variable: RSMethod: Two-Stage Least SquaresDate: 12/07/09 Time: 20:52
Sample: 1960Q1 1993Q4Included observations: 136Convergence achieved after 8 iterationsInstrument list: C CONS G INR INV IR M RL X INFL(-1) RS(-1) M2(-1) YPD(-1)
Lagged dependent variable & regressors
added to instrument list
Variable Coefficient Std. Error t-Statistic Prob.
C -44.76644 12.53438 -3.571492 0.0005YPD 0.014637 0.003054 4.792746 0.0000M2 -0.021874 0.005354 -4.085905 0.0001
INFL 0.303852 0.129569 2.345099 0.0205AR(1) 0.956617 0.022165 43.15966 0.0000
R-squared 0.906337 Mean dependent var 6.210196Adjusted R-squared 0.903477 S.D. dependent var 2.809331S.E. of regression 0.872807 Sum squared resid 99.79487F-statistic 325.5120 Durbin-Watson stat 1.981729Prob(F-statistic) 0.000000
Inverted AR Roots .96
Short-term interest rates (rates on three-month treasury bills) are modeled as a
normalization of a traditional money demand equation. When personal disposable income
increasing demand for money increases, but decreases when real short-term interest rates rise
as the opportunity cost of holding money increases. The r-squared values for this equation are
lower than other equations, and that makes sense. Interest rates are more volatile than any of
the other variables, and therefore much more difficult to predict.
-4
0
4
8
12
16
20
1960 1965 1970 1975 1980 1985 1990
RS RS (Baseline)
As you can see the historical simulation isn’t quite as fitted as many of the other
simulations that I have introduced today. The spike in the 80’s is consistent with Paul Volker
increasing the interest rates to battle inflation. The MAPE for this historical simulation is .59%.
3.2
3.6
4.0
4.4
4.8
5.2
5.6
6.0
6.4
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
RS RS (Scenario 1)
The MAPE for the ex-post forecast is .179%.
Dependent Variable: RL
Method: Least SquaresDate: 12/07/09 Time: 20:53Sample: 1960Q1 1993Q4Included observations: 136Convergence achieved after 9 iterations
Variable Coefficient Std. Error t-Statistic Prob.
C 0.301862 0.110459 2.732789 0.0071RS 0.188788 0.020330 9.286057 0.0000
RL(-1) 0.822268 0.021359 38.49828 0.0000AR(1) 0.213139 0.088868 2.398388 0.0179
R-squared 0.987126 Mean dependent var 8.211863Adjusted R-squared 0.986833 S.D. dependent var 2.743314S.E. of regression 0.314787 Akaike info criterion 0.555132Sum squared resid 13.08002 Schwarz criterion 0.640798Log likelihood -33.74896 F-statistic 3373.660Durbin-Watson stat 2.028879 Prob(F-statistic) 0.000000
Inverted AR Roots .21
This is the regression for average yield on AAA bonds. It is a member of the recursive
block, so it was run using only ordinary least squares.
0
4
8
12
16
20
1960 1965 1970 1975 1980 1985 1990
RL RL (Baseline)
The MAPE for the historic simulation is .38%.
6.4
6.8
7.2
7.6
8.0
8.4
8.8
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
RL (Scenario 1) RL
The MAPE for the ex-post fore cast is .08%.
Dependent Variable: UR
Method: Least Squares
Date: 12/07/09 Time: 22:46
Sample (adjusted): 1960Q3 1993Q4
Included observations: 134 after adjustments
Convergence achieved after 8 iterations
Variable Coefficient Std. Error t-Statistic Prob.
C 6.582626 1.181766 5.570160 0.0000
(D(LOG(GDP)))-(D(LOG(GDPPOT))) -3.592488 2.730454 -1.315711 0.1906
AR(1) 0.973305 0.019730 49.33012 0.0000
R-squared 0.949410 Mean dependent var 6.178109
Adjusted R-squared 0.948637 S.D. dependent var 1.554937
S.E. of regression 0.352400 Akaike info criterion 0.774035
Sum squared resid 16.26835 Schwarz criterion 0.838912
Log likelihood -48.86033 F-statistic 1229.218
Durbin-Watson stat 0.650476 Prob(F-statistic) 0.000000
Inverted AR Roots .97
The unemployment rate is estimated according to a tradition Okun’s law equation relating
change in the unemployment rate to the change in GDP. It makes sense that there is a negative
effect of the unemployment rate on GDP. This equation is also in the recursive block, and
therefore is estimated using ordinary least squares.
3
4
5
6
7
8
9
10
11
1960 1965 1970 1975 1980 1985 1990
UR UR (Baseline)
The MAPE for the historical simulation is .19%.
5.4
5.6
5.8
6.0
6.2
6.4
6.6
6.8
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
UR (Scenario 1) UR
The MAPE for the ex-post forecast is .13%.
Dependent Variable: WINFMethod: Two-Stage Least SquaresDate: 12/07/09 Time: 20:55Sample: 1960Q1 1993Q4Included observations: 136Convergence achieved after 8 iterationsInstrument list: C CONS G GDP GDPPOT INFL(-1) INR INV IR M NETWRTH PRFT RL RS TR UR WINF(-1) X
Lagged dependent variable & regressors
added to instrument list
Variable Coefficient Std. Error t-Statistic Prob.
C -14.26324 3.091834 -4.613198 0.0000INFL 0.691501 0.014116 48.98761 0.0000
UR(-2) 0.032879 0.096548 0.340545 0.7340PROD 0.152321 0.046329 3.287830 0.0013AR(1) 0.934047 0.033211 28.12424 0.0000
R-squared 0.999885 Mean dependent var 47.85147Adjusted R-squared 0.999882 S.D. dependent var 28.87696S.E. of regression 0.314002 Sum squared resid 12.91624F-statistic 285411.2 Durbin-Watson stat 1.438084Prob(F-statistic) 0.000000
Inverted AR Roots .93
The annual rate of growth in wages will be a positive function of overall price inflation, a
negative function of the unemployment rate, and a positive function of productivity growth. We
have a very strong r-squared value, indicating a good fiiting line.
0
20
40
60
80
100
120
1960 1965 1970 1975 1980 1985 1990
WINF WINF (Baseline)
The MAPE for the Historic simulation is .11%
103
104
105
106
107
108
109
110
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
WINF WINF (Scenario 1)
The MAPE for the ex-post forecast is .01%
Dependent Variable: INFLMethod: Two-Stage Least SquaresDate: 12/07/09 Time: 20:57Sample (adjusted): 1960Q2 1993Q4Included observations: 135 after adjustmentsConvergence achieved after 19 iterationsInstrument list: C CONS CONS(-2) G GDP(-1) GDPPOT INV IR M NETWRTH PRFT RL RS TR WINF(-1) X YPD
Lagged dependent variable & regressors
added to instrument list
Variable Coefficient Std. Error t-Statistic Prob.
C 2.645615 2.952583 0.896034 0.3719WINF 0.676700 0.140916 4.802149 0.0000
CONS(-1) 0.000505 0.001582 0.318843 0.7504POIL 0.092758 0.022974 4.037453 0.0001
INFL(-1) 0.479845 0.090355 5.310660 0.0000AR(1) 0.926117 0.041998 22.05126 0.0000
R-squared 0.999943 Mean dependent var 72.17086Adjusted R-squared 0.999941 S.D. dependent var 38.74432S.E. of regression 0.296935 Sum squared resid 11.37400F-statistic 456242.9 Durbin-Watson stat 2.116613Prob(F-statistic) 0.000000
Inverted AR Roots .93
The annual rate of growth in the consumer price index is estimated to be a function of
wage inflation, consumer demand, and oil prices. We have a high r-squared value, and the
Durbin-Watson statistic falls within the accepted values.
20
40
60
80
100
120
140
160
1960 1965 1970 1975 1980 1985 1990
INFL INFL (Baseline)
The MAPE for the historic simulation is .10%.
146
147
148
149
150
151
152
153
154
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
INFL INFL (Scenario 1)
The MAPE for the ex-post forecast is .006%.
2000
3000
4000
5000
6000
7000
1960 1965 1970 1975 1980 1985 1990
GDP GDP (Baseline)
After completing estimations of all the equations we can simulate the model as a
complete system. The above simulation is the historical look at GDP. It is a good fitting line, and
we are ultimately given a MAPE of .05%
2000
3000
4000
5000
6000
7000
1960 1965 1970 1975 1980 1985 1990 1995
GDP (Scenario 1)GDPGDP (Baseline)
Above is a graph of the historic simulation, actual numbers, and ex-post forecast
combined into one. From this view we see that the ex-post forecast looks pretty good. Below is a
closer look at the ex-post forecast.
6400
6500
6600
6700
6800
6900
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
GDP GDP (Scenario 1)
The MAPE based on this simulation is .008%. This is a strong forecast for the gross
domestic product.
0
1000
2000
3000
4000
5000
6000
1960 1965 1970 1975 1980 1985 1990
YPD YDP_0
Looking at the results for the disposable personal income equation confirm our findings
for gross domestic product. The steady growth of personal disposable income is consistent with
the growth of gross domestic product. The MAPE of the historical simulation for personal
disposable income is .06%.
0
1000
2000
3000
4000
5000
6000
1960 1965 1970 1975 1980 1985 1990 1995
YDP_1 YPD YDP_0
5600
5650
5700
5750
5800
5850
5900
94Q1 94Q2 94Q3 94Q4 95Q1 95Q2 95Q3 95Q4
YPD YDP_1
The MAPE for the ex-post forecast of personal disposable income is .01%.