turn left or right: how the economy affects political preferences and media coverage? - ...

16
Turn left or right: How the economy affects political preferences and media coverage? Multivariate ARIMA models Assignment 3 Mark Boukes ([email protected]) 5616298 1 st semester 2010/2011 Dynamic Data Analysis Lecturer: Dr. R. Vliegenthart December 1, 2010 Communication Science (Research MSc)

Upload: mark-boukes-university-of-amsterdam

Post on 12-Nov-2014

737 views

Category:

News & Politics


2 download

DESCRIPTION

Turn left or right: How the economy affects political preferences and media coverage? Multivariate ARIMA models

TRANSCRIPT

Page 1: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

Turn left or right: How the economy affects political preferences

and media coverage?

Multivariate ARIMA models

Assignment 3

Mark Boukes ([email protected])5616298

1st semester 2010/2011Dynamic Data Analysis

Lecturer: Dr. R. VliegenthartDecember 1, 2010

Communication Science (Research MSc) Faculty of Social and Behavioural Sciences

University of Amsterdam

Page 2: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

Table of contents

INTRODUCTION...................................................................................................................................................1

METHOD.................................................................................................................................................................1

RESULTS.................................................................................................................................................................2

LEFT-RIGHT POLITICAL PREFERENCES....................................................................................................................3

CONCLUSION........................................................................................................................................................6

REFERENCES........................................................................................................................................................6

Appendix 1: Do file

Page 3: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

IntroductionIn the previous assignment, I found a significant effect of the international financial crisis caused by

the second oil crisis in 1979/1980 on the political preferences Dutch citizens on a left-right scale. Being

inspired by the influence of economic factors on political preferences, I studied in this assignment what

the effect of unemployment is on political preferences of the Dutch population. Hollanders and

Vliegenthart (2009) showed in their research how news coverage that was negative about the economy,

led to a decreased consumer confidence. In this paper I would like to see if there is also an influence on

political preference. Soroka (2006) also found that increased negative economic news coverage leads to

more pessimistic expectations about the future of the economy. To study the effect of unemployment on

political preferences, I had the following research questions:

o Did the amount of articles about unemployment in NRC Handelsblad affect the average

political preference of Dutch citizens?

o Did developments in the unemployment rates of the Netherlands affect the average political

preference of Dutch citizens?

MethodTo investigate which factors have an effect on the left-right preferences of Dutch citizens, I used a

dataset that contains information about this for a long time period. The NIPO Weeksurveys 1962-20001

contained for the period 1977-2000, 1.086.336 individual answers on the following question, Here you

see seven boxes between the words left and right. Could you indicate on this scale how left, right or in

between your political opinion lies? The observations were transformed in such a way that the mean

answer for every week was reported, because the answers were reported individually and aggregate level

data is needed to answer the research question,. This resulted in 1226 weekly items containing the value

for the average left-right preference of Dutch citizens, as I only study the period 1990-2000, 560 items

could be used.

To construct a variable containing information about the amount of attention paid to

unemployment in newspapers on different moments in time, a computer assisted content analysis was

conducted using the digital archive of LexisNexis. Articles were selected via the Boolean search term

werkloosheid OR werkeloosheid. The period that I analyzed was 1 January 1990 until 31 December

2000, as the variable indicating the mean left-right preference was measured until 2000 and

LexisNexis contains no data for the period before 1990. Only articles in NRC Handelsblad were

analyzed, as this newspaper is the only that contains data from 1990 on. Using other newspapers would

have led to a too short period. The search resulted in 7652 articles for the whole period. The number of

articles was aggregated, resulting in weekly visibility scores of unemployment in NRC Handelsblad.

1 found on https://easy.dans.knaw.nl/dms

1

Page 4: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

The variable representing the unemployment rate was obtained via the website of Eurostat; also

for the period 1990-2000. Unemployment rate was measured as the percentage of the total labour force.

However, as this data was monthly and not weekly, the unemployment rate for intervening moments

were calculated by taking the mean of the week before and the next week measured.

To analyse the effects of those events, first an adequate ARIMA model is developed for the time

series of the average left-right preference, this was followed by adding the independent variables to the

ARIMA model, resulting in multivariate ARIMA models.

ResultsI specify in this results section, how the ARIMA model for the average left-right preference was

created. Thereafter will the results be described of including information about news coverage and

unemployment rates into this ARIMA model, with the purpose of explaining political preferences in

causal terms. Three timeseries were used in the analyses, the average left-right political preference, the

amount of articles about unemployment and the unemployment rate in the Netherlands. Figure 1

displays the development of those variables in the period 1990-2000.

Figure 1. The development of average political preference, amount of ‘unemployment’ articles in NRC Handelsblad and the unemployment rate in the Netherlands, between 1990 and 2000.

2

Page 5: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

Left-right political preferencesTo check if it was necessary to integrate the ARIMA model, the time series of the average left-right

political preferences was analyzed with three augmented Dickey-Fuller tests. The results of these tests

(see Table 1) indicate that the series had to be differenced, because the Dickey-Fuller test for random

walk could not be rejected. Therefore, the dependent variable needed to be differenced. The results of the

three augmented Dickey-Fuller tests of the differenced series all could reject the null hypothesis,

meaning that no random walk was present (also see Table 1). Therefore, the political preference time

series does not need to be differenced once more.

Table 1. The various augmented Dickey-Fuller tests for the average left-right political preferences  Augmented Dickey-Fuller testRandom walk without drift -0.125 *Random walk with drift -11.750Random walk with drift and trend -15.312

After integratingRandom walk without drift -38.053Random walk with drift -38.019Random walk with drift and trend -37.984Note. * indicates the presence of a unit root.

The next step was predicting the data as good as possible by accounting for its own past, either with

autoregressive (AR) terms, moving average (MA) terms or both. This was done by inspecting the

autocorrelation (ACF) and partial autocorrelation functions (PACF) (see both graphs in Figure 2). The

ACF graph shows a clear spike at lag 1 and little to no significant correlations for other lags, while the

PACF graph displays a declining pattern for the first lags. This pattern is indicative for a process with a

moving average at lag 1. A ARIMA (0,1,1) model seems thus the right choice. This model was tested for

autocorrelation with the Ljung–Box Q test statistic and for the presence of conditional heteroscedasticity

with the Engle-Granger test. The insignificant result of the Ljung-Box Q-test for autocorrelation (20

lags) means that the null hypothesis of white noise cannot be rejected and that the absence of

autocorrelation can be assumed (Q= 15.37, p=.755). However, the Engle-Granger test for the presence

of conditional heteroscedasticity gives a significant result, indicating the presence of heteroscedasticity

(Q= 79.47, p<.001); nonetheless we paid no attention to this and hope to solve it later with ARCH and

GARCH models. The values of this ARIMA (0,1,1) model can be found in Table 2; just as all coming

models.

3

Page 6: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

Figure 2. ACF and PCF for the differenced mean score of the average political preference.

Table 2. ARIMA model for the differenced mean score of the average political preference.

ARIMA (0,1,1) News coverage Unemployment rate Unemployment rate

Constant -.000 (.000) -.000 (.000) -.000 (.000) -.000 (.000)

Moving average (t - 1) -.838 (.023)* -.835 (.023)* -.838 (.023)* -.835 (.023)*

Unemployment coverage (t - 5) -.001 (.000) -.001 (.000)

Unemployment rate (t – 1) -.001 (.011) -.002 (.011)

Ljung-Box Q(20) residuals 15.37 14.81 15.56 14.77

Ljung-Box Q(20) residuals² 79.47 * 83.71* 79.30* 83.67*

AIC -1776.91 -1758.14 -1770.89 -1756.18

BIC -1763.84 -1740.75 -1753.47 -1734.43Note. Unstandardized coefficients. Standard errors in parentheses; * p<.001

Now we built a model that properly accounts for its own past, I could go on with the next step:

assessing the impact of the amount of news coverage in NRC Handelsblad about unemployment on the

average political preference of Dutch citizens. As the effect of news coverage is expected to set in

within a time-span of 3 months, I considered lags ranging from 1 to 13. The cross-correlation function

for the amount of unemployment news coverage and the residuals of the ARIMA(0,1,1) model for

average political preference, indicate that the strongest association is present when news coverage is

lagged 5 weeks (r = -.086). The ARIMA(0,1,1) model which included the amount of unemployment

news coverage, did find similar results for the Ljung-Box Q-test (Q = 14.81, p =.787) and the Engle-

Granger test (Q = 83.71, p < .001); indicating the absence of autocorrelation and the presence of

conditional heteroscedasticity.

Including this variable as an independent variable in the original ARIMA model for average

political preference, indicates that the amount of ‘unemployment news coverage’ seems not to

influence the political preference of Dutch citizens; the unstandardized coefficient is -.001 (p = .113).

The Akaike Info Criterion (AIC) increases with 18.77 points (= -1776.91 ─ -1758.14), which also

4

Page 7: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

indicates that the model did not get better. However, the model which includes the amount of

unemployment news coverage is better than the model without, according to the difference in log-

likelihood, which decreased with 8.39 points, while losing one degree of freedom (p < .01). Though

the model did explain variance in average political preference little better, I prefer the standard and

more parsimonious ARIMA(0,1,1) model, because of the insignificant effect of amount of

unemployment news coverage and the increase in AIC.

To check whether the real economy had more effect on the political preferences of citizens, I repeat the

process of including an independent variable, but this time with the unemployment rate. Again I

expected a potential effect to set in within three months (13 weeks). I analyzed the cross-correlation

function for this period for the unemployment rate and the residuals of the ARIMA(0,1,1) model for

average political preference. This indicated that the strongest association is present when the

unemployment rate is lagged 1 weeks (r= .063). Including this variable as an independent variable in

the original ARIMA model for the average political preference, indicates that the unemployment rate is

also not causing differences in the average political preferences; the unstandardized coefficient is -.001

(p = .963). According to the AIC, did including the unemployment rate to the model not improve

model fit; this value increased with 6.02 points. The difference in log-likelihood was a decrease with

2.02 points (p = .16) while losing one degree of freedom; not significant and thus no indication that the

model fits better. Including the unemployment rate harms model fit thus, just like including the amount

of articles in NRC Handelsblad seems to do. The model which included the amount of unemployment

news coverage, did find similar results for the Ljung-Box Q-test (Q = 15.56, p =.744) and the Engle-

Granger test (Q = 79.30, p < .001); indicating the absence of autocorrelation and the presence of

conditional heteroscedasticity.

The final model that I tested was the ARIMA(0,1,1) model for the political preferences, which

included both the amount of unemployment coverage in NRC Handelsblad at lag 5 and the

unemployment rate as independent variables at lag 1 (unemployment rate had also the strongest

correlation at lag 1 with the residuals for the ARIMA(0,1,1) model which included the amount of news

coverage). In this way, a potential effect of news coverage could be controlled for the real world

circumstances of the economy. This model again found comparable results for the Ljung-Box Q-test

(Q = 14.77, p =.790) and the Engle-Granger test (Q = 83.67, p < .001); indicating the absence of

autocorrelation and the presence of conditional heteroscedasticity. Including both independent

variables, again and as could be expected, led to two insignificant effects: the amount of

unemployment news coverage (b = -.001, p = .111) and unemployment rate (b = -.002, p = .871). This

model also made the AIC increase, from -1776.91 to -1756.18; 20.75 points. The difference in log-

likelihood also did not point to a significantly better fitted model; a decrease of 8.37 poins while losing

5

Page 8: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

two degrees of freedom (p = .015). Including both the amount of unemployment articles and the

unemployment rate in the ARIMA(0,1,1), thus does not improve model fit.

ConclusionBecause I found in the previous assignment an effect of the financial crisis in 1979/1980 on the

political preferences of people, I tried to find a comparable effect in this paper, by investigating

potential effects of news coverage and real world developments of unemployment. My main aim was

to study the influence news coverage about unemployment had on the average political preference on a

left-right scale of the Dutch population. The results make clear that such an effect seems not to exist;

changes in political preference are not caused by changes in the amount of political coverage. To see if

the average political preference was on the other hand affected by real world developments, I looked to

the unemployment rate as another independent variable. However, this also did not seem to have an

impact on the average political preference. To check if the oil crisis in 1979 and 1980 was an exception

as an economic factor that influenced the political preference, future research should try to use other

economic indicators as independent variables.

ReferencesHollanders, D., & Vliegenthart, R. (2009). The Influence of Negative Newspaper Coverage on

Consumer Confidence: The Dutch Case, CentER Discussion Paper Series (Vol. 2009). Tilburg:

University of Tilburg.

Soroka, S. N. (2006). Good news and bad news: Asymmetric responses to economic information.

Journal of Politics 68(2), 372-385.

6

Page 9: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

Appendix 1: Do file

*Left rightdrop if yrwk<199002drop if yrwk>200051

* declare data to be time seriesreplace nr2 = nr2 + 898tsset nr2, weekly

codebook leftrightcodebook N_BREAK

*Missing values, leftright is average of the two points coming before and after, articles is 0 as it means there were no articles about unemploymentreplace leftright= (leftright[_n-1]+leftright[_n+1])/2 if leftright>= .replace leftright= (leftright[_n-1]+leftright[_n+2])/2 if leftright>= .replace N_BREAK = 0 if N_BREAK>= .replace unumpl_rate = (unumpl_rate[_n-1]+unumpl_rate[_n+3])/2 if unumpl_rate>= .replace unumpl_rate = (unumpl_rate[_n-1]+unumpl_rate[_n+2])/2 if unumpl_rate>= .replace unumpl_rate = (unumpl_rate[_n-1]+unumpl_rate[_n+1])/2 if unumpl_rate>= .replace unumpl_rate = (unumpl_rate[_n-1]+unumpl_rate[_n+4])/2 if unumpl_rate>= .replace unumpl_rate = (unumpl_rate[_n-1]+unumpl_rate[_n+5])/2 if unumpl_rate>= .replace unumpl_rate = unumpl_rate[_n-1] if unumpl_rate>= .codebook unumpl_rate leftright N_BREAK

*Model buildingtwoway (tsline leftright, lcolor(black)) twoway (tsline N_BREAK, lcolor(black))twoway (tsline unumpl_rate, lcolor(black))

*with driftdfuller leftright*random walkdfuller leftright, noconstant*trenddfuller leftright, trend

*not necessary, but just to check the datadfuller N_BREAK*random walkdfuller N_BREAK, noconstant*trenddfuller N_BREAK, trend

dfuller unumpl_rate*random walkdfuller unumpl_rate, noconstant*trenddfuller unumpl_rate, trend

*As DF for random walk is not significant, I assume the data show a random walk pattern and therefore it is necessary to integrate (differenciate) the data

i

Page 10: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

twoway (tsline d.leftright, lcolor(black)) twoway (tsline d.N_BREAK, lcolor(black))twoway (tsline d.unumpl_rate, lcolor(black))

*with driftdfuller d.leftright*random walkdfuller d.leftright, noconstant*trenddfuller d.leftright, trend

*not necessary, but just to check the datadfuller d.N_BREAK*random walkdfuller d.N_BREAK, noconstant*trenddfuller d.N_BREAK, trend

*with driftdfuller d.unumpl_rate*random walkdfuller d.unumpl_rate, noconstant*trenddfuller d.unumpl_rate, trend

*Building the ARIMA model for d.leftrightac d.leftrightpac d.leftrightcorrgram d.leftright*The ACF graph shows a clear spike at lag 1 and little to non significant correlations for other lags, while the PACF graph displays a declining pattern for the first lags. *A ARIMA (0,1,1) model is thus the right choice

arima d.leftright, ma(1)estat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)drop r r_s*The Ljung-Box Q-test for autocorrelation (20 lags) on the residuals indicates that the null hypothesis of white noise cannot be rejected, this indicate the absence of autocorrelation (Q=xxxxx, p=xxxx). The Engle-Granger test for the presence of conditional heteroscedasticity indicates the presence of this as the test does reject the null hypothesis of having white noise (Q=xxxxx, p=xxxxxx).

*Cross-correlation function which lag works best statisticallyxcorr r d.N_BREAK, lags(13)correlate r l.d.N_BREAK l2.d.N_BREAK l3.d.N_BREAK l4.d.N_BREAK l5.d.N_BREAK l6.d.N_BREAKcorrelate r l5.d.N_BREAK *Strongest correlation at a lag of 5 weeks (one month)

arima d.leftright l5.d.N_BREAK, ma(1)estat icpredict r2, resgen r2_s=r2*r2wntestq r2, lags(20)wntestq r2_s, lags(20)

ii

Page 11: Turn left or right: How the economy affects political preferences and media coverage?  -  Multivariate ARIMA models

xcorr r d.unumpl_rate, lags(13)drop r2 r2_s*N_Break's effect is not significant

arima d.leftright, ma(1)estat icpredict r, resgen r_s= r*rwntestq r, lags(20)wntestq r_s, lags(20)xcorr r d.unumpl_rate, lags(13)correlate r l.d.unumpl_rate l2.d.unumpl_rate l3.d.unumpl_rate l4.d.unumpl_rate l5.d.unumpl_rate l6.d.unumpl_rate*Strongest correlation at lag 1 and lag 2drop r r_s

arima d.leftright l1.d.unumpl_rate, ma(1)estat icpredict r2, resgen r2_s=r2*r2wntestq r2, lags(20)wntestq r2_s, lags(20)drop r2 r2_s

arima d.leftright l5.d.N_BREAK l.d.unumpl_rate, ma(1)estat icpredict r2, resgen r2_s=r2*r2wntestq r2, lags(20)wntestq r2_s, lags(20)drop r2 r2_s

iii