financial econometrics; hypothesis testing
TRANSCRIPT
WHAT INFLUENCES THE PRICE OF USED CAR: A critical study into what determines the price of a used car in internet auctions?
Financial Econometrics Data Analysis
Raza Ghulam Mujtaba Sean O’ Moley
Zhou Yang
Introduction
This project analyses, how the price of a used car is affected by a change in model, engine size, colour,
age, etc. In an attempt to form an educated conclusion this project investigated the results in the light of
the findings by other academics. A total of 188 cars from different models, age, colour etc. were taken as
a sample. The essay will explain the methodology involved in analysing the results and at the same time
data will be analysed with the help of the models and assumptions explained in the methodology. The
main auction website used for the project is www.carzone.ie which contains more than fifty thousand
used cars data, internet auctions are probably the best source to buy a used car and the data available is
probably the most credible source of information. Although traditionally the trading if a used car has
involved an advert in the newspaper followed a one on one meeting between the prospective buyer and
seller however internet has eased the process a great deal. Hence, this essay will begin by providing some
information about how the internet has brought the evolution to the businesses and to what level has it
impacted the used car trading business.
Since the introduction of the World Wide Web the way to do business has changed for the better and
continues to do so. The number of internet users has risen from 40.3% in the year 2000 to almost 75% in
2009 in the developed world as seen in the table below.
YEAR Population Users % Penet. Usage Source
2000 31,496,800 12,700,000 40.3 % ITU
2003 32,050,369 20,450,000 63.8 % C.I.Almanac
2005 32,440,970 21,900,000 67.5 % C.I.Almanac
2008 33,212,696 28,000,000 84.3 % I.T.U.
2009 33,487,208 25,086,000 74.9 % I.T.U.
The Ecommerce industry has been around for more than a decade now and has seen growth of an
unprecedented size. Retail has been the biggest beneficiary of this growth in the Ecommerce Industry.
Online retail reached $5 trillion in 2007 and is set to reach $12 trillion by 2012 according to recent survey
by the Verdict Research and the percent of total retail sales has been rising at a rapid pace, reaching at
about five percent only in the United Sates as seen in the figure below depicting the data from the US
Census Bureau. From groceries to clothing everything can be found online. The trade of cars and
particularly the used cars is still a novelty relative to the retail online shopping, but is becoming a major
part faster then anyone expected.
Subject Motivation:
The financial crisis of the late 2000s have had a profound effect for the worse on the developed world, the
residents and the Governments has been involved in the spending cuts. This is one of the major reasons
why there has been a surge in the activity in the used car market and at the same time new car sales has
been decreasing for some time as the figure below explains. Hence, used cars market was more attractive
subject for the essay than the new cars because diversity in the probable determinants of the price for a
used car.
According to the European Automobile Manufacturers' Association (EAMA) the registration of the new
cars in Europe decreased by 9.6% in 2010 compared to 2009 on a month by month basis as seen in the
diagram above. The EAMA has cited the uncertain economic conditions for the decrease in the sales of
the new cars because the consumers are unwilling to commit funds to the purchase at the same time it has
become more difficult to get a mortgage on a car from financial institutions.
Source: ACEA
Maintenance:
Another reason for the increase in the sales of used cars is the cost of maintaining a used car. According
to RAC cost of motoring Index, the costs of maintaining a used car were found to be less than the cost of
maintaining a new car as seen in the table below:
As might be expected, used car owners save on finance costs, but pay more maintenance. Used car
owners also spend more on fuel; partly because there has been a small gain in fuel efficiency in newer
cars (the “average” gain in fuel efficiency in 2010 is estimated by the RAC to be 2.7%).
Literature Review
A used car, also referred to as a pre-owned vehicle, or a second hand car, is any vehicle that has
previously had one or more retail owners. Like all industries out there, the motor industry has seen a
dramatic decline in recent years. This has led to plant closures, and in extreme cases, government
bailouts. As new car sales have fallen, it has ultimately led to a rise in used car demand. In 2009, the
European used car market was valued at $246.4 billion, with an estimated 24.5 million cars. Due to the
global financial distress, it’s estimated to reach $260.11 billion (an increase of 5.6%) or 24.8 million cars
(an increase of 1.4%) by 2014. (Used Cars Industry Profile: Europe 2010) It is only natural that a cheaper
alternative is likely to be sought instead of new car purchases.The market can be affected in many ways.
Below are some of the factors that have consequences on the realizable market value that are out of the
customer’s control, yet can have dramatic role in sales numbers.
UK new UK used
Fuel 1958 2102
Insurance 727 776
Maintenance 574 1232
Miscellaneous fees 383 364
Depreciation 4626 1566
Finance 571 648
Total 8839 6688
Depreciation:
Although the motor industry faces unprecedented times, the wise consumer has always opted for a second
hand vehicle. It is widely reputed that on buying a new car, its value will depreciate by approximately
20% as soon as it’s driven off the forecourt. Charlie Vogelheim, executive editor of the Kelly Blue Book,
a tracker of used-car values once stated that the depreciation rate of a particular vehicle is dependent upon
market conditions, supply and demand. A highly popular or desirable car with limited availability will
depredate slower than a car that is in excess supply or less desirable. On average, a vehicle will only
retain about 35% of its original value after five years, with ones in high demand as used cars will retain
closer to 50%. (Finlay 2004) Any car that does hold its value well usually has a premium when it is
purchased new, meaning it is difficult for a consumer to try and “outplay” the market.
VRT:
According to www.Revenue.ie, Vehicle Registration Tax is chargeable on the registration of motor
vehicles (including motor-cycles) in the State. All motor vehicles in the State, other than those brought in
temporarily by visitors, must be registered with the Revenue Commissioners. Prior to 2008, in the case of
cars and small vans, the tax was a percentage of the expected retail price, including all taxes in the State.
This price is known as the Open Market Selling Price (OMSP). Based on the vehicle details forwarded to
Revenue by the NCT centre following examination of the vehicle.
Category “A” vehicles include cars (saloons, estates, hatchbacks, convertibles, coupés, MPVs, Jeeps etc.)
and minibuses with less than 12 permanently fitted seats including the driver's seat. Today the rate of tax
chargeable is based on the level of CO2 emissions for the vehicle at the time of manufacture. The rates
and associated minimum amounts are as follows:
CO2 Emissions (g CO2/km) VRT Rates Minimum VRT
0 - 120g 14% of OMSP €280
121 - 140g 16% of OMSP €320
141 - 155g 20% of OMSP €400
156 - 170g 24% of OMSP €480
171 - 190g 28% of OMSP €560
191 - 225g 32% of OMSP €640
226g and over 36% of OMSP €720
The VRT that is charged during the purchase of the car when new will undoubtedly affect the cars resale
value as this tax (or at least a percentage of it) will be passed onto the consumer buying the car second
hand.
Incentives:
One incentive that should have a positive effect on resale cost for the consumer will be the newly
introduced VRT exemptions for electric vehicles. This exemption which was meant to run until 31 st
December 2012 has now been changed to a rebate of up to €5,000 on the VRT due on new electric
vehicles sold from May onwards. Even with this change however the VRT due on these vehicles is less
than the exemption limit, meaning that any person who does purchase a new electric car will end up
paying no VRT. So what does this mean for the second hand market? If the VRT is not being charged
when the car is newly purchased, the cost associated with it will not be passed onto the customer who
wishes to purchases the car second hand sometime in the future. While this incentive should increase the
purchase of new cars it is important to note that this passing on in savings is one reason why used car
sales are expected to increase in the coming years.
According to www.revenue.ie, the government scrappage scheme provides for VRT relief when a new
passenger car with CO2 emissions of not more than 140g/km (i.e. CO2 band A or B) is purchased and
registered and another passenger car, over ten years old is scrapped. The scheme has been extended and
will run until 30 June 2011. Although the VRT relief that is now available has been reduced from €1,500
down to €1,250 for qualifying vehicles, it is still a good incentive. Figures released by SIMI show that
17,272 reclaims were made under the government Scrappage Scheme in 2010. Although this is directly
applicable to new car sales, the SIMI have reported that “the real benefit of the Scheme is the positive
knock-on effect it has had on the overall new and used car sectors”.
Finance:
As the banks and financing institutions tighten their lending, the average used car consumer is finding it
more and more difficult to borrow. While most franchised dealers selling new cars have agreements in
place for customers, several used car outlets have found it more difficult. To combat this The Society of
the Irish Motor Industry (SIMI) has joined forces with the Credit Union to create the Direct Access Credit
Union (DACU), an online application process for dealers. (McAleer 2011) Through this scheme,
customers can apply online for financing, regardless of whether it’s a new or used car purchase. This
should result in further activity in the car market.
While the various factors mentioned above are directly out of the control of a consumer, the following are
variable that directly impact the value of a used car in the market. Studies indicate that consumers should
purchase car models that are cheaper, have lower variable and service costs, have a longer life and higher
performance than cars made by a different production technology. The problem that a car buyer faces,
however, is that some of these factors (such as driving features, design, safety, reliance) are not
objectively measured. A car buyer must either rely on car manufacturers who claim that their models are
superior to those of their competitors, on professional car magazines, or on the objective and measurable
characteristics (Pingjun 2009).
Data Analysis
R 2 and Adjusted R
R2 is the simplest commonly used way of measuring the fit between the estimated regression
equation and the sample data. According to Studenmund (2011), the closer the R2 is to 1, the
closer the estimated regression fits the sample data. However, a value near 0 indicates a failure of
the estimated regression equation to explain the values of Y i. In this model the R2 is 0.7505,
which means that approximately 75% of the independent variables explain the dependent
variable (used car price).
Studentmund(2011) states that a major problem with R2 is that adding another independent
variable to a particular equation can never decrease R2. In order to solve this problem, the
degrees of freedom are introduced into the calculation of R2 which have developed the adjusted
R2. The highest possible adjusted R2 is 1, the same as for R2, while the lowest possible adjusted
R2 can be slightly negative. The result of adjusted R2 for this model is 0.7135, which is less than
the original R2.
Multicollinearity
Multicollinearity is a situation in which two or more variables in a multiple regression model are
highly correlated. According to Studenmund(2011) the major consequences of multicollinearity
are:
1. Estimates will remain unbiased.
2. The variances and standard errors of the estimates will increase.
3. The computed t-scores will fall.
4. Estimates will become very sensitive to changes in specification.
5. The overall fit of the equation and the estimation of the coefficients of non-muticollinear
variables will be largely unaffected.
In order to detect whether multicollinearity existed in the model, the variance inflation factor
(VIF) test is used. The VIF is a method of detecting the severity of multicollinearity by looking
at the extent to which a given explanatory variable can be explained by all the other explanatory
variables in the equation. There is a VIF for each explanatory variable in an equation. The higher
the VIF, the more severe the effects of muticollinearity. In general, if VIF >5, the
multicollinearity is severe. The VIF results of this model that Stata produced is an average score
of 2.18 which is less than 5. Therefore, multicollinearity is not a big problem in this model.
Specification
According to Studenmund(2011), one of the most used formal specification criteria other than R2
is the Ramsey Regression Specification Error Test (RESET). It is a general test that determines
the likelihood of an omitted variable or some other specification error by measuring whether the
fit of a given equation can be significantly improved by the addition of Y2, Y3 and Y4.
Yi = β0 + β1X1i + β2X2i + β3X3i + … + β23X23i + β24X24i +єi
Yi = β0 + β1X1i + β2X2i + β3X3i + … + β23X23i + β24X24i + β25Y25i2 + β26Y26i
3 + β27Y27i4 + єi
Compare the fits of these two equations using the ovtest.
H0: β25= β26 = β27 = 0
HA: Specification Error
The F-test result from Stata is 48.98, while the critical value acquitted from table is 2.60.
Therefore, the null hypothesis is rejected, there are omitted variables exist in this model. This
conclusion is not surprising because there are a number of determinants that could be considered
when analysing the price of used cars, for example, number of previous owners and services. In
addition, the Ramsey RESET test can prove that a specification error is likely to exist but it does
not specify the details of that error.
Serial Correlations
Studenmund(2011) states serial correlation or autocorrelation is the observations of the error
term are correlated with each other. Usually, econometricians focus on first-order serial
correlation, in which the current observation of the error term is assumed to be a function of the
previous of the error term and a not serially correlated error term: єt = pєt-1 + ut (-1< p <1)
The major consequence of serial correlation is the bias in the OLS estimates which leads to
unreliable hypothesis testing. The most common method of detecting serial correlation is the
Durbin-Watson d test; it uses the residuals of an estimated regression to test the possibility of
serial correlation in the error term.
The null and alternative hypotheses are set up as follows;
H0: No serial correlation.
HA: serial correlation.
The appropriate decision rule is reached by comparing the test statistic against critical values (d l
and du) from the Durban- Watson tables. The decision rule is;
If d < dl reject H0
If d > du Do not reject H0
If dl ≤ d ≤ du Inconclusive
Stata produced a figure of 1.364915 for the Durban- Watson d test. There are 188 observations in
the model with 8 explanatory variables. However, in the tables it only accommodates for 7
explanatory variables and 100 observations. The closest critical values obtained from the tables
were: dl =1.53, du =1.83. And 1.36 < 1.53, therefore reject H0. It means serial correlation do exist
in the model.
The Classical Assumptions of OLS
As Studenmund( 2011) states the assumptions blow:
Assumption 1: The regression model is linear, is correctly specified, and has an additive error
term.
The regression model does not have to be linear in the variables, however, it is assumed
to be linear in the coefficients. There are two addition requirement must be held. First, the
equation is correctly specified, it cannot work if it has an incorrect functional form or an
omitted variable. Second, a stochastic error term which is additive and cannot be
multiplied or divided by another variable.
Assumption 2: The error term has a zero population mean.
A stochastic error term is added to the regression equations to account for the variation in
the independent variable that cannot be explained by the model. The average value of the
entire population of the stochastic error term is assumed to be zero. In order to allow for
the possibility that the population error will not be zero a constant term is added to the
equation and it forces the mean of the error term to be zero.
Assumption 3: All explanatory variables are uncorrelated with the error term.
If an independent variable was correlated with the error term, the variation in Y would
actually come from the error term instead of X.
Assumption 4: Observations of the error term are uncorrelated with each other.
It would be very difficult for OLS to accurately estimate the standard errors of the
coefficients, if there is a systematic correlation exists in the observation of the error
term. This assumption is most important in time series models. An increase in the error
term in one time period, such as a random shock, does not affect the error term in
another time period. But this assumption is sometimes unrealistic as the impact of a
random shock may last after the time period.
Assumption 5: The error term has a constant variance.
This assumption is the way to eliminate heteroscedasticity, it is a key factor in the cross
sectional data sets. According to Studenmund (2011), if it is assumed that all error term
observations are drawn from a distribution with a constant variance when in reality they
are drawn from distributions with different variances, then the relative important variance
in Y is difficult to estimate. Although the actual values of the error term are not directly
observable, the lack of a constant variance for the distribution of the error causes OLS not
able to generate accurate estimates of the coefficients.
Assumption 6: No explanatory variable is a perfect linear function of any other explanatory
variables.
Perfect collinearity or muticollinearity means two or more independent variables are
closely interrelated. It would cause the OLS estimation procedure being incapable of
distinguishing the different variables. While perfect muticollinearity is unusual in reality,
even multicollinearity can also cause problems for estimations.
Assumption 7: The error term is normally distributed.
According to Stundenmund ( 2011), this assumption of normality is not a requirement
for OLS estimation. It is majorly used is in hypothesis testing, which uses the estimated
regression statistics to accept or reject hypothesis about economic behaviour. Without
the normality assumption most hypothesis testing would be invalid.
Actual vs. Estimated/Fitted Values
The adjusted R2 value as calculated by Stata for the model was obtained as 0.7135. R 2 measures the
percentage of the variation of Y around Y that is explained by the regression equation. Studenmund
(2006). The greater the value or R2 the closer the estimated regression equation fits the sample data. R2 is
always between 0 and 1. With 0 indicating the OLS has predicted no match between the actual and the
predicted values of the model. At the same time, R2 value of 1 indicates that the OLS has explained the
model perfectly. Therefore our result of 0.7135 indicates that 71.35% of the change in the price of the
used cars is explained by OLS while 28.65% of the change in the price of the used cars does not depend
upon the determinants that are used in the model. The actual values were plotted against the fitted values
and found that it shows that OLS has explained the model quite accurately.
1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161 169 177 185
-10000.00
0.00
10000.00
20000.00
30000.00
40000.00
50000.00
Actual vs Estimated Price
Actual Price Estimated Price
Hypothesis Testing
The regression equation:
Price=166.69 + 8559.19x1 + 1235.065x2 + 1858.221x3 + 339.72x4 - 782.2815x5 - 0.0435x6 + 969.852x7
+865x8 + 1633.95x9 + 916.65x10 + 1592.517x11 + 5779.01x12 + 2568.36x13 - 1202.78x14 + 6415x15 -
1842.80x16 - 280.78x17 - 302.10x18 + 372.98x19 - 935.02x20 - 2396.353x21 -3806.80x22 + 1427.822x23 +
801.73x24
X1 Engine Size
X2 Type of sell(Dummy Variable)
X3 Fuel Type(Dummy Variable)
X4 Dublin(Dummy Variable)
X5 Age
X6 Mileage
X7 Blue(Dummy Variable)
X8 Silver(Dummy Variable)
X9 Grey(Dummy Variable)
X10 Red(Dummy Variable)
X11 Black(Dummy Variable)
X12 Gold(Dummy Variable)
X13 3-Series(Dummy Variable)
X14 Golf(Dummy Variable)
X15 Polo(Dummy Variable)
X16 Megane(Dummy Variable)
X17 C5(Dummy Variable)
X18 Peugeot(Dummy Variable)
X19 Micra(Dummy Variable)
X20 Corolla(Dummy Variable)
X21 Avensis(Dummy Variable)
X22 Almera(Dummy Variable)
X23 Focus(Dummy Variable)
X24 Fiesta(Dummy Variable)
T-Test
Variable Coefficient Standard Error T CI
Engine Size 8559.19 694.10 12.32 (7186.77,9931.61)
Type Of Sell 1235.10 702.50 1.76 (-152.18,2622.31)
Fuel Type 1858.22 712.35 2.61 (451.54,3264.90)
Age -782.28 143.54 -5.45 (-1065.73,-498.83)
Mileage -0.04 .007 -6.11 (-0.06,-.03)
Gold 5779.01 2186.73 2.64 (1460.85,10097.18)
A total of 24 variables were used in the research in order to reach best possible solution. However, some
variables, initially thought to be significant were found to be not significant enough to conduct a
hypothesis test. Hence only the variables that were found to be significant are short listed in the table
below and will be investigated in detail, the rest of the variables will be discussed briefly.
Engine Size:
For every unit increase in the engine size the miles per gallon decreases, at the same time the cost of
insuring a vehicle increases with the size of the engine size. Hence, it can be expected that there will be a
negative relationship between the engine size and the price of the car. However, the result obtained by the
research finds that there is a direct relationship between the engine size and the price of the used car. The
result indicated that for every unit increase in the engine size which in the case of cars is 0.1, the price of
the car is expected to increase by €8559.19.
Engine Size:
H0: β1≤ 0
HA: β1>0
T critical= 1.645
i. β1=12.32, >1.645
ii. β1>0
Enough Evidence to reject the Null Hypothesis.
Type of Sell:
This is a dummy variable where value of the variable is one when the used car is being sold at a Motor
Showroom. The research found that there is positive relationship between the car being sold at a Garage
and the price charged for the car. This positive relationship can be explained by the profit margin that a
Motor showroom is expected to charge a consumer. The result obtained concluded that a car being sold at
a motor showroom is expected to cost €1235.07 more than a private sell.
Type of sell: (Dummy variable with the value of 1 for Garage sale)
H0: β2≤ 0
HA: β2>0
T critical= 1.645
i. β2=1.76, >1.645
ii. β2>0
Enough Evidence to reject the Null Hypothesis.
Fuel Type:
This is a dummy variable where the value of the variable is one when the fuel type of the car is petrol.
The result concluded a direct relationship between the type of fuel and the price of the vehicle. The result
found that the price of the vehicle is expected to increase by €1858.22 if the fuel type is petrol. This
relationship seems logical considering the fact that the maintenance of a petrol car is relatively lower than
that of a diesel car. At the same time a diesel car allows a car owner to VAT refund as per UK and Irish
Tax systems which is expected to offset the relative advantage of the petrol cars but the results found in
the research do not coincide with that.
Fuel Type: (Dummy variable with the value of 1 for a petrol car)
H0: β3≤ 0
HA: β3>0
T critical= 1.645
i. β3=2.61, >1.645
ii. β3 >0
Enough Evidence to reject the Null Hypothesis.
Dublin:
This is a dummy variable where the value of the variable is one when the car being sold is located and
registered in Dublin. It is often believed that the cars registered in the urban area are likely to be pricier
than those registered in a county. The results concluded that the price of a car is expected to increase by
€339.72 if the car is registered in Dublin. However, the results in the research found that the level of
significance of the results is only 0.60 which is not significant. If the null hypothesis H0: β≤0 estimates no
relationship between the location and the price than the research did not find a significant evidence to
reject the null hypothesis.
Age:
The research found that there is strong negative relationship between the age of a car and the price. The
results obtained concluded that for every year increase in the age of the car, its price decreases by
€782.30. The level of significance of this relationship was also found to be very significant at 5.45. The
price of a car is expected to decrease considering a direct relationship between price of a car and the cost
of maintenance including insurance which increase with the increase in the age of a vehicle.
Age:
H0: β4≥ 0
HA: β4< 0
T critical= 1.645
i. β4=5.45, >1.645
ii. β4 < 0
Enough Evidence to reject the Null Hypothesis
Mileage:
For the mileage as expected the research found that there is negative relationship between the price of a
car and the miles driven. The results concluded that for every mile driven the price of the car is expected
to fall by approximately €.40cents.
Mileage:
H0: β5≥ 0
HA: β5< 0
T critical= 1.645
i. β5=6.11, >1.645
ii. β5 < 0
Enough Evidence to reject the Null Hypothesis.
Gold :( Dummy variable with a value of 1 for a Gold coloured car)
H0: β6≤ 0
HA: β6 >0
T critical= 1.645
i. β6=2.64, >1.645
ii. β6 > 0
Enough Evidence to reject the Null Hypothesis
Black:
This is a dummy variable where the value of the variable is one when the colour of the car is black. The
research found that the price of the car increase by €1592.52 when the colour of the car is black. This
finding is consistent with the findings of one of the biggest supplier of the Automotive paints “DuPont
Automotive Color” which in 2008 colour popularity report found that most Europeans prefer black, the
detail analysis is found in the table below. However, enough evidence was not found to support this
claim.
New Vehicles Colour Popularity % of regional totals Source: DuPont Automotive Color Popularity
Report, 2008
Colour Bra
z
Mex EU Russ China Indi
a
Japan Kore
a
Black 17 25 20 26 14 31 7 13 25
Blue/Turquoise 13 3 12 13 12 9 8 7 2
Brown/Beige 5 3 1 4 2 0 4 2 0
Green/Olive 3 2 2 2 13 2 1 3 0
Grey,
medium/dark
12 16 13 18 3 15 4 7 3
Red/Pink/Purple 11 8 11 7 14 5 12 3 1
Silver 17 31 17 20 30 32 27 28 50
White 20 11 20 10 10 1 28 32 18
Yellow/Gold 2 1 3 0 2 2 7 0 1
Silver:
This is a dummy variable where the value of the variable is one when the colour of the car being sold is
silver. The research found that the price of the car increases by almost €865when it is a silver car.
According to “DuPont Automotive Color” silver is the second most favoured colour in Europe as seen in
the table above. However, the hypothesis test concluded that there is no significant evidence to suggest
that the price of the car increases by an extra €865 for silver coloured car.
Japanese car:
All the Japanese cars selected for the model except Nissan Micra were found to be negatively related to
the price of the used cars. It was found by model that on average a Japanese car is likely to be €1691.30
less expensive than a European car with same characteristics. However, as mentioned earlier Nissan
Micra was expected to make a gain on the price of €372.98.