probability weighting and asset prices: evidence from...
TRANSCRIPT
-
Probability Weighting and Asset Prices: Evidence from
Mergers and Acquisitions*
BAOLIAN WANG†
ABSTRACT
For mergers and acquisitions with a small failure probability, the average decline in target
stock price if the deal fails is much larger than the average increase that accompanies deal
success. Probability weighting implies that the deal failure probability of such target stocks
will be overweighted, leading them to be undervalued. I find strong supporting evidence and
a trading strategy delivers around 1% abnormal return per month, with negative beta,
negative downside risk exposure. The abnormal returns are not subsumed by a preference for
positive skewness under traditional (expected) utility models, and are stronger when arbitrage
is more difficult.
Keywords: Mergers and acquisitions, Probability weighting, Cumulative prospect theory
JEL classification: G12, D03, G34
* I thank Federico Bandi, Nicholas Barberis, Justin Birru, Philip Bond, Charles Cao, John Chalmers, Kalok
Chan, Sudheer Chava, Te-Feng Chen (discussant), Darwin Choi, James Choi, Sudipto Dasgupta, Kathryn
Dewenter, Vidhan Goyal, Bing Han, Lawrence Jin, Andrew Karolyi (discussant), Peter Kelly, George Korniotis,
Alok Kumar, Kai Li, Mark Loewenstein, Dan Luo, Abhiroop Mukherjee, Kasper Nielsen, Gong Ning
(discussant), Jun "QJ" Qian, Mark Seasholes, Rik Sen, Christoph Schneider (discussant), Andrew Siegel, Lei
Sun, K.C. John Wei, Mark Westerfield, Liyan Yang, Haishan Yuan, Chu Zhang, Guochang Zhang, Zilong
Zhang and the seminar participants at Cornerstone Research, Fordham University, Georgia Institute of
Technology, HKUST, Johns Hopkins University, Oxford, Peking University (Guanghua), Shanghai Advanced
Institute of Finance, Shanghai University of Finance and Economics, Sun Yat-Sen University, University of
Miami, University of South Carolina, University of Toronto (Econ), University of Washington, the EFA
Doctoral Tutorial, the Financial Intermediation Research Society Conference 2014, the Third ECCCS Workshop
on Governance and Control, the Doctoral Consortium of Financial Management Association, Asian Financial
Association (2014), the PhD forum of 24th Australasian Finance and Banking Conference for helpful
discussions. This paper was awarded the Best Paper Award at the Third ECCCS Workshop. An earlier version
of the paper was circulated under the title “Is Skewness Priced and Why? Evidence from Target Pricing in
Mergers and Acquisitions”. All errors are my own. † Area of Finance and Business Economics, Fordham School of Business, Lincoln Center Campus, Room 1328,
1790 Broadway, New York, NY 10019. Tel: (+1) 212-636-6118. Fax: (+1) 646-312-8295. Email:
-
1
1. Introduction
A substantial body of experimental research -- starting with Kahneman and Tversky
(1979) -- shows that decision makers tend to overweight the probability of tail events, such as
winning a lottery.1
Researchers have proposed different mechanisms to explain this
“probability weighting” phenomenon. For example, some have suggested that tail events tend
to be discussed disproportionately and are easier to recall (Tversky and Kahneman, 1973;
Lichtenstein, Slovic, Fischhoff, Layman and Combs, 1978), and others have suggested that
such events are psychologically more salient (Bordalo, Gennaioli, and Shleifer, 2012, 2013).2
The asset pricing implication of probability weighting is the following: it leads to
overvaluation of lottery-type assets (assets with a small probability of a large upside gain as
the probability of large gain will be overweighted) and undervaluation of disaster-type assets
(assets with a small probability of a large downside loss as the probability of large loss will
be overweighted). In this paper, I present a novel way of investigating probability weighting
in the context of mergers and acquisitions (M&As), by examining the price behavior of the
target stocks in the period between deal announcement and deal resolution.
In many ways, the M&A market provides an ideal setting in which to examine probability
weighting. First, after deal announcement, the primary concern faced by the target company
shareholders is whether or not the deal will be completed.3To a first approximation, this
setting involves binary payoffs [(completed, not completed)]; this is consistent with the way
most experimental studies analyze probability weighting (for example, Kahneman and
Tversky, 1979; Tversky and Kahneman, 1992), and fits with the way the existing literature
models probability weighting (Barberis and Huang, 2008; Bordalo, Gennaioli, and Shleifer,
2013). Second, as I show in Section 3.2, one can measure deal failure probabilities
objectively -- based on deal characteristics such as the attitude of the target company and the
payment method -- with relative ease and precision. At the same time, the decision
probability that investors use to price target company stocks can also be imputed from the
1 For a review of this literature please see Fehr-Duda and Epper (2012).
2 For more discussions of the psychology of tail events, see Burns, Chiu, and Wu (2010), and Barberis (2013).
3 Other risks include the possibility of entry of competing acquirers and revisions of the offer price. Using a
sample of M&As from 1981 and 1996, Baker and Savasoglu (2002) find that these risks are second order
relative to deal completion risk.
-
2
target stock prices, or equivalently, their expected future stock returns. Third, announced
M&As commence and conclude in discrete and typically short intervals4. This clearly defines
the period in which possible mispricing associated with probability weighting may arise.
I test this prediction in two steps. First, I construct an empirical measure of deal failure
probability using a logistic specification. I integrate the previous literature (Walkling, 1985;
Samuelson and Rosenthal, 1986; Baker and Savasoglu, 2002; Bates and Lemmon, 2003;
Officer, 2003; Bhagat, Dong, Hirshleifer, and Noah, 2005; Bates, Becher, and Lemmon, 2008;
Baker, Pan and Wurgler, 2012) by incorporating a rich set of variables about acquirer
characteristics, target characteristics, and deal characteristics. The failure prediction model
works very well out of sample in predicting actual deal outcomes. Second, I use a calendar
time portfolio approach, similar to Mitchell and Pulvino (2001), Baker and Savasoglu (2002),
and Giglio and Shue (2013) to analyze the target stock returns in the period between deal
announcement and deal resolution.
Ideally, I would like to have deals when failure probability is small and deals when failure
probability is close to one. However, in practice, very few deals have large failure probability
– only 0.89% of the whole sample has failure probability higher than 80%. Therefore, I focus
on examining deals with small failure probabilities and deals with moderate failure
probabilities.
If investors tend to overweight tail events, as theory and experiments suggest, target
company stocks should be underpriced when deal failure probability is small. Therefore, they
should earn positive abnormal returns in the period from deal announcement to deal
resolution. However, when deal failure probability is moderate, there would be no probability
over-weighting, and therefore, no abnormal returns on targets in such deals.
I sort all the deals into three groups based on the deal failure probability -- deals with
failure probability below 10% are classified as targets with low failure probability (the Low
group), deals with failure probability between 10% and 20% are classified as targets with
4 In my sample, the mean (median) duration from deal announcement to resolution is 134 (100) days.
-
3
medium failure probability (the Medium group), and all the others are classified as targets
with high failure probability (the High group).5
The empirical results strongly support the prediction of probability weighting for target
stocks in M&A deals. First, I find that the target stocks in the Low group yield around 0.80%
abnormal return per month between deal announcement and resolution. In terms of
magnitude of probability weighting, the results show that, on average, 5.2% failure
probability is overweighted to 13.2%. Second, when I examine target stocks in the Medium
group and the High group, I find no evidence of abnormal returns. A trading strategy that
buys target company stocks in the Low group and sells short target company stocks in the
High group (the Low-High portfolio) yields an alpha of 0.75% to 2.23% per month,
depending on sample selection and model specifications. I also find that, compared to the
short side of the portfolio, the long side of the portfolio has a lower beta, lower downside risk,
and lower coskewness risk as measured by Harvey and Siddique (2000), suggesting that the
positive abnormal return is unlikely to be driven by systematic risk exposure. My results are
robust to subsamples, subperiods, and alternative models of deal failure probability.
Are the arbitrageurs aware of this arbitrage opportunity and if so, what prevent them from
arbitraging away all the mispricing? First, I find that the correlation between the real world
merger arbitrage index and the Low portfolio return is much higher than its correlation with
the High portfolio, suggesting that mergers arbitrageurs indeed trade the target stocks with
low failure probabilities more than the ones with high failure probability. Second, the return
difference between the stocks from the Low group and the stocks from the High group is
larger when merger arbitrage capital is more constrained: when more deals failed in the
recent past, when more deals are pending, and when arbitrage involves more trading and
costly short selling.
I examine two alternative explanations. First, for mergers and acquisitions with small
failure probabilities, the average decline in the target stock price if the deal fails is much
larger than the average increase that accompanies deal success. This implies that targets in
5 The literature does not give an unambiguous cutoff on how small probability is small enough to lead to
overweighting. Therefore, in the sorting, I have finer classification for deals with failure probability lower than
20%. The results are robust if I further classify the High groups into 3 categories: between 20% and 30%,
between 30 and 40%, and others.
-
4
such deals have negative return skewness. Many traditional utility functions, for example, the
CRRA utility function, also exhibit a preference for skewness. The rare disaster models
(Barro, 2006; and Gabaix, 2012) explicitly consider extreme negative state with small
probabilities. I thus examine whether the findings can be explained by skewness preference
implied from the traditional utility functions or by the rare disaster models.
The traditional skewness preference models (Harvey and Siddique, 2000; Karus and
Litzenberger, 1976) and the rare disaster models (Barro, 2006; Gabaix, 2012) predict that,
when investors care about diversification and do not face frictions to do so, systematic risks
(coskewness and systematic disaster risk) will affect asset prices but idiosyncratic risk
(idiosyncratic skewness or idiosyncratic downside risk) will not. However, at the portfolio
level, the Low-High portfolio has negative coskewness risk and negative systematic
downside risk. After proper adjustment, the alpha of the Low-High portfolio becomes even
larger. I also show that, even when we introduce frictions to diversification, the traditional
skewness preference models and the rare disaster models cannot explain my findings.
Second, Grinblatt and Han (2005), and Frazzini (2006) argue that the disposition effect
can lead to excess selling after price increases, which will drive the current stock price below
the fundamental value and consequently yield higher future stock returns. Typically, an M&A
announcement is “good news” for the target shareholders. The disposition effect will
therefore predict under-reaction. This effect may be stronger for deals with low failure
probability, as their initial price run-up is likely to be higher. In other words, in deals with
low failure probability, the target price will increase more on announcement, which will
mean higher capital gains for existing shareholders. The disposition effect will make these
shareholders more likely to sell the stock. This might depress the stock’s price, beyond
fundamentals, and therefore lead to higher returns in the near future. I find that both the target
return prior to deal announcement and the return around the announcement period predict the
future target stock return, which is consistent with the disposition effect-based explanation
outlined above. However, controlling for the pre-announcement and announcement returns
does not reduce the significance of my main results.
My paper fits into a growing literature that applies probability weighting to financial
market phenomena. On the theory side, Polkovnichenko (2005) uses it to explain the under-
-
5
diversification of household portfolios. Barberis and Huang (2008), and Bordalo, Gennaioli,
and Shleifer (2013) examine the implications of probability weighting for security prices.
Brunnermeier and Parker (2005) and Brunnermeier, Gollier, and Parker (2007) endogenize
probability weighting in an optimal expectation framework where the agent chooses his/her
beliefs to maximize well-being. Spalt (2013) finds that probability weighting can help explain
why riskier firms grant more stock options to their employees.
The empirical literature focuses on the prediction of probability weighting that investors
will prefer to buy positively skewed assets (Kumar, 2009; Mitton and Vorkink, 2007) and
will be willing to pay a premium for them. Using various measures of skewness, Boyer,
Mitton, and Vorkink (2010), Amaya, Christoffersen, Jacobs, and Vasquez (2014), Bali,
Cakici, and Whitelaw (2011), and Conrad, Dittmar, and Ghysels (2013) document consistent
evidence. Skewness preference also helps explain the first-day returns and long-run
underperformance of IPOs (Green and Hwang, 2012), the prices of out-of-money options
(Boyer and Vorkink, 2014), the underperformance of distressed stocks (Conrad, Kapadia, and
Xing, 2014), and the underperformance of stocks trading in the over-the-counter markets
(Eraker and Ready, 2014). Schneider and Spalt (2013) argue that managers tend to overpay
for lottery type targets and Schneider and Spalt (2013) find that conglomerate companies
overinvest in high-skewness segments. There are also a smaller number of studies which
impute probability weighting functions from the S&P 500 index options (Polkovnichenko
and Zhao, 2013, and Chabi-Yo and Song, 2013a) and currency options (Chabi-Yo and Song,
2013b).
However, most of these studies do not explicitly investigate whether the documented
skewness preference is driven by probability weighting or from other utilities, which I do in
this paper. The binary nature of the outcome variable fits with the way the experimental
studies and the theoretical studies examine probability weighting. I also investigate the
magnitude of probability weighting in this paper.
My findings also contribute to the merger arbitrage (also called risk arbitrage) literature.
After deal announcement, the target stock price is typically lower than the offer price. Merger
arbitrage refers to the strategy that attempts to profit from this spread. In doing so,
arbitrageurs buy the target stock, and if the offer involves stock, sell short the acquirer stock
-
6
to hedge the risk induced by fluctuations in the acquirer stock price. Bhagat, Brickley, and
Loewenstein (1987), Larcker and Lys (1987), Mitchell and Pulvino (2001), and Baker and
Savasoglu (2002) find that, on average, target stocks yield positive abnormal returns in the
period between deal announcement and deal resolution. Larcker and Lys (1987) attribute the
positive abnormal returns to compensation of information acquisition of the merger
arbitrageurs. Baker and Savasoglu (2002), Geczy, Musto, and Reed (2002), Mitchell,
Pedersen and Pulvino (2007), Mitchell and Pulvino (2001, 2012), and Officer (2007)
investigate how limits to arbitrage affect the merger arbitrage market. Giglio and Shue (2013)
find that the deal success hazard rate decreases as time passes after deal announcement, but
investors do not fully take this into account when they price target stocks. Cao, Goldie, Liang,
and Petrasek (2014) find that hedge fund merger arbitrageurs do better than non-hedge fund
institutional merger arbitrageurs. In this paper, I examine target stock return conditional on its
ex-ante deal failure probability. My findings suggest that target stock performance is related
to deal failure probability, and refining merger arbitrage strategies to incorporate this finding
is likely to improve profits significantly.
The rest of paper is structured as follows. In Section 2, I introduce more background of
probability weighting and why it matters for M&As. Section 3 describes the data and the
method used to model deal failure probability. Section 4 presents the empirical results.
Section 5 concludes.
2. Hypothesis development
2.1 Probability weighting: Some background
Economists have long recognized that people are more sensitive to the probability of tail
events than typical events. For example, Tversky and Kahneman (1992) find that the median
subject in their experiment is indifferent between receiving a lottery with 1% chance of
winning $200 and a certain $10, and also indifferent between receiving a lottery with 99%
chance winning $200 and a certain $188. It is hard for the expected utility model to generate
such behavior, as in that model, events are weighted linearly by their objective probabilities.
To accommodate this, researchers propose probability weighting functions which transform
objective probabilities into decision probabilities. Many models have incorporated this
-
7
mechanism, for example, rank-dependent expected utility models (Quiggin, 1982; Yaari,
1987; Prelec, 1998) and prospect theory (Kahneman and Tversky, 1979; Tversky and
Kahneman, 1992). A substantial body of experimental studies show that incorporating
probability weighting can more accurately describe people’s decision making under
uncertainty than expected utility theory (Kahneman and Tversky, 1979; Fehr-Duda and Epper,
2012).
Probability weighting can be driven by either erroneous beliefs about objective
probability (probability estimation error) or by people behaving as if they knowingly assign a
disproportionately higher weight to tail events (overweighting objective probability). For the
former, the difference in objective probability and the decision probability reflects the error
investors make in probability estimation. However, according to the latter view, the disparity
between objective probability and decision probability is just a modelling device that captures
investors’ inherent risk preferences (Kahneman and Tversky, 1979). 6
Both views have received considerable attention in the psychology literature. For the
probability estimation error view, the availability heuristic and anchoring bias are often cited
as potential reasons why erroneous beliefs lead to overestimation of the probability of tail
events. The availability heuristic predicts that, when judging the frequency or probability of
an event, people tend to rely on the ease with which an example of the event comes to mind
(Tversky and Kahneman, 1973). Tail events are more available as they are discussed
disproportionally more than typical events and are easier to recall. For example, Lichtenstein,
Slovic, Fischhoff, Layman and Combs (1978) find that people overestimate the frequency of
rare but more discussed causes of death like accidents and homicide. Specifically, subjects
thought that accidents caused about as many deaths as disease and that homicide was a more
frequent cause of death than suicide. Actually, diseases cause about 16 times as many deaths
as accidents, and suicide is twice as frequent as homicide.
Anchoring bias may also be a source of probability weighting. Event space is usually
specified into coarse categories. Fox and Clemen (2005) and Sonnemann, Camerer, Fox and
Langer (2013) find that a typical subject in the experiment tends to assign equal probability to
6 See Barberis (2013) for more details.
-
8
the specified events. For example, we may classify market conditions as up market, down
market and flat market. Experimental subjects tend to assign the same probability to each
specified event. For this example, subjects tend to assign 1/3 to up market, down market and
flat market, even when their true probabilities are different. In reality, normal events in the
middle of a distribution are typically partitioned more coarsely than events that lie at the
extremes of a distribution. As a result, the objective probability of a specified tail event is
typically lower than a specified normal event. Equal probability assignment and anchoring
therefore predicts overweighting of tail events.
Many mechanisms have also been proposed to explain the second view of probability
weighting – that people behave as if they intentionally put higher weight on tail events. For
example, Rottenstreich and Hsee (2001) propose that tail events can make people more
emotional. Recently, Bordalo, Gennaioli, and Shleifer (2012) propose a salience-based theory
of choice under uncertainty. The psychology literature documents that salience attracts
attention. For example, according to Kahneman (2011, p324), ‘‘our mind has a useful
capability to focus on whatever is odd, different or unusual.’’ Based on this, Bordalo,
Gennaioli, and Shleifer (2012) argue that decision makers’ attention is likely to be drawn to
salient outcomes, such as extreme tail outcomes, which leads to overweighting of these
outcomes.
2.2 Probability weighting in merger and acquisition deals
There are multiple reasons why probability weighting may matter in M&As. For an
M&A deal that involves a small deal failure probability, the target stock price is typically
close to the offer price. If the deal succeeds, which happens with a high probability, by
definition, the stock price will only increase for a small amount up to the offer price. But in
the unlikely event that the deal fails, the stock price will drop by a much larger amount. If
investors recall large losses suffered on previous deal failure disproportionately (availability
heuristic) more than small gains from previous deal success, investors may overestimate the
true probability of deal failure. Alternatively, it is also possible that they do not make
systematic errors in judging the probability itself, and that it is an inherent aversion to large,
small-probability losses that results a lower willingness-to-pay for deals with small failure
-
9
probabilities (Rottenstreich and Hsee, 2001; Kahneman and Tversky, 1979; Bordalo,
Gennaioli, and Shleifer, 2012).7
2.3 Asset prices with probability weighting
Several models have been proposed to analyze the asset pricing implication of probability
weighting. Barberis and Huang (2008) consider an economy in which investors have
cumulative prospect theory preference, and examine how a lottery-type security – a security
with a small probability of large gain and a large probability of small loss – is priced in a
mean-variance world. They show that, in such an economy, lottery-type securities can
become overpriced and earn negative abnormal returns.
Barberis, Mukherjee, and Wang (2014) also model probability weighting using
cumulative prospect theory. But different from Barberis and Huang (2008) in which investors
make decisions at the portfolio level, Barberis, Mukherjee, and Wang (2014) assume that
investors derive their prospect theory utility from individual stocks (i.e., engage in narrow
framing). Similarly, Bordalo, Gennaioli, and Shleifer (2013) also assume investors engage in
narrow framing, but they model probability weighting based on the salience of a payoff. In
contrast to rank-dependent utility and cumulative prospect theory where probability
weighting function is constant, in Bordalo, Gennaioli, and Shleifer (2013), probability
weighting function is context dependent: it relies on other alternatives and how a decision
problem is described. Brunnermeier and Parker (2005) and Brunnermeier, Gollier, and Parker
(2007) endogenize probability weighting in an optimal expectation framework where the
agent chooses his/her beliefs to maximize well-being. Though the mechanisms are different,
all these models predict that lottery-type securities can become overpriced and disaster-type
securities (securities with a small probability of large loss and a large probability of small
gain) can become underpriced.
To illustrate the central point of probability weighting, I use a simple model to show the
link between deal failure probability and the expected target stock return, and to develop the
hypothesis. In the model, the deal will either fail or succeed. If the deal succeeds, the target
7 Although a distinguishing between the two alternatives is beyond the scope of this paper, some indicative
evidence can be found from Part IV of the Online Appendix.
-
10
shareholder receives the offer price Poffer. If the deal fails, the target is worth its standalone
value Palone, where I assume that Poffer>Palone. Poffer and Palone are assumed to be constant and
known8. The deal failure probability is π where0< π π when
π is small and w(π)π and (weakly) negative if w(π)π}, in other words, when will π
be overweighted? Different probability weighting functions have different answers to this
8 In reality, competitors may join in the bidding; the offer price is also subject to revision; if the payment is not
pure cash, the target shareholders may not know exactly what Poffer will be; being targeted will also reveal a lot
of valuable information to the market and potentially change the strategy of the target management even if the
deal fails, which will also change the target value. All these make Poffer and Palone variable. However, as long as
the market has a reasonable estimation of Poffer and Palone, and their variances are low comparing to their
difference, the model is a good approximation of reality. 9 In the empirical analysis, this assumption will be relaxed.
-
11
question, but most predict that the fixed point (when w(π)=π) is around 0.3 (Tversky and
Kahneman, 1992; Prelec, 1998). If investors ignore diversification, we predict that targets
will be undervalued when deal failure probability is lower than 0.3. However, this may not be
the case in the financial markets. First, the average investor may be different from the
average experimental subject, given the presence of many sophisticated institutions. Second,
not all investors engage narrow framing and they may make decisions by incorporating the
effect of π in their overall portfolio. The negative skewness driven by small π may be
partially diversified in a portfolio and the pricing effect of probability weighting may become
weaker. Ultimately, the range of π for which targets will be undervalued is an empirical
question.
3. Data and modeling deal failure probability
3.1 Data
I begin with all the M&As in the Securities Data Company database of Thompson
Financial (SDC). SDC covers deals going back to 1978. However, coverage prior to 1981 is
incomplete. The sample used in this paper runs from January 1, 1981 to December 31, 2010. I
am left with a sample of 16,906 deals after I use the following filters:
1. The target is a U.S. listed firm.
2. The deals that are classified by SDC as rumors, recapitalizations, repurchases, or
spinoffs are excluded.
3. In order to calculate target characteristics and post announcement return, I also
require that the price and return data are available from Center for Research on
Security Prices (CRSP) one year prior to deal announcement and at least three trading
days after deal announcement.
4. The deal completion or date of withdrawal must be at least three days after the deal
announcement, or missing.
Table 1 shows the sample of deals. The number of deals is larger in late 1980s and 1990s
than later years, but even in the 2000s, on average, there are more than 300 deals each year.
The deal duration is measured as the number of calendar days between deal announcement
and deal completion or withdrawal. The median duration is 100 days and the mean is 134,
-
12
suggesting that the deal duration is right skewed. 1,144 deals are initially viewed as hostile or
unsolicited by the targets.10
Payment method data shows that the number of pure stock deals
is highest during the internet bubble. Around 21.2% of the deals are conducted through tender
offers. Leverage Buyouts (LBO) and mergers of equals (MOE) are generally rare. Only 6.6%
and 0.8% are LBOs and MOEs, respectively. 55.5% of the acquirers are public firms. Past
return is the cumulative return from 365 days to 22 trading days prior to deal announcement.
The variation is large, but on average, the target past return is 9.65%, close to the market
return. Target Size is the natural logarithm of target firm market capitalization 22 trading
days prior to deal announcement. I convert market size into constant 2005 dollars using the
GDP deflator from the Federal Reserve. Mean target size is relatively stable in the early years,
begins to increase only in 2004 and reaches a peak in 2007.
Prior Bid is a dummy variable which is equal to 1 if the target company has received
another takeover bid in the past 365 days. The results show that, for 24.6% of the deals,
another bid was received in the past 365 days. The data also reveals significant variation in
geographical and industrial linkage between the acquirer and the target: for 15.4% of the
deals, the acquirers are foreign firms; for 26.7% of the deals, the acquirer and the target are
not located in the same state; and for 43.7% of the deals, they are not in the same 2-digit SIC
industry.
[Insert Table 1 here]
I also look at acquirers’ pre-acquisition ownership of target companies in these deals. Pct
Held is the percentage of the target shares held by the acquirer 6 months prior to the deal
announcement. Toehold is a dummy which is equal to 1 if the holding is no less than 5%.
Cleanup is a dummy variable which is equal to 1 if the holding is greater than 50%. Around
3.4% of the deals are cleanup deals. The acquiring firm has a toehold only in 14.3% of the
deals. But on average the acquirers hold 4.41% of the target shares, close to the cutoff used to
10
As the ultimate purpose of this paper is to do asset pricing tests, it is important to make sure that all the
information used to construct portfolios is available at the moment of portfolio construction. Thus, I choose to
use the initial attitude of the target company to the merger and acquisition announcement, instead of the final
attitude.
-
13
define toehold. This implies that once an acquiring firm holds shares of the target firm, it is
very likely that it holds a significant fraction.
In the whole sample, 10,692 deals are completed, and 3,167 are withdrawn. Most of the
remaining deals are classified as pending or status unknown by SDC. Based on the deals with
known status, the average completion rate is 77.1% (10,692/(10,692+3,167)). As the pending
and status unknown cases may be different from the cases with known status, in order to
prevent any sample selection bias, I include all of them in my asset pricing tests, but in
modeling the deal success probability, I only use the deals with known status. In the next
section, I discuss in detail how I model deal failure probability.
3.2 Modeling deal failure probability
I use two models to estimate deal failure probability. The first is based on deal
characteristics; the second is based on the target stock price. I refer to the first model as the
characteristics-based model, and to the second as the target price-based model. Throughout
the analysis, I use π to denote deal failure probability.
3.2.1 The characteristics-based model
Following the literature (Walkling, 1985; Samuelson and Rosenthal, 1986; Baker and
Savasoglu, 2002; Bates and Lemmon, 2003; Officer, 2003; Bhagat, Dong, Hirshleifer, and
Noah, 2005; Bates, Becher, and Lemmon, 2008; Baker, Pan and Wurgler, 2012), I use deal
characteristics and firm characteristics in a logistic specification to predict deal outcome.
The deal and firm characteristics that I use are: a hostile dummy variable to measure the
target attitude (Baker and Savasoglu, 2002; Baker, Pan, and Wurgler, 2012), a pure cash
dummy variable and a pure stock dummy variable to measure the payment method (Bates and
Lemmon, 2003; and Bates, Becher, and Lemmon, 2008), a Tender dummy variable, an LBO
dummy variable and a merger of equals (MOE) dummy variable to measure the deal type.
Following Officer (2003), I put three variables—Pct Held, Toehold, and Cleanup—in the
model to capture the effect of the acquirer’s pre-acquisition ownership. I also consider
acquirer and target characteristics such as the public status of the acquirer, and the size and
past return of the target. Other factors include dummy variables indicating whether the target
received any takeover offer in the past 365 days (Prior Bid), whether the acquirer is a foreign
-
14
company, and whether the acquirer and the target are from the same state or in the same 2-
digit SIC industry. Prior Bid accounts the competitiveness of the takeover market and the
other variables account for the possible effect of geographical and economic linkages
between the two sides of the deal.11
[Insert Table 2 here]
Table 2 reports the results. The dependent variable is a dummy variable which is equal to
1 if the deal is withdrawn, and 0 if the deal is completed. Models (1) to (6) report the
coefficients and the second column of model (6) reports the marginal effects based on model
(6).12
In estimating the model parameters, I only use the deals with known status in the
sample.
I first examine five sets of factors separately and in model (6) I examine all the factors in
one full model. Almost all the factors considered have significant predictive power for the
deal outcome and the estimated signs are very similar in the separate models and in the full
model. The results in the model (6) show that the significantly positive predictors include
hostile, LBO, MOE, Toehold, Cleanup, Prior Bid, and Cross Border. The negatively
significant variables include Pure Cash, Pure Stock, Tender, Pct Held, Public Acquirer,
Target Past Return, Target Size, and Same Industry. The only variable that is insignificant in
model (6) is the Same State dummy. By examining the marginal effects reported in the last
column, we can compare the importance of the various predictors. Among all the variables,
five have an absolute average marginal effect larger than 0.100. They are: hostile (0.565),
LBO (0.146), Prior Bid (0.142), Tender (-0.118), and Toehold (0.106).
Next, I examine the predictive performance of my method of assessing deal failure
probabilities. In order to avoid any look-ahead bias, I use an out-of-sample estimation method
similar to Campbell, Hilscher, and Szilagyi (2008) to estimate out-of-sample deal failure
11
The literature has also uncovered other important determinants of the deal outcome. For example, Bates and
Lemmon (2003) and Officer (2003) find that target and bidder termination fees act as an efficient contract
mechanism to encourage bidder participation and to secure target company gain, and increase the deal
completion rate. Burch (2001) and Bates and Lemmon (2003) find that lockup options also increases the deal
completion rate. Officer (2004) argues that collar provisions can be used as contractual devices by lowering the
costs of target and acquirer renegotiation. I do not consider these contractual terms because they may not be
available to the public at the beginning of the deal. Considering these factors has little effect on the main results. 12
For most variables, the marginal effects in models (1) to (5) are similar to those in model (6). To save space, I
do not report the marginal effects for models (1) to (5).
-
15
probability. Specifically, for the deals in year t, I use all the data before year t to estimate the
model coefficients. Only deals that have already been completed or withdrawn by the end of
year t-1 are included in the estimation.13
The estimated coefficients are used to calculate the
deal failure probability for all the deals in year t, including the deals with status classified as
pending or unknown by SDC.
In order to have robust coefficient estimate, I require at least three years data to do the
estimation. Thus, the main test sample begins in 1984 and ends in 2010; 16,163 deals are
used in the asset pricing tests.
[Insert Figure 2 here]
Figure 2 shows the distribution of the deal failure probability when using model (6). I
classify the deals into 100 groups based on their failure probability. The vertical axis shows
the number of deals in each group. It shows that deal failure probability has large variation.
Deals with failure probability higher than 50% are generally rare, while a large number of
deals have failure probability lower than 10%. Overall, the distribution of deal failure
probabilities is fairly smooth.
[Insert Figure 3 here]
In Figure 3, I examine whether model (6) in Table 2 does a good job in predicting deal
outcomes, year by year. I sort all the deals into three groups: low failure probability group
(deals with failure probability lower than 10%), medium failure probability group (deals with
failure probability between 10% and 20%), and high failure probability group (all the
others).14
I calculate the realized deal failure rate for each group and each year. Figure 3
shows that the deals with low failure probability indeed have a lower realized failure rate than
the deals with high failure probability, but perhaps more pointedly, this is true in each and
every year in my sample. The deals with medium failure probability also lie in between the
two extreme groups in most years.
13
This means that the deals that are announced before or in year t-1 but have not been completed or withdrawn
at the end of year t-1 are not used. 14
I choose 10% and 20% as the cut-offs mainly because I will do the asset pricing analysis using the same
grouping.
-
16
In addition, the model does a better job in the second half of the sample than in the first
half. In the 1980s, though deals with different failure probabilities are clearly separated, the
realized failure rate is higher than the fitted failure probability. I conjecture that this is driven
by the hostile takeovers in late 1980s and the fact that in the first half of the sample the model
estimates are nosier than in the second sample where we have more historical data. But
overall, the results show that the model in Table 2 has very good out-of-sample predictive
power in separating deals with different failure probabilities. We also find that the results are
better in the second half of the sample than in the first half of the sample (in Panel B of Table
7).
3.2.2 The target price-based model
Another way of modeling deal failure probability is to infer it by comparing the offer
price and the post-announcement target price. Consider the example in Figure 1: if we know
the offer price (Poffer), the target standalone price (Palone) and the post-announcement target
price (P), we can infer how the market perceives the likelihood of deal failure. I use the
following formula as the second measure of deal failure probability: (Poffer -P)/(Poffer -Palone).
Strictly speaking, the target price is not only affected by its standalone price, the offer price,
and the failure probability. It is also affected by the expected length of time needed to resolve
the deal and thus the expected return required to hold the target until then. However,
regardless of the exact asset pricing model, I expect that (Poffer -P)/(Poffer -Palone) should be
positively related to deal failure probability.
To prevent any look-ahead bias, I use the initial offer price (instead of the final offer price)
to do the calculation. I use the target price 22 trading days before the deal announcement to
proxy target standalone value. The post-announcement target price is measured at the end of
the second trading day after deal announcement. In order to minimize measurement error, I
require that the post-announcement target price is between the pre-announcement target price
and the offer price. 15
15
Both the offer price and the target standalone price are measured with noise. Competitor entry or expectation
of competitor entry can drive the post-announcement target price higher than the first offer price. It is also
possible that some extremely good or bad news comes out during the month just before the deal announcement
and takes the post-announcement target trading price beyond the range bounded by the pre-announcement and
the offer price.
-
17
In model (7) of Table 2, I analyze the predictive power of the price-based measure for the
deal outcome. The results show that (Poffer -P)/(Poffer -Palone) has very strong predictive power
for the deal outcome. The coefficient of (Poffer -P)/(Poffer -Palone) is 2.195, the t-value is 15.70,
and the marginal effect is 0.319, suggesting that an increase of 0.1 in the value of (Poffer -
P)/(Poffer -Palone) increases the failure probability by 3.19%. This is consistent with the
findings of Samuelson and Rosenthal (1986) and Subramanian (2004) that the market can
extract deal failure probability very well even at the beginning of a deal.
Compared to the deal characteristics-based measure, this price-based measure is simpler.
However, due to the constraint on the initial offer price in calculating (Poffer -P)/ (Poffer -Palone),
it is only available for 4,598 deals which are less than 30% of the full sample and are very
sparse in the years before 1996. Thus, in the paper, I use the characteristics-based measure as
my main measure and perform robustness tests using the price-based measure.16
4. Empirical results
In this section, I evaluate the average return of target firms after the deal announcement
and test whether investors overweight probability of deal failure when it is small. I sort all the
deals into three groups based on the deal failure probability. The deals with failure
probability below 10% are classified as targets with low failure probability, the deals with
failure probability between 10% and 20% are classified as targets with medium failure
probability, and all the others are classified as targets with high failure probability.
I classify in this way for two reasons. First, probability weighting has its biggest impact
on small probability events. I therefore define deals with low failure probability as deals with
a very low chance of failure. Second, the number of deals with failure probability higher than
20% is more volatile year by year. If I do a finer classification, the number of deals in some
categories will be too small. For example, if I classify deals with failure probability higher
than 20% into 3 categories-- between 20% and 30%, between 30 and 40%, and others-- for
some months, the number of deals will be as low as 1. This problem will become more severe
16
Both models are likely to have some error-in-variable problems. For example, for the characteristics based
model, it is possible that investors may use other available information which may have significant effect on
deal failure. For the priced-based model, I do not consider the heterogeneity in deal duration. Given the error-in-
variable problem, I expect that my results may underestimate the true importance of probability weighting.
-
18
for the subsample analysis in Table 5. Nevertheless, my results are robust if I further sort
deals with high failure probability into 3 categories: between 20% and 30%, between 30 and
40%, and others.
4.1 Target company characteristics
In Table 3, I report the characteristics of target stocks. Panel A reports all characteristics
except for the target stocks’ return moments, and Panel B reports their return moments. The
first column of Panel A shows the average deal failure probability for each group. The
averages are 0.052, 0.154, and 0.362 for Low, Medium and High, respectively. The
difference between Low and High is 0.310, which is economically quite large. The next two
columns show the mean and median durations of deals in each group. The mean (median)
durations are 100.87 (62), 137.44 (113), and 142.94 (107) for Low, Medium, and High,
respectively. I use the standard two sample t-test to examine the statistical significance of the
mean and the Brown and Mood (1951) test to examine the statistical significance of the
median. The differences in mean and median between the Low and High groups are both
statistically significant, suggesting that deals with low failure probability on average resolve
faster than other deals. The last column shows the average premium. Premium is measured as
the natural log difference between the initial offer price and the target stock price 22 trading
days before deal announcement. The average premia are 0.353, 0.354 and 0.382 for Low,
Medium and High, respectively. The difference between Low and High is not statistically
significant.
Panel B reports the target stock return moments. I use two methods to calculate the
characteristics of target stocks: one based on the realized return (physical moments) and one
based on option prices (risk neutral moments). For risk neutral moments, I can calculate the
moments stock by stock, as long as there is a large enough number of options available. I
follow Bakshi, Kapadia, and Madan (2003) and Dennis and Mayhew (2002) to calculate the
risk neutral moments.17
However, I cannot calculate the physical moments for each individual
stock as there is only one realized return for each of them. I thus calculate the physical
moments from the cross section of target stocks. Specifically, I calculate the cross-sectional
17
The detailed methodology can be found from Part I of the Online Appendix.
-
19
moments for stocks in the three portfolios. To mitigate the effect of extreme returns, I
calculate the physical moments from log returns rather than raw returns.
[Insert Table 3 here]
As discussed above, the payoff structure of the target stock is (approximately) binary. If
this is true, the statistical moments of target stock returns should reflect the characteristics of
the Bernoulli distribution. The standard deviation and skewness for a Bernoulli distribution
are π(1-π) and (1-2π)/[π(1-π)]1/2
, respectively. Standard deviation increases with respect to
deal failure probability when failure probability is lower than 0.5 (which is true for most of
the sample stocks), and skewness is negatively related to deal failure probability (in other
words, skewness is more negative when deal failure probability becomes smaller).
The results in Table 3 support these predictions about standard deviation and skewness.
From Table 3, I find that the standard deviation of the target stock return is highest for stocks
with high failure probability. For the target stock with low failure probability, its return
standard deviation is 21.6%, but the return standard deviation of target stocks with high
failure probability is 38.9%, almost double that of the low failure probability group. From
Low to High, the cross sectional skewness increases from -3.249 to -2.043.
The risk-neutral measures, calculated from option prices, show similar results for all the
three moments, though the number of stocks for which I can calculate risk neutral moments is
significantly smaller, as target companies on average are small stocks and may not have
traded options. However, in total, I still have more than 1,000 target stocks for which I can
calculate risk neutral moments. Furthermore, the results show that the differences in standard
deviation and skewness between the high- and low-failure probability groups are both
significantly different from zero.18
Since risk neutral skewness is imputed from option prices
and are therefore forward looking, these results also suggest that the option market can
extract information on deal failure probability soon after deal announcement. Overall, both
the results of comparing standard deviation and skewness reflect return characteristics
consistent with targets having a payoff structure that is approximately binary.
18
To mitigate the effect of extreme values in the statistical test, in unreported results, I also perform a Wilcoxon
rank test. The results are robust.
-
20
4.2 Details of the calendar portfolio construction
Following Mitchell and Pulvino (2001), Baker and Savasoglu (2002), and Giglio and
Shue (2013), I use calendar time portfolios to test the asset-pricing implications of probability
weighting. Upon an announcement of an M&A, the deal is classified into one of the three
categories (deals with low/medium/high failure probability). I begin to include a target stock
into one of the three portfolios from the end of the second day after the deal announcement.
This is to exclude the large abnormal returns that are associated with deal announcement. The
holding period ends when the time to announcement exceeds 180 days or when the deal is
completed or withdrawn, whichever comes first. To mitigate the bias of using daily equally
weighted returns to calculate compounded monthly returns (Blume and Stambaugh, 1983;
Roll, 1983; Canina, Michaely, Thaler, and Womack, 1998), throughout this paper, I measure
all portfolio returns using value weights, where the weight is the market capitalization of the
target firm at the end of the previous day.
I compound daily portfolio returns to get the monthly return. The median number of
target firms for Low, Medium, and High portfolios is 28, 58 and 121, respectively. This
suggests that the number of stocks in each portfolio is generally not thin, and my results are
unlikely to be driven by some months with a sparse number of deals.19
4.3 Portfolio returns and portfolio risks
This section reports the main results of this paper: the average returns of the Low,
Medium, and High portfolios, their risks and alphas. Besides the three portfolios, I also report
the characteristics of a zero-investment long-short portfolio which is the difference between
Low and High.
[Insert Table 4 here]
19
Out of the 324 months, there are 6 months in which the High portfolio has less than 10 stocks (minimum is 6),
3 months in which Low has less than 10 stocks (minimum is 8), and in all months Medium has more than 10
stocks. My results are robust if I exclude those months or if I replace return observations for those month by the
risk free rate.
-
21
Table 4 reports the characteristics of these portfolios. The results show that the average
excess return decreases with deal failure probability, from 1.170% for the Low portfolio to
0.377% for the High portfolio.
The annualized portfolio standard deviation increases with deal failure probability, from
16.7% for the Low portfolio to 22.6% for the High portfolio. As a result, the Sharpe ratio
decreases from 1.106 to 0.393. Among all the three portfolios, only Low is positively skewed,
and the portfolio skewness decreases from Low to High. This is opposite to the relationship
between deal failure probability and the individual stock return moments shown in Table 3,
suggesting that aggregation of stocks into portfolios can change skewness significantly.20
The relationship between deal failure probability and portfolio skewness thus suggest that,
compared to the High portfolio, the Low portfolio is less likely to have an extremely low
return. The mean of the Low-High portfolio is 0.793% which is economically large, and is
significant at the 5% level.
The results in Table 4 are consistent with the Hypothesis. Next, I investigate whether the
difference in excess return between Low and High can be explained by systematic risk
exposures.
I report factor adjusted portfolio alphas in Table 5. The market model adjusted alphas are
0.808%, 0.008%, and -0.198% for Low, Medium, and High, respectively. Betas for the three
portfolios are 0.633, 0.851, and 1.007, respectively. The beta for the Low-High portfolio is -
0.374, which is significantly negative.
[Insert Table 5 here]
The alpha of the Low portfolio is significantly positive, while the other two are not
significantly different from zero. The market-adjusted alpha for the Low-High portfolio is
1.006%, which is significant at the 1% level. The market-adjusted alpha is also higher than
the difference in excess return reported in Table 3. This is because the long side has lower
beta than the short side. Adjusting for the three Fama and French (1993) factors, the Carhart
(1997) momentum factor, and the Pastor and Stambaugh (2003) liquidity factor (I will refer
20
A similar finding is that the aggregate market displays negative skewness, but on average individual stock
returns display positive skewness (see Albuquerque (2012) for an interpretation).
-
22
to this as five-factor adjustment) does not change the results significantly. The five-factor
alphas for the Low, Medium, High and the Low-High portfolio are 0.766%, -0.038%, -
0.293%, and 1.060%, respectively.
Mitchell and Pulvino (2001) find that, on average, deals are more likely to fail in down
markets. I follow Mitchell and Pulvino (2001) and use a piecewise linear regression model:
Rportfolio,t-Rf=(1-UPt)[αlow+βlow(Rm,t-Rf)]+ UPt[αhigh+βhigh(Rm,t-Rf)]+β’Xt+εt
subject to: αlow+βlow(Threshold)= αhigh+βhigh(Threshold) (4)
where Rportfolio,t is the monthly Low-High portfolio return, Rm,t is the value weighted market
return, Rf is the risk free interest rate, and UPt is a dummy variable which is equal to 1 when
the market excess return is greater than a threshold. X represents the other factors. The second
equation shown above ensures continuity.
Another way to evaluate downside risk is to examine the coskewness of a portfolio. Thus,
I also examine whether the Low-High portfolio has exposure to the coskewness risk
mimicking factor proposed by Harvey and Siddique (2000). Coskewness is measured as:
βî=
E[εi,tεm,t2 ]
√E[εi,t2 ]E[εm,t
2 ]
(5)
where εi,t=Ri,t - αi - βiRm,t. Ri,t and Rm,t are the excess return for stock i and the excess return for
the market. εm,t is equal to Rm,t - μm,t , where μm,t is the mean of Rm,t. In each month, all the
stocks in NYSE/AMEX/NASDAQ are sorted into three portfolios based on the value of
coskewness which is estimated based on monthly data of the past 5 years. The 30% of stocks
with the most negative coskewness is classified into an S- portfolio and the 30% of stocks
with the most positive coskewness in each month is classified into an S+ portfolio. Both S
--
S+ and S
-- Rf
will be used as the coskewness hedging portfolios for the portfolio abnormal
return analysis.
Table 6 reports the results. Panel A reports the downside risk analysis using the method
proposed by Mitchell and Pulvino (2001) and Panel B reports the coskewness risk analysis.
Panel A shows that both the upside beta and the downside beta are significantly negative, but
-
23
the downside beta is much larger in magnitude. The difference between the upside beta and
the downside beta is significant at 1% level. Panel B shows that Low-High portfolio has
negative exposure on the two coskewness factors. The coskewness loading on Low-High
portfolio is -0.411 and -0.294 when S-- S
+ and S
-- Rf are used to construct the mimicking
portfolio, respectively. In sum, the results show that there is less downside risk for deals that
are less likely to fail.
[Insert Table 6 here]
The results in Table 5 and Table 6 show that Low-High portfolio has negative beta and
negative downside risk. This is consistent with Bhagat, Brickley, and Loewenstein (1987),
who view acquirer offers as put options to target shareholders. Bhagat, Brickley, and
Loewenstein (1987) argue that an offer from an acquirer is similar to a put option with the
strike price as the offer price and target standalone value as the underlying. Put options help
hedge the risk resulted from the fluctuation of the target value. As the exercisability of put
options is contingent on the deal going through, the effectiveness of hedging is determined by
the probability of deal failure. The lower the deal failure probability, the better is the hedging
value of the put option. Thus, the Low-High portfolio is essentially a long position in these
contingent put options. If the average target stock has positive beta, on average, the price of
put options will increase when market is going down (negative beta), and the price of put
options increases more quickly in the down market than price decreases in the up market
(negative downside risk), due to increasing option delta.
Overall, the results in Tables 4, 5 and 6 confirm the prediction of probability weighting.
That is, the target stocks in deals with low failure probability are undervalued, but target
stocks in other deals do not show significant mispricing, suggesting that deal failure
probability is only overweighed when it is small. The long-short portfolio has negative beta
and negative downside risk, suggesting that the abnormal return of the low failure probability
portfolio is unlikely to be driven by exposure to systematic risk. Overall, these results support
my Hypothesis.
-
24
4.4 The magnitude of overweighting
In this section, I examine the economic magnitude of overweighting for an average deal
in the Low portfolio. Based on the structure assumed in Figure 1, by definition, the target
stock return is equal to the expected payoff divided by the current target trading price:
(1 + 𝑅𝑓 + 𝐸𝑅 + 𝐴𝑅)𝑇−𝑡 =
Palone*π+Palone*(1+premium)*(1- π)
P,
(6)
where Rf is the risk free rate, ER is expected return, AR is abnormal return, T-t is the length of
time from deal announcement to deal resolution, Palone is target standalone price, P is the post
announcement target trading price, and π is deal failure probability. For investors who
overweight the small deal failure probability, P satisfies the following equation:
P=[Palone*w(π)+Palone*(1+premium)*(1-w(π))]
(1+𝑅𝑓+ER)T-t
(7)
where w(π) is the decision probability. Combining equation (6) and (7), we can infer w(π)
from the data.
As shown in Table 3, the average deal failure probability (π) is 5.2%., the average
duration (T-t) is 100.87 days or 3.32 months (100.87/ (365/12)) and the average premium
(premium) is 42.3% (exp(0.353)-1). As shown in Table 4, the average abnormal return (five
factor alpha) for the Low portfolio is 0.766% per month (AR). The average excess return is
1.170% (Table 4) and the five-factor alpha is 0.766%. So the average “normal” excess return
per month is 0.404% (ER). The average risk free rate from 1984 to 2010 is 0.364% per month
(Rf). With these statistics, w(π) equals 13.2%. The 90 percent confidence interval of AR is
from 0.343% to 1.189%. Under this range, w(π) is estimated to be from 8.8% to 17.5%. This
is consistent with the experimental studies. For example, based on the probability weighting
function proposed by Tversky and Kahneman (1992), 5.2% should be overweighted as
11.42%, which is quite close to the above estimates.
4.5 Robustness
Several robustness tests are presented in Table 7. The five-factor alpha of pure cash offers
is 0.748%, which is statistically significant. I also report results for the subsample of deals
-
25
that involves change of corporate control. Change of corporate control is defined as deals in
which the acquirer owns less than 50% of the target but seeks to own more than 50%. This
subsample includes 10,353 deals. The Low-High abnormal return of this subsample is
1.058%, which is both economically and statistically significant. I also examine the first
offers separately. First offers are the targets that did not receive any acquirer offers in the past
365 days. This eliminates the possibility the portfolios contains more than one position in a
single target company. The Low-High portfolio alpha is 0.930%, which is similar to the
whole sample results.
In Panel B of Table 7, I report the five-factor alpha of the Low-High portfolio for two
subperiods. The first subperiod is from 1984 to 1996 and the second is from 1997 to 2010.
Alphas are 0.763% and 2.115%, respectively, and both are significantly positive. The result
for the first subperiod is weaker than the second subperiod. This is perhaps due to the
predicted failure probability performing slightly worse in the first half of the sample period
(as shown in Figures 3).
In the main analysis, I use a holding period that starts from the third day after deal
announcement and extends 180 days after deal announcement, or until the deal is completed
or withdrawn, whichever comes first. However, the length of the holding period is arbitrary.
In Panel C, I report the results for alternative holding periods: 30 days, 60 days, 100 days,
and 365 days. The abnormal returns from these four models are around 0.960-2.226% and are
all statistically significant. Interestingly, the abnormal return decreases when the holding
period increases. I conjecture that this may be because the risk of deal failure is mainly
concentrated in the first two months after deal announcement.
[Insert Table 7 here]
In Panel D, I report the five-factor alpha of the Low-High portfolio using the price-based
deal failure probability estimate. The alpha and beta are 1.040% and -0.598. These are similar
in magnitude to those calculated using the characteristics-based deal failure probability
measure, confirming my earlier findings.
Next, I attempt to carefully account for two issues that can potentially affect my
conclusions. First, the structure shown in Figure 1 is a simplified version of the real world. In
-
26
reality, competitors may join in bidding and the offer price may be subject to revisions. I
gauge the economic importance of revisions following Baker and Savasoglu (2002). The data
on offer price revisions is more complete since 1997. I thus focus on the later sample
period.21
First, I tabulate the frequency of revisions for the three portfolios of targets in my
sample. From 1997 to 2010, I have 7,205 deals in total, of which 717 have their offer price
revised. 205 are revised down and 512 are revised up. Low, Medium and High have 1,371,
2,734 and 3,100 deals, respectively. 4.9%, 2.6% and 2.1% are revised down, and 8.9 %, 5.7%
and 7.5% are revised up for the three portfolios of deals. Conditional on downward revision,
the average revision ratios (defined as final offer/initial offer-1) are -8.80%, -15.01% and -
18.20% for Low, Medium, and High, respectively. Conditional on upward revision, the
average revision ratios are 18.88%, 16.17%, and 19.24%, respectively. The targets in the Low
portfolio have higher downward revision probabilities, but, conditional on downward revision,
the revision ratio is lower. These targets also have higher upward revision probabilities.
Overall, the revision analysis does not reveal evidence that revisions favor my portfolios
differentially. Still, I recalculate the portfolio returns by excluding the deals with offer price
revisions. The results in Panel E of Table 7 show that the five-factor alpha is 1.800%. This is
slightly lower than the alpha (2.115%) in the same time period without excluding offers with
revisions. But the change is relatively small.
Second, to gauge the effect of uncertainty about the entry of competing offers, I build a
new deal failure probability model which explicitly takes competing offers into account.
Specifically, I redefine deal failure as targets that are not acquired by any acquirers in a
specific period after deal announcement (365 days in the empirical analysis). Different from
the previous definition of deal failure, the new definition excludes the deals in which the
target is acquired by some other acquirer. I use the same set of independent variables as in
Table 2 to model deal failure probability and calculate the Low-High portfolio returns. I
report the five-factor analysis of the Low-High portfolio in Panel F of Table 7. The results
show that the Low-High alpha is 1.319%; this is significant at the 1% level, suggesting that
considering competing offers strengthens the results.
21
The results are robust if I analyse the effect of revisions using the whole sample. For details, see Part II of the
Online Appendix.
-
27
4.6 Testing alternative hypotheses
4.6.1 Skewness preference without probability weighting
As shown in Table 3, target stocks in deals with low failure probability are more
negatively skewed than target stocks in other categories. Many traditional utility functions
feature skewness preference, for example, CRRA utility.22
The rare disaster models (such as
Barro (2006)) explicitly consider extreme negative state with small probabilities. When
investors care about diversification and do not face frictions to do so, coskewness will affect
asset prices but idiosyncratic skewness will not (Kraus and Litzenberger, 1976; Harvey and
Siddique, 2000). In the rare disaster models, only systematic rare disaster risks are priced but
not idiosyncratic extreme negative states. As shown in Table 6, the Low-High portfolio has
negative coskewness risk. Therefore, my findings cannot be explained by the traditional
models without frictions to diversify. However, in the real world, investors are not well-
diversified (Odean, 1999; Mitton and Vorkink, 2007; Goetzmann and Kumar, 2008).23
Under-diversification leads to the pricing of idiosyncratic skewness. I thus examine whether
skewness preference without probability weighting and the rare disaster models can explain
my findings in a world in which investors cannot fully diversify. For simplicity, I assume that
investors cannot diversify at all.
I model the expected stock return of a target stock as shown in Figure 1. I choose CRRA
utility as perhaps it is the most widely used utility function in the finance literature. Barro
(2006) also uses CRRA utility. The CRRA utility is UCRRA=-C1-η
, where C is the consumption
and η is the relative risk aversion coefficient. A CRRA investor who cannot diversify will
value the target stock as:
U(P)=πU(Palone)+(1-π)U(Poffer). (8)
The target stock expected return is defined as
22
An investor’s expected utility of end of period wealth, E[U(W)], can be expressed, by Taylor expansion, as
U(W0)+[U’’(W0)/2!]σw2+[[U’’’(W0)/3!]sW
3+terms of higher orders, where W0 is the mean of W, σw
2 is the
variance of W, and sW3 is the skewness of W. Typically, U’’0,
implying preference for skewness. For CRRA utility, U(W)=(1-θ)-1
W1-θ
(θ>0 and θ≠1), U’’’= θ(θ+1) W
θ+2,
which is greater than 0, implying skewness preference (Kraus and Litzenberger, 1976). 23
There are many plausible explanations for lack of diversification, such as narrow framing, fixed trading costs
(Brennan, 1975), search costs (Merton, 1987), or returns to specialization in information acquisition (Van
Nieuwerburgh and Veldkamp, 2010).
-
28
ER= [π Palone+ (1-π)Poffer]/P-1. (9)
The target standalone value Palone is standardized to 1, and Poffer is chosen to be 1.30. 30%
is to match the average merger and acquisition premium from the data. From these
assumptions, we can calculate the target stock expected return.
[Insert Figure 4 here]
Figure 4 shows the expected target stock returns with respect to deal failure probability,
for five risk aversion coefficients ranging from 1 to 5. We can see from Figure 4, though
CRRA utility functions have feature of positive skewness preference, the skewness effect is
dominated by aversion to variance, thus target expected return is highest when deal failure
probability is moderate. This contradicts my finding that targets in deals with moderate
failure probability (the Medium group and the High group) have a lower return than targets in
deals with low failure probability (the Low group).
4.6.2 The disposition effect
Grinblatt and Han (2005), and Frazzini (2006) argue that the disposition effect can lead to
excess selling after price increases, which drives the current stock price below fundamental
value and consequently yields higher future stock returns. Typically, an M&A announcement
is “good news” for the target shareholders; the disposition effect will therefore predict an
under-reaction. This effect may be stronger for deals with low failure probability as their
initial price run-up is likely to be higher. In other words, in deals with low failure probability,
the target price will increase more on announcement, which will mean higher capital gains
for existing shareholders. The disposition effect will make these shareholders more likely to
sell the stock. This may depress the stock’s price beyond fundamentals and thereby lead to
higher returns in the near future.
I use the cross-sectional regression framework of Baker and Savasoglu (2002) to test this
alternative explanation. The dependent variable is the realized target return from 3 days after
deal announcement to 25 trading days after deal announcement. Ideally, I would like to
measure our dependent variable over the same and relatively long horizon for all deals. Too
short horizon returns may be too noisy. There are considerable cross sectional variations in
deal duration. The median duration is 134 days. 11.1% of the deals are completed or
-
29
withdrawn within 30 days, while 30.2% are completed or withdrawn within 60 days. One
month seems the best choice which can give a relatively long horizon and also make most
dependent variables to be measured over the same horizon (only 11.1% are measured in
horizons less than a month).24
The key independent variable is deal failure probability π. Overweighting of small
probabilities predicts that the coefficient on π is negative. I include the target past return and
the target announcement return in the regression to capture the impact of the disposition
effect. Since the returns are overlapping in time and thus are not independent, I cluster the
standard errors by month following Rogers (1993).
Rtgt,3-25=α+β1πi+β2TargetPastReturni+β3TargetAnnouncementReturni+Controls+εi (10)
[Insert Table 8 here]
Table 8 shows the results. Model (1) reports the univariate regression results. As expected,
the coefficient of π is significantly negative, confirming the previous results based on the
calendar-time portfolio method. In model (2), I control for other deal characteristics, such as
the attitude of the target, payment method, and target size. In the stylized model shown in
Figure 1, π*(1-π) measures the volatility of the target stock return. So, following Baker and
Savasoglu (2002), I also control for π*(1-π). Target Size is the natural logarithm of target
firm market capitalization 22 trading days prior to deal announcement. After controlling for
these factors, the coefficient of π is still significantly negative.
In Models (3), (4) and (5), I test whether the disposition effect can explain my results by
including target past return and the target announcement return as regressors. The coefficients
on these regressors are both positive, consistent with the prediction of the disposition effect
hypothesis. However, controlling for them decreases the magnitude of the coefficient of π by
only around 20%, and therefore does not change my main conclusion, as the coefficient of π
is still significantly negative.
24
The results are robust if I measure the dependent variable from 3 trading days after deal announcement to 47
trading days after deal announcement. Please see Part III of the Online Appendix.
-
30
4.7 Limits to arbitrage
It is reasonable to think that arbitrageurs may not overweight small deal failure
probability. However, why they cannot arbitrage away the positive abnormal return of the
Low-High portfolio? I therefore examine whether there are any limits to arbitrage that
prevent arbitrageurs from doing so?
My study is related to a popular hedge fund trading strategy called merger arbitrage. After
deal announcement, the target stock price is typically lower than the offer price. Merger
arbitrage refers to the strategy that attempts to profit from this spread. In doing so,
arbitrageurs buy the target stock, and if the offer involves stock, sell short the acquirer stock
to hedge the risk of fluctuations in the acquirer stock price. Evidence shows that merger
arbitrageurs have a significant impact on target prices during acquisition deals. If limits to
arbitrage help to explain the undervaluation of targets in deals with low failure probability,
we would expect that the relation between deal failure probability and target returns to be
stronger when arbitrage is more difficult.
First, I examine whether merger arbitrageurs exploit this arbitrage opportunity. If so, they
should trade more of the stocks in the Low portfolio rather than the High portfolio. I infer
whether they indeed do this by examining the correlation of the performance of the real world
merger arbitrageurs and the Low and High portfolios. I use the Barclay Merger Arbitrage
Index to proxy the performance of real world merger arbitrageurs. Due to data availability,
the period is January 1997 to December 2010. We regress the excess return of the Low
portfolio and the High portfolio on the excess return of the Barclay Merger Arbitrage Index,
the Fama and French three factor, Carhart (1997) momentum factor, and the Pastor and
Stambaugh (2003) liquidity factor, the coefficients of the excess return of the Barclay Merger
Arbitrage Index are 0.657 (t=2.47) and 0.088 (t=0.16), respectively. Only the former is
statistically significantly different from zero.25
This suggests that real world merger
arbitrageurs indeed trade more of the Low stocks than High stocks.
25
The results are similar if we use the Merger Arbitrage Index from Hedge Fund Research. If so, the
coefficients of the excess return of the Merger Arbitrage Index are 0.779 (t=3.00) and 0.061 (t=0.11),
respectively.
-
31
Next, I examine what limit the arbitrage capacity of merger arbitrageurs. Starting with the
model specification in equation (10), I add interaction terms between deal failure probability
and proxies for limits to arbitrage to examine whether limits to arbitrage moderate the
relation between deal failure probability and target returns.
Rtgt,3-25 = α+ β1πi + β2TargetPastReturni+β3TargetAnnouncementReturni
+γLimit-to-Arbitragei+ηLimit-to-Arbitragei*πi+Controls+εi (11)
If limits to arbitrage constrain the ability of arbitrageurs to arbitrage away the mispricing, we
would expect the inverse relationship between π and Rtgt,3-25 to be stronger when arbitrage is
more difficult.
The limits to arbitrage literature emphasizes the role of limited capital. In the presence of
slow moving capital, merger arbitrage capital is more likely to be constrained if merger
arbitrageurs have experienced losses in the recent past (Baker and Savasoglu, 2002; Mitchell,
Pedersen and Pulvino, 2007; Mitchell and Pulvino, 2012) or when more arbitrage capital is
needed (Baker and Savasoglu, 2002). I use the total market capitalization of failed deals in
the past 365 days to proxy for arbitrageurs’ recent losses and use the total market
capitalization of pending deals to proxy for the needed capital.
Model (1) of Table 9 shows that the coefficient of the interaction between deal failure
probability and the cumulative arbitrageur losses is negative. Model (2) shows that the
coefficient of the interaction between deal failure probability and needed arbitrage capital is
also negative. Both results are consistent with the conjecture that limited capital constrains
arbitraging activities.
[Insert Table 9 here]
Another measure of limit to arbitrage is based on payment type. In cash deals,
arbitrageurs only need to have a position in the target stock; but in stock deals, they need to
have two positions: a long position in the target stock and a short position in the acquirer
stock. The arbitrageurs incur direct costs for both the long and the short positions, and often
do not receive the full interest on the short sale proceeds (Baker and Savasoglu, 2002). Using
proprietary equity loan data, Geczy, Musto, and Reed (2002) find that acquirers’ stock is
expensive to borrow and accounting for short selling costs decreases arbitrage profits
-
32
significantly.26
Such arbitrage difficulty is likely to be lower for cash deals than for stock
deals.
The results are shown in model (3) in Table 9. The prediction here is that the relation
between deal failure probability and target returns should be weaker for cash deals. In model
(3), the interaction between deal failure probability and Pure Cash is positive, and the
interaction between deal failure probability and Pure Stock is negative though not statistically
significant, consistent with the argument that cash deals are easier to arbitrage than other
deals.
Firm size is another variable widely used to measure arbitrage difficulty. However, in
merger arbitrage, the relation between firm size and arbitrage difficulty is ambiguous. On one
hand, larger firms are more liquid and may attract more arbitrageurs. On the other hand,
Baker and Savasoglu (2002) argue that larger targets need more arbitrage capital; they find
that larger target firms indeed have higher returns. In Table 8, target size is positive in all four
models with target size as a control and is significant in three of them. Nevertheless, I report
the result of using firm size as a measure of limits to arbitrage in model (4) of Table 9.
However, I do not find that target firm size significantly moderates the relation between deal
failure probability and target returns.
5. Conclusion
A large body of experimental evidence shows that decision makers tend to overweight the
probability of tail events. This paper provides direct evidence on probability weighting using
data from mergers and acquisitions. I estimate the deal failure probability based on (1) deal
characteristics, and (2) by comparing the target price and the offer price. Next, I test for
probability overweighting by examining the future target stock returns, conditional on the ex-
ante failure probability of the deal. Overweighting small failure probabilities lowers the target
stock price and yields positive abnormal future stock returns. I confirm this prediction and
find that the target stocks with low failure probability yield significant positive stock returns
in the period between deal announcement and deal resolution, even though these deals have
26
Geczy, Musto, and Reed (2002) analyse equity loans for initial public offerings, DotCom stocks, large cap
stocks, growth stocks, low-momentum stocks and acquirers’ stocks in merger arbitrage. They conclude that “the
effect of short-selling frictions appears strongest in merger arbitrage”.
-
33
lower beta and lower downside risk than the other deals. On average, 5.2% failure probability
is overweighted to 13.2%. The overweighting magnitude is also consistent with the
probability weighting function of Tversky and Kahneman (1992).
There are at least two paths for further research. First, probability weighting can be driven
by either erroneous beliefs about the objective probability or by investors simply
overweighting tail event in their preferences. I try to distinguish between these two
possibilities in Part IV of the Online Appendix and show preliminary evidence that appears
inconsistent with the former. However, this evidence is not conclusive. Further research in
this direction is needed. Second, Cornelli and Li (2002) and Hsieh and Walkling (2005) show
that merger arbitrageurs play an important role in solving the free rider problem inherent in
mergers and acquisitions. It may be interesting to examine further how investors’
overweighting of small deal failure probabilities affects trading behavior of the merger
arbitrageurs, and how this might affect the incentives of potential bidders to launch a
takeover bid.
References
Albuquerque, Rui, 2012, Skewness in Stock Returns: Reconciling the Evidence on Firm
versus Aggregate Returns, Review of Financial Studies 25 (5): 1630-1673.
Amaya, Dieg