probability weighting and asset prices: evidence from...

52
Probability Weighting and Asset Prices: Evidence from Mergers and Acquisitions * BAOLIAN WANG ABSTRACT For mergers and acquisitions with a small failure probability, the average decline in target stock price if the deal fails is much larger than the average increase that accompanies deal success. Probability weighting implies that the deal failure probability of such target stocks will be overweighted, leading them to be undervalued. I find strong supporting evidence and a trading strategy delivers around 1% abnormal return per month, with negative beta, negative downside risk exposure. The abnormal returns are not subsumed by a preference for positive skewness under traditional (expected) utility models, and are stronger when arbitrage is more difficult. Keywords: Mergers and acquisitions, Probability weighting, Cumulative prospect theory JEL classification: G12, D03, G34 * I thank Federico Bandi, Nicholas Barberis, Justin Birru, Philip Bond, Charles Cao, John Chalmers, Kalok Chan, Sudheer Chava, Te-Feng Chen (discussant), Darwin Choi, James Choi, Sudipto Dasgupta, Kathryn Dewenter, Vidhan Goyal, Bing Han, Lawrence Jin, Andrew Karolyi (discussant), Peter Kelly, George Korniotis, Alok Kumar, Kai Li, Mark Loewenstein, Dan Luo, Abhiroop Mukherjee, Kasper Nielsen, Gong Ning (discussant), Jun "QJ" Qian, Mark Seasholes, Rik Sen, Christoph Schneider (discussant), Andrew Siegel, Lei Sun, K.C. John Wei, Mark Westerfield, Liyan Yang, Haishan Yuan, Chu Zhang, Guochang Zhang, Zilong Zhang and the seminar participants at Cornerstone Research, Fordham University, Georgia Institute of Technology, HKUST, Johns Hopkins University, Oxford, Peking University (Guanghua), Shanghai Advanced Institute of Finance, Shanghai University of Finance and Economics, Sun Yat-Sen University, University of Miami, University of South Carolina, University of Toronto (Econ), University of Washington, the EFA Doctoral Tutorial, the Financial Intermediation Research Society Conference 2014, the Third ECCCS Workshop on Governance and Control, the Doctoral Consortium of Financial Management Association, Asian Financial Association (2014), the PhD forum of 24th Australasian Finance and Banking Conference for helpful discussions. This paper was awarded the Best Paper Award at the Third ECCCS Workshop. An earlier version of the paper was circulated under the title “Is Skewness Priced and Why? Evidence from Target Pricing in Mergers and Acquisitions”. All errors are my own. Area of Finance and Business Economics, Fordham School of Business, Lincoln Center Campus, Room 1328, 1790 Broadway, New York, NY 10019. Tel: (+1) 212-636-6118. Fax: (+1) 646-312-8295. Email: [email protected].

Upload: others

Post on 22-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

  • Probability Weighting and Asset Prices: Evidence from

    Mergers and Acquisitions*

    BAOLIAN WANG†

    ABSTRACT

    For mergers and acquisitions with a small failure probability, the average decline in target

    stock price if the deal fails is much larger than the average increase that accompanies deal

    success. Probability weighting implies that the deal failure probability of such target stocks

    will be overweighted, leading them to be undervalued. I find strong supporting evidence and

    a trading strategy delivers around 1% abnormal return per month, with negative beta,

    negative downside risk exposure. The abnormal returns are not subsumed by a preference for

    positive skewness under traditional (expected) utility models, and are stronger when arbitrage

    is more difficult.

    Keywords: Mergers and acquisitions, Probability weighting, Cumulative prospect theory

    JEL classification: G12, D03, G34

    * I thank Federico Bandi, Nicholas Barberis, Justin Birru, Philip Bond, Charles Cao, John Chalmers, Kalok

    Chan, Sudheer Chava, Te-Feng Chen (discussant), Darwin Choi, James Choi, Sudipto Dasgupta, Kathryn

    Dewenter, Vidhan Goyal, Bing Han, Lawrence Jin, Andrew Karolyi (discussant), Peter Kelly, George Korniotis,

    Alok Kumar, Kai Li, Mark Loewenstein, Dan Luo, Abhiroop Mukherjee, Kasper Nielsen, Gong Ning

    (discussant), Jun "QJ" Qian, Mark Seasholes, Rik Sen, Christoph Schneider (discussant), Andrew Siegel, Lei

    Sun, K.C. John Wei, Mark Westerfield, Liyan Yang, Haishan Yuan, Chu Zhang, Guochang Zhang, Zilong

    Zhang and the seminar participants at Cornerstone Research, Fordham University, Georgia Institute of

    Technology, HKUST, Johns Hopkins University, Oxford, Peking University (Guanghua), Shanghai Advanced

    Institute of Finance, Shanghai University of Finance and Economics, Sun Yat-Sen University, University of

    Miami, University of South Carolina, University of Toronto (Econ), University of Washington, the EFA

    Doctoral Tutorial, the Financial Intermediation Research Society Conference 2014, the Third ECCCS Workshop

    on Governance and Control, the Doctoral Consortium of Financial Management Association, Asian Financial

    Association (2014), the PhD forum of 24th Australasian Finance and Banking Conference for helpful

    discussions. This paper was awarded the Best Paper Award at the Third ECCCS Workshop. An earlier version

    of the paper was circulated under the title “Is Skewness Priced and Why? Evidence from Target Pricing in

    Mergers and Acquisitions”. All errors are my own. † Area of Finance and Business Economics, Fordham School of Business, Lincoln Center Campus, Room 1328,

    1790 Broadway, New York, NY 10019. Tel: (+1) 212-636-6118. Fax: (+1) 646-312-8295. Email:

    [email protected].

  • 1

    1. Introduction

    A substantial body of experimental research -- starting with Kahneman and Tversky

    (1979) -- shows that decision makers tend to overweight the probability of tail events, such as

    winning a lottery.1

    Researchers have proposed different mechanisms to explain this

    “probability weighting” phenomenon. For example, some have suggested that tail events tend

    to be discussed disproportionately and are easier to recall (Tversky and Kahneman, 1973;

    Lichtenstein, Slovic, Fischhoff, Layman and Combs, 1978), and others have suggested that

    such events are psychologically more salient (Bordalo, Gennaioli, and Shleifer, 2012, 2013).2

    The asset pricing implication of probability weighting is the following: it leads to

    overvaluation of lottery-type assets (assets with a small probability of a large upside gain as

    the probability of large gain will be overweighted) and undervaluation of disaster-type assets

    (assets with a small probability of a large downside loss as the probability of large loss will

    be overweighted). In this paper, I present a novel way of investigating probability weighting

    in the context of mergers and acquisitions (M&As), by examining the price behavior of the

    target stocks in the period between deal announcement and deal resolution.

    In many ways, the M&A market provides an ideal setting in which to examine probability

    weighting. First, after deal announcement, the primary concern faced by the target company

    shareholders is whether or not the deal will be completed.3To a first approximation, this

    setting involves binary payoffs [(completed, not completed)]; this is consistent with the way

    most experimental studies analyze probability weighting (for example, Kahneman and

    Tversky, 1979; Tversky and Kahneman, 1992), and fits with the way the existing literature

    models probability weighting (Barberis and Huang, 2008; Bordalo, Gennaioli, and Shleifer,

    2013). Second, as I show in Section 3.2, one can measure deal failure probabilities

    objectively -- based on deal characteristics such as the attitude of the target company and the

    payment method -- with relative ease and precision. At the same time, the decision

    probability that investors use to price target company stocks can also be imputed from the

    1 For a review of this literature please see Fehr-Duda and Epper (2012).

    2 For more discussions of the psychology of tail events, see Burns, Chiu, and Wu (2010), and Barberis (2013).

    3 Other risks include the possibility of entry of competing acquirers and revisions of the offer price. Using a

    sample of M&As from 1981 and 1996, Baker and Savasoglu (2002) find that these risks are second order

    relative to deal completion risk.

  • 2

    target stock prices, or equivalently, their expected future stock returns. Third, announced

    M&As commence and conclude in discrete and typically short intervals4. This clearly defines

    the period in which possible mispricing associated with probability weighting may arise.

    I test this prediction in two steps. First, I construct an empirical measure of deal failure

    probability using a logistic specification. I integrate the previous literature (Walkling, 1985;

    Samuelson and Rosenthal, 1986; Baker and Savasoglu, 2002; Bates and Lemmon, 2003;

    Officer, 2003; Bhagat, Dong, Hirshleifer, and Noah, 2005; Bates, Becher, and Lemmon, 2008;

    Baker, Pan and Wurgler, 2012) by incorporating a rich set of variables about acquirer

    characteristics, target characteristics, and deal characteristics. The failure prediction model

    works very well out of sample in predicting actual deal outcomes. Second, I use a calendar

    time portfolio approach, similar to Mitchell and Pulvino (2001), Baker and Savasoglu (2002),

    and Giglio and Shue (2013) to analyze the target stock returns in the period between deal

    announcement and deal resolution.

    Ideally, I would like to have deals when failure probability is small and deals when failure

    probability is close to one. However, in practice, very few deals have large failure probability

    – only 0.89% of the whole sample has failure probability higher than 80%. Therefore, I focus

    on examining deals with small failure probabilities and deals with moderate failure

    probabilities.

    If investors tend to overweight tail events, as theory and experiments suggest, target

    company stocks should be underpriced when deal failure probability is small. Therefore, they

    should earn positive abnormal returns in the period from deal announcement to deal

    resolution. However, when deal failure probability is moderate, there would be no probability

    over-weighting, and therefore, no abnormal returns on targets in such deals.

    I sort all the deals into three groups based on the deal failure probability -- deals with

    failure probability below 10% are classified as targets with low failure probability (the Low

    group), deals with failure probability between 10% and 20% are classified as targets with

    4 In my sample, the mean (median) duration from deal announcement to resolution is 134 (100) days.

  • 3

    medium failure probability (the Medium group), and all the others are classified as targets

    with high failure probability (the High group).5

    The empirical results strongly support the prediction of probability weighting for target

    stocks in M&A deals. First, I find that the target stocks in the Low group yield around 0.80%

    abnormal return per month between deal announcement and resolution. In terms of

    magnitude of probability weighting, the results show that, on average, 5.2% failure

    probability is overweighted to 13.2%. Second, when I examine target stocks in the Medium

    group and the High group, I find no evidence of abnormal returns. A trading strategy that

    buys target company stocks in the Low group and sells short target company stocks in the

    High group (the Low-High portfolio) yields an alpha of 0.75% to 2.23% per month,

    depending on sample selection and model specifications. I also find that, compared to the

    short side of the portfolio, the long side of the portfolio has a lower beta, lower downside risk,

    and lower coskewness risk as measured by Harvey and Siddique (2000), suggesting that the

    positive abnormal return is unlikely to be driven by systematic risk exposure. My results are

    robust to subsamples, subperiods, and alternative models of deal failure probability.

    Are the arbitrageurs aware of this arbitrage opportunity and if so, what prevent them from

    arbitraging away all the mispricing? First, I find that the correlation between the real world

    merger arbitrage index and the Low portfolio return is much higher than its correlation with

    the High portfolio, suggesting that mergers arbitrageurs indeed trade the target stocks with

    low failure probabilities more than the ones with high failure probability. Second, the return

    difference between the stocks from the Low group and the stocks from the High group is

    larger when merger arbitrage capital is more constrained: when more deals failed in the

    recent past, when more deals are pending, and when arbitrage involves more trading and

    costly short selling.

    I examine two alternative explanations. First, for mergers and acquisitions with small

    failure probabilities, the average decline in the target stock price if the deal fails is much

    larger than the average increase that accompanies deal success. This implies that targets in

    5 The literature does not give an unambiguous cutoff on how small probability is small enough to lead to

    overweighting. Therefore, in the sorting, I have finer classification for deals with failure probability lower than

    20%. The results are robust if I further classify the High groups into 3 categories: between 20% and 30%,

    between 30 and 40%, and others.

  • 4

    such deals have negative return skewness. Many traditional utility functions, for example, the

    CRRA utility function, also exhibit a preference for skewness. The rare disaster models

    (Barro, 2006; and Gabaix, 2012) explicitly consider extreme negative state with small

    probabilities. I thus examine whether the findings can be explained by skewness preference

    implied from the traditional utility functions or by the rare disaster models.

    The traditional skewness preference models (Harvey and Siddique, 2000; Karus and

    Litzenberger, 1976) and the rare disaster models (Barro, 2006; Gabaix, 2012) predict that,

    when investors care about diversification and do not face frictions to do so, systematic risks

    (coskewness and systematic disaster risk) will affect asset prices but idiosyncratic risk

    (idiosyncratic skewness or idiosyncratic downside risk) will not. However, at the portfolio

    level, the Low-High portfolio has negative coskewness risk and negative systematic

    downside risk. After proper adjustment, the alpha of the Low-High portfolio becomes even

    larger. I also show that, even when we introduce frictions to diversification, the traditional

    skewness preference models and the rare disaster models cannot explain my findings.

    Second, Grinblatt and Han (2005), and Frazzini (2006) argue that the disposition effect

    can lead to excess selling after price increases, which will drive the current stock price below

    the fundamental value and consequently yield higher future stock returns. Typically, an M&A

    announcement is “good news” for the target shareholders. The disposition effect will

    therefore predict under-reaction. This effect may be stronger for deals with low failure

    probability, as their initial price run-up is likely to be higher. In other words, in deals with

    low failure probability, the target price will increase more on announcement, which will

    mean higher capital gains for existing shareholders. The disposition effect will make these

    shareholders more likely to sell the stock. This might depress the stock’s price, beyond

    fundamentals, and therefore lead to higher returns in the near future. I find that both the target

    return prior to deal announcement and the return around the announcement period predict the

    future target stock return, which is consistent with the disposition effect-based explanation

    outlined above. However, controlling for the pre-announcement and announcement returns

    does not reduce the significance of my main results.

    My paper fits into a growing literature that applies probability weighting to financial

    market phenomena. On the theory side, Polkovnichenko (2005) uses it to explain the under-

  • 5

    diversification of household portfolios. Barberis and Huang (2008), and Bordalo, Gennaioli,

    and Shleifer (2013) examine the implications of probability weighting for security prices.

    Brunnermeier and Parker (2005) and Brunnermeier, Gollier, and Parker (2007) endogenize

    probability weighting in an optimal expectation framework where the agent chooses his/her

    beliefs to maximize well-being. Spalt (2013) finds that probability weighting can help explain

    why riskier firms grant more stock options to their employees.

    The empirical literature focuses on the prediction of probability weighting that investors

    will prefer to buy positively skewed assets (Kumar, 2009; Mitton and Vorkink, 2007) and

    will be willing to pay a premium for them. Using various measures of skewness, Boyer,

    Mitton, and Vorkink (2010), Amaya, Christoffersen, Jacobs, and Vasquez (2014), Bali,

    Cakici, and Whitelaw (2011), and Conrad, Dittmar, and Ghysels (2013) document consistent

    evidence. Skewness preference also helps explain the first-day returns and long-run

    underperformance of IPOs (Green and Hwang, 2012), the prices of out-of-money options

    (Boyer and Vorkink, 2014), the underperformance of distressed stocks (Conrad, Kapadia, and

    Xing, 2014), and the underperformance of stocks trading in the over-the-counter markets

    (Eraker and Ready, 2014). Schneider and Spalt (2013) argue that managers tend to overpay

    for lottery type targets and Schneider and Spalt (2013) find that conglomerate companies

    overinvest in high-skewness segments. There are also a smaller number of studies which

    impute probability weighting functions from the S&P 500 index options (Polkovnichenko

    and Zhao, 2013, and Chabi-Yo and Song, 2013a) and currency options (Chabi-Yo and Song,

    2013b).

    However, most of these studies do not explicitly investigate whether the documented

    skewness preference is driven by probability weighting or from other utilities, which I do in

    this paper. The binary nature of the outcome variable fits with the way the experimental

    studies and the theoretical studies examine probability weighting. I also investigate the

    magnitude of probability weighting in this paper.

    My findings also contribute to the merger arbitrage (also called risk arbitrage) literature.

    After deal announcement, the target stock price is typically lower than the offer price. Merger

    arbitrage refers to the strategy that attempts to profit from this spread. In doing so,

    arbitrageurs buy the target stock, and if the offer involves stock, sell short the acquirer stock

  • 6

    to hedge the risk induced by fluctuations in the acquirer stock price. Bhagat, Brickley, and

    Loewenstein (1987), Larcker and Lys (1987), Mitchell and Pulvino (2001), and Baker and

    Savasoglu (2002) find that, on average, target stocks yield positive abnormal returns in the

    period between deal announcement and deal resolution. Larcker and Lys (1987) attribute the

    positive abnormal returns to compensation of information acquisition of the merger

    arbitrageurs. Baker and Savasoglu (2002), Geczy, Musto, and Reed (2002), Mitchell,

    Pedersen and Pulvino (2007), Mitchell and Pulvino (2001, 2012), and Officer (2007)

    investigate how limits to arbitrage affect the merger arbitrage market. Giglio and Shue (2013)

    find that the deal success hazard rate decreases as time passes after deal announcement, but

    investors do not fully take this into account when they price target stocks. Cao, Goldie, Liang,

    and Petrasek (2014) find that hedge fund merger arbitrageurs do better than non-hedge fund

    institutional merger arbitrageurs. In this paper, I examine target stock return conditional on its

    ex-ante deal failure probability. My findings suggest that target stock performance is related

    to deal failure probability, and refining merger arbitrage strategies to incorporate this finding

    is likely to improve profits significantly.

    The rest of paper is structured as follows. In Section 2, I introduce more background of

    probability weighting and why it matters for M&As. Section 3 describes the data and the

    method used to model deal failure probability. Section 4 presents the empirical results.

    Section 5 concludes.

    2. Hypothesis development

    2.1 Probability weighting: Some background

    Economists have long recognized that people are more sensitive to the probability of tail

    events than typical events. For example, Tversky and Kahneman (1992) find that the median

    subject in their experiment is indifferent between receiving a lottery with 1% chance of

    winning $200 and a certain $10, and also indifferent between receiving a lottery with 99%

    chance winning $200 and a certain $188. It is hard for the expected utility model to generate

    such behavior, as in that model, events are weighted linearly by their objective probabilities.

    To accommodate this, researchers propose probability weighting functions which transform

    objective probabilities into decision probabilities. Many models have incorporated this

  • 7

    mechanism, for example, rank-dependent expected utility models (Quiggin, 1982; Yaari,

    1987; Prelec, 1998) and prospect theory (Kahneman and Tversky, 1979; Tversky and

    Kahneman, 1992). A substantial body of experimental studies show that incorporating

    probability weighting can more accurately describe people’s decision making under

    uncertainty than expected utility theory (Kahneman and Tversky, 1979; Fehr-Duda and Epper,

    2012).

    Probability weighting can be driven by either erroneous beliefs about objective

    probability (probability estimation error) or by people behaving as if they knowingly assign a

    disproportionately higher weight to tail events (overweighting objective probability). For the

    former, the difference in objective probability and the decision probability reflects the error

    investors make in probability estimation. However, according to the latter view, the disparity

    between objective probability and decision probability is just a modelling device that captures

    investors’ inherent risk preferences (Kahneman and Tversky, 1979). 6

    Both views have received considerable attention in the psychology literature. For the

    probability estimation error view, the availability heuristic and anchoring bias are often cited

    as potential reasons why erroneous beliefs lead to overestimation of the probability of tail

    events. The availability heuristic predicts that, when judging the frequency or probability of

    an event, people tend to rely on the ease with which an example of the event comes to mind

    (Tversky and Kahneman, 1973). Tail events are more available as they are discussed

    disproportionally more than typical events and are easier to recall. For example, Lichtenstein,

    Slovic, Fischhoff, Layman and Combs (1978) find that people overestimate the frequency of

    rare but more discussed causes of death like accidents and homicide. Specifically, subjects

    thought that accidents caused about as many deaths as disease and that homicide was a more

    frequent cause of death than suicide. Actually, diseases cause about 16 times as many deaths

    as accidents, and suicide is twice as frequent as homicide.

    Anchoring bias may also be a source of probability weighting. Event space is usually

    specified into coarse categories. Fox and Clemen (2005) and Sonnemann, Camerer, Fox and

    Langer (2013) find that a typical subject in the experiment tends to assign equal probability to

    6 See Barberis (2013) for more details.

  • 8

    the specified events. For example, we may classify market conditions as up market, down

    market and flat market. Experimental subjects tend to assign the same probability to each

    specified event. For this example, subjects tend to assign 1/3 to up market, down market and

    flat market, even when their true probabilities are different. In reality, normal events in the

    middle of a distribution are typically partitioned more coarsely than events that lie at the

    extremes of a distribution. As a result, the objective probability of a specified tail event is

    typically lower than a specified normal event. Equal probability assignment and anchoring

    therefore predicts overweighting of tail events.

    Many mechanisms have also been proposed to explain the second view of probability

    weighting – that people behave as if they intentionally put higher weight on tail events. For

    example, Rottenstreich and Hsee (2001) propose that tail events can make people more

    emotional. Recently, Bordalo, Gennaioli, and Shleifer (2012) propose a salience-based theory

    of choice under uncertainty. The psychology literature documents that salience attracts

    attention. For example, according to Kahneman (2011, p324), ‘‘our mind has a useful

    capability to focus on whatever is odd, different or unusual.’’ Based on this, Bordalo,

    Gennaioli, and Shleifer (2012) argue that decision makers’ attention is likely to be drawn to

    salient outcomes, such as extreme tail outcomes, which leads to overweighting of these

    outcomes.

    2.2 Probability weighting in merger and acquisition deals

    There are multiple reasons why probability weighting may matter in M&As. For an

    M&A deal that involves a small deal failure probability, the target stock price is typically

    close to the offer price. If the deal succeeds, which happens with a high probability, by

    definition, the stock price will only increase for a small amount up to the offer price. But in

    the unlikely event that the deal fails, the stock price will drop by a much larger amount. If

    investors recall large losses suffered on previous deal failure disproportionately (availability

    heuristic) more than small gains from previous deal success, investors may overestimate the

    true probability of deal failure. Alternatively, it is also possible that they do not make

    systematic errors in judging the probability itself, and that it is an inherent aversion to large,

    small-probability losses that results a lower willingness-to-pay for deals with small failure

  • 9

    probabilities (Rottenstreich and Hsee, 2001; Kahneman and Tversky, 1979; Bordalo,

    Gennaioli, and Shleifer, 2012).7

    2.3 Asset prices with probability weighting

    Several models have been proposed to analyze the asset pricing implication of probability

    weighting. Barberis and Huang (2008) consider an economy in which investors have

    cumulative prospect theory preference, and examine how a lottery-type security – a security

    with a small probability of large gain and a large probability of small loss – is priced in a

    mean-variance world. They show that, in such an economy, lottery-type securities can

    become overpriced and earn negative abnormal returns.

    Barberis, Mukherjee, and Wang (2014) also model probability weighting using

    cumulative prospect theory. But different from Barberis and Huang (2008) in which investors

    make decisions at the portfolio level, Barberis, Mukherjee, and Wang (2014) assume that

    investors derive their prospect theory utility from individual stocks (i.e., engage in narrow

    framing). Similarly, Bordalo, Gennaioli, and Shleifer (2013) also assume investors engage in

    narrow framing, but they model probability weighting based on the salience of a payoff. In

    contrast to rank-dependent utility and cumulative prospect theory where probability

    weighting function is constant, in Bordalo, Gennaioli, and Shleifer (2013), probability

    weighting function is context dependent: it relies on other alternatives and how a decision

    problem is described. Brunnermeier and Parker (2005) and Brunnermeier, Gollier, and Parker

    (2007) endogenize probability weighting in an optimal expectation framework where the

    agent chooses his/her beliefs to maximize well-being. Though the mechanisms are different,

    all these models predict that lottery-type securities can become overpriced and disaster-type

    securities (securities with a small probability of large loss and a large probability of small

    gain) can become underpriced.

    To illustrate the central point of probability weighting, I use a simple model to show the

    link between deal failure probability and the expected target stock return, and to develop the

    hypothesis. In the model, the deal will either fail or succeed. If the deal succeeds, the target

    7 Although a distinguishing between the two alternatives is beyond the scope of this paper, some indicative

    evidence can be found from Part IV of the Online Appendix.

  • 10

    shareholder receives the offer price Poffer. If the deal fails, the target is worth its standalone

    value Palone, where I assume that Poffer>Palone. Poffer and Palone are assumed to be constant and

    known8. The deal failure probability is π where0< π π when

    π is small and w(π)π and (weakly) negative if w(π)π}, in other words, when will π

    be overweighted? Different probability weighting functions have different answers to this

    8 In reality, competitors may join in the bidding; the offer price is also subject to revision; if the payment is not

    pure cash, the target shareholders may not know exactly what Poffer will be; being targeted will also reveal a lot

    of valuable information to the market and potentially change the strategy of the target management even if the

    deal fails, which will also change the target value. All these make Poffer and Palone variable. However, as long as

    the market has a reasonable estimation of Poffer and Palone, and their variances are low comparing to their

    difference, the model is a good approximation of reality. 9 In the empirical analysis, this assumption will be relaxed.

  • 11

    question, but most predict that the fixed point (when w(π)=π) is around 0.3 (Tversky and

    Kahneman, 1992; Prelec, 1998). If investors ignore diversification, we predict that targets

    will be undervalued when deal failure probability is lower than 0.3. However, this may not be

    the case in the financial markets. First, the average investor may be different from the

    average experimental subject, given the presence of many sophisticated institutions. Second,

    not all investors engage narrow framing and they may make decisions by incorporating the

    effect of π in their overall portfolio. The negative skewness driven by small π may be

    partially diversified in a portfolio and the pricing effect of probability weighting may become

    weaker. Ultimately, the range of π for which targets will be undervalued is an empirical

    question.

    3. Data and modeling deal failure probability

    3.1 Data

    I begin with all the M&As in the Securities Data Company database of Thompson

    Financial (SDC). SDC covers deals going back to 1978. However, coverage prior to 1981 is

    incomplete. The sample used in this paper runs from January 1, 1981 to December 31, 2010. I

    am left with a sample of 16,906 deals after I use the following filters:

    1. The target is a U.S. listed firm.

    2. The deals that are classified by SDC as rumors, recapitalizations, repurchases, or

    spinoffs are excluded.

    3. In order to calculate target characteristics and post announcement return, I also

    require that the price and return data are available from Center for Research on

    Security Prices (CRSP) one year prior to deal announcement and at least three trading

    days after deal announcement.

    4. The deal completion or date of withdrawal must be at least three days after the deal

    announcement, or missing.

    Table 1 shows the sample of deals. The number of deals is larger in late 1980s and 1990s

    than later years, but even in the 2000s, on average, there are more than 300 deals each year.

    The deal duration is measured as the number of calendar days between deal announcement

    and deal completion or withdrawal. The median duration is 100 days and the mean is 134,

  • 12

    suggesting that the deal duration is right skewed. 1,144 deals are initially viewed as hostile or

    unsolicited by the targets.10

    Payment method data shows that the number of pure stock deals

    is highest during the internet bubble. Around 21.2% of the deals are conducted through tender

    offers. Leverage Buyouts (LBO) and mergers of equals (MOE) are generally rare. Only 6.6%

    and 0.8% are LBOs and MOEs, respectively. 55.5% of the acquirers are public firms. Past

    return is the cumulative return from 365 days to 22 trading days prior to deal announcement.

    The variation is large, but on average, the target past return is 9.65%, close to the market

    return. Target Size is the natural logarithm of target firm market capitalization 22 trading

    days prior to deal announcement. I convert market size into constant 2005 dollars using the

    GDP deflator from the Federal Reserve. Mean target size is relatively stable in the early years,

    begins to increase only in 2004 and reaches a peak in 2007.

    Prior Bid is a dummy variable which is equal to 1 if the target company has received

    another takeover bid in the past 365 days. The results show that, for 24.6% of the deals,

    another bid was received in the past 365 days. The data also reveals significant variation in

    geographical and industrial linkage between the acquirer and the target: for 15.4% of the

    deals, the acquirers are foreign firms; for 26.7% of the deals, the acquirer and the target are

    not located in the same state; and for 43.7% of the deals, they are not in the same 2-digit SIC

    industry.

    [Insert Table 1 here]

    I also look at acquirers’ pre-acquisition ownership of target companies in these deals. Pct

    Held is the percentage of the target shares held by the acquirer 6 months prior to the deal

    announcement. Toehold is a dummy which is equal to 1 if the holding is no less than 5%.

    Cleanup is a dummy variable which is equal to 1 if the holding is greater than 50%. Around

    3.4% of the deals are cleanup deals. The acquiring firm has a toehold only in 14.3% of the

    deals. But on average the acquirers hold 4.41% of the target shares, close to the cutoff used to

    10

    As the ultimate purpose of this paper is to do asset pricing tests, it is important to make sure that all the

    information used to construct portfolios is available at the moment of portfolio construction. Thus, I choose to

    use the initial attitude of the target company to the merger and acquisition announcement, instead of the final

    attitude.

  • 13

    define toehold. This implies that once an acquiring firm holds shares of the target firm, it is

    very likely that it holds a significant fraction.

    In the whole sample, 10,692 deals are completed, and 3,167 are withdrawn. Most of the

    remaining deals are classified as pending or status unknown by SDC. Based on the deals with

    known status, the average completion rate is 77.1% (10,692/(10,692+3,167)). As the pending

    and status unknown cases may be different from the cases with known status, in order to

    prevent any sample selection bias, I include all of them in my asset pricing tests, but in

    modeling the deal success probability, I only use the deals with known status. In the next

    section, I discuss in detail how I model deal failure probability.

    3.2 Modeling deal failure probability

    I use two models to estimate deal failure probability. The first is based on deal

    characteristics; the second is based on the target stock price. I refer to the first model as the

    characteristics-based model, and to the second as the target price-based model. Throughout

    the analysis, I use π to denote deal failure probability.

    3.2.1 The characteristics-based model

    Following the literature (Walkling, 1985; Samuelson and Rosenthal, 1986; Baker and

    Savasoglu, 2002; Bates and Lemmon, 2003; Officer, 2003; Bhagat, Dong, Hirshleifer, and

    Noah, 2005; Bates, Becher, and Lemmon, 2008; Baker, Pan and Wurgler, 2012), I use deal

    characteristics and firm characteristics in a logistic specification to predict deal outcome.

    The deal and firm characteristics that I use are: a hostile dummy variable to measure the

    target attitude (Baker and Savasoglu, 2002; Baker, Pan, and Wurgler, 2012), a pure cash

    dummy variable and a pure stock dummy variable to measure the payment method (Bates and

    Lemmon, 2003; and Bates, Becher, and Lemmon, 2008), a Tender dummy variable, an LBO

    dummy variable and a merger of equals (MOE) dummy variable to measure the deal type.

    Following Officer (2003), I put three variables—Pct Held, Toehold, and Cleanup—in the

    model to capture the effect of the acquirer’s pre-acquisition ownership. I also consider

    acquirer and target characteristics such as the public status of the acquirer, and the size and

    past return of the target. Other factors include dummy variables indicating whether the target

    received any takeover offer in the past 365 days (Prior Bid), whether the acquirer is a foreign

  • 14

    company, and whether the acquirer and the target are from the same state or in the same 2-

    digit SIC industry. Prior Bid accounts the competitiveness of the takeover market and the

    other variables account for the possible effect of geographical and economic linkages

    between the two sides of the deal.11

    [Insert Table 2 here]

    Table 2 reports the results. The dependent variable is a dummy variable which is equal to

    1 if the deal is withdrawn, and 0 if the deal is completed. Models (1) to (6) report the

    coefficients and the second column of model (6) reports the marginal effects based on model

    (6).12

    In estimating the model parameters, I only use the deals with known status in the

    sample.

    I first examine five sets of factors separately and in model (6) I examine all the factors in

    one full model. Almost all the factors considered have significant predictive power for the

    deal outcome and the estimated signs are very similar in the separate models and in the full

    model. The results in the model (6) show that the significantly positive predictors include

    hostile, LBO, MOE, Toehold, Cleanup, Prior Bid, and Cross Border. The negatively

    significant variables include Pure Cash, Pure Stock, Tender, Pct Held, Public Acquirer,

    Target Past Return, Target Size, and Same Industry. The only variable that is insignificant in

    model (6) is the Same State dummy. By examining the marginal effects reported in the last

    column, we can compare the importance of the various predictors. Among all the variables,

    five have an absolute average marginal effect larger than 0.100. They are: hostile (0.565),

    LBO (0.146), Prior Bid (0.142), Tender (-0.118), and Toehold (0.106).

    Next, I examine the predictive performance of my method of assessing deal failure

    probabilities. In order to avoid any look-ahead bias, I use an out-of-sample estimation method

    similar to Campbell, Hilscher, and Szilagyi (2008) to estimate out-of-sample deal failure

    11

    The literature has also uncovered other important determinants of the deal outcome. For example, Bates and

    Lemmon (2003) and Officer (2003) find that target and bidder termination fees act as an efficient contract

    mechanism to encourage bidder participation and to secure target company gain, and increase the deal

    completion rate. Burch (2001) and Bates and Lemmon (2003) find that lockup options also increases the deal

    completion rate. Officer (2004) argues that collar provisions can be used as contractual devices by lowering the

    costs of target and acquirer renegotiation. I do not consider these contractual terms because they may not be

    available to the public at the beginning of the deal. Considering these factors has little effect on the main results. 12

    For most variables, the marginal effects in models (1) to (5) are similar to those in model (6). To save space, I

    do not report the marginal effects for models (1) to (5).

  • 15

    probability. Specifically, for the deals in year t, I use all the data before year t to estimate the

    model coefficients. Only deals that have already been completed or withdrawn by the end of

    year t-1 are included in the estimation.13

    The estimated coefficients are used to calculate the

    deal failure probability for all the deals in year t, including the deals with status classified as

    pending or unknown by SDC.

    In order to have robust coefficient estimate, I require at least three years data to do the

    estimation. Thus, the main test sample begins in 1984 and ends in 2010; 16,163 deals are

    used in the asset pricing tests.

    [Insert Figure 2 here]

    Figure 2 shows the distribution of the deal failure probability when using model (6). I

    classify the deals into 100 groups based on their failure probability. The vertical axis shows

    the number of deals in each group. It shows that deal failure probability has large variation.

    Deals with failure probability higher than 50% are generally rare, while a large number of

    deals have failure probability lower than 10%. Overall, the distribution of deal failure

    probabilities is fairly smooth.

    [Insert Figure 3 here]

    In Figure 3, I examine whether model (6) in Table 2 does a good job in predicting deal

    outcomes, year by year. I sort all the deals into three groups: low failure probability group

    (deals with failure probability lower than 10%), medium failure probability group (deals with

    failure probability between 10% and 20%), and high failure probability group (all the

    others).14

    I calculate the realized deal failure rate for each group and each year. Figure 3

    shows that the deals with low failure probability indeed have a lower realized failure rate than

    the deals with high failure probability, but perhaps more pointedly, this is true in each and

    every year in my sample. The deals with medium failure probability also lie in between the

    two extreme groups in most years.

    13

    This means that the deals that are announced before or in year t-1 but have not been completed or withdrawn

    at the end of year t-1 are not used. 14

    I choose 10% and 20% as the cut-offs mainly because I will do the asset pricing analysis using the same

    grouping.

  • 16

    In addition, the model does a better job in the second half of the sample than in the first

    half. In the 1980s, though deals with different failure probabilities are clearly separated, the

    realized failure rate is higher than the fitted failure probability. I conjecture that this is driven

    by the hostile takeovers in late 1980s and the fact that in the first half of the sample the model

    estimates are nosier than in the second sample where we have more historical data. But

    overall, the results show that the model in Table 2 has very good out-of-sample predictive

    power in separating deals with different failure probabilities. We also find that the results are

    better in the second half of the sample than in the first half of the sample (in Panel B of Table

    7).

    3.2.2 The target price-based model

    Another way of modeling deal failure probability is to infer it by comparing the offer

    price and the post-announcement target price. Consider the example in Figure 1: if we know

    the offer price (Poffer), the target standalone price (Palone) and the post-announcement target

    price (P), we can infer how the market perceives the likelihood of deal failure. I use the

    following formula as the second measure of deal failure probability: (Poffer -P)/(Poffer -Palone).

    Strictly speaking, the target price is not only affected by its standalone price, the offer price,

    and the failure probability. It is also affected by the expected length of time needed to resolve

    the deal and thus the expected return required to hold the target until then. However,

    regardless of the exact asset pricing model, I expect that (Poffer -P)/(Poffer -Palone) should be

    positively related to deal failure probability.

    To prevent any look-ahead bias, I use the initial offer price (instead of the final offer price)

    to do the calculation. I use the target price 22 trading days before the deal announcement to

    proxy target standalone value. The post-announcement target price is measured at the end of

    the second trading day after deal announcement. In order to minimize measurement error, I

    require that the post-announcement target price is between the pre-announcement target price

    and the offer price. 15

    15

    Both the offer price and the target standalone price are measured with noise. Competitor entry or expectation

    of competitor entry can drive the post-announcement target price higher than the first offer price. It is also

    possible that some extremely good or bad news comes out during the month just before the deal announcement

    and takes the post-announcement target trading price beyond the range bounded by the pre-announcement and

    the offer price.

  • 17

    In model (7) of Table 2, I analyze the predictive power of the price-based measure for the

    deal outcome. The results show that (Poffer -P)/(Poffer -Palone) has very strong predictive power

    for the deal outcome. The coefficient of (Poffer -P)/(Poffer -Palone) is 2.195, the t-value is 15.70,

    and the marginal effect is 0.319, suggesting that an increase of 0.1 in the value of (Poffer -

    P)/(Poffer -Palone) increases the failure probability by 3.19%. This is consistent with the

    findings of Samuelson and Rosenthal (1986) and Subramanian (2004) that the market can

    extract deal failure probability very well even at the beginning of a deal.

    Compared to the deal characteristics-based measure, this price-based measure is simpler.

    However, due to the constraint on the initial offer price in calculating (Poffer -P)/ (Poffer -Palone),

    it is only available for 4,598 deals which are less than 30% of the full sample and are very

    sparse in the years before 1996. Thus, in the paper, I use the characteristics-based measure as

    my main measure and perform robustness tests using the price-based measure.16

    4. Empirical results

    In this section, I evaluate the average return of target firms after the deal announcement

    and test whether investors overweight probability of deal failure when it is small. I sort all the

    deals into three groups based on the deal failure probability. The deals with failure

    probability below 10% are classified as targets with low failure probability, the deals with

    failure probability between 10% and 20% are classified as targets with medium failure

    probability, and all the others are classified as targets with high failure probability.

    I classify in this way for two reasons. First, probability weighting has its biggest impact

    on small probability events. I therefore define deals with low failure probability as deals with

    a very low chance of failure. Second, the number of deals with failure probability higher than

    20% is more volatile year by year. If I do a finer classification, the number of deals in some

    categories will be too small. For example, if I classify deals with failure probability higher

    than 20% into 3 categories-- between 20% and 30%, between 30 and 40%, and others-- for

    some months, the number of deals will be as low as 1. This problem will become more severe

    16

    Both models are likely to have some error-in-variable problems. For example, for the characteristics based

    model, it is possible that investors may use other available information which may have significant effect on

    deal failure. For the priced-based model, I do not consider the heterogeneity in deal duration. Given the error-in-

    variable problem, I expect that my results may underestimate the true importance of probability weighting.

  • 18

    for the subsample analysis in Table 5. Nevertheless, my results are robust if I further sort

    deals with high failure probability into 3 categories: between 20% and 30%, between 30 and

    40%, and others.

    4.1 Target company characteristics

    In Table 3, I report the characteristics of target stocks. Panel A reports all characteristics

    except for the target stocks’ return moments, and Panel B reports their return moments. The

    first column of Panel A shows the average deal failure probability for each group. The

    averages are 0.052, 0.154, and 0.362 for Low, Medium and High, respectively. The

    difference between Low and High is 0.310, which is economically quite large. The next two

    columns show the mean and median durations of deals in each group. The mean (median)

    durations are 100.87 (62), 137.44 (113), and 142.94 (107) for Low, Medium, and High,

    respectively. I use the standard two sample t-test to examine the statistical significance of the

    mean and the Brown and Mood (1951) test to examine the statistical significance of the

    median. The differences in mean and median between the Low and High groups are both

    statistically significant, suggesting that deals with low failure probability on average resolve

    faster than other deals. The last column shows the average premium. Premium is measured as

    the natural log difference between the initial offer price and the target stock price 22 trading

    days before deal announcement. The average premia are 0.353, 0.354 and 0.382 for Low,

    Medium and High, respectively. The difference between Low and High is not statistically

    significant.

    Panel B reports the target stock return moments. I use two methods to calculate the

    characteristics of target stocks: one based on the realized return (physical moments) and one

    based on option prices (risk neutral moments). For risk neutral moments, I can calculate the

    moments stock by stock, as long as there is a large enough number of options available. I

    follow Bakshi, Kapadia, and Madan (2003) and Dennis and Mayhew (2002) to calculate the

    risk neutral moments.17

    However, I cannot calculate the physical moments for each individual

    stock as there is only one realized return for each of them. I thus calculate the physical

    moments from the cross section of target stocks. Specifically, I calculate the cross-sectional

    17

    The detailed methodology can be found from Part I of the Online Appendix.

  • 19

    moments for stocks in the three portfolios. To mitigate the effect of extreme returns, I

    calculate the physical moments from log returns rather than raw returns.

    [Insert Table 3 here]

    As discussed above, the payoff structure of the target stock is (approximately) binary. If

    this is true, the statistical moments of target stock returns should reflect the characteristics of

    the Bernoulli distribution. The standard deviation and skewness for a Bernoulli distribution

    are π(1-π) and (1-2π)/[π(1-π)]1/2

    , respectively. Standard deviation increases with respect to

    deal failure probability when failure probability is lower than 0.5 (which is true for most of

    the sample stocks), and skewness is negatively related to deal failure probability (in other

    words, skewness is more negative when deal failure probability becomes smaller).

    The results in Table 3 support these predictions about standard deviation and skewness.

    From Table 3, I find that the standard deviation of the target stock return is highest for stocks

    with high failure probability. For the target stock with low failure probability, its return

    standard deviation is 21.6%, but the return standard deviation of target stocks with high

    failure probability is 38.9%, almost double that of the low failure probability group. From

    Low to High, the cross sectional skewness increases from -3.249 to -2.043.

    The risk-neutral measures, calculated from option prices, show similar results for all the

    three moments, though the number of stocks for which I can calculate risk neutral moments is

    significantly smaller, as target companies on average are small stocks and may not have

    traded options. However, in total, I still have more than 1,000 target stocks for which I can

    calculate risk neutral moments. Furthermore, the results show that the differences in standard

    deviation and skewness between the high- and low-failure probability groups are both

    significantly different from zero.18

    Since risk neutral skewness is imputed from option prices

    and are therefore forward looking, these results also suggest that the option market can

    extract information on deal failure probability soon after deal announcement. Overall, both

    the results of comparing standard deviation and skewness reflect return characteristics

    consistent with targets having a payoff structure that is approximately binary.

    18

    To mitigate the effect of extreme values in the statistical test, in unreported results, I also perform a Wilcoxon

    rank test. The results are robust.

  • 20

    4.2 Details of the calendar portfolio construction

    Following Mitchell and Pulvino (2001), Baker and Savasoglu (2002), and Giglio and

    Shue (2013), I use calendar time portfolios to test the asset-pricing implications of probability

    weighting. Upon an announcement of an M&A, the deal is classified into one of the three

    categories (deals with low/medium/high failure probability). I begin to include a target stock

    into one of the three portfolios from the end of the second day after the deal announcement.

    This is to exclude the large abnormal returns that are associated with deal announcement. The

    holding period ends when the time to announcement exceeds 180 days or when the deal is

    completed or withdrawn, whichever comes first. To mitigate the bias of using daily equally

    weighted returns to calculate compounded monthly returns (Blume and Stambaugh, 1983;

    Roll, 1983; Canina, Michaely, Thaler, and Womack, 1998), throughout this paper, I measure

    all portfolio returns using value weights, where the weight is the market capitalization of the

    target firm at the end of the previous day.

    I compound daily portfolio returns to get the monthly return. The median number of

    target firms for Low, Medium, and High portfolios is 28, 58 and 121, respectively. This

    suggests that the number of stocks in each portfolio is generally not thin, and my results are

    unlikely to be driven by some months with a sparse number of deals.19

    4.3 Portfolio returns and portfolio risks

    This section reports the main results of this paper: the average returns of the Low,

    Medium, and High portfolios, their risks and alphas. Besides the three portfolios, I also report

    the characteristics of a zero-investment long-short portfolio which is the difference between

    Low and High.

    [Insert Table 4 here]

    19

    Out of the 324 months, there are 6 months in which the High portfolio has less than 10 stocks (minimum is 6),

    3 months in which Low has less than 10 stocks (minimum is 8), and in all months Medium has more than 10

    stocks. My results are robust if I exclude those months or if I replace return observations for those month by the

    risk free rate.

  • 21

    Table 4 reports the characteristics of these portfolios. The results show that the average

    excess return decreases with deal failure probability, from 1.170% for the Low portfolio to

    0.377% for the High portfolio.

    The annualized portfolio standard deviation increases with deal failure probability, from

    16.7% for the Low portfolio to 22.6% for the High portfolio. As a result, the Sharpe ratio

    decreases from 1.106 to 0.393. Among all the three portfolios, only Low is positively skewed,

    and the portfolio skewness decreases from Low to High. This is opposite to the relationship

    between deal failure probability and the individual stock return moments shown in Table 3,

    suggesting that aggregation of stocks into portfolios can change skewness significantly.20

    The relationship between deal failure probability and portfolio skewness thus suggest that,

    compared to the High portfolio, the Low portfolio is less likely to have an extremely low

    return. The mean of the Low-High portfolio is 0.793% which is economically large, and is

    significant at the 5% level.

    The results in Table 4 are consistent with the Hypothesis. Next, I investigate whether the

    difference in excess return between Low and High can be explained by systematic risk

    exposures.

    I report factor adjusted portfolio alphas in Table 5. The market model adjusted alphas are

    0.808%, 0.008%, and -0.198% for Low, Medium, and High, respectively. Betas for the three

    portfolios are 0.633, 0.851, and 1.007, respectively. The beta for the Low-High portfolio is -

    0.374, which is significantly negative.

    [Insert Table 5 here]

    The alpha of the Low portfolio is significantly positive, while the other two are not

    significantly different from zero. The market-adjusted alpha for the Low-High portfolio is

    1.006%, which is significant at the 1% level. The market-adjusted alpha is also higher than

    the difference in excess return reported in Table 3. This is because the long side has lower

    beta than the short side. Adjusting for the three Fama and French (1993) factors, the Carhart

    (1997) momentum factor, and the Pastor and Stambaugh (2003) liquidity factor (I will refer

    20

    A similar finding is that the aggregate market displays negative skewness, but on average individual stock

    returns display positive skewness (see Albuquerque (2012) for an interpretation).

  • 22

    to this as five-factor adjustment) does not change the results significantly. The five-factor

    alphas for the Low, Medium, High and the Low-High portfolio are 0.766%, -0.038%, -

    0.293%, and 1.060%, respectively.

    Mitchell and Pulvino (2001) find that, on average, deals are more likely to fail in down

    markets. I follow Mitchell and Pulvino (2001) and use a piecewise linear regression model:

    Rportfolio,t-Rf=(1-UPt)[αlow+βlow(Rm,t-Rf)]+ UPt[αhigh+βhigh(Rm,t-Rf)]+β’Xt+εt

    subject to: αlow+βlow(Threshold)= αhigh+βhigh(Threshold) (4)

    where Rportfolio,t is the monthly Low-High portfolio return, Rm,t is the value weighted market

    return, Rf is the risk free interest rate, and UPt is a dummy variable which is equal to 1 when

    the market excess return is greater than a threshold. X represents the other factors. The second

    equation shown above ensures continuity.

    Another way to evaluate downside risk is to examine the coskewness of a portfolio. Thus,

    I also examine whether the Low-High portfolio has exposure to the coskewness risk

    mimicking factor proposed by Harvey and Siddique (2000). Coskewness is measured as:

    βî=

    E[εi,tεm,t2 ]

    √E[εi,t2 ]E[εm,t

    2 ]

    (5)

    where εi,t=Ri,t - αi - βiRm,t. Ri,t and Rm,t are the excess return for stock i and the excess return for

    the market. εm,t is equal to Rm,t - μm,t , where μm,t is the mean of Rm,t. In each month, all the

    stocks in NYSE/AMEX/NASDAQ are sorted into three portfolios based on the value of

    coskewness which is estimated based on monthly data of the past 5 years. The 30% of stocks

    with the most negative coskewness is classified into an S- portfolio and the 30% of stocks

    with the most positive coskewness in each month is classified into an S+ portfolio. Both S

    --

    S+ and S

    -- Rf

    will be used as the coskewness hedging portfolios for the portfolio abnormal

    return analysis.

    Table 6 reports the results. Panel A reports the downside risk analysis using the method

    proposed by Mitchell and Pulvino (2001) and Panel B reports the coskewness risk analysis.

    Panel A shows that both the upside beta and the downside beta are significantly negative, but

  • 23

    the downside beta is much larger in magnitude. The difference between the upside beta and

    the downside beta is significant at 1% level. Panel B shows that Low-High portfolio has

    negative exposure on the two coskewness factors. The coskewness loading on Low-High

    portfolio is -0.411 and -0.294 when S-- S

    + and S

    -- Rf are used to construct the mimicking

    portfolio, respectively. In sum, the results show that there is less downside risk for deals that

    are less likely to fail.

    [Insert Table 6 here]

    The results in Table 5 and Table 6 show that Low-High portfolio has negative beta and

    negative downside risk. This is consistent with Bhagat, Brickley, and Loewenstein (1987),

    who view acquirer offers as put options to target shareholders. Bhagat, Brickley, and

    Loewenstein (1987) argue that an offer from an acquirer is similar to a put option with the

    strike price as the offer price and target standalone value as the underlying. Put options help

    hedge the risk resulted from the fluctuation of the target value. As the exercisability of put

    options is contingent on the deal going through, the effectiveness of hedging is determined by

    the probability of deal failure. The lower the deal failure probability, the better is the hedging

    value of the put option. Thus, the Low-High portfolio is essentially a long position in these

    contingent put options. If the average target stock has positive beta, on average, the price of

    put options will increase when market is going down (negative beta), and the price of put

    options increases more quickly in the down market than price decreases in the up market

    (negative downside risk), due to increasing option delta.

    Overall, the results in Tables 4, 5 and 6 confirm the prediction of probability weighting.

    That is, the target stocks in deals with low failure probability are undervalued, but target

    stocks in other deals do not show significant mispricing, suggesting that deal failure

    probability is only overweighed when it is small. The long-short portfolio has negative beta

    and negative downside risk, suggesting that the abnormal return of the low failure probability

    portfolio is unlikely to be driven by exposure to systematic risk. Overall, these results support

    my Hypothesis.

  • 24

    4.4 The magnitude of overweighting

    In this section, I examine the economic magnitude of overweighting for an average deal

    in the Low portfolio. Based on the structure assumed in Figure 1, by definition, the target

    stock return is equal to the expected payoff divided by the current target trading price:

    (1 + 𝑅𝑓 + 𝐸𝑅 + 𝐴𝑅)𝑇−𝑡 =

    Palone*π+Palone*(1+premium)*(1- π)

    P,

    (6)

    where Rf is the risk free rate, ER is expected return, AR is abnormal return, T-t is the length of

    time from deal announcement to deal resolution, Palone is target standalone price, P is the post

    announcement target trading price, and π is deal failure probability. For investors who

    overweight the small deal failure probability, P satisfies the following equation:

    P=[Palone*w(π)+Palone*(1+premium)*(1-w(π))]

    (1+𝑅𝑓+ER)T-t

    (7)

    where w(π) is the decision probability. Combining equation (6) and (7), we can infer w(π)

    from the data.

    As shown in Table 3, the average deal failure probability (π) is 5.2%., the average

    duration (T-t) is 100.87 days or 3.32 months (100.87/ (365/12)) and the average premium

    (premium) is 42.3% (exp(0.353)-1). As shown in Table 4, the average abnormal return (five

    factor alpha) for the Low portfolio is 0.766% per month (AR). The average excess return is

    1.170% (Table 4) and the five-factor alpha is 0.766%. So the average “normal” excess return

    per month is 0.404% (ER). The average risk free rate from 1984 to 2010 is 0.364% per month

    (Rf). With these statistics, w(π) equals 13.2%. The 90 percent confidence interval of AR is

    from 0.343% to 1.189%. Under this range, w(π) is estimated to be from 8.8% to 17.5%. This

    is consistent with the experimental studies. For example, based on the probability weighting

    function proposed by Tversky and Kahneman (1992), 5.2% should be overweighted as

    11.42%, which is quite close to the above estimates.

    4.5 Robustness

    Several robustness tests are presented in Table 7. The five-factor alpha of pure cash offers

    is 0.748%, which is statistically significant. I also report results for the subsample of deals

  • 25

    that involves change of corporate control. Change of corporate control is defined as deals in

    which the acquirer owns less than 50% of the target but seeks to own more than 50%. This

    subsample includes 10,353 deals. The Low-High abnormal return of this subsample is

    1.058%, which is both economically and statistically significant. I also examine the first

    offers separately. First offers are the targets that did not receive any acquirer offers in the past

    365 days. This eliminates the possibility the portfolios contains more than one position in a

    single target company. The Low-High portfolio alpha is 0.930%, which is similar to the

    whole sample results.

    In Panel B of Table 7, I report the five-factor alpha of the Low-High portfolio for two

    subperiods. The first subperiod is from 1984 to 1996 and the second is from 1997 to 2010.

    Alphas are 0.763% and 2.115%, respectively, and both are significantly positive. The result

    for the first subperiod is weaker than the second subperiod. This is perhaps due to the

    predicted failure probability performing slightly worse in the first half of the sample period

    (as shown in Figures 3).

    In the main analysis, I use a holding period that starts from the third day after deal

    announcement and extends 180 days after deal announcement, or until the deal is completed

    or withdrawn, whichever comes first. However, the length of the holding period is arbitrary.

    In Panel C, I report the results for alternative holding periods: 30 days, 60 days, 100 days,

    and 365 days. The abnormal returns from these four models are around 0.960-2.226% and are

    all statistically significant. Interestingly, the abnormal return decreases when the holding

    period increases. I conjecture that this may be because the risk of deal failure is mainly

    concentrated in the first two months after deal announcement.

    [Insert Table 7 here]

    In Panel D, I report the five-factor alpha of the Low-High portfolio using the price-based

    deal failure probability estimate. The alpha and beta are 1.040% and -0.598. These are similar

    in magnitude to those calculated using the characteristics-based deal failure probability

    measure, confirming my earlier findings.

    Next, I attempt to carefully account for two issues that can potentially affect my

    conclusions. First, the structure shown in Figure 1 is a simplified version of the real world. In

  • 26

    reality, competitors may join in bidding and the offer price may be subject to revisions. I

    gauge the economic importance of revisions following Baker and Savasoglu (2002). The data

    on offer price revisions is more complete since 1997. I thus focus on the later sample

    period.21

    First, I tabulate the frequency of revisions for the three portfolios of targets in my

    sample. From 1997 to 2010, I have 7,205 deals in total, of which 717 have their offer price

    revised. 205 are revised down and 512 are revised up. Low, Medium and High have 1,371,

    2,734 and 3,100 deals, respectively. 4.9%, 2.6% and 2.1% are revised down, and 8.9 %, 5.7%

    and 7.5% are revised up for the three portfolios of deals. Conditional on downward revision,

    the average revision ratios (defined as final offer/initial offer-1) are -8.80%, -15.01% and -

    18.20% for Low, Medium, and High, respectively. Conditional on upward revision, the

    average revision ratios are 18.88%, 16.17%, and 19.24%, respectively. The targets in the Low

    portfolio have higher downward revision probabilities, but, conditional on downward revision,

    the revision ratio is lower. These targets also have higher upward revision probabilities.

    Overall, the revision analysis does not reveal evidence that revisions favor my portfolios

    differentially. Still, I recalculate the portfolio returns by excluding the deals with offer price

    revisions. The results in Panel E of Table 7 show that the five-factor alpha is 1.800%. This is

    slightly lower than the alpha (2.115%) in the same time period without excluding offers with

    revisions. But the change is relatively small.

    Second, to gauge the effect of uncertainty about the entry of competing offers, I build a

    new deal failure probability model which explicitly takes competing offers into account.

    Specifically, I redefine deal failure as targets that are not acquired by any acquirers in a

    specific period after deal announcement (365 days in the empirical analysis). Different from

    the previous definition of deal failure, the new definition excludes the deals in which the

    target is acquired by some other acquirer. I use the same set of independent variables as in

    Table 2 to model deal failure probability and calculate the Low-High portfolio returns. I

    report the five-factor analysis of the Low-High portfolio in Panel F of Table 7. The results

    show that the Low-High alpha is 1.319%; this is significant at the 1% level, suggesting that

    considering competing offers strengthens the results.

    21

    The results are robust if I analyse the effect of revisions using the whole sample. For details, see Part II of the

    Online Appendix.

  • 27

    4.6 Testing alternative hypotheses

    4.6.1 Skewness preference without probability weighting

    As shown in Table 3, target stocks in deals with low failure probability are more

    negatively skewed than target stocks in other categories. Many traditional utility functions

    feature skewness preference, for example, CRRA utility.22

    The rare disaster models (such as

    Barro (2006)) explicitly consider extreme negative state with small probabilities. When

    investors care about diversification and do not face frictions to do so, coskewness will affect

    asset prices but idiosyncratic skewness will not (Kraus and Litzenberger, 1976; Harvey and

    Siddique, 2000). In the rare disaster models, only systematic rare disaster risks are priced but

    not idiosyncratic extreme negative states. As shown in Table 6, the Low-High portfolio has

    negative coskewness risk. Therefore, my findings cannot be explained by the traditional

    models without frictions to diversify. However, in the real world, investors are not well-

    diversified (Odean, 1999; Mitton and Vorkink, 2007; Goetzmann and Kumar, 2008).23

    Under-diversification leads to the pricing of idiosyncratic skewness. I thus examine whether

    skewness preference without probability weighting and the rare disaster models can explain

    my findings in a world in which investors cannot fully diversify. For simplicity, I assume that

    investors cannot diversify at all.

    I model the expected stock return of a target stock as shown in Figure 1. I choose CRRA

    utility as perhaps it is the most widely used utility function in the finance literature. Barro

    (2006) also uses CRRA utility. The CRRA utility is UCRRA=-C1-η

    , where C is the consumption

    and η is the relative risk aversion coefficient. A CRRA investor who cannot diversify will

    value the target stock as:

    U(P)=πU(Palone)+(1-π)U(Poffer). (8)

    The target stock expected return is defined as

    22

    An investor’s expected utility of end of period wealth, E[U(W)], can be expressed, by Taylor expansion, as

    U(W0)+[U’’(W0)/2!]σw2+[[U’’’(W0)/3!]sW

    3+terms of higher orders, where W0 is the mean of W, σw

    2 is the

    variance of W, and sW3 is the skewness of W. Typically, U’’0,

    implying preference for skewness. For CRRA utility, U(W)=(1-θ)-1

    W1-θ

    (θ>0 and θ≠1), U’’’= θ(θ+1) W

    θ+2,

    which is greater than 0, implying skewness preference (Kraus and Litzenberger, 1976). 23

    There are many plausible explanations for lack of diversification, such as narrow framing, fixed trading costs

    (Brennan, 1975), search costs (Merton, 1987), or returns to specialization in information acquisition (Van

    Nieuwerburgh and Veldkamp, 2010).

  • 28

    ER= [π Palone+ (1-π)Poffer]/P-1. (9)

    The target standalone value Palone is standardized to 1, and Poffer is chosen to be 1.30. 30%

    is to match the average merger and acquisition premium from the data. From these

    assumptions, we can calculate the target stock expected return.

    [Insert Figure 4 here]

    Figure 4 shows the expected target stock returns with respect to deal failure probability,

    for five risk aversion coefficients ranging from 1 to 5. We can see from Figure 4, though

    CRRA utility functions have feature of positive skewness preference, the skewness effect is

    dominated by aversion to variance, thus target expected return is highest when deal failure

    probability is moderate. This contradicts my finding that targets in deals with moderate

    failure probability (the Medium group and the High group) have a lower return than targets in

    deals with low failure probability (the Low group).

    4.6.2 The disposition effect

    Grinblatt and Han (2005), and Frazzini (2006) argue that the disposition effect can lead to

    excess selling after price increases, which drives the current stock price below fundamental

    value and consequently yields higher future stock returns. Typically, an M&A announcement

    is “good news” for the target shareholders; the disposition effect will therefore predict an

    under-reaction. This effect may be stronger for deals with low failure probability as their

    initial price run-up is likely to be higher. In other words, in deals with low failure probability,

    the target price will increase more on announcement, which will mean higher capital gains

    for existing shareholders. The disposition effect will make these shareholders more likely to

    sell the stock. This may depress the stock’s price beyond fundamentals and thereby lead to

    higher returns in the near future.

    I use the cross-sectional regression framework of Baker and Savasoglu (2002) to test this

    alternative explanation. The dependent variable is the realized target return from 3 days after

    deal announcement to 25 trading days after deal announcement. Ideally, I would like to

    measure our dependent variable over the same and relatively long horizon for all deals. Too

    short horizon returns may be too noisy. There are considerable cross sectional variations in

    deal duration. The median duration is 134 days. 11.1% of the deals are completed or

  • 29

    withdrawn within 30 days, while 30.2% are completed or withdrawn within 60 days. One

    month seems the best choice which can give a relatively long horizon and also make most

    dependent variables to be measured over the same horizon (only 11.1% are measured in

    horizons less than a month).24

    The key independent variable is deal failure probability π. Overweighting of small

    probabilities predicts that the coefficient on π is negative. I include the target past return and

    the target announcement return in the regression to capture the impact of the disposition

    effect. Since the returns are overlapping in time and thus are not independent, I cluster the

    standard errors by month following Rogers (1993).

    Rtgt,3-25=α+β1πi+β2TargetPastReturni+β3TargetAnnouncementReturni+Controls+εi (10)

    [Insert Table 8 here]

    Table 8 shows the results. Model (1) reports the univariate regression results. As expected,

    the coefficient of π is significantly negative, confirming the previous results based on the

    calendar-time portfolio method. In model (2), I control for other deal characteristics, such as

    the attitude of the target, payment method, and target size. In the stylized model shown in

    Figure 1, π*(1-π) measures the volatility of the target stock return. So, following Baker and

    Savasoglu (2002), I also control for π*(1-π). Target Size is the natural logarithm of target

    firm market capitalization 22 trading days prior to deal announcement. After controlling for

    these factors, the coefficient of π is still significantly negative.

    In Models (3), (4) and (5), I test whether the disposition effect can explain my results by

    including target past return and the target announcement return as regressors. The coefficients

    on these regressors are both positive, consistent with the prediction of the disposition effect

    hypothesis. However, controlling for them decreases the magnitude of the coefficient of π by

    only around 20%, and therefore does not change my main conclusion, as the coefficient of π

    is still significantly negative.

    24

    The results are robust if I measure the dependent variable from 3 trading days after deal announcement to 47

    trading days after deal announcement. Please see Part III of the Online Appendix.

  • 30

    4.7 Limits to arbitrage

    It is reasonable to think that arbitrageurs may not overweight small deal failure

    probability. However, why they cannot arbitrage away the positive abnormal return of the

    Low-High portfolio? I therefore examine whether there are any limits to arbitrage that

    prevent arbitrageurs from doing so?

    My study is related to a popular hedge fund trading strategy called merger arbitrage. After

    deal announcement, the target stock price is typically lower than the offer price. Merger

    arbitrage refers to the strategy that attempts to profit from this spread. In doing so,

    arbitrageurs buy the target stock, and if the offer involves stock, sell short the acquirer stock

    to hedge the risk of fluctuations in the acquirer stock price. Evidence shows that merger

    arbitrageurs have a significant impact on target prices during acquisition deals. If limits to

    arbitrage help to explain the undervaluation of targets in deals with low failure probability,

    we would expect that the relation between deal failure probability and target returns to be

    stronger when arbitrage is more difficult.

    First, I examine whether merger arbitrageurs exploit this arbitrage opportunity. If so, they

    should trade more of the stocks in the Low portfolio rather than the High portfolio. I infer

    whether they indeed do this by examining the correlation of the performance of the real world

    merger arbitrageurs and the Low and High portfolios. I use the Barclay Merger Arbitrage

    Index to proxy the performance of real world merger arbitrageurs. Due to data availability,

    the period is January 1997 to December 2010. We regress the excess return of the Low

    portfolio and the High portfolio on the excess return of the Barclay Merger Arbitrage Index,

    the Fama and French three factor, Carhart (1997) momentum factor, and the Pastor and

    Stambaugh (2003) liquidity factor, the coefficients of the excess return of the Barclay Merger

    Arbitrage Index are 0.657 (t=2.47) and 0.088 (t=0.16), respectively. Only the former is

    statistically significantly different from zero.25

    This suggests that real world merger

    arbitrageurs indeed trade more of the Low stocks than High stocks.

    25

    The results are similar if we use the Merger Arbitrage Index from Hedge Fund Research. If so, the

    coefficients of the excess return of the Merger Arbitrage Index are 0.779 (t=3.00) and 0.061 (t=0.11),

    respectively.

  • 31

    Next, I examine what limit the arbitrage capacity of merger arbitrageurs. Starting with the

    model specification in equation (10), I add interaction terms between deal failure probability

    and proxies for limits to arbitrage to examine whether limits to arbitrage moderate the

    relation between deal failure probability and target returns.

    Rtgt,3-25 = α+ β1πi + β2TargetPastReturni+β3TargetAnnouncementReturni

    +γLimit-to-Arbitragei+ηLimit-to-Arbitragei*πi+Controls+εi (11)

    If limits to arbitrage constrain the ability of arbitrageurs to arbitrage away the mispricing, we

    would expect the inverse relationship between π and Rtgt,3-25 to be stronger when arbitrage is

    more difficult.

    The limits to arbitrage literature emphasizes the role of limited capital. In the presence of

    slow moving capital, merger arbitrage capital is more likely to be constrained if merger

    arbitrageurs have experienced losses in the recent past (Baker and Savasoglu, 2002; Mitchell,

    Pedersen and Pulvino, 2007; Mitchell and Pulvino, 2012) or when more arbitrage capital is

    needed (Baker and Savasoglu, 2002). I use the total market capitalization of failed deals in

    the past 365 days to proxy for arbitrageurs’ recent losses and use the total market

    capitalization of pending deals to proxy for the needed capital.

    Model (1) of Table 9 shows that the coefficient of the interaction between deal failure

    probability and the cumulative arbitrageur losses is negative. Model (2) shows that the

    coefficient of the interaction between deal failure probability and needed arbitrage capital is

    also negative. Both results are consistent with the conjecture that limited capital constrains

    arbitraging activities.

    [Insert Table 9 here]

    Another measure of limit to arbitrage is based on payment type. In cash deals,

    arbitrageurs only need to have a position in the target stock; but in stock deals, they need to

    have two positions: a long position in the target stock and a short position in the acquirer

    stock. The arbitrageurs incur direct costs for both the long and the short positions, and often

    do not receive the full interest on the short sale proceeds (Baker and Savasoglu, 2002). Using

    proprietary equity loan data, Geczy, Musto, and Reed (2002) find that acquirers’ stock is

    expensive to borrow and accounting for short selling costs decreases arbitrage profits

  • 32

    significantly.26

    Such arbitrage difficulty is likely to be lower for cash deals than for stock

    deals.

    The results are shown in model (3) in Table 9. The prediction here is that the relation

    between deal failure probability and target returns should be weaker for cash deals. In model

    (3), the interaction between deal failure probability and Pure Cash is positive, and the

    interaction between deal failure probability and Pure Stock is negative though not statistically

    significant, consistent with the argument that cash deals are easier to arbitrage than other

    deals.

    Firm size is another variable widely used to measure arbitrage difficulty. However, in

    merger arbitrage, the relation between firm size and arbitrage difficulty is ambiguous. On one

    hand, larger firms are more liquid and may attract more arbitrageurs. On the other hand,

    Baker and Savasoglu (2002) argue that larger targets need more arbitrage capital; they find

    that larger target firms indeed have higher returns. In Table 8, target size is positive in all four

    models with target size as a control and is significant in three of them. Nevertheless, I report

    the result of using firm size as a measure of limits to arbitrage in model (4) of Table 9.

    However, I do not find that target firm size significantly moderates the relation between deal

    failure probability and target returns.

    5. Conclusion

    A large body of experimental evidence shows that decision makers tend to overweight the

    probability of tail events. This paper provides direct evidence on probability weighting using

    data from mergers and acquisitions. I estimate the deal failure probability based on (1) deal

    characteristics, and (2) by comparing the target price and the offer price. Next, I test for

    probability overweighting by examining the future target stock returns, conditional on the ex-

    ante failure probability of the deal. Overweighting small failure probabilities lowers the target

    stock price and yields positive abnormal future stock returns. I confirm this prediction and

    find that the target stocks with low failure probability yield significant positive stock returns

    in the period between deal announcement and deal resolution, even though these deals have

    26

    Geczy, Musto, and Reed (2002) analyse equity loans for initial public offerings, DotCom stocks, large cap

    stocks, growth stocks, low-momentum stocks and acquirers’ stocks in merger arbitrage. They conclude that “the

    effect of short-selling frictions appears strongest in merger arbitrage”.

  • 33

    lower beta and lower downside risk than the other deals. On average, 5.2% failure probability

    is overweighted to 13.2%. The overweighting magnitude is also consistent with the

    probability weighting function of Tversky and Kahneman (1992).

    There are at least two paths for further research. First, probability weighting can be driven

    by either erroneous beliefs about the objective probability or by investors simply

    overweighting tail event in their preferences. I try to distinguish between these two

    possibilities in Part IV of the Online Appendix and show preliminary evidence that appears

    inconsistent with the former. However, this evidence is not conclusive. Further research in

    this direction is needed. Second, Cornelli and Li (2002) and Hsieh and Walkling (2005) show

    that merger arbitrageurs play an important role in solving the free rider problem inherent in

    mergers and acquisitions. It may be interesting to examine further how investors’

    overweighting of small deal failure probabilities affects trading behavior of the merger

    arbitrageurs, and how this might affect the incentives of potential bidders to launch a

    takeover bid.

    References

    Albuquerque, Rui, 2012, Skewness in Stock Returns: Reconciling the Evidence on Firm

    versus Aggregate Returns, Review of Financial Studies 25 (5): 1630-1673.

    Amaya, Dieg