portfolio selection with artificial neural networks
TRANSCRIPT
1
Portfolio Selection with Artificial Neural Networks
Andrew J AshwoodSupervisor: Dr Anup Basu
2
Outline• Introduction• Research Motivation• Neural Network Primer• Literature Review• Research Questions• Network specification• Results • Conclusions• Further work and limitations
3
Research Motivation• Trading in equities is big business in
Australia– Value of All Ordinaries c. $1.2 trillion– Stocks turned over in 2012 c. 244 billion– Funds invested in superannuation c.
$1.4 trillion
4
Research Motivation• Outperforming the broader market is
difficult – From 1969 to 2010, an index investor would
achieve returns 80 per cent greater than the average investor in a managed fund.’ (Malkiel, 2011)
– ‘…if many managers have sufficient skill to cover costs, they are hidden by the mass of managers with insufficient skill. …true α in net returns to investors is negative for most if not all active funds’ (Fama & French, 2009)
5
Research Motivation• The Australian experience is similar– 70% of active retail funds
underperformed the benchmark over both the 1 and 3 year period (Karaban & Maguire, 2012)
– Over a five year period 69% of active retail funds failed to outperform the ASX200AI (Karaban & Maguire, 2012)
6
Research Motivation• ‘The attempt to predict accurately the future
course of stock prices and thus the appropriate time to buy or sell a stock must rank as one of investors’ most persistent endeavours.’ (Malkiel, 2011)
• Advances in computing power in combination with the widespread availability of historical datasets has provided investors with increased opportunity to test markets for predictable returns
7
Research Motivation• Artificial Neural Networks (‘ANNs’)
have several traits that suggest their suitability for stock price prediction– Ability to generalise and learn– Provide an analytic alternative to other
techniques that rely on assumptions of normality, linearity, and variable independence
8
Neural Network Primer• An artificial neural network is a
mathematical model that mimics the structure and function of biological brains
• A structure of highly interconnected processing neurons
9
Neural Network Primer• Typical neural processing element
X1
X2
Xn
Neuron
Inputs X1 to
Xn
W1
W2
Wn
Input Weights
W1 to Wn
WX i
n
iiv
0
Neuron Activatio
n
Transform-
ation
)tanh(vy
Output
Output
Learning
Error Back propagatio
n
10
Neural Network Primer• Neural networks comprise
interconnected structure of neuronsInput
Input
Input
OutputOutputOutputInput
LayerHidde
n Layer
Output
Layer
11
Neural Network Primer• Network typology – 2 fundamental classes of network architecture• Feed-forward network – information travels in the
forward direction only– Single layer feed-forward network– Multi layer feed-forward network
• Recurrent network – contains at least one feedback loop– Many types (Fully recurrent, Hopfield, Elman, Jordan, etc
etc)– Several studies demonstrate that feed-forward
networks outperform recurrent networks at predicting non-linear time series data
12
Neural Network Primer• Network learning – The Back Propagation
Algorithm– Input weights are randomly applied to the
network– Network calculations occur and the network
produces an output signal– Simulated output compared to target output to
provide system error– Error is back-propagated through the network– Process repeats until error reaches desired level
or maximum number of iterations is reached
13
Literature Review• Neural networks are relatively new. – No published research in the financial
domain prior to 1990. By 1993, over 10 studies per year were being published. (Wong & Yakup, 1998)
14
Literature Review• Yoon & Swales (1991) used ANNs to
predict prices of 98 companies and compare the ANN predictions to multiple discriminant analysis– The study concluded the ANN technique
significantly outperformed multiple discriminant analysis
15
Literature Review• Kryzanowski, Galler & Wright (1992)
used ANNs to pick stocks– Fourteen fundamental ratios were used
as inputs – coded as trending upwards/downwards or stable
– The ANN model correctly predicted price directional movement over the next year in 66% of cases
– The ANN did not predict extreme movements well
16
Literature Review• Hall, 1994 used ANN to create style
based portfolio – ANNs used to identify stocks with largest
recent price increased and apply this style to the current market to select a portfolio
– Utilised 1,000 largest US stocks– 37 inputs used to predict future stock prices– The portfolio has exceeded the S&P500
since inception after transaction costs, but, ‘Actual performance results and specific details of the system are proprietary’.
17
Literature Review• Jai & Lang (1994) used ANNs to trade
the TSE– Used 16 technical indicators to predict
the trend of the price movement over the subsequent 6 trading daysYear 1990 1991Type of ANN Fixed
structureDual
adaptive structure
Fixed structure
Dual adaptive structure
Total return % -30.66% 21.61% 25.76% 29.20%Annual TSEWPI return%
-39.52% 23.42%
Note the large divergence in performance
18
Literature Review• Freitas et al. (2001) used neural network
to provide an estimate of future returns of 66 representative stocks listed on the Brazilian BOVESPA stock exchange. –Worst network prediction error 33%– Best network prediction error 7%– Average error almost 17% – Neural network portfolios achieved better
performance in 19 out of 21 weeks and outperformed the benchmark portfolio by 12.39%.
19
Literature Review• Ellis & Wilson (2005) used ANNs to
classify Australian listed REITs as value stocks. – Value stocks were bought and portfolio
returns measured – ANN portfolio outperformed the
benchmark index 6.97% per month on average
– Sharpe and Sortino ratios significantly greater than the benchmark indices
20
Literature Review• Ko & Lin (2008) used ANNs for portfolio
selection on Taiwan Stock Exchange (‘TSE’). – Focussed on 21 different equities– Used price, variance, covariance as input
variables– ANN used to predict equity allocation ratio– 5yr data set including training & validation sets– ANN model outperformed TSE (no performance
measurement framework)
21
Literature Review• Vanstone et al. (2010) used neural networks
based on a stock trading filter rule. The filter rule bought stocks when the following criteria were met
• PE < 10• Market Price < Book Value• ROE < 12• Dividend Payout Ratio < 25%
– System yielded low number of trading opportunities– The return achieved by the system exceeded the
return available using the original filter rule– It is worth noting however that the higher return
achieved, was at least partially due to increased risk.
22
Research Questions• Core Research Question 1– Can neural networks that utilise a
combination of fundamental and technical inputs predict (a) future stock prices, (b) stock price directional movement?
• Secondary Research Question 1–Which network inputs contribute most to
network accuracy?
23
Research Questions• Secondary Research Question 2–What network parameters lead to optimal
predictive capability?• Secondary Research Question 3– Do neural networks predict any of the known
performance attribution factors?
• All research questions are to be answered with reference to stocks listed on the ASX
24
Network Specification• Kaastra & Boyd (1996) framework for
design of financial and econometric time seriesStep 1: Variable selection
Step 2: Data collection
Step 3: Data preprocessing Step 4: Training, testing, and validation sets
Step 7: Neural network training
number of training iterations
learning rate and momentum
Step 6: Evaluation criteria
Step 5: Neural network paradigms
number of hidden layers
number of hidden neurons
number of output neurons
transfer functions
Step 8: Implementation
Iterative Process
25
Variable SelectionFUNDAMENTAL INPUTS TECHNICAL INPUTS
Profit Margin
DPS Open High Low Close
Dividend Yield
CF per share
20d MA 50d MA 20d EMA 50d EMA
BV per share
BV per share growth
MACD RSI CLV ADI
Current ratio
Quick Ratio 20d Momentum
50d Momentum
6mth Momentum
12mth Momentum
Asset Turnover
Debt to Equity
Fast Stochastic Oscillator
Fast Stochastic Indicator
Slow Stochastic Oscillator
Slow Stochastic Indicator
Interest Cover
Price to CF Rate of Change(Price 1wk)
Rate of Change(Price 4wk)
Rate of Change (Price 10wk)
Rate of Change(MACD)
PE Price to Sales
Rate of Change(MACD Signal)
Rate of Change(RSI)
Rate of Change(ADI)
Rate of Change(20d MA)
ROA ROE Rate of Change(50d MA)
Rate of Change(20d EMA)
Rate of Change(50d EMA)
Rate of Change(12mth Mom)
FCF per share
Sales growth
Rate of Change(FSO)
Rate of Change(FSI)
Rate of Change(SSO)
Rate of Change(SSI)
- - Volume - - -
26
Data Set• Dataset obtained from Bloomberg• Weekly data from Jan 1997 to Dec
2011• Early data (1997-1999) used to
calculate technical indicators• 20 widely held ASX50 stocks selected
AMC ANZ BHP CBA CCACSL DJS GPT LEI NABORG QAN QBE RIO SUNTLS WBC WDC WES WOW
27
Data Preprocessing• 3 pre-processing algorithms utilised – – Remove duplicate data
• Duplicate data provides no useful information to the ANN
• Speeds calculation– Scale data
• Ensures all inputs treated equally• Avoids saturation of the tansig transformation
function– Mark missing data
• Ensures that missing data is not used in the learning process
28
Frequency of Data• Little theory to assist in determining
optimum frequency of data for a given prediction horizon Deboeck (1994) – – Long term prediction are unreliable– Short term predictions are unreliable– There is some period into the future for
which useful predictions can be made
29
Training, Testing, Validation Sets• The data set needs to be divided into
training, validation, and testing datasets (Beale et al., 2011)– – Training set –used to compute the
gradient and updating weights– Validation set – the data set used for
monitoring the error during training– Testing set –out-of-sample test data set
30
Training, Testing, Validation Sets• Walk forward testing –
t0 tn
Time
Window 1
Window 2
Window 3
Window 4
Window 5
Training WindowValidation WindowTesting Window
Testing Period
31
Network Implementation
Summing junction+ Tapped delay lineBias unitbx Input WeightTDL IW LW Layer Weight Transfer function
Input Layer
5918 Fundamental Inputs
41 Technical Inputs
59
Tansig
b1
4 → 20TDL
1
Hidden Layer
30-150
IW1,1 +
LW1,2
TDL
Output
Tansig
Output Layer
b2
LW2,1
1
1
+
32
Evaluation Criteria & Training
• Mean Squared Error (‘MSE’) selected as error function to be minimised by the network
• Convergence method adopted for network training. Training occurs until the earlier of – – 100,000 training iterations occurs
OR– 50 iterations with no reduction in validation
set error
33
ImplementationParameter Options No. of
Options
Stocks/portfolios
20 stocks +Equally and value weighted portfolios of these stocks
22
Input type Price ORTechnical indicators ORFundamental indicators ORPrice + Technical ORPrice + Fundamental ORPrice + Technical + Fundamental
6
Lookback window
4, 8, 12, 16, 20 periods 5
Hidden layer size
30, 60, 90, 120, 150 neurons
5
Training period length
3mths, 6mths, 12mths, 24mths
4
Walk forward testing
10 x 1-year testing periods from 2001 to 2011
10
Total networks = 132,000
• All testing undertaken on QUT supercomputer ‘Lyra’
• At time of testing Lyra consisted of - • 106 compute nodes• 1572 x 64 bit Intel
Xeon cores (consisting 11,736 core processors)
• 32.8 TeraFlop Theoretical (double precision), 58.6 TeraFlop (single precision)
• 10.4 TeraBytes of main memory
34
Results• The high number of network models specified
(132,000) meant that manual review and interpretation of individual networks was not feasible.
• The algorithm selected the network configuration that provides the lowest mean squared error over the one year walk forward testing window. The algorithm therefore provides 220 outputs. This is comprised as follows:– 20 stocks + 2 portfolios – 10 walk forward testing windows for each
stock/portfolio– Total outputs = 220
35
Input Type
Price Fund Tech
Price+Fund
Price+Tech
Price+Tech+Fund05
101520253035404550
Histogram of Input Type
Input Type
Freq
uenc
y
17%
14%
10%
21% 22%
16%
36
Lookback Window
4 8 12 16 2005
101520253035404550
Histogram of Lookback
Lookback Period (no. inputs)
Freq
uenc
y
19%21%
19%
21% 20%
37
Hidden Layer Size
30 60 90 120 1500
20
40
60
80
100
120
Histogram of Hidden layer
Hidden Layer Size (no. of nodes)
Freq
uenc
y
Smaller hidden layer produced optimum network in 47% of cases
47%
16%
9%
18%
10%
38
Training Period
3mths 6mths 12mths 24mths0
10
20
30
40
50
60
70
80
Histogram of Training period
Training Period Length
Freq
uenc
y
35%
20%
15%
30%
39
Predictive Capability
2002 2003 2004 2005 2006 2007 2008 2009 2010 20111 2 3 4 5 6 7 8 9 10
0%
5%
10%
15%
20%
25%
30%
RMSE as % of Share PriceAMCANZBHPCBACCLCSLDJSGPTLEINABORGQANQBERIOSUNTLSWBCWDCWESWOW
Year/Timestep
RMSE
% o
f Sha
re P
rice
Generally there is a major spike in error in 2008 & 2009
Generally predictions are fairly consistent from 2002-2007
40
Predictive Capability
01/20
02
07/20
02
01/20
03
07/20
03
01/20
04
07/20
04
01/20
05
07/20
05
01/20
06
07/20
06
01/20
07
07/20
07
01/20
08
07/20
08
01/20
09
07/20
09
01/20
10
07/20
10
01/20
11
07/20
11
01/20
122,0002,5003,0003,5004,0004,5005,0005,5006,0006,5007,000
ASX200 - 2002 to 2011ASX200 Adjusted Close Price
Sustained benign trading conditions leading up to GFC
2008 Major reversal, Bear Market
2009 Major reversal, Bull Market
2010 Return to benign trading conditions
41
Directional Movement
2002 2003 2004 2005 2006 2007 2008 2009 2010 20111 2 3 4 5 6 7 8 9 10
0%
10%
20%
30%
40%
50%
60%
70%
80%Directional Movement Prediction Accuracy
AMCANZBHPCBACCLCSLDJSGPTLEINABORGQANQBERIOSUNTLSWBCWDCWESWOWEWVW
Year/Timestep
Pred
icti
on A
ccur
acy
42
Directional Movement
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100% Directional Movement Prediction AccuracyOver 10-year Testing Horizon
AMCANZBHPCBACCLCSLDJSGPTLEINABORGQANQBERIOSUNTLSWBCWDCWESWOWEWVW
Stocks & Portfolios
Pred
ictio
n Ac
cura
cy
Significant at 5% level
43
Portfolio Selection• The simulated price returns were
then used to form a long and a long-short portfolio
• Portfolios formed at the beginning of each month during the 10-year testing period and held until the end of the month
• Stocks weighted according to the absolute value of their simulated price return
44
Portfolio Price ReturnsPortfolio Price Return
(2002-2011)ASX200 19.6%ANN Long Portfolio 205.5%ANN Long-Short Portfolio
196.6%
• Raw Results – • Do not include transaction
costs• No performance measurement
framework
45
Regression Results
Long Portfolio Long-Short Portfolio
α 0.016** 0.013***RMRF 0.712*** -0.011SMB -0.317 -0.604**HML -0.416** -0.306WML -0.011 0.038R2 0.283*** 0.215***
Performance Measurement• The portfolio price returns were then
regressed against the Carhart 4-factor model
Both portfolios have a significantly positive alpha
Both models are significant
46
Optimal Network Portfolios• This analysis confirms that ANNs can be
used for portfolio selection to achieve positive alphaHOWEVER
• These results are based upon the most accurate network specification – which is only know post hoc
• The problem is whether ANNs can achieve positive alpha without this post hoc knowledge
47
Median Network Portfolios• For each walk-forward testing window for each
stock, 600 different network specifications were trialled– 6 input options– 5 lookback windows– 5 hidden layer sizes– 4 training period lengths– 600 network options per stock per testing period
• We have confirmed that the most accurate network achieves positive alpha – what about the median network?
48
Regression Results
Long Portfolio Long-Short Portfolio
α 0.012** -0.001RMRF 0.137 0.003SMB -0.050 -0.011HML -0.310 -0.064WML -0.326 -0.115R2 0.084* 0.012
Median Network Portfolios
Only the Long portfolio achieved a significant alphaOnly the Long portfolio model is significant and the Carhart style-factors explain a lower proportion of variation in data
49
Distribution of Alphas• Network parameter combinations
were applied uniformly to all stocks and the long portfolio based on expected returns was formed.
• Mimics a real world trading situation• This process resulted in 600
measurements of alpha.
50
Distribution of Alphas
0
20
40
60
80
100
120Histogram for Alpha
Median = 0.0061Only 12% of the
networks tested have negative alpha
127 of 529 positive alphas (27%) were statistically significant (p < 0.05)
51
Conclusions• Network parameter testing was undertaken into
several network parameters –– input type– hidden layer size– lookback window– training period length.
• No useful guidance on heuristics for the input type, lookback window, or training period length.
• Hidden layer size showed some promise. Almost half of the most accurate networks models utilised the smallest hidden layer of 30 nodes.
52
Conclusions• Price prediction performance was mixed. • Most networks performed poorly at the
beginning of the global financial crisis (beginning of 2008) and at the end (end of 2009). The networks fail to predict major reversals.
• The networks did not predict the portfolio price returns well. The value weighted portfolio error ranged from 9% (best year) to 32% (worst year) and only achieved a RMSE below 10% in 1 of the 10 testing years.
53
Conclusions• Stock directional movement prediction
performance was assessed. • The ANN models predicted directional
movement with better than 50% accuracy for all stocks and portfolios except for BHP.
• Significant results were achieved for 13 of the 22 stocks and portfolios tested.
54
Conclusions• Price predictions used for long and
long-short portfolio selection• Both portfolios achieved significantly
positive alpha. • The optimal network specification
was made by comparing the simulated prices to the target prices, thus requiring post hoc knowledge.
55
Conclusions• A distribution of alpha for all 600 different
network specifications was produced. • Could not reject hypothesis that the
distribution was normal (p < 0.05).• 88% of the networks produced positive
alpha (529 out of 600). 127 of which were significant (27%).
• 12% of networks produced negative alpha (71 out of 600). None of which were significant.
56
Limitations & Future Research
• Research tested several network parameters.
• Hidden layer size provided promising results.
• Further research is required to develop heuristics.
• Future research should look at individual stocks and industry sectors.
57
Limitations & Future Research
• The analysis has been undertaken on a price return basis, rather than an overall returns basis.
• Study utilised 20 widely held stocks. Needs testing over wider investment universe. Survivorship bias.
• Impact of trading costs not included in analysis
58
Questions and Discussion