ftse trend forecasting using neural nets

44
FTSE Trend Forecasting using Neural Nets Preliminary Results and Findings Leonard Aye 1994

Upload: leonard-aye

Post on 13-Nov-2014

336 views

Category:

Documents


8 download

DESCRIPTION

Found this file amongst my old PC's hard disk. I can't believe I wrote it all those years ago. What the hell was I thinking?

TRANSCRIPT

Page 1: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting using Neural Nets

Preliminary Results and Findings

Leonard Aye

1994

Page 2: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Contents

Contents Project Summary ...............................................................................................................................................4

Project Aim ............................................................................................................................................4 Introduction .......................................................................................................................................................10

Report layout..........................................................................................................................................10 Software package ...................................................................................................................................11

Neural Networks ...............................................................................................................................................10 Brief introduction...................................................................................................................................10

Input Data Manipulations..................................................................................................................................11 Introduction............................................................................................................................................11 Input data manipulations ........................................................................................................................12

General manipulations ................................................................................................................12 Indexes manipulations ................................................................................................................13 Interest rates................................................................................................................................17 Exchange rates ............................................................................................................................18 Economic data.............................................................................................................................18 Futures ........................................................................................................................................18

Complete list of input data sets ..............................................................................................................19 Output Data Selection .......................................................................................................................................21

Number of predictive items in the neural net .........................................................................................21 Short term prediction ..................................................................................................................21 Long term prediction ..................................................................................................................21

Selection of predictive item....................................................................................................................21 Selection of FTSE difference......................................................................................................22

Network Tuning ................................................................................................................................................27 Hidden nodes ..............................................................................................................................27 Learning rate (0.0–1.0) ...............................................................................................................28 Momentum (0.0–0.9) ..................................................................................................................28 Learning Threshold (0.0–3.0) .....................................................................................................28 Number of presentation ..............................................................................................................29 Presentation type.........................................................................................................................29 Minimum and maximum values..................................................................................................29

Results ...............................................................................................................................................................52 Results evaluation methods....................................................................................................................52

Trading — Short term prediction................................................................................................52 Forecasting — Long term prediction ..........................................................................................54

Preliminary results .................................................................................................................................54 Result sheets ...............................................................................................................................54 Test 1—FTSE +1D MOM prediction .........................................................................................54 Test 2—FTSE +2D MOM (residual) prediction.........................................................................55 Test 3—FTSE +65D MOM prediction .......................................................................................55

Reduction of input data sets ...................................................................................................................56 Numerical Analysis.....................................................................................................................56 Direct experimentation................................................................................................................56

Conclusions .......................................................................................................................................................56 Highlights...............................................................................................................................................56 Conclusions............................................................................................................................................56

Input data manipulation ..............................................................................................................56 Output data selection ..................................................................................................................57 Network Tuning..........................................................................................................................57 Analysis methods ........................................................................................................................57

Appendices ........................................................................................................................................................59 Appendix A— Back-propagation algorithm...............................................................................59 Appendix B— FTSE Analyses ...................................................................................................59

Appendix A .......................................................................................................................................................60 Back-propagation algorithm...................................................................................................................60

Appendix B .......................................................................................................................................................61 Momentum (Close to Close Difference) .....................................................................................61 Returns ........................................................................................................................................62 Percentage Change of Momentum (PCM)..................................................................................62 Moving Average (MAV) ............................................................................................................62

Copyright— © Len Aye 1994 2

Page 3: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Contents

Rate of Change (ROC)................................................................................................................63 Moving Average Convergence-Divergence (MACD)................................................................63 Relative Strength Index (RSI).....................................................................................................64 Zero trend close-close volatility (ZCCV) ...................................................................................64

Copyright— © Len Aye 1994 3

Page 4: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Project summary

PROJECT SUMMARY

Project Aim This report is a result of 3 months work at a stock-broker firm based in London. The aim of the project is to add value to existing business areas by making predictions about the level of Index values in the future using Neural Networks. The FTSE index was chosen as the initial target for the estimation. The assumption is that, except in the case of unexpected shocks, e.g. the invasion of Kuwait, the likely future levels for the market are largely contained in the data available to participants in the market today. So vast is the amount of that data that turning it into usable information is a difficult task. The function of the neural network is to help discriminate between the data as to what is significant and to discover patterns in the data which enable it to make estimates about the future. The intention is not that the neural network would stand alone but that it will be used to complement the existing methods. From the Technical Analysis perspective the required time scale for the FTSE estimation is 3 months and a predictive accuracy of ±1.5%. For Trading purpose, a 1 or 2 day estimate is required with an accuracy for large moves (greater than ±0.75%) of 0.5%, but with an over-riding requirement of getting the direction of movement correct. The task of performing financial predictions, or any other analysis, using neural nets involves 4 major steps: input data selection, output data selection, network tuning and analysing results from the network. The purpose of the report is to describe our initial findings in these four areas, namely to establish:-

the most promising data sets that could be used as indicators of FTSE prediction the appropriate output parameters which could be predicted most accurately by

Neuroshell the parameters in NeuroShell that are most likely to affect the overall accuracy of the

results and methods used in tuning these parameters, and the appropriate methods for analysing the results.

Copyright— © Len Aye 1994 4

Page 5: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

INTRODUCTION

The task of predicting accurately the future value of FTSE 100 Shares Index, either one day or a few months ahead, is by no means easy. In the past, and even now, statistical tools have been used and have proved successful, up to a certain point, in predicting such financial indicators. However, there is now a different class of computerised tools that are becoming available which can be use alongside the statistical methods in predicting data consisting of non-linear patterns. This new class of tools are called Neural Networks (or neural nets) and are originated in the field of psychology, cognitive science and later crossed over to computing. The idea of neural nets was first investigated in the 1940s and only recently have practical, off-the-shelf tools are becoming available. The neural nets have been applied to such diverse fields as classification: speech, image and hand-written characters recognition, medical screening, geo-demographic analysis; control of complex non-linear plants such as engines and chemical processes; data fusion: medical diagnosis, sales forecasting, credit/loan risk analysis; and of course, prediction: financial systems and exchange rate forecasting.

Report layout This report is a summary of work carried out during the first 6 months of the project. In order to understand the results from our experiments it is necessary that the reader has some basic understanding of the neural nets. Hence, the section ‘Neural Networks’ briefly explains the idea behind the principle—alternatively skip the section if you are already familiar with the subject. As stated earlier, the task of performing financial predictions, or any other analysis, using neural nets involves 4 major steps: input data selection, output data selection, network tuning and analysing the results from the network, hence the main body of the report is broken down into 4 sections to reflect these 4 steps1. The section ‘Input Data Manipulations’ shows the data sets that were acquired and how they were manipulated so that they can be used as inputs to the neural network. The next step is to decide what we want the network to produce as outputs, i.e. the items to be predicted. This is not as obvious as one would have expected. The section ‘Output Data Selection’ details the various parameters that were tested for their suitability as predictive items for the FTSE index. Once we have established both the items to be used as input and output we then trained the network. The sections ‘Network Tuning’ describes the parameters involved in training a network (within the confines of the NeuroShell package) and how they were tuned. After the tests were carried out the results from the tests were analysed and the ‘Results’ section highlights the observations from the tests. The last section ‘Conclusions and future plans’ presents our findings and observations from each of the previous sections and our plans for the next 6 months of the project. For those readers who are not technically inclined may skip to this last section for a condensed summary of the report.

1 The reader should be aware that the optimal data sets or parameters required for each step are not obtained in isolation with other steps but were obtained in parallel by doing experiments iteratively..

Copyright— © Len Aye 1994 10

Page 6: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

Software package The package that we have used for all the experiments is called NeuroShell®2 and in this report we use the term ‘neural net’ when the context applies to neural networks in general and ‘NeuroShell’ when the context refers to the particulars of the package.

2 NeuroShell™ is a trademark of Ward Systems Group, Inc., 245 West Patrick Street, Frederick, Maryland 21701, USA. Tel: (+1) 301 662-7950.

Copyright— © Len Aye 1994 11

Page 7: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Neural Nets

NEURAL NETWORKS

Brief introduction Neural networks are typically composed of interconnected “units”, and each connection is associated with a modifier weight3. Each unit converts the pattern of incoming activities that it receives into a single outgoing activity that it broadcasts to other units. It performs this conversion in two stages. First, it multiplies each incoming activity by the weight on the connection and adds together all these weighted inputs to get a quantity called the total input. Second, a unit uses an input-output function that transforms the total input into the outgoing activity (see Figure 2.1 below).

SUM

2

0.1

0.5

1.5

1

INPUTOUTPUT

FUNCTION

INPUTACTIVITY

WEIGHTACTIVITY

INPUTWEIGHTED

OUTPUT

UNIT

Figure 2.1 — Functions of a unit in a neural network To make a neural network that performs some specific task, the weights on the connections and how the units are connected to each other must be set appropriately. The connections determine whether it is possible for one unit to influence another. The weights specify the strengths of the influence. The common types of neural networks consists of three layers of units: a layer of input units is connected to a layer of “hidden” units, which is in turn connected to a layer of output units. The activity of the input units represents the raw information that is fed into the network. The activities of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and hidden units. Similarly, the behaviour of the output units depends on the activity of the hidden units and the weights between the hidden and output units (see Figure 2.2). The number of hidden layers in a network depends very much on the problem to be solved using the network.

3 Hinton, G. E. (1992), How Neural Networks Learn from Experience, Scientific American, September 1992, pp 105-109.

Copyright— © Len Aye 1994 10

Page 8: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Neural Nets

I1 I2 I3

O1 O2

H1 H2 H3 H4 H5

I = input unitH = hidden unitO = output unit

INPUT LAYER

HIDDEN LAYER

OUTPUT LAYER

Figure 2.2 — A common three layer neural network To train a network, the input patterns are presented to the network and the actual activity of the output units and the desired activity is compared. The error is calculated, which is defined as the square of the difference between the actual and the desired activities. The weights of each connection is then changed in order to reduce the error. The above process is repeated until the network classifies, or recognises, every input pattern correctly.

Copyright— © Len Aye 1994 11

Page 9: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

INPUT DATA MANIPULATIONS

Introduction Of the numerous financial data at our disposal we have chosen the following financial data as suitable indicators for the prediction of FTSE. These data sets are classified into their relative groups, as follows: Indexes FTSE 100 FTSE Eurotrack 100 Dow Jones DAX NIKKEI CAC 40 Interest rates

UK Base rate UK Interbank 3 Months UK 20 yr. Gilt Yield US Base rate US Interbank 3 Months US 30 yr. Bond Yield German Lombard rate German Interbank 3 Months German 10 yr. Bond

Yield French Base rate French Interbank 3 Months French 10 yr. Bond Yield Japan Base Rate Japan Interbank 3 Months Japan 10 yr. Bond Yield

Exchange rates US $ – £ Sterling French Franc – £ Sterling Japanese Yen ¥ – £ Sterling German Marks DM – £ Sterling Economic data

UK Money supply UK RPI (inflation) UK GDP US Money supply US RPI (inflation) US GDP French Money supply French RPI (inflation) French GDP German Money supply German RPI (inflation) German GDP Japan Money supply Japan RPI (inflation) Japan GDP

Futures trading UK 3 month Sterling (Short Sterling) FTSE Long Gilt US T-Bond 3 month Eurodollar German Government Bond (Bund) The list above shows our initial list of financial and economic indicators that we have decided to use as predictive variables. The data sets as they stand in their raw form contain historical information that is not directly apparent in the data, and by calculating their derivatives (e.g. moving averages, etc.) this hidden information or patterns can be brought to the surface and made more explicit, and consequently be recognised by the neural network.

Copyright— © Len Aye 1994 11

Page 10: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

The sections below describes how the raw data were analysed and the types of derivatives measured.

Input data manipulations

General manipulations The following adjustments were applied to all the data sets.

1–Spikes in data When calculating the derivatives of the data set—index values in particular—we need a way of handling the sudden rise or fall of large magnitude in values, e.g. when the stock market crashed the FTSE dropped by over 250 points. Because of this crash all the derivatives that were calculated (i.e. moving averages, differences, rate of change, etc.) have large spikes in them. Since our interest is in the direction or movements but less so in the absolute values of the movements above a given level we can reduce the size of these spikes without loosing information. Another reason for dealing with spikes is that the precision of the NeuroShell output is determined by the range of the minimum and maximum values set for a particular data series. The NeuroShell manual suggests that when dealing with spikes the minimum and maximum values should be set tightly around the majority of the data set. Hence, from the graphical analysis of the data series in Excel, we decided that 4 standard deviations of the data series would be suitable for using as the minimum and maximum values.

2–History in data series The type of training that the neural network software, NeuroShell uses is known as the back-propagation algorithm, a particular class of supervised learning algorithms. In the back-propagation algorithm, the weights of the nodes in the network are adjusted in a particular fashion so as to reduce the errors between the actual and the expected outputs (for those of mathematically minded nature refer to Appendix A for detailed description of the back-propagation algorithm). The back-propagation algorithm is suitable for the majority of problems, where the data to be trained is discrete or independent of each other. However, the algorithm does not handle temporal or historical data well4. To overcome this limitation, we used momentums (differences) of the indexes between today and some periods in the past as representatives of the ‘historical’ information in the data. The following table shows the various index differences that we wish to calculate and used as inputs to the neural network.

1 day difference (momentum) short-term value that we wish to predict 2 day check on 1 day (prediction needed) 5 day 1 week change 20 day 1 month change 65 day long-term value that we wish to predict 130 day 6 month change 260 day 12 month change 25 day used by Technical Analysis 50 day " " " 200 day " " "

Table 4.1 — FTSE momentums

4 There are other algorithms such as recurrent algorithms, which can handle time-series data. However, the current version of NeuroShell does not provide this feature.

Copyright— © Len Aye 1994 12

Page 11: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

Multiples of 5 are chosen to avoid week-day affects which may be particularly great in the UK because of its settlement accounting system. These will provide some history which the neural network would not otherwise get. However, it is unlikely that all of these are significant and part of the neural net’s job is to discriminate between them.

3–Levels From our early experience of NeuroShell we have found that the network cannot predict values that are outside the range of its learning set. This particular problem is not limited to NeuroShell alone but is a limitation with the neural nets in general. For this reason any data that has levels (or trends) must be transformed into one that does not contain levels. More importantly, we cannot use NeuroShell to predict the real value of FTSE. The two methods described below can be used to eliminate the trend in the data.

Differences This is simply the normalisation of the raw data and can be done in many ways, and the most simplest method of removing the levels is to calculate the difference between the current value and the value some periods ago. It is obviously sufficient, in predicting future values, to calculate the difference from today. As daily differences are a function of market level then this series too will have widening bounds. However, this is a second order effect and unlikely to be significant over the periods of 2 days or 3 months currently being considered for prediction.

Trend removals This method on the other hand approximates the underlying trend, using linear regression, and removing the trend from the raw data series and using only the residual series as an input to the network. This is difficult because of the number of data series, each with its own trend which will not be independent. At present, it may be safer not to adjust for trend.

Significant figures All data series are calculated from their raw values and chopped at 4 std. deviations individually for each series. After this, the effect of rounding

up to the next multiple of 0.2, so that values are xxx.2, xxx.4, xxx.6, .... to the nearest 0.5, so that values are xxx.0, xxx.5, xxx.0, ....

Indexes manipulations The following applies to index data sets only. When dealing with index data we should be aware of the following points:

Raw Index data will not be used as input because of level problems, and at the same time we must never loose sight of the actual Index values.

To calculate the Index value it is sufficient to calculate the expected difference from today’s Index value.

No inputs should be used that are expected to have a trend because the neural network does not predict at all well outside its learning experience (although differences are OK).

It is acceptable to underestimate very large changes as these are generally exceptions that are not expected to be within the normal patterns previously seen — i.e. the neural network should not be expected to anticipate a large ‘shock’ to the market but might be expected to predict reasonably the aftermath of a shock given it has seen a few before.

All derived data series, differences mainly, should be limited to 4 std. dev. of the original data set, and rounded to the same accuracy of the original data.

All data should be rounded to an acceptable degree of accuracy. Nothing need be more accurate than 0.01%, e.g. 0.01*2500/100 = 0.25 in FTSE. For FTSE, clearly 0.02% (±0.5) is acceptable.

History information about the data must somehow be made available to the network.

Copyright— © Len Aye 1994 13

Page 12: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

The use of raw data will depend very much upon the required output type, i.e. short or long-term predictions. For the short-term predictions, e.g. 1 day (and 2 day, as a check for the 1 day prediction), we could use the raw FTSE data for calculating the derivatives. In contrast, average values of the Index, which reduce the noise and daily fluctuations in the Index, could be used for the long-term, e.g. 65 days or 3 months, prediction.

All the predicted Index values, either short- or long-term, will be differences from today’s value only.

FTSE 100 Index Since this is the item that we wish the neural network to predict we paid special attention to this index. The following shows the various analyses and derivatives that were calculated for FTSE and used as input data to NeuroShell (see Appendix B for complete description of the analysis methods and how they were implemented in Excel). The analyses that were performed are as follows:

Momentum (close to close difference) Returns Percentage Change of momentum Moving Averages Rate of Change Moving Average Convergence-Divergence Relative Strength Index Zero close-close volatility

These particular indicators were chosen on the basis of Robin Griffiths’s experience. They are widely used in the market (which to some extent must make them self fulfilling) and he has found them the most valuable of the huge range available (e.g. from Reuters RT handbook).

Trend removal While it is expected that any trend in the FTSE data is exponential rather than linear (because the rise should be related to the growth of money values, with re-investment), we should nevertheless test for this assumption. Linear Assume there is a trend, FTSE = m (time) + const + error. Using normal linear regression, minimise the error term, i.e. pick m and const such that ( )error 2∑ is a minimum.

Exponential Assume there is a trend log(FTSE) = m (time) + const + error. Using normal linear regression, minimise the error term, i.e. pick m and const such that ( )error 2∑ is a minimum.

Inverse Assume there is a trend,

1FTSE

=m (time) + const + error.

Using normal linear regression, minimise the error term, i.e. pick m and const such that ( )error 2∑ is a minimum.

Copyright— © Len Aye 1994 14

Page 13: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

We should pick the best solution not on the basis of the individual terms

above, but on the equivalent calculation for the series converted back into FTSE values. ( )error 2∑

i.e. it is always calculated as ( (FTSE est FTSE− ))∑ 2

We then just take the best of these three, subtract it from the original data set and use the resulting values (FTSE residual) as the items to be estimated by the neural network5. The following graph shows the trends in the raw FTSE. It can be observed from the graph that the trends of FTSE are slightly offset by the large peak in 1987, i.e. the trend lines are above what one would consider an optimum trend. It can also be observed that in both of the graphs the linear trend fits the graph better than the exponential and we use this linear trend to calculate the FTSE residual values, which could then be used as the items to be estimated.

1000

1200

1400

1600

1800

2000

2200

2400

2600

2800

3000

Jan-85

Feb-85

Mar-85

Apr-85

Apr-85

May-85

Jun-85

Jul-85

Aug-85

Sep-85

Oct-85

Nov-85

Dec-85

Jan-86

Feb-86

Mar-86

Mar-86

Apr-86

May-86

Jun-86

Jul-86

Aug-86

Sep-86

Oct-86

Nov-86

Dec-86

Jan-87

Feb-87

Mar-87

Mar-87

Apr-87

May-87

Jun-87

Jul-87

Aug-87

Sep-87

Oct-87

Nov-87

Dec-87

Jan-88

Feb-88

Feb-88

Mar-88

Apr-88

May-88

Jun-88

Jul-88

Aug-88

Sep-88

Oct-88

Nov-88

Dec-88

Jan-89

Jan-89

Feb-89

Mar-89

Apr-89

May-89

Jun-89

Jul-89

Aug-89

Sep-89

Oct-89

Nov-89

Dec-89

Jan-90

Jan-90

Feb-90

Mar-90

Apr-90

May-90

Jun-90

Jul-90

Aug-90

Sep-90

Oct-90

Nov-90

Dec-90

Dec-90

Jan-91

Feb-91

Mar-91

Apr-91

May-91

Jun-91

Jul-91

Aug-91

Sep-91

Oct-91

Nov-91

Dec-91

Dec-91

Jan-92

Feb-92

Mar-92

Apr-92

May-92

Jun-92

Jul-92

Aug-92

Sep-92

Oct-92

Nov-92

Nov-92

Dec-92

Jan-93

FTSE 100

y=mx+c

y=c*m̂ x

Underlying trends of raw FTSE

Figure 4.1 — Underlying trend in FTSE

Seasonality This should be tackled only after the trend has been removed. We should do long term seasonality (1 year) first, only then should we see whether there is any remaining cycles that might be removed hopefully by looking at the graphs. It is difficult to decide on the best method for calculating seasonals without knowing the nature of the trends described above and looking at the resulting graphic to see whether the seasonal variations are likely to remain constant or rise with increasing trend, and to what extent. However, it would probably be reasonable to start with the assumption that seasonals are a constant ratio to trend.

5 Microsoft Excel provides built-in functions for calculating the straight line and exponential curves that best fit the given series of values. For the linear trend, the gradient m and constant c can be obtained from the function LINEST(values) which returns an array that describes the line. Hence, the gradient is obtained by: m = INDEX(LINEST(values),1) and constant is obtained by: c = INDEX(LINEST(values),2). With these values a straight line is then constructed using arbitrary x values ranging from 0 to n number of data points in the series. For the exponential trend, the gradient m and constant c can be obtained from the function LOGEST(values) which returns an array that describes the curve, and the gradient and constant are obtained as described above.

Copyright— © Len Aye 1994 15

Page 14: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

Use of exponentially weighted moving average (EMA) of the FTSE residual could be applied here; e.g. EMA F EMAt t

rt= ∗ + ∗ −0 01 0 99 1. .

where EMA t = EMA at time t , F t

r = FTSE residual at time t . It is well known in the market that FTSE behaves in a seasonal pattern, i.e. one that repetitive over a certain time period. For example, the value of FTSE rises around beginning of each year (see figure below). The question is how do we incorporate this information as an input to the neural network. Our first attempt was to use another input which simply consists of a series of numbers representing the days in a year. For example, day 1 is always the first Monday in the second week of a new year. We use this input together with FTSE derivatives to indicate the seasonal change of FTSE. In order that this new information will be of use to the network the data will have to be presented in a rotational instead of a random basis. So far, we have not removed any seasonal information from the data but simply placed an additional indicator to the neural network that seasonal variations exists in the data.

1000

1200

1400

1600

1800

2000

2200

2400

2600

2800

3000

Jan-85

Jan-85

Jan-85

Jan-85

Jan-85

Jan-85

Jan-85

Jan-85

Jan-85

Jan-85

Jan-85

Jan-85

Feb-85

Feb-85

Feb-85

Feb-85

Feb-85

Feb-85

Feb-85

Feb-85

Feb-85

Feb-85

Mar-85

Mar-85

Mar-85

Mar-85

Mar-85

Mar-85

Mar-85

Mar-85

Mar-85

Mar-85

Apr-85

Apr-85

Apr-85

Apr-85

Apr-85

Apr-85

Apr-85

Apr-85

Apr-85

Apr-85

Apr-85

May-85

May-85

May-85

May-85

May-85

May-85

May-85

May-85

May-85

May-85

May-85

May-85

Jun-85

Jun-85

Jun-85

Jun-85

Jun-85

Jun-85

Jun-85

Jun-85

Jun-85

Jun-85

Jul-85

Jul-85

Jul-85

Jul-85

Jul-85

Jul-85

Jul-85

Jul-85

Jul-85

Jul-85

Jul-85

Aug-85

Aug-85

Aug-85

Aug-85

Aug-85

Aug-85

Aug-85

Aug-85

Aug-85

Aug-85

Aug-85

Sep-85

Sep-85

Sep-85

Sep-85

Sep-85

Sep-85

Sep-85

Sep-85

Sep-85

Sep-85

Sep-85

Oct-85

Oct-85

Oct-85

Oct-85

Oct-85

Oct-85

Oct-85

Oct-85

Oct-85

Oct-85

Oct-85

Nov-85

Nov-85

Nov-85

Nov-85

Nov-85

Nov-85

Nov-85

Nov-85

Nov-85

Nov-85

Nov-85

Dec-85

Dec-85

Dec-85

Dec-85

Dec-85

Dec-85

Dec-85

Dec-85

Dec-85

Dec-85

Dec-85

1985 FTSE 1986 FTSE 1987 FTSE 1988 FTSE 1989 FTSE 1990 FTSE 1991 FTSE 1992 FTSE

Comparison of FTSE values between 1985-1992 03/03/9

Figure 4.2 — Seasonal trends in FTSE

Other Indexes The following two manipulation methods were applied to the following indexes: Dow Jones, DAX, Nikkei, and CAC 40.

Trend replacements The following two differences are used as the indicators of the Index without the trend: • Index - FTSE

• ⎟⎠⎞

⎜⎝⎛−

rate £/Ex.IndexFTSE .

Copyright— © Len Aye 1994 16

Page 15: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

Historical data This is done by calculating the difference between today’s index and the index n periods ago, and we used the 1 day, 1 month, 3 months and 12 months difference of the following derivatives: • Index • Index - FTSE

• ⎟⎠⎞

⎜⎝⎛−

rate £/Ex.IndexFTSE .

• RSI 14 days • RSI 9 days.

Interest rates The following shows the data manipulation carried out for the UK Interest rates, but is equally applicable to other nations rates.

Trend removals using differences The following shows the differences calculated between the UK interest rates data series (which will be used as additional inputs to the raw data series). Differences calculated are: • Base rate - 3M Interbank • 3M Interbank - 20 yr. Gilt yield • 20 yr. Gilt yield - Inflation • Base rate - Inflation

where Inflation is derived from 100112

×⎭⎬⎫

⎩⎨⎧

−⎟⎟⎠

⎞⎜⎜⎝

− Mt

t

RPIRPI

.

Historical data The following table shows the historical data that are expected to be important and obtained using differences.

1 day 1 month 12 months

Base rate —

3M Interbank —

20 yr. Gilt yield

Base rate - 3M Interbank —

3M Interbank - 20 yr. Gilt

20 yr. Gilt - Inflation —

Base rate -Inflation —

3 M Interbank rates This is an additional factor used for Interbank rates and the following table shows the differences that we wish to calculate.

Copyright— © Len Aye 1994 17

Page 16: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

Differences UK US German French Japan

UK — — — — —

US — — — —

German — — —

French — — —

Japan — — —

Exchange rates For exchange rates 1 day and 1 month (20 day) differences were calculated as derivatives of the exchange rates. The derivatives are used together with the raw values of exchange rates because although the exchange rates varies, they are normally bounded within certain ranges.

as is 1 day 1 month

US $/£ exchange rate

French Franc/£ exchange rate

Japanese ¥/£ exchange rate

German Marks DM/£ exchange rate

Economic data Here, only the 12 month percentage change is calculated for use as replacement of the actual values for the following economic data:

UK Money supply US Money supply UK RPI (inflation) US RPI (inflation) UK GDP US GDP French Money supply German Money supply French RPI (inflation) German RPI (inflation) French GDP German GDP Japan Money supply Japan RPI (inflation) Japan GDP

Futures Currently, we have not yet used the Futures data extensively.

Copyright— © Len Aye 1994 18

Page 17: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

Complete list of input data sets The table 4.3 shows the complete list of input data sets which are generated from the manipulations of the raw data sets. The table below shows the naming convention that we have adopted to label the different types of input data sets.

Acronym Meaning UK UK US US GE Germany FR France JP Japan DJ Dow Jones NIK Nikkei 1D, 2D, 65D, etc. 1 day, 2 day, 65 day, etc. 3M, 12M, etc. 3 month, 12 month, etc. MAV Moving average MOM Momemntum (or difference) ROC Rate of Change RSI Relative Strenght Index Zero cl-cl vol. Zero close-close volatility MACD Moving Average Convergence-Divergence FTSE-(DJ/EX.RATE)

FTSE-(DJ/Exchange rate)

FTSE-DJEX,etc. FTSE-(DJ/Exchange rate) BR Base rate 3MIB 3 month interbank INF Inflation >7Y BY more than 7 years Bond yield 20YR. GILT 20 years Gilt yield 30Y BY 30 years Bond yield

Table 4.2 — Naming convention used in the tests

Copyright— © Len Aye 1994 19

Page 18: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Input data manipulations

1day MOM FTSE-(DAX/EX.RATE) CAC 1D MOM US 30Y BY 12M MOM FRR >7YBY-INF 2 day MOM DAX 1D MOM CAC 20D MOM US BR-3MIB 1D MOM FR 3MIB 1D MOM 1 week MOM DAX 20D MOM CAC 65D MOM US BR-3MIB 20D MOM FR 3MIB 20D MOM 25day MOM DAX 65D MOM CAC-FTSE 1D MOM US 3MIB-30YBY 1D MOM FR >7Y BY 1D MOM 50day MOM DAX 12M MOM CAC-FTSE 20D MOM US 3MIB-30YBY 20D MOM FR >7Y BY 20D MOM 65 days MOM FTSE-DAX 1D MOM CAC-FTSE 65D MOM US 3MIB-30YBY 12M MOM FR >7Y BY 12M MOM 1 year MOM FTSE-DAX 20D MOM FTSE-(CAC/EX.RATE) 1D

MOM US 30Y-INF 1D MOM FR 3MIB->7YBY 1D MOM

% change of MOM over 10 days

FTSE-DAX 3M MOM FTSE-(CAC/EX.RATE)20D MOM

US 30Y-INF 20D MOM FR 3MIB->7YBY 20D MOM

% change of MOM over 25 days

FTSE-DAX 12M MOM FTSE-(CAC/EX.RATE) 65D MOM

US BR-INF 1D MOM FRR >7YBY-INF 1D MOM

% change of MOM over 50 days

Day numbers CAC RSI 9D 1D MOM US BR-INF 20D MOM FRR >7YBY-INF 20D MOM

Close-2day MAV FT-DAXEX 1D MOM CAC RSI 9D 20D MOM JP BR-3MIB UK-US 3MIB Close-5 day MAV FT-DAXEX 20D MOM CAC RSI 9D 65D MOM JP 3MIB-10Y BY UK-GE 3MIB Close-25 day MAV FT-DAXEX 65D MOM CAC RSI 14D 1D MOM JP 10Y BY-INF UK-FR 3MIB Close-50 day MAV FT-DAXEX 12M MOM CAC RSI 14D 20D MOM JP BR-INF UK-JP 3MIB 3 day ROC DAX RSI 9D 1D MOM CAC RSI 14D 65D MOM JP BR 1D MOM US-GE 3MIB 5 day ROC DAX RSI 9D 20D MOM UK BR-3M IB JP BR 20D MOM GE-FR 3MIB 25 day ROC DAX RSI 9D 65D MOM UK 3MIB-20YR.GILT JP 3MIB 1D MOM GE-JP 3MIB 50 day ROC DAX RSI 9D 12M MOM UK 20YR.GILT-INFLATION JP 3MIB 20D MOM US$/£ EX. RATE MACD DAX RSI 14D 1D MOM UK BR-INFLATION JP10Y BY 1D MOM US$/£ 1D MOM RSI 9 day DAX RSI 14D 20D

MOM UK BR 1D MOM JP 10Y BY 20D MOM US$/£ 20D MOM

RSI 14 days DAX RSI 14D 65D MOM

UK BR 20D MOM JP 10Y BY 12M MOM FRANCS/£ EX. RATE

Zero cl-cl- vol DAX RSI 14D 12M MOM

UK 3MIB 1D MOM JP BR-3MIB 1D MOM FR/£ 1D MOM

DJ-FTSE NIKKEI-FT UK 3MIB 20D MOM JP BR-3MIB 20D MOM FR/£ 20D MOM FTSE-(DJ/EX.RATE) FT-(NIKKEI/EX.RATE) UK 20YR.GILT 1D MOM JP 3MIB-10YBY 1D MOM MARKS/£ EX. RATE DJ 1D MOM NIK 1D MOM UK 20YR.GILT 20D MOM JP 3MIB-10YBY 20D MOM MARKS/£ 1D MOM DJ 20D MOM NIK 20D MOM UK 20YR.GILT 12M MOM JP 3MIB-10YBY 12M MOM MARKS/£ 20D MOM DJ 65D MOM NIK 65D MOM UK BR-3M IB 1D MOM JP 1OYBY-INF 1D MOM YEN/£ EX. RATE DJ 12M MOM NIK 12M MOM UK BR-3M IB 20D MOM JP 1OYBY-INF 20D MOM YEN/£ 1D MOM DJ-FTSE 1D MOM NIK-FT 1D MOM UK 3MIB-20YR.GILT 1D MOM JP BR-INF 1D MOM YEN/£ 20D MOM DJ-FTSE 20D MOM NIK-FT 20D MOM UK 3MIB-20YR.GILT 20D

MOM JP BR-INF 20D MOM UK GDP 12M % CHANGE

DJ-FTSE 65D MOM NIK-FT 65D MOM UK 3MIB-20YR.GILT 12M MOM

GE BR-3M IB UK M. SUPLY 12M % CHANGE

DJ-FTSE 12M MOM NIK-FT 12M MOM UK 20YR.GILT-INF 1D MOM GE 3MIB-10YR BY UK INF 12M % CHANGE FT-DJEX 1D MOM FT-NIKEX 1D MOM UK 20YR.GILT-INF 20D MOM GE BR 1D MOM US GDP 12M % CHANGE FT-DJEX 20D MOM FT-NIKEX 20D MOM UK BR-INF 1D MOM GE BR 20D MOM US M. SUPLY 12M %

CHANGE FT-DJEX 65D MOM FT-NIKEX 65D MOM UK BR-INF 20D MOM GE 3M IB 1D MOM US INF 12M % CHANGE FT-DJEX 12M MOM FT-NIKEX 12M MOM US BR-3M IB GE 3M IB 20D MOM FR GDP 12M % CHANGE DJ RSI9D 1D MOM NIK RSI 9D 1D MOM US 3MIB-30Y BY GE10 YR BY 1D MOM FR M. SUPLY 12M %

CHANGE DJ RSI9D 20D MOM NIK RSI 9D 20D MOM US 30Y BY-INF GE10 YR BY 20D MOM FR INF 12M % CHANGE DJ RSI9D 65D MOM NIK RSI 9D 65D MOM US BR-INF GE10 YR BY 12M MOM GE GDP 12M % CHANGE DJ RSI9D 12M MOM NIK RSI 9D 12M MOM US BR 1D MOM GE BR-3M IB 1D MOM GE M. SUPLY 12M %

CHANGE DJ RSI14D 1D MOM NIK RSI14D 1D MOM US BR 20D MOM GE BR-3M IB 20D MOM JP GDP 12M % CHANGE DJ RSI14D 20D MOM

NIK RSI14D 20D MOM US 3M IB 1D MOM GE 3MIB-10YR BY 1D MOM

JP M. SUPLY 12M % CHANGE

DJ RSI14D 65D MOM

NIK RSI14D 65D MOM US 3M IB 20D MOM GE 3MIB-10YR BY 20D MOM

JP INF 12M % CHANGE

DJ RSI14D 12M MOM

CAC-FTSE US 30Y BY 1D MOM GE 3MIB-10YR BY 12M MOM

FTSE-DAX FTSE-(CAC/EX.RATE) US 30Y BY 20D MOM FR 3MIB->7YBY

Table 4.3 — Complete list of input data sets

Note: The first 22 data sets (1 day MOM to Zero cl-cl vol.) are the derivatives of FTSE Index..

Copyright— © Len Aye 1994 20

Page 19: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Output data selection

OUTPUT DATA SELECTION

In this section we discuss our concepts in the selection of items to be predicted, i.e. to be produced as output, from the neural network.

Number of predictive items in the neural net As we want to predict the FTSE Shares Index for a short and long period, i.e. 2 day close and 3 months moving average respectively, the obvious solution is to design and train a network with 2 outputs, one for 2 day close and the other for 3 months moving average. However, it is generally accepted that a neural network which has more than 1 output performs less well than separate networks each having a single output. This is particularly true in NeuroShell which uses a least squares minimisation technique to decide how to apportion its weights adjustments amongst several outputs. This means that the accuracy of each output is sacrificed in order to minimise the total error of all the outputs. With this in mind we will have to build two networks for predicting short term and long term FTSE indexes separately. The question then arises is what type of data will be suitable for each of the two networks.

Short term prediction Firstly, for a short term prediction, using only the short term indicators, e.g. 1 day close, 1 day moving average, 5 day rate of change, etc., will not be enough. This is because the short term price changes are highly volatile and to predict accurately additional information must be provided, such as seasonal and cyclical changes information.

Long term prediction However, for long term prediction, day to day changes of price and moving averages will have less effect than long term price indicators, such as 30 day, 3 moths and 1 year momentums, 50 day moving averages and rate of change, and zero trend close-close volatility index. Hence, we believe that the types of input data for both network will be similar in many aspects, but the weightings will be different.

Selection of predictive item As we have described earlier, any data which contain trends or levels cannot be used either as input or output parameter to the neural net. So, instead we decided to use the following derivatives of FTSE as the items to be predicted: returns, logarithms and momentum (differences)6. Initially we expected the returns (percentage change) of FTSE would prove useful as the predictive item. However, the tests carried out by Grashoff showed that the use of returns was not successful as expected. This may be because price moves are always discrete units (1p, 2p, etc.). Thus using absolute differences provides the neural net with more repetition. When returns are used, a change of 10p in price is an input of different value depending on the underlying price level at the start of the period. Returns therefore provides a more continuous set of value to present to the neural net and also compensate for level. However, any benefit appears to be offset by the vastly increased number of different values. Some work was done by rounding the returns to a small number of significant figures and while this gave some improvement the end result was less accuracy in prediction than achieved by straight differences.

6 See Appendix B for explanation.

Copyright— © Len Aye 1994 21

Page 20: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Output data selection

The same observation was made for the use of differences of logarithms of FTSE. Hence, all our tests now use the FTSE difference (the value of FTSE n number of days ahead with respect to today’s value) as the predictive item from the neural net.

Selection of FTSE difference Having established that differences would be used as the sole item for prediction we then have to decide precisely the type of FTSE difference that would be suitable. In earlier tests, we have used the 2 day moving average (MAV), defined as (P P ) /t t− +1 2 , as the factor to be predicted, since the daily differences contains too much noise for the neural net. It is in the nature of market that daily movements are generally over done and that some correction occurs the following day. However, the 2 day moving average suffers from the problem that its average point in time is about midday rather than end day: the price of today and yesterday are recorded at close of business times, so the average of the two days is around midday. A better average might then be, what we have called, 3 day weighted average, that is;

P P Pt t− +t+ ∗ +1 12

4

This is centred at the close of business required but has the problem that it requires tomorrow’s value. Using this would therefore involves estimating at least two days forward. As the graph below shows the 2 day moving average differs from the actual FTSE value by less than 0.5% normally.

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

Mar-91

Mar-91

Mar-91

Mar-91

Mar-91

Apr-91

Apr-91

Apr-91

Apr-91

May-91

May-91

May-91

May-91

May-91

Jun-91

Jun-91

Jun-91

Jun-91

Jul-91

Jul-91

Jul-91

Jul-91

Aug-91

Aug-91

Aug-91

Aug-91

Aug-91

Sep-91

Sep-91

Sep-91

Sep-91

Oct-91

Oct-91

Oct-91

Oct-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Dec-91

Dec-91

Dec-91

Dec-91

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Feb-92

Feb-92

Feb-92

Feb-92

Mar-92

Mar-92

Mar-92

Mar-92

Apr-92

Apr-92

Apr-92

Apr-92

May-92

May-92

May-92

May-92

May-92

Jun-92

Jun-92

Jun-92

Jun-92

Jul-92

Jul-92

Jul-92

Jul-92

Jul-92

Aug-92

Aug-92

Aug-92

Aug-92

Sep-92

Sep-92

Sep-92

Sep-92

Oct-92

Oct-92

Oct-92

FTSE/MAV(FTSE, 2D)*100-100

Figure 5.1 — Difference between 2 day MAV and actual FTSE However, with 3 day weighted average the difference in relation to the real FTSE value is around 0.25%, half that of the 2 day moving average (see graph below).

Copyright— © Len Aye 1994 22

Page 21: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Output data selection

FTSE/3 Day weighted Average

-1.5

-1.25

-1

-0.75

-0.5

-0.25

0

0.25

0.5

0.75

1

1.25

Mar-91

Mar-91

Mar-91

Mar-91

Apr-91

Apr-91

Apr-91

Apr-91

May-91

May-91

May-91

Jun-91

Jun-91

Jun-91

Jun-91

Jul-91

Jul-91

Jul-91

Jul-

91

Aug-91

Aug-91

Aug-91

Sep-91

Sep-91

Sep-91

Sep-91

Oct-91

Oct-91

Oct-91

Oct-91

Nov-91

Nov-91

Nov-91

Dec-91

Dec-91

Dec-91

Dec-91

Jan-92

Jan-92

Jan-92

Jan-92

Feb-92

Feb-92

Feb-92

Mar-92

Mar-92

Mar-92

Mar-92

Apr-92

Apr-92

Apr-92

May-92

May-92

May-92

May-92

Jun-92

Jun-92

Jun-92

Jul-

92

Jul-92

Jul-92

Jul-92

Aug-92

Aug-92

Aug-92

Aug-92

Sep-92

Sep-92

Sep-92

Oct-92

Oct-92

Oct-92

Oc9

Y=FTSE/(3 day Weighted average)*100-100

Figure 5.2 — Difference between 3 day weighted AV and actual FTSE It is unlikely that any wider moving average would be of value because of the incidence of ‘special’ events. Such as average would have to be treated with care because it includes future information.

Perhaps, P P Pt t− + ∗ t++1 2

41 is assigned to day t+1, when this problem disappears. If it

was assigned as a difference to Pt the value would be

42

42

11

11

+−

+−

−−=

⎟⎠⎞

⎜⎝⎛ +∗+

ttt

tttt

PPP

PPPP

which is perhaps a measure of how much yesterday’s value was an over or under estimate of some ‘true’ underlying value for the index. Initial results from tests using the 3 day MAV showed that the overall percentage error is 2.8% (or 1.78% and 4% for the first and second half of the test set) and therefore use of this measure was discontinued for the moment.

Copyright— © Len Aye 1994 23

Page 22: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Network tuning

NETWORK TUNING

After 6 months intensive use of NeuroShell we have made the following observations with regards to the package.

Hidden nodes The number of hidden nodes suitable for a particular application is still an inexact science. NeuroShell provides a simple tool, which in itself is a network, called HIDNODES which can be used to determine the number of hidden nodes required for a particular problem. HIDNODES expects three inputs (number of input nodes, number of output nodes and a figure representing the complexity of the patterns in the sample data set) and produces as output the number of hidden nodes to use. This tool, though useful, does not guarantee that the number of hidden nodes it suggests will work for the problem, since it requires the user to provide the network with a subjective figure (from 0 to 10, where 0 being not very complex and 10 being very complex). Depending on this figure the number of hidden nodes can vary in number of tens (as shown in the following table). Input nodes 50 50 50 50 100 200 Output nodes 1 1 1 2 1 1 Complexity (0-10) 0 5 10 5 5 7 Suggested no. of hidden nodes 13 26 45 26 41 77

Table 6.1 — Suggested number of nodes in relation the complexity of the problem As an alternative, a good rule of thumb in deciding the number of hidden nodes required is that the total number of weights in a network should be much less than the total number of patterns in the sample set and the number of output nodes. This is to avoid having the problem of overfitting, i.e. the network is memorising instead of generalising the given input sample data, which results in the network producing very good results on the sample data set but does very poorly in other data set. Using this rule of thumb, we decided to use 25 hidden nodes. We arrived at this figure as follows: In a 3-layer, fully connected network, each of the input node is connected to all the hidden nodes and similarly each of the output node is connected to all the hidden nodes, as shown below.

I1 I2 I3

O1 O2

H1 H2 H3 H4 H5I = input nodeH = hidden node

O = output node

Figure 6.1 — A typical 3-layer network

Copyright— © Len Aye 1994 27

Page 23: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Network tuning

Total connections, Tc N (N N )H I O= + where NH = number of hidden nodes NI = number of input nodes NO= number of output nodes,

and in the above example network, there are a total of 25 connections. In the data sets that we used in the experiment, the total number of patterns, or cases in NeuroShell terminology, is around 1500 (approximately 5½ years worths of data), and the total number of input nodes is around 40 and there is usually only one output in the networks. If we let Tc=1000, ie. 66% of 1500 cases, then NH is given by:

N TcN NH

I O=

+( )

N H =1000

41

gives N H ≈ 25. Note that this is by no means a definite or a strict rule. We used this formula to give us an initial value and found to be of value.

Learning rate (0.0–1.0) To minimise the learning errors, the network automatically adjusts the weights, that are linked to the nodes in the net, in the direction required to produce a smaller error the next time the same input pattern is presented. The amount of weight adjustment carried out by the network at each node is proportional to the amount of error produced, i.e. ϖ ε∝ , where ϖ = weight change and ε = error. In NeuroShell, the learning rate is the factor which determines how much of the error produced is applied to the change of weight at the nodes, i.e. ϖ ≅ λε, where λ is the learning rate. The value of 0.4 was found to give good predictions (together with the value of Momentum, see below) and is occasionally reduced to 0.2 in some tests once the network has learned for sometime, and the accumulated errors have not reduced further.

Momentum (0.0–0.9) The term momentum in NeuroShell is different from that used in financial analysis where it is used to mean the difference between the index value today and the value some periods ago. In NeuroShell, momentum (µ) is a factor which determines the proportion of the last weight change which is added to the new weight change. The total weight change at time t is ∴ given by

ϖt = λεt+µϖt-1.

In tests, it was found that the value of 0.6 (together with the value of 0.4 for the Learning rate) produced the best results.

Learning Threshold (0.0–3.0) The learning threshold sets the limit of accuracy to which a network is trained. The learning process stops when the errors for all cases fall below this value. The user guide suggests that for large number of classifying characteristics (i.e. the number of data sets) a larger value (0.1 or 0.001) would be more suitable.

Copyright— © Len Aye 1994 28

Page 24: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Network tuning

We have found that this value has some indirect effect on the learning accuracy of the network, and the value of 0.0001 was found to produce networks that have learned accurately.

Number of presentation This is the number of times the data sets are presented to the network on a case by case basis, where a case consists of all the financial indicators, e.g. FTSE, Dow Jones, etc., on a particular day. Once the input case is seen by the network it produces an output and compares it with the expected value and makes any necessary adjustments to the weights in the nodes. The performance of a network relates closely to the number of times a particular case is seen by the network. In other words, as the number of times a case is seen by the neural net increases the better the output of that case will be. However, care should be taken not to over present (i.e. over train) the network by presenting the cases in the learning set more than necessary. It is true that the network predictions get better as the number of presentation increases, but this is only true for the learning set. A network that is being trained too well on the learning set is normally useless in predicting any events or values using data it has not seen before. This is the classic case of the network ‘memorising’ the learning set and hence cannot generalise on any data that lies outside the learning set. Again, deciding how many presentations would produce an adequately learned-, good generalising network is still an inexact science.

Presentation type There are two ways in which the input data can be presented to the network; random and rotation.

Random In this method, the patterns, or cases, from the sample set are presented randomly to the network. The advantage of this method is that the learning time is usually quicker than the rotation method. However, there is a danger that if the number of cases is sufficiently large in the sample set the learning time will be a great deal longer in order to ensure that all the cases in the sample set are presented at least once to the network. If not, learning will take place only from those randomly chosen patterns, and all the cases may not have been seen by the network.

Rotate As the name suggests the network learns by reading the data from the sample set, one day at a time, in sequential and rotational order (from top to bottom of the files, and back to top again). This method is useful for learning and predicting events which contain historical information, and also ensures that all of the patterns in the sample set are seen by the network. As the FTSE prediction involves the use of historical data this method of data presentation was used most often and produced better results than the random presentation.

Minimum and maximum values In predicting any numerical value, appropriate values for the maximum and minimum should be set according to the following conditions:

the range between the max. and min. values should be tight enough in order to obtain good results

the range should also cover all values in the sample set as well as in the test set, but most importantly, any values likely to be encountered during actual prediction.

Copyright— © Len Aye 1994 29

Page 25: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Network tuning

The effect of the latter condition can be seen in one of tests (FT10ST01) where the values of the test set fall outside the range of the sample set (+100, -300). The graph below shows that the prediction of the values below -300 in the latter part of the test set the neural net is hopeless at predicting values outside that of the known range, where the percentage error was found to be close to 8% in that region compared with 1.3% for the first half of the test set.

-500

-450

-400

-350

-300

-250

-200

-150

-100

-50

0

50

100

Oct-91

Oct-91

Oct-91

Oct-91

Oct-91

Oct-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Feb-92

Feb-92

Feb-92

Feb-92

Feb-92

Feb-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

May-92

May-92

May-92

May-92

May-92

May-92

May-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jul-92

Jul-92

Jul-92

Jul-92

Jul-92

Jul-92

Jul-92

Jul-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Oct-92

Oct-92

Oct-92

Actual FTSE +2 days

Predicted

FTSE +2 day (trend removed)

NOTE:File= FT10ST01Presentation=3.4MThreshold=0.4Momentum= 0.6

Figure 6.2 — Limitation of neural net to predict values outside the known range

Copyright— © Len Aye 1994 30

Page 26: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

RESULTS

Results evaluation methods

Trading — Short term prediction

Analysis methods Trading requires accurate prediction of price movements over a one/two day period. Any longer and although the prediction may come right, intermediate adverse values could break position limits. Even with one day predictions, intra-day values could hurt but this is less likely. It is required to predict the one day, two day and three day closing values. Accurate one day predictions would be the ideal. Experience suggests that there are many circumstances where a move in one direction is reversed, at least partially, the following day and hence our expectation is that a prediction of the two day value will be more reliable. If the prediction for the day following our target day (+1 or +2) is in the same direction as the prediction for the target day then our expectation that it will be fulfilled should be greater. Hence the need for three day prediction, to add confidence to the two day estimate. It is important in trading to accept that on some days we make profits and some losses— we do not expect to be right all the time but we have to restrict the losses and expect the cumulative profit to grow steadily. The maximum accumulative downside must also be acceptable. In trading using the neural network, we therefore need to make decisions to be long, neutral or short as this is a system for aiding positioning. We don’t want to make mistakes, but we don’t want to miss opportunities. There are 3 levels of testing that we should do as predictions: i) direction only ii) magnitude iii) profit & loss. All predictions are for differences from today’s level; not for absolute values with which the neural network cannot cope.

Direction only Numeric and graphical analysis should be used. Straight count Compares the actual and predicted direction of the FTSE of each case in the test set, and output as follows: success if PC/AC > 0, AC ≠ 0

failure if ⎩⎨⎧

=≠<

00AC ,0/

ACACPC

where PC= Predicted Change, AC = Actual Change.

Copyright— © Len Aye 1994 52

Page 27: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

Assign numeric values, e.g. success=1; failure=0, to each test case, and the overall result is computed by finding the average number of successful cases, e.g. 125 from 250 implies a 50% success rate. In addition, the result is subdivided into two halves to give a feel for the effect of distance from the learning set. (This is because the training set and the test set are derived from the same source data file, chronologically ordered. This means that the first case in the test set follows immediately from that of the last case in the learning set.) The results can then be expressed as follows, e.g. first half 65 from 125 52 % success second half 60 from 125 48 % success. The overall total should be split to indicate whether success is better on –ve or +ve predictions and –ve or +ve moves. Straight count (ignoring small changes) Small predicted changes would be excluded because it wouldn’t be worth our while trading on them.

Magnitude Look at the magnitude of predicted changes compared with the last known value of the index at that date and separate out moves of less than x% absolute where x is probably 0.5% or 1%. success if |PC| > |xI| AND (PC/AC > 0), where I = Index value Result expressed as 125 from 200 (50 small) success 62.5%. This could be amended to include as failures small predicted moves which turned out to be large. Of predicted small changes; failure if |PC| < |xI| AND |AC-PC| > 2xI. Again, this analysis should be split into first half, second half and all of test set. Also, the total should be split to show if success is better for +ve or –ve predictions and for +ve or –ve moves.

Quantitative analysis Initial quantitative analysis carried out on the output results is on the basis of percentage errors. The percentage error (PE) is calculated as follows:

PE PI AIAI

=−

∗100

where PI= predicted index value AI= actual index value. The statistics calculated were as follows:

i) Average PE , 1n

PE t( )∑ .

ii) Std. Dev. PE, iii) Average absolute PE, iv) Std. Dev. absolute PE, v) Max. PE, vi) Min. PE.

Copyright— © Len Aye 1994 53

Page 28: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

The above analyses were carried out for first half, second half, and whole of both the test set and the sample set.

Forecasting — Long term prediction The level of the FTSE within ±1.5% is the objective of long term forecasting. Such a view would be important to long-term holders of significant equity positions. The size of their portfolios (and consequent reductions in its liquidity) makes the possibility of trading the total position impossible and thus intermediate values of the Index are of less importance. Accurate 3 month predictions would be the ideal. However, as these Index values are being predicted as differences which could be subject to significant variations in the initial and final specific daily values some averaging is desirable. Estimation of the two day average as a proxy for the FTSE was considered a good compromise. Tests showed that it generally varied by less than 0.5% from the FTSE value itself, and except on exceptional days the variation was within 1% bounds. As a measure of noise, the volatility of the 2 day MAV (moving average) was 11.7% compared to 16% for the FTSE. The tests for short term values was applied to these longer term estimates as well. The results needed to be presented initially as success in predicting the 2 day MAV but also in terms of predicting the FTSE itself.

Preliminary results In our earlier experiments with NeuroShell, the following tests produced the most promising results, and of these three types of tests the latter two gave the best results of the experiments and these are shown on the following pages.

Test Predictive item Derivatives calculated on

1 FTSE +1D MOM raw FTSE

2 FTSE +2D MOM (residual) 4 std. dev. adjusted FTSE

3 FTSE +65D MOM of 2D MAV 4 std. dev. adjusted FTSE 2D MAV

Result sheets The result sheets on the following pages are of three varieties:

Test Record Sheet—contains the summary of the conditions, input data sets and their contributions, accuracy of prediction and any other information that is relevant to the test

Line Graph—showing the actual and the predicted outputs plotted over time (usually from October 1991 to October 1992). A 100% accurate prediction means that the actual and the predicted graphs will be identical.

Scattered Graph—compares the values of actual and the predicted outputs. Again, a 100% accurate predictions means that the scattered values will be aligned with the y=x line.

Test 1—FTSE +1D MOM prediction The tests carried out in predicting the 1D MOM produced results which are less successful than the 2D MOM predictions, e.g. overall accuracy of 52% and a lightly better prediction of 54% at the first half of the test set. Hence, we have concentrated our efforts in predicting the 2D MOM values.

Copyright— © Len Aye 1994 54

Page 29: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

Test 2—FTSE +2D MOM (residual) prediction Here the predictive item is not the raw FTSE, but the residual of FTSE, i.e. the trend in FTSE has been removed. The input parameters are derivatives of FTSE and raw data of other indexes, interest rates and economic data (see result sheet FT10ST01.RES). Although the overall direction accuracy is 80%, the average percentage error is 1.38% —above the limit of acceptance. However, if we only look at the first half of the test, i.e. the first 6 months from the last available data that was used in the training of the network, we can see that the average error is 0.33%. This shows that the neural net produced better predictions to events in the near future than those that are more than 6 months away. Figure 7.1 shows the actual and predicted values of FTSE 2 day momentum plotted over October 1991- October 1992, and figure 7.2 shows the comparison of the actual and predicted values.

Test 3—FTSE +65D MOM prediction The predictive item is the 2 day MAV of FTSE in 65 days time. The input parameters are derivatives of FTSE and raw data of other indexes, interest rates and economic data (see result sheet FT08AA03.RES). Figure 7.3 shows the actual and predicted values of FTSE 65 day momentum plotted over October 1991- October 1992, and Figure 7.4 shows the comparison of the actual and predicted values.

Copyright— © Len Aye 1994 55

Page 30: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

Reduction of input data sets In general, a neural net’s performance can be greatly increased by simplifying its inputs. From the vast number of derivative data sets that we have generated (See ‘Input data manipulation’ section),. tests were carried out to determine the most significant data sets and to remove those which made the least contributions to the overall result of the prediction.

Numerical Analysis To minimise the amount of input parameters required, correlation analysis was first carried out on the derivatives of the FTSE index. Any data set which has high correlation with other data sets is of little value as an input to the neural net. This is because it does not contain any information that is not found in other correlated data sets already. The table below shows the results of the correlation analysis. From the result of the analysis we can remove those data sets which are highly correlated (> 85%) with more than one others.

FTSE 100

1 day Returns

1day MOM

2 day MOM

1 week MOM

25day MOM

50day MOM

65 day MOM

1 year MOM

% change of MOM over 10 days

% change of MOM over 25 days

% change of MOM over 50 days

FTSE 100 1

1 day Returns 1

1day MOM 0.98 1

2 day MOM 1

1 week MOM 1

25day MOM 1

50day MOM 1

65 days MOM 0.87 1

1 year MOM 1

% change of MOM over 10 days

1

% change of MOM over 25 days

0.98 1

% change of MOM over 50 days

0.98 0.86 1

Close-2 day MAV

Close-5 day MAV

Close-25 day MAV

Close-50 day MAV

3 day ROC

5 day ROC

25 day ROC

50 day ROC

MACD RSI 9 day

RSI 14 days

Zero cl-cl- vol.

Close-2day MAV 1

Close-5 day MAV 0.85 1

Close-25 day MAV 1

Close-50 day MAV 0.88 1

3 day ROC 0.92 1

5 day ROC 0.9 1

25 day ROC 0.88 0.92 1

50 day ROC 1

MACD 0.85 1

RSI 9 day 1

RSI 14 days 0.86 0.97 1

Zero cl-cl- vol. 1

Table 7.1— Correlation analysis of FTSE derivatives

Direct experimentation Due to the limitations of NeuroShell as well as the large number of input data sets that we have generated a total of 3 tests had to be devised to test for the suitability of the inputs, namely:

Copyright— © Len Aye 1994 56

Page 31: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

FTSE and other indexes (US, Germany, France, Japan) derivatives FTSE and interest rates (US, Germany, France, Japan) derivatives FTSE, exchange rates and economic (US, Germany, France, Japan) derivatives.

In all of the tests the +2 day momentum of the FTSE 2 day MAV was used as the predictive item.

FTSE and other Indexes derivatives (2 days prediction) The following graphs FT14AB02.CFT and FT14AB02.CFO show the plot of Index derivatives against their contribution factors. The contribution factor is the sum of the absolute values of the weights leading from the particular input data set, and is a rough measure of the importance of that input. From the graphs we have extracted the extreme cases, i.e. the inputs which made the most and least significance.

Rank Most significant inputs Least significant inputs 1 DJ-FTSE CAC 1D MOM 2 1 year MOM DJ RSI9D 1D MOM 3 FT-DAXEX 12M MOM 1 day Log 4 DJ 12M MOM DJ-FTSE 1D MOM 5 Day numbers FTSE-(CAC/EX.RATE) 1D MOM 6 Zero cl-cl- vol. 1day MOM 7 FT-DJEX 12M MOM 50day MOM 8 CAC RSI 9D 20D MOM FTSE-DAX 9 FT-DJEX 65D MOM FT-DJEX 1D MOM

10 CAC-FTSE 65D MOM NIK-FT 20D MOM 11 % change of MOM over 10 days FT-DAXEX 1D MOM 12 FT-DJEX 20D MOM NIK-FT 1D MOM 13 DJ RSI14D 65D MOM NIK 20D MOM 14 DAX RSI 14D 65D MOM DAX RSI 9D 1D MOM 15 FTSE-DAX 12M MOM FT-NIKEX 65D MOM

Table 7.2 — Most and least significant contributions of index derivatives

Observations The majority of the Nikkei index derivatives fall in the medium to low significance region, where as DAX, Dow Jones and CAC index derivatives made significant contributions. This is not surprising, and at the same time confirms that the Japanese market plays a less influential role in the movements of FTSE. It is safe to state that we can remove most Nikkei index derivatives from the future tests since they do not make a great deal of contribution. The 1D MOM for many items are of low significance and could be excluded, perhaps in favour of 2D MOMs. The Day Numbers, representing seasonality are of higher significance than might have been anticipated.

Copyright— © Len Aye 1994 57

Page 32: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

FTSE and Interest rates derivatives (2 days prediction) Again, the graphs FT14AB03.CFT and FT14AB03.CFO illustrate the contributions made by different nationals interest rates derivatives. From the graphs we have extracted the extreme cases, i.e. the inputs which made the most and least significance.

Rank Most significant inputs Least significant inputs 1 % change of MOM over 10 days UK 20YR.GILT-INF 1D MOM 2 Close-25 day MAV JP BR-INF 1D MOM 3 RSI 9 day UK BR-INF 1D MOM 4 FR 3MIB->7YBY 20D MOM JP BR 1D MOM 5 FR 3MIB 20D MOM UK BR 1D MOM 6 UK 20YR.GILT 20D MOM FRR >7YBY-INF 1D MOM 7 RSI 14 days US 30Y-INF 1D MOM 8 FR 3MIB->7YBY 1D MOM JP 1OYBY-INF 1D MOM 9 US 30Y BY 20D MOM FR >7Y BY 1D MOM

10 FR 3MIB->7YBY JP 3MIB-10YBY 1D MOM 11 US 30Y-INF 20D MOM US BR-INF 1D MOM 12 Day numbers GE BR 1D MOM 13 JP BR-3MIB 20D MOM US BR 1D MOM 14 65 days MOM UK 3MIB-20YR.GILT 15 5 day ROC JP BR-3MIB 1D MOM

Table 7.3 — Most and least significant contributions of interest rates derivatives

Observations Again, 1D MOMs, the Japanese indicators and inflation seem the least significant. The 2D MOMs seem to have significant value and also the 12 month MOMs. The continued significance of both 9 and 14 day FTSE RSI suggests that we should try this derivative for other items of data.

Copyright— © Len Aye 1994 58

Page 33: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

FTSE, Exchange rate and economic data derivatives (2 days prediction) The graphs FT14TD04.CFT and FT14TD04.CFO show the contributions made by different nationals exchange rates and their respective GDP derivatives. The table below shows the most and least significant contributors.

Rank Most significant inputs Least significant inputs

1 YEN/£ EX. RATE MARKS/£ 20D MOM 2 Close-2day MAV 3 day ROC 3 FR INF 12M % CHANGE GE M. SUPLY 12M % CHANGE 4 UK INF 12M % CHANGE JP INF 12M % CHANGE 5 65 days MOM US$/£ 1D MOM 6 1 year MOM 1 week MOM 7 YEN/£ 20D MOM 25day MOM 8 US$/£ EX. RATE Close-25 day MAV 9 MARKS/£ EX. RATE 25 day ROC

10 UK M. SUPLY 12M % CHANGE RSI 14 days 11 US M. SUPLY 12M % CHANGE FR/£ 20D MOM 12 Day numbers 5 day ROC 13 Zero cl-cl- vol FR/£ 1D MOM 14 % change of MOM over 25 days US GDP 12M % CHANGE 15 MACD RSI 9 day

Table 7.4 — Most and least significant contributions of exchange rate and GDP derivatives

Observations Surprisingly, the Yen/£ Exchange rate comes out as being significant. Perhaps, something from Japan has to be! This is a very strange set of results requiring further analysis.

Copyright— © Len Aye 1994 59

Page 34: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Results

FTSE, Exchange rate and economic data derivatives (65 days prediction) When the previous test was rerun to predict the 65 days, instead of 2 days, momentum, we obtained slightly different results, as shown in graphs FT14TD04.CFT and FT14TD04.CFO. The table below shows the most and least significant contributors.

Rank Most significant inputs Least significant inputs 1 1 year MOM 50day MOM 2 US$/£ EX. RATE MACD 3 GE GDP 12M % CHANGE 25day MOM 4 FRANCS/£ EX. RATE Zero cl-cl- vol 5 MARKS/£ EX. RATE 2 day MOM 6 UK INF 12M % CHANGE 1 week MOM 7 RSI 9 day 5 day ROC 8 US GDP 12M % CHANGE MARKS/£ 20D MOM 9 US$/£ 20D MOM Close-25 day MAV

10 UK GDP 12M % CHANGE MARKS/£ 1D MOM 11 JP GDP 12M % CHANGE 3 day ROC 12 JP INF 12M % CHANGE 50 day ROC 13 US M. SUPLY 12M % CHANGE YEN/£ 20D MOM 14 US$/£ 1D MOM GE M. SUPLY 12M % CHANGE 15 US INF 12M % CHANGE Day numbers

Table 7.4 — Most and least significant contributions of exchange rate

and GDP derivatives (65 days prediction

Observations It can be seen that the 12 month % change of inflation and GDP indicators are the prominent factors in the longer term prediction, compared to that of the short-term (2 day) prediction. In many respects, this is a much more understandable result.

Copyright— © Len Aye 1994 60

Page 35: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Conclusions

CONCLUSIONS

Highlights At this stage of the project, to some extent we are still trying to understand the major factors involved in training a network as much as concentrating our efforts in producing networks which can predict with high accuracy. However, in the process of trying to understand these factors we have also produced some networks which gave high accuracy in their predictions namely:

Predicting 2day MAV of FTSE 65 days (3 months) ahead gave good results, e.g. overall direction accuracy of 79%.

Predicting residual (i.e. where an approximated linear trend was removed from raw

values) of FTSE 2 days ahead also gave good results with overall directional accuracy of 80%, and an average percentage error of 1.4%. In particular, the average error for the first part of the test set was as low as 0.33%.

The predictions on the first half of the tests sets are better than those of the latter part

of the test sets.

Conclusions

Input data manipulation Ordering of the inputs data sets did not make any difference to the accuracy of

prediction. The use of back-propagation algorithm in NeuroShell means that accurate

predictions are possible if large amounts of sample data is available for learning. For majority of the tests, we have used 5 years worth of data (1986-1991) for learning and data from 1992 is used as test data. On some occasions, however, the required amount of data was not available, mainly due to the lack of French interest rates data from DATASTREAM, where the earliest available data starts from 1988. This meant that in some tests, where interest rates were used, only 3 years worth of sample data was available for learning.

The tests showed that short- and long-term predictions will require different data sets. The results of the tests need to be analysed closely to distinguish those data that are suited for the short-term and those that are suited for long-term prediction.

Reduction of input data In general, 1 day Momentum inputs do not make significant contributions, hence,

they can be eliminated from future tests. In contrast, medium and long term, e.g. 20 day, 65 day and 12 month, momentums made significant contributions to both short-term and long-term prediction.

It also appeared that the Japanese market indicators do not played a major role in the

tests. We will have to carry out further tests to see if all of the Japanese inputs can be removed without suffering from loss of accurate prediction.

Copyright— © Len Aye 1994 56

Page 36: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Conclusions

Output data selection Predicting 3 day weighted average, 2 days ahead, gave poor results. There are

anticipated problems with this proxy value and work on this estimation have been suspended.

Predicting FTSE 2 day momentum, without trend, produced acceptable results and

we need to carry out further tests to see if this is also true for long-term predictions.

Network Tuning In terms of using the NeuroShell package, we have made the following observations:

A network with more than one output consistently failed to converge (minimise errors) on the training set, hence produced poor predictions.

A total number of 25 hidden nodes was found to be satisfactory with most of the tests, when the number of input nodes is between 14 to 45. We have not done extensive tests on networks with larger number of input nodes.

The values for Threshold and Momentum which consistently gave good results were found to be 0.4 and 0.6 respectively, again when the number of input nodes is between 14 to 45.

Reducing the values of Threshold and Momentum by 50% after the network has been trained for sometime (around 2M presentation) did not improve the overall predicted results.

The back-propagation algorithm used for learning the past experience of the market cannot handle time-series data. The ability to handle time-series data is of great importance for financial prediction since the network needs to learn the market behaviour in the past.

Current method of preparation of input data sets, especially the calculation of derivatives, in spreadsheet is time consuming, laborious, and most importantly, error prone. A typical test would normally take from half to a full working day for preparation and validation of the data.

NeuroShell allows networks with only a single hidden layer. The higher the number of hidden layers the greater the network is able to recognise larger number of market scenarios and be able to predict with greater accuracy.

No security measures to protect the network from inexperienced user, i.e. the network parameters can be easily altered by users. This is dangerous because a network can only predict accurately as long as the parameters in the network remained unchanged. The package does not provide security measures to stop novice users from tinkling with the network parameters which could result in the well- trained network becoming next to useless.

Analysis methods The results of directional analysis is somewhat misleading. The figure below shows a

result from one of the tests (FT10ST01). It can be seen from the graph that the accuracy of direction is better on the first half compared with the second half of the test set.

Copyright— © Len Aye 1994 57

Page 37: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Conclusions

-500

-450

-400

-350

-300

-250

-200

-150

-100

-50

0

50

100

Oct-91

Oct-91

Oct-91

Oct-91

Oct-91

Oct-91

Oct-91

Oct-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Nov-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Dec-91

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Jan-92

Feb-92

Feb-92

Feb-92

Feb-92

Feb-92

Feb-92

Feb-92

Feb-92

Feb-92

Feb-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Mar-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

Apr-92

May-92

May-92

May-92

May-92

May-92

May-92

May-92

May-92

May-92

May-92

May-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jun-92

Jul-

92

Jul-

92

Jul-

92

Jul-

92

Jul-

92

Jul-

92

Jul-

92

Jul-

92

Jul-

92

Jul-

92

Jul-

92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Aug-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Sep-92

Oct-92

Oct-92

Oct-92

Actual FTSE +2 days

Predicted

FTSE +2 day (trend removed)

NOTE:File= FT10ST01Presentation=3.4MThreshold=0.4Momentum= 0.6

However, the result from the directional analysis showed that the directional

accuracy is better in the second half than the first. The reason for this is as follows: During the first half of the test set the fluctuation of the actual values are small and

oscillate around 0 line, and any small errors in the prediction make the predicted values go below or above the 0 line. Since directional accuracy only looks at those predicted values which are in the same region (above or below the 0 line) as that of the actual values the overall directional accuracy in the first half is lower than that of the second half of the set, where even though the differences between the actual and the predicted values are large they both happened to be in the same region. The implication is that the directional accuracy of the predicted results is dependent upon the relative position of the 0 line.

A better assessment of the results was needed and for that we use the percentage

error and standard deviations of errors between the actual and the predicted output for each case. This is done for both the learning and the test set for comparison. The reason being that although the network does very well on the learning set (over 90% accuracy), but not so on the test set should indicate that the network may be simply memorising instead of generalising. By analysing both the learning and the test set we hope to determine the right amount of learning (i.e. number of presentation of cases) required to give good results on the test set, and any data the network has not seen before.

Copyright— © Len Aye 1994 58

Page 38: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Appendix

APPENDICES

Appendix A— Back-propagation algorithm

Appendix B— FTSE Analyses

Copyright— © Len Aye 1994 59

Page 39: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Appendix

APPENDIX A

Back-propagation algorithm The following is an extract from a paper by Camp7. A network learns by successive repetitions of a problem, making smaller errors with each iteration. The most commonly used function for the error is the sum of the squared errors of the output units:

( )∑ −= 2

21

ii dyE

The value d i is the desired output of unit i, and is its actual output, where y i is the sigmoid function 1 1

y i

/ ( )+ −e x . To minimise the error, take the derivative of the error with respect to w , the weights between units i and j: ij

δδ

βEw

y y yij

i j j j = −( )1

where β j jy d= −( j ) for output units and β βj jk k kk w y y k= −∑ ( )1 for hidden units (k represents the number of units in the next layer that unit j is connected to). Note that y is the derivative of the sigmoid function. yj (1− j ) The error can then be calculated directly from the links going into the output units. For hidden units, however, the derivative depends on values calculated at all the layers that come after it. That is, the value β must be back-propagated through the network to calculate the derivatives. Using these equations, we can state the back-propagation algorithm as follows:

Choose a step size, δ (used to update the weights). Until the network is trained,

For each sample pattern, Do a forward pass through the net, producing an output pattern, For all output units, calculate β j jy d j= −( ) . For all other units (from last layer to first), calculate β using the calculation

from the layer after it: β βj jk k kk w y y k= −∑ ( )1 .

For all weights in the network, change the weight by ∆w y y yij i j j j= − −δ β( )1 .

7 Drew van Camp, Neurons for Computer, Scientific American, September 1992, pp 125-127.

Copyright— © Len Aye 1994 60

Page 40: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Appendix B

APPENDIX B FTSE Index analyses

The following is the complete set of analysis that we have performed on the FTSE Index. The analyses that were performed are as follows: • Momentum (close to close difference) • Returns • Percentage Change of momentum • Moving Averages • Rate of Change • Moving Average Convergence-Divergence • Relative Strength Index • Zero close-close volatility The formulae described using Microsoft Excel notation below assumed that the worksheet is set up as follows:

FTSE INDEX

Derivative

1 1234.5 2 1345.6 3 1456.7 4 1567.8 5 1678.9 6 1789.0 7 1890.1 8 1901.2 9 2012.3 10 2123.4 11 2234.5

The following notations are used: v = value of the index at close today v[n] = value of the index n days ago Ax = value of cell at column A, row x.

Momentum (Close to Close Difference)

Description This is a measure of the difference between the today’s and previous days index, usually over 1, 2, 5, 25 and 50 days. For use in NeuroShell this measure is preferred to the absolute value of the index.

Formula Momx = v - v[n]

where n= 1 to 260 days.

Excel formula

Copyright— © Len Aye 1994 61

Page 41: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Appendix B

Momx=Ax-A(x-n)

Returns

Description

Formula

100ln = Returns ∗⎥⎦

⎤⎢⎣

⎡v[1]

v

Excel formula Returnsx= LN(Ax/A(x-n))*100

Percentage Change of Momentum (PCM)

Description This is a measure of the % change of momentum between today and previous days index, usually over 10, 25 and 50 days.

Formula

PCM=−

∗v v n

v[ ] 100

where n = 10, 25 and 50 days.

Excel formula PCMx = (Ax-A(x-n))/Ax*100

Moving Average (MAV)

Description This is a measure of the arithmatic mean of the index. Of the various moving average measures this is the simplest and often known as Simple Moving Average (SMA). A moving average smooths out fluctuations in values and may help to indicate trends in the market. A shorter moving average (i.e. when n is small) is more sentive to changes and results in less smoothing than a longer moving average. Normal usage is in comparing the value of ROC with the raw index data. When there is a divegence between the ROC and the price, followed by a break in the trend, this indicates the signal to buy or sell.

Formula

MAV = vn

v ii

n[ ] [ ]0 1

0

1

− ∗=

where n = 5, 25 and 50 days.

Copyright— © Len Aye 1994 62

Page 42: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Appendix B

Excel formula MAVx = Ax - sum(A(x-n):Ax)/n

Rate of Change (ROC)

Description The rate of change measures how fast the momentum of the index is changing.

Formula

ROC = vv[n]

×100

Excel formula ROCx= (Ax/A(x-n))*100

Moving Average Convergence-Divergence (MACD)

Description This is an indicator of overbought and oversold signals in the market. This measure is obtained by working out the difference of the two exponential moving averages of short and long periods. When the difference in value is greater than the exponential moving average of the difference, it can be a signal to buy. Conversely, when the difference in value is less than the exponential moving average of the difference, it can be a signal to buy. In addition, if the MACD lines are too far above or below the zero line, they could indicate an overbought or oversold situation respectively.

Formula w= EMA(v,sf2) - EMA(v,sf1)

MACD = EMA(w,sf3)

where EMA(v,sf) is defined by: EMA(v,sf) = v in the first interval EMA(v,sf) = (1-sf) * EMA[1](v,sf) + sf*v in second and later intervals where sf, sf1, sf2, sf3 = smoothing factor (0.0-1.0), and sf2>sf1

Excel formula Here, the EMA for long and short periods are calculated first, in two different columns, e.g. EMA1(sf1) = v (for the first value) EMA1(sf2) = v (for the first value) EMAx(sf1) =0.02*vx+0.98*EMAx-1 EMAx(sf2) = 0.05*vx+0.95*EMAx-1 MACD can now be calculated as follows: MACD EMA sf EMA sf1 1 2 1 1= −( ) ( ) (for the first value)

( ) [ ] )1(123x 9.0)()(1.0=sfEMA=MACD −×+−× xxxx MACDsfEMAsfEMA

Copyright— © Len Aye 1994 63

Page 43: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Appendix B

Relative Strength Index (RSI)

Description This is an indicator of trend reversals in the market, and is preferred over the momentum indicators.

Formula If MEMA(u,n) = MEMA(d,n) =0

then

RSI(v,n) = 50

else

RSI v n MEMA u nMEMA u n MEMA d n

( , ) ( , )( , ) ( , )

+100

where v = close u = max(v-v[1],0) d = max(v[1],0). MEMA(v,n) is given by:

MEMA(v,n) = SMA(v,n) in nth interval,

),](1[111),( nvMEMAn

vn

nvMEMA ×⎟⎠⎞

⎜⎝⎛ −+×= in later

intervals.

Excel formula Again, MEMA for the two cases, MEMA(u,n) and MEMA(d,n), are calculated first.

( ){ } )1()1(110,max1),( −− ⎟⎠⎞

⎜⎝⎛ −+−= xxxx MEMA

nvv

nnuMEMA

( ){ } )1(1110,max1),( −− ⎟⎠⎞

⎜⎝⎛ −+−= xnnx MEMA

nvv

nndMEMA

Once these are calculated we can determine the value of RSI, which is given by:

RSI MEMA u nMEMA u n MEMA d nx

x

x x=

×+

100 ( , )( , ) ( , )

Zero trend close-close volatility (ZCCV)

Description This is an estimate of volatility in the market and the major assumption here is that the underlying distribution has a zero trend.

Formula y = ln(close/close[1]) t = time (in years) until end of period

⎭⎬⎫

⎩⎨⎧

+−∗∗= ∑

=

1

0

2

])1[][(][1100

n

i ititiy

nZCCV zx

Copyright— © Len Aye 1994 64

Page 44: FTSE Trend Forecasting Using Neural Nets

FTSE Trend Forecasting Using Neural Networks Appendix B

2561

11001

2 ∗⎭⎬⎫

⎩⎨⎧

−∗= ∑

=

n

iiy

n

Excel formula ZCCVx = 100*STDEV(Ax:Ax-n)*SQRT(256)

Copyright— © Len Aye 1994 65