rachit mishra_stock prediction_report

6
BTP Report - Stock Prediction model analysis Rachit Mishra DA-IICT, Gandhinagar [email protected] Supervisor Prof. P.M Jat Abstract This document report present a detailed analysis of stock prediction and puts forth a prediction model which facilitates the prediction. The fundamentals upon which this research was conducted and the relevant output was produced were strengthened by studying the previous research work conducted in similar domain. Keywords stock prediction, neural networks, artificial neural networks, trend prediction. I. INTRODUCTION Beginning with formulating the problem statement, this research aims at performing stock forecasting using neural networks. This is the basic underlying idea of the problem statement along lines of which, the relevant research was conducted and ideas implemented. 1.1 Importance of the problem This problem is primarily important because it implements methods and produces outputs aimed at determining the future value of stock prices. Living in a world where the global economy spins off with an innumerable number of markets, a tool aimed at predicting their stock values would maximize their profits and better the economy. 1.2 General approaches The general approaches used in stock forecasting deploy various machine learning algorithms aimed at predicting the prices or the price range for any upcoming day or week or month and so on. Page Layout. Various approaches used follow the important step of feeding the inputs to the machine learning algorithm. Another general approach was to work on the model of sentimental analysis. This basically analysed the emotional inclinations and sentiments of the investors via. their tweets and then facilitated the prediction. In the long run, this didn't prove to be much credible as people started getting biased. 1.3 Solution outline My solution includes the following of a series of steps. Firstly, I extract the data in form of excel sheets for, say a company X. Now, for any given company, there are various factors which contribute towards the development of its prediction model. Opening rates for that particular period Highest rates of that particular period Lowest rates of that given period The closing rates at that period These are the primary attributes which would be considered while applying the algorithms followed by the Artificial neural networking tool on the data flowing into the neural network. The time frame within which we would be doing the simulation can be varied as per the user's interest. The [1] data is precisely divided into sets of - training and validation data. The simulation results are then noted down and plots are produced as a result of before and after training the data. Subsequently, validation is done on the (100-x)% data where x is the %data allotted for training. Figure 1 : Step-wise fundamentals of the initial phase 1.4 Summary of experimental results Fine tuning different attributes results in different plots which exhibit the different nature of outputs at different times. There are basically 3 stages via. which the experiment has been performed. At first, simulation is performed before training the data. Two different lines can be observed where one symbolizes the feeding of data into the neural network and the other symbolized the actual data. After this, the data is trained with 90% of the data being treated as training data and the rest 10% as the testing data. In Data extraction in form of .csv files Divison into training and validation sets Feeding data into the Neural networks

Upload: rachit-mishra

Post on 15-Feb-2017

35 views

Category:

Documents


0 download

TRANSCRIPT

BTP Report - Stock Prediction model analysis Rachit Mishra

DA-IICT, Gandhinagar

[email protected]

Supervisor

Prof. P.M Jat

Abstract – This document report present a detailed analysis of

stock prediction and puts forth a prediction model which

facilitates the prediction. The fundamentals upon which this

research was conducted and the relevant output was produced

were strengthened by studying the previous research work

conducted in similar domain.

Keywords – stock prediction, neural networks, artificial neural

networks, trend prediction.

I. INTRODUCTION

Beginning with formulating the problem statement, this

research aims at performing stock forecasting using neural

networks. This is the basic underlying idea of the problem

statement along lines of which, the relevant research was

conducted and ideas implemented.

1.1 Importance of the problem

This problem is primarily important because it implements

methods and produces outputs aimed at determining the future

value of stock prices. Living in a world where the global

economy spins off with an innumerable number of markets, a

tool aimed at predicting their stock values would maximize

their profits and better the economy.

1.2 General approaches

The general approaches used in stock forecasting deploy

various machine learning algorithms aimed at predicting the

prices or the price range for any upcoming day or week or

month and so on. Page Layout. Various approaches used

follow the important step of feeding the inputs to the machine

learning algorithm.

Another general approach was to work on the model of

sentimental analysis. This basically analysed the emotional

inclinations and sentiments of the investors via. their tweets

and then facilitated the prediction. In the long run, this didn't

prove to be much credible as people started getting biased.

1.3 Solution outline

My solution includes the following of a series of steps. Firstly,

I extract the data in form of excel sheets for, say a company X.

Now, for any given company, there are various factors which

contribute towards the development of its prediction model.

Opening rates for that particular period

Highest rates of that particular period

Lowest rates of that given period

The closing rates at that period

These are the primary attributes which would be considered

while applying the algorithms followed by the Artificial

neural networking tool on the data flowing into the neural

network. The time frame within which we would be doing

the simulation can be varied as per the user's interest.

The [1]

data is precisely divided into sets of - training and

validation data. The simulation results are then noted down

and plots are produced as a result of before and after training

the data. Subsequently, validation is done on the (100-x)%

data where x is the %data allotted for training.

Figure 1 : Step-wise fundamentals of the initial phase

1.4 Summary of experimental results

Fine tuning different attributes results in different plots which

exhibit the different nature of outputs at different times. There

are basically 3 stages via. which the experiment has been

performed.

At first, simulation is performed before training the data. Two

different lines can be observed where one symbolizes the

feeding of data into the neural network and the other

symbolized the actual data.

After this, the data is trained with 90% of the data being

treated as training data and the rest 10% as the testing data. In

Data extraction in form of .csv files

Divison into

training and

validation sets

Feeding data into

the Neural

networks

overall summary, various observations can be noted down as a

result of the plots achieved. For instance, if the highest rates

of stocks for a given company witness a declining trend over a

period of time, the predicted plot can show a profit for the

company or vice versa. Such alarming observations can be

noted down which would be discussed in depth later. .

II. RELATED WORK

In the field of stock prediction, an extensive research aimed at

providing a near-accurate prediction model is underway and

has sent numerous benchmarks.

In one of the approaches, the use of [2]

global stock data in

correlation with the data of other financial products has been

stressed. In this very approach, the Support Vector Machine

learning algorithm has been implemented. Markets which stop

trading right before the beginning of the US markets are

studied in this approach. Specifically speaking, the world

major stock indices[6]

are used as an input feature for the

predictor developed via. this approach.

Figure 2 : Correlation of NASDAQ stock data with other

global markets

In another approach, the importance of the Back propagation

Learning Algorithm which intends to find the

maxima/minima of the function by moving it in direction of

negative slope is stressed. There were various attributes such

as the date, time of the day, the opening price, the closing

price, the highest and the lowest prices as well as the

fractional changes in prices, of which some were taken into

account. For training, 60% of the data was used where as the

rest 40% was used for validation.

Figure 3 : Looking at the error in the second approach

Thus, in most of the related works, the application of Artificial

neural networks in order to develop a stock prediction model

are discussed. A general observation observed is that the

prediction actually is decently accurate. However, if there is a

sudden fluctuation in any of the parameters, the accuracy

decreases.

III. PROBLEM STATEMENT

3.1 PROBLEM OVERVIEW

By studying the methodology of the neural networks[3]

,

forecasting of the stock prices can be facilitated as was done

in this research. Motivation for conducting the theoretical

research was an important factor in developing the problem

statement for this research project. Basically, the overview of

the problem is that we need to fetch data in the form of .csv

files and then, this data needs to be fed into the neural network.

3.1.1. Precise description

Consider the stock prices data being fed for a company say

Reliance. While using the prediction model produced at the

end of this research, the user gets to choose the timeline in

which the stock price data is to be trained.

For instance, you choose to extract weekly stock price

attributes between say, 3rd January, 2005 to say, 28th

Decemeber, 2015. From a relevant source, you can mine the

data and get the csv file containing the necessary data. Of the

576 rows accumulated in the data sheet, 90% of the data is

allocated to training and 10% to testing.

3.1.2 Significance

The significance of this problem statement is that it

contributes a lot to the functioning of stock markets and thus

enhancing the overall functioning of national economies. A

prediction model which can predict the profit of your

company's stock at near-accurate rates happens to be a

powerful tool for the global economy.

IV. APPROACH

4.1 Architecture

There are various elements integral to data modelling[7]

and

which form the basic underlying idea of the neural network

architecture.

Figure 4 : Elements integral to NN Architecture

I have used the MATLAB tool in order to fulfil the coding

requirements and the ANN tool to train and validate the data

thereby generating the appropriate plots.

Neural networks are used to approximate functions

depending on a large number of inputs which happens to be

the underlying idea of the implementation.

The NN Architecture covers basically the types of

problems which are to be tackled by the applications. In the

architecture, stocks can be classified in different groups based

on their kind of returns. For instance, they can be classified as

either +ve or -ve or even neutral.

4.2 Individual Component

The individual components involved are the different

attributes which are considered as parameters for predicting

the stocks. For any given parameter or even for all parameters

at once, the user can simulate the input data being fed into the

neural network and make note of the predicted outputs.

The algorithm implemented has been divided into three

separate fragments of code or it can at least be considered as

such.

Figure 5 : First fragment of the algorithm[4]

Above is the first fragment of the algorithm in which the

testing and the training data are separated. In the 8th line,

where

u_train = A(:. 2:526);

the ':' implies the inclusion of all the attributes as

parameters while predicting the stock price output. In our case,

I chose 90% of the data for training which amounts to 516

rows and the last 60 rows for validation and testing which

amounts to 10% of total.

Figure 6 : Second fragment

This next fragment of the algorithm performs training on

the first 90% of the data which happens to be the first 516

rows and fine tunes the input data[5]

being fed into the neural

network accordingly. The

plot(y_train_sim, 'r:')

function trains and simulates the data and accordingly, the

plot is generated which would be shown later.

Figure 7 : Final testing fragment of the algorithm

This is the final fragment of the algorithm which

symbolizes the part which performs testing on the remaining

10% of the data. In other words, after performing[8]

training on

the previous 90% of the data, the values of the last 10% were

predicted.

These stock price values are now tested and compared with

the actual 10% of the values. [9]

That tells a great deal about the

algorithm and the nature of the stock market for the given

company.

V. EVALUATION

5.1 Objective of the experiment

The objective of the experiment is to feed a constant stream

of data into the ANN tool prior to training the data. Then, few

parameters are listed down and are considered as the prime

attributes necessary to do the simulation.

Thus, the overall objective is to train some fraction of the

input data and use rest of the data to validate the results after

training.

5.2 Experiment setup

Setting up the experiment required the code above to be

written on Matlab. Post that, the first stage of simulation is

executed in which the below plot is produced.

It just shows the values of the stock prices before training

the data of Bombay Stock Exchange. The timeline

considered is :

From : 3rd Jan, 2005

To : 28th December, 2015

No. of rows = 576

Figure 8 : Before training

In the next stage, the training is done and the second

fragment of the algorithm is executed as shown in the

previous section. By doing this, the data is trained. 90% of the

data is trained in this stage which is roughly the 1st 516 rows

of stock prices for BSE.

We can see in the below plot that the red line which

indicates the predicted output almost coincides with the blue

line which represents the actual price. Thus, this implies that

the prediction is almost accurate while training.

Figure 9 : Post training analysis

While the experiment has been set up and the data has been

trained, based on the predicted value of the stock prices after

training the 90% of data, the predicted[10]

value of the last 10%

which is the last 60 rows is compared with the actual value of

the stock prices of the last 10% which is represented by the

blue line.

Thus, conclusions can be drawn via. this plot which shows

that the actual prices have been higher and even lower at times

than the predicted ones.

Figure 10 : After performing the testing

5.3 Results and Analysis

In a point form, the results can be drawn[11]

as shown above

and few noteworthy points worth analysing would be :

You can choose between various parameters and

your output will be formulated accordingly.

When you consider all the parameters, a

significant different between the actual and the

predicted values can be observed at the end.

If however, one chooses to use just a single

attribute as a parameter, say closing price, the

output isn't near-accurate.

Also, the difference between the actual and the

predicted value decreases. Overall, the efficiency

of the model decreases.

Figure 11 : When using a single attribute as the parameter,

efficiency decreases.

The final analysis can be concluded as saying that the user

gets the options to choose from the attributes and also gets the

option to set the parameter as output to train and validate the

data.[12]

The efficiency of this model varies depending upon

the input chosen.

A single input say Closing price chosen as a parameter can

produce more efficient and accurate predictions than when all

the 4 attributes are considered as parameters or even vice-

versa.

VI. DISCUSSION AND CONCLUSION

While discussing and concluding this research-based

experiment, the whole idea can be listed in terms of strengths

and drawbacks of this very model which has been presented in

this report.

The strengths of this model primarily centre around the fact

that this model enables the user to get deep insights about how

the stock of his or her firm might perform in the near future.

Accordingly, the user can corroborate with his associates and

the firm can implement measures to keep the prices or bay or

maximize its profits. This would also assist the clients who

happen to be major stockholders in one form or the other in a

great manner. A beforehand idea of how the stocks of a given

company might perform in the coming time and affect the

decision of a person investing into shares of a given company

by a great deal.

On the downside, the weakness or the drawbacks associated

with the functioning of such research-based prediction models

should also be taken into account in order to present an

unbiased thesis of the whole experiment. The true nature of

the performance of the stocks happens to be erratic. One

cannot exactly predict the future thus rendering the value of

such experiments null and void at times. For instance, a

prediction model which takes into account all the 4 attributes

as parameters is placed in front of a prediction model which is

taking into account just a single parameter to predict the stock

price. One of these models can be less accurate than the other

one and the person who is relying on the less-accurate model

unknowingly can suffer a great deal of loss in the stock

market. Thus, these experiments are trustworthy only to some

extent because post-that, it's all 'wish me luck'.

The future scope of this model can be tremendous devoid of

any bounds or limitations. Speaking in technical terms, this

model can be further expanded to develop a comparator which

would give a more direct idea of where to invest in as the user

would get much lucid insights as to which company's stock

might be performing better in the near future.

VII. ACKNOWLEDGEMNT

During the course of four months of this research internship, I

was able to dive deep into various domains of research

pertaining to machine learning, data mining, as well as other

technicalities associated with the field of Neural Networks and

Stock Prediction. For bestowing me with an opportunity to

pursue this research and for making the terms of research as

lucid as possible, I would like to thank my mentor, Prof. P.M.

Jat. I would also like to thank him for assisting me with

developing strategies and building ideas necessary to

overcome the roadblocks I encountered at every step during

the two phases of my internship. Also, for providing me with

the insights pertaining to all the tools and technologies

involved in my research, I would like to thank my mentor

again. All in all, this research internship was an enlightening

experience made possible only by a great guidance.

VIII. REFERENCE

[1]

<BSESN Historical Prices>, Accessed on 19th April, 2016

https://in.finance.yahoo.com/q/hp?s=%5EBSESN&a=06&b=1

&c=1997&d=00&e=8&f=2016&g=w

[2]

<Adani Power Ltd. stock prices>, Last Accessed on 29th

April,2016<https://www.quandl.com/data/NSE/ADANIPOW

ER-Adani-Power-Limited>

[3]

<Stock Market Prediction using Neural Networks>, Last

Accessed on

29thApril,2016<http://neuroph.sourceforge.net/tutorials/Stock

MarketPredictionTutorial.html>

[4]

<Stock Market Prediction - MATLAB>, Last Accessed on

29thApril,2016..http://www.breakyourhead.com/2013/03/stoc

k-prediction-artificial-neural.html

[5]

<Half adder - Neural Networks>, Last Accessed on

29thApril,2016..http://www.breakyourhead.com/2012/11/half

-adder-artificial-neural-networks.html

IX. APPENDIX

[6]

Shen, Shunrong, Haomiao Jiang, and Tongda

Zhang.,"Stock Market Forecasting Using Machine Learning

Algorithms."2012

[7]

Marijana Zekić: Neural Network Applications in Stock

Market Predictions ñ A Methodology Analysis, in B. Aurer,

R. Logoûar, Varaûdin (Eds.), Proceedings of the 9th

International Conference on Information and Intelligent

Systems, pp. 255-263, 1998

[8]S

. Zemke, “On developing a financial prediction system:

Pitfall and possibilities,” Proceedings of DMLL-2002

Workshop, ICML, Sydney, Australia, 2002.

[9]

Marijana Zekic, MS, “Neural Network Applications in

Stock Market Predictions - A Methodology Analysis,”

University of Josip Juraj Strossmayer in Osijek, Croatia.

[10]

Refenes, A.N., Zapranis, A., Francis, G., Stock

Performance Modeling Using Neural Networks: A

Comparative Study with Regression Models, Neural

Networks, vol. 7, No. 2, 1994, pp. 375-388.

[11]

Schoeneburg, E., Stock Price Prediction Using Neural

Networks: A Project Report, Neurocomputing, vol. 2, 1990,

pp. 17-27.

[12]

Swales, G.S.Jr., Yoon, Y., Applying Artificial Neural

Networks to Investment Analysis, Financial Analyst s Journal,

September-October, 1992, pp. 78-80.