deep learning approach using long short term memory
Post on 17-Apr-2022
1 Views
Preview:
TRANSCRIPT
© 2021, IJSRCSE All Rights Reserved 8
International Journal of Scientific Research in ___________________________ Research Paper. Computer Science and Engineering
Vol.9, Issue.1, pp.08-13, February (2021) E-ISSN: 2320-7639
DOI: https://doi.org/10.26438/ijsrcse/v9i1.813
Deep Learning Approach Using Long Short Term Memory Technique
for Monthly Rainfall Prediction in Chhattisgarh, India
Nisha Thakur
1*, Sanjeev Karmakar
2
1,2
Computer Science & Engineering, Bhilai Institute of Technology, Chhattisgarh Swami Vivekananda Technical
University, Bhilai Nagar, India
*Corresponding Author: nishathakur.india@gmail.com Tel.: 7000167615
Available online at: www.isroset.org
Received: 31/Jan/2021, Accepted: 10/Feb/2021, Online: 28/Feb/2021
Abstract— Rainfall is an essential factor in Chhattisgarh state as the economy is dependent on agriculture here. Time
Series forecasting approach for monthly rainfall prediction is done using Long Short Term Memory [LSTM] Model
applying on 1404 months data of Chhattisgarh state. The factors taken for the evaluation of the performance and the
efficiency of the proposed rainfall prediction model are Mean Absolute Deviation (MAD), Mean Square Error (MSE),
Root Mean Square Error (RMSE), Cosine Similarity (CS) and Correlation Coefficient (r). Various learning rate like (α =
0.01, 0.05, 0.001, 0.005) for various epoch(s) such as 200, 400, 600, 800 and 1000 respectively are done for LSTM
approach. The experimental results show that Long Short Term Memory gave significant results than ANN for 200 epochs.
Keywords— Forecasting; Long short-term memory; Recurrent neural networks; Time series; Artificial Neural Network
(ANN)
I. INTRODUCTION
Rainfall has a significant role to play as far as Asian
countries are concerned. Indian is one of those countries
which directly or indirectly driven by the rainfall. Most of
the people residing in villages are dependent on
agriculture, and fruitful agriculture depends on the Rain, in
most of the parts of India. Therefore balanced rain is
needed for proper agriculture results. Rainfall prediction
becomes even more important, in case of possibilities of
deficient or excess rain. When there is a possibility of
excess rain, the people may suffer from flooding. Hence to
prevent this flooding situation, to manage resources and
most importantly to save the human life rainfall prediction
is so important. The deficient rainfall has also adverse
effect on humans or animals because it may affect the
water quality and aquatic ecosystem. Rainfall prediction
can help in controlling the adverse situation created due to
the excess or deficient rain.
Chhattisgarh State was constituted on 01st November
2000. Chhattisgarh State substantially depends on
agriculture where about 60% of the total populations of the
state depend on it. Chhattisgarh bears the 7th highest
positions amongst the rice-producing state of India with
6.32 million tons of rice production per year (1). Wide
ranges of crops are the speciality of Chhattisgarh. The
average annual rainfall received by the state is about 1400
mm out of which about 80% of precipitation is received
between June to September and deficit rainfall leads to
scarcity of water in rest of the year (2). In a generalized
manner, precipitation minimizes from eastern to the
western part of the state (Durg, Rajnandgaon, Kabirdham)
are falling under downpour shadow district gets minimum
precipitation. Increasing agricultural production of the state
is a great challenge since 75% of the rice cultivated area is
under the rain fed situation which is vulnerable to vagaries
of monsoon rainfall amount and distribution (3).
There are many approaches reported for making accurate
rainfall prediction using time series data that included
model-based methods such as artificial neural (ANN)
runoff prediction (4-5), streamflow forecasting and rainfall
simulation (6)
There have been various recent studies on the application
of LSTM neural network to the stock market (7), Rainfall-
runoff modelling (8), Wind power forecasting (9), Weather
forecasting (10), Predicting Individual Mobility Traces
(11), etc.
1.1 GEOGRAPHICAL DETAILS OF
CHHATTISGARH: -
The geographical location of Chhattisgarh is 17o 46’, north
to 24o 5’ north latitude and from 80
o15’ east to 54
o20’ east
longitude. The total area of the Chhattisgarh State is
135194 m2 and it is located in the central part of India.
This part of Chhattisgarh lies in the upper basin section of
the Mahanadi River and tributaries. (12)
Int. J. Sci. Res. in Computer Science and Engineering Vol.9, Issue.1, Feb 2021
© 2021, IJSRCSE All Rights Reserved 9
Figure 1: Geographical Location of Chhattisgarh (India)
The climate of the state is tropical. Its proximity to the
Tropic of Cancer and its dependence on the monsoons for
rainfall makes it hot and humid. The temperature in
summer in the state may reach 45 °C. The monsoon season
starts from the latter part of month June to October and is a
welcome as it provides relaxation from the heat. The state
receives an average of 1,292 mm (50.9 in) of rain in
monsoon. The Season of Winter starts from November and
remain effective till January. Winter season in
Chhattisgarh is most pleasant with less temperature and
humidity. (13)
II. DATA COLLECTION
A time-series monthly rainfall data of Chhattisgarh State
from January 1901 to December 2017 was collected. In all,
monthly rainfall data of 1404 months were collected. The
data has been downloaded from the official website of the
Indian Metrological Department (IMD). As per the
literature review, it is understood that long short term
memory (LSTM) techniques are being used extensively as
it provides better accuracy. 80% of the total data i.e.
rainfall in 1124 months was used to train the LSTM
network thereafter remaining 20% i.e. rainfall in 280
months was used to test the results.
III. RELATED WORK
M. Swapnaa and N. Sudhakarb (2018) forecast rainfall in
mm using deep learning approach based on Long Short
Term Memory (LSTM) technique [14].
Choi and Lee (2018) proposed a LSTM ensemble
forecasting algorithm for time series forecasting that
effectively combines multiple forecast results from a set of
individual LSTM networks. It gives better results both in
terms of forecasting accuracy and fast runtime
performance [15].
Poornima and Pushpalatha (2019) presents an Intensified
Long Short-Term Memory (Intensified LSTM) based
Recurrent Neural Network (RNN) to forecast rainfall. The
results obtained are compared with different techniques
like Holt–Winters, Extreme Learning Machine (ELM),
Autoregressive Integrated Moving Average (ARIMA),
Recurrent Neural Network and Long Short-Term Memory
model in order to exemplify the improvement in the ability
to predict rainfall [16].
Chimmula and Zhang (2020) prposed a deep learning
approach fror forecast COVID 19 using a Long short-term
memory (LSTM) networks. Based on the results of Long
short-term memory (LSTM) network, predicted the
possible ending point of this outbreak will be around June
2020 [17].
IV. METHODOLOGY USED DEEP LEARNING
APPROACH (LONG SHORT TERM MEMORY
TECHNIQUE)
The basic building blocks of a typical LSTM network are
known as cells. Two states used to be transferred to the
next cell, the cell state and the hidden state. The
fundamental series of data flow is the cell state, this state
allows the data to flow in the forward direction without
being altered. However, some linear transformations may
occur. Adding or removing the data from the cell state is the
work of sigmoidal gate. A gate can be compared to a series
of matrix operations, which contain different individual
weights. (18) LSTM algorithms are developed to avoid
long-term dependencies as it uses gates to control the
remembering process. To recognize data that is not required
and will be deleted from the cell in that step is done in the
beginning phase of an LSTM model. Sigmoid function
decides the process of searching and deletion of data, which
takes the output of the previous LSTM unit )h( t 1 at a time
( 1t ) and the present input expressed as )X( t at time t.
Apart from this, the sigmoid function determines which part
from the earlier output must be removed. This gate is said
to forget gate (or ft) with values ranging from 0 to 1,
considered as a vector, associated with each number in the
cell state 1tC .
For time series forecasting of the input 𝑥𝑡, the LSTM
modifies the memory cell 𝑐𝑡 and outputs a hidden state ℎ𝑡 as per the following calculations. These are performed at
each time step 𝑡.
Full mechanism of a modern LSTM is given by the
following equations (19): -
)bXWX(Wf i1-thitxit
)bXWX(Wf f1-thftxft
)bXWX(Wo o1-thotxot ……. (1)
)bXWX(Wicfc c1-thctxctttt 1
)(coc 1tt
Int. J. Sci. Res. in Computer Science and Engineering Vol.9, Issue.1, Feb 2021
© 2021, IJSRCSE All Rights Reserved 10
Figure 2: Systematic diagram for Long Short Term Memory
(LSTM) Neural Network
In (1), W𝑚𝑛 specifies the weight of the connection from
gate 𝑚 and gate n, and 𝑏𝑛 is bias parameter for learning,
where 𝑚 ∈ {𝑥, ℎ} and 𝑛 ∈ {𝑖, 𝑓, 𝑜, 𝑐}. Additionally, ⊗
represents the element-wise multiplication (Hadamard
product), 𝜎 represents for the standard logistic sigmoid
function, and 𝜙 represents the tanh function: (𝑥) = 1/ (1 +
𝑒−𝑥) and (𝑥) = (𝑒𝑥 – 𝑒-𝑥)/ (𝑒𝑥 + 𝑒-𝑥). The input-output and
forgets gates are expressed by 𝑖𝑡, 𝑜𝑡, and 𝑓𝑡 respectively,
whereas 𝑐𝑡 represents the internal state of the memory cell
𝑐 at time 𝑡. Vector ℎ𝑡 represents the value of the hidden
layer of the LSTM at time 𝑡, whereas ℎ𝑡−1 is the values of
outputs by each memory cell in the hidden layer at the
preceding time (20).
V. RESULT & DISCUSSIONS
We started experimenting with the time series data by
applying the Feed forward network, with Bayesian
Regularization (trainbr) algorithm. We went for 200 epochs
for this study. Various factors such as MAD, MSE, RMSE,
Cosine Similarity, and Correlation Coefficient were
calculated to investigate the accuracy of forecasting data.
The formula used for calculating these mentioned factors is
illustrated in Table 03 along with the significance of its
value in the quality of prediction. Neural network tool (nn
tool) was utilized to find these parameters. The values for
MAD, MSE, RMSE, Cosine Similarity (CS), and
Correlation Coefficient (r) were found as 82.589,
15877.523, 126.006, 0.724, and 0.545 respectively
illustrated in Table 01. The values for CS and r were an
indicator of poor resemblance between observed (actual)
and forecast data; therefore, LSTM techniques were taken
into consideration for further studies.
Table 1: Feed forward network Bayesian Regularization analysis
of time series data
Factors
Feedforward network
Bayesian Regularization (trainbr)
(200 Epochs)
MAD 82.589
MSE 15877.523
RMSE 126.006
CS 0.724
R 0.545
Table 2: LSTM based analysis of time series data
Learning Rate
(α) Factors
No. of Epoch (s) Optimum
Value 200 400 600 800 1000
0.01
MAD 33.544 35.089 35.949 34.850 35.148 33.544
MSE 2904.847 3632.101 3535.022 3128.956 3665.518 2904.847
RMSE 53.897 60.267 59.456 55.937 60.544 53.897
CS 0.956 0.946 0.948 0.952 0.945 0.956
r 0.931 0.915 0.919 0.925 0.914 0.931
0.05
MAD 32.842 37.308 47.481 45.816 36.744 32.842
MSE 3043.374 3383.482 4606.685 5450.882 3301.463 3043.374
RMSE 55.167 58.168 67.873 73.830 57.458 55.167
CS 0.956 0.949 0.928 0.919 0.951 0.956
r 0.932 0.920 0.886 0.874 0.923 0.932
0.001
MAD 34.059 32.921 32.494 32.468 32.409 32.409
MSE 2841.701 2744.058 2704.869 2702.570 2703.178 2702.570
RMSE 53.308 52.384 52.008 51.986 51.992 51.986
CS 0.958 0.959 0.960 0.960 0.960 0.960
r 0.934 0.936 0.937 0.937 0.937 0.937
0.005
MAD 36.224 32.115 31.969 32.107 31.601 31.601
MSE 3105.309 2708.119 2673.978 2704.418 2666.373 2666.373
RMSE 55.725 52.040 51.711 52.004 51.637 51.637
CS 0.953 0.959 0.960 0.959 0.959 0.960
r 0.926 0.936 0.937 0.936 0.936 037
Int. J. Sci. Res. in Computer Science and Engineering Vol.9, Issue.1, Feb 2021
© 2021, IJSRCSE All Rights Reserved 11
A prediction was made using the LSTM technique available
in deep learning tool using the Matlab ver. 2016. Table 2
contains the all above parameters obtained. We varied the
learning rate (α = 0.01, 0.05, 0.001, 0.005) and observed the
quality of prediction for various epoch(s) such as 200, 400,
600, 800 and 1000. The factors MAD, MSE and RMSE
need to be less for obtaining optimum prediction quality.
On the other hand, the values of Cosine similarity (CS) and
Correlation Coefficient (r) must be close to one. When
these values are closed to 1, we consider the actual and
predicted data are the same and when the said values are
closed to 1, we may conclude about the usefulness of the
prediction made. If we observe Table 2 for α = 0.01,
closely, we find that the value of RMSE is 53.897 for 200
epochs, which is smallest in comparison to that of
prediction result obtained for rest of the epochs. We also
got the value of Cosine similarity and Correlations
Coefficient as 0.956, and 0.931 respectively which is again
highest for 200 epochs. When we observed Table 2 for α =
0.05, we find similar results of RMSE, Cosine similarity
and Correlations Coefficient as 55.167, 0.956, and 0.932 for
200 epochs again. This is an indication that prediction
results are better for 200 epochs if the learning rate α is 0.01
and 0.05 respectively. Moving towards the lower values of
α = 0.001 and 0.005 we find the better result for 600 and
800 epochs respectively as the results of r and CS were
found to be 0.960 & 0.937 and 0.960 & 0.937 for of α =
0.001 and 0.005 respectively. Graph obtained for (α = 0.01,
epoch = 200), (α = 0.05, epoch = 200), (α = 0.001, epoch =
600) and (α = 0.05, epoch = 800) are shown in Figure
3(a), Figure 3(b), Figure 3(c) & Figure 3(d) respectively.
In these figures, a great resemblance between the actual
and forecast data can be observed. Factors to judge the
quality of prediction with their significance are elaborated
in Table 03.
Table 3: Factors to judge the quality of prediction with their significance
S. No. Factor (s) Formula used Meaning of
Symbols used Significance
1 Mean Absolute
Deviation (MAD)
n
1iii fa
n
1
ai= Actual
value
fi= Forecast
value
n=number of
observations
Estimates the quality of forecasting by
calculating mean absolute deviation. Smaller is
the value, better is the result
2 Mean Square
Error (MSE) 2)
n
1iii f(a
n
1
The greater value of MSE expresses the values
are scattered around its central moment (mean),
and a smaller MSE means otherwise
3
Root Mean
Square Error
(RMSE)
2)
n
1iii f(a
n
1
The greater value of MSE expresses the values
are scattered around its central moment (mean)
and a smaller MSE means otherwise.
4 Cosine Similarity
(CS)
n
i
i
n
i
i
n
i
ii
fa
fa
1
2
1
2
1
Cosine similarity is an estimation to judge the
similarity between two vectors of an inner
product space. It is calculated by the cosine of
the angle between two vectors and determines
whether two vectors are pointing in roughly the
same direction. It is found as 1 for the same set
of data and 0 for absolutely dissimilar sets.
5 Correlation
Coefficient (r)
2
11
2
2
11
2
111
n
i
n
i
i
n
i
n
i
i
n
i
i
n
i
i
n
i
ii
ffnaan
fafan
It is a measure of the association between two
data sets. It is a measure to check how strong
the association ship is. It returns a value
between -1 and 1, where -1 indicates a negative
correlation and +1 exhibits a positive
correlation.
Figure 3(a): Comparison between Actual & Forecast Rainfall(200 epoch (s), alpha = 0.01) using LSTM algorithm
Int. J. Sci. Res. in Computer Science and Engineering Vol.9, Issue.1, Feb 2021
© 2021, IJSRCSE All Rights Reserved 12
Figure 3(c): Comparison between Actual & Forecast Rainfall (600 epoch (s), alpha = 0.001) using LSTM algorithm
Figure 3(b):- Comparison between Actual & Forecast Rainfall (200 epoch (s), alpha = 0.05) using LSTM algorithm
Figure 3(d): Comparison between Actual & Forecast Rainfall(800 epoch (s), alpha = 0.005) using LSTM algorithm
Int. J. Sci. Res. in Computer Science and Engineering Vol.9, Issue.1, Feb 2021
© 2021, IJSRCSE All Rights Reserved 13
VI. CONCLUSION
In present research paper, Artificial Neural Network
(ANN) approach and Deep learning approach for rainfall
forecasting based on the time series analysis is
incorporated using MATLAB v. 2016. An attempt has
been made to forecast the monthly rainfall applying Feed
Forward Network using Bayesian Regularization algorithm
for 200 epochs. The factors like MAD, MSE, RMSE,
Cosine Similarity (CS), and Correlation Coefficient (r)
were calculated to judge the quality of forecasting. These
parameters were found as 82.589, 15877.523, 126.006,
0.724, and 0.545 respectively. The values of cosine
similarity and correlation coefficient were not close to 1. It
was an indicator that the actual data and predicted data do
not have a significant resemblance. Therefore the LSTM
model was used to predict the rainfall for 200 epochs. The
same parameters were calculated and found as 33.544,
2904.847, 53.897, 0.956, and .931 respectively which
preserved the high accuracy of resemblance and exhibited
reduced loss too. For investigating the best possible
combination of learning rate (α) and epochs, LSTM
network was run for 200, 400, 600, 800 and 100 epoch for
various learning rate such as 0.01, 0.05, 0.001, 0.005. The
results exhibited that LSTM networks provide a better
result with lower epochs when higher learning rates are
chosen. These experimental results showed that the LSTM
approach gave the best result in predicting rainfall. To
improve the result obtained, a hybrid model may be
studied in future.
REFERENCE
[1] https://en.wikipedia.org/wiki/Rice_production_in_India(accesse
d on 25.09.20)
[2] Chakraborty, S., Pandey, R.P., Chaube, U.C., and Mishra, S.K.,
(2013).Trend and variability analysis of rainfall series at
Seonath River Basin, Chhattisgarh (India). Intl. J. Appl. Sci.
Engg. Res, 2(4), 425-434, 2013.
[3] Bhuarya, S., K., Chaudhary, J.,L., Khalkho, M., and Khalkho,
D., (2015). Comparison of Drought Indices at Different Stations
of Chhattisgarh .Journal of Agricultural Physics, 15(2), 140-
149, 2015.
[4] Maier, H.R., (2006). Application of natural computing methods
to water resources and environmental modeling , Mathematical
and Computer Modeling , 44 (5-6), 413-414, 2006.
[5] Maier, H.R., and Dandy, G.C., (2000). Application of neural
networks to forecasting of surface water quality variables,
issues, applications and challenges, Environmental Modeling
and Software 15, 348, 2000.
[6] T. A. Duong, M.D.Bui, and P.Rutschmann, Long short term
memory for monthly rainfall prediction in Camau, Vietnam,
https://www.researchgate.net/publication/322896962_LONG_S
HORT_TERM_MEMORY_FOR_MONTHLY_RAINFALL_P
REDICTION_IN_CAMAU_VIETNAM
[7] J.Qiu, B.Wang, C. Zhou,
https://doi.org/10.1371/journal.pone.0227222, January 3. 2020.
[8] F. Kratzert, D. Klotz, C.Brenner, K.Schulz, and M. Herrnegger,
(2018) Hydrology Earth System Science, 22, 6005-6022, 2018.
[9] López, E., Carlos Valle, C., Allende, H., Gil, E., and Madsen,
H., (2018). Wind Power Forecasting Based on Echo State
Networks and Long Short-Term Memory, Energies, 11, 526,
2018. doi:10.3390/en11030526
[10] Salman, A. G., Heryadi, Y., Abdurahman, E., and Suparta, W.,
(2018) .Weather Forecasting, 2018.
[11] Using Merged Long Short-term Memory Model, Bulletin of
Electrical Engineering and Informatics, 7(3), 377-385
[12] Crivellari, A., and Beinat, E., (2020). Sustainability. 12, 349,
2020. doi:10.3390/su12010349
[13] https://chhattisgarh.pscnotes.com/chhatttisgarh-
geography/chhattisgarh-geographic-location(accessed
on27.10.20)
[14] Swapnaa & Sudhakar., (2018) A hybrid model for rainfall
prediction using both parametrized and time series models,
International Journal of Pure and Applied Mathematics,
Volume -119, No. 14 , pp -1549-1556, 2018. [15] Choi, Y, J., & Lee, B., Combining LSTM Network Ensemble
via Adaptive Weighting for Improved Time Series Forecasting,
Volume – 2018.
[16] Poornima, S., & Pushpalatha, M., (2019) Prediction of Rainfall
Using Intensified LSTM Based Recurrent Neural Network with
Weighted Linear Units, Atmosphere, Volume - 10, pp – 668,
2019.
[17] Chimmula & Zhang., (2020), Time series forecasting of COVID
19 transmission in Canada using LSTM networks Chaos,
Solitons & Fractals Volume - 135, 2020.
[18] Z. C. Lipton, J. Berkowitz, and C. Elkan,,.A critical review of
recurrent neural networks for sequence learning,” https://
arxiv.org/abs/1506.00019.
[19] S. Hochreiter and J. Schmidhuber., (1997). Long short-term
memory,” Neural Computation, 9(8), 1735–1780,
[20] Jae Young Choi and Bumshik Lee, Combining LSTM Network
Ensemble via Adaptive Weighting for Improved Time Series
Forecasting, Volume 2018, Article ID 2470171.
AUTHORS PROFILE
Ms. Nisha Thakur is M.Tech from
Chhattisgarh Swami Vivekananda
Technical University, Chhattisgarh.
She is Faculty member in Department
of Computer Science at ICFAI
University, Kumhari, Durg. She has
08years of overall experience. Recently
she is working in the field of
Prediction.
Dr. Sanjeev Karmakar is B.Sc.
(Mathematics), M.Tech in CSE and
PhD in Computer & Information
Technology. At present he is registered
for post PhD research degree program
DSc/D.Lit in the Sambalpur University
Odissa, India. He is an Associate
professor In Master of Computer
Application Department in Bhilai Institute of Technology
(BIT) Durg. He has received Young Scientist Award from
Chhattisgarh Council of Science Technology-CCOST,
Raipur, India in 2007. Other than that he has received five
different awards at national level. He completed one
research project and at present one research project is
running. Recently 2 Scholars awarded PhD degree and 1
Submitted under his guidance.
top related