deep learning approach using long short term memory

International Journal of Scientific Research in ___________________________ Research Paper. Computer Science and Engineering

Vol.9, Issue.1, pp.08-13, February (2021) E-ISSN: 2320-7639

DOI: https://doi.org/10.26438/ijsrcse/v9i1.813

Deep Learning Approach Using Long Short Term Memory Technique

for Monthly Rainfall Prediction in Chhattisgarh, India

Nisha Thakur

1*, Sanjeev Karmakar

Computer Science & Engineering, Bhilai Institute of Technology, Chhattisgarh Swami Vivekananda Technical

University, Bhilai Nagar, India

*Corresponding Author: nishathakur.india@gmail.com Tel.: 7000167615

Available online at: www.isroset.org

Received: 31/Jan/2021, Accepted: 10/Feb/2021, Online: 28/Feb/2021

Abstract— Rainfall is an essential factor in Chhattisgarh state as the economy is dependent on agriculture here. Time

Series forecasting approach for monthly rainfall prediction is done using Long Short Term Memory [LSTM] Model

applying on 1404 months data of Chhattisgarh state. The factors taken for the evaluation of the performance and the

efficiency of the proposed rainfall prediction model are Mean Absolute Deviation (MAD), Mean Square Error (MSE),

Root Mean Square Error (RMSE), Cosine Similarity (CS) and Correlation Coefficient (r). Various learning rate like (α =

0.01, 0.05, 0.001, 0.005) for various epoch(s) such as 200, 400, 600, 800 and 1000 respectively are done for LSTM

approach. The experimental results show that Long Short Term Memory gave significant results than ANN for 200 epochs.

Keywords— Forecasting; Long short-term memory; Recurrent neural networks; Time series; Artificial Neural Network

I. INTRODUCTION

Rainfall has a significant role to play as far as Asian

countries are concerned. Indian is one of those countries

which directly or indirectly driven by the rainfall. Most of

the people residing in villages are dependent on

agriculture, and fruitful agriculture depends on the Rain, in

most of the parts of India. Therefore balanced rain is

needed for proper agriculture results. Rainfall prediction

becomes even more important, in case of possibilities of

deficient or excess rain. When there is a possibility of

excess rain, the people may suffer from flooding. Hence to

prevent this flooding situation, to manage resources and

most importantly to save the human life rainfall prediction

is so important. The deficient rainfall has also adverse

effect on humans or animals because it may affect the

water quality and aquatic ecosystem. Rainfall prediction

can help in controlling the adverse situation created due to

the excess or deficient rain.

Chhattisgarh State was constituted on 01st November

2000. Chhattisgarh State substantially depends on

agriculture where about 60% of the total populations of the

state depend on it. Chhattisgarh bears the 7th highest

positions amongst the rice-producing state of India with

6.32 million tons of rice production per year (1). Wide

ranges of crops are the speciality of Chhattisgarh. The

average annual rainfall received by the state is about 1400

mm out of which about 80% of precipitation is received

between June to September and deficit rainfall leads to

scarcity of water in rest of the year (2). In a generalized

manner, precipitation minimizes from eastern to the

western part of the state (Durg, Rajnandgaon, Kabirdham)

are falling under downpour shadow district gets minimum

precipitation. Increasing agricultural production of the state

is a great challenge since 75% of the rice cultivated area is

under the rain fed situation which is vulnerable to vagaries

of monsoon rainfall amount and distribution (3).

There are many approaches reported for making accurate

rainfall prediction using time series data that included

model-based methods such as artificial neural (ANN)

runoff prediction (4-5), streamflow forecasting and rainfall

simulation (6)

There have been various recent studies on the application

of LSTM neural network to the stock market (7), Rainfall-

runoff modelling (8), Wind power forecasting (9), Weather

forecasting (10), Predicting Individual Mobility Traces

(11), etc.

1.1 GEOGRAPHICAL DETAILS OF

CHHATTISGARH: -

The geographical location of Chhattisgarh is 17o 46’, north

to 24o 5’ north latitude and from 80

o15’ east to 54

o20’ east

longitude. The total area of the Chhattisgarh State is

135194 m2 and it is located in the central part of India.

This part of Chhattisgarh lies in the upper basin section of

the Mahanadi River and tributaries. (12)

Int. J. Sci. Res. in Computer Science and Engineering Vol.9, Issue.1, Feb 2021

Figure 1: Geographical Location of Chhattisgarh (India)

The climate of the state is tropical. Its proximity to the

Tropic of Cancer and its dependence on the monsoons for

rainfall makes it hot and humid. The temperature in

summer in the state may reach 45 °C. The monsoon season

starts from the latter part of month June to October and is a

welcome as it provides relaxation from the heat. The state

receives an average of 1,292 mm (50.9 in) of rain in

monsoon. The Season of Winter starts from November and

remain effective till January. Winter season in

Chhattisgarh is most pleasant with less temperature and

humidity. (13)

II. DATA COLLECTION

A time-series monthly rainfall data of Chhattisgarh State

from January 1901 to December 2017 was collected. In all,

monthly rainfall data of 1404 months were collected. The

data has been downloaded from the official website of the

Indian Metrological Department (IMD). As per the

literature review, it is understood that long short term

memory (LSTM) techniques are being used extensively as

it provides better accuracy. 80% of the total data i.e.

rainfall in 1124 months was used to train the LSTM

network thereafter remaining 20% i.e. rainfall in 280

months was used to test the results.

III. RELATED WORK

M. Swapnaa and N. Sudhakarb (2018) forecast rainfall in

mm using deep learning approach based on Long Short

Term Memory (LSTM) technique [14].

Choi and Lee (2018) proposed a LSTM ensemble

forecasting algorithm for time series forecasting that

effectively combines multiple forecast results from a set of

individual LSTM networks. It gives better results both in

terms of forecasting accuracy and fast runtime

performance [15].

Poornima and Pushpalatha (2019) presents an Intensified

Long Short-Term Memory (Intensified LSTM) based

Recurrent Neural Network (RNN) to forecast rainfall. The

results obtained are compared with different techniques

like Holt–Winters, Extreme Learning Machine (ELM),

Autoregressive Integrated Moving Average (ARIMA),

Recurrent Neural Network and Long Short-Term Memory

model in order to exemplify the improvement in the ability

to predict rainfall [16].

Chimmula and Zhang (2020) prposed a deep learning

approach fror forecast COVID 19 using a Long short-term

memory (LSTM) networks. Based on the results of Long

short-term memory (LSTM) network, predicted the

possible ending point of this outbreak will be around June

2020 [17].

IV. METHODOLOGY USED DEEP LEARNING

APPROACH (LONG SHORT TERM MEMORY

TECHNIQUE)

The basic building blocks of a typical LSTM network are

known as cells. Two states used to be transferred to the

next cell, the cell state and the hidden state. The

fundamental series of data flow is the cell state, this state

allows the data to flow in the forward direction without

being altered. However, some linear transformations may

occur. Adding or removing the data from the cell state is the

work of sigmoidal gate. A gate can be compared to a series

of matrix operations, which contain different individual

weights. (18) LSTM algorithms are developed to avoid

long-term dependencies as it uses gates to control the

remembering process. To recognize data that is not required

and will be deleted from the cell in that step is done in the

beginning phase of an LSTM model. Sigmoid function

decides the process of searching and deletion of data, which

takes the output of the previous LSTM unit )h( t 1 at a time

( 1t ) and the present input expressed as )X( t at time t.

Apart from this, the sigmoid function determines which part

from the earlier output must be removed. This gate is said

to forget gate (or ft) with values ranging from 0 to 1,

considered as a vector, associated with each number in the

cell state 1tC .

For time series forecasting of the input 𝑥𝑡, the LSTM

modifies the memory cell 𝑐𝑡 and outputs a hidden state ℎ𝑡 as per the following calculations. These are performed at

each time step 𝑡.

Full mechanism of a modern LSTM is given by the

following equations (19): -

)bXWX(Wf i1-thitxit

)bXWX(Wf f1-thftxft

)bXWX(Wo o1-thotxot ……. (1)

)bXWX(Wicfc c1-thctxctttt 1

)(coc 1tt

Figure 2: Systematic diagram for Long Short Term Memory

(LSTM) Neural Network

In (1), W𝑚𝑛 specifies the weight of the connection from

gate 𝑚 and gate n, and 𝑏𝑛 is bias parameter for learning,

where 𝑚 ∈ {𝑥, ℎ} and 𝑛 ∈ {𝑖, 𝑓, 𝑜, 𝑐}. Additionally, ⊗

represents the element-wise multiplication (Hadamard

product), 𝜎 represents for the standard logistic sigmoid

function, and 𝜙 represents the tanh function: (𝑥) = 1/ (1 +

𝑒−𝑥) and (𝑥) = (𝑒𝑥 – 𝑒-𝑥)/ (𝑒𝑥 + 𝑒-𝑥). The input-output and

forgets gates are expressed by 𝑖𝑡, 𝑜𝑡, and 𝑓𝑡 respectively,

whereas 𝑐𝑡 represents the internal state of the memory cell

𝑐 at time 𝑡. Vector ℎ𝑡 represents the value of the hidden

layer of the LSTM at time 𝑡, whereas ℎ𝑡−1 is the values of

outputs by each memory cell in the hidden layer at the

preceding time (20).

V. RESULT & DISCUSSIONS

We started experimenting with the time series data by

applying the Feed forward network, with Bayesian

Regularization (trainbr) algorithm. We went for 200 epochs

for this study. Various factors such as MAD, MSE, RMSE,

Cosine Similarity, and Correlation Coefficient were

calculated to investigate the accuracy of forecasting data.

The formula used for calculating these mentioned factors is

illustrated in Table 03 along with the significance of its

value in the quality of prediction. Neural network tool (nn

tool) was utilized to find these parameters. The values for

MAD, MSE, RMSE, Cosine Similarity (CS), and

Correlation Coefficient (r) were found as 82.589,

15877.523, 126.006, 0.724, and 0.545 respectively

illustrated in Table 01. The values for CS and r were an

indicator of poor resemblance between observed (actual)

and forecast data; therefore, LSTM techniques were taken

into consideration for further studies.

Table 1: Feed forward network Bayesian Regularization analysis

of time series data

Factors

Feedforward network

Bayesian Regularization (trainbr)

(200 Epochs)

MAD 82.589

MSE 15877.523

RMSE 126.006

CS 0.724

R 0.545

Table 2: LSTM based analysis of time series data

Learning Rate

(α) Factors

No. of Epoch (s) Optimum

Value 200 400 600 800 1000

MAD 33.544 35.089 35.949 34.850 35.148 33.544

MSE 2904.847 3632.101 3535.022 3128.956 3665.518 2904.847

RMSE 53.897 60.267 59.456 55.937 60.544 53.897

CS 0.956 0.946 0.948 0.952 0.945 0.956

r 0.931 0.915 0.919 0.925 0.914 0.931

MAD 32.842 37.308 47.481 45.816 36.744 32.842

MSE 3043.374 3383.482 4606.685 5450.882 3301.463 3043.374

RMSE 55.167 58.168 67.873 73.830 57.458 55.167

CS 0.956 0.949 0.928 0.919 0.951 0.956

r 0.932 0.920 0.886 0.874 0.923 0.932

MAD 34.059 32.921 32.494 32.468 32.409 32.409

MSE 2841.701 2744.058 2704.869 2702.570 2703.178 2702.570

RMSE 53.308 52.384 52.008 51.986 51.992 51.986

CS 0.958 0.959 0.960 0.960 0.960 0.960

r 0.934 0.936 0.937 0.937 0.937 0.937

MAD 36.224 32.115 31.969 32.107 31.601 31.601

MSE 3105.309 2708.119 2673.978 2704.418 2666.373 2666.373

RMSE 55.725 52.040 51.711 52.004 51.637 51.637

CS 0.953 0.959 0.960 0.959 0.959 0.960

r 0.926 0.936 0.937 0.936 0.936 037

A prediction was made using the LSTM technique available

in deep learning tool using the Matlab ver. 2016. Table 2

contains the all above parameters obtained. We varied the

learning rate (α = 0.01, 0.05, 0.001, 0.005) and observed the

quality of prediction for various epoch(s) such as 200, 400,

600, 800 and 1000. The factors MAD, MSE and RMSE

need to be less for obtaining optimum prediction quality.

On the other hand, the values of Cosine similarity (CS) and

Correlation Coefficient (r) must be close to one. When

these values are closed to 1, we consider the actual and

predicted data are the same and when the said values are

closed to 1, we may conclude about the usefulness of the

prediction made. If we observe Table 2 for α = 0.01,

closely, we find that the value of RMSE is 53.897 for 200

epochs, which is smallest in comparison to that of

prediction result obtained for rest of the epochs. We also

got the value of Cosine similarity and Correlations

Coefficient as 0.956, and 0.931 respectively which is again

highest for 200 epochs. When we observed Table 2 for α =

0.05, we find similar results of RMSE, Cosine similarity

and Correlations Coefficient as 55.167, 0.956, and 0.932 for

200 epochs again. This is an indication that prediction

results are better for 200 epochs if the learning rate α is 0.01

and 0.05 respectively. Moving towards the lower values of

α = 0.001 and 0.005 we find the better result for 600 and

800 epochs respectively as the results of r and CS were

found to be 0.960 & 0.937 and 0.960 & 0.937 for of α =

0.001 and 0.005 respectively. Graph obtained for (α = 0.01,

epoch = 200), (α = 0.05, epoch = 200), (α = 0.001, epoch =

600) and (α = 0.05, epoch = 800) are shown in Figure

3(a), Figure 3(b), Figure 3(c) & Figure 3(d) respectively.

In these figures, a great resemblance between the actual

and forecast data can be observed. Factors to judge the

quality of prediction with their significance are elaborated

in Table 03.

Table 3: Factors to judge the quality of prediction with their significance

S. No. Factor (s) Formula used Meaning of

Symbols used Significance

1 Mean Absolute

Deviation (MAD)

1iii fa

ai= Actual

fi= Forecast

n=number of

observations

Estimates the quality of forecasting by

calculating mean absolute deviation. Smaller is

the value, better is the result

2 Mean Square

Error (MSE) 2)

1iii f(a

The greater value of MSE expresses the values

are scattered around its central moment (mean),

and a smaller MSE means otherwise

Root Mean

Square Error

(RMSE)

1iii f(a

The greater value of MSE expresses the values

are scattered around its central moment (mean)

and a smaller MSE means otherwise.

4 Cosine Similarity

Cosine similarity is an estimation to judge the

similarity between two vectors of an inner

product space. It is calculated by the cosine of

the angle between two vectors and determines

whether two vectors are pointing in roughly the

same direction. It is found as 1 for the same set

of data and 0 for absolutely dissimilar sets.

5 Correlation

Coefficient (r)

ffnaan

It is a measure of the association between two

data sets. It is a measure to check how strong

the association ship is. It returns a value

between -1 and 1, where -1 indicates a negative

correlation and +1 exhibits a positive

correlation.

Figure 3(a): Comparison between Actual & Forecast Rainfall(200 epoch (s), alpha = 0.01) using LSTM algorithm

Figure 3(c): Comparison between Actual & Forecast Rainfall (600 epoch (s), alpha = 0.001) using LSTM algorithm

Figure 3(b):- Comparison between Actual & Forecast Rainfall (200 epoch (s), alpha = 0.05) using LSTM algorithm

Figure 3(d): Comparison between Actual & Forecast Rainfall(800 epoch (s), alpha = 0.005) using LSTM algorithm

VI. CONCLUSION

In present research paper, Artificial Neural Network

(ANN) approach and Deep learning approach for rainfall

forecasting based on the time series analysis is

incorporated using MATLAB v. 2016. An attempt has

been made to forecast the monthly rainfall applying Feed

Forward Network using Bayesian Regularization algorithm

for 200 epochs. The factors like MAD, MSE, RMSE,

Cosine Similarity (CS), and Correlation Coefficient (r)

were calculated to judge the quality of forecasting. These

parameters were found as 82.589, 15877.523, 126.006,

0.724, and 0.545 respectively. The values of cosine

similarity and correlation coefficient were not close to 1. It

was an indicator that the actual data and predicted data do

not have a significant resemblance. Therefore the LSTM

model was used to predict the rainfall for 200 epochs. The

same parameters were calculated and found as 33.544,

2904.847, 53.897, 0.956, and .931 respectively which

preserved the high accuracy of resemblance and exhibited

reduced loss too. For investigating the best possible

combination of learning rate (α) and epochs, LSTM

network was run for 200, 400, 600, 800 and 100 epoch for

various learning rate such as 0.01, 0.05, 0.001, 0.005. The

results exhibited that LSTM networks provide a better

result with lower epochs when higher learning rates are

chosen. These experimental results showed that the LSTM

approach gave the best result in predicting rainfall. To

improve the result obtained, a hybrid model may be

studied in future.

REFERENCE

[1] https://en.wikipedia.org/wiki/Rice_production_in_India(accesse

d on 25.09.20)

[2] Chakraborty, S., Pandey, R.P., Chaube, U.C., and Mishra, S.K.,

(2013).Trend and variability analysis of rainfall series at

Seonath River Basin, Chhattisgarh (India). Intl. J. Appl. Sci.

Engg. Res, 2(4), 425-434, 2013.

[3] Bhuarya, S., K., Chaudhary, J.,L., Khalkho, M., and Khalkho,

D., (2015). Comparison of Drought Indices at Different Stations

of Chhattisgarh .Journal of Agricultural Physics, 15(2), 140-

149, 2015.

[4] Maier, H.R., (2006). Application of natural computing methods

to water resources and environmental modeling , Mathematical

and Computer Modeling , 44 (5-6), 413-414, 2006.

[5] Maier, H.R., and Dandy, G.C., (2000). Application of neural

networks to forecasting of surface water quality variables,

issues, applications and challenges, Environmental Modeling

and Software 15, 348, 2000.

[6] T. A. Duong, M.D.Bui, and P.Rutschmann, Long short term

memory for monthly rainfall prediction in Camau, Vietnam,

https://www.researchgate.net/publication/322896962_LONG_S

HORT_TERM_MEMORY_FOR_MONTHLY_RAINFALL_P

REDICTION_IN_CAMAU_VIETNAM

[7] J.Qiu, B.Wang, C. Zhou,

https://doi.org/10.1371/journal.pone.0227222, January 3. 2020.

[8] F. Kratzert, D. Klotz, C.Brenner, K.Schulz, and M. Herrnegger,

(2018) Hydrology Earth System Science, 22, 6005-6022, 2018.

[9] López, E., Carlos Valle, C., Allende, H., Gil, E., and Madsen,

H., (2018). Wind Power Forecasting Based on Echo State

Networks and Long Short-Term Memory, Energies, 11, 526,

2018. doi:10.3390/en11030526

[10] Salman, A. G., Heryadi, Y., Abdurahman, E., and Suparta, W.,

(2018) .Weather Forecasting, 2018.

[11] Using Merged Long Short-term Memory Model, Bulletin of

Electrical Engineering and Informatics, 7(3), 377-385

[12] Crivellari, A., and Beinat, E., (2020). Sustainability. 12, 349,

2020. doi:10.3390/su12010349

[13] https://chhattisgarh.pscnotes.com/chhatttisgarh-

geography/chhattisgarh-geographic-location(accessed

on27.10.20)

[14] Swapnaa & Sudhakar., (2018) A hybrid model for rainfall

prediction using both parametrized and time series models,

International Journal of Pure and Applied Mathematics,

Volume -119, No. 14 , pp -1549-1556, 2018. [15] Choi, Y, J., & Lee, B., Combining LSTM Network Ensemble

via Adaptive Weighting for Improved Time Series Forecasting,

Volume – 2018.

[16] Poornima, S., & Pushpalatha, M., (2019) Prediction of Rainfall

Using Intensified LSTM Based Recurrent Neural Network with

Weighted Linear Units, Atmosphere, Volume - 10, pp – 668,

[17] Chimmula & Zhang., (2020), Time series forecasting of COVID

19 transmission in Canada using LSTM networks Chaos,

Solitons & Fractals Volume - 135, 2020.

[18] Z. C. Lipton, J. Berkowitz, and C. Elkan,,.A critical review of

recurrent neural networks for sequence learning,” https://

arxiv.org/abs/1506.00019.

[19] S. Hochreiter and J. Schmidhuber., (1997). Long short-term

memory,” Neural Computation, 9(8), 1735–1780,

[20] Jae Young Choi and Bumshik Lee, Combining LSTM Network

Ensemble via Adaptive Weighting for Improved Time Series

Forecasting, Volume 2018, Article ID 2470171.

AUTHORS PROFILE

Ms. Nisha Thakur is M.Tech from

Chhattisgarh Swami Vivekananda

Technical University, Chhattisgarh.

She is Faculty member in Department

of Computer Science at ICFAI

University, Kumhari, Durg. She has

08years of overall experience. Recently

she is working in the field of

Prediction.

Dr. Sanjeev Karmakar is B.Sc.

(Mathematics), M.Tech in CSE and

PhD in Computer & Information

Technology. At present he is registered

for post PhD research degree program

DSc/D.Lit in the Sambalpur University

Odissa, India. He is an Associate

professor In Master of Computer

Application Department in Bhilai Institute of Technology

(BIT) Durg. He has received Young Scientist Award from

Chhattisgarh Council of Science Technology-CCOST,

Raipur, India in 2007. Other than that he has received five

different awards at national level. He completed one

research project and at present one research project is

running. Recently 2 Scholars awarded PhD degree and 1

Submitted under his guidance.

deep learning approach using long short term memory

Documents

deep learning short introduction

short-term memory and working memory

short term memory loss

memory short term memory (a.k.a. working memory)

draw - cleveland state...

sensory memory and short-term (working) memory

neurodynamics of working memory - deep blue

lecture 5 – short term memory 1 the study of memory part 1...

psy 368 human memory short term memory cont. working memory

deep sentence embedding using long short-term memory ... ·...

short-term memory for facial identity and...

module 25 storage: retaining information. sensory memory...

deep learning and neural networks deep learning history of...

anomaly detection based on deep...

short-term’’ / ‘‘long-term’’ - memory

chapter 2 sections 2.8-2.10 by gerardo galvan. ram and short...

3 types of memory: sensory memory short – term memory ...

seizure prediction based on long short term memory...

deep sentence embedding using long short-term memory ... ·...

verbal labeling, rehearsal, and short-term memory - deep...