forecasting black river flow using feedforward …sci.tamucc.edu/~cams/projects/528.pdfforecasting...

Forecasting Black River Flow UsingFeedforward Artificial Neural Networks

GRADUATE PROJECT REPORT

Submitted to the Faculty ofthe Department of Computing SciencesTexas A&M University-Corpus Christi

Corpus Christi, Texas

In Partial Fulfillment of the Requirements for the Degree ofMaster of Science in Computer Science

By

Sarvani ThotaSpring 2018

Committee Members

Dr. Alaa ShetaCommittee Chairperson

Dr. Ajay K. KatangurCommittee Member

ii

ABSTRACT

The River flow estimation plays a significant role where agriculture is the country’s

central economy. In this work, Artificial Neural Network (ANN) and Auto Regressive

(AR) models are used to solve the river forecasting problem of Black River at Elyria

OH. For these modeling techniques, single attribute and multiple attribute datasets are

considered. Time series modeling problems are solved too, as Artificial neural networks

(ANN) is quite successful to deal with time series modeling problem. A comparison is

made between multi attribute model and single attribute delay model. The single attribute

delay model is designed based on the model order selection. Simulation results indicate

that the performance is better with neural network model compared to the auto-regression

model in case of both multi attribute model and single attribute delay model. Hence, neural

network model is recommended as a tool for river flow forecasting.

iii

TABLE OF CONTENTS

CHAPTER Page

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

TABLE OF CONTENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Supervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Unsupervised learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Historical Background and Related Works . . . . . . . . . . . . . . . . . 41.4 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.5 Research Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.6 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 MODELING TECHNIQUES . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1 Traditional modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.1.1 Linear Regression Model (LR) . . . . . . . . . . . . . . . . . . . 112.1.2 Auto Regression Model (AR) . . . . . . . . . . . . . . . . . . . . 13

2.2 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . 162.2.1 Neural Network Components . . . . . . . . . . . . . . . . . . . . 16

2.2.1.1 Types of Neural Networks . . . . . . . . . . . . . . . . . . 182.2.2 Network Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2.2.1 Design of Network Structure . . . . . . . . . . . . . . . . . 202.2.3 Backpropagation Learning algorithm . . . . . . . . . . . . . . . . 20

3 SYSTEM DESIGN AND IMPLEMENTATION . . . . . . . . . . . . . . . . 24

3.1 Data Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.2 Model Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2.1 Model Structure Selection . . . . . . . . . . . . . . . . . . . . . . 283.3 Evaluation Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

iv

CHAPTER Page

4 IMPLEMENTATION AND RESULTS . . . . . . . . . . . . . . . . . . . . . 32

4.1 AR Implementation for Single Attribute delay Model . . . . . . . . . . . 324.2 AR Implementation for Multiple Attribute Model . . . . . . . . . . . . . 334.3 ANN Implementation for Single Attribute delay Model . . . . . . . . . . 344.4 ANN Implementation for Multiple Attribute Model . . . . . . . . . . . . 35

5 CONCLUSION AND FUTURE WORK . . . . . . . . . . . . . . . . . . . . 41

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

v

LIST OF TABLES

TABLE Page

1 Related Research work on River Forecasting . . . . . . . . . . . . . . . 4

2 NN Parameters for Single Attribute delay Model . . . . . . . . . . . . . 35

3 NN Parameters for Multiple Attribute Model . . . . . . . . . . . . . . . 37

4 Results for ANN and AR models of the Black Water River for SingleAttribute delay Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

5 Results for ANN and AR models of the Black Water River for Mul-tiple Attribute Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

vi

LIST OF FIGURES

FIGURE Page

2.1 Three-layer perceptron . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.1 Black river at Ohio state . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2 Proposed Model development procedure . . . . . . . . . . . . . . . . . 25

3.3 Data collected between 2011 and 2013 as inputs . . . . . . . . . . . . . 25

3.4 Discharge rate collected between 2011 and 2013 . . . . . . . . . . . . . 26

3.5 System model based on Multiple Attributes . . . . . . . . . . . . . . . . 28

3.6 System model for Single Attribute delay Model . . . . . . . . . . . . . . 29

4.1 Actual and Predicted flow using AR model for Single Attribute delay Model 33

4.2 Actual and Predicted flow using AR for Multiple Attribute Model . . . . 34

4.3 NN convergence curve over 1000 epochs for Single Attribute delay Model 36

4.4 Neural Network Structure for Multiple Attribute Model . . . . . . . . . . 37

4.5 Actual and Predicted flow using ANN for Single Attribute delay Model . 38

4.6 Actual and Predicted flow using ANN for Multiple Attribute Model . . . 39

4.7 Neural Network convergence over 5 iterations in Multiple Attribute Model 40

1

CHAPTER 1

INTRODUCTION

Forecasting the flow rate accurately based on the collected data is a challenging prob-

lem for engineers. It is important for short-term and long-term decisions. The two broad

categories of forecasting techniques [4] are quantitative methods (objective approach) and

qualitative methods (subjective approach). Quantitative forecasting methods are based on

analysis of historical data under an assumption that past patterns in data can be used to

forecast future data points. Qualitative forecasting techniques employ the judgment of ex-

perts in the specified field to generate forecasts. This approach is based on quantitative

forecasting method. The forecast is not just dependent on the stream rate and considers

numerous parameters such as temperature, moistness, precipitation etc. However, accu-

rate and timely forecasting of river flow flooding provides ample time for the authorities

to take flood protection measures such as evacuation. As water is dependent on attributes

like temperature, rainfall, conductivity, oxygen etc the problem is considered to be non

linear in nature. Therefore, forecasting such non linear characteristic data is a challenging

task as it is based on time series. Therefore, for this work the data considered is both

multi-variable and univariate data.

Furthermore, the study of a life cycle of water is known as hydrology. The most basic

process in this cycle is where evaporation happens and brings about stream in the form of

rainfall. However, excess rainfall leads to floods. The flow rates collected at each station

are essential to many activities such as taking measures against flood protection, assessing

how much water can be extracted from a river for water supply or irrigation, protecting

agricultural land, dam construction, reservoir operations etc. Since the estimation of flow

rate more precisely is very important in the life cycle, some models that deal with mete-

2

orological, hydrological and geological variables should be improved. Thus, controlling

water and operating water structures would be efficiently possible. As a part of solving

such problems, we adopt machine learning techniques [22]. Nowadays machine learning

techniques are widely used in many applications. Machine learning is a field of computer

science that gives computer systems the ability to learn from data, without being explicitly

programmed. They learn from previous computations to produce reliable, repeatable deci-

sions and results. The two most widely adopted machine learning methods are supervised

learning and unsupervised learning but there are also other methods of machine learning.

1.1 Supervised learning

Supervised learning algorithms are given training using examples, as inputs and pre-

dicting the output. For instance, a bit of data could have information named either ”F”

(failures) or ”R” (runs). The learning algorithm gets a set of input and output values, and

the algorithm learns by comparing its actual output and corrected outputs to discover the

error. The algorithm then learns from the examples trained to it and extracts useful infor-

mation from the raw data. Through strategies like regression, prediction and classification,

supervised learning utilizes examples to predict the estimations of the unlabeled informa-

tion. Supervised learning is in commonly utilized in cases where the output can predict

the future trends.

1.2 Unsupervised learning

Unsupervised learning is used against data that has no historical labels. The system

is not told the ”right answer.” The objective is to investigate the information and discover

some structure inside. Unsupervised learning functions admirably on value-based infor-

mation. The most commonly used machine learning algorithms are Naive Bayes classifier

3

algorithm, K- means clustering algorithm, linear regression, logistic regression, artificial

neural networks, genetic programming and fuzzy logic.

Therefore, both supervised and unsupervised learning algorithms can be used to pre-

dict the flow rate. Training can be given on the available extensive records of river flow

and other climatic data, which could be used to predict flow. There are many practical situ-

ations where the primary concern is with making accurate predictions at specific locations.

In such cases, it is preferred to implement a simple black box [1] model to identify a direct

mapping between the inputs and outputs. In the class of black box models, machine learn-

ing methods are used to predict by known meteorological and hydrological time series

based on the applications of Auto Regression Model (AR) and Artificial Neural Network

(ANN). In this work, models AR and ANN for flow prediction are implemented. Graphs

are plotted between the actual and estimated flow rates in both the models. In addition,

parameters such as mean absolute percentage error (MAPE), mean square error (MSE),

and variance accounted for (VAF) in each technique are calculated and compared to know

which model is performing better in prediction. The main aim of the project is modeling

of time series prediction and historical data based on the records, which are necessary to

solve the below water management problems.

Forecasting the water flow will solve many problems like:

• Estimate how much amount of water can be stored for the future use.

• Analyze how much amount of water can be released to other canals for irrigation to

improve the yield in agriculture and fishing.

• Redirecting the overflow of the river and save the man kind of any destruction around

the river.

• Finding the appropriate places on the river for installing hydroelectric plants for

4

Table 1: Related Research work on River Forecasting

Ref Authors Title River

[7] Y. B. Dibike and D. P. Solo-

matln

River Flow Forecasting Using Artificial

Neural Networks

Apure river

basin

[20] A. F. Sheta and M. S. El-

Sherif

Optimal prediction of the Nile River flow

using neural networks

Nile River

[11] Ozgur Kisi Daily River Flow Forecasting Using

Artificial Neural Networks and Auto-

Regressive Models

Black water

river, Gila

river

[19] LINDA SEE and STAN

OPENSHAW

Applying soft computing approaches to

river level forecasting

Ouse River

catchment.

[1] Imen Aichouri and Azzedine

Hani and Nabil Bougherira

and Larbi Djabri and Hicham

Chaffai and Sami Lallahem”

River flow model using artificial neural

networks

Seybouse

basin

generating electricity.

1.3 Historical Background and Related Works

In the past, many authors proposed various techniques to predict the results using soft

computing. Some of these important techniques are stated below in Table 1

In the research river Flow Forecasting using Artificial Neural Networks by Y. B.

Dibike and D. P. Solomatln [7] a comparison is made between two types of ANN ar-

5

chitectures. This research is done considering data for Apure river basin. The two types

on ANN are a multi-layer perceptron network (MLP) and a radial basis function network

(RBF). Observations reveal that the performance of these networks when compared with a

conceptual rainfall-runoff model were found to be slightly better for this river flow- fore-

casting problem.

A.F.Sheta and M. S. El-Sherif [20] developed two models for forecasting the Nile

River flow in Egypt. A traditional linear autoregressive (AR) model and a feedforward

neural networks (NN) model are presented. The network with minimum normalized root

mean square error during training and testing was selected as the optimal network for

forecasting. The performance of both the AR and NN models were tested using a set of

measurements recorded at Dongola station in Egypt. A significant improvement of the

error when using NN model was achieved in this model.

Ozgur Kisi [11] makes a comparison between neural networks and auto-regressive

(AR) models. It gives both numerical and graphical comparison for river flow prediction.

Sum of square errors (SSEs) and correlation statistic measures were used to evaluate the

model’s performance in this study. Results showed that ANNs were able to produce better

results than AR models when given the same data inputs and the similar model [17].

The forecasting daily river flows and infilling missing data are investigated using

adaptive neuro-fuzzy and artificial neural network (ANN) techniques by Ozgur kisi and

Ozgur ozturk. In this model, ANN and ANFIS are considered, and a comparison is made

for predicting the river flow on a daily basis. In this study, three different applications are

employed. The first part deals with the estimation of upstream and downstream station

flow data, separately. The second part focuses on the evaluation of missing data of the

downstream station flow data by using the upstream station data. In the third part, the flow

data of downstream station are estimated by using data from both stations.

6

Imen Aichouri [19] model rates one of many possible improvements to conventional

flood prediction that can be achieved using soft computing technologies. A methodology

is sketched in which the forecast data set is divided into subsets with a series of neural

networks before training. These networks are then recombined using a rule-based fuzzy

logic model optimized with a genetic algorithm. The methodology is demonstrated using

historical time-series data from the catchment area of the Ouse River in northern Eng-

land. The model predictions are evaluated using global performance statistics and more-

over specific flood-related assessment is measured and compared with benchmarks from a

statistical model, naive forecasts The overall results show that this methodology is a well-

functioning, cost-effective solution that can be easily integrated into existing operational

flood forecasting and alert systems.

Imen Achouri [19] uses ANN to model the precipitation-runoff relationship in a

catchment area in a semi-arid and Mediterranean climate in Algeria [1]. The performance

of the developed neural network-based model was compared to several linear regression

models using the same observed data. Accordlingly, the neural network strategy can be

connected to different hydrological frameworks where different models are inappropriate.

Models of artificial neural networks show a good ability to model hydrological processes.

They are useful and powerful tools for dealing with complex problems compared to other

traditional models. The results show that in the semi-arid and Mediterranean regions,

where rain and runoff are very irregular, can show a model precipitation-runoff relation-

ship, using neural network is able to achieve the overall improvement when compared with

many other hydrological areas. The ANN approach could be a very useful and accurate

tool for solving problems in water resource studies and management.

Based on the studies mentioned earlier ANN and AR models are successful in pre-

dicting the daily river-flow [3] by either making comparisons or creating hybrid neural

7

network [5]. However, there is no model which compares all the two models multiple

attribute model and single attribute delay model. In this work, we compare the results pro-

duced by the two model to determine the best way for predicting the results. Moreover, in

the existing models, there is no focus on time-series prediction, so the proposed model also

predicts the flow based on time series, i.e., daily basis. Both graphical and numerical com-

parisons are produced using MATLAB. As a future work, the model can be extended to

the dynamic problems like rainfall modeling problem, temperature forecasting, forecasting

ozone concentrations [12], influence of the other gases on ozone surface [21], determining

the content of toxic gases in the air [16] etc.

1.4 Motivation

Water management plays a vital role in daily life. If the water is not managed properly

it may lead to problems like water scarcity, flooding, insufficient water for agriculture,

improper dam constructions. In case of floods it will effect the lives of people living on

the banks of river and agriculture yield will be effected which in turn effects the countries

economy. According to the national geographic USA reports, by 2025 an estimated 1.8

billion people will live in areas plagued by water scarcity, with two-thirds of the worlds

population living in water-stressed regions [24]. Asia is the most flood-affected region,

accounting for nearly 50% of flood-related fatalities in the last quarter of the 20th century.

Therefore, it is very important to address the problems of water management.

1.5 Research Challenges

The data used to estimate the output plays a vital role in each of the technique. Having

an output variable flow rate, which depends on many attributes like temperature, humidity,

pressure, precipitation, etc. is a great challenge. In the same way, predicting the water

8

flow only based on a single attribute, discharge flow using time series model is also chal-

lenging. As the data considered in this model has values in different ranges, there is a

need to normalize the data in an optimal range. Furthermore, the details about the two

types of the datasets considered are discussed when dataset is explained. Also, the dataset

preparation phase is crucial as it involves the concept of model order selection too. Using

that discharge flow as the attribute, we try to build a dataset having three inputs and one

output following model order selection.

Furthermore, model order selection is choosing a model structure. This is usually the

first step towards its estimation. There are various possibilities for structure-state-space,

transfer functions and polynomial forms such as ARX, ARMAX, OE, BJ, etc. If there is no

detailed prior knowledge of the system, such as its noise characteristics and the indication

of feedback, the choice of a reasonable structure may not be obvious. Also for a given

choice of structure, the order of the model needs to be specified before the corresponding

parameters are estimated. The decision of a model order is additionally affected by the

measure of lag. There are many ways to determine time delay from input to output. In the

proposed idea the best order of model order selection is also determined.

1.6 Research Objectives

Recently, a lot of research has been engaged on the modeling and identification of

linear systems. All systems are non-linear to some extent. The presence of nonlinear dis-

tortions can always cause significant errors in the process of system modeling. The fact

that the behaviors of most dynamic systems are nonlinear has made the use of models such

as artificial neural networks, regression models, fuzzy logic and neuro-fuzzy systems be-

comes a suitable choice for the modeling and identification of many real-life systems. This

research aims to investigate the use of Artificial Neural Network(ANN) and Traditional

9

model algorithms to develop a prediction/forecasting model for river flow and natural me-

teorological systems. The primary aim is to construct a model that can accurately solve

the prediction problem of river flow or without the need for physical insight or extensive

prior knowledge about these systems except historical measurements.

10

CHAPTER 2

MODELING TECHNIQUES

In this chapter we discuss the two modeling techniques autoregression model and ar-

tificial neural networks for solving the dynamic modeling problems like river forecasting.

In traditional modeling we discuss the types of regression models, how autoregression

model works when there are multiple attributes, whereas in artificial neural networks we

discuss types of neural networks and how backpropogation learning algorithm works in

tuning the weights. We also explain how ANN is able to learn from the samples of data

and derive knowledge from the raw data.

2.1 Traditional modeling

Regression model [8] is the most critical and extensively utilized machine learning

and statistics tools out there. It enables you to make predictions from data by learning the

connection between features of your data and some observed, constant valued response.

Regression is used in various applications.

Machine learning, more specifically the field of predictive modeling is primarily con-

cerned with minimizing the error rate of a model or making the most accurate predictions

possible in the application, at the expense of explaining ability. As such, linear regression

was developed in the field of statistics and is studied as a model for understanding the rela-

tionship between input and output numerical variables, but has been borrowed by machine

learning. It is both a statistical algorithm and a machine learning algorithm.

11

2.1.1 Linear Regression Model (LR)

Linear regression is a linear model, e.g., a model that assumes a linear relationship

[25] between the input variables (x) and the single output variable (y). The variable x is the

independent variable where y and one or more variables are the dependent variables. More

specifically, that y can be estimated from a linear combination of the input variables Linear

regression is a linear model, e.g., a model that assumes a linear relationship between the

input variables (x) and the single output variable (y). The variable x is the independent

variable where y and one or more variables are the dependent variables. More specifically,

that y can be estimated from a linear combination of the input variables . If there is a

single input variable (x), the method is called as simple linear regression. When there are

multiple input variables to predict the output as a combination of all the variable, then its

called as multiple linear regression. If there is a single input variable (x), the method is

called as simple linear regression. When there are multiple input variables to predict the

output as a combination of all the variable, then its called as multiple linear regression.

In linear regression analysis, the relations are modeled using the linear predictor func-

tions in which the data predict the unknown model parameters. Therefore, such models are

called as linear models. In common, the mean of y given the value of x is assumed to be a

function of x whereas the median or some other quantile if the condition distribution of y

given x is expressed as a linear function of x. Linear regression focuses on the conditional

probability distribution of y if x is given.

It is the first regression analysis which is studied and used in many practical applica-

tions. These models are often find the optimal parameters by the least square estimation

approach. However, there are multiple ways in which they can be fitted. Mostly, least

squares approach can be used to fit models that are not linear models.

The linear equation assigns one scale factor to each input value, called a coefficient

12

and represented by Beta (β) One additional coefficient is also added, giving the line an

additional degree of freedom and is known as the intercept or the bias coefficient. For

example, in a simple regression problem where only x and y are measures the form of the

model would be:

y = β0 + β1 ∗ x (2.1)

In solving real world problems, we have many x values or many input values to solve

the problem. This means many coefficients are used in this model as the inputs increase.

In the multiple regression models, because of the potentially large number of predictors or

inputs, it is more efficient to use matrices to define the regression model and the subsequent

analyses.

Consider the following simple linear regression function in Equation 2.2.

yi = β0 + β1xi + εi; ∀i = 1, 2, 3, ...n (2.2)

For i = 1, . . . , n, the obtained n Equations are 2.3, 2.4 and 2.5:

y1 = β0 + β1 ∗ x1 + ε1. (2.3)

y2 = β0 + β1 ∗ x2 + ε2. (2.4)

...

yn = β0 + β1 ∗ xn + εn. (2.5)

13

Therefore, these equations can be written in matrix format as Equation. 2.6:

y1

y2...

yn

=

1 x1

1 x2...

...

1 xn

β0β1

+

ε1

ε2...

εn

(2.6)

In short notation, simple linear regression function can be reduced as Y=Xβ + ε. In

all the above equations yi where i =1, 2, ..., n is called the response variables or measured

variables or dependent variables. So, in each dataset, the decision as to which variable is

modeled as the dependent variable and the independent variable is based on a presumption

that the other variables cause the selected one. And all xi where i=1, 2, ..., n is called

explanatory variables or input variables or independent variables. Usually a constant is

incorporated as one of the regressors. The elements β0, β1 are called the intercepts. Multi-

ple linear regression is the extension to simple linear regression. There are many predictor

variables (X) in this regression. Nearly all real-world problems involve multiple linear

regression.

2.1.2 Auto Regression Model (AR)

Autoregressive (AR) and autoregressive integrated moving average (ARIMA) models

are important in the stochastic modeling of hydrological data. Such models are of value

in handling what might be described as the short-run problem, that of modeling the sea-

sonal variability in a stochastic flow series. In recent years their importance to practical

water resource problems has been over- shadowed by more sophisticated types of models

that are designed to preserve long-time dependencies, perhaps of the order of decades,

in hydrological series. Although the long-run problem is crucial, the short-run problem,

perhaps of the order of months to a few years, is also important.

14

An autoregression model [11] is a time series model that uses observations from pre-

vious time steps as input to a regression equation to predict the value at the next time step.

Models using time as a predictor can be understood as using previous values to estimate

the model parameters, but they are otherwise not part of the forecast equation thus being

non-adaptive or fixed until re-estimation occurs. Models using time variables will exhibit

auto-correlated residuals thus should be studiously avoided as the presumed model. It is

a straightforward idea that can result in accurate forecasts on a range of time series prob-

lems. Also, a regression model, such as linear regression, models an output value based on

a linear combination of input values. If the output is just dependent on the previous value

i.e. yt on yt−1, then the AR model is written as Equation 2.7:

yt = β0 + β1yt−1 + εt. (2.7)

where t is the measure at specific time. If the idea is to predict the present value based

on the past two years as Equation 2.8, i.e on yt−1, yt−2 then the AR model is:

yt = β0 + β1yt−1 + β2yt−2 + εt. (2.8)

If the prediction is performed using yt−1, yt−2 then it is said as AR(2) since the past

two values are used in prediction. Hence, in general AR model can be written in the form

of Equation 2.9:

yt = β0 +n∑

i=1

βiyt−i + εt (2.9)

where β0,β1,...,βi are parameters, yt−1,yt−2,..., yt−i are previous values based on time.

There are two broadly used linear time series models Autoregressive (AR) and Moving

Average (MA) [9] [13] models. Combining these two, the Autoregressive Moving Aver-

age (ARMA) and Autoregressive Integrated Moving Average (ARIMA) models have been

15

proposed in the literature [15].

The ARMA is a tool for understanding and, maybe, foreseeing future values in this

series. The model comprises of two sections, an autoregressive (AR) part and a moving

normal (MA) part. The AR part includes regressing the variable lagged on its own, past

values. The MA part involves modeling the error term as a linear combination of error

terms occurring contemporaneously and at various times in the past. This model is usually

referred to as the ARMA(p, q) as in Equation 2.12, a model where p is the order of the

autoregressive part as in the Equation 2.10 and q is the order of the moving average part

as in the Equation 2.11, The notation AR(p) refers to the autoregressive model of order p.

The AR(p) model is written as:

yt = c+

p∑i=1

βiyt−i + εt. (2.10)

where βi i=1, 2, ,n are parameters, c is constant and εt is noise. Some constraints are

required on the values of the parameters so that the model remains stationary. The notation

MA(q) refers to the moving average model of order q:

yt = µ+ εt +

q∑i=1

θiεt−i. (2.11)

where θ1, θ2, ...., θn are parameters, µ is the expectation and εti , i=1,2,..,n are error

terms or noise. Therefore, the notation ARMA (p, q) refers to the model with p autore-

gressive terms and q moving-average terms. This model contains the AR(p) and MA(q)

models.

yt = c+ εt +

p∑i=1

βiyt−i +

q∑i=1

θiεt−i. (2.12)

ARMA is suitable when a system is a function of a series of unobserved shocks (the

16

MA or moving average part) as well as its behavior for example, in applications like stock

price prediction.

2.2 Artificial Neural Networks

ANNs are computing systems inspired by the biological neural network that learn by

examples. These systems do not have a prior idea about any objects, images or data. They

are quantitative models which learn to associate input and output patterns adaptively with

the use of learning algorithms. ANN’s can help in making decisions about examples in

same way that decision trees can. Just as with decision trees, the examples need to be

converted first to a set of feature values.

2.2.1 Neural Network Components

Similarly, a neural network structure is designed using the concept of the biological

nervous system. An ANN is the collection of connected units or nodes called as artificial

neurons which perform some functions. The network of these nodes is like the human

brain. The ANN is organized in layers like human nervous system. These layers include

input layer, a hidden layer and output layer where the training and testing the data takes

place by neurons in each layer. An example of three layer perceptron with three inputs,

one output, and the hidden layer including four neurons is shown in Figure 2.1. The

artificial neurons, activation units present in each layer that receives the signal can process

and forward the signal to next neuron. This process is continued until the output layer is

reached. This type of network in which data flows in one direction (forward) is known as

a feed-forward network. The data is divided into training and testing parts, and the data is

given to the system, and the validation is performed for a sample data so that it is examined

if the system could learn from the examples. In general, the function of each layer is:

17

• Input Layer - It contains those units (artificial neurons) that receive input from the

outside world through which the network will learn, recognize, or otherwise process.

• Output Layer - Contains units that respond to information about how they have

learned a task.

• Hidden Layer - These units are located between input and output layers. The hidden

layer’s job is to turn the input into something that the output unit can use in some

way using activation units. Most neural networks are fully connected, each hidden

neuron is completely connected to each neuron in its previous layer (input) and to

the next layer (output).

Figure 2.1: Three-layer perceptron

In ANN implementations, input between the artificial neurons is a real number; a

non-linear function calculates an output as the sum of its inputs. ANN typically have

a weight that adjusts learning rate. ANNs also have a threshold such that only if the

aggregate signal crosses that threshold is the signal sent. Thus, the signal emanating from

the output node(s) is the solution of the network to the input problem. Hence, ANN is

18

designed such that it can solve problems in the same way that human brain does. However,

ANN can be implemented as single layer perceptron, which has a single layer of output

nodes; the inputs are fed directly to the outputs through a series of weights. Multi-layer

perceptron as shown in the Figure 2.1 uses feed forward way on training and a variety of

learning techniques, the most popular being back propagation. Each neuron in one layer

has direct connections to the neurons of the subsequent layer as shown in the figure. In

many applications, the units of these networks apply a sigmoid function as an activation

function. The activation function can also be the sign, step which are commonly used.

A multi-layer neural network can be computed by a continuous output instead of a step

function or sign function. A common choice is the so-called logistic function:

f(x) =1

1 + e−x(2.13)

2.2.1.1 Types of Neural Networks

There are many types of neural networks. Some of the commonly used types in

machine learning techniques are.

• FeedForward Neural Network (FFNN): FFNN is one of the simplest forms of ANN,

where the data or the input forwards in a single direction [23]. The data flows

through the input nodes and comes out at the output nodes. This neural network

may or may not have the hidden layers. In simple words, it can forward in one

direction but no back propagation by using a classifying activation function usually.

• Radial Basis function Neural Network (RBF):

RBF is consider the distance of a point with respect to the center [10]. RBF functions

have two layers, first where the features are combined with the Radial Basis Function

19

in the inner layer and then the output of these features are taken into consideration

while computing the same output in the next time-step which is basically a memory.

• Recurrent Neural Network(RNN):

The Recurrent Neural Network [10] works on the principle of saving the output of

a layer and feeding this back to the input to help in predicting the outcome of the

layer. Here, the first layer is formed like the feed forward neural network with the

product of the sum of the weights and the features. The recurrent neural network

process starts once this is computed, this means that from one-, time step to the

next each neuron will memorize some information it had in the previous time-step.

This makes each neuron behave like a memory cell in performing computations. In

this process, we need to let the neural network to work on the front propagation

and remember what information it needs for future usage. Here, if the prediction is

wrong, we use the learning rate or error correction to make the changes so that it will

gradually work towards making the right prediction during the backpropagation.

• Convolution Neural Network(CNN):

CNN are like feed forward neural networks, where the neurons have learn-able

weights and biases [18]. Its application has been in signal and image processing

which takes over Open-CV in the field of computer vision.

2.2.2 Network Structure

The importance of the network design, that is the arrangement of neurons and synapses

is not to be underestimated. The arrangement of a number of neurons and number of hid-

den layers is very important to extract the hidden relation between the data.

20

2.2.2.1 Design of Network Structure

Neural network architecture can be defined as the design that executes the functions

of a neural network. All network structures are based on the concepts of neurons, inter-

connections and transfer functions. Network architectures are organized in different ways

according to their applications. The most common architecture is the multi-layer feed

forward network.

A neural network structure should have all the architecture in such a way that it can

solve the problems of application. All the ANN components should be decided and the

architecture designed should be able to solve the problems. The main aspects that are

considered in designing ANN structure are:

• Deciding on how many neurons should each layer in the architecture contain.

• Deciding how many number of hidden layers should be there.

• Deciding on which transfer function or activation function at each layer.

• Determining the weights for inputs.

2.2.3 Backpropagation Learning algorithm

Backpropagation was proposed in the 1970’s as a general optimization method acting

for performing automatic differentiation of complex nested functions. It is the generaliza-

tion of the Widrow-Hoff learning rule to multiple-layer networks and nonlinear differen-

tiable transfer functions [14]. Its a procedure to repeatedly adjust the weights to minimize

the difference between actual output and desired output. Also, an algorithm used in artifi-

cial neural networks to calculate the error contribution of each neuron after a batch of data

is being processed. This is used by adapting optimization algorithm to adjust the weight

21

of each neuron, completing the learning process for that case. Hidden Layers are neu-

ron nodes stacked in between inputs and outputs, allowing neural networks to learn more

complicated features.

• Backpropagation requires a known, desired output for each input value, therefore, it

is called supervised learning method.

• The weights that are changed when we backpropagate to the input layer follow var-

ious mathematical functions.

• Choose an optimal function to calculate weights that are to be given to the input

layer in each iteration so that the error rate is minimized.

Consider a neural network made up of n inputs connected to a single output unit with

n number of hidden neurons. The output of the network is determined by calculating a

weighted sum of its inputs and comparing this value with a threshold. If the net input is

greater than the threshold, the output is 1. Otherwise, it is 0. First, if we have m input data

(x1, x2, , xm), we call this m features. A feature is just one variable we consider as having

an influence on a specific outcome. Secondly, when we perform product for each of the m

features with a weight (w1, w2, , wm) and sum them all together, this is a dot product:

Mathematically, we can summarize the computation performed by the input unit as

Equation 2.14.

WX = w1x1 + w2x2 + .....wnxn =n∑

t=1

wixi (2.14)

where (w1, w2, ..., wn are the weights given to input layer and x1, x2, . . . , xn are inputs

at the input layer), which are output from the input layer.

Perform dot product for input data set X: x1, x2, ...., xm with weight set W 1: w11,

w12,... w1

n the weights have super script 1 since they are weight to first hidden neuron of the

22

hidden layer Perform dot product for input data set X: x1, x2, ...., xm with weight set W 1:

w11, w1

2,... w1n the weights have super script 1 since they are weight to first hidden neuron

of the hidden layer Perform dot product for input data set X: x1, x2, ...., xm with weight

set W 1: w11, w1

2,... w1n the weights have superscript 1 since they are weight to first hidden

neuron h11 of the hidden layer h1 with dot product summation, added with bias the result

z. So, at each hidden layer a bias is added to the inputs along with dot product of weight

as in Equation 2.16.

z =n∑

i=1

wixi + bi (2.15)

Now z is feed to the activation function. The activation function can be an identity,

sign or step. The output from the first hidden neuron in the hidden layer is now h11= f(z).

The same step is repeated for all the hidden n neurons and, now we have the n hidden

outputs. Now the outputs from all the n hidden neurons (h11, h12, ...., h1n) are used as input

to calculate the final output. For the final output, we perform a dot product for hidden layer

outputs, and hidden layers weights W h. As the inputs are from the hidden layer outputs

h1: (h11, h12, h

13, ...., h1n). As the final output is one value the set of weights consist of a

W h1 : (wh11, wh12, ..., wh1n) with n weights having n hidden layer inputs. Then the bias is

added to obtain the results z:

z =n∑

i=1

wh1i h

1i + bias (2.16)

Now the output z is given to an activation function f(z) so that an output at the output

layer is y=f(z). After the output is calculated the error rate is written as Equation 2.17.

Etotal =1

2[target− output]2. (2.17)

23

Now a chain rule is implemented to adjust the weights at each layer using derivative.

If the partial derivative of the total error is considered with respect to output, the quantity

12[target − output]2 becomes zero because output does not affect it which means taking

the derivative of a constant which is zero. Using delta rule the weights are adjusted at each

layer. Delta rule is a gradient descent learning rule for updating the weights of the inputs

to artificial neurons in a single-layer neural network. So, for a neuron j with activation

function g(x), the delta rule for j’s ith weight wji is written as Equation 2.18.

∆wji = α(tj − yj)g′(hj)xi (2.18)

where α is a small constant called learning rate, g(x) is the neuron’s activation func-

tion, tj is the target output, hj is the weighted sum of the neuron’s inputs, yj is the actual

output, xi is the ith input and hj=∑

xi wji and yj= g(hj).

In simplified form a delta rule for a neuron is:

∆wji = α(tj − yj)xi (2.19)

If mse > theta (theta can be any value called thresh hold value) then retrain the data

or stop training, the weights are adjusted until the desired error rate.

24

CHAPTER 3

SYSTEM DESIGN AND IMPLEMENTATION

In this chapter, we describe two modeling techniques, ANN and AR. These two tech-

niques are commonly used for prediction/forecasting of real-life problems. The Black

river data on the proposed model. The Black River is used for the purpose of analysis and

forecasting. It is a tributary of Lake Erie, about 19 km long, north of Ohio in the United

States as shown in Figure 3.1, across Lake Erie, the Niagara River and Lake Ontario, it

is part of the watershed of the St. Lawrence River that flows to the Atlantic Ocean. The

black drains an area of 470 square meters (1217 square kilometers) [6].

Figure 3.1: Black river at Ohio state

The proposed model structure to implement multiple attribute model and single at-

tribute delay model is shown in Figure 3.2. For the purpose of convenience and to have

accurate results the process is sub-categorized into three phases as in Figure 3.2, data pre-

processing, model design and validation testing. In preprocessing stage data normalization

is used to process the available dataset to find the range which will fit the problem. The

25

processed data is fed as an input in the model development stage, and finally, we test the

results acquired from the previous stage.

Figure 3.2: Proposed Model development procedure

The dataset consists of 6 attributes, collected during the years 2011, 2012 and 2013 at

site USGS 04200500. The attributes are temperature, dissolved oxygen, PH of the water,

specific conductance, turbidity and the discharge rate [24]. After visualizing the data in

the dataset, it is evident that there are many outliers present as shown in the Figure 3.3 and

Figure 3.4.

Jan 2011 Jan 2012 Jan 2013 Jan 2014

0

5

10

15

20

25

30

Jan 2011 Jan 2012 Jan 2013 Jan 2014

0

200

400

600

800

1000

Jan 2011 Jan 2012 Jan 2013 Jan 2014

6

8

10

12

14

16

Jan 2011 Jan 2012 Jan 2013 Jan 2014

7.4

7.6

7.8

8

8.2

8.4

8.6

8.8

Jan 2011 Jan 2012 Jan 2013 Jan 2014

0

100

200

300

400

500

600

Figure 3.3: Data collected between 2011 and 2013 as inputs

26

Jan 2011 Jul 2011 Jan 2012 Jul 2012 Jan 2013 Jul 2013 Jan 2014

0

2000

4000

6000

8000

10000

12000

Figure 3.4: Discharge rate collected between 2011 and 2013

3.1 Data Normalization

The dataset considered for historical data which has six attributes. These attributes

have a wide range of scales. As the attributes are of different features like temperature,

pH, oxygen, turbidity, etc, this dataset has a wide range of values with outliers, there is

a need for data normalization for obtaining accurate prediction results using the proposed

model. The existence of such data which do not have values on the same scale may not

give desired values as output. In data normalization stage all the variables are transformed

to a specific range or on to the same scale. However, we can do the normalization to reach

a linear, more robust relationship using the mean, standard deviation, setting up range

as maximum and minimum values to round off the values as follows: Standardizing the

Numeric Attributes. Data standardization is the technique of rescaling one or more than

one attribute so that they have a mean value of 0 and a standard deviation of 1.

In the proposed model normalization is performed by adding a function called Fea-

tureNormalize(X). FeatureNormalize(X) returns a normalized version of X, where the stan-

dard deviation of each feature is 1 and the mean value of each feature is 0. This is often

27

a better preprocessing technique to when working with learning algorithms. This func-

tion returns M, S, and norm as outputs. First, for each feature dimension, the mean of the

feature is computed and subtracted from the dataset, storing the mean value in M. Each

feature standard deviation is also computed and stored in S as in the equations 3.1, 3.2

and 3.3. Next, the actual value in each feature is computed using M and S stored and

computing normalized value into a variable norm.

M = mean(X) (3.1)

S = std(X) (3.2)

norm =(X −M)

S(3.3)

3.2 Model Design

In the model development stage after preprocessing the dataset, the data is fed as input

to the proposed models, namely ANN and AR. We have developed two models namely,

multi attribute model and single attribute delay model for predicting the discharge flow. In

multiple attribute model, more than one attribute is fed as input to an algorithm, whereas

in single attribute delay model, the discharge flow is predicted using single attribute based

on model order selection.

• Multiple Attribute Model:

The data used for this model is the data which is collected in 2011, 2012 and 2013.

The data collected has many attributes like temperature, humidity, dissolved oxygen,

pH, turbidity and discharge rate. The discharge rate depends on the other 5 attributes

in this model. The Model structure for multiple attribute model is shown in Figure

3.5, the inputs for the model are five attributes and the output is discharge rate.

28

Figure 3.5: System model based on Multiple Attributes

• Single Attribute delay Model:

This model is based on time series model which involves the data collected as an

average value per day. In this case, only daily flow rate is considered. A dataset is

designed using the model order selection with some delay in each column, and then

training is performed using both techniques and tested for the flow rate. To find y(t)

at a particular day, output using the input-output sequences of measurements y(t-1)

the previous day flow, y(t-2) which is 2 days before the flow, y(t-3) which is 3 days

before the actual day as shown in Figure 3.6. Hence, the values in both input and

output are discharge rate values.

3.2.1 Model Structure Selection

Model order selection is in which we determine order of the model and Input De-

lay. There are few techniques for selecting and configuring the optimal model structure,

lags and delays. Estimation of a model utilizing measurement data requires a choice of a

29

Figure 3.6: System model for Single Attribute delay Model

model structure, (for example, state-space or transfer function) and its order (e.g., number

of zeros) ahead of time. This decision is affected by earlier learning about the system

being modeled, yet can also be motivated by an analysis of data itself. Choosing a model

structure is usually the first step towards its estimation.

There are various possibilities for structure - state-space, transfer functions and poly-

nomial forms such as ARX, ARMAX, OE, BJ, etc. Likewise, for a given decision of

structure, the order of the model should be indicated before comparing the parameters.

The decision of a model order is also effected by the amount of delay. There are different

options available for deciding the time delay from input to output like using the delays, us-

ing a non-parametric estimate of the impulse response, using impulse using the state-space

model estimator N4SID with many different orders and finding the delay of the ’best’ one.

So, the best model order is chosen for predicting the output in single attribute delay model.

30

3.3 Evaluation Criteria

To check the performance of each model, we adopted evaluation parameters such

as Mean square error (MSE), Mean absolute percentage error (MAPE) and Variance ac-

counted for(VAF) as in the Equations 3.4, 3.5 and 3.6.

The mean squared error (MSE) in the equation 3.4 measures the average of the

squares of the errors or deviations present, the difference between the estimator and what

is estimated. MSE is a function, corresponding to the expected value of the squared error

loss or quadratic loss. The difference occurs because of randomness. The MSE is a mea-

sure of the quality of an estimator and is always non-negative, and values closer to zero

are better.

The mean absolute percentage error (MAPE) in the equation 3.5, also known as mean

absolute percentage deviation (MAPD), is a measure of prediction accuracy of a forecast-

ing method, for example in time series prediction.

Variance Accounted For (VAF) in the equation3.6 is the expectation of the squared

deviation of a random variable from its mean. It calculates how far a set of (random)

numbers are arranged out from their mean value.

An error rate e, is computed as a difference between estimated and predicted values

yi and yi as in the equations 3.4, 3.5 and 3.6. These evaluation criteria are provided by the

data mining software tool called Weka (Waikato environment for knowledge analysis) [2].

This software tool is a popular machine learning tool developed in java. To implement the

modeling techniques MATLAB software is used.

MSE =1

n

n∑i=1

(yi − yi)2 (3.4)

31

MAPE =1

n

n∑i=1

|yi − yi|yi

∗ 100% (3.5)

V AF = [1− var(y − y)

var(y)] ∗ 100% (3.6)

So the above parameters MSE, VAF and MAPE are measured for both single attribute

delay model and multiple attribute model using ANN and AR model and the results are

evaluated.

32

CHAPTER 4

IMPLEMENTATION AND RESULTS

We have designed two models to predict the discharge flow rate of the Black River

using ANN and AR algorithms. We computed results using single attribute delay model

and multiple attribute model. The dataset has been preprocessed to remove the outliers

using the normalizing technique discussed in Chapter 3. The data is divided into 70%

training and 30% testing for both techniques.

4.1 AR Implementation for Single Attribute delay Model

An AR model is implemented for predicting the flow of the Black River using the

single attribute, discharge flow. The AR model performance is estimated using the MSE,

to minimize the error between the actual flow and the predicted flow. The model order

of 3 is chosen in implementing this model. The developed AR algorithm for this model

is given by Equation 4.1. In this case, the results for the actual and predicted values are

shown in Figure 4.1, the actual flow with a solid line and the estimated flow with the dotted

line, based on AR(3) model for training and testing of Black River flow.

yt = 0.0012 + 0.9963 yt−1 − 0.3709 yt−2 + 0.0012 yt−3 (4.1)

33

4.2 AR Implementation for Multiple Attribute Model

For AR algorithm, the number of training examples used is 645, for 1000 itera-

tions.The developed AR for the multiple Attribute model is given by Equation 4.2.

y(k) = 0.0047− 0.0227 temp+ 05918 cdt− 0.6826 oxygen

+ 0.3425 pH − 0.0047 turbidity (4.2)

The experiment is executed over a number of iterations, varying the error rate, model

order, number of iterations and changing training and testing samples. The best results are

recorded after tuning the parameters that effect the performance of the model. Parameters

like MSE, VAF and MAPE are recorded for this model. The results for the actual and

predicted values for this model are shown in Figure 4.2.

0 100 200 300 400 500 600 700

-5000

0

5000

10000

15000

Observed

Estimated

0 50 100 150 200 250 300

-5000

0

5000

10000

15000

Observed

Estimated

Figure 4.1: Actual and Predicted flow using AR model for Single Attribute delay Model

34

0 100 200 300 400 500 600 700

0

2000

4000

6000

8000

10000

12000

Observed

Estimated

0 50 100 150 200 250 300

0

0.5

1

1.5

210

4

Observed

Estimated

Figure 4.2: Actual and Predicted flow using AR for Multiple Attribute Model

4.3 ANN Implementation for Single Attribute delay Model

A feed forward neural network is implemented on daily values. The daily data set

consists of 919 × 4 size data. This data is built based on the model order selection with

the order of 3 i.e., having a delay of 0. To extract the knowledge in the data one hidden

layer is not sufficient in this model. The model is executed for 2 hidden layers, 3 inputs

and 1 output layer. The neural network structure to predict daily values is shown in Table

2. Other attributes like Performance goal, Marquardt adjustment parameter, Maximum

validation failures, Minimum performance gradient are defined at the phase of training

and testing the network. For daily dataset for model 2, the results are as shown in the

Figure 4.3 and Figure 4.5.

35

Table 2: NN Parameters for Single Attribute delay Model

Inputs 3

Hidden Layers 2

Hidden neurons 20,1

Maximum epochs 1000

4.4 ANN Implementation for Multiple Attribute Model

In multiple attribute model we try to predict the discharge rate using the other five

attributes. The size of this dataset is 919 × 6. As shown in Table 3 the NN structure has

30 activation units in the first hidden layer and 20 activation units in the second hidden

layer which is considered as optimal. When the network structure is considered having

one hidden layer with less number of hidden neurons the performance of NN is not so

efficient. The neural network structure has 1 input layer, 2 hidden layers and 1 output

layer with 10 neurons in first hidden layer and 1 hidden neuron in second hidden layer as

shown in Figure 4.4. Therefore, more than one hidden layers are used to predict the output.

The number of hidden neurons present in the hidden layers are arbitrarily chosen. After a

number of runs, it is observed that 30 number of neurons in hidden layer 1 and 20 hidden

neurons in hidden layer 2 are optimal for achieving best MSE. In the first hidden layer the

transfer function used is tansig, and in second layer the transfer function is purelin. The

sum of the weighted inputs and the bias forms the input to these transfer function.

The results for the actual and predicted values using ANN in this model are shown

in Figure 4.6 When the experiment is implemented for 5 iterations the convergence graph

plotted between epochs and MSE for the 5 iterations is shown in Figure 4.7.

Accordingly, for the single attribute delay model, the best values for the parameters

36

0 100 200 300 400 500 600 700 800 900 1000

Epochs

10-1

100

101

102

MS

E

NN convergence

Figure 4.3: NN convergence curve over 1000 epochs for Single Attribute delay Model

MSE, VAF and MAPE are compared between AR and ANN techniques are shown in Table

4. It is observed that ANN is the best in predicting the flow rate using single attribute delay

model, multiple attribute model yielding the best values for the parameters MSE, VAF,

and MAPE are compared between the two models as shown in Table 5. It is observed that

ANN is best in predicting the flow rate. The actual and predicted values for the flow in

both training and testing cases are showed for ANN and AR techniques. The computed

difference between the actual and predicted are shown in Table 5. For both the models

ANN is observed to be better for predicting the flow values than AR when MSE, VAF, and

MAPE are compared.

37

Figure 4.4: Neural Network Structure for Multiple Attribute Model

Table 3: NN Parameters for Multiple Attribute Model

Inputs 5

Hidden Layers 2

Hidden neurons 30,20

Maximum epochs 1000

38

0 100 200 300 400 500 600 700

0

2000

4000

6000

8000

10000

12000

Observed

Estimated

0 50 100 150 200 250 300

-5000

0

5000

10000

15000

Observed

Estimated

Figure 4.5: Actual and Predicted flow using ANN for Single Attribute delay Model

Table 4: Results for ANN and AR models of the Black Water River for Single Attribute

delay Model

ANN AR

Training Testing Training Testing

VAF 68.759 61.259 59.262 61.513

MSE 0.2641 0.5287 0.3445 0.5235

MAPE 0.2173 0.2841 0.2952 0.2907

39

0 100 200 300 400 500 600 700

-5000

0

5000

10000

15000

Observed

Estimated

0 50 100 150 200 250 300

-5000

0

5000

10000

15000

Observed

Estimated

Figure 4.6: Actual and Predicted flow using ANN for Multiple Attribute Model

Table 5: Results for ANN and AR models of the Black Water River for Multiple Attribute

Model

ANN AR

Training Testing Training Testing

VAF 97.351 91.65 62.80 33.246

MSE 0.037 0.128 0.1354 1.0897

MAPE 0.115 0.135 0.333 0.299

40

0 100 200 300 400 500 600 700 800 900 100010

-1

100

101

102

103

run1

run2

run3

run4

run5

Figure 4.7: Neural Network convergence over 5 iterations in Multiple Attribute Model

41

CHAPTER 5

CONCLUSION AND FUTURE WORK

In this study, a detailed comparison is presented between the ANN and AR models

in predicting the water flow at Black river, Ohio. Comparisons reveal that ANN is bet-

ter in prediction besides having many advantages. As the model is implemented for both

the historical data and time series based data, it is observed that ANN can be adapted for

predicting the discharge rate. The performance of the two proposed models was com-

pared for both training and testing cases. In addition, the comparison is made considering

parameters such as MSE, VAF, and MAPE.

As a part of future work, a comparison can be made in predicting weekly and monthly

values. Other modeling techniques like genetic algorithm, artificial neuro fuzzy inference

systems, etc. can also be implemented. In addition, a hybrid neural network can be im-

plemented where the weights are tuned by the genetic algorithm and given to the neural

network for better training and testing. Other parameters for measuring the performance

such as correlation, regression, accuracy can also be implemented. Further, the model

can be extended to solve other real-world problems like ozone layer depletion, rain water

problems. etc.

42

REFERENCES

[1] AICHOURI, I., HANI, A., BOUGHERIRA, N., DJABRI, L., CHAFFAI, H., AND

LALLAHEM, S. River flow model using artificial neural networks. Energy Procedia

74 (2015), 1007 – 1014. The International Conference on Technologies and Materi-

als for Renewable Energy, Environment and Sustainability TMREES15.

[2] ALJAHDALI, S., SHETA, A. F., AND DEBNATH, N. C. Estimating software ef-

fort and function point using regression, support vector machine and artificial neural

networks models. In 2015 IEEE/ACS 12th International Conference of Computer

Systems and Applications (AICCSA) (Nov 2015), pp. 1–8.

[3] ATIYA, A. F., EL-SHOURA, S. M., SHAHEEN, S. I., AND EL-SHERIF, M. S. A

comparison between neural-network forecasting techniques-case study: river flow

forecasting. IEEE Transactions on Neural Networks 10, 2 (Mar 1999), 402–409.

[4] BOX, G. E. P., AND JENKINS, G. Time Series Analysis, Forecasting and Control.

Holden-Day, Incorporated, 1990.

[5] CHEN, X., CHAU, K., AND BUSARI, A. A comparative study of population-based

optimization algorithms for downstream river flow forecasting by a hybrid neural

network model. Engineering Applications of Artificial Intelligence 46 (2015), 258 –

268.

[6] CONTRIBUTORS, W. Black river (ohio) — wikipedia the free encyclopedia, 2018.

[Online; accessed 2-May-2018].

[7] DIBIKE, Y., AND SOLOMATINE, D. River flow forecasting using artificial neural

networks. Physics and Chemistry of the Earth, Part B: Hydrology, Oceans and At-

mosphere 26, 1 (2001), 1 – 7.

43

[8] DOAN, T., AND KALITA, J. Selecting machine learning algorithms using regres-

sion models. In 2015 IEEE International Conference on Data Mining Workshop

(ICDMW) (Nov 2015), pp. 1498–1505.

[9] G.E.P. BOX, G. J. Time series analysis, forecasting and control.

[10] GRAVES, A., R. MOHAMED, A., AND HINTON, G. Speech recognition with deep

recurrent neural networks. In 2013 IEEE International Conference on Acoustics,

Speech and Signal Processing (May 2013), pp. 6645–6649.

[11] KISI, O., AND ZTRK, . Forcasting river flows and estimating missing data using soft

computing techniques. International congress on River basin management (2007).

[12] KOVAC-ANDRIC, ELVIRAAND SHETA, A., FARIS, H., AND GAJDOSIK,

M. S. Forecasting ozone concentrations in the east of croatia using nonparametric

neural network models. Journal of Earth System Science 125, 5 (Jul 2016), 997–

1006.

[13] K.W. HIPEL, A. M. Time series modelling of water resources and environmental

systems.

[14] LI J., CHENG J., S. J. H. F. Brief Introduction of Back Propagation (BP) Neu-

ral Network Algorithm and Its Improvement. In: Jin D., Lin S. (eds) Advances in

Computer Science and Information Engineering. Advances in Intelligent and Soft

Computing, vol 169. Springer, Berlin, Heidelberg, 2012.

[15] MCDONALD, S., COLEMAN, S., MCGINNITY, T. M., AND LI, Y. A hybrid fore-

casting approach using arima models and self-organising fuzzy neural networks for

capital markets. In The 2013 International Joint Conference on Neural Networks

(IJCNN) (Aug 2013), pp. 1–7.

44

[16] MONDAL, B., MEETEI, M., DAS, J., CHAUDHURI, C. R., AND SAHA, H. Quan-

titative recognition of flammable and toxic gases with artificial neural network using

metal oxide gas sensors in embedded platform. Engineering Science and Technology,

an International Journal 18, 2 (2015), 229 – 234.

[17] NIGAM, R.AND NIGAM, S. M. S. K. The river runoff forecast based on the model-

ing of time series. Russian Meteorology and Hydrology 39, 11 (Nov 2014), 750–761.

[18] O’SHEA, K., AND NASH, R. An introduction to convolutional neural networks.

[19] SEE, L., AND OPENSHAW, S. Applying soft computing approaches to river level

forecasting. Hydrological Sciences Journal 44, 5 (1999), 763–778.

[20] SHETA, A. F., AND EL-SHERIF, M. S. Optimal prediction of the nile river flow

using neural networks. In Neural Networks, 1999. IJCNN ’99. International Joint

Conference on (1999), vol. 5, pp. 3438–3441 vol.5.

[21] SHETA, A. F., AND FARIS, H. Influence of nitrogen-di-oxide, temperature and rel-

ative humidity on surface ozone modeling process using multigene symbolic regres-

sion genetic programming. International Journal of Advanced Computer Science

and Applications 6, 6 (2015).

[22] SINGH, A., THAKUR, N., AND SHARMA, A. A review of supervised machine

learning algorithms. In 2016 3rd International Conference on Computing for Sus-

tainable Global Development (INDIACom) (March 2016), pp. 1310–1315.

[23] TANG, Y., AND SALAKHUTDINOV, R. R. Learning stochastic feedforward neu-

ral networks. In Advances in Neural Information Processing Systems 26, C. J. C.

Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, Eds. Curran

Associates, Inc., 2013, pp. 530–538.

45

[24] WATER-RESOURCES DATA. United states geological survey,usgs water data for the

nation, 1994.

[25] ZOU, K. H., TUNCALI, K., AND SILVERMAN, S. G. Correlation and simple linear

regression. Radiology 227, 3 (2003), 617–628. PMID: 12773666.

forecasting black river flow using feedforward …sci.tamucc.edu/~cams/projects/528.pdfforecasting...

Documents