1807 wind speed forecasting using fully recurrent neural network in wind power plants

Wind Speed Forecasting Using Fully Recurrent Neural Network in Wind Power

Plants

Hong Liangyou, Jiang Dongxiang, Huang Qian, Ding Yongshan

Department of Thermal Engineering, Tsinghua University, Beijing 100084, China [email protected], [email protected]

Abstract:

Wind speed forecasting is very important to the operation of wind power plants and power systems. Because of its nonlinearity and non-stationary, wind forecasting is a severe task. This paper deals with the problem of short-term wind speed forecasting based on historical time-serial meteorological data. A kind of recurrent neural network called PRNN (pipelined recurrent neural network) is adopted here to make prediction. It is a NARMAX ANN model (nonlinear auto regressive moving average artificial neural network with external inputs). First, the raw wind speed data is processed to be steady using logarithm difference method. Then, the phase space reconstruction method of chaotic theory is adopted to determine embedded dimension and time delay. Third, based on the selected embedded dimension and time delay, we develop a PRNN+TDL (tapped-delay-line filter) ANN model. Several on-line learning and optimal algorithms are used to train network. Finally, the ANN predicted result is amended based on the statistic characteristic of wind speed. The model is tested at a wind power plant over one year period. The raw data interval is 1 minutes. And predication term is about five minutes to one hour. The result shows that forecasting accuracy was effectively improved by the proposed method. The average predication error is within 10%. In additional, model parameters are very important factors affecting predication precision. Different type wind data need different parameters. Keywords: Wind speed forecasting, time series, recurrent neural networks, real

time learning

1. Introduction Wind speed signal is a kind of statistically non-stationary and nonlinear signal,

due to highly complex interactions and the contribution of various meteorological parameters. There are many prediction methods have been suggested in the literature. The simplest method is persistence forecast: the prediction is set equal to the last available measurement. Other methods includes ARMAX model, Kalman filters, artificial neural network (ANN), Fuzzy Logic, etc. The architecture of Recurrent neural networks (RNN’s) enable the information to be temporally memorized in the networks. They can exhibit a wide range of dynamics, due to feedback, and are also tractable nonlinear maps. The application of using RNN’s as predictors in nonlinear dynamical systems is increasing [1].

In 1995, Haykin and Li presented a novel computationally efficient nonlinear predictor based on the pipelined recurrent neural network (PRNN) [2]. The PRNN consists of a number of small scale recurrent neural networks (RNNs), but maintains its relatively low computational complexity considering the entire number of neurons in its architecture. In addition, the PRNN architecture helps to circumvent the problem of vanishing gradient in several aspects [7].

This paper adopts above mentioned nonlinear predictor to make on-line wind speed forecast. Of course, the system is reconstructed by the theory of phase space reconstruction before forecasting.

2. The Haykin–Li’s Nonlinear Predictor The Predictor is a combination of two subsections. PRNN, consisting of many

levels of recurrent signal processing, constitutes the nonlinear subsection of the Predictor. The function of PRNN is to linear input signal. Linear subsection is represented by a conventional tapped delay-line (TDL) filter. This combination of nonlinear and linear processing should be able to extract both nonlinear and linear relationships contained in the input signal [6].

2.1 Nonlinear subsection

The PRNN is a modular neural network and consists of a certain number M of fully connected RNNs as its modules, with each module consisting of neurons. In the PRNN configuration, RNN are connected as shown in Fig. 1. Module of the PRNN is a fully connected RNN, where as in modules , one of the feedback signals is substituted with the output of the first neuron of the following module. Equations (1) give a full description of the PRNN.

Fig.1. Pipelined recurrent neural network [7]

(1)

The ( )-dimensional external signal vector is delayed by times steps before feeding the module . All the modules operate using the same weight matrix W. The overall output signal of the PRNN is , i.e. the output of the first neuron of the first module. The overall cost function of the PRNN becomes

(2)

(3) Where one-step forward prediction error from module and forgetting

factor which determines the weighting of the individual modules.

2.2 linear subsection The linear subsection of the neural network-based predictor consists of a

tapped-delay-line (TDL) filter, which is shown in Fig.2.

Fig.2. Tapped-delay-line filter[2]

2.3 Prediction procedure

The procedure was composed of the three following subtasks [2]: 1). Prediction: Compute the one-step forward nonlinear prediction errors of the

PRNN at the time instant k, using equation (1) and (3). 2).Weight Updating: A learning algorithm uses the suitably chosen overall cost

function (2) in order to calculate the weight matrix correction factor which updates the weight matrix .

3). Filtering: Using (1) the output of the PRNN is computed. The updated input signal to every module is formed by substituting the external signal input with the updated external signal input

The output of the PRNN was then fed into the LMS filter in order to produce the

predicted signal of the nonlinear predictor.

3 Training Algorithms for the Predictor Original learning algorithms for PRNN are RTRL. This algorithm has been proved

suffered from some serious drawbacks such as divergence. This paper adopted a normalized version of the RTRL. This has been achieved via local linearization of the RTRL around the current point in the state space of the network. Such an algorithm provides an adaptive learning rate normalized by the norm of the gradient vector at the output neuron.

3.1 RLS algorithms for linear subsection

The description of RLS algorithms is as follows [1]:

(4)

3.2 Normalized RTRL algorithms for PRNN

Here, we adopt the NRTRL algorithms to train the network. The weights are adapted as follows: [4, 5]

(5)

Where denotes the gradients at the output neuron with respect to the weights from the jth neuron, positive constant, learning rate of ith

module, and is calculated by RTRL algorithm. Here, is an additional

parameter we introduced.

3.3 Initialization of the Algorithm Initialization of the synaptic weight matrix for PRNN is traditional epochwise

training method of RNNs [3] with 100 epochs run over 10% of total data point. All the experiments take same strategy.

4 The theory of State Space Reconstruction

The concept of low-dimensional chaos has proven to be fruitful in nonlinear time series analysis. It is believed that a dynamic nonlinear system can be reconstructed from a single time series signal though method of delay (MOD).

According to Packard et al. [8] and Takens [9], the method of delays can be used to embed a scalar time series into a d-dimensional space as follows:

(6) Where is the index lag. If the sampling time is , the delay time is . For the real data with noise, the optimal value of delay time and embedding

dimension is important for the quality of the reconstruction. Many methods have been suggested for estimating these parameters such as autocorrelation function [10], mutual information [11], for delay time and G-P [12], FNN (false nearest neighbors) [13] for embedding dimension. Other methods such as C-C [14] method and time windows method [15] determine these parameters together.

5 Prediction result of wind speed

5.1 Some basic information

The function in equation (1) is given by

(7)

As for normalized algorithm, the condition of must be satisfied [1]. In fact, the nonlinear predictors in this paper are a kind of NARMAX network. And

the wind signals are preprocessed before feed into the predictor. First, the operator of first order sequence difference is worked on the signal. This is a necessary step for better performance gain. Then, signal is rescaled to range between -1 and 1.

The prediction is made as follows: first, the original is reconstructed and new signal is obtained. Then, the dimensional of external signal vector for PRNN is set by the number of embedding dimension, namely p=d.

5.2 Selection of delay time and embedding dimension

Here, the delay time is determined by the first minimum of the average mutual information. The embedding dimension is chosen by the method of Cao-false nearest neighbors (Cao-FNN). Both E1 (d) and E2 (d), invariant of Cao-FNN, are calculated for determining the minimum embedding dimension of wind speed signal. The latter is helpful to distinguish deterministic data from random data. For deterministic data, E2 (d) is certainly related to d; as a result, it cannot be a constant for all d. In fact, we have tested the sensitivity of Cao-FNN method on data length. It shows that the result does not strongly depend on length of data (Fig.4). Finally, 1000 data point is chosen for determination of delay time and embedding dimension.

Fig.3. average mutual information versus time lag

Fig.4. the values E1 and E2 versus dimension

(E1-1 and E1-2 represents E1value obtained using 1000 and 2000 data points, and E2 is the same)

5.3 Typical prediction results and discussion

In this section, we illustrate the application of the predictor on some coarse sampled signal. Our speed signal is get from some wind power plant. The time series is sampled at one point per minute for a period of one year. All three original wind speed signals we used here are consist of 1000 point.

The measure that was used to assess the performance of the predictors was the

forward prediction gain given by [6]

(8)

Where denotes the estimated variance of the speed signal , whereas

denotes the estimated variance of the forward prediction error signal . Fig.5, 6, 7 is a representation of typical prediction results. It is clearly from the

figure that the predictor can trace the change of signal quickly. Compared with the persistence forecasting, the PRNN+TDL model with NRTRL+RLS training algorithm gives an effective improved result. The results given by RTRL+LMS algorithm and NRTRL+NLMS algorithm are shown in table 1, at the mean while. Table 2 shows the Parameters for PRNN. The result of State Space Reconstruction is listed as follows: t=2, d=6 for wind1; t=2, d=8 for wind2; t=2, d=6 for wind3. In the following pictures, there are some place where the relative err become very large. The reason is that the original wind velocity there is near zero. So a small absolute prediction err will lead to large relative err. That is also why is chosen as a performance indicator here.

Fig.5. prediction results of wind1



Table.1 Performance of different predictor signal Persistence

forecast RTRL+LMS NRTRL+NLMS NRTRL+RLS Performance

improve Wind1 10.64 dB 12.63 dB 12.65dB 12.66 dB 19.0% Wind2 11.7dB 14.14 dB 14.21dB 14.25 dB 21.8% Wind3 5.85dB 7.87 dB 7.84dB 7.90 dB 35.0%

Table.2 Parameters of PRNN

RTRL+LMS

NRTRL+NLMS

NRTRL+RLS

6 Conclusions In this paper, the Pipelined Recurrent Neural Network (PRNN) is adopted as a

predictor to make prediction on wind speed signal. Due to the non-stationary and nonlinearity property of the wind, it is a serve task to do prediction. First, the paper presents the architecture of the nonlinear predictor; a Normalized algorithm is introduced to train the PRNN. Then, the original wind speed signal is reconstructed by the theory of state space reconstruction. The delay time is determined by the first minimum of the average mutual information. The embedding dimension is chosen by the method of Cao-false nearest neighbors (Cao-FNN). The sensitivity of Cao-FNN method on data length is tested. It shows that the result does not strongly depend on length of data. Third, typical prediction results are shown. The performance between different predictors is compared. The results show that the improved PRNN+TDL predictor has great advantage with persistence forecast. And, NRTRL+RLS algorithm is the best train algorithm for this nonlinear predictor. Finally, the experiments show that MOD is an effective method in determination of parameters for PRNN.

Acknowledgment This work was supported by National Basic Research (973) Program of China

(No.2007CB210304) Reference

[1] Danilo P. Mandic, Jonathon A. Chambers. “Recurrent neural networks for prediction”. John Wiley & Sons, 2001

[2] S. Haykin and L. Li, “Nonlinear adaptive prediction of nonstationary signals,” IEEE Transactionson Signal Processing, vol. 43, no. 2, pp. 526-535, 1995.

[3] Ronald J. Williams and Jing Peng. “An Efficient Gradient Based Algorithm for On Line Training of Recurrent Network Trajectories”. Neural Computation, 2, pp. 490-501, 1990。

[4] Danilo P. Mandic, Jonathon A. Chambers. “A normalised real time recurrent learning algorithm”. Signal Processing 80 (2000) 1909-1916

[5] Ronald J.Williams and David Zipser. “A Learning Algorithm for Continually Running Fully Recurrent Neural Networks”. Neural Computation, 1,pp. 270-280, 1980.

[6] Jens Baltersee and Jonathon A. Chambers. “Nonlinear Adaptive Prediction of Speech with a Pipelined Recurrent Neural Network”. IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 46, NO. 8, AUGUST 1998.

[7] Danilo P. Mandic and Jonathon A. Chambers. “Toward an Optimal PRNN-Based Nonlinear Predictor”. IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 6, NOVEMBER 1999.

[8] N.H. Packard, J.P. Crutchfield, J.D. Farmer, R.S. Shaw, Phys. Rev. Lett. 45 (1980) 712.

[9] F. Takens, in: D.A. Rand, L.S. Young (Eds.), Dynamical Systems and Turbulence, Lecture Notes in Mathematics, vol. 898 Springer, Berlin,1981, p. 336.

[10] H.Kantz, T.Schreiber, Nonlinear Time Series Analysis[M]. Cambridge: Cambridge University Press,1997,p127.

[11] A.M.Fraser, H.L.Swinney. Independent coordinates for strange attractors form time series[J]. Phys. Rev.A. 1986,33:1134-1140.

[12] P.Grassberger, I.Procaccia. Measuring the strangeness of strange attractors[J]. Physica D, 1983,9:189-208

[13] M.B.Kennel, R.Brown, H.D.I.Abarbanel. Determining embedding dimension for phase-space reconstruction using a geometrical construction[J]. Phys. Rev. A 1992,45:3403.

[14] D.Kugiurmtzis. State space reconstruction parameters in the analysis of chaotic times series – the role of the time window length[J]. Physica D, 1996,95:13-28.

[15] H.S.Kim, R.Eykholt, J.D.Salas. Nonlinear dynamics, delay times and embedding windows[J]. Physica D, 1999,127:48-60.

1807 wind speed forecasting using fully recurrent neural network in wind power plants

Documents