comparisons between different hybrid statistical models for accurate forecasting of photovoltaic...

125
                                 

Upload: andcreti

Post on 16-Oct-2015

26 views

Category:

Documents


0 download

DESCRIPTION

Master's Degree Thesis - Mechanical Engineering - Termo Technical Plants. Eng. Andrea Cretì[email protected]

TRANSCRIPT

  • Universit del Salento

    Facolt Di Ingegneria

    Corso di Laurea Magistrale in Ingegneria Meccanica

    Tesi di Laurea in:

    Impianti termotecnici

    COMPARISONS BETWEEN DIFFERENT

    HYBRID STATISTICAL MODELS FOR

    ACCURATE FORECASTING OF

    PHOTOVOLTAIC SYSTEMS POWER

    Relatori:

    Prof. Ing. Paolo M. Congedo

    Prof.ssa Ing. M.G. De Giorgi

    Correlatore:

    Ing. Maria Malvoni

    Laureando:

    Andrea Cret

    Sessione Autunnale

    Anno Accademico 2012/2013

  • To my Family

    To God

    1

  • Contents

    Abstract 4

    1 Introduction 5

    2 Photovoltaic system description 8

    2.1 Places description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2.2 Climate data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    2.3 General specication of the PV plant . . . . . . . . . . . . . . . . . 13

    2.4 Mechanical dimensioning of the PV plant . . . . . . . . . . . . . . . 14

    2.5 Electrical dimensioning of the PV plant . . . . . . . . . . . . . . . . 16

    2.6 Data acquisition system . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.7 Electrical Power Production Data . . . . . . . . . . . . . . . . . . . 19

    3 Electrical time series forecasting 24

    3.1 State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    3.2 What is a Learning Machine . . . . . . . . . . . . . . . . . . . . . . 27

    3.3 Created Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    3.4 Training and Test Datasets . . . . . . . . . . . . . . . . . . . . . . . 30

    3.5 Model Performances Evaluations methods . . . . . . . . . . . . . . 31

    4 Articial Neural Networks (ANNs) 34

    4.1 Elman Back-propagation Neural Network . . . . . . . . . . . . . . . 34

    4.2 Forecasting with Model I - Input Vector I and II . . . . . . . . . . . 35

    5 Support Vector Machines (SVMs) 45

    5.1 Introduction to Support Vector Machines . . . . . . . . . . . . . . . 45

    5.2 SVM for Regression Models - SVR . . . . . . . . . . . . . . . . . . 50

    5.3 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    5.4 Nonlinear SVR using kernel . . . . . . . . . . . . . . . . . . . . . . 54

    2

  • Contents

    5.5 Least Square Support Vector Machine for Regression . . . . . . . . 57

    5.6 LsSVM Matlab Toolbox . . . . . . . . . . . . . . . . . . . . . . . . 59

    5.7 Forecasting with Model II - Input Vector I . . . . . . . . . . . . . . 60

    5.8 Forecasting with Model II - Input Vector II . . . . . . . . . . . . . . 66

    6 LSSVM with Wavelet Transform 74

    6.1 Fourier transform and short-term Fourier transform . . . . . . . . . 74

    6.2 Continuous Wavelet transform . . . . . . . . . . . . . . . . . . . . . 75

    6.3 Discrete Wavelet transform . . . . . . . . . . . . . . . . . . . . . . . 77

    6.4 Daubechies type 4 Discrete Wavelet transform . . . . . . . . . . . . 78

    6.5 Matlab

    Wavelet Toolbox

    and Wavelet transforming algorithm . . 81

    6.6 Forecasting results with Input Vector I . . . . . . . . . . . . . . . . 82

    6.7 Forecasting results with Input Vector II . . . . . . . . . . . . . . . . 89

    7 Multistep forecasting 96

    8 Comparisons between Model I, II and III 98

    9 Conclusions 105

    9.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    9.2 Future work recommendations . . . . . . . . . . . . . . . . . . . . . 106

    Acknowledgements 108

    Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

    Connecting the Dots... . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

    Terms, denitions, abbreviations and symbols 111

    Bibliografy 119

    3

  • Abstract

    A very high inuence of the photovoltaic energy into electricity free market

    requires ecient PV power forecasting systems. This study is focused on the power

    productivity forecasting of a Photovoltaic System located in Apulia - South East

    of Italy - using dierent hybrid statistical models and comparing the performances

    between all this models. Statistical models created and analyzed in this thesis

    are based on: Articial Neural Networks (ANNs), Least Square Support Vector

    Machines (LS-SVMs) and an hybrid model based on Least Square Support Vector

    Machines (LS-SVMs) with the Wavelet Decomposition of the input dataset.

    In the rst part of the thesis a description of the photovoltaic systems technology

    has been made and also a description of the PV park located in Monteroni di Lecce

    - Puglia - Italy with an accurate analysis of the climate data and the productivity

    of the plant. In the second part of the thesis dierent models for electric power

    forecasting are proposed. Learning Machines theory was explained and forecasting

    simulations using Mathworks Matlab software are made for dierent forecasting

    horizons (+1h,+3h,+6h,+12h,+24h). The forecasting errors obtained with the dif-

    ferent models are investigated and also an accurate error distribution analysis is

    made in order to evaluate the model that reaches the best performance. In the nal

    chapter a short discussion about the Multistep technique applied to the LS-SVM

    models and also forecasting simulations are mare. It was found that hybrid methods

    based on LS-SVM and WD outperform oher methods in the majority of cases.

    4

  • Chapter 1

    Introduction

    According to the statistical analysis of the photovoltaic PV systems in Italy,

    performed by the Italian Energy Service (Gestore Servizi Energetici, GSE S.p.A.),

    at the end of 2011 about 330,200 plants have become operative in Italy with a total

    installed power of 12,780 MW. In September 2012 the number of plants increased up

    to 440,387 and the total power was equal to 15,482.8 MW. In the middle September

    2013 the installed PV plants increased to 549,918 with a total power equal to 17,439

    MW and the Apulia Region - in the south-east of Italy - is the rst region in Italy

    for installed PV power (2,493 MW), with about 128 kW/km2. One of the Apulian

    PV plant was installed in the "Ecotekne Campus" of the University of Salento

    in Monteroni di Lecce (LE), that promotes the use of renewable energy and also

    participating in international research projects in this eld.

    This study is part of the founded research project "Building Energy Advanced

    Management Systems (BEAMS)". BEAMS is an EU Research and Development

    project funded by the EC in the context of the 7th Framework Program. Its strate-

    gic goal is the development of an advanced, integrated management system which

    enables energy eciency in buildings and special infrastructures from a holistic

    perspective. The project is developing an open interoperability gateway that will

    allow the management of diverse, heterogeneous sources and loads, some of them

    typically present nowadays in spaces of public use (e.g. public lighting, ventilation,

    air conditioning), some others emergent and to be widespread over the next years

    (e.g. electric vehicles).

    BEAMS is a user driven, demonstration oriented project, where evidence of the

    energy and CO2 savings achieved by the project's technologies will be collected. By

    means of a decentralized architecture, BEAMS will enable new mechanisms to ex-

    tend current building management systems and achieve higher degrees of eciency.

    5

  • Figure 1.1: Italian Solar Map

    The solution proposed will not only support the human operator of the building or

    facility to achieve higher eciency in the use of energy, but it will also open new

    opportunities to third parties - such as Energy Service Companies (ESCO), utili-

    ties and grid operators - needing and willing to interact with BEAMS management

    system through the interoperability gateway in order to improve the quality and

    eciency of the service - both inside and outside the perimeter of the facility.

    Figure 1.2: BEAMS Project logo

    The purpose of this thesis is to design dierent hybrid statistical models for

    photovoltaic power forecasting, applied to the PV Park located in Monteroni di

    6

  • Lecce (LE), and also evaluate the performances of these dierent forecasting models.

    In the First part of this thesis, the author presents a detailed description of the

    Photovoltaic Power Plant, including a detailed climate data analysis and the Data

    acquisition system, based on past studies made by P.M. Congedo et. al. [1]. Sec-

    ondly, a productivity analysis is proposed relying on the original project of the plant

    made by the Italian company ESPE srl. A very accurate literature search of the

    Electrical time series forecasting state of the art was done, underlining the evolu-

    tion of time series forecasting, starting from traditional statistical methods such as

    multi-linear regression models, Box-Jenkins methods, Kalman ltering-based meth-

    ods and ARMA models. It is focused that electric load time series are usually non-

    linear functions of exogenous variables [13], so to incorporate non-linearity, many

    researches started using Articial Neural Networks (ANNs), Support Vector Ma-

    chines (SVMs) and Hybrid models based on Wavelet Transform of the signals. It is

    also underlined that no one already used a Least Square Support Vector Machines

    (LSSVMs) based model and also an hybrid model with Wavelet Transform and

    LSSVMs in order to evaluate the performances of a photovoltaic power plant. So,

    the innovative character of this thesis is to use these hybrid statistical models in

    order to reach better performances in PV power forecasting.

    In order to store in ecient way the acquired data from the PV Power plant, a

    software called "Solar Data Extractor" was developed, that provides a real time ac-

    quisition from the PV plant design company web site to a local MS Access Database.

    Also a Java routine for MS Access to MySQL database conversion is developed.

    In the Second part of the thesis a description of the mathematical theory of

    Articial Neural Networks, Least Square Support Vector Machines and Wavelet

    Transform is proposed, underlining the strengths and the weaknesses of each al-

    gorithm. After that, forecasting simulations are made using Mathworks Matlab

    R2012b software with each forecasting models and using as input two dierent in-

    put vectors. All the results are analyzed and an accurate performance evaluation

    by using a statistical approach is proposed.

    Performance of the created models are compared in order to focus the best

    forecasting models for particular applications. A nal discussion about the used

    forecasting techniques is proposed, underlining positive and negative aspects of

    each model and also some future work recommendations in order to improve the

    performance of the created models.

    7

  • Chapter 2

    Photovoltaic system description

    2.1 Places description

    The PV Park under study is located in the campus of the University of Salento,

    in Monteroni di Lecce (LE), Puglia (40193216N, 1855244E) and maps in

    Figures 2.1 and 2.2 indicates its geographical collocation.

    Figure 2.1: Geographical collocation of the Campus "Ecotekne" (source: Google

    Maps)

    The PV Panels of the PV Park are installed on shelters used as car parking, as

    shown in Figures 2.3 and 2.4

    8

  • 2.2 Climate data analysis

    Figure 2.2: Map of the Campus Ecotekne (source: Google Maps)

    Figure 2.3: Example 1 of PV modules installed on shelters in the Ecotekne Cam-

    pus

    2.2 Climate data analysis

    The PV Park is characterized by a warm Mediterranean climate with a dry

    summer [1]. The Cartesian and polar solar maps of the site are proposed in Fig-

    ures 2.6 and 2.7. The climate data of the site, reported in www.meteo-am.it and

    www.ilmeteo.it were analyzed in terms of temperature, humidity and wind speed

    during three temporal periods: from 1961 to 1990, from 1991 to 2000 and from 2001

    to 2011.

    Table 2.1 shows the values of average ambient temperature and PV module

    9

  • 2.2 Climate data analysis

    Figure 2.4: Example 2 of PV modules installed on shelters in the Ecotekne Cam-

    pus

    (a) 1961-1990 (b) 1991-2000

    (c) 2001-2011

    Figure 2.5: Means of Maximum and minimum temperatures during three periods

    temperature for each range of solar irradiance for the target period (2012). The

    average ambient temperature ranges from 18.9 C to 1 100 W/m2 and 27.5 C at1100 1200 W/m2. Moreover, the average PV module temperature ranges from 16

    10

  • 2.2 Climate data analysis

    Figure 2.6: Height of the sun for twelve months for the desired latitude - Cartesian

    diagram (source: ENEA)

    Figure 2.7: Height of the sun for twelve months for the desired latitude - Polar

    diagram (source: ENEA)

    C to 0 100 W/m2 and 48.6 C at 1100 1200 W/m2. In the range between 1100and 1200 W/m2 the PV module temperature reaches the highest increment (about

    21

    C).

    11

  • 2.2 Climate data analysis

    Solar Irradiance Ambient Temperature Ambient Temperature

    (W/m2) (C) (C)

    0-100 18.9 16.0

    100-200 22.2 22.8

    200-300 23.5 26.4

    300-400 24.3 29.7

    400-500 24.9 32.7

    500-600 25.8 36.2

    600-700 25.8 38.6

    700-800 26.8 41.9

    800-900 27.1 43.9

    900-1000 27.5 45.3

    1000-1100 25.5 45.7

    1100-1200 27.5 48.6

    1200-1300 17.9 28.1

    Table 2.1: Average ambient temperature and PV module temperature for each range

    of solar irradiance

    Max Solar Irradiance

    PV1

    Max Solar Irradiance

    PV2

    (W/m2) (W/m2)

    March 984.2 1073.4

    April 1253.1 1344.4

    May 1245.0 1309.0

    June 1249.9 1265.0

    July 1094.7 1124.3

    August 1101.4 1122.1

    September 1220.2 1249.5

    October 924.8 1103.5

    Table 2.2: Monthly ranges variation of solar irradiance

    Table 2.2 shows the maximum values recorded for each month regarding solar

    irradiance and the maximum and minimum values of ambient temperature and

    module temperature are shown in Table 2.3. The maximum dierence between Tc

    12

  • 2.3 General specication of the PV plant

    Month Ambient temperature (

    C) Module temperature (C)

    Min Max Min Max

    March 4.4 25.1 -1.1 50.4

    April 4.6 29.4 -0.1 57.2

    May 8.7 33.7 4.82 60.2

    June 13.6 43.1 9.1 73.6

    July 18.1 44.4 13.7 70.5

    August 15.7 41.2 11.3 69.6

    September 10.1 37.1 4.5 62.5

    October 6.8 33.4 0.8 55.9

    Table 2.3: Monthly ranges variation of PV module temperature and ambient tem-

    perature

    and Ta of about 40.3C is also noted. The maximum PV module temperature is

    73.6 C, when the solar irradiance and ambient temperature are 915.5 W/m2 and

    33.4 C.

    2.3 General specication of the PV plant

    In the University Campus "Ecotekne" located in Monteroni di Lecce (LE) -

    Apulia - Italy is installed a Photovoltaic plant for the electrical energy production

    by the direct conversion of the solar radiation, means photovoltaic eect, composed

    primarily of a set of PV modules, one or more conversion groups from continuous

    current to alternate current and other minor electrical components. The speci-

    cation of the single PV module are shown in Tab. 2.4. The modules installed

    in the PV plant are produced by the company "SUNPOWER Corporation"; they

    have a declared ecient of 19.6%, a reduced tension-temperature coecient and a

    anti-reective glass. The solar cells used for this module are produced by "Maxeon

    Corp." with "back-contact" patented technology.

    In the site under study there are the following 4 PV sub-plants with dierent

    nominal power:

    FV1: 960 kWp

    FV2.1: 990.72 kWp

    13

  • 2.4 Mechanical dimensioning of the PV plant

    FV2.2: 979.20 kWp

    FV3: 84.436 kWp

    PV module Specication

    Type Mono-crystalline silicon

    Nominal power (Pn) 320 Wap

    Maximum power voltage (Vpm) 54.70 V

    Maximum power current (Ipm) 5.86 A

    Open circuit voltage (Voc) 64.80 V

    Short circuit current (Isc) 6.24 A

    Weight 18.6 Kg

    Net [gross] module surface 1.57 m2 [1.63 m2]

    Table 2.4: Specications of the PV module

    This study focuses on the sub-plant FV1, that have specication shown in Tab.

    2.5. The FV1 sub-plant is composed by two dierent module groups: the rst group

    with nominal power of 606.7 kWp and a modules Tilt angle of 15and a second

    group with nominal power of 353.3 kWp and a modules Tilt angle of 3.

    2.4 Mechanical dimensioning of the PV plant

    The support structures of the shelters are metallic structures that ensure the

    anchorage to the ground to the PV modules. They also ensure the designed right

    angle to the PV modules. All the mechanical structures were designed and built

    by the company ESPE srl, respecting the Italian legislation (Leggi 1086/71, 64/74,

    D.M. 14 January 2008). The shelters were dimensioned to resist to the following

    loads:

    Permanents loads

    1. Structures weight

    2. Ballast weight

    3. Modules weight

    14

  • 2.4 Mechanical dimensioning of the PV plant

    Overloads

    1. Snow loads

    2. Wind loads

    3. Thermal variations

    4. Seismic eects

    The nal checks of the structure was made with the most unfavorable load

    conditions and also applying a safety factor for the tipping checks equal to 1.5.

    For the resistance checks was applied allowable stresses equal to 1.125amm and

    1.125amm.

    PV module Specication

    Type Mono-crystalline silicon

    Nominal power of PV system 960 kWp

    Total number of modules 3000

    Total number of inverter 3

    Total number of strings 250

    Total number of module for each string 12

    Net [gross] modules' surface 4710 m2 [4892 m2]

    PV1 PV system

    Nominal power of PV system 353.3 kWp

    Azimuth -10

    Tilt 3

    Total number of modules 1104

    Net [gross] modules' surface 1733.3 m2 [1799.5 m2]

    PV2 PV system

    Nominal power of PV system 606.7 kWp

    Azimuth -10

    Tilt 15

    Total number of modules 1896

    Net [gross] modules' surface 2976.7 m2 [3090.5 m2]

    Table 2.5: Specications of the PV system FV1

    15

  • 2.5 Electrical dimensioning of the PV plant

    2.5 Electrical dimensioning of the PV plant

    The FV1 PV Plant was partitioned in two sub-plants having peak power equal

    to 606.72 kWp and 353.28 kWp. For the electrical dimensioning of the FV1 Plant a

    productivity study was made, as shown in Tab. 2.6 for PV1 modules group and in

    Tab. 2.7 for PV2 modules group.

    606.72 kWp PV sub-plant productivity

    Fixed system: Tilt = 15 , Orientation = 10

    Month Ed Em Hd Hm

    January 1.90 59.0 2.39 74.1

    February 2.36 66.1 3.0 84.0

    March 3.40 105 4.43 137

    April 4.36 131 5.79 174

    May 4.82 149 6.57 204

    June 5.11 153 7.13 214

    July 5.17 160 7.26 225

    August 4.83 150 6.81 211

    September 4.05 121 5.52 166

    October 3.14 97.3 4.18 130

    November 2.16 64.9 2.79 83.8

    December 1.70 52.6 2.15 66.5

    Year Mean 3.59 109 4.48 147

    Year Total 1310 1770

    Table 2.6: Productivity of the 606.72 kWp PV sub-plant

    Where:

    Ed: Electrical average daily productivity (kWh/kWp per year);

    Em: Electrical average monthly productivity (kWh/kWp per year);

    Hd: Average solar daily radiation for square meter (kWh/m2);

    Hm: Average solar monthly radiation for square meter (kWh/m2);

    16

  • 2.6 Data acquisition system

    353.28 kWp PV sub-plant productivity

    Fixed system: Tilt = 3 , Orientation = 10

    Month Ed Em Hd Hm

    January 1.52 47.2 1.96 60.7

    February 2.04 57.1 2.60 72.9

    March 3.11 96.3 4.02 125

    April 4.19 126 5.52 166

    May 4.81 149 6.51 202

    June 5.19 156 7.20 216

    July 5.20 161 7.25 225

    August 4.71 146 6.58 204

    September 3.74 112 5.06 152

    October 2.71 84.1 3.62 112

    November 1.75 52.5 2.30 68.9

    December 1.33 41.2 1.73 53.6

    Year Mean 3.37 102 4.54 138

    Year Total 1230 1660

    Table 2.7: Productivity of the 353.28 kWp PV sub-plant

    Where:

    Ed: Electrical average daily productivity (kWh/kWp per year);

    Em: Electrical average monthly productivity (kWh/kWp per year);

    Hd: Average solar daily radiation for square meter (kWh/m2);

    Hm: Average solar monthly radiation for square meter (kWh/m2);

    2.6 Data acquisition system

    The Company ESAPRO, who designed the PV plant, installed a data acquisi-

    tion system in order to monitor the system main parameters. The solar irradiation

    is monitored by LP-PYRA02 sensors with resolution of 1 W/m2. PT100 type

    temperature sensors are used to measure the PV module temperature and the am-

    bient temperature. The data acquisition system consist of three inverters, the solar

    irradiation sensors and the PV module/ambient temperature sensors. The data

    17

  • 2.6 Data acquisition system

    from inverters and sensors are characterized by protocols Modbus, Probus, clean

    contacts or digital inputs, and they are collected by a PLC Siemens with scada

    WINCC for processing and storage. Another scada WINCC is used to extract and

    duplicate local data. All the acquired data is down-loadable by the web site of

    the society ESAPRO "http://supervisione.espe.it/fotovoltaicoWeb/index.htm" that

    designed and installed the PV plant.

    DATA TYPE Units

    Total Energy Production kWh

    Active Power - Radiation kW W/m2Daily Energy Production kWh

    Monthly Energy Production kWh

    Annual Energy Production kWh

    Ambient Temperature

    C

    Module Temperature

    C

    Integral Radiation W/m2

    Table 2.8: Acquired Data Type from ESAPRO Website

    The Web-site of ESAPRO allows to download data in XLS, CSV and PDF

    digital formats by using a web module for selecting the desired period of time. This

    procedure for data extracting is static: it not allows the user to have a real-time

    acquisition but a manual downloading is necessary. In addiction, XLS and CSV

    les become too big for high acquisition periods of time and are dicult to manage.

    In order to obtain a real-time acquisition system and a historical local database

    with the data from the put-into-service-day of the PV plant, a software has been

    created. The developed software called "SOLAR DATA EXTRACTOR" allows to

    interrogate the ESAPRO web-site every 10 minutes and extract the data type shown

    in Tab. 2.8. The extracted data is inserted into an MS ACCESS 2007 Database. By

    interrogating the Access Database is possible to have real-time information about

    the main parameters of the PV plant. Fig. 2.8 shows the main screen of the

    software: it's possible to select the MANUAL or AUTOMATIC acquisition mode.

    In order to implement a Web application and to have a fast data interrogation from

    Matlab software, a data conversion from ACCESS DB to MySQL Database was

    made by a developed Java software, as shown in Fig. 2.10. A general scheme of the

    data acquisition system developed is shown in Fig. 2.11.

    18

  • 2.7 Electrical Power Production Data

    Figure 2.8: Software SOLAR DATA EXTRACTOR main screen

    Figure 2.9: Software SOLAR DATA EXTRACTOR export screen

    2.7 Electrical Power Production Data

    The relation between power production, time and meteorological conditions are

    vital for an accurate prediction of solar power production. These factors should be

    taken into account when determining the data that will be used in power production

    forecasting. In this study, the data sets used in the forecasting process consist of past

    peak electrical power production values, ambient temperature values, PV modules

    temperature values and solar irradiation values. A plot of the Data sets used in this

    study is shown in Figure 2.13. For each hour i considered as the beginning time of

    the forecasting, the input vector was given by:

    The average value of the power produced by the PV plant in the previous 60

    minutes respect to the hour i given by:

    Pm(i) =1

    6

    it=i50min

    P (t) i = 1, ..., 6297 (2.1)

    19

  • 2.7 Electrical Power Production Data

    Figure 2.10: Software for ACCESS DB to MySQL DB conversion

    The hourly average value of the module temperature (

    C), ambient tempera-

    ture (

    C), irradiance on plain inclined at a tilt angle of 3 and irradiance for

    a tilt angle of 15

    (W/m2).

    The target used to evaluate model prediction is given by Pt(i, l), the sum of the

    average hourly powers Pm(r) during the forecast time horizon l, dened as:

    Pt(i, l) =1

    6

    i+lr=i+1

    Pm(r) i = 1, ..., 6297 (2.2)

    Figure 2.14 shows the correlation between the solar radiation for PV modules

    at 3and Output Power, solar radiation for PV modules at 15and Output Power,

    Ambient Temperature and Output Power and also Module Temperature and Output

    Power, on the basis of one year collected data. The correlation coecient of Pearson-

    Bravais (R2) is used to evaluate the data correlation: it is evident that the most

    correlated parameter with PV power is given by irradiation but in view of the R2

    values obtained, all parameters have been taken into consideration to implement

    the forecasting Models in this study.

    The very high correlation between Solar Irradiation and PV Power is also shown

    in Fig. 2.12: the solar irradiation curve is almost the same of the PV Power curve,

    it follows the same trend for every time instant.

    20

  • 2.7 Electrical Power Production Data

    Figure 2.11: General PV Dataset Management System

    Figure 2.12: Correlation between Solar Radiation and Output Power

    21

  • 2.7 Electrical Power Production Data

    (a) Output Power (b) Ambient Temperature

    (c) Module Temperature (d) Solar Radiation at 3

    (e) Solar Radiation at 15

    Figure 2.13: Input Dataset plot

    22

  • 2.7 Electrical Power Production Data

    (a) Ambient Temperature - Output Power (b) Module Temperature - Output Power

    (c) Irradiation at 3 - Output Power (d) Irradiation at 15 - Output Power

    Figure 2.14: Input Dataset plot

    23

  • Chapter 3

    Electrical time series forecasting

    3.1 State of the Art

    Figure 3.1: Italian renewable energy production

    Load and Productivity forecasting has always been a key instrument in power

    system operation. Many operational decisions in power systems, such as unit com-

    mitment, economic dispatch, automatic generation control, security assessment,

    maintenance scheduling and energy commercialization depend on the future be-

    havior of loads and productivity. In particular, with the rise of deregulation and

    free competition of the electric power industry all around the world, loads and

    productivity forecasting has become more important than ever before [12]. Since

    renewable energy power plants such as PV systems and Wind farms were used, the

    productivity forecast for the national energy system become dicult due to the

    24

  • 3.1 State of the Art

    high variability of the electricity production of this new systems.

    In recent years, accuracy of electricity productivity forecasting has become very

    important on regional and national scale. In terms of precision, electricity suppliers

    are interested in various horizons in order to estimate the fossil fuel saving, to man-

    age and dispatch the power plants installed [14]. The uncertainty of power from

    the sun is a limitation of PV system, inuencing the quality of the electrical system

    that connected. So, the possibility to predict the PV Power (up to 24 h or even

    more) can become very important for an ecient planning of the Grid Connected

    photovoltaic systems [2]. The PV Power production mainly includes two kinds of

    methods (Fig. 3.2): physical model-based and historical data-based methods [33].

    Physical models are based on numerical weather predictions (NWP) to predict so-

    lar radiation and other meteorological data. It does well in in medium-term and

    long-term predictions. The historical data methods requires only past power or cli-

    mate data and do better in short-term prediction. All the dierent wind and solar

    power prediction studies, and also electric loads predictions studies, underline the

    necessity to implement forecasting models by using physical and statistical models

    or to combine short-term and medium-term models to improve the forecasting per-

    formance. In this study only historical data-based models for short term PV Power

    forecasting are used.

    Figure 3.2: Renewable Energy forecasting methods

    In literature dierent historical data-based forecasting methods were developed

    to evaluate the performance of PV systems. Statistical models include moving av-

    erage and exponential smoothing methods such as, multi-linear regression models,

    stochastic process, data mining approaches, autoregressive and moving averages

    (ARMA) models, Box-Jenkins methods, and Kalman ltering-based methods [12].

    25

  • 3.1 State of the Art

    However, electric load time series are usually nonlinear functions of exogenous vari-

    ables. Therefore, to incorporate non-linearity, Articial Neural Networks (ANNs)

    have received much more attention in solving problems of electricity load or pro-

    ductivity forecasting [13]. In [15] the power forecasting of a PV system has been

    presented by calculating the solar radiation, collecting data from weather forecast-

    ing, and using Elman neural network to forecast by using data from PV system.

    In [16] a MLP network for 24 h forecasting ahead of solar irradiation was devel-

    oped. The proposed model used as input parameters the mean daily irradiation

    and the mean daily air temperature. De Giorgi et al [18] compared ARMA models,

    which perform a linear mapping between inputs and outputs with Articial Neu-

    ral Network (ANNs) and Adaptive Neuro-Fuzzy Inference Systems (ANFIS), which

    perform a non-linear mapping, underlining that ANNs presents an higher accuracy

    at long time horizon in wind power forecasting. The higher forecasting accuracy for

    long time horizon given by non-linear models as the ANN is also shown in [3842]

    for PV power prediction and in [4349] for wind power signals. However, one major

    risk in using ANN models is the possibility of excessive training data approxima-

    tion, i.e., over-tting, which usually increases the out-of-sample forecasting errors

    [16].

    Recently, new methods for time series forecasting based on Learning Machines

    were developed, using Support Vector Machines (SVMs). A Support Vector Ma-

    chine for Regression problems is based on the Vapnik statistical learning theory

    [4, 5] and has been used for electrical productivity forecasting (PV systems, Wind

    parks, etc..). Several studies demonstrate that SVMs are more resistant to the over-

    tting problem, by achieving high generalization performance in solving forecasting

    problems of various time series. In addiction SVM can model complex problems

    with datasets made up of several variables and a reduced training dataset. In [35]

    is focused the less computational times of the SVM, compared to ANN models us-

    ing back-propagation algorithms. Ruidong Xu et. al. [36] shown that PV power

    forecasting by using SVM models is more ecient than that of the ANN and more

    practicable and also in [50] and [51] forecasting models based on Support Vector

    Machines Regression are proposed. Also in [37] SVM model outperform ANN.

    A variant of the standard SVM is the Least Square Support Vector Machine

    (LSSVM) [10], which uses a simplied linear model, more simple and computation-

    ally easier but with the same advantages of the ANNs and SVMs models. LSSVM

    models are already applied for wind power forecasting such as in [52] and [53]: is

    is focused that this models outperform standard SVM Regression, in particular for

    26

  • 3.2 What is a Learning Machine

    very-short forecasting horizons.

    Improvement of prediction performance is noticeable in particular for hybrid

    methods based on Wavelet decompositions [13], [19] and [33]. The non-stationary

    nature of PV power (such as wind power) is interesting for a WD studying of this

    kind of signals; the time series can be decomposed into approximately stationary

    components and allowing to the model to analyze those components separately.

    Xiyun Yang et. al. [7] used wavelet transform to process data for PV power pre-

    diction with a SVMs Model, and the Wavelet Transform was also used to improve

    the performance of Neural Networks Forecasting Models for Wind Power and Na-

    tional Electric Loads [33], [34]. In [54] an hybrid approach based on WD and

    ANNs and evolutionary algorithm was successfully proposed. No one already used

    a LS-SVM model or an hybrid LS-SVM model for PV Power forecasting, so the

    innovation introduced in this thesis concern the usage of a LS-SVM model for PV

    Power forecasting and also an Hybrid Model based on LS-SVM. A nal evaluation

    of the performance of ANNs, LS-SVMs and LS-SVMs with Wavelet Transform is

    proposed in order to reach the best performance for every forecasting horizon.

    3.2 What is a Learning Machine

    A learning system is a system that provides an adaptive answer to external

    urges. A Learning process requires a feedback from the external environment that

    give information to the system that learn from the quality of the answer associated

    to the urge. Learning machines are based on three types of learning processes (Fig.

    3.3):

    Figure 3.3: Learning Machines

    27

  • 3.2 What is a Learning Machine

    Supervised Learning: the feedback provided to the system is implemented

    by an error function that evaluate the bias from the system response to the

    optimal response. The target of the learning process is to minimize the error

    function in order to obtain an optimal response;

    Non-Supervised Learning: the optimal response to the urge input does not

    exist. The learning machine can be able to extract information of similarity

    between input data (without desired output association) in order to make a

    categorization;

    Reinforcement Learning: is a programming philosophy that allows to re-

    alize algorithms able to learn and adapt themselves to the environment mu-

    tations. This programming technique is based on the possibility to receive

    external urges depending on the algorithm choices. A correct choice will re-

    sult in an award while a incorrect choice will result in a penalization of the

    system. The target of the system is to reach the higher premium and then

    the best result.

    It isn't always possible to specify a deterministic relation between an input

    dataset (urges) and an associated output dataset (response). The solution to these

    problems is to Learn from some examples the functional relation (target function)

    that map the input space in the output space. The evaluation obtained by the

    target function from the extrapolated input example information is called Learning

    Problem Solution. It is important to select from a set of functions the one with the

    best input-output mapping performance.

    An AGENT is a system designed inspiring on a human or animal model. An

    Agent is made up of a Sensor System integrated with the external environment in

    order of acquiring data, a Decisional System in order to take decisions basing on

    acquired data and an Actuation System for operating on the environment using the

    decision from the Decisional System.

    In a system for reinforcement learning:

    The Agent receive sensations from the environment using its sensors

    The Agent decide the actions on the external environment

    According to the results of the actions, the agent can be awarded

    28

  • 3.3 Created Models

    Figure 3.4: Reinforcement Learning

    In order to use an automatic learning method, general suppositions about en-

    vironment properties are made, in particular the environment is described by a

    Markov Decision Process (MDP), formally dened by:

    A nite action set A;

    A nite states set S;

    A transition function T (T : S A (S)) that assign to every couplestate-action a probability distribution on S;

    A reinforcement function (reward function) R (R : S A A

  • 3.4 Training and Test Datasets

    Input Vector Description

    I Output Power

    II Output Power, Irradiation at 3, Irradiation at 15, Module

    Temperature, Ambient Temperature

    Table 3.2: Input Vectors used with every Models

    Before applying one of the forecasting Models, some operations on the Datasets

    are necessary in order to obtain better performance and to allow to the simulation

    software (Mathworks Matlab) to success the forecasting procedure, as shown in

    Fig. 3.5 . After the acquisition, the data was adjusted to suit the Matlab routine,

    normalized between a range [1; 1], uploaded into a Matlab database and after theforecasting procedure the dataset was denormalized and compared with the real

    power data, evaluating the performances of the model. The data normalization was

    made by a Min-Max Normalization procedure, using the Eq. 3.1.

    = minA

    maxA minA (NmaxA NminA) +NminA (3.1)

    Where:

    = value to normalize;

    = normalized value;

    minA = minimum value of ;

    maxA = maximum value of ;

    Nmax = maximum value of the new range;

    Nmin = minimum value of the new range;

    3.4 Training and Test Datasets

    All the collected data time series (365 days/6297 hourly records) were divided

    in two sets: Training and Testing data sets. The training data set included 65% of

    the time series data, the testing data set 35% (Figure 3.6), in the same way done

    by De Giorgi et. al. [2]. Figures 3.7 and 3.8 shows the trend of some samples from

    the training and test datasets.

    30

  • 3.5 Model Performances Evaluations methods

    Figure 3.5: Acquired dataset adjustment

    Figure 3.6: Training and Test Dataset division

    3.5 Model Performances Evaluations methods

    The Mean Absolute Error (MAE), Normalized Mean Absolute Percentage Error

    (NMAPE) and the Standard Deviation (Std) of MAE [33] can be used to measure

    the prediction performance of the created models:

    MAE =1

    n|Pi Ti| (3.2)

    NMAPE =1

    n |Pi Ti|

    C 100 (3.3)

    31

  • 3.5 Model Performances Evaluations methods

    Figure 3.7: Example of used Training data

    Figure 3.8: Example of used Test data

    Std =

    1

    n 1

    Pi Ti MAE2 (3.4)

    Where:

    i: generic time instant;

    n: number of observations;

    Pi: forecasted PV power at time instant i;

    32

  • 3.5 Model Performances Evaluations methods

    Ti = real PV power at time instant i;

    C = maximum real PV at time instant i.

    De Giorgi et. al. [2] have chosen the NMAPE as the best parameter in order

    to evaluate in a correct way the quality of the forecasts. The normalizations of

    the dierences between predicted power and real power allows to prevent that very

    dierent errors have the same weight for the performances evaluation. For Example,

    without normalization, a dierence between Pi = 2 kW and Ti = 4 kW have the

    same importance that a dierence between Pi = 200 kW and Ti = 400 kW . The

    NMAPE parameter will be used for evaluating the performances of every Models

    used in this thesis. Analyzing the output data obtained from the forecasting Models

    created, it was noticed that, in lot of cases, the output data is a negative value. The

    negative output data will be replaced with a zero value. Even though the NMAPE

    is a valid indicator of the performances of a forecasting Model, a better statistical

    analysis of the output data has been proposed: has been studied the probability of

    obtaining an NMAPE value between dierent Ranges of values, as shown in Tab.

    3.3.

    Forecasting Models Evaluated Probability Range

    Model I 1%, 5%, 10%, 20%

    Model II 1%, 5%, 10%, 20%

    Model III 1%, 5%, 10%, 20%

    Table 3.3: Probability range for the NMAPE

    33

  • Chapter 4

    Articial Neural Networks (ANNs)

    Neural networks are composed by simple elements operating in parallel and

    inspired on the biological nervous system. The complexity of the real neurons is

    highly abstracted when modeling articial neurons. These basically consist of inputs

    (like synapses), which are multiplied by weights (strength of the respective signals),

    and then computed by a mathematical functions which determines the activation

    of the neuron. Another function (which may be the identity) computes the output

    of the articial neuron (sometimes in dependance of a certain thereshold). ANNs

    combine articial neurons in order to process information. By adjusting the weights

    of an articial neuron we can obtain the output we want to specic input and this

    process of adjusting the weights is called learning or training.

    Figure 4.1: Articial Neuron

    4.1 Elman Back-propagation Neural Network

    A rst prediction model based on Elman Neural Network has already been used

    by P.M. Congedo et. al. in order to forecast the productivity of the PV system

    of the Campus Ecotekne [1]. This kind of Neural Network has a feedback from

    the output of the rst layer to the input of the same layer - as shown in Fig. 4.1

    34

  • 4.2 Forecasting with Model I - Input Vector I and II

    - thus enabling the detection and generation of time-varying patterns [55]. This

    characteristic is of great importance as the time-length ot the prediction increases.

    The used scheme consists of three layers of neurons. The number of neurons in

    each layer is dened in Tab. 4.1. In the rst layer the hyperbolic tangent sigmoid

    transfer function (TANSIG) [56] was applied and in the second layer the linear

    transfer function (PURELIN) [57] was used. The "gradient descent weight and

    bias" was used as learning function (LEARNGD) [58] to determine how to adjust

    the neuron weights to maximize performance.

    Figure 4.2: Typical architecture of an Elman Back Propagation ANN

    The implementation of the Elman Neural Network in the calculator was done

    by using the Matlab

    Neural Network Toolbox

    Version R2011b. The input data

    were processed in order to delete wrong acquisitions (negative values and out-of-

    scale values) and also normalized between the intervals [1; +1]. The parametersused for designing the Elman Neural Network are shown in Tab. 4.1. The Model

    based on Elamn ANN was called MODEL I, as already shown in Tab. 3.1; INPUT

    VECTOR I and INPUT VECTOR II are used as input for the created Model I.

    4.2 Forecasting with Model I - Input Vector I and

    II

    In this section, experiments were carried out to evaluate the performance of

    the Model I and simulations results are shown in Tab. 4.2. Firstly, forecasting

    performance are evaluated with the NMAPE for Model I and Input Vectors I and

    II.

    35

  • 4.2 Forecasting with Model I - Input Vector I and II

    Parameter Value

    Training function TRAINGDX

    Adapt learning function LEARNGD

    Number of layers 3

    Neurons (layer 1) 5

    Neurons (layer 2) 5

    Neurons (layer 3) 1

    Activation function hidden layer TANSIG

    Activation function output layer PURELIN

    Epochs 500

    Table 4.1: Training parameters for the Elman ANN

    NMAPE

    Horizon

    +1

    NMAPE

    Horizon

    +3

    NMAPE

    Horizon

    +6

    NMAPE

    Horizon

    +12

    NMAPE

    Horizon

    +24

    [%] [%] [%] [%] [%]

    Input

    Vector I

    9.40 15.11 20.18 21.12 18.54

    Input

    Vector II

    6.49 10.37 13.46 14.22 19.60

    Table 4.2: Productivity forecasting results for Model 1

    The NMAPE rises as the forecasting horizon increases, except for the forecasting

    horizon +24 with Input Vector I. The lower NMAPE for horizon +24 with Input

    Vector I is due to the low correlation of dataset for an high forecasting horizon.

    As expected, using Input Vector II instead of Input Vector I, better forecasting

    performances can be obtained. Figures 4.3 and 4.4 show histograms with the data

    in Table 4.2 while Figure 4.5 shows a line chart comparison between the NMAPE

    values obtained with Input Vectors I and II on Model I: using Input Vector II the

    NMAPE decreasing is higher for high forecasting horizons.

    Figures from 4.6 to 4.9 show a graphical comparison between the Real PV Power

    Production and the forecasted PV Power Production for some time samples and also

    36

  • 4.2 Forecasting with Model I - Input Vector I and II

    Figure 4.3: NAMPE for Model I with Input vector I

    Figure 4.4: NAMPE for Model I with Input vector II

    Figure 4.5: Comparisons between NMAPE for Model I - Input Vectors I and II

    a plot of the corresponding error. Graphs show that using Input Vector I there is

    always a bias error between the forecasted and real PV power (visible when the real

    37

  • 4.2 Forecasting with Model I - Input Vector I and II

    PV power as zero value). It is also shown that usign Input Vector I the forecasting

    model is not very capable to follow abrupt changes of the Real PV power signal,

    such as in case of an unexpected passage of clouds over the PV plant. Using Input

    Vector II the performance increases: the forecasting power better follows abrupt

    changes of the real power and also the bias error is cleared. For both Input Vector

    and for every forecasting horizons, the forecasted PV power signal presents a delay

    for the edge of the real power signal and an advance for falling edge of the real

    power signal. This behavior of the model is also visible from the error signal that

    has a sinusoidal for real power peaks.

    Tables 4.3 and 4.4 show the probability distribution of the NMAPE for all the

    forecasting horizons. Figure 4.10 shows the error distribution for the Model I with

    Input Vector I and II and for all the forecasting horizons. These graphs allows

    to evaluate the tendency af the Model to underestimate or overestimate the real

    PV power: generally, using Input Vector I there is and underestimation of the error

    while using Input Vector II the error distribution has an average value closer to zero.

    It can be also inferred from the graphs that error distributions for long forecasting

    horizons have higher standard deviation values, especially for the horizon +24 h.

    Probab.

    (+1 h)

    Probab.

    (+3 h)

    Probab.

    (+6 h)

    Probab.

    (+12 h)

    Probab.

    (+24 h)

    [%] [%] [%] [%] [%]

    1 % 2 2 1 2 3

    5 % 16 9 6 8 17

    10 % 72 56 12 17 34

    20 % 87 76 67 37 61

    Table 4.3: Probability analysis results for Model I with Input Vector I

    Fig. 4.11 shows the error distribution comparison between Input Vector I and

    II for Model I: the best performance of the Input Vector II in terms of mean of the

    Gaussian distribution are clearly perceptible. A graphical comparison between the

    probability distributions of the NMAPE for Model I is also provided and is shown

    in Figures 4.12 and 4.13. As expected, the probability to have a low NMAPE is

    not very good using Input Vector I, while it raises using Input Vector II. Figure

    4.12 shows that using Input Vector I a forecasting horizon increasing is not always

    related to a Probability decreasing, due to a low correlation between the input and

    the output data, while Figure 4.13 shows that using Input Vector II, a forecasting

    38

  • 4.2 Forecasting with Model I - Input Vector I and II

    Figure 4.6: Comparisons between actual and forecasted PV power and Error for

    Model I with Input Vector I - forecasting horizon +1 hour

    Figure 4.7: Comparisons between actual and forecasted PV power for Model I with

    Input Vector II - forecasting horizon +1 hour

    39

  • 4.2 Forecasting with Model I - Input Vector I and II

    Figure 4.8: Comparisons between actual and forecasted PV power for Model I with

    Input Vector I - forecasting horizon +6 hour

    Figure 4.9: Comparisons between actual and forecasted PV power for Model I with

    Input Vector II - forecasting horizon +6 hour

    40

  • 4.2 Forecasting with Model I - Input Vector I and II

    horizon increasing always cause a Probability decreasing, for every NMAPE range.

    In conclusion, Input Vector II allows to achieves the best global performance: a

    low NMAPE for every forecasting horizon and an error distribution without high

    overestimation or underestimation of the real PV power. Input Vector I may be

    used just in case a lower computational time is desired.

    Probab.

    (+1 h)

    Probab.

    (+3 h)

    Probab.

    (+6 h)

    Probab.

    (+12 h)

    Probab.

    (+24 h)

    [%] [%] [%] [%] [%]

    1 % 29 16 11 7 4

    5 % 64 41 27 21 15

    10 % 78 65 53 44 31

    20 % 91 82 78 78 57

    Table 4.4: Probability analysis results for Model I with Input Vector II

    41

  • 4.2 Forecasting with Model I - Input Vector I and II

    (a) Model I - IV I - +1h (b) Model I - IV II - +1h

    (c) Model I - IV I - +3h (d) Model I - IV II - +3h

    (e) Model I - IV I - +6h (f) Model I - IV II - +6h

    (g) Model I - IV I - +12h (h) Model I - IV II - +12h

    (i) Model I - IV I - +24h (j) Model I - IV II - +24h

    Figure 4.10: Error distribution for Model I and all Horizons

    42

  • 4.2 Forecasting with Model I - Input Vector I and II

    (a) Horizon +1h (b) Horizon +3h

    (c) Horizon +6h (d) Horizon +12h

    (e) Horizon +24h

    Figure 4.11: Error distributions comparisons between Input Vector I and II of Model

    I for all forecasting Horozons

    43

  • 4.2 Forecasting with Model I - Input Vector I and II

    (a) Input Vector I (b) Input Vector II

    Figure 4.12: Absolute error distributions for Model I

    (a) Input Vector I (b) Input Vector II

    Figure 4.13: Absolute error distributions for Model I

    44

  • Chapter 5

    Support Vector Machines (SVMs)

    5.1 Introduction to Support Vector Machines

    Support Vector Machines (SVMs) - also called Support Vector Network or Kernel

    Machines - was designed and developed by the Soviet statistic and mathematician

    Vladimir Vapnik in the 90es in the Bell AT&T laboratories. The algorithm on which

    it is based falls in the Vapnik-Chervonenkis theory or Statistical Learning Theory,

    that consist of a supervised learning that allows to generalize and classify new

    elements starting from a base of elements learned in the past. The rst industrial

    applications of SVMs where:

    Optical Character Recognition (OCR)

    Text Classifying

    Objects Recognition

    In order to illustrate the theory of SVMs, a data set of l observations was considered,

    where every observation consist of a couple:

    xl

  • 5.1 Introduction to Support Vector Machines

    f :

  • 5.1 Introduction to Support Vector Machines

    Figure 5.2: Dierent separation iperplanes: only H2 is the optimal hyperplane

    so every points xi A are contained in a half-space and every points xj B arecontained in the other half-space. There is a vector w 0. It's possible to rescale the last relations simply dividing for , without

    generality losing:

    wTxi + b 1,xi AwTxj + b 1,xj B(5.6)

    Now are introduced some denitions, lemmas and propositions that allows a better

    comprehension of the concept of separating hyperplane, in particular the denition

    of Separation Margin will be introduced.

    Denition 5.1. Let H be a Separator Hyperplane. The Separation Margin of H

    is the minimum distance between the points in A B and the given hyperplane:

    (w, b) = minxiAB

    {xTxi + bw

    }(5.7)

    47

  • 5.1 Introduction to Support Vector Machines

    Denition 5.2. An Optimal Hyperplane H(w, b) is the separation hyperplane

    having the maximum Separation Margin. The optimal hyperplane is found by

    solving the optimization problem:

    (w, b) = maxx

  • 5.1 Introduction to Support Vector Machines

    Proposition 5.1.4. It's demonstrated that if (w, b) is the solution of the opti-

    mization problem, then (w, b) is the only solution for that problem:

    minx

  • 5.2 SVM for Regression Models - SVR

    4. yi(xi w + b) 1 0, i = 1, ..., l

    Where:

    5. Lp(w, b, ) =12w2 li=1 iyi(xiw + b) +i i: is the Lagrangian of theproblem

    Through mathematical steps, not shown in this thesis to avoid burdening the

    discussion, it is possible to determine the Lagrangian form of the optimization

    problem:

    min () =1

    2

    li=1

    li=1

    yiyj(xi)Txjij

    li=1

    i

    t.c.l

    i=1

    iyi = 0

    i 0 i = 1, ..., l

    (5.15)

    The training vectors xi are also called Support Vectors. Support Vectors have

    non-null corresponding multipliers i .

    Form the 5.15 it is possible to obtain the following classication function, that

    allows to calssicate the x vectors after a learning phase:

    f(x) = sign((w)T x+ b) = sign(

    li=1

    i yi(xi)Tx+ b

    )(5.16)

    Where xi are the support vectors, i are the related Lagrange coecients and b0 is

    a constant.

    5.2 SVM for Regression Models - SVR

    Suppose that we want to approximate a linear relation between a set of input

    data (x vector) and a output observations (y vector), using as linear estimator a

    function f :

  • 5.2 SVM for Regression Models - SVR

    > 0 is the precision used to approximate the function. is called Tube size. The

    model estimation is correct if:

    |yi wTxi b| (5.18)

    The Loss Function is introduced:

    |y f(x;w, b)| = max {0, |y f(x;w, b)| } (5.19)

    and also the Training error is dened:

    E =l

    i=1

    |yi f(xi;w, b)| (5.20)

    The training error is zero only if the following system of equations is satised:

    wTxi + b yi yi wTxi b

    i = 1, ..., l

    (5.21)

    The articial variables i,i are introduced in 5.22, with i = 1, ..., l:

    wTxi + b yi + i

    yi wTxi b + i

    i, i 0(5.22)

    It is noticed that the term:

    li=1

    i, i 0 (5.23)

    is an Upper Bound for the Training error. As the linear SVM for classication

    problems, the following problem is studied:

    51

  • 5.2 SVM for Regression Models - SVR

    minw,b,i,i

    1

    2w2 + C

    li=1

    (i + i

    )wTxi + b yi + i

    yi wTxi b + i

    i, i 0i = 1, ..., l

    (5.24)

    In SVR, the parameters w and b are estimated by minimizing the regularized risk

    function:

    minw,b,i,i

    1

    2w2 + C

    li=1

    (i + i

    )(5.25)

    Where the rst term

    12w2 represent the regularized term (or complexity penalizer)and C

    li=1

    (i + i

    )is the empirical risk with error calculated using intensiveloss function given by 5.19. The regularized constant, C, determines the trade o

    between the model complexity and training error. , known as tube size, controls

    the deviation of f(x) from y. Both C and are parameters specied by user. This

    regularized risk function is the key in balancing the needs between learning accuracy

    and capacity for learning. It turns out that the optimization in Eq. 5.24 can be

    solved more easily in its dual formulation. This can be done with the introduction

    of Lagrange multipliers , , ,. It can be shown that [2426] this function has a

    saddle point with respect to the primal and dual variables at optimal solution. It is

    minimum at the saddle point with respect to primal variables and maximum with

    respect to dual variables. The dual of the problem 5.24 is the following problem:

    52

  • 5.2 SVM for Regression Models - SVR

    maxL(w, b, , , , , , ) =1

    2w2 + C

    li=1

    (i + i

    )

    li=1

    (i + i

    )

    li=1

    (i + i

    )+

    li=1

    i[wTxi + b yi i

    ]

    li=1

    i[yi wTxi b i

    ]t.c. wL = 0

    L

    b= 0

    L

    i= 0 i = 1, ..., l

    L

    i= 0 i = 1, ..., l

    , 0, 0

    (5.26)

    Or rather:

    maxL(w, b, , , , , , ) =1

    2w2 + C

    li=1

    (i + i

    )

    li=1

    (i + i

    )

    li=1

    (i + i

    )+

    li=1

    i

    t.c. wL = 0

    L

    b= 0

    L

    i= 0 i = 1, ..., l

    L

    i= 0 i = 1, ..., l

    , 0, 0

    (5.27)

    53

  • 5.3 Loss Functions

    The Problem 5.40 can be written in the following form:

    min (, i) =1

    2

    li=1

    lj=1

    (i i

    )(j j

    ) (xi)Txj

    l

    i=1

    (i i

    )yi +

    li=1

    (i + i

    )+

    lj=1

    (i i

    )0 C i = 1, ..., l0 C i = 1, ..., l

    (5.28)

    The linear estimator in Eq. 5.17 can be expressed as:

    f(xi, i, i) =mi=1

    (i i

    ) (xi, xj

    )+ b (5.29)

    It is demonstrated that the Lagrange multipliers can only be non-zero when |f(x)y| . This means that (i , i) will be zero for data lying inside the tube. Hencethe sparse representation of data falling outside the tube by (i , i) are knownas the Support Vectors. However, having a sparse representation by increasing the

    tube size depreciate the accuracy of approximation. Therefore, presents atrade-o between sparseness representation and accuracy.

    5.3 Loss Functions

    Loss functions are used as a measure of errors between estimate and actual

    values. Choice of loss function depends very much on the problem at hand. A

    discussion on general loss function in SVM can be found in [27]. Fig. 5.3 shows a

    intensive loss functions used in regression. It is proved that the optimal choiceof loss functions in regression is actually related to the noise density distribution in

    the data [4]. It is also proved [28] that intensive loss function is the best choicefor additive and Gaussian noise whose variance and mean are random.

    5.4 Nonlinear SVR using kernel

    The capability of SVM can be further extended to enable the learning of non-

    linear functions. The input data can be mapped from the input space to a higher

    54

  • 5.4 Nonlinear SVR using kernel

    Figure 5.3: intensive loss function

    dimensional feature space using a mapping function (x). The linear hyperplane

    estimator can be written as:

    y = f(x) = w (x) + b (5.30)

    Using this approach, computation and generalization problems are found:

    The over-tting of data, due to the increasing of the number of features used

    in mapping the data into higher dimension, can cause a degradation of gen-

    eralization performance.

    The increase of features used in mapping also results in an increase in com-

    putational resources in evaluating the features.

    The problem on generalization performance can be solved from the perspective

    of statistical learning theory [4] by limiting the capacity for learning from a small

    data set in a rich feature space. This can be done with the introduction of a capacity

    control term w2 leading to the regularized risk functional in Eq. for our case here.This means that SVM can generalize well regardless of the dimension of the feature

    space used. The second problem (computation resources) can be solved by using

    kernel functions. One important property of linear learning machine is that they can

    be expressed in a dual representation (Eq. 5.40). This dual function is expressed

    by the dot product of the training data points and the decision of this function

    is found by evaluating dot product between training and test data points. The

    Duality Theory and the use of kernel functions allows to generalize the discussion

    to the non-linear regression models, similarly to what was done for the classication

    problems. In particular, the training problem for a SVM for non-linear regression

    is dened as follows:

    55

  • 5.4 Nonlinear SVR using kernel

    min (, )1

    2

    li=1

    lj=1

    (i i

    )(j j

    )k(xi, xj

    ) li=1

    (i i

    )yi

    + l

    i=1

    (i + i

    )l

    j=1

    (i i

    )= 0

    0 C i = 1, ..., l0 C i = 1, ..., l

    (5.31)

    Where k(x, z) is a kernel function. The formulated problem is a CQP and the

    solution (, ) allows to dene the regression function in the following form:

    f(x) =l

    i=1

    (i i

    )k(x, xi

    )+ b (5.32)

    where b can be determined by using complementarity conditions. Any kernel sat-

    isfying Mercer's conditions [30] can be used as a kernel function. Some of the

    commonly used kernels are:

    Linear Kernel:

    K(x, y) = (x y) (5.33)

    Polynomial kernel:

    K(x, y) = (1 + x y)d (5.34)

    Radial basis function(RBF):

    K(x, y) = exp( x y2) (5.35)

    Multi-Layer Perceptron:

    K(x, y) = tanh(b(x y) c) (5.36)

    Gaussian Radial basis function:

    K(x, y) = exp

    ((x y)

    2

    22

    )(5.37)

    56

  • 5.5 Least Square Support Vector Machine for Regression

    It is evident that the implicit mapping of a kernel oers a cheap way for SVM

    to construct a variety of non-linear functions. Not only does a kernel character-

    ize the features of the input data, but having a well dened kernel could greatly

    improves the performance of SVM. Therefore, studies on kernels constitute an im-

    portant area of research in SVMs and the Training of SVR is more complex that

    the training for the SVM for classication. It was studied that RBF Kernel are

    the most powerful. Unlike the polynomial kernels, RBF kernels can manage very

    closed decision surfaces, that is a useful property in can of classifying dataset with

    a class completely encapsulated in the data of another class. The only drawback is

    the highest computational time. To use a Support Vector Machine is necessary to

    dene:

    Kernel Type;

    Kernel Parameters;

    Value of C.

    No theoretical criteria are available to dene all these parameters. The typi-

    cal procedure involves a validation on a validation dataset using Cross-Validation

    algorithms.

    5.5 Least Square Support Vector Machine for Re-

    gression

    In [4] and [10] a modied form of SVM algorithm was proposed, called Least

    Square Support Vector Machines, in order to reduce the computing time of the

    SVMs. The Training of the LS-SVM is more simple because it requires the solution

    of a set of linear equations (linear KKT systems). LS-SVMs are closely related

    to regularization networks and Gaussian processes but additionally emphasize and

    exploit primal-dual interpretations. LS-SVM was introduced by [4], as a modied

    form of SVM of [10]. For the productivity forecasting of this thesis, the Radial

    Basis Function (RBF) is used. In literature many tests and comparisons showed

    great performances of LS-SVMs on several benchmark data set problems and were

    very encouraging for further research in this promising direction [10].

    A model in the primal weight space is considered:

    57

  • 5.5 Least Square Support Vector Machine for Regression

    Figure 5.4: LS-SVM: an interdisciplinary topic

    y(x) = wT(x) + b (5.38)

    Where x

  • 5.6 LsSVM Matlab Toolbox

    Lw

    = 0 w = Nk=1 k(xk)Lb

    = 0Nk=1 k = 0Lek

    = 0 k = ek k = 1, ..., NLk

    = 0 wT(xk) + b+ ek yk = 0 k = 1, ..., N

    (5.41)

    After elimination of variables w and e and applying the kernel tick, the resulting

    LS-SVM model for function estimation becomes then:

    y(x) =Nk=1

    kK(x, xk) + b (5.42)

    Note that in the case of RBF Kernels, one has only two additional tuning

    parameters (, ), which is less than for standard SVMs. Fig. 5.5 shows a

    time series prediction on the Santa Fe chaotic laser data set [31] using a LS-SVM

    with RBF kernel.

    Figure 5.5: Time series prediction by LS-SVM with RBF kernel [9]

    5.6 LsSVM Matlab Toolbox

    The commercial software Mathworks Matlab R2011b includes a lot of functions

    in order to implement a SVM model for classication and regression, while a Least

    Square SVM can not be implemented without installing a specic external Toolbox.

    In particular the LS-SVMlab Toolboox Version 1.8, created by K. De Brabanter et.

    al. [9] at the ESAT-SISTA research division of Electrical Engineering department

    of the Katholieke Universiteit Leuven, was chosen. The LS-SVMlab Toolbox is

    59

  • 5.7 Forecasting with Model II - Input Vector I

    compiled and tested for dierent computer architectures including Linux and Win-

    dows. Most functions can handle datasets up to 20.000 data points or more. The

    LS-SVMlab interface for Matlab consists of a basic version for beginners as well as

    a more advanced version with programs for multiclass encoding techniques and a

    Bayesian framework. In this section it's shown how to obtain a LS-SVM model for

    classication or regression [9] (Fig. 5.6) while a complete Matlab source code of the

    implemented LS-SVM model is shown in Appendix A :

    Choose between the functional or objected oriented interface (initlssvm);

    Search for suitable tuning parameters (tunelssvm);

    Train the model given the previously determined tuning parameters (trainlssvm);

    Simulate the model on e.g. test data (simlssvm);

    Visualize the results when possible (plotlssvm).

    Figure 5.6: List of commands for obtaining an LS-SVM model

    5.7 Forecasting with Model II - Input Vector I

    In this section, experiments were carried out to evaluate the performance of the

    Model II using Matlab R2011b software with a comparison of the results obtained

    by Model I on the same data sets. Firstly, the performance of the created Model

    are evaluated with the the Normalized Absolute Percentage Error (NMAPE) (See

    Section 6.3), for each forecasting horizon. The Model that uses the LS-SVM is

    called MODEL II (See Table 3.1). The LS-SVM Training was carried out by several

    training procedures repetitions, in order to obtain a performing couple of and 2

    parameters (so, to obtain lower values of NMAPE) and the nal values of the LS-

    SVM parameters are listed in Table 5.1: these parameters have been proved to be

    60

  • 5.7 Forecasting with Model II - Input Vector I

    optimal for every forecasting horizons and for every Input Vectors. The results of

    the NMAPE values evaluation are shown in Table 5.2 and Figure 5.7.

    2

    22.8 2600

    Table 5.1: Parameters of the LS-SVM based Model

    NMAPE

    Horizon

    +1

    NMAPE

    Horizon

    +3

    NMAPE

    Horizon

    +6

    NMAPE

    Horizon

    +12

    NMAPE

    Horizon

    +24

    [%] [%] [%] [%] [%]

    Input

    Vector I

    7.53 13.62 18.22 21.11 18.52

    Table 5.2: Productivity forecasting results for Model II with Input Vector I

    Figure 5.7: NMAPE values for Model II with Input Vector I

    Simulations results shows that, using Input Vector I, an highest forecasting

    horizons non always correspond to an NMAPE increasing. For Example, for the

    forecasting horizon +24 hours, the NAMPE value is lower than the value for the

    61

  • 5.7 Forecasting with Model II - Input Vector I

    horizon +12 hours and it may depends on the low data correlation for high fore-

    casting horizons. This model behavior is the same obtained with Model I and Input

    Vector I. Fig. 5.8 shows a comparison between NMAPE values obtained with Mod-

    els I and II with Input Vector I. Model II is better for very short forecasting horizons

    while for +12 and +24 hours horizons the performance of Model II are almost the

    same as those of Model I.

    Figure 5.8: Comparison of the NMAPE values between Model I and II with Input

    Vector I

    Figures 5.9 and 5.10 show a graphical comparison between the Real Power Pro-

    duction and the forecasted Power Production for some time samples and also a plot

    of the corresponding error. Graphs show that using Input Vector I there is always

    a bias error between the forecasted and real PV power (visible when the real PV

    power as zero value), but the error is less than the ones evaluated using Model I,

    especially for low forecasting horizons where the bias error is almost zero. It is

    also shown that using Input Vector I the forecasting model is not very capable to

    follow abrupt changes of the real PV power signal, such as in case of an unexpected

    passage of clouds over the PV plant. In addiction, for all the forecasting horizons,

    the forecasted PV power signal presents a delay for the edge of the real power signal

    and an advance for falling edge of the real power signal. This behavior of the model

    is also visible from the error signal that has a sinusoidal for real power peaks, as

    also shown for Model I.

    Tables 5.3 shows the probability distribution of the NMAPE for all the forecast-

    ing horizons. Figure 5.11 shows the error distribution for the Model I with Input

    62

  • 5.7 Forecasting with Model II - Input Vector I

    Figure 5.9: Comparisons between actual and forecasted PV power and Error for

    Model II with Input Vector I - forecasting horizon +1 hour

    Figure 5.10: Comparisons between actual and forecasted PV power for Model II

    with Input Vector I - forecasting horizon +6 hour

    63

  • 5.7 Forecasting with Model II - Input Vector I

    Probab.

    (+1 h)

    Probab.

    (+3 h)

    Probab.

    (+6 h)

    Probab.

    (+12 h)

    Probab.

    (+24 h)

    [%] [%] [%] [%] [%]

    1 % 45 2 1 2 3

    5 % 57 10 5 8 17

    10 % 70 61 12 17 34

    20 % 88 77 70 37 61

    Table 5.3: Probability analysis results for Model II with Input Vector I

    (a) Horizon +1h (b) Horizon +3h

    (c) Horizon +6h (d) Horizon +12h

    (e) Horizon +24h

    Figure 5.11: Error distribution for Model II - Input Vector I - all Horizons

    Vector I and for all the forecasting horizons. These graphs allows to evaluate the

    tendency of the Model to underestimate or overestimate the real PV power: gen-

    64

  • 5.7 Forecasting with Model II - Input Vector I

    erally, using Input Vector I there is an underestimation of the error so the error

    distribution has a behavior of a Gaussian with mean value shifted left from zero.

    It can be also inferred from the graphs that error distributions for long forecasting

    horizons have an higher standard deviation value, especially for the horizon +24

    h. Fig. 5.12 shows the error distribution comparison between Model I and II with

    Input Vector I: the best performance of Model II in terms of mean of the Gaussian

    distribution are perceptible, using Model II is possible to reduce the model under-

    estimation but for high forecasting horizons the results are very similar to those

    obtained with Model I, so there is an high shift of the distribution mean. The fore-

    casting horizon +24 h is always the critical one because of the very high standard

    deviation of the error distribution.

    (a) Horizon +1h (b) Horizon +3h

    (c) Horizon +6h (d) Horizon +12h

    (e) Horizon +24h

    Figure 5.12: Error distribution for Model I and II- Input Vector I - all Horizons

    A graphical comparison between the probability distributions of the NMAPE for

    65

  • 5.8 Forecasting with Model II - Input Vector II

    Model II with Input Vector I is also provided and it is shown in Figg. 5.13 and 5.14.

    Fig. 5.13 shows that using Input Vector I a forecasting horizon increasing is not

    always related to a Probability decreasing, but for a xed forecasting horizon the

    probability always increases with the probability range because of we are considering

    cumulative probability values. It is clear that the results for very short forecasting

    horizons are very good: for the [-20%;+20%] range the probability reach the 88 %

    for the +1h horizon, the 77 % for the +3h and the 70 % for the +6h, while for high

    forecasting horizons the results are quite poor and are not acceptable for a good

    forecasting system.

    In conclusion, Input Vector I with Model II allows to achieves the best per-

    formance just for low forecasting horizons, even though its global performance are

    better then those reached with the same Input Vector but using Model I.

    Figure 5.13: Comparisons between NMAPE probability for Model II - Input Vector

    I

    5.8 Forecasting with Model II - Input Vector II

    In this section are shown the results of the simulations using the Forecasting

    Model II and the Input Vector II (see Table 3.1 and 3.2), for every forecasting

    horizons. Firstly, the performance of the created Model are evaluated with the

    the Normalized Absolute Percentage Error (NMAPE) (See Section 3.5), for each

    forecasting horizon. The training parameters and 2 are the same of that used

    for Input Vector I and are listed in Tab. 5.1. The results of the NMAPE values

    evaluation are shown in Tab. 5.4 and Fig. 5.15.

    Using Model II and Input Vector II, an increasing of the forecasting horizons

    66

  • 5.8 Forecasting with Model II - Input Vector II

    Figure 5.14: Comparisons between NMAPE probability for Model II - Input Vector

    I

    NMAPE

    Horizon

    +1

    NMAPE

    Horizon

    +3

    NMAPE

    Horizon

    +6

    NMAPE

    Horizon

    +12

    NMAPE

    Horizon

    +24

    [%] [%] [%] [%] [%]

    Input

    Vector I

    7.53 13.62 18.22 21.11 18.52

    Input

    Vector II

    6.40 10.18 13.49 14.53 19.50

    Table 5.4: Productivity forecasting results for Model II with Input Vector II

    is always related to an increasing of the NMAPE value, similarly to Model I with

    Input Vector II. Fig. 5.16 shows a comparison between NMAPE performance using

    Model II with Input Vector I and II: the performance are better for +1,+3,+6,+12

    hours forecasting horizons using Input Vector II while for the +24 h horizon the

    comparisons can't be done due to the untrusted NMAPE value obtained with the

    Input Vector I. Fig. 5.17 shows a comparison between NMAPE values obtained

    using Models I and II with Input Vector II: performance of the two Models are

    practically the same.

    Figures 5.18 and 5.19 show a graphical comparison between the Real PV Power

    Production and the forecasted Power PV Production for some time samples and

    also a plot of the corresponding error. Graphs show that using Input Vector II

    there isn't the bias error between the forecasted and real PV power observed using

    Input Vector I, (when the real PV power as zero value). It is also shown that

    67

  • 5.8 Forecasting with Model II - Input Vector II

    Figure 5.15: NMAPE values for Model II with Input Vector II

    Figure 5.16: Comparison between NMAPE values of Model II with Input Vector I

    and II

    using Input Vector II the forecasting model is more capable than Input Vector I to

    follow abrupt changes of the real PV power signal, such as in case of an unexpected

    passage of clouds over the PV plant. In addiction, for all the forecasting horizons,

    the forecasted PV power signal presents a delay for the edge of the real power signal

    and an advance for falling edge of the real power signal. This behavior of the model

    68

  • 5.8 Forecasting with Model II - Input Vector II

    Figure 5.17: Comparison between NMAPE values of Model I and II with Input

    Vector II

    is also visible from the error signal that has a sinusoidal for real power peaks, as

    also shown for Model I (Sect. 4.2).

    Probab.

    (+1 h)

    Probab.

    (+3 h)

    Probab.

    (+6 h)

    Probab.

    (+12 h)

    Probab.

    (+24 h)

    [%] [%] [%] [%] [%]

    1 % 38 22 17 10 4

    5 % 62 48 41 25 15

    10 % 77 65 58 44 31

    20 % 91 82 77 75 57

    Table 5.5: Probability analysis results for Model II with Input Vector II

    Tables 5.5 shows the probability distribution of the NMAPE for all the forecast-

    ing horizons. Figure 5.20 shows the error distribution for the Model I with Input

    Vector II and for all the forecasting horizons. These graphs allows to evaluate the

    tendency of the Model to underestimate or overestimate the real PV power: gen-

    erally, using Input Vector II there is a slight overestimation for the +1h horizon

    and a slight underestimation of the error for the other horizons. It can be also

    inferred from the graphs that error distributions for long forecasting horizons have

    69

  • 5.8 Forecasting with Model II - Input Vector II

    Figure 5.18: Comparisons between actual and forecasted PV power and Error for

    Model II with Input Vector II - forecasting horizon +1 hour

    Figure 5.19: Comparisons between actual and forecasted PV power for Model II

    with Input Vector II - forecasting horizon +6 hour

    70

  • 5.8 Forecasting with Model II - Input Vector II

    (a) Horizon +1h (b) Horizon +3h

    (c) Horizon +6h (d) Horizon +12h

    (e) Horizon +24h

    Figure 5.20: Error distribution for Model II - Input Vector II - all Horizons

    an higher standard deviation value, especially for the horizon +24 h. Fig. 5.21

    shows the error distribution comparison between Model I and II with Input Vector

    II: in this case the best performance of Model II in terms of mean of the Gaussian

    distribution are perceptible for +1h,+3h and +6h, horizons but barely perceptible

    for +12h and +24h horizons: using Model II is possible to reduce slightly the model

    underestimation. The forecasting horizon +24 h is always the critical one because

    of the very high standard deviation of the error distribution and the improvements

    are not perceptible using the LS-SVM regression model.

    A comparison between the probability distributions of the NMAPE for Model II

    is shown in Figg. 5.22 and 5.23. The probability to have a low NMAPE using Input

    Vector II is very good for every forecasting horizons, in particular it is noted that

    the probability to have an NMAPE lower than 1 % is almost 40 % for forecasting

    71

  • 5.8 Forecasting with Model II - Input Vector II

    (a) Horizon +1h (b) Horizon +3h

    (c) Horizon +6h (d) Horizon +12h

    (e) Horizon +24h

    Figure 5.21: Error distribution for Model I and II- Input Vector I - all Horizons

    horizon +1 and in the [20%; +20%] range is the 91 %. Fig. 5.23 shows that usingInput Vector II a forecasting horizon increasing is always related to a Probability

    decreasing, such as noted for Model I with Input Vector II. In conclusion, perfor-

    mance are clearly better then those obtained with Input Vector 1 and barely better

    then those reached using Model I with Input Vector II.

    72

  • 5.8 Forecasting with Model II - Input Vector II

    Figure 5.22: Probability analysis results for Model II with Input Vector II

    Figure 5.23: Probability analisys results for Model II with Input Vector II

    73

  • Chapter 6

    LSSVM with Wavelet Transform

    6.1 Fourier transform and short-term Fourier trans-

    form

    Figure 6.1: Fourier transform: from time domain to frequency domain

    With the Fourier Transform is possible to move a signal representation from the

    time domain to the frequency domain. The new frequency domain representation

    is useful for signal analysis, even though time information are lost: is no longer

    possible to determine "when" a particular event happened. Direct and Inverse

    Fourier Transform have the mathematical representation shown in Eq. 6.1.

    F () =

    +

    f(t) ejtdt

    F (t) =

    +

    F () ejtd(6.1)

    The Fourier Transform evaluate the dierent frequencies weight of a signal. Even

    thought a signal is not stationary, but only stationary for short time intervals, the

    74

  • 6.2 Continuous Wavelet transform

    spectrum for this signal can be calculated by "moving" a "stationary signal window"

    on consecutive signal segments, in order to realize a Short Term Fourier Transform

    (STFT). It is advisable to superimpose the windows during their moving, in order

    to obtain a better interpolated representation of the signal.

    Figure 6.2: Short Term Fourier Transform (STFT)

    The Short Term Fourier Transform is a compromise between time and frequency

    but its precision depends on the window amplitude and the amplitude can not be

    variate, but it is constant for each frequency. So, it is necessary an adaptive window

    to scale requirement in time and frequency domains.

    6.2 Continuous Wavelet transform

    The Wavelet Transform uses adaptive windows in order to improve results ob-

    tainable using STFT. Adaptive windows encloses long time intervals to analyze low

    frequencies and short time intervals to analyze high frequencies. A signal is ex-

    pressed as the combination of children wavelets, results of the shifting and scaling

    from a mother wavelet.

    C(scale, shift) =

    +

    s(t) (scale, shift, t)dt (6.2)

    From a generic wavelet (a, b, t), where a and b are the shifting and scaling factors,

    the Continuous Wavelet Transform (CWT) is dened as the integral of the signal

    s(t) multiplied for the scaled wavelet [21]:

    W (a, b) =1a

    +

    s(t) (t ba

    )dt (6.3)

    Where correspond to the complex conjugate of the wavelet function . Similar

    to Fourier Transform, Wavelet Transform also approximate a given signal s(t) by

    75

  • 6.2 Continuous Wavelet transform

    using a basis function except that the basis function here is a small wave instead of

    a continuous sinusoidal function.

    Figure 6.3: Wavelet Transform (source: Mathworks website)

    A Wavelet Scaling consist of the stretching and compression of the mother

    wavelet. The smaller the factor, the more complex the wavelet. A Wavelet shifting

    consist of delay or anticipate the mother wavelet. If (t) is the original wavelet,

    (t k) is its k-delayed version.

    Figure 6.4: Continuous Wavelet Transform Process

    The Following present a simplistic way to generate a CWT:

    Take an arbitrary wavelet function and compare it with the signal s(t);

    Calculate the similarity coecient C, that evaluate the similarity between the

    signal window and the wavelet;

    Shift the wavelet function by b, compare and calculate the similarity coef-

    cient again until end of signal;

    Proceed to the next scale by stretching the wavelet function by a;

    Repeat the last four points for every scales.

    The CWT is the sum of signal windows multiplied for scaled and shifted wavelet

    versions. The Wavelet Coecients are the result of a regression operation on the

    original signal. Coecients graphical representation (Fig. 6.5) is done by quoting

    the time on the abscissas axis and the scale of every coecient on the ordinate axis.

    76

  • 6.3 Discrete Wavelet transform

    Dierent colored pixels are used to refers to the position and the module of the

    coecient. High scales correspond to more stretched wavelets and the greater the

    stretching, the greater the signal window compared with the wavelet.

    Figure 6.5: Wavelet Coecients 2D graphical representation

    Figure 6.6: Continuous Wavelet Transform (source: Mathworks website)

    6.3 Discrete Wavelet transform

    In continuous wavelet transform, the wavelet function is stretched and shifted

    along the signal in a continuous manner. This present an enormous amount of

    77

  • 6.4 Daubechies type 4 Discrete Wavelet transform

    work and there is some redundancy in the data in the sense that there is more than

    enough information for re-constructing back the original signal. It turns out that

    if the scales and shifting are discretised based on powers of two so called dyadic

    scaled and positions the computing of the transform will be more ecient without

    any loss in accuracy. The Discrete Wavelet Transform decomposes the original

    signal in DETAILS (high frequencies components) and APPROXIMATIONS

    (low frequencies components).

    Figure 6.7: Details and Approximations decomposition with subsampling

    Even if two signal are obtained from the original signal (Details and Approxi-

    mations), a sub-sampling operation is done in order to keep only one sample every

    two processed samples (Figures 6.7 and 6.8). The sub-sampling operation is done

    by redoubling the sampling period. If the scales and shifting are discretised based

    on powers of two, the following Discrete Wavelet transform is obtained:

    s(t) =+j=

    cj k 2j/2 (2j t k) (6.4)Where the wavelet functions (2j t k) are 2j scaled and k translated versions ofthe original wavelet (t). With a dyadic scaling, on every decomposition correspond

    a halving of the data (sub-sampling).

    6.4 Daubechies type 4 Discrete Wavelet transform

    The Wavel Transform studied in this thesis is the Daubechies Wavelet Transform

    of type 4 (db4) [20], very similar to the Haar Wavelet Transform. For a signal f

    with N 2 values, the Level 1 Daubechies (D1) type 4 Wavelet Transform isdened as:

    78

  • 6.4 Daubechies type 4 Discrete Wavelet transform

    Figure 6.8: Details and Approximations decomposition scheme

    fD1 (t1|d1) (6.5)

    Where:

    ti = (f, V 1i )

    di = (f,W 1i )(6.6)

    and so on for subsequent levels. The dierences between the Daubechies Transform

    and the Haar Transform are in the way to dene V 1i and W1i . We dene:

    1 =1 +

    3

    4

    2

    2 =3 +

    3

    4

    2

    3 =33

    4

    2

    4 =13

    4

    2

    (6.7)

    Terms V 1i (1st level Daub4 Transform) are constructed in as follows:

    79

  • 6.4 Daubechies type 4 Discrete Wavelet transform

    V 11 = (1, 2, 3, 4, 0, ..., 0)

    V 12 = (0, 0, 1, 2, 3, 4, 0, ..., 0)

    V 13 = (0, 0, 0, 0, 1, 2, 3, 4, 0, .