june 2003neural computation for time series1 neural computation and applications in time series and...

27
June 2003 Neural Computation for Time Series 1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics and Artificial Intelligence, University of Vienna And Austrian Research Institute for Artificial Intelligence

Upload: ezequiel-keal

Post on 31-Mar-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 1

Neural Computationand Applications in

Time Series and Signal Processing

Georg DorffnerDept. of Medical Cybernetics and

Artificial Intelligence, University of Vienna

And

Austrian Research Institute for Artificial Intelligence

Page 2: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 2

Neural Computation

• Originally biologically motivated(information processing in the brain)

• Simple mathematical model of the neuron neural network

• Large number of simple „units“• Massively parallel (in theory)• Complexity through the interplay of many simple

elements• Strong relationship to methods from statistics• Suitable for pattern recognition

Page 3: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 3

A Unit

• Propagation rule:– Weighted sum

– Euclidian distance

• Transfer function f:– Threshold fct.

(McCulloch & Pitts)

– Linear fct.

– Sigmoid fct.

– Gaussian fct.

yj f xj

w1

w2

wi

Weight

Unit (Neuron)

Activation, Output

(Net-) Input

Page 4: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 4

Multilayer Perceptron (MLP), Radial Basis Function Network (RBFN)

• 2 (or more) layers (= connections)

2

1 1

2inout

x

k

l

n

iiilljj

exf

xwfvx

Input Units

Hidden Units (typically nonlinear)

Output Units(typically linear)

x

k

l

n

iiilljj

exf

xwfvx

1

11 1

inoutMLP: RBFN:

Page 5: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 5

MLP as Universal Function Approximator

• E.g,: 1 Input, 1 Output, 5 Hidden

• MLP can approximate arbitrary functions (Hornik et al. 1990)

• trough superposition of weighted sigmoids

• Similar is true for RBFN

out0

1 1

hid0

inhidoutinoutj

n

j

m

iiiijjkkk wwxwfwxgx

move(bias)

Stretch, mirror

Page 6: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 6

Training (Model Estimation)

• Typical error function:

• „Backpropagation“ (application of chain rule):

ii w

x

x

E

w

E

out

out

contribution of error function contribution of network

• Iterative optimisation based on gradient(gradient descent, conjugent gradient, quasi-Newton):

outout'outjjjj xtyf out

1

outhid'hidk

n

kjkjj wyf

n

i

m

k

ik

ik txE

1 1

2)()(out, (summed squared error)

targetall patterns

all outputs

Page 7: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 7

Recurrent Perceptrons

• Recurrent connection = feedback loop

• From hidden layer („Elman“) or output layer („Jordan“) Learning:

„backpropagation through time“

Input Zustands- bzw.Kontextlayer

copy

Page 8: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 8

Time series processing

• Given: time-dependent observables

• Scalar: univariate; vector: multivariate

• Typical tasks:

- Forecasting- Noise modeling

- Pattern recognition- Modeling

- Filtering- Source separation

Time series(minutes to days)

Signals(milliseconds to seconds)

,1,0, txt

Page 9: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 9

ExamplesStandard & Poor‘s Sunspots

Preprocessed: (returns) Preprocessed: (de-seasoned)1 ttt xxr 11 ttt xxs

Page 10: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 10

Autoregressive models

• Forecasting: making use of past information to predict (estimate) the future

• AR: Past information = past observations

tptttt xxxFx ,,, 21

past observations ptX ,

Expected value tx̂

Noise,„random shock“

• Best forecast: expected value

Page 11: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 11

Linear AR models

• Most common case:

• Simplest form: random walk

• Nontrivial forecast impossible

p

ittit xax

11

1,0~ ;1 Nxx tttt

Page 12: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 12

MLP as NAR

• Neural network can approximate nonlinear AR model

• „time window“ or „time delay“

Page 13: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 13

Noise modeling

• Regression is density estimation of:(Bishop 1995)

• Likelihood:

xxttx, ppp |

in

i

ii ppL xxt

1

|

Distribution with expected value F(xi)

Target = future past

Page 14: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 14

Gaussian noise

• Likelihood:

• Maximization = minimization of -logL(constant terms can be deleted, incl. p(x))

• Corresponds to summed squared error(typical backpropagation)

n

t

tptn

tptt

xXFXxpL

12

2,

1, 2

;exp

2

1|

W

W

n

itpt xXFE

1

2, ;W

Page 15: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 15

Complex noise models

• Assumption: arbitrary distribution

• Parameters are time dependent (dependent on past):

• Likelihood:

D~

ptXg ,

N

i

iptXgdL

1

)(,

Probability density function for D

Page 16: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 16

Heteroskedastic time series

• Assumption: Noise is Gaussian with time-dependent variance

• ARCH model

• MLP is nonlinear ARCH (when applied to returns/residuals)

N

i

X

Xx

iptt

iptt

ipt

it

eX

L1

2

)(,

2

)(,

2

2)(,

)(

2

1

p

iitit ra

1

22

222

2121

2 ,,,',,, ptttptttt rrrFrrrF

Page 17: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 17

Non-Gaussian noise

• Other parametric pdfs (e.g. t-distribution)

• Mixture of Gaussians (Mixture density network, Bishop 1994)

• Network with 3k outputs (or separate networks)

k

i

X

Xx

pti

pti pti

pti

eX

Xd

1

2

,2

,2 ,2

2,

2,,

Page 18: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 18

Identifiability problem• Mixture models (like neural networks) are not identifiable

(parameters cannot be interpreted)• No distinction between model and noise

e.g. sunspot data:

Models have to be treated with care

Page 19: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 19

Recurrent networks: Moving Average

• Second model class: Moving Average models• Past information: random shocks

• Recurrent (Jordan) network: Nonlinear MA

• However, convergence notguaranteed

q

iitit bx

0

ttt xx ˆ

Page 20: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 20

GARCH

• Extension of ARCH:

• Explains „volatility clustering“

• Neural network can again be a nonlinear version

• Using past estimates: recurrent network

p

i

p

iitiitit bra

1 1

222

Page 21: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 21

State space models

• Observables depend on (hidden) time-variant state

• Strong relationship to recurrent (Elman) networks

• Nonlinear version only with additional hidden layers

ttt

ttt

ss

sx

BA

C

1

Page 22: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 22

Symbolic time series

• Examples:– DNA

– Text

– Quantised time series (e.g. „up“ and „down“)

• Past information: past p symbols probability distribution

• Markov chains

• Problem: long substrings are rare

it sx

ptttt xxxxp ,,,| 21

alphabet

Page 23: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 23

Fractal prediction machines

• Similar subsequences are mapped to points close in space

• Clustering = extraction of stochastic automaton

Page 24: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 24

Relationship to recurrent network

• Network of 2nd order

Page 25: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 25

Other topics

• Filtering:corresponds to ARMA modelsNN as nonlinear filters

• Source separationindependent component analysis

• Relationship to stochastic automata

Page 26: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 26

Practical considerations

• Stationarity is an important issue

• Preprocessing (trends, seasonalities)

• N-fold cross-validation time-wise(validation set must be after training set

• Mean and standard deviation model selection

train

validation

test

Page 27: June 2003Neural Computation for Time Series1 Neural Computation and Applications in Time Series and Signal Processing Georg Dorffner Dept. of Medical Cybernetics

June 2003 Neural Computation for Time Series 27

Summary

• Neural networks are powerful semi-parametric models for nonlinear dependencies

• Can be considered as nonlinear extensions of classical time series and signal processing techniques

• Applying semi-parametric models to noise modeling adds another interesting facet

• Models must be treated with care, much data necessary