forecasting default probabilities in emerging markets and dynamical regulatory networks through...

65
Gerhard-Wilhelm WEBER * Ayşe ÖZMEN Zehra Çavuşoğlu Özlem Defterli Institute of Applied Mathematics, METU, Ankara, Turkey * Faculty of Economics, Management Science and Law, University of Siegen, Germany Center for Research on Optimization and Control, University of Aveiro, Portugal Universiti Teknologi Malaysia, Forecasting Default Probabilities in Emerging Markets and Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization 6th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics National University of Technology of the Ukraine Kiev, Ukraine, August 8-20, 2011

Upload: ssa-kpi

Post on 13-Dec-2014

512 views

Category:

Education


0 download

DESCRIPTION

AACIMP 2011 Summer School. Operational Research stream. Lecture by Gerhard-Wilhelm Weber.

TRANSCRIPT

Page 1: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Gerhard-Wilhelm WEBER * Ayşe ÖZMEN

Zehra ÇavuşoğluÖzlem Defterli

Institute of Applied Mathematics, METU, Ankara, Turkey

* Faculty of Economics, Management Science and Law, University of Siegen, Germany Center for Research on Optimization and Control, University of Aveiro, Portugal Universiti Teknologi Malaysia, Skudai, Malaysia

Forecasting Default Probabilities in Emerging Markets and

Dynamical Regulatory Networks throughNew Robust Conic GPLMs and Optimization

6th International Summer School Achievements and Applications of Contemporary Informatics, Mathematics and Physics

National University of Technology of the Ukraine Kiev, Ukraine, August 8-20, 2011

Page 2: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

IntroductionRegressionMARSCMARSRobust OptimizationRobustification of CMARSCMARS Model with Polyhedral UncertaintyGeneralized Linear Models (GLMs)Generalized Partial Linear Models (GPLMs)Conic GPLMsRobust Conic GPLMs (RCGPLMs)Numerical Experience with RCGPLMReal-word Application for RCGPLMProcess Version of RCGPLMConclusion

Content

Page 3: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

learning from data has become very important in every field of science and technology, e.g., in

• financial sector,• quality improvent in manufacturing,• computational biology,• medicine and• engineering.

Learning enables for doing estimation and prediction.

Introduction

Page 4: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Regression

•Regression is mainly based on the methods of– Least squares estimation,

– Maximum likelihood estimation.

•There are many regression models– Linear regression models,

– Nonlinear regression models,

– Generalized linear models,

– Nonparametric regression models,

– Additive models,

– Generalized additive models.

Page 5: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

To estimate general functions of high-dimensional arguments.

An adaptive procedure.

A nonparametric regression procedure.

No specific assumption about the underlying functional relationship between the dependent and independent variables.

Ability to estimate the contributions of the basis functions so that both the additive and the interactive effects of the predictors are allowed to determine the response variable.

Uses expansions in piecewise linear basis functions of the form

MARS: Multivariate Adaptive Regression Spline

+ ( , ) = [ ( )] ,c x x ( , ) = [ ( )]-c x x .

[ ] : max 0,q q

Page 6: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Let us consider

The goal is to construct reflected pairs for each input

MARS

( ) ,Y f X 1 2, ,..., .pX X X X

r egression w ith

( 1, 2,..., ).j j pX

x

y

+( , )=[ ( )]c x x ( , )=[ ( )]-c x x

Page 7: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Set of basis functions:

Thus, can be represented by

are basis functions from or products of two or more such functions; interaction basis functions are created by multiplying an existing basis function with a truncated linear function involving a new variable.

Provided the observations represented by the data

where subvectors of .

MARS

1, 2, ,: ( ) , ( ) , ,..., , 1, 2,..., .| j j j j N jX X x x x j p

( )f X

01

( ) .M

mm m

m

Y

X

( 1,2,..., )m m M

( , ) ( 1, 2,..., ) :i iy i N x

1

( ) : [ ( )]m

m m mj j j

Km

mj

s x

x

:mx x

Page 8: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Two subalgorithms:

(i) Forward stepwise algorithm:Search for the basis functions.Minimization of some “lack of fit” criterion. The process stops when a user-specified value is reached. Overfitting. This model typically overfits the data; so a backward deletion procedure is applied.

(ii) Backward stepwise algorithm:

Prevents from over-fitting by decreasing the complexity of the model without degrading the fit to the data.

Alternative:

MARS

maxM

Page 9: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

PRSS for MARS

Tradeoff between both accuracy and complexity. Penalty parameters .

1 2, ( ) : ( )m m m m

r s m m r sD x x x x

m

max

1 2

22 2 2

,1 1 1

, ( ) ( , )

: ( ( )) [ ( )]

T

MNm m

i i m m r s mi m r s

r s V m

PRSS y f D d

x x x

1 2

1 2

1 2 1 2

( ) : | 1, 2,...,

: ( , ,..., )

( , )

: , , 0,1

Km

mj m

m Tm m m

T

V m j K

x x x

x =

where

Page 10: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

max

max

1 11 11, ( ),..., ( ), ( ),..., ( )

TMM Mi i M i M i M i

b x x x x

max

,M

max1 2 1 2: ( , ,..., , , ,..., )MM M M Ti i i i i i ix x x x x x b

1 21 21 2

, , ,

ˆ , ,..., ,m m mm m K mm

KmKm

mi

l l l

x x x

x

1, ,1

ˆ :m

m mj jm m

j jj j

Kmi

l lj

x x

x

1( ) : ( ),..., ( )T

Nb b b

12

1 2

2 2

,1

, ( )( , )

ˆ ˆ: ( ) .

T

m mim r s m i i

r sr s V m

L D

x x

L is an matrix. max max( 1) ( 1)M M

CQP and Tikhonov Regularization for MARS

1,2,...,( ) 0,1,2,..., 1Kmj

mj K N

Page 11: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

For a short representation, we can rewrite the approximate relation as

In case of the same penalty parameter , then:

Tikhonov regularization

max ( 1)2 2 2

21 1

( ) .KmM N

m im mm i

PRSS L

y b

2 2

22( ) .PRSS y b L

2( : )m

CQP and Tikhonov Regularization for MARS

Page 12: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Conic quadratic programming:

,

2

2

min ,

( ) ,

.

tt

t

M

b y

L

subject to

2, ( 1, 2,..., ).min T

i ii iT q i kIn general :

xc x p xD x dsubject to

CQP for MARS

Page 13: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

CQP for MARS

Moreover, is a primal dual optimal solution if and only if 1 2( , , , , , )t χ η ω ω

max

maxmax

max

max

maxmax max

m

1

11

1

1

1 211 1

1 2

11 2

( ): ,

1 0

: ,0

0 11,

( )

0, 0,

, M

NTM

MM

TM

TTMN

T TMM M

T T

N

t

t

M

L L

0

0

00

0

00

00 0

b yχ

ω ωb L

ω χ ω η

ω ω 2ax

max 21

,

, .MNL L

χ η

Page 14: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

CQPs belong to the well-structured convex problems.

Interior Point Methods.

Better complexity bounds.

Better practical performance.

CQP for MARS

C-MARS

Page 15: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Robust OptimizationLaurent El Ghaoui

.

Robust Optimizationand Applications,

IMA Tutorial, March 11, 2003

Page 16: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

dual of conic program

,

. .

.

Robust Optimization

Page 17: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

robust conic programming

.

.

.

.

Robust Optimization

Page 18: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

polytopic uncertainty

.

.

.

Robust Optimization

Page 19: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

robust CQP

CQP

,

.

,

.

Robust Optimization

Page 20: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

CMARS models depend on the parameters. Small perturbations in data may give different model parameters.

This may cause unstable solutions.

In CMARS, the aim is to reduce the estimation error, while keeping efficiency as high as possible.

In order to achieve this aim, we use some approaches: scenario optimization, robust counterpart, usage of more robust estimators.

By using robustification in CMARS, the estimation variance will decrease.

Robustification of CMARS The Idea of Robust CMARS (RCMARS)

Page 21: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

.j jX X

General model on the relation between input and response :

( ) Y f X 1 2 ( , ,..., ) ,T

pX X X X

error termmean

noisy input data value

jX

is random variable, and we assume that it is normally distributed.

Robustification of CMARS The Idea of Robust CMARS

,

Page 22: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Robustification of CMARSCMARS Model with Uncertainity

To employ robust optimization on CMARS, a “perturbation” (uncertainty)

is incorporated into the input data (for each dimension) and into the output data:

,1 ,2 , ( , ,..., )Ti i i i px x x

x will be represented as ,1 ,2 , , ( , ,..., )Ti i i i px x x x

after perturbation ,1 ,2 ,( , ,..., ) ( =1,2,..., ).Ti i i i p i N

( 1, 2, ..., ; =1, 2, ..., ). ; , ij ij ij j ij ij ij j p i Nx x x x

jx ( 1, 2,..., )j pHere, is the mean of the vector

in each dimension is restricted by ij , which is the semi-length of confidence interval.

; the amount of perturbation

Page 23: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

CMARS Model with Uncertainity

Provided the observations represented by the data with uncertainty

the form of the mth basis function with uncertainty set is defined by

The PRSS has the following representation:

( , ) ( 1,2,..., ),i ix y i N

1U

max

1 2

22 2 2

,

1 1 1

, ( ) ( , )

: ( ( )) [ ( )] ,MN

m m

i m m r s m

i m r sT r s V m

PRSS y f D d

ix x x

1

( ) := ( ) .m

m m

j j

K

m ij

mi x

x

22

22

( ) .PRSS

Accuracy Complexity

y b L Tikhonov regularization

Page 24: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

polyhedral uncertainty sets 1 2,U U

[. , .] [. , .] [. , .] [. , .] Cartesian product .

. .

.

vertex

vertices

.

.

. .

. . . . . . .

.max

max

max

111 12 1

221 22 2

1 2

...

..., .

...

M

M

NN N NM

uu u v

uu u v

vu u u

U v

Polyhedral Uncertainty

Page 25: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Polyhedral Uncertainty and Robust Counterpart for

CMARS Model

Based on polyhedral uncertainty sets , the robust counterpart for CMARS1 2,U U

2 2

2 21

2

min max , U

U

Wz

z W L

is a polytope with vertices

max max

max

2 2

1 1

1 0 ( 1, ..., 2 ), 1 |

N M N M

N Mj

j j j

j j

jU

WW

max2

1

1 2conv{ }, ,...,N M

U

= W W Wwhere is the convex hull.

1U max2N M max21 2, ...,

N M

W W W :

,

Page 26: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

is the polytope with vertices 1 2 2, , ...,

N

z z z2U

2 2

21 1

0 ( 1,2,..., 2 ), 1 ,N N

N

i i ii i

U i

iz z

1 2 22 conv{ , ,..., }

N

U z z z

2N

where is the convex hull.

Polyhedral Uncertainty and Robust Counterpart for

CMARS Model

2N:

Page 27: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Robust CQP with the Polytopic Uncertainity

Robust conic quadratic programming of our CMARS:

,

1 2

subject to

min

, ,

tt

U U

α

z Wα L

W z

where L ice-cream (or second-order, or Lorentz) cones.

equivalently

max

,

( 1, 2, ..., 2 ; 1, 2, ..., 2

min

subject to

)N MN

t

i jL

t

i j

α

z W α

(Standard) Conic Quadratic Programming

Page 28: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

The class of Generalized Linear Models (GLMs) has gained popularity as a statistical modeling tool. It has some advantages like:

• GLM has the flexibility in addressing a variety of statistical problems,• GLM has an advantage in the case of the availability of software (Stata, SAS, S-PLUS, R).

The class of GLM is an extension of traditional linear models allows:

The mean of a dependent variable to depend on a linear predictor by a nonlinear link function.

The probability distribution of the response, to be any member of an exponential family of distributions.

GLM contains many widely used statistical models:

o linear models with normal errors, o logistic and probit models for binary data,o log-linear models for multinomial data.

Generalized Linear Models

Page 29: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Some other useful statistical models such as with

Poisson, binomial,

Gamma or normal distributions,

can be formulated as GLM by the selection of an appropriate link function

and response probability distribution.

A GLM has the following form:

where

: expected value of the response variable, : smooth monotonic link function, : observed value of explanatory variable for the ith case, : vector of unknown parameters.

( ) ,T

i i iH x

( ) i iE Y

ix

H

Generalized Linear Models

Page 30: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

A particular type of semiparametric models are the Generalized Partial Linear Models (GPLMs) :

GPLMs are extended version of the GLMs by adding a single nonparametric component to the usual parametric terms:

is a vector of parameters, and

is a smooth function, which we try to estimate by CMARS.

Assumption: m-dimensional random vector which represents (typically discrete) covariates, q-dimensional random vector of continuous covariates,

which comes from a decomposition of explanatory variables.

X

, ;TE Y X T G X T

Tm

T

Generalized Partial Linear Models

Page 31: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

The general model can be written as follows:

with observation values ,

and and is a smooth function.

There are different kinds of estimation methods for GPLM. Generally, the estimation methods for model are based on kernel methods and test procedures on the

correct specification of this model.

Now, we will try to concentrate on special types of GPLM (Conic GPLM) estimation based on CMARS.

1

( ) ( , ) .m

Tj j

j

H X X X

T T T

, , ( 1, 2,..., )i i iy i nx t

( )i iG ( ) Ti i i iH x t

Estimation for GPLM

Page 32: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

The Least-Squares Estimation with Tikhonov Regularization

The procedure is as follows:

The vector is found by the application of the linear least squares on the given data:

(1) Then, parametric part has the form:

To estimate the regression coefficients the method of least squares is employed:

in

to minimize the residual sum of squares (RSS).

.preprocY TX

0 1 2( , , ,..., )preproc Tm

01

.m

j jj

y X

01

,m

j jj

Y X

preproc

Conic GPLM (CGPLM)

Page 33: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

After obtaining the regression coefficients, we subtract the linear least-squares model (without intercept) from corresponding responses:

The resulting values become new response variables of the input data.Then, the knots for nonparametric part of MARS can be found based on these new data.

In this the model :

and is a smooth function which will be estimated by CMARS

that is an alternative to multivariate adaptive regression splines (MARS).

1

ˆ.m

j jj

y X y

y

( ) Ti i i iH x t ( )

Conic GPLM

The Least-Squares Estimation with Tikhonov Regularization

Page 34: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Tikhonov Regularization is employed to find the approximate solution to the equation (1) by minimizing the quadratic functional

(2)

where is a regularization parameter between the first and the second part.

The term is the response vector and shows unknown coefficients.

Tikhonov regularization problem (2) helps for finding the parameters.

2 2

2 2min ,

preproc

preproc preprocy X L

y preproc

Conic GPLM

The Least-Squares Estimation with Tikhonov Regularization

Page 35: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

( 1, 2, ..., ) j j j mX X

General model on the relation between input and response :

1 2 ( , ,..., ) ,TmX X X

X

error termmean

and j jX T

is random variable, and we assume that it is normally distributed.

Robustification of Conic GPLM The Idea of Robust Conic GPLM

noisy variable noisy variable

( , ) ( ( ) ),TE Y G X T X T

1 2( , , ..., ) .T

pT T T

T

( 1, 2, ..., ).j j

T T j q

( 1)G H ( | , ),E Y X T is a link function that connect the mean of the response variable,

to the predictor varaibles. Then, additive semiparametric model:

1

, . m

T

j

j jH X

X T X T T

Page 36: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Variables of Robust Conic GPLM

; , ( 1, 2,..., ; =1,

2,..., )

; , ( 1, 2,..., ; =1,2,..., )

;

,

ij ij ij j ij ij ij

ij ij ij j ij ij ij

i i i i i i

x x x x j m i N

t t t t j q i N

y y y y

Then the final model

: ( ) ( ).

(

=1,2,..., )

Ti i i iH

i N

x t

Robustification of Conic GPLM

Page 37: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Linear Part of Robust Conic GPLM

01

.m

preproc T preprocj j

j

Y x

X

11 1

11 2

1 1 2 2 1 1 2 2

1 1 1 1 2 1 1 1

2 2

1 1 2 2

conv , , ..., , conv , , ...,

minimi max z

.

e ,

N m N

U

U

U U

W

z

W W W z z z

z W K

1

1,

1 1 12

12

minimize ,

subject to ( 1, 2,..., 2 ; 1, 2,..., 2 ),

.

i j N N mi j

M

z W

K

A linear estimator is found as:

As a Tikhonov Regularization form it can be written as:

Finally, it can be written as a standard CQP problem:

Robustification of Conic GPLM

Page 38: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Nonlinear Part of Robust Conic GPLM

1

.M

mm m

m

H

t

2 2

22( ) .PRSS t L

2

max

2,

2 2 22

22

minimize

subject to ( 1, 2,..., 2 ; 1, 2,..., 2 ),

.

,

N Mi j Nz i j

M

W

L

max2 1 2 2 2 1 2 2

1 2 2 2 2 2 2 2

22 1

22 2

2 2

2 2 2 2

conv , , ..., , conv , , ...,

minimize max

.

,

N M N

U

U

U U

W

z

W W W z z z

z W L

By the help of the smooth function found by RCMARS

the PRSS form is obtained:

It can be converted into:

Robustification of Conic GPLM

Finally, it can be written as a standard CQP problem:

Page 39: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Numerical Experience

,i

x

iy

To employ the robust optimization technique on the linear part of CGPLM model, we include perturbations (uncertainty) into the real input data in each dimension, and into the output data (i=1,2,…,24).

For this purpose, the uncertainty matrices and vectors are elements in polyhedral uncertainty sets for the linear part.

Then, uncertainty is evaluated for all input and output values which are represented by CIs.

Afterwards, we transform the variables into the standard normal distribution, the CI is obtained to be [-3, 3].

Robustification of Conic GPLM

Page 40: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

For nonlinear part, we constructed model functions for these data using MARS Software, where we selected the maximum number of basis elements: max 8.M

Then, the large model becomes

1 2 2 2 3 2

4 2 5 1 6 2

7 2 8 1

( ) max{0, 0.2855}, ( ) max{0, 0.2855 }, ( ) max{0, 0.7508},

( ) max{0, 0.7508 }, ( ) max{0, 0.5573}, ( ) max{0, 0.5573},

( ) max{0, 0.5573 }, ( ) max{0, 1.8980} max

t t t t t t

t t t t t t

t t t t

2

{0, 0.2855 }.t

0 0 1 2 2 2 3 2

1

4 2 5 1 6 2 7 2

8 1

( ) + = max{0, 0.2855} max{0, 0.2855 } max{0, 0.7508}

+ max{0, 0.7508 } max{0, 0.5573} max{0, 0.5573} max{0, 0.5573 }

+ max{0, 1.8980}

M

m m

m

m t t t

t t t t

t

t

2max{0, 0.2855 } .t

Robustification of Conic GPLM Numerical Experience

Page 41: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

To apply the robust optimization technique on the nonlinear part of CGPLM model, we include perturbation (uncertainty) into the real input data in each dimension, and into the output data (i=1,2,…,24).

Similar to linear part, the uncertainty matrices and vectors based on polyhedral uncertainty sets are obtained.

Uncertainty is calculated for all input and output values which are represented by CIs and we transform the variables into the standard normal distribution.

The CI is obtained to be [-3, 3].

The uncertainty matrix for input data has a huge size

We do not have enough computer capacity.

Tradeoff between tractibility and robustification.

24 8 192( 2 ).2

,i

t

i

Robustification of Conic GPLM Numerical Experience

Page 42: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

To solve our problem:•Transform it into the MOSEK format.

•Write this formulation each value of our sample (N=24).

•We formulate our models as a CQP problem for each sample value (observation)

using the combinatorial approach, which we call weak robustification.

• As a result, we obtain 24 different weak RCGPLM (WRCGPLM) submodels

for each linear and nonlinear part.

•Solve them separately by using MOSEK program.

•After the MOSEK result for each of our observation is found, we select a MOSEK model which has the maximum t values for linear and nonlinear part, and we continue with these MOSEK models to calculate our parameter values

and .

Robustification of Conic GPLM Numerical Experience

Page 43: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Data• 1019 observations which belong to 45 emerging markets from 1980 to 2005• 13 explanatory variables:

Bank liquid reserves to bank assets ratio,

Changes in net reserves / GDP ( Gross Domestic Product),

Current account balance (% of GDP),

Exports of goods and services (% of GDP),

External debt total / Total Reserves,

Long-term debt / GDP,

GDP growth (annual %),

Liquid liabilities as % of GDP,

Total debt service (% of exports of goods services and income),

Short-term debt (% of exports of goods services and income),

Trade (% of GDP),

Use of IMF credit / GDP,

Inflation consumer prices (annual %).

Real-word Application for RCGPLM

Page 44: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

• Variables’ scatterplot graphs are plotted in Minitab 14 to see linearity

• 4 of 13 variables show a linear relationship with default values “y”.

Liquid liabilities as % of GDP,

Total debt service (% of exports of goods services and income),

Use of IMF credit / GDP,

Inflation consumer prices (annual %).

Real-word Application for RCGPLM

Page 45: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Result of Linear Part

“1” or “0”

The Remainder (RCMARS Part)

consists of “-1”, “0” and “1” values

Response Variableshows defaults and nondefaultsconsists of “0” and “1” values

Real-word Application for RCGPLM

.preproc y X β

Page 46: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Group I• consists of 378 observations, • 52 of which are “-1”, • 326 of which are “0”.

Group II• consists of 379 observations, • 104 of which are “1”, • 275 of which are “0”.

Subgroups of Training Sample which consists of 757 observations

Real-word Application for RCGPLM

y

Page 47: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Real-word Application for RCGPLM

 

Training Sample Validation Sample

D-D ND-ND

Correct

Classificati

on Rate

D-D ND-ND

Correct

Classification

Rate

CGPLM 90.09% 93.24% 91.81% 86.27% 90.05% 89.31%

RCGPLM 87.80% 96.20% 93.33% 96.88% 89.71% 92%

Results and Comparision

Page 48: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Process Version of RCGPLMBio-Systems

medicine

food

education health caredevelopment

sustainability

bio materials bio energy

environment

Page 49: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

DNA microarray chip experiments

prediction of gene patterns based on

with

M.U. Akhmet, H. Öktem

S.W. Pickl, E. Quek Ming Poh

T. Ergenç, B. Karasözen

J. Gebert, N. Radde

Ö. Uğur, R. Wünschiers

M. Taştan, A. Tezel, P. Taylan

F.B. Yilmaz, B. Akteke-Öztürk

S. Özöğür, Z. Alparslan-Gök

A. Soyler, B. Soyler, M. Çetin

S. Özöğür-Akyüz, Ö. Defterli

N. Gökgöz, E. Kropat

... Finance

Environment

Health Care

MedicineProcess Version of RCGPLMBio-Systems

Page 50: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Financial Systems

Process Version of RCGPLM

Page 51: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Regulatory Networks: ExamplesExamples

Target variables Environmental items

Genetic Networks

Gene expression Transscription factors,

toxins, radiation

Eco-Finance Networks

CO2-emissions Financial means,

technical means

Further examples:Socio-econo-networks, stock markets, portfolio optimization, immune system, epidemiological processes …

Process Version of RCGPLM

Page 52: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Modeling & Prediction

: ( )nX

( ) , (0)

0X M X X X X

: ( )n nM

prediction, anticipation least squares – max likelihood

statistical learning

expression data

matrix-valued function – metabolic reaction

1 2( ) ( ( ) , ( ) , ... , ( ) )Tnt e t e t e t X X

Expression

Process Version of RCGPLM

Page 53: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Process Version of RCGPLM

1k k k MX X

1 ( ( )) ,k k k kI h X M X XEx.:

( )k i jem Μ M

We analyze the influence of em -parameters on the dynamics (expression-metabolic).

Ex.: Euler, Runge-Kutta, Heun

Modeling & Prediction

Page 54: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Process Version of RCGPLM

gene2

gene3

gene1

gene4

0.4 E1

0.2 E2 1 E1

Genetic Networks

Page 55: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Process Version of RCGPLM

Gene-Environment Networks

Page 56: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Process Version of RCGPLMThe Model Class

d-vector of concentration levels of proteins and of certain levels of environmental factors d= m+n continuous change in the gene-expression data in time

is the firstly introduced time-autonomous form, where

nonlinearitiesinitial values of the gene-exprssion levels

: experimental data vectors obtained from microarray experiments

and environmental measurements : the gene-expression level (concentration rate) of the i th gene at time t

denotes anyone of the first n coordinates in thed-vector of genetic and environmental states.

is the set of genes.

Weber et al. (2008c), Chen et al. (1999), Gebert et al. (2004a),Gebert et al. (2006), Gebert et al. (2007), Tastan (2005), Yilmaz (2004), Yilmaz et al. (2005),Sakamoto and Iba (2001), Tastan et al. (2005)

( ) F

1 2 1 2( , ,..., , , ,..., ) ,Tn mX X X T T T

d

dt

: 1, 2,..., diF i n mR R

0 0( )t

( )i t

: {1,2,..., }G n

Page 57: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Process Version of RCGPLM

(i) is an constant (nxn)-matrix is an (nx1)-vector of gene-expression

levels(ii) represents the dynamical system of the n

genes and their interaction alone. : (nxn)-matrix with entries as functions of polynomials, exponential,

trigonometric,

(iii)environmental effects

n genes , m environmental effects

are (n+m)-vector and (n+m)x(n+m)-matrix, respectively.

Weber et al. (2008c), Tastan (2005), Tastan et al. (2006),Ugur et al. (2009), Tastan et al. (2005), Yilmaz (2004), Yilmaz et al. (2005),Weber et al. (2008b), Weber et al. (2009b)

The Model Class

M

M( )

( )M( ) C

0 0, = ( ) X

tT

M( )

( ) ( )

0 0

M X M X

M( ) =

splines or wavelets containing some parameters to be optimized.

Page 58: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Errors uncorrelated Errors correlated Fuzzy values

Interval arithmetics Ellipsoidal calculus Fuzzy arithmetics

θθ11

θθ22

Regulatory Networks under Uncertainty

Process Version of RCGPLM

Page 59: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Process Version of RCGPLM

Robustification of GPLM Approach for Regulatory, Dynamical Systems

1 1

2 2

( ) ( )( ) with : , and ( ) : . (*)

( ) ( )

M X M TX

M T M XT

We can represent their generalized multiplicative form with our GPLM approach as follows:

X

T

represents the expression levels of targets,

consists of environmental factors which affect the targets in the network,

( ) is called as network matrix , which can be identified by solving the following least-squares (or maximum likelihood) estimation problem:

21( ) ( ) ( )

0 2

minimize ( ) (**)N

k k k

k

: some vector of unknowns

Page 60: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Process Version of RCGPLM

Robustification of GPLM Approach for Regulatory, Dynamical Systems

We represent the process version of the GPLM formulation in the following way:

( 1, 2, ..., ).( )Ti i i i d X T

( )

X

T

( , ) .T T T

corresponding to the parameters of

The unknown parameters appearing inside of

(nonlinear part).

can be collected separately vectors

(linear part),

corresponding to the parameters of

Hence,

Page 61: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

When the entiries of the matrix are splines, to solve the problem given in (**),

CGPLM can be used for target-environment networks.

Process Version of RCGPLM

Robustification of GPLM Approach for Regulatory, Dynamical Systems

( )

Furthermore, in the case of the existence of uncertainty in the expression data,

then the presented RCGPLM technique can be applied with RCMARS in order to

study a robustification of our target-environment networks.

Then, for each row of the matrix equation in (*), we represent the process version of

the RCGPLM model in the subsequent manner:

( 1, 2, ..., ).( )Ti i i i d

X T

Page 62: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Aster, A., Borchers, B., and Thurber, C., Parameter Estimation and Inverse Problems, Academic Press, 2004.Ben-Tal, A., Nemirovski, A., Lectures on Modern Convex Optimization: Analysis, Algorithms, and Engineering Applications, MPR-SIAM Series on Optimization, SIAM, Philadelphia, 2001.Chen, T., He, H.L., and Church, G.M., Modeling gene expression with differential equations, Proceedings of Pacific Symposium on Biocomputing 1999, 29-40.Defterli, O., Fügenschuh, A, and Weber, G-W., New discretization and optimization techniques with results in the dynamics of gene- environment networks. In: Proceedings of  the 3rd Global Conference on Power Control & Optimization (PCO 2010),  Editors: N. Barsoum, P. Vasant, R. Habash, ISBN: 978-983-44483-1-8. Defterli, O., Fügenschuh, A., and Weber, G.-W., Modern Tools For The Tıme-dıscrete Dynamıcs and Optımızatıon Of Gene-envıronment Networks, Communications in Nonlinear Science and Numerical Simulation, in press, 2011. El Ghaoui, L., Robust Optimization and Applications, IMA Tutorial, 2003. Ergenc, T., and Weber, G.-W., Modeling and prediction of gene-expression patterns reconsidered with Runge-Kutta discretization, Journal of Computational Technologies 9, 6 (2004) 40-48.Friedman, J.H., Multivariate adaptive regression splines, The Annals of Statistics 19, 1 (1991) 1-141.Hansen, P.C., Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion, SIAM, Philadelphia, 1998. Hastie, T., Tibshirani, R., and Friedman, J.H., The Element of Statistical Learning, Springer Verlag, NY, 2001.Hoon, M.D., Imoto, S., Kobayashi, K., Ogasawara, N ., and Miyano, S., Inferring gene regulatory networks from time-ordered gene expression data of Bacillus subtilis using dierential equations, Proceedings of Pacific Symposium on Biocomputing (2003) 17-28. Gebert, J., Laetsch, M., Pickl, S.W., Weber, G.-W., and Wünschiers ,R., Genetic networks and anticipation of gene expression patterns, Computing Anticipatory Systems: CASYS(92)03 - Sixth International Conference, AIP Conference Proceedings 718 (2004) 474-485.Kropat, E., Weber, G.-W., Robust regression analysis for gene-environment and eco-finance networks under polyhedral and ellipsoidal uncertainty. preprint_2 (2010) at Institute of Applied Mathematics, METU.Myers, R.H., and Montgomery, D.C., Response Surface Methodology: Process and Product Optimization Using Designed Experiments,New York: Wiley (2002).Nemirovski, A., Lectures on modern convex optimization, Israel Institute Technology (2002), http://iew3.technion.ac.il/Labs/Opt/LN/Final.pdf. Nesterov, Y.E., and Nemirovskii, A.S., Interior Point Methods in Convex Programming, SIAM, 1993.

References

Page 63: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Özmen, A., Weber, G.-W., Batmaz, I. and Kropat E., RCMARS: Robustification of CMARS with Di erent Scenarios under ffPolyhedral Uncertainty Set. To appear in Communications in Nonlinear Science and Numerical Simulation (CNSNS), Special Issue Nonlinear, Fractional and Complex Systems with Discontinuity and Chaos, D. Baleanu and J.A. Tenreiro Machado (guest editors), 2010.Özmen, A., Weber, G.-W., and Kerimov, A., RCMARS: A New Optimization Supported Tool - Applied on Financial Market Data -under Polyhedral Uncertainty, preprint at Institute of Applied Mathematics, METU,submitted to JOGO, 2010.Özmen, A., Weber, G.-W., Çavuşoglu Z., and Defterli Ö., The New Robust Conic GPLM Method with an Application to Finance and Regulatory Systems: Prediction of Credit Default and a Process Version, preprint at Institute of Applied Mathematics, METU,submitted to JOGO, 2010.Özmen, A., and Weber, G.-W.: Robust Conic Generalized Partial Linear Models Using RCMARS Method – A Robustification of CGPLM. preprint at Institute of Applied Mathematics, METU, in Proceedings of Fifth Global Conference on Power Control and Optimization PCO, June 1 – 3, 2011, Dubai, ISBN: 983-44483-49.Pickl, S.W., and Weber, G.-W., Optimization of a time-discrete nonlinear dynamical system from a problem of ecology - an analytical and numerical approach, Journal of Computational Technologies 6, 1 (2001) 43-52.Sakamoto, E., and Iba, H., Inferring a system of differential equations for a gene regulatory network by using genetic programming, Proc. Congress on Evolutionary Computation 2001, 720-726.Tastan, M., Analysis and Prediction of Gene Expression Patterns by Dynamical Systems, and by a Combinatorial Algorithm, MSc Thesis, Institute of Applied Mathematics, METU, Turkey, 2005.Tastan, M., Pickl, S.W., and Weber, G.-W., Mathematical modeling and stability analysis of gene-expression patterns in an extended space and with Runge-Kutta discretization, Proceedings of Operations Research, Bremen, 2006, 443-450.Weber, G.-W., Batmaz, I., Köksal G., Taylan P., and Yerlikaya F., 2009. CMARS: A New Contribution to Nonparametric Regression with Multivariate Adaptive Regression Splines Supported by Continuous Optimisation, preprint at IAM, METU, submitted for publication. Weber, G.-W., Çavuşoğlu Z., and Özmen A., Predicting Default Probabilities in Emerging Markets by New Conic Generalized Partial Linear Models and Their Optimization. To appear in Advances in Continuous Optimization with Applications in Finance, Special Issue Optimization,2010

Page 64: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Weber, G.-W., Alparslan -Gök, S.Z., and Dikmen, N., Environmental and life sciences: Gene-environment networks-optimization, games and control - a survey on recent achievements, deTombe, D. (guest ed.), special issue of Journal of Organizational Transformation and Social Change 5, 3 (2008) 197-233.Weber, G.-W., Taylan, P., Alparslan-Gök, S.Z., Özögur, S., and Akteke-Öztürk, B., Optimization of gene-environment networks in the presence of errors and uncertainty with Chebychev approximation, TOP 16, 2 (2008) 284-318.Weber, G.-W., Alparslan-Gök, S.Z., and Söyler, B., A new mathematical approach in environmental and life sciences: gene-environment networks and their dynamics,Environmental Modeling & Assessment 14, 2 (2009) 267-288.Weber, G.-W., and Ugur, O., Optimizing gene-environment networks: generalized semi-infinite programming approach with intervals, Proceedings of International Symposium on Health Informatics and Bioinformatics Turkey '07, HIBIT, Antalya, Turkey, April 30 - May 2 (2007).Yılmaz, F.B., A Mathematical Modeling and Approximation of Gene Expression Patterns by Linear and Quadratic Regulatory Relations and Analysis of Gene Networks, MSc Thesis, Institute of Applied Mathematics, METU, Turkey, 2004.Weber, G.-W., Kropat, E., Tezel, A., and Belen, S., Optimization applied on on regulatory and eco-finance networks – survey and new development. Pacific J. Optim. 6(2), 319-340 (2010).

Page 65: Forecasting Default Probabilities  in Emerging Markets and   Dynamical Regulatory Networks through New Robust Conic GPLMs and Optimization

Thank you