research article dynamically measuring statistical...

15
Hindawi Publishing Corporation ISRN Signal Processing Volume 2013, Article ID 434832, 14 pages http://dx.doi.org/10.1155/2013/434832 Research Article Dynamically Measuring Statistical Dependencies in Multivariate Financial Time Series Using Independent Component Analysis Nauman Shah and Stephen J. Roberts Machine Learning Research Group, Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, UK Correspondence should be addressed to Nauman Shah; [email protected] Received 30 March 2013; Accepted 4 May 2013 Academic Editors: A. Krzyzak, C.-M. Kuo, S. Kwong, W. Liu, and F. Perez-Cruz Copyright © 2013 N. Shah and S. J. Roberts. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We present a computationally tractable approach to dynamically measure statistical dependencies in multivariate non-Gaussian signals. e approach makes use of extensions of independent component analysis to calculate information coupling, as a proxy measure for mutual information, between multiple signals and can be used to estimate uncertainty associated with the information coupling measure in a straightforward way. We empirically validate relative accuracy of the information coupling measure using a set of synthetic data examples and showcase practical utility of using the measure when analysing multivariate financial time series. 1. Introduction e task of accurately inferring the statistical dependency structure (association) in multivariate systems has been an area of active research for many years, with a wide range of practical applications [1]. Many of these applications require real-time sequential analysis of dependencies in multivariate data streams with dynamically changing properties. However, most existing measures of dependence have some serious limitations; in terms of the type of data sets they are suitable for or in their computational complexities. If the data being analysed is generated using a known stable process, with known marginal and multivariate distributions, the degree of dependence can be estimated relatively easily. However, most real-world data sets have dynamically changing properties to which a single distribution cannot be assigned. Multivariate data generated in global financial markets is an example of such complex data sets. Financial data exhibits rapidly changing dynamics and is non-Gaussian in nature; this is especially true for financial data recorded at high frequencies [2]. In fact, as the scale over which financial returns are calcu- lated decreases, their distribution becomes increasingly non- Gaussian, a feature referred to as aggregational Gaussianity. e recent explosive growth in availability and use of financial data sampled at high frequencies therefore requires the use of computationally efficient algorithms which are suitable for dynamically analysing dependencies in non-Gaussian data streams. e most commonly used measure of statistical depen- dence is linear correlation. However, practical use of the lin- ear correlation measure has three main limitations; that is, it cannot accurately model dependencies between signals with non-Gaussian distributions [3]; it is restricted to measuring linear statistical dependencies and is very sensitive to outliers [4]. Rank correlation is another frequently used measure of association. However, it is only valid for monotonic functions and is not suitable for large data sets, as assigning ranks to a large number of observations is computationally demanding. Financial returns oſten have a large fraction of zero values, which result in tied ranks [5]. Rank correlation measures cannot accurately deal with the presence of tied ranks and hence the results obtained can be misleading [6]. Another widely used method for multivariate dependence analysis in the financial sector is the use of copulas [7]. However, copula-based methods also suffer from major limitations in practice; for example, computation of copula functions involves calculating multiple moments as well as integra- tion of joint distributions, which require use of numerical

Upload: others

Post on 14-Jun-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

Hindawi Publishing CorporationISRN Signal ProcessingVolume 2013 Article ID 434832 14 pageshttpdxdoiorg1011552013434832

Research ArticleDynamically Measuring Statistical Dependencies in MultivariateFinancial Time Series Using Independent Component Analysis

Nauman Shah and Stephen J Roberts

Machine Learning Research Group Department of Engineering Science University of Oxford Oxford OX1 3PJ UK

Correspondence should be addressed to Nauman Shah naumanshahengoxacuk

Received 30 March 2013 Accepted 4 May 2013

Academic Editors A Krzyzak C-M Kuo S Kwong W Liu and F Perez-Cruz

Copyright copy 2013 N Shah and S J Roberts This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

We present a computationally tractable approach to dynamically measure statistical dependencies in multivariate non-Gaussiansignals The approach makes use of extensions of independent component analysis to calculate information coupling as a proxymeasure for mutual information between multiple signals and can be used to estimate uncertainty associated with the informationcoupling measure in a straightforward way We empirically validate relative accuracy of the information coupling measure using aset of synthetic data examples and showcase practical utility of using the measure when analysing multivariate financial time series

1 Introduction

The task of accurately inferring the statistical dependencystructure (association) in multivariate systems has been anarea of active research for many years with a wide range ofpractical applications [1] Many of these applications requirereal-time sequential analysis of dependencies in multivariatedata streamswith dynamically changing propertiesHowevermost existing measures of dependence have some seriouslimitations in terms of the type of data sets they are suitablefor or in their computational complexities If the data beinganalysed is generated using a known stable process withknownmarginal andmultivariate distributions the degree ofdependence can be estimated relatively easily However mostreal-world data sets have dynamically changing properties towhich a single distribution cannot be assigned Multivariatedata generated in global financial markets is an exampleof such complex data sets Financial data exhibits rapidlychanging dynamics and is non-Gaussian in nature this isespecially true for financial data recorded at high frequencies[2] In fact as the scale over which financial returns are calcu-lated decreases their distribution becomes increasingly non-Gaussian a feature referred to as aggregational GaussianityThe recent explosive growth in availability anduse of financial

data sampled at high frequencies therefore requires the useof computationally efficient algorithms which are suitable fordynamically analysing dependencies in non-Gaussian datastreams

The most commonly used measure of statistical depen-dence is linear correlation However practical use of the lin-ear correlation measure has three main limitations that is itcannot accurately model dependencies between signals withnon-Gaussian distributions [3] it is restricted to measuringlinear statistical dependencies and is very sensitive to outliers[4] Rank correlation is another frequently used measure ofassociation However it is only valid formonotonic functionsand is not suitable for large data sets as assigning ranks to alarge number of observations is computationally demandingFinancial returns often have a large fraction of zero valueswhich result in tied ranks [5] Rank correlation measurescannot accurately deal with the presence of tied ranks andhence the results obtained can be misleading [6] Anotherwidely used method for multivariate dependence analysisin the financial sector is the use of copulas [7] Howevercopula-based methods also suffer from major limitationsin practice for example computation of copula functionsinvolves calculating multiple moments as well as integra-tion of joint distributions which require use of numerical

2 ISRN Signal Processing

methods and hence become computationally complex [8]Copula-based methods suffer from other major limitationsas well namely the difficulties in accurate estimation of thecopula functions the empirical choice of the type of copulasand problems in the design and use of time-dependentcopulas [9] Mutual information the canonical measure ofstatistical dependence is also used in practice Howeveraccurate computation of mutual information using finitedata sets can be computationally complex (as we discusslater) In this paper we present a computationally efficientindependent component analysis (ICA) based approach todynamically measure information coupling in multivariatenon-Gaussian data streams as a proxy measure for mutualinformation

The paper is organised as follows We first discuss theneed for developing an ICA-based information couplingmeasure and present the theoretical framework underly-ing the development of our approach We then present abrief introduction to the principles of ICA and discuss ourmethod of choice for accurately and efficiently inferringthe ICA unmixing matrix in a dynamic environment Wethen proceed to present the ICA-based information couplingmetric and describe its properties Finally we present a setof synthetic and financial data examples which make use ofthe information coupling measure to estimate dependenciesin non-Gaussian signals

2 Measuring Dependencies Using ICAA Conceptual Overview

Mutual information is the canonical measure of statisticaldependence in multivariate systems [10] It is a quantitativemeasurement of how much information the observationof one variable gives us regarding another variable Whilstthe computation of mutual information is conceptuallystraightforward when the full probability density functions(pdf) of the variables under consideration are available itis often difficult to accurately estimate mutual informationdirectly using finite data sets This is especially true inhigh-dimensional spaces in which computation of mutualinformation requires the estimation of multivariate jointdistributions a process which is unreliable (being exquisitelysensitive to the joint pdf over the variables of interest) aswell as computationally expensive [11] Existing approaches tocompute mutual information include methods that are basedon ranking of variables kernel density estimation k-nearestneighbours and the Edgeworth approximation of differentialentropy [12 13] However for most finite data sets none ofthese techniques all of which impose a trade-off betweencomputational complexity and accuracy outperforms theother methods and all these approaches can be extremelysensitive to the presence of noise [12] The accuracy of theseapproaches is also highly sensitive to the choice of the modelparameters such as the number of kernels or neighboursTherefore in most practical cases the direct use of mutualinformation is not feasible However as we discuss below itis possible to make use of information encoded in the ICA

unmixingmatrix to calculate information coupling as a proxymeasure for mutual information

Let us first take a look at the conceptual basis on whichwe can use ICA as a tool for measuring statistical dependen-cies According to its classical definition ICA estimates anunmixing matrix such that the mutual information betweenthe independent source signals is minimised [14] Hence wecan consider the unmixing matrix to contain informationabout the degree ofmutual information between the observedsignals Although the direct computation of mutual informa-tion can be very expensive alternative efficient approachesto ICA which do not involve direct computation of mutualinformation exist Hence it is possible to indirectly obtainan estimate for mutual information by using the ICA-basedinformation couplingmeasure as a proxy Now let us considersome properties of financial returns which make them well-suited to be analysed using ICA (as this paper is focusedon financial applications therefore we consider the case ofmeasuring dependencies in financial data however similarideas can be applied to most real-world systems which giverise to non-Gaussian data) Financial markets are influencedby many independent factors all of which have some finiteeffect on any specific financial time series These factorscan include among others news releases price trendsmacroeconomic indicators and order flows We hypothesisethat the observed multivariate financial data may hence begenerated as a result of linear combination of some hidden(latent) variables [15 16] This process can be quantitativelydescribed by using a linear generative model such as prin-cipal component analysis (PCA) factor analysis (FA) orICA As financial returns have non-Gaussian distributionswith heavy tails therefore PCA and FA are not suitable formodelling multivariate financial data as both these second-order approaches are based on the assumption of Gaussianity[17] ICA in contrast takes into account non-Gaussiannature of the data being analysed by making use of higher-order statistics ICA has proven applicability for multivariatefinancial data analysis some interesting applications arepresented in [15 16 18] These and other similar studiesmake use of ICA primarily to extract the underlying latentsource signals However all relevant information about thesource mixing process is contained in the ICA unmixingmatrix which hence encodes dependenciesTherefore in ouranalysis we onlymake use of the ICA unmixingmatrix (with-out extracting the independent components) to measureinformation coupling The ICA-based information couplingmodel we present in this paper can be used to directlymeasure statistical dependencies in high-dimensional spacesThis makes it particularly attractive for a range of practicalapplications in which relying solely on pair-wise analysis ofdependencies is not feasible (there is surprisingly little workdone towards addressing the important issue of estimatingthe dependency structure in high-dimensional multivariatesystems although there has been interest in this field fora long time [19] High-dimensional analysis of informationcoupling has various important applications in the financialsector including active portfolio management multivariatefinancial risk analysis statistical arbitrage and pricing andhedging of various instruments [9])

ISRN Signal Processing 3

3 Independent Components Unmixing andNon-Gaussianity

Mixing two or more unique signals to a set of mixedobservations results in an increase in the dependency of thepdfs of the mixed signals The marginal pdfs of the observedmixed signals becomemore Gaussian due to the central limittheorem [20] The mixing process also results in a reductionin the independence of the mixed signal distribution andhence increase in mutual information associated with itMoreover there is a rise in the stationarity of the mixedsignals which have flatter spectra as compared to the originalsources [21] Consider a set of 119873 observed signals x(119905) =[1199091(119905) 1199092(119905) 119909

119873(119905)]⊤ at the time instant 119905 which are a

mixture of119872 source signals s(119905) = [1199041(119905) 1199042(119905) 119904

119872(119905)]⊤

mixed linearly using a mixing matrix A with observationnoise n(119905) as per

x (119905) = As (119905) + n (119905) (1)

Independent component analysis (ICA) attempts to find anunmixingmatrixW such that the119872 recovered source signalsb(119905) = [119887

1(119905) 1198872(119905) 119887

119872(119905)]⊤ are given by

b (119905) =W (x (119905) minus n (119905)) (2)

For the case where observation noise n(119905) is assumed to benormally distributed with a mean of zero the least squaresexpected value of the recovered source signals is given by

b (119905) =Wx (119905) (3)

whereW is the pseudo-inverse of A that is

W = A+ = (A⊤A)minus1A⊤ (4)

In the case of square mixingW = Aminus1Most ICA approaches make implicit or explicit assump-

tions regarding the parametric model of the pdfs of theindependent sources [21] for example Gaussian mixturedistributions are used as source models in [22 23] while [24]makes use of a flexible source density model given by thegeneralised exponential distribution In our analysis we usea reciprocal cosh source model as a canonical heavy-taileddistribution namely [21]

119901 (119904119894) =

1

1198850cosh (119904

119894) (5)

where 119904119894is the 119894th source and 119885

0is a normalising constant

This analytical fixed source model has no adjustable parame-ters therefore it has considerable computational advantagesover alternative source models Also as this source modelis heavier in the tails it is able to accurately model heavy-tailed unimodal non-Gaussian distributions such as financialreturns

31 Inference For our analysis we make use of the icadecalgorithm [21 24] to infer the unmixing matrix This algo-rithm constrains the unmixing matrix to the manifold of

decorrelating matrices thus offering rapid computation Thealgorithm also gives accurate results compared to otherrelated ICA approaches [24] and allows us to obtain a con-fidence measure for the unmixing matrix Here we present abrief overview of this algorithm an in-depth description ispresented in [24]

The independent source signals obtained using ICA b(119905)must be at least linearly decorrelated for them to be classedas independent The icadec algorithm makes use of thisproperty of the independent components to efficiently inferthe unmixing matrix For a set of observed signals X whereX = [x(119905)]119905=119879

119905=1 the set of recovered independent components

B = [b(119905)]119905=119879119905=1

is given by B = WX The independentcomponents are linearly decorrelated if

BB⊤ =WXX⊤W⊤ = D2 (6)

where D is a diagonal matrix of scaling factors The singularvalue decomposition of the set of observed signals is given by

X = UΣV⊤ (7)

where U and V are orthogonal matrices with the columns ofU being the principal components of X and Σ is a diagonalmatrix of the singular values of X It can be shown that thedecorrelating matrixW can then be written as [24]

W = DQΣminus1U⊤ (8)

whereQ is a real orthogonalmatrix To obtain an estimate forthe ICA unmixing matrix we need to optimise a given con-trast function (we use log-likelihood of the data as describedlater) There are a variety of optimisation approaches whichcan be used our approach of choice is the Broyden-Fletcher-Golfarb-Shanno (BFGS) quasi-Newton method which givesthe best estimate of the minimum of the negative log-likelihood in a computationally efficient manner and alsoprovides us with an estimate for the Hessian matrix How-ever parameterising the optimisation problem directly bythe elements of Q makes it a constrained minimisationproblem for which BFGS is not applicable Therefore toconvert it into an unconstrained minimisation problem weconstrainQ to be orthonormal by parameterising its elementsas the matrix exponential of a skew-symmetric matrix J(nonzero elements of this matrix are known as the Cayleycoordinates) whose above diagonal elements parameteriseQ [21] (for 119872 sources and 119873 observed signals the ICAunmixing matrix may be optimised in the (12)119872(119872 + 1)dimensional space of decorrelating matrices rather than inthe full 119872119873 dimensional space as (12)119872(119872 minus 1) and 119872parameters are required to specifyQ andD respectivelyThisfeature offers considerable computational benefits (especiallyin high-dimensional spaces) and the resulting matrix henceobtained is guaranteed to be decorrelating [24])

Q = exp (J) (9)

Using this parameterisation makes it possible to apply BFGSto any contrast function the contrast function used as part of

4 ISRN Signal Processing

the icadec algorithm is an expression for the log-likelihood ofthe data as described below

Using (1) and assuming that the observation noise (n)is normally distributed with a mean of zero and having anisotropic covariance matrix with precision 120573 the distributionof the observations (as a preprocessing step we normaliseeach observed signal to have a mean of zero and unitvariance) conditioned on A and s (where we drop the timeindex 119905 for ease of presentation) is given by

119901 (x | A s) =N (xAs 120573minus1I) (10)

where As is the mean of the normal distribution and 120573minus1I isits covariance The likelihood of an observation occurring isgiven by

119901 (x | A) = int119901 (x | A s) 119901 (s) 119889s (11)

Assuming that the distribution over sources has a singledominant peak in this case given by themaximum likelihoodsource estimates s = (A⊤A)minus1A⊤x the integral in (11) canbe analysed by using a simplified (computationally efficient)variant of Laplacersquos method as shown in [21 25]

119901 (x | A) = int119901 (x | A s) 119901 (s) 119889s

asymp 119901 (x | A s) 119901 (s) (2120587)1198722 det (G)minus12(12)

where G is the Hessian matrix

G = minus[1205972 log119901 (x | A s)120597s119894120597s119895

]

s=s (13)

Taking log of the expanded form of (10) gives

log119901 (x | A s) = 1198732log(

120573

2120587) minus120573

2(x minus As)⊤ (x minus As) (14)

which via (13) results in G = 120573A⊤A The log-likelihood ℓ equivlog119901(x | A) is therefore

ℓ =119873 minus119872

2log(

120573

2120587) minus120573

2(x minus As)⊤ (x minus As)

+ log119901 (s) minus 12log det (A119879A)

(15)

By using (8) we obtain A = UΣQ⊤Dminus1 Hence the log-likelihood becomes [21]

ℓ =119873 minus119872

2log(

120573

2120587119890) + log119901 (s) + log det (Σminus1D) (16)

where 119890 is the average reconstruction error Noting that weuse a reciprocal cosh source model (as given by (5)) it canbe shown that taking the derivative of this log-likelihoodexpression with respect to D and J (which parameterises Q)and by following the resulting likelihood gradient using aBFGS optimiser makes it possible to efficiently compute anoptimum value for the ICA unmixing matrix details of thisprocedure are presented in [24]

32 Dynamic Mixing The standard (offline) ICAmodel usesall available data samples at times 119905 = 1 2 119879 of theobserved signals x(119905) to estimate a single static unmixingmatrix W The unmixing matrix obtained provides a goodestimate of the mixing process for the complete time seriesand is well suited for offline data analysis However manytime series such as financial data streams are highly dynamicin nature with rapidly changing properties and thereforerequire a source separation method that can be used in asequential manner This issue is addressed here by usinga sliding-window ICA model [26] This model makes useof a sliding-window approach to sequentially update thecurrent unmixing matrix using information contained in theprevious window and can easily handle nonstationary dataThe unmixing matrix for the current window W(119905) is usedas a prior for computing the unmixing matrix for the nextwindow W(119905 + 1) This results in significant computationalefficiency as fewer iterations are required to obtain anoptimum value for W(119905 + 1) The algorithm also results inan improvement in the source separation results obtainedwhen the mixing process is drifting and addresses the ICApermutation and sign ambiguity issues [21] by maintaining afixed (but of course arbitrary) ordering of recovered sourcesthrough time

4 Information Coupling

We now proceed to derive the ICA-based information cou-pling metric Later in this section we discuss the practicaladvantages this metric offers when used to analyse multivari-ate financial time series

41 Coupling Metric Let W be any arbitrary square ICAunmixing matrix (for the purpose of brevity and clarity weonly consider the case of square mixing while deriving themetric here However the metric derived in this section isvalid for nonsquare mixing as well and the correspondingderivation can be undertaken using a similar approach aspresented here but converting the nonsquare ICA unmixingmatrices in each instance into square matrices by paddingthem with zeros)

W isin R119873times119873 (17)

Sincemultiplication ofW by a diagonalmatrix does not affectthe mutual information of the recovered sources thereforewe row normalise the unmixing matrix in order to addressthe ICA scale indeterminacy problem (most ICA algorithmsincluding icadec suffer from the scale indeterminacy prob-lem that is the variances of the independent componentscannot be determined this is because both the unmixingmatrix and the source signals are unknown and any scalarmultiplication on either will be lost in the mixing process)Row normalisation implies that the elements 119908

119894119895of the

unmixing matrix W are constrained such that each row ofthe matrix is of unit length that is

119873

sum

119895=1

1199082

119894119895= 1 (18)

ISRN Signal Processing 5

for all rows 119894 For a set of observed signals to be completelydecoupled their latent independent components must be thesame as the observed signals therefore the row-normalisedunmixing matrix for decoupled signals (W

0) must be a

permutation of the identity matrix (I)

W0= PI isin R119873times119873 (19)

where P is a permutation matrix For the case where theobserved signals are completely coupled all the latent inde-pendent components must be the same therefore the row-normalised unmixing matrix for completely coupled signals(W1) is given by

W1=1

radic119873

K isin R119873times119873 (20)

where K is the unit matrix (a matrix of ones)To calculate coupling we need to consider the distance

between any arbitrary unmixing matrix (W) and the zerocoupling matrix (W

0) The distance measure we use is the

generalised 2-norm also called the spectral norm of thedifference between the twomatrices [27] althoughwe can usesome other norms as well to get similar results The spectralnorm of a matrix corresponds to its largest singular valueand is the matrix equivalent of the vector Euclidean normHence the distance 119889(WW

0) between the twomatrices can

be written as

119889 (WW0) =1003817100381710038171003817W minusW0

10038171003817100381710038172 (21)

where sdot 2is the spectral norm of the matrix As W

0is a

permutation of the identity matrix therefore

119889 (WW0) = W minus PI2 (22)

As the spectral norm of a matrix is independent of itspermutations therefore we may define another permutationmatrix (P) such that

119889 (WW0) =10038171003817100381710038171003817PW minus I100381710038171003817100381710038172 (23)

For this equation the following equality holds

119889 (WW0) =10038171003817100381710038171003817PW100381710038171003817100381710038172 minus 1 (24)

Again noting that PW2= W

2 we have

119889 (WW0) = W2 minus 1 (25)

We normalise this measure with respect to the range overwhich the distance measure can vary that is the distancebetween matrices representing completely coupled (W

1) and

decoupled (W0) signals From (20) we have that W

1=

(1radic119873)K therefore

119889 (W1W0) =1003817100381710038171003817W1 minusW0

10038171003817100381710038172=

10038171003817100381710038171003817100381710038171003817

1

radic119873

K minus PI100381710038171003817100381710038171003817100381710038172

(26)

Using the same analysis as presented previously this equationcan be simplified to

119889 (W1W0) =

1

radic119873K2 minus 1 (27)

For a 119873-dimensional square unit matrix the spectral normis given by K

2= 119873 Therefore for a row-normalised unit

matrix the spectral norm is (1radic119873)K2= radic119873 Hence ifW

is row-normalised (27) can be written as

119889 (W1W0) = radic119873 minus 1 (28)

The normalised information coupling metric (120578) is thendefined as

120578 =119889 (WW

0)

119889 (W1W0) (29)

Substituting (25) and (28) into (29) the normalised informa-tion coupling between119873 observed signals is given by

120578 =W2 minus 1radic119873 minus 1

(30)

We can consider the bounds of 120578 as described belowSupposeM is an arbitrary real matrix We can look upon thespectral norm of this matrix (M

2= M minus 0

2) as a measure

of departure (distance) ofM from a null matrix (0) [28] Thebounds on this norm are given by

0 le M2 le infin (31)

If M is a row-normalised ICA unmixing matrix (W) then(as discussed earlier) it lies between W

0and W

1 Hence the

bounds onW are1003817100381710038171003817W010038171003817100381710038172le W2 le

1003817100381710038171003817W110038171003817100381710038172 (32)

Using (19) and (20) we can write

PI2 le W2 le10038171003817100381710038171003817100381710038171003817

1

radic119873

K100381710038171003817100381710038171003817100381710038172

(33)

which can be simplified to

1 le W2 le radic119873 (34)

Rearranging terms in this inequality gives

0 leW2 minus 1radic119873 minus 1

le 1 (35)

which gives us the same coupling metric as in (30) and showsthat the metric is normalised that is 0 le 120578 le 1 For real-valuedW W

2can be written as

W2 = radic120582max (W⊤W) (36)

where 120582max(W⊤W) is the maximum eigenvalue of W⊤WThe unmixing matrix obtained using most ICA algorithmsincluding icadec suffers from row permutation and signambiguity problems that is the rows are arranged in arandom order and the sign of elements in each row isunknown [21 24] We note that as W

2is independent of

the sign and permutations of the rows of W therefore ourmeasure of information coupling straightaway addresses the

6 ISRN Signal Processing

problems of ICA sign and permutation ambiguities Also asthe metricrsquos value is independent of the row permutations ofW therefore it provides symmetric results The informationcoupling metric is valid for all dimensions of the unmixingmatrix W This implies that information coupling can beeasily measured in high-dimensional spaces

It is possible to obtain a measure of uncertainty in ourestimation of the information coupling measure We makeuse of the BFGS quasi-Newton optimisation approach overthe most probable skew-symmetric matrix J of (9) fromwhich estimates for the unmixingmatrixW can be obtainedand thence the coupling measure 120578 calculated We alsoestimate the Hessian (inverse covariance) matrixH for J aspart of this process Hence it is possible to draw samples J1015840say from the distribution over J as a multivariate normal

J1015840 simMN (JHminus1) (37)

These samples can be readily transformed to samples in 120578using (9) (8) and (30) respectively Confidence bounds (andhere we use the 95 bounds) may then be easily obtainedfrom the set of samples for 120578 (in our analysis we use 100samples)

42 Computational Complexity The information couplingalgorithm achieves computational efficiency bymaking use ofthe sliding-window based decorrelating manifold approachto ICA Making use of the reciprocal cosh based ICA sourcemodel also results in significant computational advantagesWe now take a look at the comparative computationalcomplexity of the information coupling measure and threefrequently used measures of statistical dependence that islinear correlation rank correlation and mutual informationFor bivariate data (119899

119904samples long) for which these four

measures are directly comparable linear correlation andrank correlation have time complexities of order 119874(119899

119904) and

119874(119899119904log 119899119904) respectively [29] while mutual information and

information coupling scale as 119874(1198992119904) and 119874(119899

119904) respectively

(there have been various estimation algorithms proposed forefficient computation of mutual information however theyall result in increased estimation errors and require care-ful selection of various user-defined parameters [30]) [31]Hence even though the time complexity of the informationcoupling measure is of the same order as linear correlation itcan still accurately capture statistical dependencies in non-Gaussian data streams and is a computationally efficientproxy for mutual information

For119873-dimensionalmultivariate data direct computationof mutual information has time complexity of order 119874(119899119873

119904)

compared to119874(1198991199041198733) for the information coupling measure

In high-dimensions even an approximation for mutualinformation can be computationally very costly For exam-ple using a Parzen-window density estimator the mutualinformation computational complexity can be reduced to119874(119899119904119899119873

119887) where 119899

119887is the number of bins used for estimation

[32] which will incur a very high computational cost evenfor relatively small values of 119873 119899

119887 and 119899

119904 As a simple

example Table 1 shows a comparison of computation time(in seconds) taken by mutual information and information

Table 1 Example showing comparison of average computation time(in seconds) ofmutual information and information coupling whenthese measures are used to analyse bivariate data sets containingdifferent number of samples (119899

119904) The approach used to estimate

mutual information is based on a Parzen window based algorithmas described in [45] The computational cost of this algorithm isdependent on the window-size (ℎ) The values of ℎ used for thesimulations are ℎ = 20 for 119899

119904= 102 and ℎ = 100 for all other

simulations Results are obtained using a 266GHz processor as anaverage of 100 simulations

Computation time (sec) 119899119904= 102119899119904= 103119899119904= 104119899119904= 105

Mutual information 00214 28289 170101 1195088Information coupling 00073 00213 00561 05543

couplingmeasures for analysing bivariate data sets of varyinglengths As expected mutual information estimation usingthe Parzen window based approach (which is consideredto be a relatively efficient approach to compute mutualinformation) becomes computationally very demandingwithan increase in the number of samples of the bivariatedata set In contrast the information coupling measure iscomputationally efficient even when used to analyse verylarge high-dimensional multivariate data sets

43 Capturing Market Dynamics To dynamically analyseinformation coupling in multivariate financial data streamswe need to make use of windowing techniques Financialmarkets give rise towell-defined events such as orders tradesand quote revisions These events are irregularly spaced inclock-time that is they are asynchronous Statistical modelsin clock-timemake use of data aggregated over fixed intervalsof time [33] The time at which these events are indexedis called the event-time Hence for dynamic modelling inevent-time the number of data points can be regarded asfixed while time varies while in clock-time the time periodis considered to be fixed with variable number of data pointsAlthough we may need adaptive windows in clock-time wecan use sliding-windows of fixed length in event-time Usingfixed length sliding-windows in event-time can be usefulfor obtaining consistent results when developing and testingdifferent statistical models Also statistical models deployedfor online analysis of financial data operate best in event-time as they often need to make decisions as soon as somenewmarket information (such as quote update etc) becomesavailable Consider an online trading model making use ofan adaptive window in event-time At specific times of theday for example at times of major news announcementstrading volume can significantly increase Hence more datawill be available to the algorithm and thus results obtainedcan be misleading [34] Using a sliding-window of fixedlength in event-time can overcome this problem The lengthof the sliding-window needs to be selected appropriatelyThefinancial application for which the model is being used is oneof the factors which drives the choice of window length As ageneral rule for trading models a window of approximatelythe same size as the average time period between placingtrades (inverse of trading frequency) is often usedThismakes

ISRN Signal Processing 7

it possible to accurately capture the rapidly evolving dynamicsof the markets over the corresponding period without beingtoo long so as to only capture major trends or too short tocapture noise in the data

44 Discussion The information coupling model offers uswith multiple advantages when used to analyse multivariatefinancial data Here we summarise some of the main prop-erties the model while the empirical results presented in thenext section showcase some of its practical benefits

(i) The information coupling measure a proxy formutual information is able to accurately pick up sta-tistical dependencies in data sets with non-Gaussiandistributions (such as financial returns)

(ii) The information coupling algorithm is computation-ally efficient which makes it particularly suitable foruse in an online dynamic environment This makesthe algorithm especially attractive when dealing withdata sampled at high frequenciesThis is because withthe ever-increasing use of high-frequency data over-coming sources of latency is of utmost importance ina variety of applications in modern financial markets

(iii) It gives confidence levels on the information couplingmeasure This allows us to estimate the uncertaintyassociated with the measurements

(iv) The metric provides normalised results that is infor-mation coupling ranges from 0 for decoupled systemsto 1 for completely coupled systems This makes iteasier to analyse results obtained using themetric andto compare its performance with other similar mea-sures of association The metric also gives symmetricresults

(v) The metric is valid for any number of signals inhigh-dimensional spaces that is it consistently givesaccurate results irrespective of the number of timeseries between which information coupling is beingcomputed This makes it suitable for a range offinancial applications

(vi) It is not data intensive that is it gives relativelyaccurate results evenwhen a small sample size is usedThis allows the metric to model the complex andrapidly changing dynamics of financial markets

(vii) It does not depend on user-defined parameters whichcan restrict its practical utility as the evolving marketconditions may require the parameters to be con-stantly updated which may not be practical

5 Results

Wenow present a set of synthetic and financial data examplesshowing the relative accuracy and practical utility of theinformation coupling measure The following notations areused for different measures of statistical dependence in this

paper ICA-based information coupling (120578) linear correla-tion (120588) rank correlation (120588

119877) and normalised mutual infor-

mation (119868119873) Normalisation of mutual information values (119868)

is achieved using the following transformation [35]

119868119873= radic1 minus exp (minus2119868) (38)

The financial data examples we present in this paperprimarily make use of spot foreign exchange (FX) data Spotprice or spot rate is the price which is actually quoted fora currency transaction to take place Return is the profitrealised when a currency is traded It is common practice touse the normalised log-returns of financial data in statisticalanalysis instead of the raw data itself [36] Using log-returnsmakes it possible to convert exponential problems intolinear ones thus significantly simplifying relevant analysisA normalised log-returns data set with a mean of zero andunit variance can generally be regarded as a locally stationaryprocess [37] Therefore many signal processing techniquesmeant solely for stationary processes can be successfullyapplied to the normalised log-returns time series in anadaptive environment Denoting the mid-price at time 119905 of afinancial time series by 119875

119905 the log-return value is given by

119903119905= ln(

119875119905

119875119905minus1

) (39)

The normalisation of log-returns is achieved by convertingthe data to a form such that it has a mean of zero and unitvariance This is easily achieved by removing the mean anddividing by the standard deviation of the data

51 Synthetic Data There is no single distribution whichfits financial returns especially those sampled at higher fre-quencies which tend to be highly non-Gaussian [2] althoughthere have been attempts to model returns using a variety ofdistributions [38] In this paper we aim to capture the heavy-tailed skewed properties of financial returns using a Pearsontype IV distribution [39 40] which can be used to representdistributions with varying degrees of skewness and kurtosisand thus are useful for representing distributions of financialreturns [41 42] The first four moments of the distributioncan be uniquely determined by setting four parameters whichcharacterise the distribution [43] Until recently due to itsmathematical and computational complexity this distribu-tion has not been widely used in financial literature althoughthis is rapidly changing with advances in computationalpower and proposal of new improved analytical methods[39 42 44]

To test accuracy of various measures of dependence weneed to generate coupled non-Gaussian synthetic data withknown predefined correlation values There is no straight-forward way to simulate correlated random variables whentheir joint distribution is not known [46] as is the case withmultivariate financial returns One possible method that canbe used to induce any desired predefined correlation betweenindependent randomly distributed variables irrespective oftheir distributions is commonly known as the Iman-Conovermethod as presented in [47] This method is based on

8 ISRN Signal Processing

Table 2 Table showing accuracy of four measures of dependence that is information coupling (120578) linear correlation (120588) rank correlation(120588119877) and normalised mutual information (119868

119873) when used to estimate the level of dependence in a coupled system with varying levels of true

correlation (120588TC) 120588TC is induced in an independent bivariate randomly distributed system using the Iman-Conover method as described inthe text The dependence estimates together with their standard deviation values shown in the table are obtained using 1000 independentsimulations using 1000 data points long data sets for each simulationThe last row of the table gives values for themean absolute error (MAE)

120588TC 120578 120588 120588119877

119868119873

|120588TC minus 120578| |120588TC minus 120588| |120588TC minus 120588119877| |120588TC minus 119868119873|

0 00046 plusmn 00026 00038 plusmn 00022 00147 plusmn 00115 01851 plusmn 00401 00046 00038 00147 0185101 01079 plusmn 00073 00917 plusmn 00058 00942 plusmn 00184 01687 plusmn 00412 00079 00083 00058 0068702 02145 plusmn 00102 01862 plusmn 00093 01899 plusmn 00177 01131 plusmn 00464 00145 00138 00111 0086903 03176 plusmn 00138 02812 plusmn 00130 02863 plusmn 00167 01644 plusmn 00538 00176 00188 00137 0135604 04166 plusmn 00210 03764 plusmn 00165 03835 plusmn 00153 02787 plusmn 00500 00166 00236 00165 0121305 05135 plusmn 00196 04720 plusmn 00197 04818 plusmn 00136 03819 plusmn 00475 00135 00280 00182 0118106 06070 plusmn 00347 05688 plusmn 00226 05815 plusmn 00117 04788 plusmn 00458 00070 00312 00185 0121207 07011 plusmn 00232 06670 plusmn 00242 06830 plusmn 00093 05728 plusmn 00483 00011 00330 00170 0127208 07936 plusmn 00239 07676 plusmn 00249 07864 plusmn 00066 06652 plusmn 00490 00064 00324 00136 0134809 08864 plusmn 00252 08713 plusmn 00249 08919 plusmn 00034 07595 plusmn 00501 00136 00287 00081 0140510 10000 plusmn 00000 10000 plusmn 00000 10000 plusmn 00000 09601 plusmn 00185 00000 00000 00000 00399MAE 00093 00201 00125 01163

inducing a known dependency structure in samples takenfrom the input independent marginal distributions usingreordering techniques The multivariate coupled structureobtained as the output can thus be used as the input data invarious models of dependency analysis to test their relativeaccuracies

We set parameters of the Pearson type IV distributionsuch that the coupled synthetic data we obtain using theIman-Conover method has similar properties to financialreturns But first we need to consider some properties offinancial returns which we want to mimic Figures 1(a) and1(b) show the distributions of two higher-order momentsthat is skewness (120574) and kurtosis (120581) for FX spot data setssampled at three different frequenciesThe plots are obtainedusing a sliding-window of length 50 data points as an averageof all G10 (Group of Ten) currency pairs covering a period of8 hours in the case of 025 second and 05 second sampleddata and 2 years in case of 05 hour sampled data The non-Gaussian (heavy-tailed skewed) nature of the data is clearlyvisible It is interesting to note that the kurtosis value almostnever goes below three for any of the data sets signifyingthe temporal persistence of non-Gaussianity We now makeuse of the Iman-Conover method to induce varying levels ofcorrelation between 1000 samples taken from independentrandomly distributed Pearson type IV distributions A 1000sample data set makes it easier to accurately induce prede-fined correlations in the system as well as makes it possibleto generate data with relatively accurate average kurtosis andskewness values hence allowing us to accurately capture thehigher-order moments of financial returns As an exampleFigures 1(c) and 1(d) show distributions of the kurtosis andskewness of two variables for 1000 independent simulationsNote the similarity of these plots with the average of thecorresponding distributions of higher-order moments forfinancial data as presented in Figures 1(a) and 1(b) Thisshows the effectiveness of using synthetic data sampled from

a Pearson type IV distribution for capturing higher-ordermoments of financial returns

Four different approaches are now used to estimate thelevel of dependence between the output coupled data Theprocess is repeated 1000 times for each level of true correlation(120588TC) Table 2 presents the results obtained The results showthe accuracy of the information coupling measure whenused to analyse non-Gaussian data For this synthetic dataexample on average the information coupling measure was537 more accurate than the linear correlation measure and256 more accurate with respect to the rank correlationmeasure The normalised mutual information provided theleast accurate results

We now extend this example by incorporating datadynamics The same data generation process as describedabove is nowused to construct a 32000 samples long bivariatedata set in which the induced true correlation changes every8000 time steps that is120588TC = 02when 119905 = 1 8000120588TC = 04when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000and 120588TC = 08 when 119905 = 24001 32000 A 1000 datapoints wide sliding-window is used to dynamically measuredependencies in the data setThe resulting temporal informa-tion coupling plot together with the 5th and 95th percentileconfidence intervals is presented in Figure 2(a) The fourdifferent coupling regions are clearly visible together withthe step changes in coupling after every 8000 time stepsshowing ability of the algorithm to detect abrupt changes incoupling The normalised empirical probability distributionsover 120578 for the four coupling regions are shown in Figure 2(b)Also plotted in the same figure are the normalised empiricalpdfs for linear correlation (120588) and rank correlation (120588

119877)

The mutual information pdf is omitted for clarity as it givesrelatively less accurate results as presented in Table 2 Itis interesting to see how the peaks of the 120578 distributioncorrespond very closely to 120588TC values showing ability of theinformation coupling model to accurately capture statistical

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 2: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

2 ISRN Signal Processing

methods and hence become computationally complex [8]Copula-based methods suffer from other major limitationsas well namely the difficulties in accurate estimation of thecopula functions the empirical choice of the type of copulasand problems in the design and use of time-dependentcopulas [9] Mutual information the canonical measure ofstatistical dependence is also used in practice Howeveraccurate computation of mutual information using finitedata sets can be computationally complex (as we discusslater) In this paper we present a computationally efficientindependent component analysis (ICA) based approach todynamically measure information coupling in multivariatenon-Gaussian data streams as a proxy measure for mutualinformation

The paper is organised as follows We first discuss theneed for developing an ICA-based information couplingmeasure and present the theoretical framework underly-ing the development of our approach We then present abrief introduction to the principles of ICA and discuss ourmethod of choice for accurately and efficiently inferringthe ICA unmixing matrix in a dynamic environment Wethen proceed to present the ICA-based information couplingmetric and describe its properties Finally we present a setof synthetic and financial data examples which make use ofthe information coupling measure to estimate dependenciesin non-Gaussian signals

2 Measuring Dependencies Using ICAA Conceptual Overview

Mutual information is the canonical measure of statisticaldependence in multivariate systems [10] It is a quantitativemeasurement of how much information the observationof one variable gives us regarding another variable Whilstthe computation of mutual information is conceptuallystraightforward when the full probability density functions(pdf) of the variables under consideration are available itis often difficult to accurately estimate mutual informationdirectly using finite data sets This is especially true inhigh-dimensional spaces in which computation of mutualinformation requires the estimation of multivariate jointdistributions a process which is unreliable (being exquisitelysensitive to the joint pdf over the variables of interest) aswell as computationally expensive [11] Existing approaches tocompute mutual information include methods that are basedon ranking of variables kernel density estimation k-nearestneighbours and the Edgeworth approximation of differentialentropy [12 13] However for most finite data sets none ofthese techniques all of which impose a trade-off betweencomputational complexity and accuracy outperforms theother methods and all these approaches can be extremelysensitive to the presence of noise [12] The accuracy of theseapproaches is also highly sensitive to the choice of the modelparameters such as the number of kernels or neighboursTherefore in most practical cases the direct use of mutualinformation is not feasible However as we discuss below itis possible to make use of information encoded in the ICA

unmixingmatrix to calculate information coupling as a proxymeasure for mutual information

Let us first take a look at the conceptual basis on whichwe can use ICA as a tool for measuring statistical dependen-cies According to its classical definition ICA estimates anunmixing matrix such that the mutual information betweenthe independent source signals is minimised [14] Hence wecan consider the unmixing matrix to contain informationabout the degree ofmutual information between the observedsignals Although the direct computation of mutual informa-tion can be very expensive alternative efficient approachesto ICA which do not involve direct computation of mutualinformation exist Hence it is possible to indirectly obtainan estimate for mutual information by using the ICA-basedinformation couplingmeasure as a proxy Now let us considersome properties of financial returns which make them well-suited to be analysed using ICA (as this paper is focusedon financial applications therefore we consider the case ofmeasuring dependencies in financial data however similarideas can be applied to most real-world systems which giverise to non-Gaussian data) Financial markets are influencedby many independent factors all of which have some finiteeffect on any specific financial time series These factorscan include among others news releases price trendsmacroeconomic indicators and order flows We hypothesisethat the observed multivariate financial data may hence begenerated as a result of linear combination of some hidden(latent) variables [15 16] This process can be quantitativelydescribed by using a linear generative model such as prin-cipal component analysis (PCA) factor analysis (FA) orICA As financial returns have non-Gaussian distributionswith heavy tails therefore PCA and FA are not suitable formodelling multivariate financial data as both these second-order approaches are based on the assumption of Gaussianity[17] ICA in contrast takes into account non-Gaussiannature of the data being analysed by making use of higher-order statistics ICA has proven applicability for multivariatefinancial data analysis some interesting applications arepresented in [15 16 18] These and other similar studiesmake use of ICA primarily to extract the underlying latentsource signals However all relevant information about thesource mixing process is contained in the ICA unmixingmatrix which hence encodes dependenciesTherefore in ouranalysis we onlymake use of the ICA unmixingmatrix (with-out extracting the independent components) to measureinformation coupling The ICA-based information couplingmodel we present in this paper can be used to directlymeasure statistical dependencies in high-dimensional spacesThis makes it particularly attractive for a range of practicalapplications in which relying solely on pair-wise analysis ofdependencies is not feasible (there is surprisingly little workdone towards addressing the important issue of estimatingthe dependency structure in high-dimensional multivariatesystems although there has been interest in this field fora long time [19] High-dimensional analysis of informationcoupling has various important applications in the financialsector including active portfolio management multivariatefinancial risk analysis statistical arbitrage and pricing andhedging of various instruments [9])

ISRN Signal Processing 3

3 Independent Components Unmixing andNon-Gaussianity

Mixing two or more unique signals to a set of mixedobservations results in an increase in the dependency of thepdfs of the mixed signals The marginal pdfs of the observedmixed signals becomemore Gaussian due to the central limittheorem [20] The mixing process also results in a reductionin the independence of the mixed signal distribution andhence increase in mutual information associated with itMoreover there is a rise in the stationarity of the mixedsignals which have flatter spectra as compared to the originalsources [21] Consider a set of 119873 observed signals x(119905) =[1199091(119905) 1199092(119905) 119909

119873(119905)]⊤ at the time instant 119905 which are a

mixture of119872 source signals s(119905) = [1199041(119905) 1199042(119905) 119904

119872(119905)]⊤

mixed linearly using a mixing matrix A with observationnoise n(119905) as per

x (119905) = As (119905) + n (119905) (1)

Independent component analysis (ICA) attempts to find anunmixingmatrixW such that the119872 recovered source signalsb(119905) = [119887

1(119905) 1198872(119905) 119887

119872(119905)]⊤ are given by

b (119905) =W (x (119905) minus n (119905)) (2)

For the case where observation noise n(119905) is assumed to benormally distributed with a mean of zero the least squaresexpected value of the recovered source signals is given by

b (119905) =Wx (119905) (3)

whereW is the pseudo-inverse of A that is

W = A+ = (A⊤A)minus1A⊤ (4)

In the case of square mixingW = Aminus1Most ICA approaches make implicit or explicit assump-

tions regarding the parametric model of the pdfs of theindependent sources [21] for example Gaussian mixturedistributions are used as source models in [22 23] while [24]makes use of a flexible source density model given by thegeneralised exponential distribution In our analysis we usea reciprocal cosh source model as a canonical heavy-taileddistribution namely [21]

119901 (119904119894) =

1

1198850cosh (119904

119894) (5)

where 119904119894is the 119894th source and 119885

0is a normalising constant

This analytical fixed source model has no adjustable parame-ters therefore it has considerable computational advantagesover alternative source models Also as this source modelis heavier in the tails it is able to accurately model heavy-tailed unimodal non-Gaussian distributions such as financialreturns

31 Inference For our analysis we make use of the icadecalgorithm [21 24] to infer the unmixing matrix This algo-rithm constrains the unmixing matrix to the manifold of

decorrelating matrices thus offering rapid computation Thealgorithm also gives accurate results compared to otherrelated ICA approaches [24] and allows us to obtain a con-fidence measure for the unmixing matrix Here we present abrief overview of this algorithm an in-depth description ispresented in [24]

The independent source signals obtained using ICA b(119905)must be at least linearly decorrelated for them to be classedas independent The icadec algorithm makes use of thisproperty of the independent components to efficiently inferthe unmixing matrix For a set of observed signals X whereX = [x(119905)]119905=119879

119905=1 the set of recovered independent components

B = [b(119905)]119905=119879119905=1

is given by B = WX The independentcomponents are linearly decorrelated if

BB⊤ =WXX⊤W⊤ = D2 (6)

where D is a diagonal matrix of scaling factors The singularvalue decomposition of the set of observed signals is given by

X = UΣV⊤ (7)

where U and V are orthogonal matrices with the columns ofU being the principal components of X and Σ is a diagonalmatrix of the singular values of X It can be shown that thedecorrelating matrixW can then be written as [24]

W = DQΣminus1U⊤ (8)

whereQ is a real orthogonalmatrix To obtain an estimate forthe ICA unmixing matrix we need to optimise a given con-trast function (we use log-likelihood of the data as describedlater) There are a variety of optimisation approaches whichcan be used our approach of choice is the Broyden-Fletcher-Golfarb-Shanno (BFGS) quasi-Newton method which givesthe best estimate of the minimum of the negative log-likelihood in a computationally efficient manner and alsoprovides us with an estimate for the Hessian matrix How-ever parameterising the optimisation problem directly bythe elements of Q makes it a constrained minimisationproblem for which BFGS is not applicable Therefore toconvert it into an unconstrained minimisation problem weconstrainQ to be orthonormal by parameterising its elementsas the matrix exponential of a skew-symmetric matrix J(nonzero elements of this matrix are known as the Cayleycoordinates) whose above diagonal elements parameteriseQ [21] (for 119872 sources and 119873 observed signals the ICAunmixing matrix may be optimised in the (12)119872(119872 + 1)dimensional space of decorrelating matrices rather than inthe full 119872119873 dimensional space as (12)119872(119872 minus 1) and 119872parameters are required to specifyQ andD respectivelyThisfeature offers considerable computational benefits (especiallyin high-dimensional spaces) and the resulting matrix henceobtained is guaranteed to be decorrelating [24])

Q = exp (J) (9)

Using this parameterisation makes it possible to apply BFGSto any contrast function the contrast function used as part of

4 ISRN Signal Processing

the icadec algorithm is an expression for the log-likelihood ofthe data as described below

Using (1) and assuming that the observation noise (n)is normally distributed with a mean of zero and having anisotropic covariance matrix with precision 120573 the distributionof the observations (as a preprocessing step we normaliseeach observed signal to have a mean of zero and unitvariance) conditioned on A and s (where we drop the timeindex 119905 for ease of presentation) is given by

119901 (x | A s) =N (xAs 120573minus1I) (10)

where As is the mean of the normal distribution and 120573minus1I isits covariance The likelihood of an observation occurring isgiven by

119901 (x | A) = int119901 (x | A s) 119901 (s) 119889s (11)

Assuming that the distribution over sources has a singledominant peak in this case given by themaximum likelihoodsource estimates s = (A⊤A)minus1A⊤x the integral in (11) canbe analysed by using a simplified (computationally efficient)variant of Laplacersquos method as shown in [21 25]

119901 (x | A) = int119901 (x | A s) 119901 (s) 119889s

asymp 119901 (x | A s) 119901 (s) (2120587)1198722 det (G)minus12(12)

where G is the Hessian matrix

G = minus[1205972 log119901 (x | A s)120597s119894120597s119895

]

s=s (13)

Taking log of the expanded form of (10) gives

log119901 (x | A s) = 1198732log(

120573

2120587) minus120573

2(x minus As)⊤ (x minus As) (14)

which via (13) results in G = 120573A⊤A The log-likelihood ℓ equivlog119901(x | A) is therefore

ℓ =119873 minus119872

2log(

120573

2120587) minus120573

2(x minus As)⊤ (x minus As)

+ log119901 (s) minus 12log det (A119879A)

(15)

By using (8) we obtain A = UΣQ⊤Dminus1 Hence the log-likelihood becomes [21]

ℓ =119873 minus119872

2log(

120573

2120587119890) + log119901 (s) + log det (Σminus1D) (16)

where 119890 is the average reconstruction error Noting that weuse a reciprocal cosh source model (as given by (5)) it canbe shown that taking the derivative of this log-likelihoodexpression with respect to D and J (which parameterises Q)and by following the resulting likelihood gradient using aBFGS optimiser makes it possible to efficiently compute anoptimum value for the ICA unmixing matrix details of thisprocedure are presented in [24]

32 Dynamic Mixing The standard (offline) ICAmodel usesall available data samples at times 119905 = 1 2 119879 of theobserved signals x(119905) to estimate a single static unmixingmatrix W The unmixing matrix obtained provides a goodestimate of the mixing process for the complete time seriesand is well suited for offline data analysis However manytime series such as financial data streams are highly dynamicin nature with rapidly changing properties and thereforerequire a source separation method that can be used in asequential manner This issue is addressed here by usinga sliding-window ICA model [26] This model makes useof a sliding-window approach to sequentially update thecurrent unmixing matrix using information contained in theprevious window and can easily handle nonstationary dataThe unmixing matrix for the current window W(119905) is usedas a prior for computing the unmixing matrix for the nextwindow W(119905 + 1) This results in significant computationalefficiency as fewer iterations are required to obtain anoptimum value for W(119905 + 1) The algorithm also results inan improvement in the source separation results obtainedwhen the mixing process is drifting and addresses the ICApermutation and sign ambiguity issues [21] by maintaining afixed (but of course arbitrary) ordering of recovered sourcesthrough time

4 Information Coupling

We now proceed to derive the ICA-based information cou-pling metric Later in this section we discuss the practicaladvantages this metric offers when used to analyse multivari-ate financial time series

41 Coupling Metric Let W be any arbitrary square ICAunmixing matrix (for the purpose of brevity and clarity weonly consider the case of square mixing while deriving themetric here However the metric derived in this section isvalid for nonsquare mixing as well and the correspondingderivation can be undertaken using a similar approach aspresented here but converting the nonsquare ICA unmixingmatrices in each instance into square matrices by paddingthem with zeros)

W isin R119873times119873 (17)

Sincemultiplication ofW by a diagonalmatrix does not affectthe mutual information of the recovered sources thereforewe row normalise the unmixing matrix in order to addressthe ICA scale indeterminacy problem (most ICA algorithmsincluding icadec suffer from the scale indeterminacy prob-lem that is the variances of the independent componentscannot be determined this is because both the unmixingmatrix and the source signals are unknown and any scalarmultiplication on either will be lost in the mixing process)Row normalisation implies that the elements 119908

119894119895of the

unmixing matrix W are constrained such that each row ofthe matrix is of unit length that is

119873

sum

119895=1

1199082

119894119895= 1 (18)

ISRN Signal Processing 5

for all rows 119894 For a set of observed signals to be completelydecoupled their latent independent components must be thesame as the observed signals therefore the row-normalisedunmixing matrix for decoupled signals (W

0) must be a

permutation of the identity matrix (I)

W0= PI isin R119873times119873 (19)

where P is a permutation matrix For the case where theobserved signals are completely coupled all the latent inde-pendent components must be the same therefore the row-normalised unmixing matrix for completely coupled signals(W1) is given by

W1=1

radic119873

K isin R119873times119873 (20)

where K is the unit matrix (a matrix of ones)To calculate coupling we need to consider the distance

between any arbitrary unmixing matrix (W) and the zerocoupling matrix (W

0) The distance measure we use is the

generalised 2-norm also called the spectral norm of thedifference between the twomatrices [27] althoughwe can usesome other norms as well to get similar results The spectralnorm of a matrix corresponds to its largest singular valueand is the matrix equivalent of the vector Euclidean normHence the distance 119889(WW

0) between the twomatrices can

be written as

119889 (WW0) =1003817100381710038171003817W minusW0

10038171003817100381710038172 (21)

where sdot 2is the spectral norm of the matrix As W

0is a

permutation of the identity matrix therefore

119889 (WW0) = W minus PI2 (22)

As the spectral norm of a matrix is independent of itspermutations therefore we may define another permutationmatrix (P) such that

119889 (WW0) =10038171003817100381710038171003817PW minus I100381710038171003817100381710038172 (23)

For this equation the following equality holds

119889 (WW0) =10038171003817100381710038171003817PW100381710038171003817100381710038172 minus 1 (24)

Again noting that PW2= W

2 we have

119889 (WW0) = W2 minus 1 (25)

We normalise this measure with respect to the range overwhich the distance measure can vary that is the distancebetween matrices representing completely coupled (W

1) and

decoupled (W0) signals From (20) we have that W

1=

(1radic119873)K therefore

119889 (W1W0) =1003817100381710038171003817W1 minusW0

10038171003817100381710038172=

10038171003817100381710038171003817100381710038171003817

1

radic119873

K minus PI100381710038171003817100381710038171003817100381710038172

(26)

Using the same analysis as presented previously this equationcan be simplified to

119889 (W1W0) =

1

radic119873K2 minus 1 (27)

For a 119873-dimensional square unit matrix the spectral normis given by K

2= 119873 Therefore for a row-normalised unit

matrix the spectral norm is (1radic119873)K2= radic119873 Hence ifW

is row-normalised (27) can be written as

119889 (W1W0) = radic119873 minus 1 (28)

The normalised information coupling metric (120578) is thendefined as

120578 =119889 (WW

0)

119889 (W1W0) (29)

Substituting (25) and (28) into (29) the normalised informa-tion coupling between119873 observed signals is given by

120578 =W2 minus 1radic119873 minus 1

(30)

We can consider the bounds of 120578 as described belowSupposeM is an arbitrary real matrix We can look upon thespectral norm of this matrix (M

2= M minus 0

2) as a measure

of departure (distance) ofM from a null matrix (0) [28] Thebounds on this norm are given by

0 le M2 le infin (31)

If M is a row-normalised ICA unmixing matrix (W) then(as discussed earlier) it lies between W

0and W

1 Hence the

bounds onW are1003817100381710038171003817W010038171003817100381710038172le W2 le

1003817100381710038171003817W110038171003817100381710038172 (32)

Using (19) and (20) we can write

PI2 le W2 le10038171003817100381710038171003817100381710038171003817

1

radic119873

K100381710038171003817100381710038171003817100381710038172

(33)

which can be simplified to

1 le W2 le radic119873 (34)

Rearranging terms in this inequality gives

0 leW2 minus 1radic119873 minus 1

le 1 (35)

which gives us the same coupling metric as in (30) and showsthat the metric is normalised that is 0 le 120578 le 1 For real-valuedW W

2can be written as

W2 = radic120582max (W⊤W) (36)

where 120582max(W⊤W) is the maximum eigenvalue of W⊤WThe unmixing matrix obtained using most ICA algorithmsincluding icadec suffers from row permutation and signambiguity problems that is the rows are arranged in arandom order and the sign of elements in each row isunknown [21 24] We note that as W

2is independent of

the sign and permutations of the rows of W therefore ourmeasure of information coupling straightaway addresses the

6 ISRN Signal Processing

problems of ICA sign and permutation ambiguities Also asthe metricrsquos value is independent of the row permutations ofW therefore it provides symmetric results The informationcoupling metric is valid for all dimensions of the unmixingmatrix W This implies that information coupling can beeasily measured in high-dimensional spaces

It is possible to obtain a measure of uncertainty in ourestimation of the information coupling measure We makeuse of the BFGS quasi-Newton optimisation approach overthe most probable skew-symmetric matrix J of (9) fromwhich estimates for the unmixingmatrixW can be obtainedand thence the coupling measure 120578 calculated We alsoestimate the Hessian (inverse covariance) matrixH for J aspart of this process Hence it is possible to draw samples J1015840say from the distribution over J as a multivariate normal

J1015840 simMN (JHminus1) (37)

These samples can be readily transformed to samples in 120578using (9) (8) and (30) respectively Confidence bounds (andhere we use the 95 bounds) may then be easily obtainedfrom the set of samples for 120578 (in our analysis we use 100samples)

42 Computational Complexity The information couplingalgorithm achieves computational efficiency bymaking use ofthe sliding-window based decorrelating manifold approachto ICA Making use of the reciprocal cosh based ICA sourcemodel also results in significant computational advantagesWe now take a look at the comparative computationalcomplexity of the information coupling measure and threefrequently used measures of statistical dependence that islinear correlation rank correlation and mutual informationFor bivariate data (119899

119904samples long) for which these four

measures are directly comparable linear correlation andrank correlation have time complexities of order 119874(119899

119904) and

119874(119899119904log 119899119904) respectively [29] while mutual information and

information coupling scale as 119874(1198992119904) and 119874(119899

119904) respectively

(there have been various estimation algorithms proposed forefficient computation of mutual information however theyall result in increased estimation errors and require care-ful selection of various user-defined parameters [30]) [31]Hence even though the time complexity of the informationcoupling measure is of the same order as linear correlation itcan still accurately capture statistical dependencies in non-Gaussian data streams and is a computationally efficientproxy for mutual information

For119873-dimensionalmultivariate data direct computationof mutual information has time complexity of order 119874(119899119873

119904)

compared to119874(1198991199041198733) for the information coupling measure

In high-dimensions even an approximation for mutualinformation can be computationally very costly For exam-ple using a Parzen-window density estimator the mutualinformation computational complexity can be reduced to119874(119899119904119899119873

119887) where 119899

119887is the number of bins used for estimation

[32] which will incur a very high computational cost evenfor relatively small values of 119873 119899

119887 and 119899

119904 As a simple

example Table 1 shows a comparison of computation time(in seconds) taken by mutual information and information

Table 1 Example showing comparison of average computation time(in seconds) ofmutual information and information coupling whenthese measures are used to analyse bivariate data sets containingdifferent number of samples (119899

119904) The approach used to estimate

mutual information is based on a Parzen window based algorithmas described in [45] The computational cost of this algorithm isdependent on the window-size (ℎ) The values of ℎ used for thesimulations are ℎ = 20 for 119899

119904= 102 and ℎ = 100 for all other

simulations Results are obtained using a 266GHz processor as anaverage of 100 simulations

Computation time (sec) 119899119904= 102119899119904= 103119899119904= 104119899119904= 105

Mutual information 00214 28289 170101 1195088Information coupling 00073 00213 00561 05543

couplingmeasures for analysing bivariate data sets of varyinglengths As expected mutual information estimation usingthe Parzen window based approach (which is consideredto be a relatively efficient approach to compute mutualinformation) becomes computationally very demandingwithan increase in the number of samples of the bivariatedata set In contrast the information coupling measure iscomputationally efficient even when used to analyse verylarge high-dimensional multivariate data sets

43 Capturing Market Dynamics To dynamically analyseinformation coupling in multivariate financial data streamswe need to make use of windowing techniques Financialmarkets give rise towell-defined events such as orders tradesand quote revisions These events are irregularly spaced inclock-time that is they are asynchronous Statistical modelsin clock-timemake use of data aggregated over fixed intervalsof time [33] The time at which these events are indexedis called the event-time Hence for dynamic modelling inevent-time the number of data points can be regarded asfixed while time varies while in clock-time the time periodis considered to be fixed with variable number of data pointsAlthough we may need adaptive windows in clock-time wecan use sliding-windows of fixed length in event-time Usingfixed length sliding-windows in event-time can be usefulfor obtaining consistent results when developing and testingdifferent statistical models Also statistical models deployedfor online analysis of financial data operate best in event-time as they often need to make decisions as soon as somenewmarket information (such as quote update etc) becomesavailable Consider an online trading model making use ofan adaptive window in event-time At specific times of theday for example at times of major news announcementstrading volume can significantly increase Hence more datawill be available to the algorithm and thus results obtainedcan be misleading [34] Using a sliding-window of fixedlength in event-time can overcome this problem The lengthof the sliding-window needs to be selected appropriatelyThefinancial application for which the model is being used is oneof the factors which drives the choice of window length As ageneral rule for trading models a window of approximatelythe same size as the average time period between placingtrades (inverse of trading frequency) is often usedThismakes

ISRN Signal Processing 7

it possible to accurately capture the rapidly evolving dynamicsof the markets over the corresponding period without beingtoo long so as to only capture major trends or too short tocapture noise in the data

44 Discussion The information coupling model offers uswith multiple advantages when used to analyse multivariatefinancial data Here we summarise some of the main prop-erties the model while the empirical results presented in thenext section showcase some of its practical benefits

(i) The information coupling measure a proxy formutual information is able to accurately pick up sta-tistical dependencies in data sets with non-Gaussiandistributions (such as financial returns)

(ii) The information coupling algorithm is computation-ally efficient which makes it particularly suitable foruse in an online dynamic environment This makesthe algorithm especially attractive when dealing withdata sampled at high frequenciesThis is because withthe ever-increasing use of high-frequency data over-coming sources of latency is of utmost importance ina variety of applications in modern financial markets

(iii) It gives confidence levels on the information couplingmeasure This allows us to estimate the uncertaintyassociated with the measurements

(iv) The metric provides normalised results that is infor-mation coupling ranges from 0 for decoupled systemsto 1 for completely coupled systems This makes iteasier to analyse results obtained using themetric andto compare its performance with other similar mea-sures of association The metric also gives symmetricresults

(v) The metric is valid for any number of signals inhigh-dimensional spaces that is it consistently givesaccurate results irrespective of the number of timeseries between which information coupling is beingcomputed This makes it suitable for a range offinancial applications

(vi) It is not data intensive that is it gives relativelyaccurate results evenwhen a small sample size is usedThis allows the metric to model the complex andrapidly changing dynamics of financial markets

(vii) It does not depend on user-defined parameters whichcan restrict its practical utility as the evolving marketconditions may require the parameters to be con-stantly updated which may not be practical

5 Results

Wenow present a set of synthetic and financial data examplesshowing the relative accuracy and practical utility of theinformation coupling measure The following notations areused for different measures of statistical dependence in this

paper ICA-based information coupling (120578) linear correla-tion (120588) rank correlation (120588

119877) and normalised mutual infor-

mation (119868119873) Normalisation of mutual information values (119868)

is achieved using the following transformation [35]

119868119873= radic1 minus exp (minus2119868) (38)

The financial data examples we present in this paperprimarily make use of spot foreign exchange (FX) data Spotprice or spot rate is the price which is actually quoted fora currency transaction to take place Return is the profitrealised when a currency is traded It is common practice touse the normalised log-returns of financial data in statisticalanalysis instead of the raw data itself [36] Using log-returnsmakes it possible to convert exponential problems intolinear ones thus significantly simplifying relevant analysisA normalised log-returns data set with a mean of zero andunit variance can generally be regarded as a locally stationaryprocess [37] Therefore many signal processing techniquesmeant solely for stationary processes can be successfullyapplied to the normalised log-returns time series in anadaptive environment Denoting the mid-price at time 119905 of afinancial time series by 119875

119905 the log-return value is given by

119903119905= ln(

119875119905

119875119905minus1

) (39)

The normalisation of log-returns is achieved by convertingthe data to a form such that it has a mean of zero and unitvariance This is easily achieved by removing the mean anddividing by the standard deviation of the data

51 Synthetic Data There is no single distribution whichfits financial returns especially those sampled at higher fre-quencies which tend to be highly non-Gaussian [2] althoughthere have been attempts to model returns using a variety ofdistributions [38] In this paper we aim to capture the heavy-tailed skewed properties of financial returns using a Pearsontype IV distribution [39 40] which can be used to representdistributions with varying degrees of skewness and kurtosisand thus are useful for representing distributions of financialreturns [41 42] The first four moments of the distributioncan be uniquely determined by setting four parameters whichcharacterise the distribution [43] Until recently due to itsmathematical and computational complexity this distribu-tion has not been widely used in financial literature althoughthis is rapidly changing with advances in computationalpower and proposal of new improved analytical methods[39 42 44]

To test accuracy of various measures of dependence weneed to generate coupled non-Gaussian synthetic data withknown predefined correlation values There is no straight-forward way to simulate correlated random variables whentheir joint distribution is not known [46] as is the case withmultivariate financial returns One possible method that canbe used to induce any desired predefined correlation betweenindependent randomly distributed variables irrespective oftheir distributions is commonly known as the Iman-Conovermethod as presented in [47] This method is based on

8 ISRN Signal Processing

Table 2 Table showing accuracy of four measures of dependence that is information coupling (120578) linear correlation (120588) rank correlation(120588119877) and normalised mutual information (119868

119873) when used to estimate the level of dependence in a coupled system with varying levels of true

correlation (120588TC) 120588TC is induced in an independent bivariate randomly distributed system using the Iman-Conover method as described inthe text The dependence estimates together with their standard deviation values shown in the table are obtained using 1000 independentsimulations using 1000 data points long data sets for each simulationThe last row of the table gives values for themean absolute error (MAE)

120588TC 120578 120588 120588119877

119868119873

|120588TC minus 120578| |120588TC minus 120588| |120588TC minus 120588119877| |120588TC minus 119868119873|

0 00046 plusmn 00026 00038 plusmn 00022 00147 plusmn 00115 01851 plusmn 00401 00046 00038 00147 0185101 01079 plusmn 00073 00917 plusmn 00058 00942 plusmn 00184 01687 plusmn 00412 00079 00083 00058 0068702 02145 plusmn 00102 01862 plusmn 00093 01899 plusmn 00177 01131 plusmn 00464 00145 00138 00111 0086903 03176 plusmn 00138 02812 plusmn 00130 02863 plusmn 00167 01644 plusmn 00538 00176 00188 00137 0135604 04166 plusmn 00210 03764 plusmn 00165 03835 plusmn 00153 02787 plusmn 00500 00166 00236 00165 0121305 05135 plusmn 00196 04720 plusmn 00197 04818 plusmn 00136 03819 plusmn 00475 00135 00280 00182 0118106 06070 plusmn 00347 05688 plusmn 00226 05815 plusmn 00117 04788 plusmn 00458 00070 00312 00185 0121207 07011 plusmn 00232 06670 plusmn 00242 06830 plusmn 00093 05728 plusmn 00483 00011 00330 00170 0127208 07936 plusmn 00239 07676 plusmn 00249 07864 plusmn 00066 06652 plusmn 00490 00064 00324 00136 0134809 08864 plusmn 00252 08713 plusmn 00249 08919 plusmn 00034 07595 plusmn 00501 00136 00287 00081 0140510 10000 plusmn 00000 10000 plusmn 00000 10000 plusmn 00000 09601 plusmn 00185 00000 00000 00000 00399MAE 00093 00201 00125 01163

inducing a known dependency structure in samples takenfrom the input independent marginal distributions usingreordering techniques The multivariate coupled structureobtained as the output can thus be used as the input data invarious models of dependency analysis to test their relativeaccuracies

We set parameters of the Pearson type IV distributionsuch that the coupled synthetic data we obtain using theIman-Conover method has similar properties to financialreturns But first we need to consider some properties offinancial returns which we want to mimic Figures 1(a) and1(b) show the distributions of two higher-order momentsthat is skewness (120574) and kurtosis (120581) for FX spot data setssampled at three different frequenciesThe plots are obtainedusing a sliding-window of length 50 data points as an averageof all G10 (Group of Ten) currency pairs covering a period of8 hours in the case of 025 second and 05 second sampleddata and 2 years in case of 05 hour sampled data The non-Gaussian (heavy-tailed skewed) nature of the data is clearlyvisible It is interesting to note that the kurtosis value almostnever goes below three for any of the data sets signifyingthe temporal persistence of non-Gaussianity We now makeuse of the Iman-Conover method to induce varying levels ofcorrelation between 1000 samples taken from independentrandomly distributed Pearson type IV distributions A 1000sample data set makes it easier to accurately induce prede-fined correlations in the system as well as makes it possibleto generate data with relatively accurate average kurtosis andskewness values hence allowing us to accurately capture thehigher-order moments of financial returns As an exampleFigures 1(c) and 1(d) show distributions of the kurtosis andskewness of two variables for 1000 independent simulationsNote the similarity of these plots with the average of thecorresponding distributions of higher-order moments forfinancial data as presented in Figures 1(a) and 1(b) Thisshows the effectiveness of using synthetic data sampled from

a Pearson type IV distribution for capturing higher-ordermoments of financial returns

Four different approaches are now used to estimate thelevel of dependence between the output coupled data Theprocess is repeated 1000 times for each level of true correlation(120588TC) Table 2 presents the results obtained The results showthe accuracy of the information coupling measure whenused to analyse non-Gaussian data For this synthetic dataexample on average the information coupling measure was537 more accurate than the linear correlation measure and256 more accurate with respect to the rank correlationmeasure The normalised mutual information provided theleast accurate results

We now extend this example by incorporating datadynamics The same data generation process as describedabove is nowused to construct a 32000 samples long bivariatedata set in which the induced true correlation changes every8000 time steps that is120588TC = 02when 119905 = 1 8000120588TC = 04when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000and 120588TC = 08 when 119905 = 24001 32000 A 1000 datapoints wide sliding-window is used to dynamically measuredependencies in the data setThe resulting temporal informa-tion coupling plot together with the 5th and 95th percentileconfidence intervals is presented in Figure 2(a) The fourdifferent coupling regions are clearly visible together withthe step changes in coupling after every 8000 time stepsshowing ability of the algorithm to detect abrupt changes incoupling The normalised empirical probability distributionsover 120578 for the four coupling regions are shown in Figure 2(b)Also plotted in the same figure are the normalised empiricalpdfs for linear correlation (120588) and rank correlation (120588

119877)

The mutual information pdf is omitted for clarity as it givesrelatively less accurate results as presented in Table 2 Itis interesting to see how the peaks of the 120578 distributioncorrespond very closely to 120588TC values showing ability of theinformation coupling model to accurately capture statistical

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 3: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

ISRN Signal Processing 3

3 Independent Components Unmixing andNon-Gaussianity

Mixing two or more unique signals to a set of mixedobservations results in an increase in the dependency of thepdfs of the mixed signals The marginal pdfs of the observedmixed signals becomemore Gaussian due to the central limittheorem [20] The mixing process also results in a reductionin the independence of the mixed signal distribution andhence increase in mutual information associated with itMoreover there is a rise in the stationarity of the mixedsignals which have flatter spectra as compared to the originalsources [21] Consider a set of 119873 observed signals x(119905) =[1199091(119905) 1199092(119905) 119909

119873(119905)]⊤ at the time instant 119905 which are a

mixture of119872 source signals s(119905) = [1199041(119905) 1199042(119905) 119904

119872(119905)]⊤

mixed linearly using a mixing matrix A with observationnoise n(119905) as per

x (119905) = As (119905) + n (119905) (1)

Independent component analysis (ICA) attempts to find anunmixingmatrixW such that the119872 recovered source signalsb(119905) = [119887

1(119905) 1198872(119905) 119887

119872(119905)]⊤ are given by

b (119905) =W (x (119905) minus n (119905)) (2)

For the case where observation noise n(119905) is assumed to benormally distributed with a mean of zero the least squaresexpected value of the recovered source signals is given by

b (119905) =Wx (119905) (3)

whereW is the pseudo-inverse of A that is

W = A+ = (A⊤A)minus1A⊤ (4)

In the case of square mixingW = Aminus1Most ICA approaches make implicit or explicit assump-

tions regarding the parametric model of the pdfs of theindependent sources [21] for example Gaussian mixturedistributions are used as source models in [22 23] while [24]makes use of a flexible source density model given by thegeneralised exponential distribution In our analysis we usea reciprocal cosh source model as a canonical heavy-taileddistribution namely [21]

119901 (119904119894) =

1

1198850cosh (119904

119894) (5)

where 119904119894is the 119894th source and 119885

0is a normalising constant

This analytical fixed source model has no adjustable parame-ters therefore it has considerable computational advantagesover alternative source models Also as this source modelis heavier in the tails it is able to accurately model heavy-tailed unimodal non-Gaussian distributions such as financialreturns

31 Inference For our analysis we make use of the icadecalgorithm [21 24] to infer the unmixing matrix This algo-rithm constrains the unmixing matrix to the manifold of

decorrelating matrices thus offering rapid computation Thealgorithm also gives accurate results compared to otherrelated ICA approaches [24] and allows us to obtain a con-fidence measure for the unmixing matrix Here we present abrief overview of this algorithm an in-depth description ispresented in [24]

The independent source signals obtained using ICA b(119905)must be at least linearly decorrelated for them to be classedas independent The icadec algorithm makes use of thisproperty of the independent components to efficiently inferthe unmixing matrix For a set of observed signals X whereX = [x(119905)]119905=119879

119905=1 the set of recovered independent components

B = [b(119905)]119905=119879119905=1

is given by B = WX The independentcomponents are linearly decorrelated if

BB⊤ =WXX⊤W⊤ = D2 (6)

where D is a diagonal matrix of scaling factors The singularvalue decomposition of the set of observed signals is given by

X = UΣV⊤ (7)

where U and V are orthogonal matrices with the columns ofU being the principal components of X and Σ is a diagonalmatrix of the singular values of X It can be shown that thedecorrelating matrixW can then be written as [24]

W = DQΣminus1U⊤ (8)

whereQ is a real orthogonalmatrix To obtain an estimate forthe ICA unmixing matrix we need to optimise a given con-trast function (we use log-likelihood of the data as describedlater) There are a variety of optimisation approaches whichcan be used our approach of choice is the Broyden-Fletcher-Golfarb-Shanno (BFGS) quasi-Newton method which givesthe best estimate of the minimum of the negative log-likelihood in a computationally efficient manner and alsoprovides us with an estimate for the Hessian matrix How-ever parameterising the optimisation problem directly bythe elements of Q makes it a constrained minimisationproblem for which BFGS is not applicable Therefore toconvert it into an unconstrained minimisation problem weconstrainQ to be orthonormal by parameterising its elementsas the matrix exponential of a skew-symmetric matrix J(nonzero elements of this matrix are known as the Cayleycoordinates) whose above diagonal elements parameteriseQ [21] (for 119872 sources and 119873 observed signals the ICAunmixing matrix may be optimised in the (12)119872(119872 + 1)dimensional space of decorrelating matrices rather than inthe full 119872119873 dimensional space as (12)119872(119872 minus 1) and 119872parameters are required to specifyQ andD respectivelyThisfeature offers considerable computational benefits (especiallyin high-dimensional spaces) and the resulting matrix henceobtained is guaranteed to be decorrelating [24])

Q = exp (J) (9)

Using this parameterisation makes it possible to apply BFGSto any contrast function the contrast function used as part of

4 ISRN Signal Processing

the icadec algorithm is an expression for the log-likelihood ofthe data as described below

Using (1) and assuming that the observation noise (n)is normally distributed with a mean of zero and having anisotropic covariance matrix with precision 120573 the distributionof the observations (as a preprocessing step we normaliseeach observed signal to have a mean of zero and unitvariance) conditioned on A and s (where we drop the timeindex 119905 for ease of presentation) is given by

119901 (x | A s) =N (xAs 120573minus1I) (10)

where As is the mean of the normal distribution and 120573minus1I isits covariance The likelihood of an observation occurring isgiven by

119901 (x | A) = int119901 (x | A s) 119901 (s) 119889s (11)

Assuming that the distribution over sources has a singledominant peak in this case given by themaximum likelihoodsource estimates s = (A⊤A)minus1A⊤x the integral in (11) canbe analysed by using a simplified (computationally efficient)variant of Laplacersquos method as shown in [21 25]

119901 (x | A) = int119901 (x | A s) 119901 (s) 119889s

asymp 119901 (x | A s) 119901 (s) (2120587)1198722 det (G)minus12(12)

where G is the Hessian matrix

G = minus[1205972 log119901 (x | A s)120597s119894120597s119895

]

s=s (13)

Taking log of the expanded form of (10) gives

log119901 (x | A s) = 1198732log(

120573

2120587) minus120573

2(x minus As)⊤ (x minus As) (14)

which via (13) results in G = 120573A⊤A The log-likelihood ℓ equivlog119901(x | A) is therefore

ℓ =119873 minus119872

2log(

120573

2120587) minus120573

2(x minus As)⊤ (x minus As)

+ log119901 (s) minus 12log det (A119879A)

(15)

By using (8) we obtain A = UΣQ⊤Dminus1 Hence the log-likelihood becomes [21]

ℓ =119873 minus119872

2log(

120573

2120587119890) + log119901 (s) + log det (Σminus1D) (16)

where 119890 is the average reconstruction error Noting that weuse a reciprocal cosh source model (as given by (5)) it canbe shown that taking the derivative of this log-likelihoodexpression with respect to D and J (which parameterises Q)and by following the resulting likelihood gradient using aBFGS optimiser makes it possible to efficiently compute anoptimum value for the ICA unmixing matrix details of thisprocedure are presented in [24]

32 Dynamic Mixing The standard (offline) ICAmodel usesall available data samples at times 119905 = 1 2 119879 of theobserved signals x(119905) to estimate a single static unmixingmatrix W The unmixing matrix obtained provides a goodestimate of the mixing process for the complete time seriesand is well suited for offline data analysis However manytime series such as financial data streams are highly dynamicin nature with rapidly changing properties and thereforerequire a source separation method that can be used in asequential manner This issue is addressed here by usinga sliding-window ICA model [26] This model makes useof a sliding-window approach to sequentially update thecurrent unmixing matrix using information contained in theprevious window and can easily handle nonstationary dataThe unmixing matrix for the current window W(119905) is usedas a prior for computing the unmixing matrix for the nextwindow W(119905 + 1) This results in significant computationalefficiency as fewer iterations are required to obtain anoptimum value for W(119905 + 1) The algorithm also results inan improvement in the source separation results obtainedwhen the mixing process is drifting and addresses the ICApermutation and sign ambiguity issues [21] by maintaining afixed (but of course arbitrary) ordering of recovered sourcesthrough time

4 Information Coupling

We now proceed to derive the ICA-based information cou-pling metric Later in this section we discuss the practicaladvantages this metric offers when used to analyse multivari-ate financial time series

41 Coupling Metric Let W be any arbitrary square ICAunmixing matrix (for the purpose of brevity and clarity weonly consider the case of square mixing while deriving themetric here However the metric derived in this section isvalid for nonsquare mixing as well and the correspondingderivation can be undertaken using a similar approach aspresented here but converting the nonsquare ICA unmixingmatrices in each instance into square matrices by paddingthem with zeros)

W isin R119873times119873 (17)

Sincemultiplication ofW by a diagonalmatrix does not affectthe mutual information of the recovered sources thereforewe row normalise the unmixing matrix in order to addressthe ICA scale indeterminacy problem (most ICA algorithmsincluding icadec suffer from the scale indeterminacy prob-lem that is the variances of the independent componentscannot be determined this is because both the unmixingmatrix and the source signals are unknown and any scalarmultiplication on either will be lost in the mixing process)Row normalisation implies that the elements 119908

119894119895of the

unmixing matrix W are constrained such that each row ofthe matrix is of unit length that is

119873

sum

119895=1

1199082

119894119895= 1 (18)

ISRN Signal Processing 5

for all rows 119894 For a set of observed signals to be completelydecoupled their latent independent components must be thesame as the observed signals therefore the row-normalisedunmixing matrix for decoupled signals (W

0) must be a

permutation of the identity matrix (I)

W0= PI isin R119873times119873 (19)

where P is a permutation matrix For the case where theobserved signals are completely coupled all the latent inde-pendent components must be the same therefore the row-normalised unmixing matrix for completely coupled signals(W1) is given by

W1=1

radic119873

K isin R119873times119873 (20)

where K is the unit matrix (a matrix of ones)To calculate coupling we need to consider the distance

between any arbitrary unmixing matrix (W) and the zerocoupling matrix (W

0) The distance measure we use is the

generalised 2-norm also called the spectral norm of thedifference between the twomatrices [27] althoughwe can usesome other norms as well to get similar results The spectralnorm of a matrix corresponds to its largest singular valueand is the matrix equivalent of the vector Euclidean normHence the distance 119889(WW

0) between the twomatrices can

be written as

119889 (WW0) =1003817100381710038171003817W minusW0

10038171003817100381710038172 (21)

where sdot 2is the spectral norm of the matrix As W

0is a

permutation of the identity matrix therefore

119889 (WW0) = W minus PI2 (22)

As the spectral norm of a matrix is independent of itspermutations therefore we may define another permutationmatrix (P) such that

119889 (WW0) =10038171003817100381710038171003817PW minus I100381710038171003817100381710038172 (23)

For this equation the following equality holds

119889 (WW0) =10038171003817100381710038171003817PW100381710038171003817100381710038172 minus 1 (24)

Again noting that PW2= W

2 we have

119889 (WW0) = W2 minus 1 (25)

We normalise this measure with respect to the range overwhich the distance measure can vary that is the distancebetween matrices representing completely coupled (W

1) and

decoupled (W0) signals From (20) we have that W

1=

(1radic119873)K therefore

119889 (W1W0) =1003817100381710038171003817W1 minusW0

10038171003817100381710038172=

10038171003817100381710038171003817100381710038171003817

1

radic119873

K minus PI100381710038171003817100381710038171003817100381710038172

(26)

Using the same analysis as presented previously this equationcan be simplified to

119889 (W1W0) =

1

radic119873K2 minus 1 (27)

For a 119873-dimensional square unit matrix the spectral normis given by K

2= 119873 Therefore for a row-normalised unit

matrix the spectral norm is (1radic119873)K2= radic119873 Hence ifW

is row-normalised (27) can be written as

119889 (W1W0) = radic119873 minus 1 (28)

The normalised information coupling metric (120578) is thendefined as

120578 =119889 (WW

0)

119889 (W1W0) (29)

Substituting (25) and (28) into (29) the normalised informa-tion coupling between119873 observed signals is given by

120578 =W2 minus 1radic119873 minus 1

(30)

We can consider the bounds of 120578 as described belowSupposeM is an arbitrary real matrix We can look upon thespectral norm of this matrix (M

2= M minus 0

2) as a measure

of departure (distance) ofM from a null matrix (0) [28] Thebounds on this norm are given by

0 le M2 le infin (31)

If M is a row-normalised ICA unmixing matrix (W) then(as discussed earlier) it lies between W

0and W

1 Hence the

bounds onW are1003817100381710038171003817W010038171003817100381710038172le W2 le

1003817100381710038171003817W110038171003817100381710038172 (32)

Using (19) and (20) we can write

PI2 le W2 le10038171003817100381710038171003817100381710038171003817

1

radic119873

K100381710038171003817100381710038171003817100381710038172

(33)

which can be simplified to

1 le W2 le radic119873 (34)

Rearranging terms in this inequality gives

0 leW2 minus 1radic119873 minus 1

le 1 (35)

which gives us the same coupling metric as in (30) and showsthat the metric is normalised that is 0 le 120578 le 1 For real-valuedW W

2can be written as

W2 = radic120582max (W⊤W) (36)

where 120582max(W⊤W) is the maximum eigenvalue of W⊤WThe unmixing matrix obtained using most ICA algorithmsincluding icadec suffers from row permutation and signambiguity problems that is the rows are arranged in arandom order and the sign of elements in each row isunknown [21 24] We note that as W

2is independent of

the sign and permutations of the rows of W therefore ourmeasure of information coupling straightaway addresses the

6 ISRN Signal Processing

problems of ICA sign and permutation ambiguities Also asthe metricrsquos value is independent of the row permutations ofW therefore it provides symmetric results The informationcoupling metric is valid for all dimensions of the unmixingmatrix W This implies that information coupling can beeasily measured in high-dimensional spaces

It is possible to obtain a measure of uncertainty in ourestimation of the information coupling measure We makeuse of the BFGS quasi-Newton optimisation approach overthe most probable skew-symmetric matrix J of (9) fromwhich estimates for the unmixingmatrixW can be obtainedand thence the coupling measure 120578 calculated We alsoestimate the Hessian (inverse covariance) matrixH for J aspart of this process Hence it is possible to draw samples J1015840say from the distribution over J as a multivariate normal

J1015840 simMN (JHminus1) (37)

These samples can be readily transformed to samples in 120578using (9) (8) and (30) respectively Confidence bounds (andhere we use the 95 bounds) may then be easily obtainedfrom the set of samples for 120578 (in our analysis we use 100samples)

42 Computational Complexity The information couplingalgorithm achieves computational efficiency bymaking use ofthe sliding-window based decorrelating manifold approachto ICA Making use of the reciprocal cosh based ICA sourcemodel also results in significant computational advantagesWe now take a look at the comparative computationalcomplexity of the information coupling measure and threefrequently used measures of statistical dependence that islinear correlation rank correlation and mutual informationFor bivariate data (119899

119904samples long) for which these four

measures are directly comparable linear correlation andrank correlation have time complexities of order 119874(119899

119904) and

119874(119899119904log 119899119904) respectively [29] while mutual information and

information coupling scale as 119874(1198992119904) and 119874(119899

119904) respectively

(there have been various estimation algorithms proposed forefficient computation of mutual information however theyall result in increased estimation errors and require care-ful selection of various user-defined parameters [30]) [31]Hence even though the time complexity of the informationcoupling measure is of the same order as linear correlation itcan still accurately capture statistical dependencies in non-Gaussian data streams and is a computationally efficientproxy for mutual information

For119873-dimensionalmultivariate data direct computationof mutual information has time complexity of order 119874(119899119873

119904)

compared to119874(1198991199041198733) for the information coupling measure

In high-dimensions even an approximation for mutualinformation can be computationally very costly For exam-ple using a Parzen-window density estimator the mutualinformation computational complexity can be reduced to119874(119899119904119899119873

119887) where 119899

119887is the number of bins used for estimation

[32] which will incur a very high computational cost evenfor relatively small values of 119873 119899

119887 and 119899

119904 As a simple

example Table 1 shows a comparison of computation time(in seconds) taken by mutual information and information

Table 1 Example showing comparison of average computation time(in seconds) ofmutual information and information coupling whenthese measures are used to analyse bivariate data sets containingdifferent number of samples (119899

119904) The approach used to estimate

mutual information is based on a Parzen window based algorithmas described in [45] The computational cost of this algorithm isdependent on the window-size (ℎ) The values of ℎ used for thesimulations are ℎ = 20 for 119899

119904= 102 and ℎ = 100 for all other

simulations Results are obtained using a 266GHz processor as anaverage of 100 simulations

Computation time (sec) 119899119904= 102119899119904= 103119899119904= 104119899119904= 105

Mutual information 00214 28289 170101 1195088Information coupling 00073 00213 00561 05543

couplingmeasures for analysing bivariate data sets of varyinglengths As expected mutual information estimation usingthe Parzen window based approach (which is consideredto be a relatively efficient approach to compute mutualinformation) becomes computationally very demandingwithan increase in the number of samples of the bivariatedata set In contrast the information coupling measure iscomputationally efficient even when used to analyse verylarge high-dimensional multivariate data sets

43 Capturing Market Dynamics To dynamically analyseinformation coupling in multivariate financial data streamswe need to make use of windowing techniques Financialmarkets give rise towell-defined events such as orders tradesand quote revisions These events are irregularly spaced inclock-time that is they are asynchronous Statistical modelsin clock-timemake use of data aggregated over fixed intervalsof time [33] The time at which these events are indexedis called the event-time Hence for dynamic modelling inevent-time the number of data points can be regarded asfixed while time varies while in clock-time the time periodis considered to be fixed with variable number of data pointsAlthough we may need adaptive windows in clock-time wecan use sliding-windows of fixed length in event-time Usingfixed length sliding-windows in event-time can be usefulfor obtaining consistent results when developing and testingdifferent statistical models Also statistical models deployedfor online analysis of financial data operate best in event-time as they often need to make decisions as soon as somenewmarket information (such as quote update etc) becomesavailable Consider an online trading model making use ofan adaptive window in event-time At specific times of theday for example at times of major news announcementstrading volume can significantly increase Hence more datawill be available to the algorithm and thus results obtainedcan be misleading [34] Using a sliding-window of fixedlength in event-time can overcome this problem The lengthof the sliding-window needs to be selected appropriatelyThefinancial application for which the model is being used is oneof the factors which drives the choice of window length As ageneral rule for trading models a window of approximatelythe same size as the average time period between placingtrades (inverse of trading frequency) is often usedThismakes

ISRN Signal Processing 7

it possible to accurately capture the rapidly evolving dynamicsof the markets over the corresponding period without beingtoo long so as to only capture major trends or too short tocapture noise in the data

44 Discussion The information coupling model offers uswith multiple advantages when used to analyse multivariatefinancial data Here we summarise some of the main prop-erties the model while the empirical results presented in thenext section showcase some of its practical benefits

(i) The information coupling measure a proxy formutual information is able to accurately pick up sta-tistical dependencies in data sets with non-Gaussiandistributions (such as financial returns)

(ii) The information coupling algorithm is computation-ally efficient which makes it particularly suitable foruse in an online dynamic environment This makesthe algorithm especially attractive when dealing withdata sampled at high frequenciesThis is because withthe ever-increasing use of high-frequency data over-coming sources of latency is of utmost importance ina variety of applications in modern financial markets

(iii) It gives confidence levels on the information couplingmeasure This allows us to estimate the uncertaintyassociated with the measurements

(iv) The metric provides normalised results that is infor-mation coupling ranges from 0 for decoupled systemsto 1 for completely coupled systems This makes iteasier to analyse results obtained using themetric andto compare its performance with other similar mea-sures of association The metric also gives symmetricresults

(v) The metric is valid for any number of signals inhigh-dimensional spaces that is it consistently givesaccurate results irrespective of the number of timeseries between which information coupling is beingcomputed This makes it suitable for a range offinancial applications

(vi) It is not data intensive that is it gives relativelyaccurate results evenwhen a small sample size is usedThis allows the metric to model the complex andrapidly changing dynamics of financial markets

(vii) It does not depend on user-defined parameters whichcan restrict its practical utility as the evolving marketconditions may require the parameters to be con-stantly updated which may not be practical

5 Results

Wenow present a set of synthetic and financial data examplesshowing the relative accuracy and practical utility of theinformation coupling measure The following notations areused for different measures of statistical dependence in this

paper ICA-based information coupling (120578) linear correla-tion (120588) rank correlation (120588

119877) and normalised mutual infor-

mation (119868119873) Normalisation of mutual information values (119868)

is achieved using the following transformation [35]

119868119873= radic1 minus exp (minus2119868) (38)

The financial data examples we present in this paperprimarily make use of spot foreign exchange (FX) data Spotprice or spot rate is the price which is actually quoted fora currency transaction to take place Return is the profitrealised when a currency is traded It is common practice touse the normalised log-returns of financial data in statisticalanalysis instead of the raw data itself [36] Using log-returnsmakes it possible to convert exponential problems intolinear ones thus significantly simplifying relevant analysisA normalised log-returns data set with a mean of zero andunit variance can generally be regarded as a locally stationaryprocess [37] Therefore many signal processing techniquesmeant solely for stationary processes can be successfullyapplied to the normalised log-returns time series in anadaptive environment Denoting the mid-price at time 119905 of afinancial time series by 119875

119905 the log-return value is given by

119903119905= ln(

119875119905

119875119905minus1

) (39)

The normalisation of log-returns is achieved by convertingthe data to a form such that it has a mean of zero and unitvariance This is easily achieved by removing the mean anddividing by the standard deviation of the data

51 Synthetic Data There is no single distribution whichfits financial returns especially those sampled at higher fre-quencies which tend to be highly non-Gaussian [2] althoughthere have been attempts to model returns using a variety ofdistributions [38] In this paper we aim to capture the heavy-tailed skewed properties of financial returns using a Pearsontype IV distribution [39 40] which can be used to representdistributions with varying degrees of skewness and kurtosisand thus are useful for representing distributions of financialreturns [41 42] The first four moments of the distributioncan be uniquely determined by setting four parameters whichcharacterise the distribution [43] Until recently due to itsmathematical and computational complexity this distribu-tion has not been widely used in financial literature althoughthis is rapidly changing with advances in computationalpower and proposal of new improved analytical methods[39 42 44]

To test accuracy of various measures of dependence weneed to generate coupled non-Gaussian synthetic data withknown predefined correlation values There is no straight-forward way to simulate correlated random variables whentheir joint distribution is not known [46] as is the case withmultivariate financial returns One possible method that canbe used to induce any desired predefined correlation betweenindependent randomly distributed variables irrespective oftheir distributions is commonly known as the Iman-Conovermethod as presented in [47] This method is based on

8 ISRN Signal Processing

Table 2 Table showing accuracy of four measures of dependence that is information coupling (120578) linear correlation (120588) rank correlation(120588119877) and normalised mutual information (119868

119873) when used to estimate the level of dependence in a coupled system with varying levels of true

correlation (120588TC) 120588TC is induced in an independent bivariate randomly distributed system using the Iman-Conover method as described inthe text The dependence estimates together with their standard deviation values shown in the table are obtained using 1000 independentsimulations using 1000 data points long data sets for each simulationThe last row of the table gives values for themean absolute error (MAE)

120588TC 120578 120588 120588119877

119868119873

|120588TC minus 120578| |120588TC minus 120588| |120588TC minus 120588119877| |120588TC minus 119868119873|

0 00046 plusmn 00026 00038 plusmn 00022 00147 plusmn 00115 01851 plusmn 00401 00046 00038 00147 0185101 01079 plusmn 00073 00917 plusmn 00058 00942 plusmn 00184 01687 plusmn 00412 00079 00083 00058 0068702 02145 plusmn 00102 01862 plusmn 00093 01899 plusmn 00177 01131 plusmn 00464 00145 00138 00111 0086903 03176 plusmn 00138 02812 plusmn 00130 02863 plusmn 00167 01644 plusmn 00538 00176 00188 00137 0135604 04166 plusmn 00210 03764 plusmn 00165 03835 plusmn 00153 02787 plusmn 00500 00166 00236 00165 0121305 05135 plusmn 00196 04720 plusmn 00197 04818 plusmn 00136 03819 plusmn 00475 00135 00280 00182 0118106 06070 plusmn 00347 05688 plusmn 00226 05815 plusmn 00117 04788 plusmn 00458 00070 00312 00185 0121207 07011 plusmn 00232 06670 plusmn 00242 06830 plusmn 00093 05728 plusmn 00483 00011 00330 00170 0127208 07936 plusmn 00239 07676 plusmn 00249 07864 plusmn 00066 06652 plusmn 00490 00064 00324 00136 0134809 08864 plusmn 00252 08713 plusmn 00249 08919 plusmn 00034 07595 plusmn 00501 00136 00287 00081 0140510 10000 plusmn 00000 10000 plusmn 00000 10000 plusmn 00000 09601 plusmn 00185 00000 00000 00000 00399MAE 00093 00201 00125 01163

inducing a known dependency structure in samples takenfrom the input independent marginal distributions usingreordering techniques The multivariate coupled structureobtained as the output can thus be used as the input data invarious models of dependency analysis to test their relativeaccuracies

We set parameters of the Pearson type IV distributionsuch that the coupled synthetic data we obtain using theIman-Conover method has similar properties to financialreturns But first we need to consider some properties offinancial returns which we want to mimic Figures 1(a) and1(b) show the distributions of two higher-order momentsthat is skewness (120574) and kurtosis (120581) for FX spot data setssampled at three different frequenciesThe plots are obtainedusing a sliding-window of length 50 data points as an averageof all G10 (Group of Ten) currency pairs covering a period of8 hours in the case of 025 second and 05 second sampleddata and 2 years in case of 05 hour sampled data The non-Gaussian (heavy-tailed skewed) nature of the data is clearlyvisible It is interesting to note that the kurtosis value almostnever goes below three for any of the data sets signifyingthe temporal persistence of non-Gaussianity We now makeuse of the Iman-Conover method to induce varying levels ofcorrelation between 1000 samples taken from independentrandomly distributed Pearson type IV distributions A 1000sample data set makes it easier to accurately induce prede-fined correlations in the system as well as makes it possibleto generate data with relatively accurate average kurtosis andskewness values hence allowing us to accurately capture thehigher-order moments of financial returns As an exampleFigures 1(c) and 1(d) show distributions of the kurtosis andskewness of two variables for 1000 independent simulationsNote the similarity of these plots with the average of thecorresponding distributions of higher-order moments forfinancial data as presented in Figures 1(a) and 1(b) Thisshows the effectiveness of using synthetic data sampled from

a Pearson type IV distribution for capturing higher-ordermoments of financial returns

Four different approaches are now used to estimate thelevel of dependence between the output coupled data Theprocess is repeated 1000 times for each level of true correlation(120588TC) Table 2 presents the results obtained The results showthe accuracy of the information coupling measure whenused to analyse non-Gaussian data For this synthetic dataexample on average the information coupling measure was537 more accurate than the linear correlation measure and256 more accurate with respect to the rank correlationmeasure The normalised mutual information provided theleast accurate results

We now extend this example by incorporating datadynamics The same data generation process as describedabove is nowused to construct a 32000 samples long bivariatedata set in which the induced true correlation changes every8000 time steps that is120588TC = 02when 119905 = 1 8000120588TC = 04when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000and 120588TC = 08 when 119905 = 24001 32000 A 1000 datapoints wide sliding-window is used to dynamically measuredependencies in the data setThe resulting temporal informa-tion coupling plot together with the 5th and 95th percentileconfidence intervals is presented in Figure 2(a) The fourdifferent coupling regions are clearly visible together withthe step changes in coupling after every 8000 time stepsshowing ability of the algorithm to detect abrupt changes incoupling The normalised empirical probability distributionsover 120578 for the four coupling regions are shown in Figure 2(b)Also plotted in the same figure are the normalised empiricalpdfs for linear correlation (120588) and rank correlation (120588

119877)

The mutual information pdf is omitted for clarity as it givesrelatively less accurate results as presented in Table 2 Itis interesting to see how the peaks of the 120578 distributioncorrespond very closely to 120588TC values showing ability of theinformation coupling model to accurately capture statistical

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 4: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

4 ISRN Signal Processing

the icadec algorithm is an expression for the log-likelihood ofthe data as described below

Using (1) and assuming that the observation noise (n)is normally distributed with a mean of zero and having anisotropic covariance matrix with precision 120573 the distributionof the observations (as a preprocessing step we normaliseeach observed signal to have a mean of zero and unitvariance) conditioned on A and s (where we drop the timeindex 119905 for ease of presentation) is given by

119901 (x | A s) =N (xAs 120573minus1I) (10)

where As is the mean of the normal distribution and 120573minus1I isits covariance The likelihood of an observation occurring isgiven by

119901 (x | A) = int119901 (x | A s) 119901 (s) 119889s (11)

Assuming that the distribution over sources has a singledominant peak in this case given by themaximum likelihoodsource estimates s = (A⊤A)minus1A⊤x the integral in (11) canbe analysed by using a simplified (computationally efficient)variant of Laplacersquos method as shown in [21 25]

119901 (x | A) = int119901 (x | A s) 119901 (s) 119889s

asymp 119901 (x | A s) 119901 (s) (2120587)1198722 det (G)minus12(12)

where G is the Hessian matrix

G = minus[1205972 log119901 (x | A s)120597s119894120597s119895

]

s=s (13)

Taking log of the expanded form of (10) gives

log119901 (x | A s) = 1198732log(

120573

2120587) minus120573

2(x minus As)⊤ (x minus As) (14)

which via (13) results in G = 120573A⊤A The log-likelihood ℓ equivlog119901(x | A) is therefore

ℓ =119873 minus119872

2log(

120573

2120587) minus120573

2(x minus As)⊤ (x minus As)

+ log119901 (s) minus 12log det (A119879A)

(15)

By using (8) we obtain A = UΣQ⊤Dminus1 Hence the log-likelihood becomes [21]

ℓ =119873 minus119872

2log(

120573

2120587119890) + log119901 (s) + log det (Σminus1D) (16)

where 119890 is the average reconstruction error Noting that weuse a reciprocal cosh source model (as given by (5)) it canbe shown that taking the derivative of this log-likelihoodexpression with respect to D and J (which parameterises Q)and by following the resulting likelihood gradient using aBFGS optimiser makes it possible to efficiently compute anoptimum value for the ICA unmixing matrix details of thisprocedure are presented in [24]

32 Dynamic Mixing The standard (offline) ICAmodel usesall available data samples at times 119905 = 1 2 119879 of theobserved signals x(119905) to estimate a single static unmixingmatrix W The unmixing matrix obtained provides a goodestimate of the mixing process for the complete time seriesand is well suited for offline data analysis However manytime series such as financial data streams are highly dynamicin nature with rapidly changing properties and thereforerequire a source separation method that can be used in asequential manner This issue is addressed here by usinga sliding-window ICA model [26] This model makes useof a sliding-window approach to sequentially update thecurrent unmixing matrix using information contained in theprevious window and can easily handle nonstationary dataThe unmixing matrix for the current window W(119905) is usedas a prior for computing the unmixing matrix for the nextwindow W(119905 + 1) This results in significant computationalefficiency as fewer iterations are required to obtain anoptimum value for W(119905 + 1) The algorithm also results inan improvement in the source separation results obtainedwhen the mixing process is drifting and addresses the ICApermutation and sign ambiguity issues [21] by maintaining afixed (but of course arbitrary) ordering of recovered sourcesthrough time

4 Information Coupling

We now proceed to derive the ICA-based information cou-pling metric Later in this section we discuss the practicaladvantages this metric offers when used to analyse multivari-ate financial time series

41 Coupling Metric Let W be any arbitrary square ICAunmixing matrix (for the purpose of brevity and clarity weonly consider the case of square mixing while deriving themetric here However the metric derived in this section isvalid for nonsquare mixing as well and the correspondingderivation can be undertaken using a similar approach aspresented here but converting the nonsquare ICA unmixingmatrices in each instance into square matrices by paddingthem with zeros)

W isin R119873times119873 (17)

Sincemultiplication ofW by a diagonalmatrix does not affectthe mutual information of the recovered sources thereforewe row normalise the unmixing matrix in order to addressthe ICA scale indeterminacy problem (most ICA algorithmsincluding icadec suffer from the scale indeterminacy prob-lem that is the variances of the independent componentscannot be determined this is because both the unmixingmatrix and the source signals are unknown and any scalarmultiplication on either will be lost in the mixing process)Row normalisation implies that the elements 119908

119894119895of the

unmixing matrix W are constrained such that each row ofthe matrix is of unit length that is

119873

sum

119895=1

1199082

119894119895= 1 (18)

ISRN Signal Processing 5

for all rows 119894 For a set of observed signals to be completelydecoupled their latent independent components must be thesame as the observed signals therefore the row-normalisedunmixing matrix for decoupled signals (W

0) must be a

permutation of the identity matrix (I)

W0= PI isin R119873times119873 (19)

where P is a permutation matrix For the case where theobserved signals are completely coupled all the latent inde-pendent components must be the same therefore the row-normalised unmixing matrix for completely coupled signals(W1) is given by

W1=1

radic119873

K isin R119873times119873 (20)

where K is the unit matrix (a matrix of ones)To calculate coupling we need to consider the distance

between any arbitrary unmixing matrix (W) and the zerocoupling matrix (W

0) The distance measure we use is the

generalised 2-norm also called the spectral norm of thedifference between the twomatrices [27] althoughwe can usesome other norms as well to get similar results The spectralnorm of a matrix corresponds to its largest singular valueand is the matrix equivalent of the vector Euclidean normHence the distance 119889(WW

0) between the twomatrices can

be written as

119889 (WW0) =1003817100381710038171003817W minusW0

10038171003817100381710038172 (21)

where sdot 2is the spectral norm of the matrix As W

0is a

permutation of the identity matrix therefore

119889 (WW0) = W minus PI2 (22)

As the spectral norm of a matrix is independent of itspermutations therefore we may define another permutationmatrix (P) such that

119889 (WW0) =10038171003817100381710038171003817PW minus I100381710038171003817100381710038172 (23)

For this equation the following equality holds

119889 (WW0) =10038171003817100381710038171003817PW100381710038171003817100381710038172 minus 1 (24)

Again noting that PW2= W

2 we have

119889 (WW0) = W2 minus 1 (25)

We normalise this measure with respect to the range overwhich the distance measure can vary that is the distancebetween matrices representing completely coupled (W

1) and

decoupled (W0) signals From (20) we have that W

1=

(1radic119873)K therefore

119889 (W1W0) =1003817100381710038171003817W1 minusW0

10038171003817100381710038172=

10038171003817100381710038171003817100381710038171003817

1

radic119873

K minus PI100381710038171003817100381710038171003817100381710038172

(26)

Using the same analysis as presented previously this equationcan be simplified to

119889 (W1W0) =

1

radic119873K2 minus 1 (27)

For a 119873-dimensional square unit matrix the spectral normis given by K

2= 119873 Therefore for a row-normalised unit

matrix the spectral norm is (1radic119873)K2= radic119873 Hence ifW

is row-normalised (27) can be written as

119889 (W1W0) = radic119873 minus 1 (28)

The normalised information coupling metric (120578) is thendefined as

120578 =119889 (WW

0)

119889 (W1W0) (29)

Substituting (25) and (28) into (29) the normalised informa-tion coupling between119873 observed signals is given by

120578 =W2 minus 1radic119873 minus 1

(30)

We can consider the bounds of 120578 as described belowSupposeM is an arbitrary real matrix We can look upon thespectral norm of this matrix (M

2= M minus 0

2) as a measure

of departure (distance) ofM from a null matrix (0) [28] Thebounds on this norm are given by

0 le M2 le infin (31)

If M is a row-normalised ICA unmixing matrix (W) then(as discussed earlier) it lies between W

0and W

1 Hence the

bounds onW are1003817100381710038171003817W010038171003817100381710038172le W2 le

1003817100381710038171003817W110038171003817100381710038172 (32)

Using (19) and (20) we can write

PI2 le W2 le10038171003817100381710038171003817100381710038171003817

1

radic119873

K100381710038171003817100381710038171003817100381710038172

(33)

which can be simplified to

1 le W2 le radic119873 (34)

Rearranging terms in this inequality gives

0 leW2 minus 1radic119873 minus 1

le 1 (35)

which gives us the same coupling metric as in (30) and showsthat the metric is normalised that is 0 le 120578 le 1 For real-valuedW W

2can be written as

W2 = radic120582max (W⊤W) (36)

where 120582max(W⊤W) is the maximum eigenvalue of W⊤WThe unmixing matrix obtained using most ICA algorithmsincluding icadec suffers from row permutation and signambiguity problems that is the rows are arranged in arandom order and the sign of elements in each row isunknown [21 24] We note that as W

2is independent of

the sign and permutations of the rows of W therefore ourmeasure of information coupling straightaway addresses the

6 ISRN Signal Processing

problems of ICA sign and permutation ambiguities Also asthe metricrsquos value is independent of the row permutations ofW therefore it provides symmetric results The informationcoupling metric is valid for all dimensions of the unmixingmatrix W This implies that information coupling can beeasily measured in high-dimensional spaces

It is possible to obtain a measure of uncertainty in ourestimation of the information coupling measure We makeuse of the BFGS quasi-Newton optimisation approach overthe most probable skew-symmetric matrix J of (9) fromwhich estimates for the unmixingmatrixW can be obtainedand thence the coupling measure 120578 calculated We alsoestimate the Hessian (inverse covariance) matrixH for J aspart of this process Hence it is possible to draw samples J1015840say from the distribution over J as a multivariate normal

J1015840 simMN (JHminus1) (37)

These samples can be readily transformed to samples in 120578using (9) (8) and (30) respectively Confidence bounds (andhere we use the 95 bounds) may then be easily obtainedfrom the set of samples for 120578 (in our analysis we use 100samples)

42 Computational Complexity The information couplingalgorithm achieves computational efficiency bymaking use ofthe sliding-window based decorrelating manifold approachto ICA Making use of the reciprocal cosh based ICA sourcemodel also results in significant computational advantagesWe now take a look at the comparative computationalcomplexity of the information coupling measure and threefrequently used measures of statistical dependence that islinear correlation rank correlation and mutual informationFor bivariate data (119899

119904samples long) for which these four

measures are directly comparable linear correlation andrank correlation have time complexities of order 119874(119899

119904) and

119874(119899119904log 119899119904) respectively [29] while mutual information and

information coupling scale as 119874(1198992119904) and 119874(119899

119904) respectively

(there have been various estimation algorithms proposed forefficient computation of mutual information however theyall result in increased estimation errors and require care-ful selection of various user-defined parameters [30]) [31]Hence even though the time complexity of the informationcoupling measure is of the same order as linear correlation itcan still accurately capture statistical dependencies in non-Gaussian data streams and is a computationally efficientproxy for mutual information

For119873-dimensionalmultivariate data direct computationof mutual information has time complexity of order 119874(119899119873

119904)

compared to119874(1198991199041198733) for the information coupling measure

In high-dimensions even an approximation for mutualinformation can be computationally very costly For exam-ple using a Parzen-window density estimator the mutualinformation computational complexity can be reduced to119874(119899119904119899119873

119887) where 119899

119887is the number of bins used for estimation

[32] which will incur a very high computational cost evenfor relatively small values of 119873 119899

119887 and 119899

119904 As a simple

example Table 1 shows a comparison of computation time(in seconds) taken by mutual information and information

Table 1 Example showing comparison of average computation time(in seconds) ofmutual information and information coupling whenthese measures are used to analyse bivariate data sets containingdifferent number of samples (119899

119904) The approach used to estimate

mutual information is based on a Parzen window based algorithmas described in [45] The computational cost of this algorithm isdependent on the window-size (ℎ) The values of ℎ used for thesimulations are ℎ = 20 for 119899

119904= 102 and ℎ = 100 for all other

simulations Results are obtained using a 266GHz processor as anaverage of 100 simulations

Computation time (sec) 119899119904= 102119899119904= 103119899119904= 104119899119904= 105

Mutual information 00214 28289 170101 1195088Information coupling 00073 00213 00561 05543

couplingmeasures for analysing bivariate data sets of varyinglengths As expected mutual information estimation usingthe Parzen window based approach (which is consideredto be a relatively efficient approach to compute mutualinformation) becomes computationally very demandingwithan increase in the number of samples of the bivariatedata set In contrast the information coupling measure iscomputationally efficient even when used to analyse verylarge high-dimensional multivariate data sets

43 Capturing Market Dynamics To dynamically analyseinformation coupling in multivariate financial data streamswe need to make use of windowing techniques Financialmarkets give rise towell-defined events such as orders tradesand quote revisions These events are irregularly spaced inclock-time that is they are asynchronous Statistical modelsin clock-timemake use of data aggregated over fixed intervalsof time [33] The time at which these events are indexedis called the event-time Hence for dynamic modelling inevent-time the number of data points can be regarded asfixed while time varies while in clock-time the time periodis considered to be fixed with variable number of data pointsAlthough we may need adaptive windows in clock-time wecan use sliding-windows of fixed length in event-time Usingfixed length sliding-windows in event-time can be usefulfor obtaining consistent results when developing and testingdifferent statistical models Also statistical models deployedfor online analysis of financial data operate best in event-time as they often need to make decisions as soon as somenewmarket information (such as quote update etc) becomesavailable Consider an online trading model making use ofan adaptive window in event-time At specific times of theday for example at times of major news announcementstrading volume can significantly increase Hence more datawill be available to the algorithm and thus results obtainedcan be misleading [34] Using a sliding-window of fixedlength in event-time can overcome this problem The lengthof the sliding-window needs to be selected appropriatelyThefinancial application for which the model is being used is oneof the factors which drives the choice of window length As ageneral rule for trading models a window of approximatelythe same size as the average time period between placingtrades (inverse of trading frequency) is often usedThismakes

ISRN Signal Processing 7

it possible to accurately capture the rapidly evolving dynamicsof the markets over the corresponding period without beingtoo long so as to only capture major trends or too short tocapture noise in the data

44 Discussion The information coupling model offers uswith multiple advantages when used to analyse multivariatefinancial data Here we summarise some of the main prop-erties the model while the empirical results presented in thenext section showcase some of its practical benefits

(i) The information coupling measure a proxy formutual information is able to accurately pick up sta-tistical dependencies in data sets with non-Gaussiandistributions (such as financial returns)

(ii) The information coupling algorithm is computation-ally efficient which makes it particularly suitable foruse in an online dynamic environment This makesthe algorithm especially attractive when dealing withdata sampled at high frequenciesThis is because withthe ever-increasing use of high-frequency data over-coming sources of latency is of utmost importance ina variety of applications in modern financial markets

(iii) It gives confidence levels on the information couplingmeasure This allows us to estimate the uncertaintyassociated with the measurements

(iv) The metric provides normalised results that is infor-mation coupling ranges from 0 for decoupled systemsto 1 for completely coupled systems This makes iteasier to analyse results obtained using themetric andto compare its performance with other similar mea-sures of association The metric also gives symmetricresults

(v) The metric is valid for any number of signals inhigh-dimensional spaces that is it consistently givesaccurate results irrespective of the number of timeseries between which information coupling is beingcomputed This makes it suitable for a range offinancial applications

(vi) It is not data intensive that is it gives relativelyaccurate results evenwhen a small sample size is usedThis allows the metric to model the complex andrapidly changing dynamics of financial markets

(vii) It does not depend on user-defined parameters whichcan restrict its practical utility as the evolving marketconditions may require the parameters to be con-stantly updated which may not be practical

5 Results

Wenow present a set of synthetic and financial data examplesshowing the relative accuracy and practical utility of theinformation coupling measure The following notations areused for different measures of statistical dependence in this

paper ICA-based information coupling (120578) linear correla-tion (120588) rank correlation (120588

119877) and normalised mutual infor-

mation (119868119873) Normalisation of mutual information values (119868)

is achieved using the following transformation [35]

119868119873= radic1 minus exp (minus2119868) (38)

The financial data examples we present in this paperprimarily make use of spot foreign exchange (FX) data Spotprice or spot rate is the price which is actually quoted fora currency transaction to take place Return is the profitrealised when a currency is traded It is common practice touse the normalised log-returns of financial data in statisticalanalysis instead of the raw data itself [36] Using log-returnsmakes it possible to convert exponential problems intolinear ones thus significantly simplifying relevant analysisA normalised log-returns data set with a mean of zero andunit variance can generally be regarded as a locally stationaryprocess [37] Therefore many signal processing techniquesmeant solely for stationary processes can be successfullyapplied to the normalised log-returns time series in anadaptive environment Denoting the mid-price at time 119905 of afinancial time series by 119875

119905 the log-return value is given by

119903119905= ln(

119875119905

119875119905minus1

) (39)

The normalisation of log-returns is achieved by convertingthe data to a form such that it has a mean of zero and unitvariance This is easily achieved by removing the mean anddividing by the standard deviation of the data

51 Synthetic Data There is no single distribution whichfits financial returns especially those sampled at higher fre-quencies which tend to be highly non-Gaussian [2] althoughthere have been attempts to model returns using a variety ofdistributions [38] In this paper we aim to capture the heavy-tailed skewed properties of financial returns using a Pearsontype IV distribution [39 40] which can be used to representdistributions with varying degrees of skewness and kurtosisand thus are useful for representing distributions of financialreturns [41 42] The first four moments of the distributioncan be uniquely determined by setting four parameters whichcharacterise the distribution [43] Until recently due to itsmathematical and computational complexity this distribu-tion has not been widely used in financial literature althoughthis is rapidly changing with advances in computationalpower and proposal of new improved analytical methods[39 42 44]

To test accuracy of various measures of dependence weneed to generate coupled non-Gaussian synthetic data withknown predefined correlation values There is no straight-forward way to simulate correlated random variables whentheir joint distribution is not known [46] as is the case withmultivariate financial returns One possible method that canbe used to induce any desired predefined correlation betweenindependent randomly distributed variables irrespective oftheir distributions is commonly known as the Iman-Conovermethod as presented in [47] This method is based on

8 ISRN Signal Processing

Table 2 Table showing accuracy of four measures of dependence that is information coupling (120578) linear correlation (120588) rank correlation(120588119877) and normalised mutual information (119868

119873) when used to estimate the level of dependence in a coupled system with varying levels of true

correlation (120588TC) 120588TC is induced in an independent bivariate randomly distributed system using the Iman-Conover method as described inthe text The dependence estimates together with their standard deviation values shown in the table are obtained using 1000 independentsimulations using 1000 data points long data sets for each simulationThe last row of the table gives values for themean absolute error (MAE)

120588TC 120578 120588 120588119877

119868119873

|120588TC minus 120578| |120588TC minus 120588| |120588TC minus 120588119877| |120588TC minus 119868119873|

0 00046 plusmn 00026 00038 plusmn 00022 00147 plusmn 00115 01851 plusmn 00401 00046 00038 00147 0185101 01079 plusmn 00073 00917 plusmn 00058 00942 plusmn 00184 01687 plusmn 00412 00079 00083 00058 0068702 02145 plusmn 00102 01862 plusmn 00093 01899 plusmn 00177 01131 plusmn 00464 00145 00138 00111 0086903 03176 plusmn 00138 02812 plusmn 00130 02863 plusmn 00167 01644 plusmn 00538 00176 00188 00137 0135604 04166 plusmn 00210 03764 plusmn 00165 03835 plusmn 00153 02787 plusmn 00500 00166 00236 00165 0121305 05135 plusmn 00196 04720 plusmn 00197 04818 plusmn 00136 03819 plusmn 00475 00135 00280 00182 0118106 06070 plusmn 00347 05688 plusmn 00226 05815 plusmn 00117 04788 plusmn 00458 00070 00312 00185 0121207 07011 plusmn 00232 06670 plusmn 00242 06830 plusmn 00093 05728 plusmn 00483 00011 00330 00170 0127208 07936 plusmn 00239 07676 plusmn 00249 07864 plusmn 00066 06652 plusmn 00490 00064 00324 00136 0134809 08864 plusmn 00252 08713 plusmn 00249 08919 plusmn 00034 07595 plusmn 00501 00136 00287 00081 0140510 10000 plusmn 00000 10000 plusmn 00000 10000 plusmn 00000 09601 plusmn 00185 00000 00000 00000 00399MAE 00093 00201 00125 01163

inducing a known dependency structure in samples takenfrom the input independent marginal distributions usingreordering techniques The multivariate coupled structureobtained as the output can thus be used as the input data invarious models of dependency analysis to test their relativeaccuracies

We set parameters of the Pearson type IV distributionsuch that the coupled synthetic data we obtain using theIman-Conover method has similar properties to financialreturns But first we need to consider some properties offinancial returns which we want to mimic Figures 1(a) and1(b) show the distributions of two higher-order momentsthat is skewness (120574) and kurtosis (120581) for FX spot data setssampled at three different frequenciesThe plots are obtainedusing a sliding-window of length 50 data points as an averageof all G10 (Group of Ten) currency pairs covering a period of8 hours in the case of 025 second and 05 second sampleddata and 2 years in case of 05 hour sampled data The non-Gaussian (heavy-tailed skewed) nature of the data is clearlyvisible It is interesting to note that the kurtosis value almostnever goes below three for any of the data sets signifyingthe temporal persistence of non-Gaussianity We now makeuse of the Iman-Conover method to induce varying levels ofcorrelation between 1000 samples taken from independentrandomly distributed Pearson type IV distributions A 1000sample data set makes it easier to accurately induce prede-fined correlations in the system as well as makes it possibleto generate data with relatively accurate average kurtosis andskewness values hence allowing us to accurately capture thehigher-order moments of financial returns As an exampleFigures 1(c) and 1(d) show distributions of the kurtosis andskewness of two variables for 1000 independent simulationsNote the similarity of these plots with the average of thecorresponding distributions of higher-order moments forfinancial data as presented in Figures 1(a) and 1(b) Thisshows the effectiveness of using synthetic data sampled from

a Pearson type IV distribution for capturing higher-ordermoments of financial returns

Four different approaches are now used to estimate thelevel of dependence between the output coupled data Theprocess is repeated 1000 times for each level of true correlation(120588TC) Table 2 presents the results obtained The results showthe accuracy of the information coupling measure whenused to analyse non-Gaussian data For this synthetic dataexample on average the information coupling measure was537 more accurate than the linear correlation measure and256 more accurate with respect to the rank correlationmeasure The normalised mutual information provided theleast accurate results

We now extend this example by incorporating datadynamics The same data generation process as describedabove is nowused to construct a 32000 samples long bivariatedata set in which the induced true correlation changes every8000 time steps that is120588TC = 02when 119905 = 1 8000120588TC = 04when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000and 120588TC = 08 when 119905 = 24001 32000 A 1000 datapoints wide sliding-window is used to dynamically measuredependencies in the data setThe resulting temporal informa-tion coupling plot together with the 5th and 95th percentileconfidence intervals is presented in Figure 2(a) The fourdifferent coupling regions are clearly visible together withthe step changes in coupling after every 8000 time stepsshowing ability of the algorithm to detect abrupt changes incoupling The normalised empirical probability distributionsover 120578 for the four coupling regions are shown in Figure 2(b)Also plotted in the same figure are the normalised empiricalpdfs for linear correlation (120588) and rank correlation (120588

119877)

The mutual information pdf is omitted for clarity as it givesrelatively less accurate results as presented in Table 2 Itis interesting to see how the peaks of the 120578 distributioncorrespond very closely to 120588TC values showing ability of theinformation coupling model to accurately capture statistical

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 5: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

ISRN Signal Processing 5

for all rows 119894 For a set of observed signals to be completelydecoupled their latent independent components must be thesame as the observed signals therefore the row-normalisedunmixing matrix for decoupled signals (W

0) must be a

permutation of the identity matrix (I)

W0= PI isin R119873times119873 (19)

where P is a permutation matrix For the case where theobserved signals are completely coupled all the latent inde-pendent components must be the same therefore the row-normalised unmixing matrix for completely coupled signals(W1) is given by

W1=1

radic119873

K isin R119873times119873 (20)

where K is the unit matrix (a matrix of ones)To calculate coupling we need to consider the distance

between any arbitrary unmixing matrix (W) and the zerocoupling matrix (W

0) The distance measure we use is the

generalised 2-norm also called the spectral norm of thedifference between the twomatrices [27] althoughwe can usesome other norms as well to get similar results The spectralnorm of a matrix corresponds to its largest singular valueand is the matrix equivalent of the vector Euclidean normHence the distance 119889(WW

0) between the twomatrices can

be written as

119889 (WW0) =1003817100381710038171003817W minusW0

10038171003817100381710038172 (21)

where sdot 2is the spectral norm of the matrix As W

0is a

permutation of the identity matrix therefore

119889 (WW0) = W minus PI2 (22)

As the spectral norm of a matrix is independent of itspermutations therefore we may define another permutationmatrix (P) such that

119889 (WW0) =10038171003817100381710038171003817PW minus I100381710038171003817100381710038172 (23)

For this equation the following equality holds

119889 (WW0) =10038171003817100381710038171003817PW100381710038171003817100381710038172 minus 1 (24)

Again noting that PW2= W

2 we have

119889 (WW0) = W2 minus 1 (25)

We normalise this measure with respect to the range overwhich the distance measure can vary that is the distancebetween matrices representing completely coupled (W

1) and

decoupled (W0) signals From (20) we have that W

1=

(1radic119873)K therefore

119889 (W1W0) =1003817100381710038171003817W1 minusW0

10038171003817100381710038172=

10038171003817100381710038171003817100381710038171003817

1

radic119873

K minus PI100381710038171003817100381710038171003817100381710038172

(26)

Using the same analysis as presented previously this equationcan be simplified to

119889 (W1W0) =

1

radic119873K2 minus 1 (27)

For a 119873-dimensional square unit matrix the spectral normis given by K

2= 119873 Therefore for a row-normalised unit

matrix the spectral norm is (1radic119873)K2= radic119873 Hence ifW

is row-normalised (27) can be written as

119889 (W1W0) = radic119873 minus 1 (28)

The normalised information coupling metric (120578) is thendefined as

120578 =119889 (WW

0)

119889 (W1W0) (29)

Substituting (25) and (28) into (29) the normalised informa-tion coupling between119873 observed signals is given by

120578 =W2 minus 1radic119873 minus 1

(30)

We can consider the bounds of 120578 as described belowSupposeM is an arbitrary real matrix We can look upon thespectral norm of this matrix (M

2= M minus 0

2) as a measure

of departure (distance) ofM from a null matrix (0) [28] Thebounds on this norm are given by

0 le M2 le infin (31)

If M is a row-normalised ICA unmixing matrix (W) then(as discussed earlier) it lies between W

0and W

1 Hence the

bounds onW are1003817100381710038171003817W010038171003817100381710038172le W2 le

1003817100381710038171003817W110038171003817100381710038172 (32)

Using (19) and (20) we can write

PI2 le W2 le10038171003817100381710038171003817100381710038171003817

1

radic119873

K100381710038171003817100381710038171003817100381710038172

(33)

which can be simplified to

1 le W2 le radic119873 (34)

Rearranging terms in this inequality gives

0 leW2 minus 1radic119873 minus 1

le 1 (35)

which gives us the same coupling metric as in (30) and showsthat the metric is normalised that is 0 le 120578 le 1 For real-valuedW W

2can be written as

W2 = radic120582max (W⊤W) (36)

where 120582max(W⊤W) is the maximum eigenvalue of W⊤WThe unmixing matrix obtained using most ICA algorithmsincluding icadec suffers from row permutation and signambiguity problems that is the rows are arranged in arandom order and the sign of elements in each row isunknown [21 24] We note that as W

2is independent of

the sign and permutations of the rows of W therefore ourmeasure of information coupling straightaway addresses the

6 ISRN Signal Processing

problems of ICA sign and permutation ambiguities Also asthe metricrsquos value is independent of the row permutations ofW therefore it provides symmetric results The informationcoupling metric is valid for all dimensions of the unmixingmatrix W This implies that information coupling can beeasily measured in high-dimensional spaces

It is possible to obtain a measure of uncertainty in ourestimation of the information coupling measure We makeuse of the BFGS quasi-Newton optimisation approach overthe most probable skew-symmetric matrix J of (9) fromwhich estimates for the unmixingmatrixW can be obtainedand thence the coupling measure 120578 calculated We alsoestimate the Hessian (inverse covariance) matrixH for J aspart of this process Hence it is possible to draw samples J1015840say from the distribution over J as a multivariate normal

J1015840 simMN (JHminus1) (37)

These samples can be readily transformed to samples in 120578using (9) (8) and (30) respectively Confidence bounds (andhere we use the 95 bounds) may then be easily obtainedfrom the set of samples for 120578 (in our analysis we use 100samples)

42 Computational Complexity The information couplingalgorithm achieves computational efficiency bymaking use ofthe sliding-window based decorrelating manifold approachto ICA Making use of the reciprocal cosh based ICA sourcemodel also results in significant computational advantagesWe now take a look at the comparative computationalcomplexity of the information coupling measure and threefrequently used measures of statistical dependence that islinear correlation rank correlation and mutual informationFor bivariate data (119899

119904samples long) for which these four

measures are directly comparable linear correlation andrank correlation have time complexities of order 119874(119899

119904) and

119874(119899119904log 119899119904) respectively [29] while mutual information and

information coupling scale as 119874(1198992119904) and 119874(119899

119904) respectively

(there have been various estimation algorithms proposed forefficient computation of mutual information however theyall result in increased estimation errors and require care-ful selection of various user-defined parameters [30]) [31]Hence even though the time complexity of the informationcoupling measure is of the same order as linear correlation itcan still accurately capture statistical dependencies in non-Gaussian data streams and is a computationally efficientproxy for mutual information

For119873-dimensionalmultivariate data direct computationof mutual information has time complexity of order 119874(119899119873

119904)

compared to119874(1198991199041198733) for the information coupling measure

In high-dimensions even an approximation for mutualinformation can be computationally very costly For exam-ple using a Parzen-window density estimator the mutualinformation computational complexity can be reduced to119874(119899119904119899119873

119887) where 119899

119887is the number of bins used for estimation

[32] which will incur a very high computational cost evenfor relatively small values of 119873 119899

119887 and 119899

119904 As a simple

example Table 1 shows a comparison of computation time(in seconds) taken by mutual information and information

Table 1 Example showing comparison of average computation time(in seconds) ofmutual information and information coupling whenthese measures are used to analyse bivariate data sets containingdifferent number of samples (119899

119904) The approach used to estimate

mutual information is based on a Parzen window based algorithmas described in [45] The computational cost of this algorithm isdependent on the window-size (ℎ) The values of ℎ used for thesimulations are ℎ = 20 for 119899

119904= 102 and ℎ = 100 for all other

simulations Results are obtained using a 266GHz processor as anaverage of 100 simulations

Computation time (sec) 119899119904= 102119899119904= 103119899119904= 104119899119904= 105

Mutual information 00214 28289 170101 1195088Information coupling 00073 00213 00561 05543

couplingmeasures for analysing bivariate data sets of varyinglengths As expected mutual information estimation usingthe Parzen window based approach (which is consideredto be a relatively efficient approach to compute mutualinformation) becomes computationally very demandingwithan increase in the number of samples of the bivariatedata set In contrast the information coupling measure iscomputationally efficient even when used to analyse verylarge high-dimensional multivariate data sets

43 Capturing Market Dynamics To dynamically analyseinformation coupling in multivariate financial data streamswe need to make use of windowing techniques Financialmarkets give rise towell-defined events such as orders tradesand quote revisions These events are irregularly spaced inclock-time that is they are asynchronous Statistical modelsin clock-timemake use of data aggregated over fixed intervalsof time [33] The time at which these events are indexedis called the event-time Hence for dynamic modelling inevent-time the number of data points can be regarded asfixed while time varies while in clock-time the time periodis considered to be fixed with variable number of data pointsAlthough we may need adaptive windows in clock-time wecan use sliding-windows of fixed length in event-time Usingfixed length sliding-windows in event-time can be usefulfor obtaining consistent results when developing and testingdifferent statistical models Also statistical models deployedfor online analysis of financial data operate best in event-time as they often need to make decisions as soon as somenewmarket information (such as quote update etc) becomesavailable Consider an online trading model making use ofan adaptive window in event-time At specific times of theday for example at times of major news announcementstrading volume can significantly increase Hence more datawill be available to the algorithm and thus results obtainedcan be misleading [34] Using a sliding-window of fixedlength in event-time can overcome this problem The lengthof the sliding-window needs to be selected appropriatelyThefinancial application for which the model is being used is oneof the factors which drives the choice of window length As ageneral rule for trading models a window of approximatelythe same size as the average time period between placingtrades (inverse of trading frequency) is often usedThismakes

ISRN Signal Processing 7

it possible to accurately capture the rapidly evolving dynamicsof the markets over the corresponding period without beingtoo long so as to only capture major trends or too short tocapture noise in the data

44 Discussion The information coupling model offers uswith multiple advantages when used to analyse multivariatefinancial data Here we summarise some of the main prop-erties the model while the empirical results presented in thenext section showcase some of its practical benefits

(i) The information coupling measure a proxy formutual information is able to accurately pick up sta-tistical dependencies in data sets with non-Gaussiandistributions (such as financial returns)

(ii) The information coupling algorithm is computation-ally efficient which makes it particularly suitable foruse in an online dynamic environment This makesthe algorithm especially attractive when dealing withdata sampled at high frequenciesThis is because withthe ever-increasing use of high-frequency data over-coming sources of latency is of utmost importance ina variety of applications in modern financial markets

(iii) It gives confidence levels on the information couplingmeasure This allows us to estimate the uncertaintyassociated with the measurements

(iv) The metric provides normalised results that is infor-mation coupling ranges from 0 for decoupled systemsto 1 for completely coupled systems This makes iteasier to analyse results obtained using themetric andto compare its performance with other similar mea-sures of association The metric also gives symmetricresults

(v) The metric is valid for any number of signals inhigh-dimensional spaces that is it consistently givesaccurate results irrespective of the number of timeseries between which information coupling is beingcomputed This makes it suitable for a range offinancial applications

(vi) It is not data intensive that is it gives relativelyaccurate results evenwhen a small sample size is usedThis allows the metric to model the complex andrapidly changing dynamics of financial markets

(vii) It does not depend on user-defined parameters whichcan restrict its practical utility as the evolving marketconditions may require the parameters to be con-stantly updated which may not be practical

5 Results

Wenow present a set of synthetic and financial data examplesshowing the relative accuracy and practical utility of theinformation coupling measure The following notations areused for different measures of statistical dependence in this

paper ICA-based information coupling (120578) linear correla-tion (120588) rank correlation (120588

119877) and normalised mutual infor-

mation (119868119873) Normalisation of mutual information values (119868)

is achieved using the following transformation [35]

119868119873= radic1 minus exp (minus2119868) (38)

The financial data examples we present in this paperprimarily make use of spot foreign exchange (FX) data Spotprice or spot rate is the price which is actually quoted fora currency transaction to take place Return is the profitrealised when a currency is traded It is common practice touse the normalised log-returns of financial data in statisticalanalysis instead of the raw data itself [36] Using log-returnsmakes it possible to convert exponential problems intolinear ones thus significantly simplifying relevant analysisA normalised log-returns data set with a mean of zero andunit variance can generally be regarded as a locally stationaryprocess [37] Therefore many signal processing techniquesmeant solely for stationary processes can be successfullyapplied to the normalised log-returns time series in anadaptive environment Denoting the mid-price at time 119905 of afinancial time series by 119875

119905 the log-return value is given by

119903119905= ln(

119875119905

119875119905minus1

) (39)

The normalisation of log-returns is achieved by convertingthe data to a form such that it has a mean of zero and unitvariance This is easily achieved by removing the mean anddividing by the standard deviation of the data

51 Synthetic Data There is no single distribution whichfits financial returns especially those sampled at higher fre-quencies which tend to be highly non-Gaussian [2] althoughthere have been attempts to model returns using a variety ofdistributions [38] In this paper we aim to capture the heavy-tailed skewed properties of financial returns using a Pearsontype IV distribution [39 40] which can be used to representdistributions with varying degrees of skewness and kurtosisand thus are useful for representing distributions of financialreturns [41 42] The first four moments of the distributioncan be uniquely determined by setting four parameters whichcharacterise the distribution [43] Until recently due to itsmathematical and computational complexity this distribu-tion has not been widely used in financial literature althoughthis is rapidly changing with advances in computationalpower and proposal of new improved analytical methods[39 42 44]

To test accuracy of various measures of dependence weneed to generate coupled non-Gaussian synthetic data withknown predefined correlation values There is no straight-forward way to simulate correlated random variables whentheir joint distribution is not known [46] as is the case withmultivariate financial returns One possible method that canbe used to induce any desired predefined correlation betweenindependent randomly distributed variables irrespective oftheir distributions is commonly known as the Iman-Conovermethod as presented in [47] This method is based on

8 ISRN Signal Processing

Table 2 Table showing accuracy of four measures of dependence that is information coupling (120578) linear correlation (120588) rank correlation(120588119877) and normalised mutual information (119868

119873) when used to estimate the level of dependence in a coupled system with varying levels of true

correlation (120588TC) 120588TC is induced in an independent bivariate randomly distributed system using the Iman-Conover method as described inthe text The dependence estimates together with their standard deviation values shown in the table are obtained using 1000 independentsimulations using 1000 data points long data sets for each simulationThe last row of the table gives values for themean absolute error (MAE)

120588TC 120578 120588 120588119877

119868119873

|120588TC minus 120578| |120588TC minus 120588| |120588TC minus 120588119877| |120588TC minus 119868119873|

0 00046 plusmn 00026 00038 plusmn 00022 00147 plusmn 00115 01851 plusmn 00401 00046 00038 00147 0185101 01079 plusmn 00073 00917 plusmn 00058 00942 plusmn 00184 01687 plusmn 00412 00079 00083 00058 0068702 02145 plusmn 00102 01862 plusmn 00093 01899 plusmn 00177 01131 plusmn 00464 00145 00138 00111 0086903 03176 plusmn 00138 02812 plusmn 00130 02863 plusmn 00167 01644 plusmn 00538 00176 00188 00137 0135604 04166 plusmn 00210 03764 plusmn 00165 03835 plusmn 00153 02787 plusmn 00500 00166 00236 00165 0121305 05135 plusmn 00196 04720 plusmn 00197 04818 plusmn 00136 03819 plusmn 00475 00135 00280 00182 0118106 06070 plusmn 00347 05688 plusmn 00226 05815 plusmn 00117 04788 plusmn 00458 00070 00312 00185 0121207 07011 plusmn 00232 06670 plusmn 00242 06830 plusmn 00093 05728 plusmn 00483 00011 00330 00170 0127208 07936 plusmn 00239 07676 plusmn 00249 07864 plusmn 00066 06652 plusmn 00490 00064 00324 00136 0134809 08864 plusmn 00252 08713 plusmn 00249 08919 plusmn 00034 07595 plusmn 00501 00136 00287 00081 0140510 10000 plusmn 00000 10000 plusmn 00000 10000 plusmn 00000 09601 plusmn 00185 00000 00000 00000 00399MAE 00093 00201 00125 01163

inducing a known dependency structure in samples takenfrom the input independent marginal distributions usingreordering techniques The multivariate coupled structureobtained as the output can thus be used as the input data invarious models of dependency analysis to test their relativeaccuracies

We set parameters of the Pearson type IV distributionsuch that the coupled synthetic data we obtain using theIman-Conover method has similar properties to financialreturns But first we need to consider some properties offinancial returns which we want to mimic Figures 1(a) and1(b) show the distributions of two higher-order momentsthat is skewness (120574) and kurtosis (120581) for FX spot data setssampled at three different frequenciesThe plots are obtainedusing a sliding-window of length 50 data points as an averageof all G10 (Group of Ten) currency pairs covering a period of8 hours in the case of 025 second and 05 second sampleddata and 2 years in case of 05 hour sampled data The non-Gaussian (heavy-tailed skewed) nature of the data is clearlyvisible It is interesting to note that the kurtosis value almostnever goes below three for any of the data sets signifyingthe temporal persistence of non-Gaussianity We now makeuse of the Iman-Conover method to induce varying levels ofcorrelation between 1000 samples taken from independentrandomly distributed Pearson type IV distributions A 1000sample data set makes it easier to accurately induce prede-fined correlations in the system as well as makes it possibleto generate data with relatively accurate average kurtosis andskewness values hence allowing us to accurately capture thehigher-order moments of financial returns As an exampleFigures 1(c) and 1(d) show distributions of the kurtosis andskewness of two variables for 1000 independent simulationsNote the similarity of these plots with the average of thecorresponding distributions of higher-order moments forfinancial data as presented in Figures 1(a) and 1(b) Thisshows the effectiveness of using synthetic data sampled from

a Pearson type IV distribution for capturing higher-ordermoments of financial returns

Four different approaches are now used to estimate thelevel of dependence between the output coupled data Theprocess is repeated 1000 times for each level of true correlation(120588TC) Table 2 presents the results obtained The results showthe accuracy of the information coupling measure whenused to analyse non-Gaussian data For this synthetic dataexample on average the information coupling measure was537 more accurate than the linear correlation measure and256 more accurate with respect to the rank correlationmeasure The normalised mutual information provided theleast accurate results

We now extend this example by incorporating datadynamics The same data generation process as describedabove is nowused to construct a 32000 samples long bivariatedata set in which the induced true correlation changes every8000 time steps that is120588TC = 02when 119905 = 1 8000120588TC = 04when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000and 120588TC = 08 when 119905 = 24001 32000 A 1000 datapoints wide sliding-window is used to dynamically measuredependencies in the data setThe resulting temporal informa-tion coupling plot together with the 5th and 95th percentileconfidence intervals is presented in Figure 2(a) The fourdifferent coupling regions are clearly visible together withthe step changes in coupling after every 8000 time stepsshowing ability of the algorithm to detect abrupt changes incoupling The normalised empirical probability distributionsover 120578 for the four coupling regions are shown in Figure 2(b)Also plotted in the same figure are the normalised empiricalpdfs for linear correlation (120588) and rank correlation (120588

119877)

The mutual information pdf is omitted for clarity as it givesrelatively less accurate results as presented in Table 2 Itis interesting to see how the peaks of the 120578 distributioncorrespond very closely to 120588TC values showing ability of theinformation coupling model to accurately capture statistical

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 6: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

6 ISRN Signal Processing

problems of ICA sign and permutation ambiguities Also asthe metricrsquos value is independent of the row permutations ofW therefore it provides symmetric results The informationcoupling metric is valid for all dimensions of the unmixingmatrix W This implies that information coupling can beeasily measured in high-dimensional spaces

It is possible to obtain a measure of uncertainty in ourestimation of the information coupling measure We makeuse of the BFGS quasi-Newton optimisation approach overthe most probable skew-symmetric matrix J of (9) fromwhich estimates for the unmixingmatrixW can be obtainedand thence the coupling measure 120578 calculated We alsoestimate the Hessian (inverse covariance) matrixH for J aspart of this process Hence it is possible to draw samples J1015840say from the distribution over J as a multivariate normal

J1015840 simMN (JHminus1) (37)

These samples can be readily transformed to samples in 120578using (9) (8) and (30) respectively Confidence bounds (andhere we use the 95 bounds) may then be easily obtainedfrom the set of samples for 120578 (in our analysis we use 100samples)

42 Computational Complexity The information couplingalgorithm achieves computational efficiency bymaking use ofthe sliding-window based decorrelating manifold approachto ICA Making use of the reciprocal cosh based ICA sourcemodel also results in significant computational advantagesWe now take a look at the comparative computationalcomplexity of the information coupling measure and threefrequently used measures of statistical dependence that islinear correlation rank correlation and mutual informationFor bivariate data (119899

119904samples long) for which these four

measures are directly comparable linear correlation andrank correlation have time complexities of order 119874(119899

119904) and

119874(119899119904log 119899119904) respectively [29] while mutual information and

information coupling scale as 119874(1198992119904) and 119874(119899

119904) respectively

(there have been various estimation algorithms proposed forefficient computation of mutual information however theyall result in increased estimation errors and require care-ful selection of various user-defined parameters [30]) [31]Hence even though the time complexity of the informationcoupling measure is of the same order as linear correlation itcan still accurately capture statistical dependencies in non-Gaussian data streams and is a computationally efficientproxy for mutual information

For119873-dimensionalmultivariate data direct computationof mutual information has time complexity of order 119874(119899119873

119904)

compared to119874(1198991199041198733) for the information coupling measure

In high-dimensions even an approximation for mutualinformation can be computationally very costly For exam-ple using a Parzen-window density estimator the mutualinformation computational complexity can be reduced to119874(119899119904119899119873

119887) where 119899

119887is the number of bins used for estimation

[32] which will incur a very high computational cost evenfor relatively small values of 119873 119899

119887 and 119899

119904 As a simple

example Table 1 shows a comparison of computation time(in seconds) taken by mutual information and information

Table 1 Example showing comparison of average computation time(in seconds) ofmutual information and information coupling whenthese measures are used to analyse bivariate data sets containingdifferent number of samples (119899

119904) The approach used to estimate

mutual information is based on a Parzen window based algorithmas described in [45] The computational cost of this algorithm isdependent on the window-size (ℎ) The values of ℎ used for thesimulations are ℎ = 20 for 119899

119904= 102 and ℎ = 100 for all other

simulations Results are obtained using a 266GHz processor as anaverage of 100 simulations

Computation time (sec) 119899119904= 102119899119904= 103119899119904= 104119899119904= 105

Mutual information 00214 28289 170101 1195088Information coupling 00073 00213 00561 05543

couplingmeasures for analysing bivariate data sets of varyinglengths As expected mutual information estimation usingthe Parzen window based approach (which is consideredto be a relatively efficient approach to compute mutualinformation) becomes computationally very demandingwithan increase in the number of samples of the bivariatedata set In contrast the information coupling measure iscomputationally efficient even when used to analyse verylarge high-dimensional multivariate data sets

43 Capturing Market Dynamics To dynamically analyseinformation coupling in multivariate financial data streamswe need to make use of windowing techniques Financialmarkets give rise towell-defined events such as orders tradesand quote revisions These events are irregularly spaced inclock-time that is they are asynchronous Statistical modelsin clock-timemake use of data aggregated over fixed intervalsof time [33] The time at which these events are indexedis called the event-time Hence for dynamic modelling inevent-time the number of data points can be regarded asfixed while time varies while in clock-time the time periodis considered to be fixed with variable number of data pointsAlthough we may need adaptive windows in clock-time wecan use sliding-windows of fixed length in event-time Usingfixed length sliding-windows in event-time can be usefulfor obtaining consistent results when developing and testingdifferent statistical models Also statistical models deployedfor online analysis of financial data operate best in event-time as they often need to make decisions as soon as somenewmarket information (such as quote update etc) becomesavailable Consider an online trading model making use ofan adaptive window in event-time At specific times of theday for example at times of major news announcementstrading volume can significantly increase Hence more datawill be available to the algorithm and thus results obtainedcan be misleading [34] Using a sliding-window of fixedlength in event-time can overcome this problem The lengthof the sliding-window needs to be selected appropriatelyThefinancial application for which the model is being used is oneof the factors which drives the choice of window length As ageneral rule for trading models a window of approximatelythe same size as the average time period between placingtrades (inverse of trading frequency) is often usedThismakes

ISRN Signal Processing 7

it possible to accurately capture the rapidly evolving dynamicsof the markets over the corresponding period without beingtoo long so as to only capture major trends or too short tocapture noise in the data

44 Discussion The information coupling model offers uswith multiple advantages when used to analyse multivariatefinancial data Here we summarise some of the main prop-erties the model while the empirical results presented in thenext section showcase some of its practical benefits

(i) The information coupling measure a proxy formutual information is able to accurately pick up sta-tistical dependencies in data sets with non-Gaussiandistributions (such as financial returns)

(ii) The information coupling algorithm is computation-ally efficient which makes it particularly suitable foruse in an online dynamic environment This makesthe algorithm especially attractive when dealing withdata sampled at high frequenciesThis is because withthe ever-increasing use of high-frequency data over-coming sources of latency is of utmost importance ina variety of applications in modern financial markets

(iii) It gives confidence levels on the information couplingmeasure This allows us to estimate the uncertaintyassociated with the measurements

(iv) The metric provides normalised results that is infor-mation coupling ranges from 0 for decoupled systemsto 1 for completely coupled systems This makes iteasier to analyse results obtained using themetric andto compare its performance with other similar mea-sures of association The metric also gives symmetricresults

(v) The metric is valid for any number of signals inhigh-dimensional spaces that is it consistently givesaccurate results irrespective of the number of timeseries between which information coupling is beingcomputed This makes it suitable for a range offinancial applications

(vi) It is not data intensive that is it gives relativelyaccurate results evenwhen a small sample size is usedThis allows the metric to model the complex andrapidly changing dynamics of financial markets

(vii) It does not depend on user-defined parameters whichcan restrict its practical utility as the evolving marketconditions may require the parameters to be con-stantly updated which may not be practical

5 Results

Wenow present a set of synthetic and financial data examplesshowing the relative accuracy and practical utility of theinformation coupling measure The following notations areused for different measures of statistical dependence in this

paper ICA-based information coupling (120578) linear correla-tion (120588) rank correlation (120588

119877) and normalised mutual infor-

mation (119868119873) Normalisation of mutual information values (119868)

is achieved using the following transformation [35]

119868119873= radic1 minus exp (minus2119868) (38)

The financial data examples we present in this paperprimarily make use of spot foreign exchange (FX) data Spotprice or spot rate is the price which is actually quoted fora currency transaction to take place Return is the profitrealised when a currency is traded It is common practice touse the normalised log-returns of financial data in statisticalanalysis instead of the raw data itself [36] Using log-returnsmakes it possible to convert exponential problems intolinear ones thus significantly simplifying relevant analysisA normalised log-returns data set with a mean of zero andunit variance can generally be regarded as a locally stationaryprocess [37] Therefore many signal processing techniquesmeant solely for stationary processes can be successfullyapplied to the normalised log-returns time series in anadaptive environment Denoting the mid-price at time 119905 of afinancial time series by 119875

119905 the log-return value is given by

119903119905= ln(

119875119905

119875119905minus1

) (39)

The normalisation of log-returns is achieved by convertingthe data to a form such that it has a mean of zero and unitvariance This is easily achieved by removing the mean anddividing by the standard deviation of the data

51 Synthetic Data There is no single distribution whichfits financial returns especially those sampled at higher fre-quencies which tend to be highly non-Gaussian [2] althoughthere have been attempts to model returns using a variety ofdistributions [38] In this paper we aim to capture the heavy-tailed skewed properties of financial returns using a Pearsontype IV distribution [39 40] which can be used to representdistributions with varying degrees of skewness and kurtosisand thus are useful for representing distributions of financialreturns [41 42] The first four moments of the distributioncan be uniquely determined by setting four parameters whichcharacterise the distribution [43] Until recently due to itsmathematical and computational complexity this distribu-tion has not been widely used in financial literature althoughthis is rapidly changing with advances in computationalpower and proposal of new improved analytical methods[39 42 44]

To test accuracy of various measures of dependence weneed to generate coupled non-Gaussian synthetic data withknown predefined correlation values There is no straight-forward way to simulate correlated random variables whentheir joint distribution is not known [46] as is the case withmultivariate financial returns One possible method that canbe used to induce any desired predefined correlation betweenindependent randomly distributed variables irrespective oftheir distributions is commonly known as the Iman-Conovermethod as presented in [47] This method is based on

8 ISRN Signal Processing

Table 2 Table showing accuracy of four measures of dependence that is information coupling (120578) linear correlation (120588) rank correlation(120588119877) and normalised mutual information (119868

119873) when used to estimate the level of dependence in a coupled system with varying levels of true

correlation (120588TC) 120588TC is induced in an independent bivariate randomly distributed system using the Iman-Conover method as described inthe text The dependence estimates together with their standard deviation values shown in the table are obtained using 1000 independentsimulations using 1000 data points long data sets for each simulationThe last row of the table gives values for themean absolute error (MAE)

120588TC 120578 120588 120588119877

119868119873

|120588TC minus 120578| |120588TC minus 120588| |120588TC minus 120588119877| |120588TC minus 119868119873|

0 00046 plusmn 00026 00038 plusmn 00022 00147 plusmn 00115 01851 plusmn 00401 00046 00038 00147 0185101 01079 plusmn 00073 00917 plusmn 00058 00942 plusmn 00184 01687 plusmn 00412 00079 00083 00058 0068702 02145 plusmn 00102 01862 plusmn 00093 01899 plusmn 00177 01131 plusmn 00464 00145 00138 00111 0086903 03176 plusmn 00138 02812 plusmn 00130 02863 plusmn 00167 01644 plusmn 00538 00176 00188 00137 0135604 04166 plusmn 00210 03764 plusmn 00165 03835 plusmn 00153 02787 plusmn 00500 00166 00236 00165 0121305 05135 plusmn 00196 04720 plusmn 00197 04818 plusmn 00136 03819 plusmn 00475 00135 00280 00182 0118106 06070 plusmn 00347 05688 plusmn 00226 05815 plusmn 00117 04788 plusmn 00458 00070 00312 00185 0121207 07011 plusmn 00232 06670 plusmn 00242 06830 plusmn 00093 05728 plusmn 00483 00011 00330 00170 0127208 07936 plusmn 00239 07676 plusmn 00249 07864 plusmn 00066 06652 plusmn 00490 00064 00324 00136 0134809 08864 plusmn 00252 08713 plusmn 00249 08919 plusmn 00034 07595 plusmn 00501 00136 00287 00081 0140510 10000 plusmn 00000 10000 plusmn 00000 10000 plusmn 00000 09601 plusmn 00185 00000 00000 00000 00399MAE 00093 00201 00125 01163

inducing a known dependency structure in samples takenfrom the input independent marginal distributions usingreordering techniques The multivariate coupled structureobtained as the output can thus be used as the input data invarious models of dependency analysis to test their relativeaccuracies

We set parameters of the Pearson type IV distributionsuch that the coupled synthetic data we obtain using theIman-Conover method has similar properties to financialreturns But first we need to consider some properties offinancial returns which we want to mimic Figures 1(a) and1(b) show the distributions of two higher-order momentsthat is skewness (120574) and kurtosis (120581) for FX spot data setssampled at three different frequenciesThe plots are obtainedusing a sliding-window of length 50 data points as an averageof all G10 (Group of Ten) currency pairs covering a period of8 hours in the case of 025 second and 05 second sampleddata and 2 years in case of 05 hour sampled data The non-Gaussian (heavy-tailed skewed) nature of the data is clearlyvisible It is interesting to note that the kurtosis value almostnever goes below three for any of the data sets signifyingthe temporal persistence of non-Gaussianity We now makeuse of the Iman-Conover method to induce varying levels ofcorrelation between 1000 samples taken from independentrandomly distributed Pearson type IV distributions A 1000sample data set makes it easier to accurately induce prede-fined correlations in the system as well as makes it possibleto generate data with relatively accurate average kurtosis andskewness values hence allowing us to accurately capture thehigher-order moments of financial returns As an exampleFigures 1(c) and 1(d) show distributions of the kurtosis andskewness of two variables for 1000 independent simulationsNote the similarity of these plots with the average of thecorresponding distributions of higher-order moments forfinancial data as presented in Figures 1(a) and 1(b) Thisshows the effectiveness of using synthetic data sampled from

a Pearson type IV distribution for capturing higher-ordermoments of financial returns

Four different approaches are now used to estimate thelevel of dependence between the output coupled data Theprocess is repeated 1000 times for each level of true correlation(120588TC) Table 2 presents the results obtained The results showthe accuracy of the information coupling measure whenused to analyse non-Gaussian data For this synthetic dataexample on average the information coupling measure was537 more accurate than the linear correlation measure and256 more accurate with respect to the rank correlationmeasure The normalised mutual information provided theleast accurate results

We now extend this example by incorporating datadynamics The same data generation process as describedabove is nowused to construct a 32000 samples long bivariatedata set in which the induced true correlation changes every8000 time steps that is120588TC = 02when 119905 = 1 8000120588TC = 04when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000and 120588TC = 08 when 119905 = 24001 32000 A 1000 datapoints wide sliding-window is used to dynamically measuredependencies in the data setThe resulting temporal informa-tion coupling plot together with the 5th and 95th percentileconfidence intervals is presented in Figure 2(a) The fourdifferent coupling regions are clearly visible together withthe step changes in coupling after every 8000 time stepsshowing ability of the algorithm to detect abrupt changes incoupling The normalised empirical probability distributionsover 120578 for the four coupling regions are shown in Figure 2(b)Also plotted in the same figure are the normalised empiricalpdfs for linear correlation (120588) and rank correlation (120588

119877)

The mutual information pdf is omitted for clarity as it givesrelatively less accurate results as presented in Table 2 Itis interesting to see how the peaks of the 120578 distributioncorrespond very closely to 120588TC values showing ability of theinformation coupling model to accurately capture statistical

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 7: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

ISRN Signal Processing 7

it possible to accurately capture the rapidly evolving dynamicsof the markets over the corresponding period without beingtoo long so as to only capture major trends or too short tocapture noise in the data

44 Discussion The information coupling model offers uswith multiple advantages when used to analyse multivariatefinancial data Here we summarise some of the main prop-erties the model while the empirical results presented in thenext section showcase some of its practical benefits

(i) The information coupling measure a proxy formutual information is able to accurately pick up sta-tistical dependencies in data sets with non-Gaussiandistributions (such as financial returns)

(ii) The information coupling algorithm is computation-ally efficient which makes it particularly suitable foruse in an online dynamic environment This makesthe algorithm especially attractive when dealing withdata sampled at high frequenciesThis is because withthe ever-increasing use of high-frequency data over-coming sources of latency is of utmost importance ina variety of applications in modern financial markets

(iii) It gives confidence levels on the information couplingmeasure This allows us to estimate the uncertaintyassociated with the measurements

(iv) The metric provides normalised results that is infor-mation coupling ranges from 0 for decoupled systemsto 1 for completely coupled systems This makes iteasier to analyse results obtained using themetric andto compare its performance with other similar mea-sures of association The metric also gives symmetricresults

(v) The metric is valid for any number of signals inhigh-dimensional spaces that is it consistently givesaccurate results irrespective of the number of timeseries between which information coupling is beingcomputed This makes it suitable for a range offinancial applications

(vi) It is not data intensive that is it gives relativelyaccurate results evenwhen a small sample size is usedThis allows the metric to model the complex andrapidly changing dynamics of financial markets

(vii) It does not depend on user-defined parameters whichcan restrict its practical utility as the evolving marketconditions may require the parameters to be con-stantly updated which may not be practical

5 Results

Wenow present a set of synthetic and financial data examplesshowing the relative accuracy and practical utility of theinformation coupling measure The following notations areused for different measures of statistical dependence in this

paper ICA-based information coupling (120578) linear correla-tion (120588) rank correlation (120588

119877) and normalised mutual infor-

mation (119868119873) Normalisation of mutual information values (119868)

is achieved using the following transformation [35]

119868119873= radic1 minus exp (minus2119868) (38)

The financial data examples we present in this paperprimarily make use of spot foreign exchange (FX) data Spotprice or spot rate is the price which is actually quoted fora currency transaction to take place Return is the profitrealised when a currency is traded It is common practice touse the normalised log-returns of financial data in statisticalanalysis instead of the raw data itself [36] Using log-returnsmakes it possible to convert exponential problems intolinear ones thus significantly simplifying relevant analysisA normalised log-returns data set with a mean of zero andunit variance can generally be regarded as a locally stationaryprocess [37] Therefore many signal processing techniquesmeant solely for stationary processes can be successfullyapplied to the normalised log-returns time series in anadaptive environment Denoting the mid-price at time 119905 of afinancial time series by 119875

119905 the log-return value is given by

119903119905= ln(

119875119905

119875119905minus1

) (39)

The normalisation of log-returns is achieved by convertingthe data to a form such that it has a mean of zero and unitvariance This is easily achieved by removing the mean anddividing by the standard deviation of the data

51 Synthetic Data There is no single distribution whichfits financial returns especially those sampled at higher fre-quencies which tend to be highly non-Gaussian [2] althoughthere have been attempts to model returns using a variety ofdistributions [38] In this paper we aim to capture the heavy-tailed skewed properties of financial returns using a Pearsontype IV distribution [39 40] which can be used to representdistributions with varying degrees of skewness and kurtosisand thus are useful for representing distributions of financialreturns [41 42] The first four moments of the distributioncan be uniquely determined by setting four parameters whichcharacterise the distribution [43] Until recently due to itsmathematical and computational complexity this distribu-tion has not been widely used in financial literature althoughthis is rapidly changing with advances in computationalpower and proposal of new improved analytical methods[39 42 44]

To test accuracy of various measures of dependence weneed to generate coupled non-Gaussian synthetic data withknown predefined correlation values There is no straight-forward way to simulate correlated random variables whentheir joint distribution is not known [46] as is the case withmultivariate financial returns One possible method that canbe used to induce any desired predefined correlation betweenindependent randomly distributed variables irrespective oftheir distributions is commonly known as the Iman-Conovermethod as presented in [47] This method is based on

8 ISRN Signal Processing

Table 2 Table showing accuracy of four measures of dependence that is information coupling (120578) linear correlation (120588) rank correlation(120588119877) and normalised mutual information (119868

119873) when used to estimate the level of dependence in a coupled system with varying levels of true

correlation (120588TC) 120588TC is induced in an independent bivariate randomly distributed system using the Iman-Conover method as described inthe text The dependence estimates together with their standard deviation values shown in the table are obtained using 1000 independentsimulations using 1000 data points long data sets for each simulationThe last row of the table gives values for themean absolute error (MAE)

120588TC 120578 120588 120588119877

119868119873

|120588TC minus 120578| |120588TC minus 120588| |120588TC minus 120588119877| |120588TC minus 119868119873|

0 00046 plusmn 00026 00038 plusmn 00022 00147 plusmn 00115 01851 plusmn 00401 00046 00038 00147 0185101 01079 plusmn 00073 00917 plusmn 00058 00942 plusmn 00184 01687 plusmn 00412 00079 00083 00058 0068702 02145 plusmn 00102 01862 plusmn 00093 01899 plusmn 00177 01131 plusmn 00464 00145 00138 00111 0086903 03176 plusmn 00138 02812 plusmn 00130 02863 plusmn 00167 01644 plusmn 00538 00176 00188 00137 0135604 04166 plusmn 00210 03764 plusmn 00165 03835 plusmn 00153 02787 plusmn 00500 00166 00236 00165 0121305 05135 plusmn 00196 04720 plusmn 00197 04818 plusmn 00136 03819 plusmn 00475 00135 00280 00182 0118106 06070 plusmn 00347 05688 plusmn 00226 05815 plusmn 00117 04788 plusmn 00458 00070 00312 00185 0121207 07011 plusmn 00232 06670 plusmn 00242 06830 plusmn 00093 05728 plusmn 00483 00011 00330 00170 0127208 07936 plusmn 00239 07676 plusmn 00249 07864 plusmn 00066 06652 plusmn 00490 00064 00324 00136 0134809 08864 plusmn 00252 08713 plusmn 00249 08919 plusmn 00034 07595 plusmn 00501 00136 00287 00081 0140510 10000 plusmn 00000 10000 plusmn 00000 10000 plusmn 00000 09601 plusmn 00185 00000 00000 00000 00399MAE 00093 00201 00125 01163

inducing a known dependency structure in samples takenfrom the input independent marginal distributions usingreordering techniques The multivariate coupled structureobtained as the output can thus be used as the input data invarious models of dependency analysis to test their relativeaccuracies

We set parameters of the Pearson type IV distributionsuch that the coupled synthetic data we obtain using theIman-Conover method has similar properties to financialreturns But first we need to consider some properties offinancial returns which we want to mimic Figures 1(a) and1(b) show the distributions of two higher-order momentsthat is skewness (120574) and kurtosis (120581) for FX spot data setssampled at three different frequenciesThe plots are obtainedusing a sliding-window of length 50 data points as an averageof all G10 (Group of Ten) currency pairs covering a period of8 hours in the case of 025 second and 05 second sampleddata and 2 years in case of 05 hour sampled data The non-Gaussian (heavy-tailed skewed) nature of the data is clearlyvisible It is interesting to note that the kurtosis value almostnever goes below three for any of the data sets signifyingthe temporal persistence of non-Gaussianity We now makeuse of the Iman-Conover method to induce varying levels ofcorrelation between 1000 samples taken from independentrandomly distributed Pearson type IV distributions A 1000sample data set makes it easier to accurately induce prede-fined correlations in the system as well as makes it possibleto generate data with relatively accurate average kurtosis andskewness values hence allowing us to accurately capture thehigher-order moments of financial returns As an exampleFigures 1(c) and 1(d) show distributions of the kurtosis andskewness of two variables for 1000 independent simulationsNote the similarity of these plots with the average of thecorresponding distributions of higher-order moments forfinancial data as presented in Figures 1(a) and 1(b) Thisshows the effectiveness of using synthetic data sampled from

a Pearson type IV distribution for capturing higher-ordermoments of financial returns

Four different approaches are now used to estimate thelevel of dependence between the output coupled data Theprocess is repeated 1000 times for each level of true correlation(120588TC) Table 2 presents the results obtained The results showthe accuracy of the information coupling measure whenused to analyse non-Gaussian data For this synthetic dataexample on average the information coupling measure was537 more accurate than the linear correlation measure and256 more accurate with respect to the rank correlationmeasure The normalised mutual information provided theleast accurate results

We now extend this example by incorporating datadynamics The same data generation process as describedabove is nowused to construct a 32000 samples long bivariatedata set in which the induced true correlation changes every8000 time steps that is120588TC = 02when 119905 = 1 8000120588TC = 04when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000and 120588TC = 08 when 119905 = 24001 32000 A 1000 datapoints wide sliding-window is used to dynamically measuredependencies in the data setThe resulting temporal informa-tion coupling plot together with the 5th and 95th percentileconfidence intervals is presented in Figure 2(a) The fourdifferent coupling regions are clearly visible together withthe step changes in coupling after every 8000 time stepsshowing ability of the algorithm to detect abrupt changes incoupling The normalised empirical probability distributionsover 120578 for the four coupling regions are shown in Figure 2(b)Also plotted in the same figure are the normalised empiricalpdfs for linear correlation (120588) and rank correlation (120588

119877)

The mutual information pdf is omitted for clarity as it givesrelatively less accurate results as presented in Table 2 Itis interesting to see how the peaks of the 120578 distributioncorrespond very closely to 120588TC values showing ability of theinformation coupling model to accurately capture statistical

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 8: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

8 ISRN Signal Processing

Table 2 Table showing accuracy of four measures of dependence that is information coupling (120578) linear correlation (120588) rank correlation(120588119877) and normalised mutual information (119868

119873) when used to estimate the level of dependence in a coupled system with varying levels of true

correlation (120588TC) 120588TC is induced in an independent bivariate randomly distributed system using the Iman-Conover method as described inthe text The dependence estimates together with their standard deviation values shown in the table are obtained using 1000 independentsimulations using 1000 data points long data sets for each simulationThe last row of the table gives values for themean absolute error (MAE)

120588TC 120578 120588 120588119877

119868119873

|120588TC minus 120578| |120588TC minus 120588| |120588TC minus 120588119877| |120588TC minus 119868119873|

0 00046 plusmn 00026 00038 plusmn 00022 00147 plusmn 00115 01851 plusmn 00401 00046 00038 00147 0185101 01079 plusmn 00073 00917 plusmn 00058 00942 plusmn 00184 01687 plusmn 00412 00079 00083 00058 0068702 02145 plusmn 00102 01862 plusmn 00093 01899 plusmn 00177 01131 plusmn 00464 00145 00138 00111 0086903 03176 plusmn 00138 02812 plusmn 00130 02863 plusmn 00167 01644 plusmn 00538 00176 00188 00137 0135604 04166 plusmn 00210 03764 plusmn 00165 03835 plusmn 00153 02787 plusmn 00500 00166 00236 00165 0121305 05135 plusmn 00196 04720 plusmn 00197 04818 plusmn 00136 03819 plusmn 00475 00135 00280 00182 0118106 06070 plusmn 00347 05688 plusmn 00226 05815 plusmn 00117 04788 plusmn 00458 00070 00312 00185 0121207 07011 plusmn 00232 06670 plusmn 00242 06830 plusmn 00093 05728 plusmn 00483 00011 00330 00170 0127208 07936 plusmn 00239 07676 plusmn 00249 07864 plusmn 00066 06652 plusmn 00490 00064 00324 00136 0134809 08864 plusmn 00252 08713 plusmn 00249 08919 plusmn 00034 07595 plusmn 00501 00136 00287 00081 0140510 10000 plusmn 00000 10000 plusmn 00000 10000 plusmn 00000 09601 plusmn 00185 00000 00000 00000 00399MAE 00093 00201 00125 01163

inducing a known dependency structure in samples takenfrom the input independent marginal distributions usingreordering techniques The multivariate coupled structureobtained as the output can thus be used as the input data invarious models of dependency analysis to test their relativeaccuracies

We set parameters of the Pearson type IV distributionsuch that the coupled synthetic data we obtain using theIman-Conover method has similar properties to financialreturns But first we need to consider some properties offinancial returns which we want to mimic Figures 1(a) and1(b) show the distributions of two higher-order momentsthat is skewness (120574) and kurtosis (120581) for FX spot data setssampled at three different frequenciesThe plots are obtainedusing a sliding-window of length 50 data points as an averageof all G10 (Group of Ten) currency pairs covering a period of8 hours in the case of 025 second and 05 second sampleddata and 2 years in case of 05 hour sampled data The non-Gaussian (heavy-tailed skewed) nature of the data is clearlyvisible It is interesting to note that the kurtosis value almostnever goes below three for any of the data sets signifyingthe temporal persistence of non-Gaussianity We now makeuse of the Iman-Conover method to induce varying levels ofcorrelation between 1000 samples taken from independentrandomly distributed Pearson type IV distributions A 1000sample data set makes it easier to accurately induce prede-fined correlations in the system as well as makes it possibleto generate data with relatively accurate average kurtosis andskewness values hence allowing us to accurately capture thehigher-order moments of financial returns As an exampleFigures 1(c) and 1(d) show distributions of the kurtosis andskewness of two variables for 1000 independent simulationsNote the similarity of these plots with the average of thecorresponding distributions of higher-order moments forfinancial data as presented in Figures 1(a) and 1(b) Thisshows the effectiveness of using synthetic data sampled from

a Pearson type IV distribution for capturing higher-ordermoments of financial returns

Four different approaches are now used to estimate thelevel of dependence between the output coupled data Theprocess is repeated 1000 times for each level of true correlation(120588TC) Table 2 presents the results obtained The results showthe accuracy of the information coupling measure whenused to analyse non-Gaussian data For this synthetic dataexample on average the information coupling measure was537 more accurate than the linear correlation measure and256 more accurate with respect to the rank correlationmeasure The normalised mutual information provided theleast accurate results

We now extend this example by incorporating datadynamics The same data generation process as describedabove is nowused to construct a 32000 samples long bivariatedata set in which the induced true correlation changes every8000 time steps that is120588TC = 02when 119905 = 1 8000120588TC = 04when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000and 120588TC = 08 when 119905 = 24001 32000 A 1000 datapoints wide sliding-window is used to dynamically measuredependencies in the data setThe resulting temporal informa-tion coupling plot together with the 5th and 95th percentileconfidence intervals is presented in Figure 2(a) The fourdifferent coupling regions are clearly visible together withthe step changes in coupling after every 8000 time stepsshowing ability of the algorithm to detect abrupt changes incoupling The normalised empirical probability distributionsover 120578 for the four coupling regions are shown in Figure 2(b)Also plotted in the same figure are the normalised empiricalpdfs for linear correlation (120588) and rank correlation (120588

119877)

The mutual information pdf is omitted for clarity as it givesrelatively less accurate results as presented in Table 2 Itis interesting to see how the peaks of the 120578 distributioncorrespond very closely to 120588TC values showing ability of theinformation coupling model to accurately capture statistical

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 9: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

ISRN Signal Processing 9

05 hour05 s

025 s

0 10 20 30 400

005

01

015

02

025

03

pdf (120581

)

Kurtosis (120581)

(a) Financial data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

05 hour05 s

025 s

(b) Financial data

0 10 20 30 400

005

01

015

02

025

pdf (120581

)

Kurtosis (120581)

11990911199092

(c) Synthetic data

0 2 4 60

02

04

06

08

1

minus6 minus4 minus2

pdf (120574

)

Skewness (120574)

11990911199092

(d) Synthetic data

Figure 1 (a b) Plots showing the normalised pdfs of (a) kurtosis (120581) and (b) skewness (120574) obtained using a sliding-window of length 50 datapoints as an average of all G10 currency pairs covering a period of 8 hours in case of 025 second and 05 second sampled data and 2 yearsin case of 05 hour sampled data The plots clearly show the heavy-tailed skewed nature of FX returns at all three sampling frequencies (cd) An example of the normalised pdf plots showing the average kurtosis and skewness values respectively for synthetic data generated usingPearson type IV distributions The data has properties similar to the average of the distributions presented in (a) and (b) The vertical linesindicate the kurtosis (120581 = 3) and skewness (120574 = 0) values for a Gaussian distribution

dependencies in a dynamic environment The least accuratemeasure in this example is the linear correlation

52 Financial Data We first present a simple applicationof the information coupling algorithm to a section of 05second sampled FX spot log-returns data set (in this paperall currencies are referred by their standardised internationalthree-letter codes as described by the ISO 4217 standard Forthe currencies mentioned in this paper the three-letter codesare USD (US dollar) EUR (Euro) JPY (Japanese yen) GBP

(British pound) CHF (Swiss franc) and AUD (Australiandollar)) Figure 3 shows the variation of information couplingand linear correlation with time for EURUSD and GBPUSDThe results are obtained using a 5-minute wide sliding-window We note that the two measures of dependence fre-quently give different results which reflects on the inability oflinear correlation to capture dependencies in non-Gaussiandata streams We also note that dependencies in FX log-returns exhibit rapidly changing dynamics often charac-terised by regions of quasi-stability punctuated by abrupt

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 10: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

10 ISRN Signal Processing

0 04 08 12 16 2 24 28 320

01

02

03

04

05

06

07

08

09

1

119905

120578(119905

)

times104

120578(119905)

[120578(119905)]median

[120578(119905)]955

(a)

120578

120588

120588119877[120578]955

0 02 04 06 08 100

05

1

15

2

Magnitude of dependency measurepd

f (m

agni

tude

of d

epen

denc

y m

easu

re)

(b)

Figure 2 (a) Temporal variation of information coupling 120578(119905) plotted as a function of time Step changes in coupling are visible afterevery 8000 time steps that is 120588TC = 02 when 119905 = 1 8000 120588TC = 04 when 119905 = 8001 16000 120588TC = 06 when 119905 = 16001 24000 and120588TC = 08 when 119905 = 24001 32000 Also plotted are the median [120578(119905)]median and the 5 and 95 confidence interval contours [120578(119905)]95

5 (b)Normalised empirical pdf plots for information coupling (120578) linear correlation (120588) and rank correlation (120588

119877)The vertical lines represent the

true correlation (120588TC) values The relative accuracy of the information coupling measure is evident from these results

0 500 1000 1500 2000 25000

010203040506070809

Time (s)

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)

Mag

nitu

de o

fde

pend

ency

mea

sure

Figure 3 Information coupling (120578119905) and linear correlation (120588

119905)

plotted as a function of time for a section of 05 second sampledEURUSD and GBPUSD log-returns data set A 5-minute widesliding-window is used to obtain the results

changes In [48] we show that these regions of persistencein statistical dependence may be captured using a hiddenMarkov ICA model which is a hidden Markov model withan ICA observation model The information thus obtainedcan be useful for a range of financial applications such as

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Date

AUDUSDUSDJPY

Adju

sted

closin

g pr

ices

(119875119905)

Figure 4 Daily closing mid-prices (119875119905) of AUDUSD and USDJPY

The two plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

finding regions of financial market volatility clustering andpersistence [49]

There have been numerous academic studies on thecauses and effects of the 2008 financial crisis [50 51]However very few of these have focused on the impact of thecrisis on interdependencies in the global FX market Here wepresent a set of examples which give a unique insight into the

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 11: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

ISRN Signal Processing 11

2005 2006 2007 2008 2009 20100

01

02

03

04

05

Mag

nitu

de o

fde

pend

ency

mea

sure

[120578119905]955

Information coupling (120578119905)

Linear correlation (120588119905)Rank correlation (120588119877119905)

(a)

0005

01015

02025

03

2005 2006 2007 2008 2009 2010

Δ[120578

119905]95

5

(b)

Figure 5 (a)Three approaches used tomeasure temporal dependencies betweenAUDUSD andUSDJPY log-returns (b) Range of confidenceintervals (Δ[120578]95

5 = 12057895minus1205785) plotted as a function of time showing the uncertainty associatedwith the information couplingmeasurementsThe vertical lines correspond to the September 2008 financial meltdown

effects of the crisis on the nature of statistical dependenciesin the spot FX market Accurate estimation of dependenciesat times of financial crises is of utmost importance as theseestimates are used by financial practitioners for a rangeof tasks such as rebalancing portfolios accurately pricingoptions and deciding on the level of risk-taking We firstpresent an application of the information coupling modelfor detecting temporal changes in dependencies in bivariateFX data streams at times of financial crises Figure 4 showsthe adjusted daily closing mid-prices (119875

119905) for AUDUSD and

USDJPY from January 2005 till April 2010 The two plotsclearly show an abrupt change in the exchange rates inSeptember-October 2008 This was caused at the height ofthe 2008 global financial crisis due to the unwinding ofcarry trades [52] Figure 5(a) displays three plots showing thetemporal variation of information coupling (120578

119905) linear cor-

relation (120588119905) and rank correlation (120588

119877119905) between AUDUSD

and USDJPY log-returns The plots are obtained using a six-month long sliding-window We notice the rise in uncer-tainty of the information coupling measure (Figure 5(b))right before the crash with uncertainty decreasing graduallythereafter this information may be useful to systematicallypredict upheavals in the market although we do not carryout this study in detail in this paper Information about thelevel of uncertainty can be used as a measure of confidencein the information coupling values and can be useful invarious practical decisionmaking scenarios such as decidingon the capital to deploy for the purpose of trading orselecting stocks (or currencies) for inclusion in a portfolio Asdaily sampled data is generally less non-Gaussian than datasampled at higher frequencies therefore the three plots inFigure 5(a) are somewhat similar during certain time periodsHowever right after the September 2008 crash the plotssignificantly deviate from each other We believe that this isbecause the nature of the data in particular its level of non-Gaussianity has changed As shown in Figure 6 the distancemeasure (120578

119905minus|120588119905|)2 between information coupling and linear

correlation closely matches the non-Gaussianity of the dataunder consideration The two plots are adjusted for easeof comparison The degree of non-Gaussianity is calculated

012345678

Date2005 2006 2007 2008 2009 2010

(120578119905 minus |120588119905|)2

JB2AUDUSDt + JB2

USDJPYt

Figure 6 Difference between information coupling and linearcorrelation plotted as a function of time Also plotted is a measureof non-Gaussianity of the two time series as defined by (40) Thetwo plots are adjusted for ease of comparison The vertical linecorresponds to the September 2008 financial meltdown

using the multivariate Jarque-Bera statistic (JBMV) which wedefine for a119873-dimensional multivariate data set as

JBMV =119873

sum

119895=1

[

[

119899119904

6(1205742

119895+

(120581119895minus 3)2

4)]

]

2

(40)

where 119899119904is the number of data points (in this case the size

of the sliding-window) 120574 is the skewness of the data underanalysis and 120581 is its kurtosisThis shows that relying solely oncorrelation measures to model dependencies in multivariatefinancial time series even when using data sampled atrelatively lower frequencies can potentially lead to inaccurateresults In contrast the information coupling model takesinto account properties of the data being analysed resultingin an accurate approach to measure statistical dependencies

We now show utility of the information couplingmodel for analysing multivariate statistical dependenciesFigure 7 shows the temporal variation of information cou-pling between four major liquid currency pairs (EURUSD

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 12: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

12 ISRN Signal Processing

2004 2005 2006 2007 2008 2009 20100

00501

01502

02503

03504

04505

FTSE-100 index (adjusted)

119905 (years)

Information coupling between EURUSD

[120578119905120578119905

]955

120578 119905

GBPUSD USDCHF and USDJPY

Figure 7 Information coupling between EURUSD GBPUSDUSDCHF and USDJPY log-returns plotted as a function of timeAlso plotted is the FTSE-100 index which is adjusted for ease ofcomparison The vertical line corresponds to the September 2008financial meltdown

GBPUSD USDCHF and USDJPY) The results are obtainedusing daily log-returns for a seven-year period and a six-month long sliding-window Also plotted on the same figureis the FTSE-100 (Financial Times Stock Exchange 100) indexfor the corresponding time period which has been adjustedfor ease of comparison The plot clearly shows an abruptupward shift in coupling between the four currency pairsright at the time of the September 2008 financial meltdownwith gradual decrease in coupling over the next year Wealso notice an increase in the uncertainty associated with theinformation coupling measure before the 2008 crash Theincrease in dependence of financial instruments in times offinancial crises has been observed for other asset classes aswell [53] Our unique example showing the dynamics ofmultivariate dependencies within the spot FX space providesfurther insight into the nature of interdependencies in timesof financial crisis

6 Conclusions

We present an ICA-based approach to dynamically measureinformation coupling as a proxy for mutual informationThis approach makes use of ICA as a tool to captureinformation in the tails of the underlying distributions andis suitable for efficiently and accurately measuring statisticaldependencies between multiple non-Gaussian signals As faras we know this is the first attempt to quantify multivari-ate dependencies using information encoded in the ICAunmixingmatrix Our proposed information couplingmodelhas multiple other benefits associated with its practical useIt provides a framework for estimating confidence boundson the information coupling metric can be efficiently usedto directly model dependencies in high-dimensional spacesand gives normalised symmetric results It has the addedadvantage of not depending on any user-defined parametersand is not data intensive that is it can be used even withrelatively small data sets without significantly affecting its

performance an important requirement for analysing datastreams with rapidly changing dynamics such as financialreturns The model makes use of a sliding-window baseddecorrelating manifold approach to ICA with a reciprocalcosh source model to infer the ICA unmixing matrix whichresults in increased accuracy and efficiency of the algorithmThis gives the information couplingmodel the computationalcomplexity similar to that of linear correlation with theaccuracy of mutual information

Acknowledgments

The authors are grateful to the Oxford-Man Institute ofQuantitative Finance for help in supporting this researchThe first author would also like to thank Exeter College(University of Oxford) for funding

References

[1] D Berg and K Aas ldquoModels for construction of multivariatedependencemdasha comparison studyrdquo The European Journal ofFinance vol 15 no 7-8 pp 639ndash659 2009

[2] M M Dacorogna An Introduction to High-Frequency FinanceAcademic Press New York NY USA 2001

[3] J Hlinka M Palus M Vejmelka D Mantini and M CorbettaldquoFunctional connectivity in resting-state fMRI is linear correla-tion sufficientrdquo NeuroImage vol 54 no 3 pp 2218ndash2225 2011

[4] S J Devlin R Gnanadesikan and J R Kettenring ldquoRobustestimation and outlier detection with correlation coefficientsrdquoBiometrika vol 62 no 3 pp 531ndash545 1975

[5] A R Cowan ldquoNonparametric event study testsrdquo Review ofQuantitative Finance and Accounting vol 2 no 4 pp 343ndash3581992

[6] G J Glasser and R F Winter ldquoCritical values of the coefficientof rank correlation for testing the hypothesis of independencerdquoBiometrika vol 48 no 3-4 pp 444ndash448 1961

[7] P Embrechts F Lindskog and A McNeil ldquoModelling depen-dence with copulas and applications to risk managementrdquo inHandbook of Heavy Tailed Distributions in Finance vol 1chapter 8 pp 329ndash384 2003

[8] E Jondeau S H Poon and M Rockinger Financial ModelingUnder Non-Gaussian Distributions Springer New York NYUSA 2007

[9] J D Fermanian and O Scaillet ldquoSome statistical pitfalls in cop-ula modeling for financial applicationsrdquo in Capital FormationGovernance and Banking p 59 2005

[10] T M Cover and J A Thomas Elements of Information TheoryWiley-Interscience New York NY USA 2006

[11] A Kraskov H Stogbauer and P Grassberger ldquoEstimatingmutual informationrdquo Physical Review E vol 69 no 6 ArticleID 066138 16 pages 2004

[12] S Khan S Bandyopadhyay and A R Ganguly ldquoComputingmutual information based nonlinear dependence among noisyand finite geophysical time seriesrdquo in Proceedings of the Ameri-can Geophysical Union Fall Meeting 2005 abstract no NG22A-03

[13] M M V Hulle ldquoEdgeworth approximation of multivariatedifferential entropyrdquo Neural Computation vol 17 no 9 pp1903ndash1910 2005

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 13: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

ISRN Signal Processing 13

[14] L B Almeida ldquoLinear and nonlinear ICA based on mutualinformationrdquo in Proceedings of the IEEE Adaptive Systems forSignal Processing Communications and Control Symposium(AS-SPCC rsquo00) pp 117ndash122 IEEE 2000

[15] A D Back and A S Weigend ldquoA first application of inde-pendent component analysis to extracting structure from stockreturnsrdquo International Journal of Neural Systems vol 8 no 4pp 473ndash484 1997

[16] E Oja K Kiviluoto and S Malaroiu ldquoIndependent componentanalysis for financial time seriesrdquo in Proceedings of the IEEEAdaptive Systems for Signal Processing Communications andControl Symposium (AS-SPCC rsquo00) pp 111ndash116 IEEE 2000

[17] A Hyvarinen ldquoSurvey on independent component analysisrdquoNeural Computing Surveys vol 2 no 4 pp 94ndash128 1999

[18] C J Lu T S Lee and C C Chiu ldquoFinancial time seriesforecasting using independent component analysis and supportvector regressionrdquo Decision Support Systems vol 47 no 2 pp115ndash125 2009

[19] H Joe ldquoRelative entropymeasures of multivariate dependencerdquoJournal of the American Statistical Association vol 84 no 405pp 157ndash164 1989

[20] M Novey and T Adali ldquoComplex ICA by negentropy maxi-mizationrdquo IEEE Transactions on Neural Networks vol 19 no 4pp 596ndash609 2008

[21] S Roberts and R Everson Independent Component AnalysisPrinciples and Practice Cambridge University Press New YorkNY USA 2001

[22] J Palmer K Kreutz-Delgado and S Makeig ldquoSuper-Gaussianmixture source model for ICArdquo in Independent ComponentAnalysis and Blind Signal Separation pp 854ndash861 2006

[23] R A Choudrey and S J Roberts ldquoVariational mixture ofBayesian independent component analyzersrdquoNeural Computa-tion vol 15 no 1 pp 213ndash252 2003

[24] R Everson and S Roberts ldquoIndependent component analysisa flexible nonlinearity and decorrelating manifold approachrdquoNeural Computation vol 11 no 8 pp 1957ndash1983 1999

[25] R E Kass L Tierney and J B Kadane ldquoLaplacersquos method inBayesian analysisrdquo in Statistical Multiple Integration Proceed-ings of the AMS-IMS-SIAM Joint Summer Research Conference[on StatisticalMultiple Integration]Held at Humboldt UniversityJune 17ndash23 1989 vol 115 p 89 AmericanMathematical Society1991

[26] W Addison and S Roberts ldquoBlind source separation with non-stationary mixing using waveletsrdquo in Proceedings of the ICAResearch NetworkWorkshop The University of Liverpool 2006

[27] N J Higham ldquoMatrix nearness problems and applicationsrdquo inApplications of Matrix Theory 1989

[28] K B DattaMatrix and Linear Algebra PHI Learning 2004[29] K Boudt J Cornelissen and C Croux ldquoThe Gaussian rank

correlation estimator robustness propertiesrdquo Statistics andComputing vol 22 no 2 pp 471ndash483 2012

[30] D Evans ldquoA computationally efficient estimator for mutualinformationrdquo Proceedings of the Royal Society A vol 464 no2093 pp 1203ndash1215 2008

[31] K Torkkola ldquoOn feature extraction by mutual informationmaximizationrdquo in Proceedings of the IEEE International Confer-ence on Acustics Speech and Signal Processing (ICASSP rsquo02) vol1 pp I821ndashI824 IEEE May 2002

[32] KNagarajan BHolland C Slatton andADGeorge ldquoScalableand portable architecture for probability density function esti-mation on FPGAsrdquo in Proceedings of the 16th IEEE Symposium

on Field-Programmable Custom Computing Machines (FCCMrsquo08) pp 302ndash303 IEEE Computer Society April 2008

[33] C G Bowsher ldquoModelling security market events in continu-ous time intensity based multivariate point process modelsrdquoJournal of Econometrics vol 141 no 2 pp 876ndash912 2007

[34] B Peiers ldquoInformed traders intervention and price leadershipa deeper view of the microstructure of the foreign exchangemarketrdquo Journal of Finance vol 52 no 4 pp 1589ndash1614 1997

[35] A Dionisio R Menezes and D A Mendes ldquoMutual infor-mation a measure of dependency for nonlinear time seriesrdquoPhysica A vol 344 no 1-2 pp 326ndash329 2004

[36] J Y Campbell A W Lo A C MacKinlay and R FWhitelaw ldquoThe econometrics of financial marketsrdquo Macroeco-nomic Dynamics vol 2 no 4 pp 559ndash562 1998

[37] G Koutmos C Negakis and P Theodossiou ldquoStochasticbehaviour of the Athens stock exchangerdquo Applied FinancialEconomics vol 3 no 2 pp 119ndash126 1993

[38] D M Guillaume M M Dacorogna R R Dave U A MullerR B Olsen and O V Pictet ldquoFrom the birdrsquos eye to themicroscope a survey of new stylized facts of the intra-dailyforeign exchange marketsrdquo Finance and Stochastics vol 1 no2 pp 95ndash129 1997

[39] R Cheng ldquoUsing pearson type IV and other Cinderella distri-butions in simulationrdquo in Proceedings of the Winter SimulationConference (WSC rsquo11) pp 457ndash468 IEEE 2011

[40] Y Nagahara ldquoThe PDF and CF of Pearson type IV distributionsand the ML estimation of the parametersrdquo Statistics and Proba-bility Letters vol 43 no 3 pp 251ndash264 1999

[41] A Shephard Pearson IV Institute for Fiscal Studies UniversityCollege London 2008

[42] S Stavroyiannis I Makris V Nikolaidis and L ZarangasldquoEconometric modeling and value-at-risk using the Pearsontype IV distributionrdquo International Review of Financial Analysisvol 22 pp 10ndash17 2012

[43] M Kendall A Stuart and J K Ord Kendallrsquos AdvancedTheoryof Statistics Charles Griffin 1987

[44] R Willink ldquoA closed-form expression for the pearson type IVdistribution functionrdquo Australian and New Zealand Journal ofStatistics vol 50 no 2 pp 199ndash205 2008

[45] A Venelli ldquoEfficient entropy estimation formutual informationanalysis using B-splinesrdquo in Information Security Theory andPractices Security and Privacy of Pervasive Systems and SmartDevices pp 17ndash30 2010

[46] C G Park and D W Shin ldquoAn algorithm for generatingcorrelated random variables in a class of infinitely divisible dis-tributionsrdquo Journal of Statistical Computation and Simulationvol 61 no 1-2 pp 127ndash139 1998

[47] R L Iman and W J Conover ldquoA distribution-free approach toinducing rank correlation among input variablesrdquo Communica-tions in Statistics-Simulation and Computation vol 11 no 3 pp311ndash334 1982

[48] N Shah and S Roberts ldquoHidden Markov independent compo-nent analysis as a measure of coupling in multivariate financialtime seriesrdquo in Proceedings of the ICA Research Network Inter-national Workshop Liverpool UK 2008

[49] A Rossi and G M Gallo ldquoVolatility estimation via hiddenMarkov modelsrdquo Journal of Empirical Finance vol 13 no 2 pp203ndash230 2006

[50] J Crotty ldquoStructural causes of the global financial crisis a crit-ical assessment of the lsquonew financial architecturersquordquo CambridgeJournal of Economics vol 33 no 4 pp 563ndash580 2009

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 14: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

14 ISRN Signal Processing

[51] J A Murphy ldquoAn analysis of the financial crisis of 2008 causesand solutionsrdquo Social Science Research Network 2008

[52] M Pojarliev and R M Levich ldquoDetecting crowded trades incurrency fundsrdquo Financial Analysts Journal vol 67 no 1 pp26ndash39 2011

[53] L Sandoval and ID P Franca ldquoCorrelation of financialmarketsin times of crisisrdquo Physica A vol 391 no 1 pp 187ndash208 2012

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of

Page 15: Research Article Dynamically Measuring Statistical ...downloads.hindawi.com/archive/2013/434832.pdf · Research Article Dynamically Measuring Statistical Dependencies in Multivariate

International Journal of

AerospaceEngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Active and Passive Electronic Components

Control Scienceand Engineering

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

RotatingMachinery

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

Journal ofEngineeringVolume 2014

Submit your manuscripts athttpwwwhindawicom

VLSI Design

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Shock and Vibration

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Civil EngineeringAdvances in

Acoustics and VibrationAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Advances inOptoElectronics

Hindawi Publishing Corporation httpwwwhindawicom

Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

SensorsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Chemical EngineeringInternational Journal of Antennas and

Propagation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Navigation and Observation

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

DistributedSensor Networks

International Journal of