genealized diagonal band copula wifth two-sided power densities
TRANSCRIPT
Uncertainty Analysis Applications of GDB Copula with TSPGenerating Densities
Leon Adams
May 10, 2011
Dissertation Defense
Uncertainty Analysis Applications of GDB Copula with TSP Generating Densities
Dissertation Committee:
Johan R. van Dorp Dissertation DirectorEnrique Campos-Nanez Committee MemberJonathan Pierce Deason Committee MemberRobert A. Roncace Committee MemberThomas Andrew Mazzuchi Committee Member
Overview
Introduction
• Dissertation background• Preview dissertation contributions
Multivariate Models
• Details of the single common risk factor multivariate model• Details of the multiple common risk factor multivariate model
Application Examples
• Hydrological frequency analysis• Stock returns
Concluding remarks
• Review model success• Future work
Introduction
Definitions – Sklar’s Theorem
Theorem (Sklar). Given the CDF H with continuous margins F and G. Then there exists a uniquecopula represented by:
H(x, y) = C{F (x), G(y)}
Corollary. Given the CDF H with continuous margins F and G and the copula C. Then foru, v ∈ [0, 1]:
C(u, v) = H{F−1(u), G−1(v)},
where F−1 and G−1 are quantile functions.
We also have the density representation of copulas:
h(x, y) = c{F (x), G(y)} · f(x)g(y)
where h, c, f, and g are densities.
Graphical representation of the diagonal band copula
Support region and copula PDF representation of the original diagonal band copula [4].
There are two extension to the diagonal band copula–both using generating densities
• Ferguson (1995) and Bojarski (2002)• Lewandowski (2005) showed that Ferguson and Bojarski extension were equivalent
c(u, v) =1
2{g(|u− v|) + g(1− |1− u− v|)} 0 < u, v < 1 [5]
g(z) =
{
(gθ(−z) + gθ(z) for z ∈ [0, 1− θ] [10]0 elsewhere
GDB TSP copula [8]
c {x, y|p (·|Ψ)} = 12×
p (1− x− y|Ψ) + p (1 + x− y|Ψ) , (x, y) ∈ A1,
p (1− x− y|Ψ) + p (1− x+ y|Ψ) , (x, y) ∈ A2,
p (x+ y − 1|Ψ) + p (1 + x− y|Ψ) , (x, y) ∈ A3,
p (x+ y − 1|Ψ) + p (1− x+ y|Ψ) , (x, y) ∈ A4.
If we consider the TSP generating density (Kotz, Van Dorp 2010)
p(·|Ψ) = nzn−1 → P (·|Ψ) = zn
c { x, y | p (·|Ψ) } = 12×
n (1− x− y)n−1 + n (1 + x− y)n−1 , (x, y) ∈ A1,
n (1− x− y)n−1 + n (1− x+ y)n−1 , (x, y) ∈ A2,
n (x+ y − 1)n−1 + n (1 + x− y)n−1 , (x, y) ∈ A3,
n (x+ y − 1)n−1 + n (1− x+ y)n−1 , (x, y) ∈ A4.
• For nomenclature we use: single parameter GDB TSP copula• This single parameter copula is utilized in the current research model• Could be extended to other copula models
Preview research contributions
Contributions:
• Specification of estimation procedure for a multivariate copula framework
• Bivariate model with single common risk factor• Multivariate model with single common risk factor• Multivariate model with multiple common risk factors
• Derivation of novel relationships between model parameters and the Spearman’s ρ andBlomqvist’s β dependency measures
• Specification of efficient joint sampling procedures
Motivation:
• Literature review reveals focus on bivariate copulas• Applications tend to use select few family
• Elliptical - Gaussian, Student t• Archimedean - Gumbel, Clayton, Frank
• Increase availability of multivariate copula models
Multivariate Models
Single common risk factor multivariate GDB TSP copula - Parameter estimation
Copula model assumptions
• Single common risk factor• Uniformly distributed observations & common risk factor• Conditional independence of observed random variables given U
• Dependence on common risk factor defined by GDB TSP copula• Model parameters scale with dimension of the problem
ρij = 1− 12
(ni + 2)(ni + 3)− 12
(nj + 2)(nj + 3)+
24
(ni + 1)(nj + 1)(ni + nj + 3)
− 24Γ(ni + 2)Γ(nj + 2)
(ni + 1)(nj + 1)Γ(ni + nj + 4)
min (ρn − ρ̂)2 s.t. n > 0
Estimate model parameters
U
Multiple common risk factor multivariate GDB TSP copula - Parameter estimation
G(Y ) = vol (H(γ) ∩ C) =∑
v∈C(sign v)(γ−α·v)n+n!(Πn
i=1αi)
vi = Gi{yi(u|W,n)} = Pr(Yi ≤ yi)
Yi =∑k
l=1 wilUl , yi =∑k
l=1 wilul
ρωn ←∫ 1
0. . .
∫ 1
0
{
vi +(1− vi)
ni+1
ni + 1− v
ni+1i
ni + 1
}
×
vj +(1− vj)
nj+1
nj + 1−
vnj+1
j
nj + 1
du
min (ρωn − ρ̂)2 s.t. n > 0;k
∑
i,j=1
ωi,j = 1; ωi,j ≥ 0
Multiple common risk factor multivariate GDB TSP copula
Extension to multiple common risk factor multivariate model
• Main addition is the incorporation of GDB-TSP copula in common risk factors model• With multiple common risk factors, model flexibility and complexity increase
• Must know account for the Yi ↔ G(Yi)• Must relate model parameters to global dependence measure
• Increases the challenge of the optimization procedure in the parameter estimation process• Two classes of model parameters
• Copula parameters constrained by ni > 0• Weights of the common risk factors with following constraints:
0 ≤ wji ≤ 1∑m
i=1 ωji = 1 ∀j → 1..k
Multivariate sampling algorithm
Application Examples
Application Examples
• Hydrological frequency analysis
• Objective: Study relationship between rainfall duration and amounts• Demonstrated importance of correctly modeling dependence structure• GDB-TSP model outperforms traditional distribution approach• Differences in model outputs have practical implications to flood mitigation strategies
• Flood example
• Objective: Study relationship between locations upstream and downstream• Demonstrated importance of correctly modeling marginals• Gamma distribution selected as the univariate model for both marginals
• Sediment composition
• Objective: Spatial study of relationship between composition of Cerium and Scandium• Demonstrated importance of correctly modeling marginals• GEV distribution selected as univariate model for Cerium marginal• Logistic distribution selected as univariate model for Scandium marginal
• Salmonid risk assessment
• Objective: Monte Carlo Salmonid risk assessment(
dsDg
, hL1
,L2L1
)
• Demonstrated importance of correctly modeling dependence structure• In parts of modeled space correlated model had a higher estimation of risk in achieving target survival rates
• Stock returns analysis
• Objective: Study effectiveness of research model on 7-dim model. Data matrix ↔ Sample matrix• Use law of parsimony to select 3 common risk factors model over 2 common risk factors model• Demonstrated high degree of fidelity of Sample Correlation matrix with Data Correlation matrix
Hydrological frequency analysis
The task is to better understand thebehavior of extreme rainfall events:
• Magnitude• Duration• Frequency• Taking a distributional approach
Motivation:
• Potential for great loss• Inform mitigation strategies• Insurance underwriting• Input into rainfall runoff models
An aerial view of the submerged runway at Rockhampton airport in Australia. Ge�y Images / Jonathan Wood
Source: h�p://blogs.sacbee.com/photos/2011/01/new-storms-soak-flood-weary-au.html
Korean rainfall example
Data:
• Seoul Korean dataset [9]• Bivariate dataset of rainfall
maximums of amount andduration
• Study period from 1965-2005
Approach:
• Comparative investigationbetween GDB-TSP modeland Gumbel mixed model
• Maintain assumption ofGumbel marginals
• Relies on the calculations ofreturn periods
• Examine differences in modelpredictions of returns period
Estimated parameters for the GDB-TSP model with Gumbel marginals
Marginal Mean Std. dev Scale Location Correlation GDB-TSPµ σ λ u ρ n
Duration 56.25 30.55 23.82 42.5 0.55 2.689
Amount 225.23 111.9 87.2 174.9
Returns definitions
• T (x, y) is defined as the joint returns period of amount and duration.• T (x|y) is defined as the conditional returns period of amount given duration.• T ′(x, y) is the non-standard joint return of amount and duration.• T ′(x|y) is defined as the non-standard conditional return of amount given duration.
Bivariate returns periods
Return Period Event
TX,Y (x, y) {(X > x or Y > y) or (X > x & Y > y) }T ′X,Y (x, y) {X > x and Y > y }
TX|Y (x|y) {X > x given Y = y }T ′X|Y
(x|y) {X > x given Y ≤ y }
T (x, y) =1
PE(x, y)Where PE(x, y) = 1− F (x, y)
Candidate models
Gumbel mixed model F (x, y) = Fx(x)Fy(y)× exp
{
−θ[
1lnFx(x)
+ 1lnFy(y)
]−1}
GDB-TSP model F (x, y) = C {Fx(x), Fy(y)} Where C is the GDB-TSP copula
For both models marginals are assumed to be Gumbel marginals:
Fz(z) = exp[
− exp(
− z−uzλz
)]
Distribution and density for GDB-TSP model
Goodness of fit comparison
Goodness of fit details
• Distance from empirical CDF [6]
• Sn =n∑
i=1{Fn − Fθn}2
• Tn = sup√n |Fn − Fθn |
Model predictions comparison for the T(x,y) study
D=12hrs
Amount (mm)
Re
turn
Pe
rio
d (
yr)
Image: hp://www.abbey-associates.com/splash-splash/storm_water_management.html
Storm water runoff system
Findings of the comparative study
• Study compared GDB-TSP model to the Gumbel mixed model• Based on Goodness of fit results, we select the GDB-TSP model• For T (x, y) the comparison found:
• Good agreement in the low amount-duration regime• GDB-TSP model predicted shorter joint returns elsewhere
• For T ′(x, y) the comparison found:
• Good agreement in most of the modeled space• GDB-TSP model predicted smaller rainfall amounts in the higher duration events
• For both T (x|y) and T ′(x|y) the comparison found:
• GDB-TSP model predicted smaller rainfall amounts in the shorter duration events• GDB-TSP model predicted larger rainfall amounts in the higher duration events
Higher dimensional example: stock returns
• Data
• Weekly returns: January 1st 1990 – January 3rd 2011 Rt =Pt−Pt−1
Pt−1
• Stocks: XOM APA CVX SLB SU IMO NBL• Indices: NDX DJA GSPC
• Goal
• Reproduce data correlation matrix• Simulate from resulting distribution• data correlation↔ fit correlation↔ sampled correlation
Stocks estimated parameters
Objec�ve func�on = 0.00036
Parameter Vector Weight Vector
Objec�ve func�on = 0.035
Parameter Vector Weight Vector
Correlation matrix comparison
Data Correlaon Matrix
Fi�ed Correlaon Matrix
Sampled Correlaon Matrix
XOM APA CVX SLB SU NDX DJA
XOM APA CVX SLB SU NDX DJA
XOM APA CVX SLB SU NDX DJA
Canonical Correlation
YX
XOMAPACVXSLBSUIMONBL
NDX
DJA
GSPC
COMP1
COMP2
COMP3
U1
U2
U3
Linear
Model 1 Model 2 Model 3
R2M R1 R2 R3
Model 1 1.000 1.000 1.000 1.000Model 2 6.56e−6 0.550 0.097 0.048Model 3 0.232 0.950 0.774 0.655
R2M
=
∣
∣
∣
∣
S−1yy SyxS
−1xx Sxy
∣
∣
∣
∣
=s∏
i=1
r2i
Research Summary
Research contributions
• Novel relations linking copula parameters to traditional dependence measures• Copula models
• A two parameter bivariate copula model based on a common risk factor• A multivariate copula model based on a common risk factor• A multivariate copula model based on multiple common risk factor
• Estimation procedures and sampling routines
Application examples
• A flood example demonstrating improvement over traditional distribution approach• A geochemical sediment composition example leveraging the flexibility of arbitrary marginals• A hydrology example of returns period• A Monte Carlo simulation for risk assessment• A multivariate example of stock market returns
Future work
• Investigate alternatives to numerical integration• Investigate the interpretation of the common risk factors
Uncertainty Analysis Applications of GDB Copula with TSPGenerating Densities
Leon Adams
May 10, 2011
References
[1] D. L. Barrow and P. W. Smith. Spline notation applied to a volume problem. The AmericanMathematical Monthly, 86:50–51, Jan. 1979.[2] R. W. Carter. Floods in Georgia. Geological Survey Circular, 1951. No. 100. [24.3-1].[3] R. Dennis Cook and Mark E. Johnson. A family of distributions for modelling non-ellipticallysymmetric multivariate data. Journal of the Royal Statistical Society. Series B (Methodological),43(2):210–218, 1981.[4] Roger M. Cooke and Rudi Waij. Monte carlo sampling for generalized knowledge dependencewith application to human reliability. Risk Analysis, 6(3):335–343, 1986.[5] T.F. Ferguson. A class of symmetric bivariate uniform distributions. Statistical Papers,36(1):31–40, 1995.[7] Christian Genest and Louis-Paul Rivest. Statistical inference procedures for bivariate archimedeancopulas. Journal of the American statistical association, 88(423):1034–1043, September 1993.[6] Christian Genest, Bruno Remillard, and David Beaudoin. Goodness-of-fit tests for copulas: Areview and a power study. Insurance: Mathematics and Economics, 44:199–213, 2009.[8] Samuel Kotz and Johan Rene Van Dorp. Generalized diagonal band copulas with two-sidedgenerating densities. Decison Analysis, 7(2):196–214, 2010.[9] Chang Lee, Tae-Woong Kim, Gunhui Chung, Minha Choi, and Chulsang Yoo. Application ofbivariate frequency analysis to the derivation of rainfall–frequency curves. Stochastic EnvironmentalResearch and Risk Assessment, 24:389–397, 2010. 10.1007/s00477-009-0328-9.[10] Daniel Lewandowski. Generalized diagonal band copulas. Insurance: Mathematics andEconomics, 37:49–67, 2005.[11] Fu-Chun Wu and Yin-Phan Tsang. Second-order monte carlo uncertainty/variability analysisusing correlated model parameters: application to salmonid embryo survival risk assesment.Ecological Modelling, 177:393–414, 2004.
GDB-TSP Canonical Correlation
Zero covariance
X correlation Y correlation
Cross−correlation
−1.0 −0.5 0.0 0.5 1.0
• Reminiscent of factor rotation
• Independent X’s• Significant dependence Y↔ X
• Structural differences
• Factor rotation is a linear model• GDB-TSP model provides full distribution approach
Multiple common risk factor multivariate GDB TSP copula
Model correlation matrix detail calculations
Dependence Parameters: W : k × m-matrix, n : m-vector
k = 3,m = 7 ⇒ 21 dependence parameters as opposed to(7
2
)
= 21
k = 3,m = 10 ⇒ 30 dependence parameters as opposed to(10
2
)
= 45
E[XiXj |W,n] =
∫
1
u1=0
. . .
∫
1
uk=0
E[XiXj |U = u,W, n]duk . . . du1
=
∫
1
u1=0
. . .
∫
1
uk=0
E[Xi|U = u,W, n]E[Xj |U = u,W, n]duk . . . du1
=
∫
1
u1=0
. . .
∫
1
uk=0
E[Xi|Yi = yi(u|W ), n]E[Xj |Yj = yj(u|W ), n]duk . . . du1
=
∫
1
u1=0
. . .
∫
1
uk=0
E[Xi|Vi = Gi{yi(u|W ), n}]E[Xj |Vj = Gj{yj(u|W ), n}]duk . . . du1
=
1∫
u1=0
. . .
1∫
uk=0
{
vi +(1 − vi)
ni+1
ni + 1−
vni+1
i
ni + 1
}{
vj +(1 − vj)
nj+1
nj + 1−
vnj+1
j
nj + 1
}
duk . . . du1
where vi = Gi{yi(u|W,n)} = Pr(Yi ≤ yi), Yi =∑k
l=1wilUl , yi =
∑kl=1
wilul.
Validation of estimation procedures