Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Actuariat de l’Assurance Non-Vie # 9
A. Charpentier (UQAM & Université de Rennes 1)
ENSAE ParisTech, Octobre 2015 - Janvier 2016.
http://freakonometrics.hypotheses.org
@freakonometrics 1
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Fourre-Tout sur la Tarification
• modèle collectif vs. modèle individuel
• cas de la grande dimension
• choix de variables
• choix de modèles
@freakonometrics 2
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La loi Tweedie
Consider a Tweedie distribution, with variance function power p ∈ (1, 2), mean µand scale parameter φ, then it is a compound Poisson model,
• N ∼ P(λ) with λ = φµ2−p
2− p
• Yi ∼ G(α, β) with α = −p− 2p− 1 and β = φµ1−p
p− 1
Consversely, consider a compound Poisson model N ∼ P(λ) and Yi ∼ G(α, β),then
• variance function power is p = α+ 2α+ 1
• mean is µ = λα
β
• scale parameter is φ = [λα]α+2α+1−1β2−α+2
α+1
α+ 1
@freakonometrics 3
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La régression Tweedie
In the context of regression
Ni ∼ P(λi) with λi = exp[XTi βλ]
Yj,i ∼ G(µi, φ) with µi = exp[XTi βµ]
Then Si = Y1,i + · · ·+ YN,i has a Tweedie distribution
• variance function power is p = φ+ 2φ+ 1
• mean is λiµi
• scale parameter is λ1
φ+1−1i
µφφ+1i
(φ
1 + φ
)
There are 1 + 2dim(X) degrees of freedom.
@freakonometrics 4
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Modèle individuel ou modèle collectif ? La régression Tweedie
Remark Note that the scale parameter should not depend on i.
A Tweedie regression is
• variance function power is p ∈ (1, 2)
• mean is µi = exp[XTi βTweedie]
• scale parameter is φ
There are 2 + dim(X) degrees of freedom.
@freakonometrics 5
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Double Modèle Frquence - Coût Individuel
Considérons les bases suivantes, en RC, pour la fréquence1 > freq = merge (contrat , nombre _RC)
pour les coûts individuels1 > sinistre _RC = sinistre [( sinistre $ garantie =="1RC")&( sinistre $cout >0)
,]
2 > sinistre _RC = merge ( sinistre _RC , contrat )
et pour les co ûts agrégés par police1 > agg_RC = aggregate ( sinistre _RC$cout , by=list( sinistre _RC$ nocontrat )
, FUN=’sum ’)
2 > names (agg_RC)=c(’nocontrat ’,’cout_RC ’)
3 > global _RC = merge (contrat , agg_RC , all.x=TRUE)
4 > global _RC$cout_RC[is.na( global _DO$cout_RC)]=0
@freakonometrics 6
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Double Modèle Frquence - Coût Individuel1 > library ( splines )
2 > reg_f = glm(nb_RC~zone+bs( ageconducteur )+carburant , offset =log(
exposition ),data=freq , family = poisson )
3 > reg_c = glm(cout~zone+bs( ageconducteur )+carburant , data= sinistre _RC
, family = Gamma (link="log"))
Simple Modèle Coût par Police1 > library ( tweedie )
2 > library ( statmod )
3 > reg_a = glm(cout_RC~zone+bs( ageconducteur )+carburant , offset =log(
exposition ),data= global _RC , family = tweedie (var. power =1.5 , link.
power =0))
@freakonometrics 7
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Comparaison des primes1 > freq2 = freq
2 > freq2 $ exposition = 1
3 > P_f = predict (reg_f, newdata =freq2 ,type=" response ")
4 > P_c = predict (reg_c, newdata =freq2 ,type=" response ")
5 prime1 = P_f*P_c
1 > k = 1.5
2 > reg_a = glm(cout_DO~zone+bs( ageconducteur )+carburant , offset =log(
exposition ),data= global _DO , family = tweedie (var. power =k, link. power
=0))
3 > prime2 = predict (reg_a, newdata =freq2 ,type=" response ")
1 > arrows (1:100 , prime1 [1:100] ,1:100 , prime2 [1:100] , length =.1)
@freakonometrics 8
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Impact du degré Tweedie sur les Primes Pures
@freakonometrics 9
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Impact du degré Tweedie sur les Primes Pures
Comparaison des primes pures, assurés no1, no2 et no 3 (DO)
@freakonometrics 10
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
‘Optimisation’ du Paramètre Tweedie
1 > dev = function (k){
2 + reg = glm(cout_RC~zone+bs( ageconducteur )+
carburant , data= global _RC , family =
tweedie (var. power =k, link. power =0) ,
offset =log( exposition ))
3 + reg$ deviance
4 + }
@freakonometrics 11
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Tarification et données massives (Big Data)
Problèmes classiques avec des données massives
• beaucoup de variables explicatives, k grand, XTX peut-être non inversible
• gros volumes de données, e.g. données télématiques
• données non quantitatives, e.g. texte, localisation, etc.
@freakonometrics 12
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
La fascination pour les estimateurs sans biais
En statistique mathématique, on aime les estimateurs sans biais car ils ontplusieurs propriétés intéressantes. Mais ne peut-on pas considérer des estimateursbiaisés, potentiellement meilleurs ?
Consider a sample, i.i.d., {y1, · · · , yn} with distribution N (µ, σ2). Defineθ = αY . What is the optimal α? to get the best estimator of µ ?
• bias: bias(θ)
= E(θ)− µ = (α− 1)µ
• variance: Var(θ)
= α2σ2
n
• mse: mse(θ)
= (α− 1)2µ2 + α2σ2
n
The optimal value is α? = µ2
µ2 + σ2
n
< 1.
@freakonometrics 13
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Linear Model
Consider some linear model yi = xTi β + εi for all i = 1, · · · , n.
Assume that εi are i.i.d. with E(ε) = 0 (and finite variance). Writey1...yn
︸ ︷︷ ︸y,n×1
=
1 x1,1 · · · x1,k...
.... . .
...1 xn,1 · · · xn,k
︸ ︷︷ ︸
X,n×(k+1)
β0
β1...βk
︸ ︷︷ ︸β,(k+1)×1
+
ε1...εn
︸ ︷︷ ︸ε,n×1
.
Assuming ε ∼ N (0, σ2I), the maximum likelihood estimator of β is
β = argmin{‖y −XTβ‖`2} = (XTX)−1XTy
... under the assumtption that XTX is a full-rank matrix.
What if XTiX cannot be inverted? Then β = [XTX]−1XTy does not exist, but
βλ = [XTX + λI]−1XTy always exist if λ > 0.
@freakonometrics 14
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Ridge Regression
The estimator β = [XTX + λI]−1XTy is the Ridge estimate obtained as solutionof
β = argminβ
n∑i=1
[yi − β0 − xTi β]2 + λ ‖β‖`2︸ ︷︷ ︸
1Tβ2
for some tuning parameter λ. One can also write
β = argminβ;‖β‖`2≤s
{‖Y −XTβ‖`2}
Remark Note that we solve β = argminβ{objective(β)} where
objective(β) = L(β)︸ ︷︷ ︸training loss
+ R(β)︸ ︷︷ ︸regularization
@freakonometrics 15
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
In severall applications, k can be (very) large, but a lot of features are just noise:βj = 0 for many j’s. Let s denote the number of relevent features, with s << k,cf Hastie, Tibshirani & Wainwright (2015),
s = card{S} where S = {j;βj 6= 0}
The model is now y = XTSβS + ε, where XT
SXS is a full rank matrix.
@freakonometrics 16
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
Define ‖a‖`0 =∑
1(|ai| > 0). Ici dim(β) = s.
We wish we could solve
β = argminβ;‖β‖`0≤s
{‖Y −XTβ‖`2}
Problem: it is usually not possible to describe all possible constraints, since(s
k
)coefficients should be chosen here (with k (very) large).
Idea: solve the dual problem
β = argminβ;‖Y −XTβ‖`2≤h
{‖β‖`0}
where we might convexify the `0 norm, ‖ · ‖`0 .
@freakonometrics 17
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Regularization `0, `1 et `2
@freakonometrics 18
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Going further on sparcity issues
On [−1,+1]k, the convex hull of ‖β‖`0 is ‖β‖`1
On [−a,+a]k, the convex hull of ‖β‖`0 is a−1‖β‖`1
Hence,β = argmin
β;‖β‖`1≤s{‖Y −XTβ‖`2}
is equivalent (Kuhn-Tucker theorem) to the Lagragian optimization problem
β = argmin{‖Y −XTβ‖`2+λ‖β‖`1}
@freakonometrics 19
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO Least Absolute Shrinkage and Selection Operator
β ∈ argmin{‖Y −XTβ‖`2+λ‖β‖`1}
is a convex problem (several algorithms?), but not strictly convex (no unicity ofthe minimum). Nevertheless, predictions y = xTβ are unique
? MM, minimize majorization, coordinate descent Hunter (2003).
@freakonometrics 20
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Optimal LASSO Penalty
Use cross validation, e.g. K-fold,
β(−k)(λ) = argmin
∑i6∈Ik
[yi − xTi β]2 + λ‖β‖
then compute the sum of the squared errors,
Qk(λ) =∑i∈Ik
[yi − xTi β(−k)(λ)]2
and finally solve
λ? = argmin{Q(λ) = 1
K
∑k
Qk(λ)}
Note that this might overfit, so Hastie, Tibshiriani & Friedman (2009) suggest thelargest λ such that
Q(λ) ≤ Q(λ?) + se[λ?] with se[λ]2 = 1K2
K∑k=1
[Qk(λ)−Q(λ)]2
@freakonometrics 21
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
1 > freq = merge (contrat , nombre _RC)
2 > freq = merge (freq , nombre _DO)
3 > freq [ ,10]= as. factor (freq [ ,10])
4 > mx= cbind (freq[,c(4 ,5 ,6)],freq [ ,9]=="D",
freq [ ,3]% in%c("A","B","C"))
5 > colnames (mx)=c( names (freq)[c(4 ,5 ,6)],"
diesel ","zone")
6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean(
mx[,i]))/sd(mx[,i])
7 > names (mx)
8 [1] puissance agevehicule ageconducteur
diesel zone
9 > library ( glmnet )
10 > fit = glmnet (x=as. matrix (mx), y=freq [ ,11] ,
offset =log(freq [ ,2]) , family = " poisson
")
11 > plot(fit , xvar=" lambda ", label =TRUE)
LASSO, Fréquence RC
−10 −9 −8 −7 −6 −5
−0.
20−
0.15
−0.
10−
0.05
0.00
0.05
0.10
Log Lambda
Coe
ffici
ents
4 4 4 4 3 1
1
3
4
5
@freakonometrics 22
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO, Fréquence RC
1 > plot(fit , label =TRUE)
2 > cvfit = cv. glmnet (x=as. matrix (mx), y=freq
[ ,11] , offset =log(freq [ ,2]) ,family = "
poisson ")
3 > plot( cvfit )
4 > cvfit $ lambda .min
5 [1] 0.0002845703
6 > log( cvfit $ lambda .min)
7 [1] -8.16453
• Cross validation curve + error bars
0.0 0.1 0.2 0.3 0.4
−0.
20−
0.15
−0.
10−
0.05
0.00
0.05
0.10
L1 Norm
Coe
ffici
ents
0 2 3 4 4
1
3
4
5
−10 −9 −8 −7 −6 −5
0.24
60.
248
0.25
00.
252
0.25
40.
256
0.25
8
log(Lambda)
Poi
sson
Dev
ianc
e
●●
●●
●●
●●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 2 1
@freakonometrics 23
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
1 > freq = merge (contrat , nombre _RC)
2 > freq = merge (freq , nombre _DO)
3 > freq [ ,10]= as. factor (freq [ ,10])
4 > mx= cbind (freq[,c(4 ,5 ,6)],freq [ ,9]=="D",
freq [ ,3]% in%c("A","B","C"))
5 > colnames (mx)=c( names (freq)[c(4 ,5 ,6)],"
diesel ","zone")
6 > for(i in 1: ncol(mx)) mx[,i]=( mx[,i]-mean(
mx[,i]))/sd(mx[,i])
7 > names (mx)
8 [1] puissance agevehicule ageconducteur
diesel zone
9 > library ( glmnet )
10 > fit = glmnet (x=as. matrix (mx), y=freq [ ,12] ,
offset =log(freq [ ,2]) , family = " poisson
")
11 > plot(fit , xvar=" lambda ", label =TRUE)
LASSO, Fréquence DO
−9 −8 −7 −6 −5 −4
−0.
8−
0.6
−0.
4−
0.2
0.0
Log Lambda
Coe
ffici
ents
4 4 3 2 1 1
1
2
4
5
@freakonometrics 24
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
LASSO, Fréquence DO
1 > plot(fit , label =TRUE)
2 > cvfit = cv. glmnet (x=as. matrix (mx), y=freq
[ ,12] , offset =log(freq [ ,2]) ,family = "
poisson ")
3 > plot( cvfit )
4 > cvfit $ lambda .min
5 [1] 0.0004744917
6 > log( cvfit $ lambda .min)
7 [1] -7.653266
• Cross validation curve + error bars
0.0 0.2 0.4 0.6 0.8 1.0
−0.
8−
0.6
−0.
4−
0.2
0.0
L1 Norm
Coe
ffici
ents
0 1 1 1 2 4
1
2
4
5
−9 −8 −7 −6 −5 −4
0.21
50.
220
0.22
50.
230
0.23
5
log(Lambda)
Poi
sson
Dev
ianc
e
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
4 4 4 4 3 3 3 3 3 2 2 1 1 1 1 1 1 1 1
@freakonometrics 25
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
Consider an ordered sample {y1, · · · , yn}, then Lorenz curve is
{Fi, Li} with Fi = i
nand Li =
∑ij=1 yj∑nj=1 yj
The theoretical curve, given a distribution F , is
u 7→ L(u) =∫ F−1(u)−∞ tdF (t)∫ +∞−∞ tdF (t)
see Gastwirth (1972, econpapers.repec.org)
@freakonometrics 26
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
1 > library (ineq)
2 > set.seed (1)
3 > (x<-sort( rlnorm (5 ,0 ,1)))
4 [1] 0.4336018 0.5344838 1.2015872 1.3902836
4.9297132
5 > Lc.sim <- Lc(x)
6 > plot(Lc.sim)
7 > points ((1:4) /5,( cumsum (x)/sum(x))[1:4] , pch
=19 , col="blue")
8 > lines (Lc.lognorm , parameter =1, lty =2)0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
Lorenz curve
p
L(p)
●
●
●
●
@freakonometrics 27
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Model Selection and Gini/Lorentz (on incomes)
Gini index is the ratio of the areas AA + B . Thus,
G = 2n(n− 1)x
n∑i=1
i · xi:n −n+ 1n− 1
= 1E(Y )
∫ ∞0
F (y)(1− F (y))dy
1 > Gini(x)
2 [1] 0.4640003●
●
0.0 0.2 0.4 0.6 0.8 1.0
0.0
0.2
0.4
0.6
0.8
1.0
p
L(p)
●
●
●
●A
B
@freakonometrics 28
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Comparing Models
Consider an ordered sample {y1, · · · , yn} of incomes,with y1 ≤ y2 ≤ · · · ≤ yn, then Lorenz curve is
{Fi, Li} with Fi = i
nand Li =
∑ij=1 yj∑nj=1 yj
We have observed losses yi and premiums π(xi). Con-sider an ordered sample by the model, see Frees, Meyers& Cummins (2014), π(x1) ≥ π(x2) ≥ · · · ≥ π(xn), thenplot
{Fi, Li} with Fi = i
nand Li =
∑ij=1 yj∑nj=1 yj
0 20 40 60 80 100
020
4060
8010
0
Proportion (%)
Inco
me
(%)
poorest ← → richest
0 20 40 60 80 100
020
4060
8010
0
Proportion (%)
Loss
es (
%)
more risky ← → less risky
@freakonometrics 29
Arthur CHARPENTIER - Actuariat de l’Assurance Non-Vie, # 9
Choix et comparaison de modèle en tarification
See Frees et al. (2010) or Tevet (2013).
@freakonometrics 30