alexander suprun, cibc august 2018torsas.ca/attachments/file/20190621/suprun-lgdmodel.pdf ·...

15
Alexander Suprun, CIBC August 2018 1

Upload: others

Post on 17-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Alexander Suprun, CIBC

August 2018

1

Page 2: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Introductory Notes The presentation discussion and examples are applied

to LGD models.

However, the same methodology could be applied to any account level target in the range of [0,1] and

Any model which consists of several pools containing accounts with similar values of the target.

An obvious example of this model is a pool wise exposure at default (EAD) model, where EAD is defined as a ratio of default balance to limit.

AUROC and RCAP codes are at the end pages.

2

Page 3: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Revised AUROC AUROC - Area Under Receiver Operation Characteristic Applied to a binary version of LGD (see below)

Model AUROC calculation For pool wise sorted LGD(p), where p=1 to 𝑝𝑛 Calculate cumulative percent of “bad” and “good” accounts:

𝑐𝑁𝐵𝑝 = σ𝑖=1𝑝

𝑁𝐵𝑖/σ𝑖=1𝑝𝑛 𝑁𝐵𝑖 and

𝑐𝑁𝐺𝑝 = σ𝑖=1𝑝

𝑁𝐺𝑖/σ𝑖=1𝑝𝑛 𝑁𝐺𝑖

Graph 𝑐𝑁𝐵𝑝 vs 𝑐𝑁𝐺𝑝 and calculate area under the curve

Perfect or account AUROC calculation Sort account level LGD and arrange them in a large number of

pools=ROUND(aLGD/aLGDmax/0.01) , that produces around 100 pools, and aLGD is an account LGD value

Use the above algorithm to calculate Perfect AUROC

Revised AUROC=Model AUROC/Perfect AUROC

3

Page 4: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

RCAP Revised Cumulative Accuracy Profile (Revised CAP)

Applied directly to LGD

Model CAP calculation: For pool wise sorted LGD(p), where p=1 to 𝑝𝑛

Calculate cumulative percent of 𝑐𝐿𝐺𝐷𝑝 = σ𝑖=1𝑝

𝐿𝐺𝐷𝑖/σ𝑖=1𝑝𝑛 𝐿𝐺𝐷𝑖 and

𝑐𝑃𝑝 = σ𝑖=1𝑝

𝑖/𝑝𝑛 Graph 𝑐𝐿𝐺𝐷𝑝 vs 𝑐𝑃 and calculate area under the curve

Perfect CAP calculation: Sort account LGD and apply the above method considering each

account as a separate pool. This value is a highest possible CAP for a given data

RCAP=Model CAP/Perfect CAP

4

Page 5: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

AUROC for Binary Version of LGD Binomization:

Let LGD=0.75 then

Create two records:

BAD=1 with freq=75 and BAD=0 with freq=25

Perfect or Account Level AUROC:

Create pools by grouping accounts having the same value of pool=ROUND(aLGD/aLGDmax/0.01), where aLGD is account level LGD

Calculate AUROC based on these ~100 of pools.

5

Page 6: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Model AUROC Calculate AUROC for LGD model with 8 pools

Results:

r_AUROC=mod_AUROC/acc_AUROC

DWLGD stands for dollar weighted LGD

6

DWLGD r_AUROC mod_AUROC acc_AUROC0.4843 0.6925 0.6496 0.9380

Page 7: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Account Level & Model ROC Graph r_AUROC=0.6925 acc_AUROC=0.9380 mod_AUROC=0.6496

7

Page 8: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

AUROC for Simple Binary LGD Binomization:

If LGD > t_LGD then BAD=1 else BAD=0, where t_LGDis some threshold LGD

Model LGD AUROC

AUROC for model LGD pools has been calculated for t_LGD=0.05 to 1.00 by 0.05

8

Page 9: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

AUROC for Simple Binary LGD cont. Results

All threshold values produceroughly the same AUROC

t_LGD=0.4843 is a dollar weighed LGDvalue for the model

9

t_LGD r_AUROC mod_AUROC acc_AUROC

0.4843 0.7172 0.6727 0.9380

0.0500 0.7188 0.6743 0.9380

0.1000 0.7202 0.6756 0.9380

0.1500 0.7206 0.6759 0.9380

0.2000 0.7208 0.6761 0.9380

0.2500 0.7199 0.6753 0.9380

0.3000 0.7192 0.6747 0.9380

0.3500 0.7189 0.6743 0.9380

0.4000 0.7180 0.6735 0.9380

0.4500 0.7175 0.6731 0.9380

0.5000 0.7168 0.6724 0.9380

0.5500 0.7160 0.6716 0.9380

0.6000 0.7152 0.6708 0.9380

0.6500 0.7140 0.6698 0.9380

0.7000 0.7131 0.6689 0.9380

0.7500 0.7117 0.6676 0.9380

0.8000 0.7100 0.6660 0.9380

0.8500 0.7076 0.6638 0.9380

0.9000 0.7038 0.6602 0.9380

0.9500 0.6984 0.6552 0.9380

1.0000 0.6691 0.6276 0.9380

Page 10: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Revised Cumulative Accuracy Profile Model LGD Pools: rCAP=model_CAP/perfect_CAP

rCAP=0.2637 perfect_CAP=0.4201 model_CAP=0.1108

10

Page 11: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Conclusions Suggested revised AUROC is a better representation of

model AUROC value

AUROC for simple binary LGD model does not practically depend on LGD threshold value

AUROC for binary LGD model is somewhat lower than the one for simple binary model

11

Page 12: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Q & A

12

Page 13: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Area Under Receiver Operating Characteristic (ROC) MacroArea Under Receiver Operating Characteristic (ROC) Macro/************************************************************/;

/* HIGHER POOL NUMBERS SHOULD CORRESPOND HIGHER PDs */;

/* FOR BAD=1 MEANING DEFAULT, BAD=0 NONDEFAULT */;

/************************************************************/;

%macro AUC_ST (dat= /* Input Dataset */

,res= /* Output AUC Time Series */

,pool= /* Pool Variable*/

,bad= /* Bad Flag */

,YearMonth= /* Date or YearMonth */

,wt= /* Weight Variable */

,AUCT_ONLY=0 /* If AUCT_ONLY=1 Then &auct ONLY is Calculated */

/* Otherwise AUC Time Series and &auct are Calculated */

,print=1 /* Print=1 means output into LOG */ );

%global auct set;

/* Clear Output Datasets */;

%if %sysfunc(exist(WORK._somerst))=1 %then %do; proc SQL; drop table _somerst; quit; %end; %if %sysfunc(exist(WORK._somers)) =1 %then %do; proc SQL; drop table _somers; quit; %end;

/* Total */;

proc freq data=&dat noprint; output out=_somerst SMDRC SMDCR; table &pool*&bad/noprint measures; %if "&wt" NE "" %then %do; weight &wt; %end; run;

/* AUC Total*/

proc SQL noprint; select MAX((1+_SMDRC_)/2,1-(1+_SMDRC_)/2) as auct, E_SMDRC/2 as set into :auct trimmed, :set trimmed from _somerst; quit;

%if &print = 1 %then %do; %put AUC_ST: auct=&auct set=&set; %end;

%if &AUCT_ONLY = 0 %then %do;

/* Monthly */;

proc sort data=&dat out=_a(keep=&pool &Bad &YearMonth &wt); by &YearMonth; run;

proc freq data=_a noprint; output out=_somers SMDRC SMDCR; table &pool*&bad/noprint measures; %if "&wt" NE "" %then %do; weight &wt; %end; by &YearMonth; run;

data &res;

retain AUC SE; set _Somers; AUC=(1+_SMDRC_)/2; SE=E_SMDRC/2; format AUC 6.4 SE 8.6;

label AUC="AUC based on Somers' D R|C"; label SE="StdErr of AUC based on Somers'D R|C StdErr";

run;

%end;

%mend AUC_ST;

13

Page 14: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Revised Cumulative Accuracy Profile (RCAP) Macro%macro rcap(data=,y=,score=,tbin=0.05,gbin=0.001,graph=Y,table=Y);

%* data =input data set name (account level data with pool assigned to each account) ***;

%* y =variable for observed target value for an account ***;

%* score =pool variable ***;

%* tbin =bin size for table, 0.05 means 20 bins in total (5% accounts per bin) ***;

%* gbin =bin size for graph, 0.001 means 1000 points in total ***;

%* output=_rcap&data (model cap, perfect_cap and RCAP=model_cap/perfect_cap) ***;

%* output=_g2&data CAP curve for a graph ***;

%* output=_g2p&data CAP curve based on model pools ***;

%let out2=%scan(&data, 2);

%if %length(&out2)=0 %then %let data=%scan( &data, 1 );

%else %let data=&out2;

proc sort data=&LGDoct..&data out=_temp&data; by &score; run;

data _null_; set _temp&data end=eof; if _n_=1 then do; tot_n=0; totosum_y=0; end; tot_n+1; totsum_y+&y;

if eof then do; call symput('tot_n',put(tot_n,12.)); call symput('totsum_y',put(totsum_y, 24.12)); end; run;

proc summary data=_temp&data nway missing; var &y; class &score; output out=_grp&data(drop=_type_) mean(&y)=_grp_&y; run;

data _null_; set _grp&data end=eof; if eof then do; call symput('Parameters',put(_n_,12.)); end; run;

proc sort data=_grp&data out=ForScore; by _grp_&y; run;

data score_grp&data; set ForScore; score=_n_; run;

proc sort data=score_grp&data; by descending score; run;

data _grp2&data (keep=depth prop_y rename=(prop_y=grp_prop_y)); set score_grp&data end=eof; retain cum1 0 depth prop_y;

if _n_ = 1 then do; depth=0; prop_y=0; output _grp2&data; end;

depth+(_freq_/&tot_n); cum1+(_grp_&y*_freq_); prop_y=cum1/&totsum_y; output _grp2&data;

run;

proc sort data=score_grp&data; by &score; run;

data _temp&data; merge _temp&data score_grp&data; by &score; run;

proc sort data=_temp&data; by descending score; run;

data _t&data (keep=depth score bin_score binavg_y bintot binsum_y prop_y) _g&data(keep=obs depth prop_y); set _temp&data end=eof;

retain cum1 0 cumtot 0 max_ks 0 obs binsum_y 0 bintot tsize gsize bin_scoresum cap prev_depth prev_propy;

if _n_=1 then do;

prev_depth=0; prev_propy=0; depth=0; prop_y=0; tsize=ceil(&tbin*&tot_n); gsize=ceil(&gbin*&tot_n); cap=0; obs=0; output _g&data;

end;

depth=_n_/&tot_n; binsum_y+_grp_&y; bintot+1; bin_scoresum+score; cum1+_grp_&y; cumtot+1; prop_y=cum1/&totsum_y;

if mod(_n_,tsize)=0 or depth=1 then do;

binavg_y=binsum_y/bintot; bin_score=bin_scoresum/bintot; output _t&data; binsum_y=0; bintot=0; bin_scoresum=0;

end;

if mod(_n_,gsize)=0 or depth=1 then do;

obs+1; cap+0.5*(prop_y-depth+prev_propy-prev_depth)*(depth-prev_depth); output _g&data; prev_depth=depth; prev_propy=prop_y;

end;

if eof then do; call symput('cap',put(cap,8.6)); end;

run;

14

Page 15: Alexander Suprun, CIBC August 2018torsas.ca/attachments/File/20190621/Suprun-LGDModel.pdf · 2019-07-01 · Introductory Notes The presentation discussion and examples are applied

Revised Cumulative Accuracy Profile (RCAP) Macro (Continued)%if %upcase(&table)=Y %then %do;

proc print data=_t&data; var depth score bin_score binavg_y bintot binsum_y prop_y; format depth percent6. score bin_score binavg_ypercent8.2;

run;

%end;

proc sort data=&LGDOct..&data out=_temp&data; by descending &y; run;

/* Perfect Cap */;

data _p&data(keep=obs depth perfect_y); set _temp&data end=eof; retain cum1 0 cumtot 0 gsize pcap obs prev_depth prev_perfect_y;

if _n_ = 1 then do;

prev_depth=0; prev_perfect_y=0; depth=0; perfect_y=0; gsize=ceil(&gbin*&tot_n); pcap=0; cum1=0; cumtot=0; obs=0; output _p&data;

end;

depth=_n_/&tot_n; cum1+&y; cumtot+1; perfect_y=cum1/&totsum_y;

if mod(_n_, gsize)=0 or depth=1 then do;

obs+1; pcap=pcap+0.5*(perfect_y-depth+prev_perfect_y-prev_depth)*(depth-prev_depth); output _p&data; prev_depth=depth; prev_perfect_y=perfect_y;

end;

if eof then do;

rcap=&cap/pcap; call symput('pcap',put(pcap,8.6)); call symput('rcap',put(rcap,8.6));

end;

run;

data _g2&data; merge _g&data _p&data _grp2&data; by depth; run;

%if %upcase(&graph)=Y %then %do;

title "RCAP Chart for &data"; title2 "Model CAP=[&cap], Account-Level CAP=[&pcap], Pool-Level RCAP=[&rcap]";

title3 "# of Unique Scores=[&parameters]"; Legend1 value=(color=black height=1 "Segment-Level" "Account-Level" "Random" "Pools");

proc gplot data=_g2&data;

symbol1 v=none c=black i=join; symbol2 v=none c=red i=join line=20; symbol3 v=none c=black i=join line=20; symbol4 v=dot c=black i=none ;

plot (prop_y perfect_y depth grp_prop_y)*depth / overlay legend=legend1;

run; quit; title; title2;

%end;

data _rcap&data; model_cap=∩ perfect_cap=&pcap; rcap=&rcap; num_uni_scores=&parameters; run;

%mend rcap;

15