risk and risk management in the credit card industry

30
FLORENTIN BUTARU (OCC) QINGQING CHEN (OCC) BRIAN CLARK (OCC) SANMAY DAS (WASHINGTON UNIVERSITY) ANDREW LO (MIT) AKHTAR SIDDIQUE (OCC) JANUARY 29, 2015 *This views expressed in this paper and presentation are those of the authors alone and do not necessarily reflect those of the U.S. Treasury or Office of the Comptroller of the Currency. Risk and risk management in the credit card industry

Upload: others

Post on 02-Jan-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Risk and risk management in the credit card industry

F L O R E N T I N B U T A R U ( O C C ) Q I N G Q I N G C H E N ( O C C )

B R I A N C L A R K ( O C C ) S A N M A Y D A S ( W A S H I N G T O N U N I V E R S I T Y )

A N D R E W L O ( M I T ) A K H T A R S I D D I Q U E ( O C C )

J A N U A R Y 2 9 , 2 0 1 5

* T h i s v i e w s e x p r e s s e d i n t h i s p a p e r a n d p r e s e n t a t i o n a r e t h o s e o f t h e a u t h o r s a l o n e a n d d o n o t n e c e s s a r i l y r e f l e c t t h o s e o f t h e U . S . T r e a s u r y o r O f f i c e o f t h e C o m p t r o l l e r o f t h e C u r r e n c y .

Risk and risk management in the credit card industry

Page 2: Risk and risk management in the credit card industry

Key Takeaways

¡  Machine learning models (with six banks’ credit card data) out-of-sample and out-of-time forecasts of credit-card-holder delinquencies and defaults outperform the traditional logistic regressions.

¡  Analysis and comparison of risk management practices across

different banks along with comparison of the drivers of delinquency at different institutions show: ÷ Substantial heterogeneity in risk factors, sensitivities, and

predictability of delinquency across banks ÷  In particular, macroeconomic variables affect different institutions

differently.

January 29, 2016

2

MFM Workshop - New York Univerisity

Page 3: Risk and risk management in the credit card industry

Key Takeaways: 2

¡  One model fit, in terms of risk factors or sensitivities to risk factors, would not work very well across all institutions.

¡  Predictability of delinquency and the quality (i.e. effectiveness)

of risk management varies across institutions

÷ Leveraging information from one bank may be useful for analyzing the risk and performance for another bank.

¡  Policy implications that bank specific supervision and capital requirements would be more appropriate.

January 29, 2016

3

MFM Workshop - New York Univerisity

Page 4: Risk and risk management in the credit card industry

Consumer Credit Modeling

�  Two broad schools of modeling: ¡  Traditional statistical models (logistic regression)

÷  Segmentation analysis, more relevant and predictive attributes ¢  Mester, 1997; Morrison, 2004;Fensterstock, 2005

÷  Tend to be easier to interpret and thus useful in practice and may be more useful over long time periods

¡  Machine learning and data mining method ÷  Decision tree models (Davis, Edelman, & Gammerman, 1992), neural network models (Desai,

Crook, & Overstreet, 1996; Malhotra & Malhotra, 2002; West, 2000), support vector machines (Huang, Chen, Wang, 2007)

÷  Machine learning is most useful for large-scale complex problems, such as modeling human behavior

�  Why use ML for modeling consumer credit card defaults? ¡  ML works well when classification is the key model output and is likely to work over short

time horizons ¡  Banks can actively manage credit cards to manage their exposure

÷  Line cuts ÷  Freeze accounts

January 29, 2016

4

MFM Workshop - New York Univerisity

Page 5: Risk and risk management in the credit card industry

Concerns with Data Mining

�  Machine learning (a.k.a. data mining) is widely used in many scientific fields, but is often viewed with skepticism in economics research because: ¡  A lack of theory ¡  Overfitting the data (in-sample) ¡  Difficult to interpret the results

�  We address the above issues by: ¡  Testing out of sample ¡  Sensitivity analysis ¡  Analyzing the decision trees

January 29, 2016

5

MFM Workshop - New York Univerisity

Page 6: Risk and risk management in the credit card industry

Data Sources

�  Credit Card Data: ¡  Detailed account-level characteristics (days past due, balance, utilization, FICO, behavioral

score, etc.) ÷  Cannot link the accounts across individuals

¡  Collected by the OCC ¡  Entire credit card portfolios of nine (six) large U.S. banks ¡  Monthly data starting January 2008

�  Attribute Data: ¡  Detailed borrower characteristics from credit bureau (linked by account) ¡  Quarterly staring in 2009 and ending 2013.

�  Macroeconomic Variables: ¡  Collected from various sources (linked by ZIP code associated with the account) ¡  Employment data, HPI, average wages, average hours, etc.

�  In total, over 20TB of raw data ¡  Takes weeks to process

January 29, 2016

6

MFM Workshop - New York Univerisity

Page 7: Risk and risk management in the credit card industry

Sample Selection

�  Start with the full dataset from all six banks �  Create a panel dataset that tracks accounts over time �  Simple random sample the data

¡  Retain every nth account where (100/n) = s% sample size ¡  Sample before merging attribute and bureau data (merge has

roughly a 70% success rate) ¡  Each month, we retain s% of the new accounts so our sample

size mimics the bank’s portfolio size �  Since we are forecasting default, we drop all accounts

currently in default (e.g., 90DPD+, chargeoffs, etc.) �  Retain everything else

January 29, 2016

7

MFM Workshop - New York Univerisity

Page 8: Risk and risk management in the credit card industry

Variable Selection

�  Roughly based on Glennon, et al. (2009) ¡  Raw data with 186 raw data items (106 account-level items and 80

credit-bureau items) ¡  Including account level variables, credit bureau variables, and

macroeconomic variables ¡  Compile into 87 attributes/features ¡  These are the results we will focus on today

�  Variable selection is carried out in two ways: 1.  Linear best first selection (forward, backward, and bi-directional)

1.  Very few attributes are retained regardless of the method 2.  Decision tree analysis from the full models and retain the top 20

variables from each bank (more on this later)

January 29, 2016

8

MFM Workshop - New York Univerisity

Page 9: Risk and risk management in the credit card industry

Account  Level  Features: Credit  Bureau  Features: Macroeconomic  Features:Cycle  end  balance Flag  if  greater  than  0  accounts  90  days  past  due Unemployment  rateRefreshed  credit  score Flag  if  greater  than  0  accounts  60  days  past  due Unemployment  rate  (3  mo.  chg.)Behavioral  score Flag  if  greater  than  0  accounts  30  days  past  due Unemployment  rate  (12  mo.  chg.)Current  credit  l imit Flag  if  greater  than  0  bank  cards  60  days  past  due Number  of  total  nonfarm  (NSA)Line  frozen  flag  (0,1) Flag  if  greater  than  0  retail  cards  60  days  past  due Number  of  total  nonfarm  (NSA)  (3  mo.  chg.)Line  decrease  in  current  mo.  flag  (0,1) Flag  if  total  l imit  on  all  bank  cards  greater  than  zero Number  of  total  nonfarm  (NSA)  (12  mo.  chg.)Line  increase  in  current  mo.  flag  (0,1) Flag  if  total  l imit  on  all  retail  cards  greater  than  zero Total  private  (NSA)  (3  mo.  chg.)Actual  payment  /  minimum  payment Flag  if  greater  than  0  accounts  opened  in  the  past  year Total  private  (NSA)  (12  mo.  chg.)Days  past  due Total  number  of  accounts Avg.  weekly  hours  worked  (private)  (3  mo.  chg.)Purchase  volume  /  credit  l imit Total  balance  on  all  accounts  /  total  l imit Avg.  weekly  hours  worked  (private)  (12  mo.  chg.)Cash  advance  volume  /  credit  l imit Total  non-­‐mortgage  balance  /  total  l imit Avg.  hourly  wage  (private)  (3  mo.  chg.)Balance  transfer  volume  /  credit  l imit Total  number  of  accounts  60+  days  past  due Avg.  hourly  wage  (private)  (12  mo.  chg.)Flag  is  the  card  is  securitized Total  number  of  bank  card  accounts Avg.  weekly  hours  worked  (trade  and  transportation)  (3  mo.  chg.)chg.  in  securitization  status  (1  mo.) Util izatiion  of  all  bank  card  accounts Avg.  weekly  hours  worked  (trade  and  transportation)  (12  mo.  chg.)Percent  chg.  in  credit  l imit  (lagged  1  mo.) Number  of  accounts  30+  days  past  due Avg.  hourly  wage  (trade  and  transportation)  (3  mo.  chg.)Percent  chg.  in  credit  l imit  current  1  mo.) Number  of  accounts  60+  days  past  due Avg.  hourly  wage  (trade  and  transportation)  (12  mo.  chg.)Total  fees Number  of  accounts  90+  days  past  due Avg.  weekly  hours  worked  (leisure)  (3  mo.  chg.)Workout  program  flag Number  of  accounts  under  wage  garnishment Avg.  weekly  hours  worked  (leisure)  (12  mo.  chg.)Line  frozen  flag  (1  mo.  lag) Number  of  accounts  in  collection Avg.  hourly  wage  (leisure)  (3  mo.  chg.)Line  frozen  flag  (current  mo.) Number  of  accounts  in  charge  off  status Avg.  hourly  wage  (leisure)  (12  mo.  chg.)Product  type Total  balance  on  all  60+  days  past  due  accounts House  price  index3  mo.  chg.  in  credit  score Total  number  of  acocunts House  price  index  (3  mo.  chg.)6  mo.  chg.  in  credit  score Total  credit  l imit  to  number  of  open  bank  cards House  price  index  (12  mo.  chg.)3  mo.  chg.  in  behavioral  score Total  credit  l imit  to  number  of  open  retail  accounts6  mo.  chg.  in  behavioral  score Total  number  of  accounts  opened  in  the  past  yearmo.ly  util ization Total  balance  of  all  revolving  accounts  /  total  balance  on  all  accounts1  mo.  chg.  in  mo.ly  util ization Flag  if  total  balance  over  l imit  on  all  open  bank  cards  =  0%3  mo.  chg.  in  mo.ly  util ization Flag  if  total  balance  over  l imit  on  all  open  bank  cards  =  100%6  mo.  chg.  in  mo.ly  util ization Flag  if  total  balance  over  l imit  on  all  open  bank  cards  >  100%Cycle  util ization1  mo.  chg.  in  cycle  util ization3  mo.  chg.  in  cycle  util izationAccount  exceeded  the  limit  in  past  3  mo.s  (0,1)Payment  equal  minimum  payment  in  past  3  mo.s  (0,1)6  mo.  chg.  in  cycle  util ization

Variables: Tradelines and attributes

January 29, 2016

9

MFM Workshop - New York Univerisity

Page 10: Risk and risk management in the credit card industry

Sample Description

January 29, 2016

10

�  Six large U.S. banks [From the original nine] ¡  2.5% sample for large banks and upto 40% of the smaller banks.

�  Yields between ~90K and ~1MM observations each period per bank

�  Portfolio size varies over time (mostly grow, some decline) and the sample size is representative of the true portfolio

�  Substantial heterogeneity in the time series and cross sectional distribution of delinquency rates

MFM Workshop - New York Univerisity

Page 11: Risk and risk management in the credit card industry

Relative Delinquency Rates

January 29, 2016

11

MFM Workshop - New York Univerisity

1 2 3 4 5 60

0.5

1

1.5

2

2.5

3

3.5

Date

Relative  De

linqu

ency  Rate

Relative  Delinquency  Rates

Page 12: Risk and risk management in the credit card industry

Loss Forecasting- Empirical Designs

�  All models trained on historical data so all forecasts are out of sample and out of time.

�  Loan –level default models: ÷  Goal is to classify individual accounts and study the risk management

practices of individual banks- for example: ¢  Credit line increases/decreases

�  Default is defined as 90 days past due (90DPD)

�  Compare performance of our models using a value added analysis (Khandani, Kim, & Lo (2011))

�  Analyze the attributes (e.g., decision tree output)

January 29, 2016

12

MFM Workshop - New York Univerisity

Page 13: Risk and risk management in the credit card industry

Timing of the Models

t = 0 (today)

t = +6 t = -6 t = -12

Training Data

Past Future

Outcome Variable

t = -9

Forecast

Lagged and differenced

variables

Test Data Test the Forecast

All models are tested out of sample! January 29, 2016

13

MFM Workshop - New York Univerisity

Page 14: Risk and risk management in the credit card industry

Modeling techniques

January 29, 2016

14

Three approaches

�  Decision Trees ¡  C4.5 algorithm [J48 classifier in Weka]

�  Random Forests ¡  Bootstrapping a decision tree with replacement

�  Regularized Logistic Regression with a quadratic penalty function

MFM Workshop - New York Univerisity

Page 15: Risk and risk management in the credit card industry

Evaluating the Models: F-Measure Kappa statistic

�  Precision = the proportion of positives identified by a technique that are truly positive.

�  Recall = the proportion of positives that is correctly identified.

�  F-Measure = the harmonic mean of precision and recall, and is meant to describe the balance between precision and recall.

�  kappa statistic = performance relative to random classification.

January 29, 2016

15

MFM Workshop - New York Univerisity

Page 16: Risk and risk management in the credit card industry

Performance Metrics: More Precisely

January 29, 2016

16

Precision = TN/(TN+FN)

Recall = TN/(TN+FP)

True Positive Rate = TP/(TP+FN)

False Positive Rate = FP/(FP+TN)

F-Measure = (2*Recall*Precision)/(Recall+Precision)

Kappa Statistic = (Pa – Pe)/(1-Pe),

where Pa = (TP+TN)/N and Pe = [(TP+FN)/N]*[(TP+FN)/N]

 

 

MFM Workshop - New York Univerisity

Good BadGood True  Positive  (TP) False  Negative  (FN)Bad False  Positive  (FP) True  Negative  (TN)

Model  Prediction

Actual  Outcome

Page 17: Risk and risk management in the credit card industry

Bank 1

January 29, 2016

17

MFM Workshop - New York Univerisity

2010Q4 2011Q2 2011Q4 2012Q2 2012Q4 2013Q210%

20%

30%

40%

50%

60%

70%

80%

90%

Date

F-­‐Measure  (%

)

Bank  1:  F-­‐Measure2  Quarter  Forecast

Bank  1  -­‐  C4.5  TreeBank  1  -­‐  LogisticBank  1  -­‐  Random  Forest

Page 18: Risk and risk management in the credit card industry

Bank 6

January 29, 2016

18

MFM Workshop - New York Univerisity 2010Q4 2011Q2 2011Q4 2012Q2 2012Q4 2013Q2

10%

20%

30%

40%

50%

60%

70%

80%

90%

Date

F-­‐Measure  (%

)

Bank  6:  F-­‐Measure2  Quarter  Forecast

Bank  6  -­‐  C4.5  TreeBank  6  -­‐  LogisticBank  6  -­‐  Random  Forest

Page 19: Risk and risk management in the credit card industry

Bank 1

January 29, 2016

19

MFM Workshop - New York Univerisity

2010Q4 2011Q2 2011Q4 2012Q2 2012Q4 2013Q210%

20%

30%

40%

50%

60%

70%

80%

90%

Date

Kapp

a  Stat

istic  (

%)

Bank  1:  Kappa  Statistic2  Quarter  Forecast

Bank  1  -­‐  C4.5  TreeBank  1  -­‐  LogisticBank  1  -­‐  Random  Forest

Page 20: Risk and risk management in the credit card industry

Bank 6

January 29, 2016

20

MFM Workshop - New York Univerisity

2010Q4 2011Q2 2011Q4 2012Q2 2012Q4 2013Q210%

20%

30%

40%

50%

60%

70%

80%

90%

Date

Kapp

a  Stat

istic  (

%)

Bank  6:  Kappa  Statistic2  Quarter  Forecast

Bank  6  -­‐  C4.5  TreeBank  6  -­‐  LogisticBank  6  -­‐  Random  Forest

Page 21: Risk and risk management in the credit card industry

Evaluating the Models: Value-Added Analysis (Loan-level)

�  Khandani, Kim, & Lo (2011) develop a method to estimate the potential cost-savings of applying a classification model ¡  Assumptions:

÷  Without a forecast, the bank will not manage the account (e.g., will not cut credit limits, freeze accounts, etc.) ¢  We relax this assumption

÷  All customers have an observable current balance BC at t=0 ¢  True in our data

÷  Customers that default will increase their balance to BD between t=0 and t=default (call this the “run-up”) ¢  We can calculate this for each account

÷  Cost savings depends on: 1.   Classification accuracy (i.e., quality of the model) 2.  “Run-up” ratio 3.  Expected future cash flows (opportunity cost of misclassification)

1.  Refer to this as the “Profitability Margin”

January 29, 2016

21

MFM Workshop - New York Univerisity

Page 22: Risk and risk management in the credit card industry

Value-Added Analysis

Model Prediction Good Bad

Actual Outcome

Good TP FN Bad FP TN

( ) ( )( ) ( ) ( )( ) ( )

( )[ ]FPTN

BBrFNTN

Nr

PBFNBBTNΔΠBTNBFPPBTPΠBTNFPPBFNTPΠ

C

DN

mCCD

CDmC

DMC

+

⎥⎦

⎤⎢⎣

⎡−+−−

=⎟⎟⎠

⎞⎜⎜⎝

−−=

−−=

+−+=

1

C

D

forecast

forecast no

111,,

BBAdded Value

[3] [2] [1]

January 29, 2016

22

MFM Workshop - New York Univerisity

Page 23: Risk and risk management in the credit card industry

Value-Added Analysis

�  We make comparison to passive risk management

�  Our results can be compared relative to what the bank is actually doing (regardless of the model)

January 29, 2016

23

MFM Workshop - New York Univerisity

Page 24: Risk and risk management in the credit card industry

Value added : Bank 1, 2 Quarter

January 29, 2016

24

MFM Workshop - New York Univerisity

2010Q4 2011Q2 2011Q4 2012Q2 2012Q4 2013Q2-­‐80%

-­‐60%

-­‐40%

-­‐20%

   0%

 20%

 40%

 60%

 80%

Date

Value

 Added  (

%)

Bank  1:  Value  Added  (%)2  Quarter  Forecast

Bank  1  -­‐  C4.5  TreeBank  1  -­‐  LogisticBank  1  -­‐  Random  Forest

Page 25: Risk and risk management in the credit card industry

Value added : Bank 6, 2 Quarter

January 29, 2016

25

MFM Workshop - New York Univerisity

2010Q4 2011Q2 2011Q4 2012Q2 2012Q4 2013Q2-­‐80%

-­‐60%

-­‐40%

-­‐20%

   0%

 20%

 40%

 60%

 80%

Date

Value

 Added  (

%)

Bank  6:  Value  Added  (%)2  Quarter  Forecast

Bank  6  -­‐  C4.5  TreeBank  6  -­‐  LogisticBank  6  -­‐  Random  Forest

Page 26: Risk and risk management in the credit card industry

Risk management via line cuts

January 29, 2016

26

Steps �  Take the accounts which were predicted to default

over a given horizon for a given bank, �  Analyze whether the bank cut its credit line or not.

Metric �  Ratio of the percent of lines cut for defaulted

accounts to the percent of lines cut on all accounts.

MFM Workshop - New York Univerisity

Page 27: Risk and risk management in the credit card industry

Comparing banks efficacy in line cuts

January 29, 2016

27

MFM Workshop - New York Univerisity

0.5Q    1Q 1.5Q    2Q 2.5Q    3Q 3.5Q    4Q 4.5Q-­‐3�

-­‐2�

-­‐1�

 0�

 1�

 2�

 3�

Quarters  Since  Forecast

Log(T

argete

d  Line

 Cuts  Ratio

)

All  Banks  C4.5  Tree  Models:  Line  Cuts3  Quarter  Forecast

Bank  1  -­‐  C4.5  TreeBank  2  -­‐  C4.5  TreeBank  3  -­‐  C4.5  TreeBank  4  -­‐  C4.5  TreeBank  5  -­‐  C4.5  TreeBank  6  -­‐  C4.5  Tree

Page 28: Risk and risk management in the credit card industry

Attribute Analysis

January 29, 2016

28

Understanding the relative importance of attributes.

�  Log of the number of instances classified �  The minimum leaf number: �  Indicator variable equal to 1 if the attribute appears

in the tree and 0 otherwise:

MFM Workshop - New York Univerisity

Page 29: Risk and risk management in the credit card industry

January 29, 2016

29

MFM Workshop - New York Univerisity

Page 30: Risk and risk management in the credit card industry

January 29, 2016

30

MFM Workshop - New York Univerisity