supervised multivariate discretization and levels merging ... · 4/34 currentpractice job home time...
TRANSCRIPT
![Page 1: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/1.jpg)
Supervised multivariate discretization and levelsmerging for logistic regression
Adrien Ehrhardt1,2
Christophe Biernacki2
Philippe Heinrich3,4
Vincent Vandewalle2,31Crédit Agricole Consumer Finance
2Inria Lille - Nord-Europe3Université de Lille
4CNRS
31/08/2018
1/34
![Page 2: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/2.jpg)
2/34
Table of Contents
Context and basic notations
Supervised multivariate discretization and factor levels grouping
Interactions in logistic regression
Conclusion and future work
![Page 3: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/3.jpg)
3/34
Context and basic notations
![Page 4: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/4.jpg)
4/34
Current practice
Job Home Time injob
Family status Wages
Score
Repayment
Craftsman Owner 20 Widower 2000
225
0
? Renter 10 Common-law 1700
190
0
Licensed profes-sional
Starter 5 Divorced 4000
218
1
Executive By work 8 Single 2700
202
1
Office employee Renter 12 Married 1400
205
0
Worker By family 2 ? 1200
192
0
Table: Dataset with outliers and missing values.
1. Feature selection2. Discretization / grouping3. Interaction screening4. Logistic regression fitting
![Page 5: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/5.jpg)
4/34
Current practice
Job Home Time injob
Family status Wages
Score
Repayment
Craftsman Owner 20 Widower 2000
225
0
? Renter 10 Common-law 1700
190
0
Licensed profes-sional
Starter 5 Divorced 4000
218
1
Executive By work 8 Single 2700
202
1
Office employee Renter 12 Married 1400
205
0
Worker By family 2 ? 1200
192
0
Table: Dataset with outliers and missing values.
1. Feature selection2. Discretization / grouping3. Interaction screening4. Logistic regression fitting
![Page 6: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/6.jpg)
4/34
Current practice
Job Family status Wages
Score
Repayment
Craftsman Widower 2000
225
0
? Common-law 1700
190
0
Licensed profes-sional
Divorced 4000
218
1
Executive Single 2700
202
1
Office employee Married 1400
205
0
Worker ? 1200
192
0
Table: Dataset with outliers and missing values.
1. Feature selection2. Discretization / grouping3. Interaction screening4. Logistic regression fitting
![Page 7: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/7.jpg)
4/34
Current practice
Job Family status Wages
Score
Repayment
Craftsman Widower ]1500;2000]
225
0
? Common-law ]1500;2000]
190
0
Licensed profes-sional
Divorced ]2000;∞[
218
1
Executive Single ]2000;∞[
202
1
Office employee Married ]-∞ ; 1500]
205
0
Worker ? ]-∞ ; 1500]
192
0
Table: Dataset with outliers and missing values.
1. Feature selection2. Discretization / grouping3. Interaction screening4. Logistic regression fitting
![Page 8: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/8.jpg)
4/34
Current practice
Job Family status Wages
Score
Repayment
?+Low-qualified ?+Alone ]1500;2000]
225
0
?+Low-qualified Union ]1500;2000]
190
0
High-qualified ?+Alone ]2000;∞[
218
1
High-qualified ?+Alone ]2000;∞[
202
1
?+Low-qualified Union ]-∞ ; 1500]
205
0
?+Low-qualified ?+Alone ]-∞ ; 1500]
192
0
Table: Dataset with outliers and missing values.
1. Feature selection2. Discretization / grouping3. Interaction screening4. Logistic regression fitting
![Page 9: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/9.jpg)
4/34
Current practice
Job Family status x Wages
Score
Repayment
?+Low-qualified ?+Alone x ]1500;2000]
225
0
?+Low-qualified Union x ]1500;2000]
190
0
High-qualified ?+Alone x ]2000;∞[
218
1
High-qualified ?+Alone x ]2000;∞[
202
1
?+Low-qualified Union x ]-∞ ; 1500]
205
0
?+Low-qualified ?+Alone x ]-∞ ; 1500]
192
0
Table: Dataset with outliers and missing values.
1. Feature selection2. Discretization / grouping3. Interaction screening4. Logistic regression fitting
![Page 10: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/10.jpg)
4/34
Current practice
Job Family status x WagesScore
Repayment
?+Low-qualified ?+Alone x ]1500;2000]225
0
?+Low-qualified Union x ]1500;2000]190
0
High-qualified ?+Alone x ]2000;∞[218
1
High-qualified ?+Alone x ]2000;∞[202
1
?+Low-qualified Union x ]-∞ ; 1500]205
0
?+Low-qualified ?+Alone x ]-∞ ; 1500]192
0
Table: Dataset with outliers and missing values.
1. Feature selection2. Discretization / grouping3. Interaction screening4. Logistic regression fitting
![Page 11: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/11.jpg)
5/34
Mathematical reinterpretation
The whole process can be decomposed into two steps:
X → E → Yx 7→ e = f (x) 7→ y
Selected features: x = ((xj)d11︸ ︷︷ ︸
∈R
, (xj)dd1+1︸ ︷︷ ︸
∈{1,...,oj}
).
f must be “simple” and “component-wise”, i.e. f = (fj)d1 .
We restrict to discretization and grouping of factor levels.
![Page 12: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/12.jpg)
5/34
Mathematical reinterpretation
The whole process can be decomposed into two steps:
X → E → Yx 7→ e = f (x) 7→ y
Selected features: x = ((xj)d11︸ ︷︷ ︸
∈R
, (xj)dd1+1︸ ︷︷ ︸
∈{1,...,oj}
).
f must be “simple” and “component-wise”, i.e. f = (fj)d1 .
We restrict to discretization and grouping of factor levels.
![Page 13: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/13.jpg)
5/34
Mathematical reinterpretation
The whole process can be decomposed into two steps:
X → E → Yx 7→ e = f (x) 7→ y
Selected features: x = ((xj)d11︸ ︷︷ ︸
∈R
, (xj)dd1+1︸ ︷︷ ︸
∈{1,...,oj}
).
f must be “simple” and “component-wise”, i.e. f = (fj)d1 .
We restrict to discretization and grouping of factor levels.
![Page 14: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/14.jpg)
6/34
Mathematical reinterpretation: Feature Engineering
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = 3
Discretization (1 ≤ j ≤ d1)
Into m intervals with associated cutpoints c = (c1, . . . , cm−1).
Discretization function
fj(·; c ,m) : R→{1, . . . ,m}
x 7→1]−∞;c1](x) +m−2∑k=1
(k + 1) 1]ck ;ck+1](x)
+m 1]cm−1,∞[(x)
![Page 15: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/15.jpg)
6/34
Mathematical reinterpretation: Feature Engineering
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = 3
Discretization (1 ≤ j ≤ d1)
Into m intervals with associated cutpoints c = (c1, . . . , cm−1).
Discretization function
fj(·; c ,m) : R→{1, . . . ,m}
x 7→1]−∞;c1](x) +m−2∑k=1
(k + 1) 1]ck ;ck+1](x)
+m 1]cm−1,∞[(x)
![Page 16: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/16.jpg)
7/34
Mathematical reinterpretation: Feature Engineering
1 2
1 2 3 4 5
fj(xj) =
xj =
Grouping (d1 < j ≤ d)
Grouping o values into m, m ≤ o.
Grouping function
fj : {1, . . . , o} → {1, . . . ,m}fj surjective: it defines a partition of {1, . . . , o} in m elements.
![Page 17: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/17.jpg)
7/34
Mathematical reinterpretation: Feature Engineering
1 2
1 2 3 4 5
fj(xj) =
xj =
Grouping (d1 < j ≤ d)
Grouping o values into m, m ≤ o.
Grouping function
fj : {1, . . . , o} → {1, . . . ,m}fj surjective: it defines a partition of {1, . . . , o} in m elements.
![Page 18: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/18.jpg)
8/34
Mathematical reinterpretation: Objective
Target feature y ∈ {0, 1} must be predicted given engineeredfeatures f (x) = (fj(xj))
d1 .
We restrict to binary logistic regression.
On “raw” data, logistic regression yields:
logit(pθraw(1|x)) = θ0 +
d1∑j=1
θjxj +d∑
j=d1+1
θxjj
On discretized / grouped data, logistic regression yields:
logit(pθf (1|f (x))) = θ0 +d∑
j=1
θfj (xj )j
![Page 19: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/19.jpg)
8/34
Mathematical reinterpretation: Objective
Target feature y ∈ {0, 1} must be predicted given engineeredfeatures f (x) = (fj(xj))
d1 .
We restrict to binary logistic regression.
On “raw” data, logistic regression yields:
logit(pθraw(1|x)) = θ0 +
d1∑j=1
θjxj +d∑
j=d1+1
θxjj
On discretized / grouped data, logistic regression yields:
logit(pθf (1|f (x))) = θ0 +d∑
j=1
θfj (xj )j
![Page 20: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/20.jpg)
8/34
Mathematical reinterpretation: Objective
Target feature y ∈ {0, 1} must be predicted given engineeredfeatures f (x) = (fj(xj))
d1 .
We restrict to binary logistic regression.
On “raw” data, logistic regression yields:
logit(pθraw(1|x)) = θ0 +
d1∑j=1
θjxj +d∑
j=d1+1
θxjj
On discretized / grouped data, logistic regression yields:
logit(pθf (1|f (x))) = θ0 +d∑
j=1
θfj (xj )j
![Page 21: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/21.jpg)
8/34
Mathematical reinterpretation: Objective
Target feature y ∈ {0, 1} must be predicted given engineeredfeatures f (x) = (fj(xj))
d1 .
We restrict to binary logistic regression.
On “raw” data, logistic regression yields:
logit(pθraw(1|x)) = θ0 +
d1∑j=1
θjxj +d∑
j=d1+1
θxjj
On discretized / grouped data, logistic regression yields:
logit(pθf (1|f (x))) = θ0 +d∑
j=1
θfj (xj )j
![Page 22: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/22.jpg)
9/34
Example
True data
logit(ptrue(1|x)) = ln
(ptrue(1|x)
1− ptrue(1|x)
)= sin((x1 − 0.7)× 7)
0.0 0.2 0.4 0.6 0.8 1.0
-1.0
-0.5
0.0
0.5
1.0
x
logit(p(1|x))
True distribution
Figure: True relationship between predictor and outcome
![Page 23: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/23.jpg)
9/34
Example
Logistic regression on “raw” data:
logit(pθraw(1|x)) = θ0 + θ1x1
0.0 0.2 0.4 0.6 0.8 1.0
-1.0
-0.5
0.0
0.5
1.0
x
logit(p(1|x))
True distributionLinear logistic regression
Figure: Linear logistic regression fit
![Page 24: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/24.jpg)
9/34
Example
Logistic regression on discretized data:If f is not carefully chosen . . .
logit(pθf (1|f (x))) = θ0 + θf1(x1)1︸ ︷︷ ︸
θ11 ,...,θ501
0.0 0.2 0.4 0.6 0.8 1.0
-1.0
-0.5
0.0
0.5
1.0
x
logit(p(1|x))
True distributionBad discretization
Figure: Bad (high variance) discretization
![Page 25: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/25.jpg)
9/34
Example
Logistic regression on discretized data:If f is carefully chosen . . .
logit(pθf (1|f (x))) = θ0 + θf1(x1)1︸ ︷︷ ︸
θ11 ,...,θ31
0.0 0.2 0.4 0.6 0.8 1.0
-1.0
-0.5
0.0
0.5
1.0
x
logit(p(1|x))
True distributionGood discretization
Figure: Good (bias/variance tradeoff) discretization
![Page 26: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/26.jpg)
10/34
Criterion
θ can be estimated for each discretization f and f ? can be chosenthrough our favorite model choice criterion: BIC, AIC, . . .
A model selection problem
(f ?,θ?) = argmaxf ∈F ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
How to efficiently explore F?
![Page 27: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/27.jpg)
10/34
Criterion
θ can be estimated for each discretization f and f ? can be chosenthrough our favorite model choice criterion: BIC, AIC, . . .
A model selection problem
(f ?,θ?) = argmaxf ∈F ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
How to efficiently explore F?
![Page 28: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/28.jpg)
10/34
Criterion
θ can be estimated for each discretization f and f ? can be chosenthrough our favorite model choice criterion: BIC, AIC, . . .
A model selection problem
(f ?,θ?) = argmaxf ∈F ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
How to efficiently explore F?
![Page 29: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/29.jpg)
11/34
Exploring F
Example of discretization
“Functional” space F where f lives is continuous:
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = mj
However, for a fixed design x = (x i )n1 there is a countable
space F̃ in which fRg ⇔ ∀i , j , fj(xi ) = gj(xi )
(f ?,θ?) = argmaxf ∈F̃ ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
![Page 30: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/30.jpg)
11/34
Exploring F
Example of discretization
“Functional” space F where f lives is continuous:
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = mj
However, for a fixed design x = (x i )n1 there is a countable
space F̃ in which fRg ⇔ ∀i , j , fj(xi ) = gj(xi )
(f ?,θ?) = argmaxf ∈F̃ ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
![Page 31: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/31.jpg)
11/34
Exploring F
Example of discretization
“Functional” space F where f lives is continuous:
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = mj
However, for a fixed design x = (x i )n1 there is a countable
space F̃ in which fRg ⇔ ∀i , j , fj(xi ) = gj(xi )
(f ?,θ?) = argmaxf ∈F̃ ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
![Page 32: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/32.jpg)
11/34
Exploring F
Example of discretization
“Functional” space F where f lives is continuous:
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = mj
However, for a fixed design x = (x i )n1 there is a countable
space F̃ in which fRg ⇔ ∀i , j , fj(xi ) = gj(xi )
xjfj(xj) = 1 fj(xj) = 2
(f ?,θ?) = argmaxf ∈F̃ ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
![Page 33: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/33.jpg)
11/34
Exploring F
Example of discretization
“Functional” space F where f lives is continuous:
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = mj
However, for a fixed design x = (x i )n1 there is a countable
space F̃ in which fRg ⇔ ∀i , j , fj(xi ) = gj(xi )
xjgj(xj) = 1 gj(xj) = 2
(f ?,θ?) = argmaxf ∈F̃ ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
![Page 34: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/34.jpg)
11/34
Exploring F
Example of discretization
“Functional” space F where f lives is continuous:
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = mj
However, for a fixed design x = (x i )n1 there is a countable
space F̃ in which fRg ⇔ ∀i , j , fj(xi ) = gj(xi )
xjhj(xj) = 1 hj(xj) = 2
(f ?,θ?) = argmaxf ∈F̃ ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
![Page 35: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/35.jpg)
11/34
Exploring F
Example of discretization
“Functional” space F where f lives is continuous:
xjfj(xj) = 1 fj(xj) = 2 fj(xj) = mj
However, for a fixed design x = (x i )n1 there is a countable
space F̃ in which fRg ⇔ ∀i , j , fj(xi ) = gj(xi )
(f ?,θ?) = argmaxf ∈F̃ ,θ∈Θf
n∑i=1
ln pθ(yi |f (x i ))− penalty(n;θ)
![Page 36: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/36.jpg)
12/34
State-of-the art
Current academic methods:
A lot of existing heuristics, see [Ramírez-Gallego et al., 2016]:
![Page 37: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/37.jpg)
13/34
State-of-the art
Most of these methods are:
I Univariate,
I Test statistics more or less justified (χ2-based).
![Page 38: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/38.jpg)
13/34
State-of-the art
Most of these methods are:
I Univariate,
I Test statistics more or less justified (χ2-based).
![Page 39: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/39.jpg)
14/34
Supervised multivariate discretization and factorlevels grouping
![Page 40: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/40.jpg)
15/34
Mathematical formalization
Discretized / grouped xj denoted by ej has been seen up to now asthe result of a function of xj :
ej = fj(xj).
Discretization / grouping ej can be seen as a latent randomvariable for which
p(ej |xj) = 1ej (fj(xj))︸ ︷︷ ︸Heaviside-like functiondifficult to optimize
.
Suppose for now that m = (mj)d1 is fixed.
e ∈ Em = {1, . . . ,m1} × . . .× . . .× {1, . . . ,md}.
![Page 41: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/41.jpg)
15/34
Mathematical formalization
Discretized / grouped xj denoted by ej has been seen up to now asthe result of a function of xj :
ej = fj(xj).
Discretization / grouping ej can be seen as a latent randomvariable for which
p(ej |xj) = 1ej (fj(xj)).
p(ej |xj) = 1ej (fj(xj))︸ ︷︷ ︸Heaviside-like functiondifficult to optimize
.
Suppose for now that m = (mj)d1 is fixed.
e ∈ Em = {1, . . . ,m1} × . . .× . . .× {1, . . . ,md}.
![Page 42: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/42.jpg)
15/34
Mathematical formalization
Discretized / grouped xj denoted by ej has been seen up to now asthe result of a function of xj :
ej = fj(xj).
Discretization / grouping ej can be seen as a latent randomvariable for which
p(ej |xj) = 1ej (fj(xj))︸ ︷︷ ︸Heaviside-like functiondifficult to optimize
.
Suppose for now that m = (mj)d1 is fixed.
e ∈ Em = {1, . . . ,m1} × . . .× . . .× {1, . . . ,md}.
![Page 43: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/43.jpg)
15/34
Mathematical formalization
Discretized / grouped xj denoted by ej has been seen up to now asthe result of a function of xj :
ej = fj(xj).
Discretization / grouping ej can be seen as a latent randomvariable for which
p(ej |xj) = 1ej (fj(xj))︸ ︷︷ ︸Heaviside-like functiondifficult to optimize
.
Suppose for now that m = (mj)d1 is fixed.
e ∈ Em = {1, . . . ,m1} × . . .× . . .× {1, . . . ,md}.
![Page 44: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/44.jpg)
15/34
Mathematical formalization
Discretized / grouped xj denoted by ej has been seen up to now asthe result of a function of xj :
ej = fj(xj).
Discretization / grouping ej can be seen as a latent randomvariable for which
p(ej |xj) = 1ej (fj(xj))︸ ︷︷ ︸Heaviside-like functiondifficult to optimize
.
Suppose for now that m = (mj)d1 is fixed.
e ∈ Em = {1, . . . ,m1} × . . .× . . .× {1, . . . ,md}.
![Page 45: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/45.jpg)
16/34
Mathematical formalization
Model selection criterionWe want the “best” model pθ?(y |e?) where θ? is the maximumlikelihood estimator and e? is determined by AIC, BIC. . .
(e?,θ?) = argmaxe∈Em ,θ∈Θm
n∑i=1
ln pθ(yi |e i )− penalty(n;θ)
Em is still too big, so there is a need for a “path” in Em.
![Page 46: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/46.jpg)
16/34
Mathematical formalization
Model selection criterionWe want the “best” model pθ?(y |e?) where θ? is the maximumlikelihood estimator and e? is determined by AIC, BIC. . .
(e?,θ?) = argmaxe∈Em ,θ∈Θm
n∑i=1
ln pθ(yi |e i )− penalty(n;θ)
Em is still too big, so there is a need for a “path” in Em.
![Page 47: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/47.jpg)
17/34
First set of hypotheses
H1: implicit hypothesis of every discretization:
Predictive information about y in x is “squeezed” in e, i.e.ptrue(y |x , e) = ptrue(y |e).
H2: conditional independence:
Conditional independence of ej |xj with other features xk , k 6= j .
x1
xj
xd
e1
ej
ed
y
f1
fj
fd
Figure: Dependance structure between xj ,ej and y
![Page 48: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/48.jpg)
17/34
First set of hypotheses
H1: implicit hypothesis of every discretization:
Predictive information about y in x is “squeezed” in e, i.e.ptrue(y |x , e) = ptrue(y |e).
H2: conditional independence:
Conditional independence of ej |xj with other features xk , k 6= j .
x1
xj
xd
e1
ej
ed
y
f1
fj
fd
Figure: Dependance structure between xj ,ej and y
![Page 49: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/49.jpg)
17/34
First set of hypotheses
H1: implicit hypothesis of every discretization:
Predictive information about y in x is “squeezed” in e, i.e.ptrue(y |x , e) = ptrue(y |e).
H2: conditional independence:
Conditional independence of ej |xj with other features xk , k 6= j .
x1
xj
xd
e1
ej
ed
y
f1
fj
fd
Figure: Dependance structure between xj ,ej and y
![Page 50: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/50.jpg)
18/34
Proposal: continuous relaxation
H3: link between xj and ej :
Continuous relaxation of a discrete problem (cf neural nets)
Continuous features: relaxation of the “hard” discretizationLink between ej and xj is supposed to be polytomous logistic:
pαj (ej |xj).
Categorical features: relaxation of the grouping problem
A simple contingency table is used:
pαj (ej = k |xj = `) = αk,`j .
![Page 51: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/51.jpg)
18/34
Proposal: continuous relaxation
H3: link between xj and ej :Continuous relaxation of a discrete problem (cf neural nets)
Continuous features: relaxation of the “hard” discretizationLink between ej and xj is supposed to be polytomous logistic:
pαj (ej |xj).
Categorical features: relaxation of the grouping problem
A simple contingency table is used:
pαj (ej = k |xj = `) = αk,`j .
![Page 52: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/52.jpg)
18/34
Proposal: continuous relaxation
H3: link between xj and ej :Continuous relaxation of a discrete problem (cf neural nets)
Continuous features: relaxation of the “hard” discretizationLink between ej and xj is supposed to be polytomous logistic:
pαj (ej |xj).
Categorical features: relaxation of the grouping problem
A simple contingency table is used:
pαj (ej = k |xj = `) = αk,`j .
![Page 53: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/53.jpg)
19/34
Intuitions about how it works: model proposal
p(y |x ,θ,α) =∑
e∈Em
p(y |x , e)p(e|x)
=∑
e∈Em
p(y |e)d∏
j=1
p(ej |xj)
=∑
e∈Em
pθe (y |e)︸ ︷︷ ︸logistic
d∏j=1
pαj (ej |xj)︸ ︷︷ ︸logistic or table
≈ pθ?(y |e?)
Subsequently, it is equivalent to “optimize” p(y |x ,θ,α).
maxθ,e
pθ(y |e) ' maxθ,α
p(y |x ,θ,α)
![Page 54: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/54.jpg)
19/34
Intuitions about how it works: model proposal
p(y |x ,θ,α) =∑
e∈Em
p(y |x , e)p(e|x)
=∑
e∈Em
p(y |e)d∏
j=1
p(ej |xj)
=∑
e∈Em
pθe (y |e)︸ ︷︷ ︸logistic
d∏j=1
pαj (ej |xj)︸ ︷︷ ︸logistic or table
≈ pθ?(y |e?)
Subsequently, it is equivalent to “optimize” p(y |x ,θ,α).
maxθ,e
pθ(y |e) ' maxθ,α
p(y |x ,θ,α)
![Page 55: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/55.jpg)
19/34
Intuitions about how it works: model proposal
p(y |x ,θ,α) =∑
e∈Em
p(y |x , e)p(e|x)
=∑
e∈Em
p(y |e)d∏
j=1
p(ej |xj)
=∑
e∈Em
pθe (y |e)︸ ︷︷ ︸logistic
d∏j=1
pαj (ej |xj)︸ ︷︷ ︸logistic or table
≈ pθ?(y |e?)
Subsequently, it is equivalent to “optimize” p(y |x ,θ,α).
maxθ,e
pθ(y |e) ' maxθ,α
p(y |x ,θ,α)
![Page 56: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/56.jpg)
20/34
Intuitions about how it works: estimation
“Classical” estimation strategy with latent variables: EM algorithm.
There would still be a sum over Em:p(y |x ,θ,α) =
∑e∈Em
pθ(y |e)∏d
j=1 pαj (ej |xj)
Use a Stochastic-EM! Draw e knowing that:
p(e|x , y) =pθ(y |e)
∏dj=1 pαj (ej |xj)∑
e∈Empθ(y |e)
∏dj=1 pαj (ej |xj)︸ ︷︷ ︸
still difficult to calculate
Gibbs-sampling step:
p(ej |x , y , e{−j}) ∝ pθ(y |e)pαj (ej |xj)
![Page 57: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/57.jpg)
20/34
Intuitions about how it works: estimation
“Classical” estimation strategy with latent variables: EM algorithm.
There would still be a sum over Em:p(y |x ,θ,α) =
∑e∈Em
pθ(y |e)∏d
j=1 pαj (ej |xj)
Use a Stochastic-EM! Draw e knowing that:
p(e|x , y) =pθ(y |e)
∏dj=1 pαj (ej |xj)∑
e∈Empθ(y |e)
∏dj=1 pαj (ej |xj)︸ ︷︷ ︸
still difficult to calculate
Gibbs-sampling step:
p(ej |x , y , e{−j}) ∝ pθ(y |e)pαj (ej |xj)
![Page 58: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/58.jpg)
20/34
Intuitions about how it works: estimation
“Classical” estimation strategy with latent variables: EM algorithm.
There would still be a sum over Em:p(y |x ,θ,α) =
∑e∈Em
pθ(y |e)∏d
j=1 pαj (ej |xj)
Use a Stochastic-EM! Draw e knowing that:
p(e|x , y) =pθ(y |e)
∏dj=1 pαj (ej |xj)∑
e∈Empθ(y |e)
∏dj=1 pαj (ej |xj)︸ ︷︷ ︸
still difficult to calculate
Gibbs-sampling step:
p(ej |x , y , e{−j}) ∝ pθ(y |e)pαj (ej |xj)
![Page 59: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/59.jpg)
20/34
Intuitions about how it works: estimation
“Classical” estimation strategy with latent variables: EM algorithm.
There would still be a sum over Em:p(y |x ,θ,α) =
∑e∈Em
pθ(y |e)∏d
j=1 pαj (ej |xj)
Use a Stochastic-EM! Draw e knowing that:
p(e|x , y) =pθ(y |e)
∏dj=1 pαj (ej |xj)∑
e∈Empθ(y |e)
∏dj=1 pαj (ej |xj)︸ ︷︷ ︸
still difficult to calculate
Gibbs-sampling step:
p(ej |x , y , e{−j}) ∝ pθ(y |e)pαj (ej |xj)
![Page 60: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/60.jpg)
20/34
Intuitions about how it works: estimation
“Classical” estimation strategy with latent variables: EM algorithm.
There would still be a sum over Em:p(y |x ,θ,α) =
∑e∈Em
pθ(y |e)∏d
j=1 pαj (ej |xj)
Use a Stochastic-EM! Draw e knowing that:
p(e|x , y) =pθ(y |e)
∏dj=1 pαj (ej |xj)∑
e∈Empθ(y |e)
∏dj=1 pαj (ej |xj)︸ ︷︷ ︸
still difficult to calculate
Gibbs-sampling step:
p(ej |x , y , e{−j}) ∝ pθ(y |e)pαj (ej |xj)
![Page 61: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/61.jpg)
21/34
Algorithm
Initializationx1,1 · · · x1,d
.
.
....
.
.
.xn,1 · · · xn,d
at random⇒
e1,1 · · · e1,d
.
.
....
.
.
.en,1 · · · en,d
Loop
y1
.
.
.yn
logisticregression⇒
e1,1 · · · e1,d
.
.
....
.
.
.en,1 · · · en,d
polytomousregression⇒
x1,1 · · · x1,d
.
.
....
.
.
.xn,1 · · · xn,d
Updating e
p(y1, e1,j = k|x i )
.
.
.p(yn, en,j = k|x i )
randomsampling⇒
e1,j
.
.
.en,j
Calculating eMAP
eMAP,1,j
.
.
.eMAP,n,j
MAP
estimate=
argmaxej
pαj (ej |x1,j )
.
.
.argmaxej
pαj (ej |xn,j )
![Page 62: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/62.jpg)
22/34
Go back to “hard” thresholding: MAP estimation
−0.7 1
eMAP,1 = 1 eMAP,1 = 2 eMAP,1 = 3
x1
p(1|x1)
−0.7 1
eMAP,1 = 1 eMAP,1 = 2 eMAP,1 = 3
x1
p(2|x1)
−0.7 1
eMAP,1 = 1 eMAP,1 = 2 eMAP,1 = 3
x1
p(3|x1)
![Page 63: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/63.jpg)
23/34
In the end: the best discretization
New model selection criterionWe have drastically restricted the search space to provably clevercandidates e(1)
MAP, . . . , e(iter)MAP resulting from the Gibbs sampling and
MAP estimation.
(e?,θ?) = argmaxe∈{e(1)
MAP,...,e(iter)MAP},θ∈Θm
n∑i=1
ln pθe (yi |e i )− penalty(n;θ)
We would still need to loop over candidates m!
In practice if ∀i , p(ei ,j = 1|xi ,j , yi )� 1, then ej = 1 disappears. . .
Start with m = (mmax)d1 and “wait” . . . eventually until m = 1.
![Page 64: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/64.jpg)
23/34
In the end: the best discretization
New model selection criterionWe have drastically restricted the search space to provably clevercandidates e(1)
MAP, . . . , e(iter)MAP resulting from the Gibbs sampling and
MAP estimation.
(e?,θ?) = argmaxe∈{e(1)
MAP,...,e(iter)MAP},θ∈Θm
n∑i=1
ln pθe (yi |e i )− penalty(n;θ)
We would still need to loop over candidates m!
In practice if ∀i , p(ei ,j = 1|xi ,j , yi )� 1, then ej = 1 disappears. . .
Start with m = (mmax)d1 and “wait” . . . eventually until m = 1.
![Page 65: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/65.jpg)
23/34
In the end: the best discretization
New model selection criterionWe have drastically restricted the search space to provably clevercandidates e(1)
MAP, . . . , e(iter)MAP resulting from the Gibbs sampling and
MAP estimation.
(e?,θ?) = argmaxe∈{e(1)
MAP,...,e(iter)MAP},θ∈Θm
n∑i=1
ln pθe (yi |e i )− penalty(n;θ)
We would still need to loop over candidates m!
In practice if ∀i , p(ei ,j = 1|xi ,j , yi )� 1, then ej = 1 disappears. . .
Start with m = (mmax)d1 and “wait” . . . eventually until m = 1.
![Page 66: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/66.jpg)
23/34
In the end: the best discretization
New model selection criterionWe have drastically restricted the search space to provably clevercandidates e(1)
MAP, . . . , e(iter)MAP resulting from the Gibbs sampling and
MAP estimation.
(e?,θ?) = argmaxe∈{e(1)
MAP,...,e(iter)MAP},θ∈Θm
n∑i=1
ln pθe (yi |e i )− penalty(n;θ)
We would still need to loop over candidates m!
In practice if ∀i , p(ei ,j = 1|xi ,j , yi )� 1, then ej = 1 disappears. . .
Start with m = (mmax)d1 and “wait” . . . eventually until m = 1.
![Page 67: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/67.jpg)
24/34
Interactions in logistic regression
![Page 68: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/68.jpg)
25/34
Notations
Upper triangular matrix with δk,` = 1 if k < ` and features p and q“interact” in the logistic regression.
logit(pθf (1|f (x))) = θ0 +d∑
j=1
θfj (xj )j +
∑1≤k<`≤d
δk,`θfk (xk )f`(x`)k,`
Imagine for now that the discretization e = f (x) is fixed. Thecriterion becomes:
(θ?, δ?) = argmax
θ,δ∈{0,1}d(d−1)
2
n∑i=1
ln pθ(yi |e i , δ)− penalty(n;θ)
Analogous to previous problem: 2d(d−1)
2 models.
![Page 69: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/69.jpg)
25/34
Notations
Upper triangular matrix with δk,` = 1 if k < ` and features p and q“interact” in the logistic regression.
logit(pθf (1|f (x))) = θ0 +d∑
j=1
θfj (xj )j +
∑1≤k<`≤d
δk,`θfk (xk )f`(x`)k,`
Imagine for now that the discretization e = f (x) is fixed. Thecriterion becomes:
(θ?, δ?) = argmax
θ,δ∈{0,1}d(d−1)
2
n∑i=1
ln pθ(yi |e i , δ)− penalty(n;θ)
Analogous to previous problem: 2d(d−1)
2 models.
![Page 70: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/70.jpg)
25/34
Notations
Upper triangular matrix with δk,` = 1 if k < ` and features p and q“interact” in the logistic regression.
logit(pθf (1|f (x))) = θ0 +d∑
j=1
θfj (xj )j +
∑1≤k<`≤d
δk,`θfk (xk )f`(x`)k,`
Imagine for now that the discretization e = f (x) is fixed. Thecriterion becomes:
(θ?, δ?) = argmax
θ,δ∈{0,1}d(d−1)
2
n∑i=1
ln pθ(yi |e i , δ)− penalty(n;θ)
Analogous to previous problem: 2d(d−1)
2 models.
![Page 71: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/71.jpg)
26/34
Model proposal
δ is latent and hard to optimize over: use a stochastic algorithm!
Strategy used here: Metropolis-Hastings algorithm.
p(y |e) =∑
δ∈{0,1}d(d−1)
2
p(y |e, δ)p(δ)
p(δ|e, y) ∝ p(y |e, δ)���p(δ)
≈ exp(−BIC[δ]/2)���p(δ) p(δp,q) =12
Which transition proposal q : ({0, 1}d(d−1)
2 , {0, 1}d(d−1)
2 ) 7→ [0; 1]?
![Page 72: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/72.jpg)
26/34
Model proposal
δ is latent and hard to optimize over: use a stochastic algorithm!
Strategy used here: Metropolis-Hastings algorithm.
p(y |e) =∑
δ∈{0,1}d(d−1)
2
p(y |e, δ)p(δ)
p(δ|e, y) ∝ p(y |e, δ)���p(δ)
≈ exp(−BIC[δ]/2)���p(δ) p(δp,q) =12
Which transition proposal q : ({0, 1}d(d−1)
2 , {0, 1}d(d−1)
2 ) 7→ [0; 1]?
![Page 73: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/73.jpg)
26/34
Model proposal
δ is latent and hard to optimize over: use a stochastic algorithm!
Strategy used here: Metropolis-Hastings algorithm.
p(y |e) =∑
δ∈{0,1}d(d−1)
2
p(y |e, δ)p(δ)
p(δ|e, y) ∝ p(y |e, δ)p(δ)≈ exp(−BIC[δ]/2)p(δ)
p(y |e) =∑
δ∈{0,1}d(d−1)
2
p(y |e, δ)p(δ)
p(δ|e, y) ∝ p(y |e, δ)���p(δ)
≈ exp(−BIC[δ]/2)���p(δ) p(δp,q) =12
Which transition proposal q : ({0, 1}d(d−1)
2 , {0, 1}d(d−1)
2 ) 7→ [0; 1]?
![Page 74: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/74.jpg)
26/34
Model proposal
δ is latent and hard to optimize over: use a stochastic algorithm!
Strategy used here: Metropolis-Hastings algorithm.
p(y |e) =∑
δ∈{0,1}d(d−1)
2
p(y |e, δ)p(δ)
p(δ|e, y) ∝ p(y |e, δ)���p(δ)
≈ exp(−BIC[δ]/2)���p(δ) p(δp,q) =12
Which transition proposal q : ({0, 1}d(d−1)
2 , {0, 1}d(d−1)
2 ) 7→ [0; 1]?
![Page 75: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/75.jpg)
26/34
Model proposal
δ is latent and hard to optimize over: use a stochastic algorithm!
Strategy used here: Metropolis-Hastings algorithm.
p(y |e) =∑
δ∈{0,1}d(d−1)
2
p(y |e, δ)p(δ)
p(δ|e, y) ∝ p(y |e, δ)���p(δ)
≈ exp(−BIC[δ]/2)���p(δ) p(δp,q) =12
Which transition proposal q : ({0, 1}d(d−1)
2 , {0, 1}d(d−1)
2 ) 7→ [0; 1]?
![Page 76: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/76.jpg)
27/34
Model proposal
2d(d−1) probabilities to calculate. . .
We restrict changes to only one entry δk,`.
Proposal: gain/loss in BIC between bivariate models with /without the interaction.
Trick: alternate one discretization / grouping step and one“interaction” step.
![Page 77: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/77.jpg)
27/34
Model proposal
2d(d−1) probabilities to calculate. . .
We restrict changes to only one entry δk,`.
Proposal: gain/loss in BIC between bivariate models with /without the interaction.
Trick: alternate one discretization / grouping step and one“interaction” step.
![Page 78: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/78.jpg)
27/34
Model proposal
2d(d−1) probabilities to calculate. . .
We restrict changes to only one entry δk,`.
Proposal: gain/loss in BIC between bivariate models with /without the interaction.
Trick: alternate one discretization / grouping step and one“interaction” step.
![Page 79: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/79.jpg)
27/34
Model proposal
2d(d−1) probabilities to calculate. . .
We restrict changes to only one entry δk,`.
Proposal: gain/loss in BIC between bivariate models with /without the interaction.
Trick: alternate one discretization / grouping step and one“interaction” step.
![Page 80: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/80.jpg)
28/34
Results: several datasets
Performance asserted on simulated data.Good performance on real data:
Gini Current performance glmdisc Basic glmAuto (n=50,000 ; d=15) 57.9 64.84 58
Revolving (n=48,000 ; d=9) 58.57 67.15 53.5Prospects (n=5,000 ; d=25) 35.6 47.18 32.7Electronics (n=140,000 ; d=8) 57.5 58 -10
Young (n=5,000 ; d=25) ≈ 15 30 12.2Basel II (n=70,000 ; d=13) 70 71.3 19
Relatively fast computing time: between 2 hours and a day on alaptop according to number of observations, features, . . .
“Inexisting” human time.
![Page 81: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/81.jpg)
29/34
Conclusion and future work
![Page 82: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/82.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),I R implementation of glmdisc available on Github, to be
submitted to CRAN,I Python implementation of glmdisc available on Github and
PyPi,I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 83: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/83.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,
I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),I R implementation of glmdisc available on Github, to be
submitted to CRAN,I Python implementation of glmdisc available on Github and
PyPi,I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 84: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/84.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,
I Good empirical results and statistical guarantees (to someextent...),
I R implementation of glmdisc available on Github, to besubmitted to CRAN,
I Python implementation of glmdisc available on Github andPyPi,
I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 85: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/85.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),
I R implementation of glmdisc available on Github, to besubmitted to CRAN,
I Python implementation of glmdisc available on Github andPyPi,
I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 86: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/86.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),I R implementation of glmdisc available on Github, to be
submitted to CRAN,
I Python implementation of glmdisc available on Github andPyPi,
I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 87: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/87.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),I R implementation of glmdisc available on Github, to be
submitted to CRAN,I Python implementation of glmdisc available on Github and
PyPi,
I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 88: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/88.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),I R implementation of glmdisc available on Github, to be
submitted to CRAN,I Python implementation of glmdisc available on Github and
PyPi,I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 89: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/89.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),I R implementation of glmdisc available on Github, to be
submitted to CRAN,I Python implementation of glmdisc available on Github and
PyPi,I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 90: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/90.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),I R implementation of glmdisc available on Github, to be
submitted to CRAN,I Python implementation of glmdisc available on Github and
PyPi,I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 91: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/91.jpg)
30/34
Take-aways
Conclusion
I Reinterpretation as a latent variable problem,I Resolution proposal relying on MCMC and “soft” discretization,I Good empirical results and statistical guarantees (to some
extent...),I R implementation of glmdisc available on Github, to be
submitted to CRAN,I Python implementation of glmdisc available on Github and
PyPi,I Big gain for statisticians in the field of Credit Scoring.
Perspectives
I Tested for logistic regression and polytomous logistic links:can be adapted to other models pθ and pα!
I The same model can be estimated with shallow neuralnetworks.
![Page 92: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/92.jpg)
31/34
Shallow neural nets as a substitute estimation procedure
Input #1
Input #2
Input #3
σ
σ
σ
σ
σ
σ
σ Output
Hiddenlayer
Inputlayer
Outputlayer
![Page 93: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/93.jpg)
31/34
Shallow neural nets as a substitute estimation procedure
Input #1
Input #2
Input #3
σ
σ
σ
σ
σ
σ
σ Output
Hiddenlayer
Inputlayer
Outputlayer
mj − 1 neuronswith sigmoidactivations≈ pαj (ej |xj)
![Page 94: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/94.jpg)
32/34
Thanks!
![Page 95: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/95.jpg)
33/34
References I
Ramírez-Gallego, S., García, S., Mouriño-Talín, H.,Martínez-Rego, D., Bolón-Canedo, V., Alonso-Betanzos, A.,Benítez, J. M., and Herrera, F. (2016).Data discretization: taxonomy and big data challenge.Wiley Interdisciplinary Reviews: Data Mining and KnowledgeDiscovery, 6(1):5–21.
![Page 96: Supervised multivariate discretization and levels merging ... · 4/34 Currentpractice Job Home Time in job Familystatus Wages Score Repayment Craftsman Owner 20 Widower 2000 225 0?](https://reader033.vdocuments.site/reader033/viewer/2022042412/5f2c4ff0c6f38e7679743da9/html5/thumbnails/96.jpg)
34/34
Interaction discovery: proposal
p(δk,` = 1|ek , e`, y) = g(BIC[δk,` = 1]− BIC[δk,` = 0])≈ exp
( 12 (BIC[pθ(y |ek , e`, δk,` = 0)]− BIC[pθ(y |ek , e`, δk,` = 1)])
)q(δ, δ′) = |δk,` − pk,`| for the unique couple (k , `) s.t. δ(s)
k,` 6= δ′k,`
α = min(1, p(δ′|e,y)
p(δ|e,y)1−q(δ,δ′)q(δ,δ′)
)≈ min
(1, exp
( 12 (BIC[pθ(y |e, δ)]− BIC[pθ(y |e, δ′)])
) 1−q(δ,δ′)q(δ,δ′)
)