inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · inverse...
TRANSCRIPT
![Page 1: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/1.jpg)
Inverse regression approach to (robust) non-linear high-to-lowdimensional mapping
Emeline Perthame
Joint work with Florence Forbes
INRIA, team MISTIS, Grenoble
LMNO, Caen
October 27, 2016
1 / 25
![Page 2: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/2.jpg)
Outlines
1. Non linear mapping problem
2. GLLiM/SLLiM: inverse regression approach
3. Estimation of parameters
4. Results and conclusion
2 / 25
![Page 3: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/3.jpg)
Outlines
1. Non linear mapping problem
2. GLLiM/SLLiM: inverse regression approach
3. Estimation of parameters
4. Results and conclusion
3 / 25
![Page 4: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/4.jpg)
A non linear mapping problem
• A non linear mapping problem
y =
y1.........yD
g(y) x1
...xL
= x
• Prediction of X from Y through a non linear regression function g
E(X |Y = y) = g(y)
with Y ∈ RD ,X ∈ RL,D L
4 / 25
![Page 5: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/5.jpg)
A non linear mapping problem
• Application: Ω mission on Mars → launch of a spectrometer aroundMars
• Problem: Retrieving physical properties from hyperspectral images
− Y: spectrum (D=184)
− X: composition of the ground (L=3)
Mars Express - Omega (2004) [http://geops.geol.u-psud.fr/]
0 50 100 150
0.1
0.2
0.3
0.4
0.5
Wavelength
Refl
ecta
nce prop. of dust
prop. of CO2 ice
prop. of water ice
5 / 25
![Page 6: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/6.jpg)
Some approaches
• Difficulty: D large → curse of dimensionality
• Solutions: via dimensionality reduction
− Reduce dimension of y before regression: eg. PCA on y
→ Risk: poor prediction of x
− Take x into account: PLS, SIR, Kernel SIR, PC based methods
→ Two steps approaches not expressed as a single optimizationproblem
→ Our approach: inverse regression to reduce dimension
6 / 25
![Page 7: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/7.jpg)
Outlines
1. Non linear mapping problem
2. GLLiM/SLLiM: inverse regression approach
3. Estimation of parameters
4. Results and conclusion
7 / 25
![Page 8: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/8.jpg)
Proposed Method: An inverse regression strategy
• x ∈ RL low-dimensional space,
• y ∈ RD high-dimensional space,
• (y , x) are realizations of (Y ,X ) ∼ p(Y ,X ; θ), θ parameters
Inverse conditional density: p(Y | X ; θ)
• Y is a noisy function of X
• Modeled via mixtures → Tractable θ estimation
Forward conditional density: p(X | Y ; θ∗), with θ∗ = f (θ)
→ High-to-low prediction, eg. X = E[X | Y = Y ; θ∗]
8 / 25
![Page 9: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/9.jpg)
Student Locally-linear Mapping (SLLiM)
A piecewise affine model:
• Introduce a missing variable Z → Z = k ⇔ Y is the image of X by anaffine transformation
Y =K∑
k=1
I(Z = k)(AkX + bk + Ek )
Definition of SLLiM
p(Y |X ,Z = k ; θ) = S(Y ;AkX + bk ,Σk , αyk , γ
yk )
• Affine transformations are local: mixture of K Student laws
p(X |Z = k ; θ) = S(X ; ck ,Γk , αk , 1)
p(Z = k ; θ) = πk
• The set of all model parameters is:
θ = πk , ck ,Γk ,Ak , bk ,Σk , αk , k = 1 . . .K
9 / 25
![Page 10: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/10.jpg)
Why a Student mixture ?
• Dealing with outliers → Generalized Student distribution for the jointdensity of (X ,Y )
SM (y ;µ,Σ, α, γ) =Γ(α+ M /2)
|Σ|1/2 Γ(α) (2πγ)M/2[1 + δ(y , µ,Σ)/(2γ)]−(α+M/2),
• Gaussian scale mixture representation (using weight variable Udistributed according to a Gamma distribution )
SM (y ;µ,Σ, α, γ) =
∫ ∞0
NM (y ;µ,Σ/u) G(u;α, γ) du
• Parameters estimation is tractable by an EM algorithm
-6 -4 -2 0 2 4 6
0.0
0.1
0.2
0.3
0.4
x
Den
sity
GaussianStudent α=0.1
10 / 25
![Page 11: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/11.jpg)
Low-to-high (Inverse) Regression
• If X and Y are both observed
− The parameter vector, θ, can be estimated in closed-form using an EMinference procedure
− This yields the inverse conditional density which is a Student mixture:
p(Y |X ; θ) =K∑
k=1
πkS(X ; ck ,Γk , αk , 1)∑Kj=1 πjS(X ; cj ,Γj , αj , 1)
S(Y ;AkX + bk ,Σkαyk , γ
yk )
• Both densities are Student mixtures parameterized by θ. Therefore, toobtain:
− A low-to-high inverse regression function:
E[Y |X = x ; θ] =K∑
k=1
πkS(x ; ck ,Γk , αk , 1)∑Kj=1 πjS(x ; cj ,Γj , αk , 1)
(Akx + bk ),
11 / 25
![Page 12: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/12.jpg)
High-to-low (Forward) Regression
• The forward conditional density is a Student mixture as well:
p(X |Y ; θ∗) =K∑
k=1
π∗kS(Y ; c∗k ,Γ∗k , αk , 1)∑K
j=1 π∗j S(Y ; c∗j ,Γ
∗j , αj , 1)
S(X ;A∗kY + b∗k ,Σ∗k , α
xk , γ
xk )
• The forward parameter vector, θ∗ has an analytic expression as afunction of θ
• Both densities are Student mixtures parameterized by θ. Therefore, toobtain:
− A high-to-low forward regression function:
E[X |Y = y ; θ] =K∑
k=1
πkS(y ; c∗k ,Γ∗k , αk , 1)∑K
j=1 πjS(y ; c∗j ,Γ∗j , αj , 1)
(A∗ky + b∗k ).
12 / 25
![Page 13: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/13.jpg)
The forward parameter vector θ∗ from θ
c∗k = Akck + bk ,
Γ∗k = Σk + AkΓkATk ,
A∗k = Σ∗kATk Σ−1
k ,
b∗k = Σ∗k (Γ−1k ck −AT
k Σ−1k bk ),
Σ∗k = (Γ−1k + AT
k Σ−1k Ak )−1.
13 / 25
![Page 14: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/14.jpg)
A joint model approach to reduce the number of parameters
• Joint model
p(X = x ,Y = y |Z = k) = SL+D
([xy
];mk ,Vk , αk , 1
)with
mk =
[ck
Akck + bk
]and Vk =
[Γk ΓkA
Tk
AkΓk Σk + AkΓkATk
]• Reduce the number of parameters to estimate
− Forward strategy + Γk diagonal
∗ nb. par. = 12D(D − 1) + DL + 2L + D
∗ D = 500,L = 2→ 126 254 parameters
− Inverse strategy + Σk diagonal
∗ nb. par. 12L(L− 1) + DL + 2D + L
∗ D = 500,L = 2→ 2 003 parameters
14 / 25
![Page 15: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/15.jpg)
Extension to partially observed responses
• Incorporate a latent component into the low-dimensional variable:
X =
[TW
]where T ∈ RLt is observed and W ∈ RLw is latent (L = Lt + Lw)
• Example on Mars data: lighting ? temperature ? grain size ?
• Observed pairs (yn ,Tn),n = 1 . . .N (T ∈ RLt)
• Additional latent variable W (W ∈ RLw)
• Assuming the independence of T and W given Z :
p(X = (T ,W )> | Z = k) = SL((T ,W )>; ck ,Γk , αk , 1)
with ck =
[ctk0
], Γk =
[Γtk 0
0 ILw
]
15 / 25
![Page 16: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/16.jpg)
Extension to partially observed responses
• Extension of SLLiM to more general covariance structure
• With Ak =[At
k Awk
],
Y =
K∑k=1
I(Z = k)(AtkT + Aw
k W + bk + Ek )
rewrites
Y =
K∑k=1
I(Z = k)(AtkT + bk + E ′k )
with Var(E ′k ) ∝ Σk + Aw
k Aw>k
− Diagonal Σk −→ Factor analysis with Lw factors (at most)
− A compromise between full O(D2) and diagonal O(D) covariances
16 / 25
![Page 17: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/17.jpg)
Outlines
1. Non linear mapping problem
2. GLLiM/SLLiM: inverse regression approach
3. Estimation of parameters
4. Results and conclusion
17 / 25
![Page 18: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/18.jpg)
Estimation of θ = (ck ,Γk ,Ak , bk ,Σk , πk , αk )1≤k≤K by EM algorithm
• E-step
− Update posterior probabilities
(EZ ) p(Z = k |t , y , θ(i)) → “SMM-like”
(EW ) p(W |Z = k , t , y , θ(i)) → Probabilistic PCA or FactorAnalysis like
(EU ) E(U |Z = k , t , y , θ(i)) → Down-weighting extreme/atypicvalues in estimators → More robust
• M-step
(MX ) (πk , ck ,Γk ) → “SMM-like”
(MY |X ) (Ak , bk ,Σk ) → Hybrid between linear regression andPPCA/FA
Ak = Yk XTk (
[0 0
0 Swk
]+ Xk X
Tk )−1
(Mα) αk → Not in closed-form but standard (specific to Student)
18 / 25
![Page 19: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/19.jpg)
Outlines
1. Non linear mapping problem
2. GLLiM/SLLiM: inverse regression approach
3. Estimation of parameters
4. Results and conclusion
19 / 25
![Page 20: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/20.jpg)
Application L = D = 1
• RATP → Subway in Paris
• Measure of air quality atChatelet station, line 4
• March 2015 → N = 341measures
• Prediction of NO (L=1) fromNO2 (D=1)
→ Robustness of SLLiM
20 30 40 50 60 70 80
010
020
0300
400
500
NO2
NO
20 / 25
![Page 21: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/21.jpg)
Application L = D = 1 / SLLiM compared to GLLiM
20 30 40 50 60 70 80
010
020
0300
400
500
NO2
NO
GLLiMSLLiM
20 30 40 50 60 70 80
010
020
0300
400
500
NO2
NO
GLLiMSLLiM
→ Illustration of robustness of the proposed model
21 / 25
![Page 22: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/22.jpg)
Application L = D = 1 / SLLiM compared to GLLiM
1 2 3 4 5 6 7 8 9 10
0.76
0.78
0.80
0.82
0.84
K
NRMSE
GLLiMSLLiMGLLiM-WOSLLiM-WO
→ SLLiM achieves better prediction rates than GLLiM on complete data
→ SLLiM becomes equivalent to GLLiM when outliers are removed
22 / 25
![Page 23: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/23.jpg)
Other applications and augmented version of SLLiM
• Application when D L
− Hyperspectral data on Mars (D=184, L=2, N=6983)
→ Comparison with other non linear regression methods
Table: Mars data: average NRMSE and standard deviations in parenthesis forproportions of CO2 ice and dust over 100 runs.
Method Prop. of CO2 ice Prop. of dust
SLLiM (K=10) 0.168 (0.019) 0.145 (0.020)GLLiM (K=10) 0.180 (0.023) 0.155 (0.023)MARS 0.173 (0.016) 0.160 (0.021)SIR 0.243 (0.025) 0.157 (0.016)RVM 0.299 (0.021) 0.275 (0.034)
23 / 25
![Page 24: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/24.jpg)
Results - Application to hyperspectral image analysis
GLLiM SLLiM SplinesProportion of CO2 ice
Proportion of dust
24 / 25
![Page 25: Inverse regression approach to (robust) non-linear high-to-low … · 2019-12-20 · Inverse regression approach to (robust) non-linear high-to-low dimensional mapping Emeline Perthame](https://reader030.vdocuments.site/reader030/viewer/2022040809/5e4ef81b13230853fe1816b1/html5/thumbnails/25.jpg)
Conclusion and future work
• Mixture model used for prediction
• Addition of latent variables of partially observed responses
• Selection of K and Lw
− K fixed ? Or selected by BIC ?
− Lw selected by BIC ?
Thank you for your attention ! Any questions ?
25 / 25