introduction to several works and some ideas
DESCRIPTION
Introduction to several works and Some Ideas. Songcan Chen 2012.9.4. Outlines. Introduction to Several works Some ideas from Sparsity Aware. Introduction to Several works. A Least-Squares Framework for Component Analysis (CA)[1] - PowerPoint PPT PresentationTRANSCRIPT
Introduction to several works and Some Ideas
Songcan Chen
2012.9.4
Outlines
• Introduction to Several works
• Some ideas from Sparsity Aware
Introduction to Several works
1. A Least-Squares Framework for Component Analysis (CA)[1]
2. On the convexity of log-sum-exp functions with positive definite matrices[2]
Some Ideas
• Motivated by CA framework [1]
• Motivated by Log-Sum-Exp [2]
• Motivated by Sparsity Aware[3-4]
CA framework
Proposes a unified least-squares framework, called least-squares weighted kernel reduced rank regression (LS-WKRRR), to formulate many CA methods, As a result, PCA, LDA, CCA, SC, LE and their kernel versions become its special cases.
• LS-WKRRR’s benefits
(1) provides a clean connection between many CA techniques
(2) yields efficient numerical schemes to solve CA techniques
(3) overcomes the small sample size problem;
(4) provides a framework to easily extend CA methods. For example, weighted generalizations of PCA, LDA, SC, and CCA, and several new CA techniques.
• The LS-WKRRR problem minimizes the followingexpression:
Where
Factors:
Weights:Data:
Solutions to A and B
A GEP:
Computational Aspects
• Subspace Iteration• Alternated Least Squares (ALS)• Gradient Descent and Second-Order Methods
Important to notice:
both the ALS and the gradient-based algorithms effectively solve the SSS problem, unlike those that directly solve the GEP.
PCA,KPCA AND WEIGHTED EXTENSIONS
• PCA:
That is, in (1), set
Or alternative formulation:
KPCA & WEIGHTED EXTENSIONS
KPCA:
Weighted PCA:
LDA, KLDA and Weighted Extensions
• LDA:
In (1), Set
G is label matrix using one-of-c encoding for c classes!
CCA, KCCA and Weighted Extensions
• CCA
In (1), set
The relations to LLE, LE etc.
• Please refer to [1]
On the convexity of log-sum-exp functions with positive definite (PD) matrices [2]
Log-Sum-Exp (LSE) function
• One of the fundamental functions in convex analysis is the LSE function whose convexity is the core ingredient in the methodology of geometric programming (GP) which has made considerable impact in different fields, e.g., power control in communication theory!
This paper• Extends these results and consider the convexity of the
log-determinant of a sum of rank one PD matrices with scalar exponential weights!
LSE function (convex):
Extending convexity of vector- function to matrix-variablefor PD
A general convexity definition:
Where between any two points q0 and q1 in the domain.
Several Definitions:
More general,
Applications
• Robust covariance estimation
• Kronecker structured covariance estimation
• Hybrid Robust Kronecker model
Robust covariance estimation
Assume:
The ML objective:
The objective is convex in 1/qi and its minimizers are
Plugging this solution back into the objective, results in
A key lemma:
Applying this lemma to (37) yields
Plugging it back into the objective yields
For avoiding ill-condition, regularize (37) and minimize
Other priors added if available:
1) Bounded peak values:
2) Bounded second moment:
3) Smoothness:
4) Sparsity:
Kronecker structured covariance estimation
The basic Kronecker model is
The ML objective:
Use
The problem (58) turns to
Hybrid Robust Kronecker Model
The ML objective:
Solving for Σ>0 again via Lemma 4 yields
the problem (73) reduces to
Solve (75) using the fixed point iteration
Arbitrary can be used as initial iteration.
Some Ideas
• Motivated by CA framework [1]
• Motivated by Log-Sum-Exp [2]
• Motivated by Sparsity Aware [3][4]
Motivated by CA framework [1]
Recall
Tr r rM W W
0Tr r rM W W 0T
c c cM WW
2
0 ( , , , ) ( ( ) )Tr c r c r c r cE A B M M W RW tr W RW W RW ( )c rtr RM RM
1 103
1 1
( , , ,{ }) [( ) ( ) ]inC
i T i T i T ii j j j j i
i j
E A B Q tr X BA Y X BA Y Q
1 101( , , , ) [( ) ( ) ]T T TE A B Q tr BA Y BA Y Q
1 11 1 2 21 1log log Q Q
1 102
1
( , ,{ },{ }) [( ) ( ) ]n
T T Ti i i i i i i i
i
E A B Q tr X BA Y X BA Y Q
… …
Motivated by Log-Sum-Exp [2]
1) Metric Learning (ML) ML&CL, Relative Distance constraints, LMNN-like,…
2) Classification learningPredictive function: f(X)=tr(WTX)+b;
The objective:
2 1 1( , ) [( ) ( ) ]Ti j i j i jd X X tr X X X X Q
21*
1 1
min [ ( ( ) ) ] ( , , )C n
T i ii j i j i i C
i j
tr W X b y W Pen W W
• ML across heterogeneous domains 2 lines:
1) Line 1:
2) Line 2 (for ML&CL)
22( , ) ;T T Ti j x i y j ij ijd W W W W
x y x y z z
0( , ) [ ] [ ]
0T T T
T
Wf W U
W
x xx y x y z z
y y
U U U Symmetry and PSD
An indefinite measure ({Ui} is base & {αi} is sparsified)
1
( , ) ( ) ( )I
T T Ti i
i
f U U U U
x y z z z z z z1
1I
ii
with
Implying that 2 lines can be unified to a common indefinite ML!
Motivated by Sparsity Aware [3][4]
Noise model
i c c i ci ciU x m y e o
Where c is the c-th class or cluster, eci is noise and oci is outlier and its ||oci||≠0 if outlier, 0 otherwise.
Discuss:
1) Uc=0, oci=0; eci~N(0, dI) Means; Lap(0,dI) Medians; other priors other statistics
2) Uc≠0, oci=0; eci~ N(0, dI) PCA; Lap(0,dI) L1-PCA;
other priorsother PCAs;
3) Uc=0, oci ≠0; eci~N(0, dI) Robust (k-)Means;
~ Lap(0,dI) (k-)Medians;
4) Subspace
Uc≠0, oci ≠0; eci~N(0, dI) Robust k-subspaces;
5) mc=0 ……
6) Robust (Semi-)NMF ……
7) Robust CA ……
where noise model:Γ=BATΥ+E+O
i c c i ci ciU x m y e o
Reference
[1] Fernando De la Torre, A Least-Squares Framework for Component Analysis, IEEE TPAMI,34(6) 2012: 1041-1055.
[2] Ami Wiesel, On the convexity of log-sum-exp functions with positive definite matrices, available at http://www.cs.huji.ac.il/~amiw/
[3] Gonzalo Mateos & Georgios B. Giannakis, Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization, available at homepage of Georgios B. Giannakis.
[4] Pedro A. Forero, Vassilis Kekatos & Georgios B. Giannakis, Robust Clustering Using Outlier-Sparsity Regularization, available at homepage of Georgios B. Giannakis.
Thanks!
Q&A