ppdsparse: a parallel primal and dual sparse method to extreme...
TRANSCRIPT
![Page 1: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/1.jpg)
PPDSparse: A Parallel Primal and Dual Sparse Methodto Extreme Classification
Ian E.H. Yen1, Xiangru Huang2, Wei Dai1, Pradeep Ravikumar1,Inderjit S. Dhillon2 and Eric Xing1
1Carnegie Mellon University. 2University of Texas at Austin
KDD, 2017
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 1 / 17
![Page 2: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/2.jpg)
Outline
1 Problem SettingExtreme ClassificationRelated Works
2 AlgorithmSeparable LossAlgorithm Diagram
3 TheoryAnalysis of primal and dual sparsity
4 Experimental Results
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 2 / 17
![Page 3: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/3.jpg)
Outline
1 Problem SettingExtreme ClassificationRelated Works
2 AlgorithmSeparable LossAlgorithm Diagram
3 TheoryAnalysis of primal and dual sparsity
4 Experimental Results
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 3 / 17
![Page 4: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/4.jpg)
Extreme Classification
Goal: Learn a function h(x) : RD → RK from D input features to Koutput scores that is consistent with labels y ∈ {0, 1}K .
K is large (e.g. 103˜106).
Average number of positive labels (per sample)kp = 1
N
∑Ni=1(
∑k yik).
Multiclass: kp = 1; Multilabel: kp � K .
Average number of positive samples (per class)np = 1
K
∑Kk=1 n
kp = 1
K
∑Kk=1(
∑i yik).
np = Nkp/K � N
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 4 / 17
![Page 5: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/5.jpg)
Extreme Classification
Goal: Learn a function h(x) : RD → RK from D input features to Koutput scores that is consistent with labels y ∈ {0, 1}K .
K is large (e.g. 103˜106).
Average number of positive labels (per sample)kp = 1
N
∑Ni=1(
∑k yik).
Multiclass: kp = 1; Multilabel: kp � K .
Average number of positive samples (per class)np = 1
K
∑Kk=1 n
kp = 1
K
∑Kk=1(
∑i yik).
np = Nkp/K � N
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 4 / 17
![Page 6: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/6.jpg)
Extreme Classification
Goal: Learn a function h(x) : RD → RK from D input features to Koutput scores that is consistent with labels y ∈ {0, 1}K .
K is large (e.g. 103˜106).
Average number of positive labels (per sample)kp = 1
N
∑Ni=1(
∑k yik).
Multiclass: kp = 1; Multilabel: kp � K .
Average number of positive samples (per class)np = 1
K
∑Kk=1 n
kp = 1
K
∑Kk=1(
∑i yik).
np = Nkp/K � N
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 4 / 17
![Page 7: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/7.jpg)
Extreme Classification
We consider Linear Classifier:
h(x) := W Tx where W ∈ RD×K .
Challenge: When K is large, training of simple linear model requiresO(NDK ) cost.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 5 / 17
![Page 8: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/8.jpg)
Extreme Classification
We consider Linear Classifier:
h(x) := W Tx where W ∈ RD×K .
Challenge: When K is large, training of simple linear model requiresO(NDK ) cost.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 5 / 17
![Page 9: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/9.jpg)
Outline
1 Problem SettingExtreme ClassificationRelated Works
2 AlgorithmSeparable LossAlgorithm Diagram
3 TheoryAnalysis of primal and dual sparsity
4 Experimental Results
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 6 / 17
![Page 10: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/10.jpg)
Approaches
Approach 1 Structural, i.e. Low-rank or Tree-hierarchy Goodaccuracy when assumption holds. Lower accuracy when assumptionsnot hold.
Approach 2 Parallelized one-vs-all Good accuracy, slow,parallelizable. Need days on largest dataset with 100 cores.
Approach 3 Primal-Dual Sparse Good accuracy, fast, notparallelizable, memory issue O(DK ). Need days on largest dataset.
This paper Parallel PD-Sparse Good accuracy, fast, parallelizable.Need only < 30 min on largest dataset with 100 cores.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 7 / 17
![Page 11: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/11.jpg)
Approaches
Approach 1 Structural, i.e. Low-rank or Tree-hierarchy Goodaccuracy when assumption holds. Lower accuracy when assumptionsnot hold.
Approach 2 Parallelized one-vs-all Good accuracy, slow,parallelizable. Need days on largest dataset with 100 cores.
Approach 3 Primal-Dual Sparse Good accuracy, fast, notparallelizable, memory issue O(DK ). Need days on largest dataset.
This paper Parallel PD-Sparse Good accuracy, fast, parallelizable.Need only < 30 min on largest dataset with 100 cores.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 7 / 17
![Page 12: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/12.jpg)
Approaches
Approach 1 Structural, i.e. Low-rank or Tree-hierarchy Goodaccuracy when assumption holds. Lower accuracy when assumptionsnot hold.
Approach 2 Parallelized one-vs-all Good accuracy, slow,parallelizable. Need days on largest dataset with 100 cores.
Approach 3 Primal-Dual Sparse Good accuracy, fast, notparallelizable, memory issue O(DK ). Need days on largest dataset.
This paper Parallel PD-Sparse Good accuracy, fast, parallelizable.Need only < 30 min on largest dataset with 100 cores.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 7 / 17
![Page 13: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/13.jpg)
Approaches
Approach 1 Structural, i.e. Low-rank or Tree-hierarchy Goodaccuracy when assumption holds. Lower accuracy when assumptionsnot hold.
Approach 2 Parallelized one-vs-all Good accuracy, slow,parallelizable. Need days on largest dataset with 100 cores.
Approach 3 Primal-Dual Sparse Good accuracy, fast, notparallelizable, memory issue O(DK ). Need days on largest dataset.
This paper Parallel PD-Sparse Good accuracy, fast, parallelizable.Need only < 30 min on largest dataset with 100 cores.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 7 / 17
![Page 14: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/14.jpg)
Outline
1 Problem SettingExtreme ClassificationRelated Works
2 AlgorithmSeparable LossAlgorithm Diagram
3 TheoryAnalysis of primal and dual sparsity
4 Experimental Results
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 8 / 17
![Page 15: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/15.jpg)
We consider the classwise-separable hinge loss
L(z , y) :=K∑
k=1
`(zk , yk) =K∑
k=1
max(1− ykzk , 0)
Minimizing a separable loss is equivalent to One-versus-all:
minW∈RD×K
N∑i=1
K∑k=1
`(wTk xi , yik) =
K∑k=1
(N∑i=1
`(wTk xi , yik)
)
To obtain sparse iterates, we add `1-penalty on W and add bias perclass w0k . The dual problem of the `1-`2-regularized problem is:
minαk∈RN
G (αk) :=1
2‖w(αk)‖2 −
N∑i=1
αik
s.t. w(αk) = proxλ(X̂Tαk),
0 ≤ αik ≤ 1.
(1)
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 9 / 17
![Page 16: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/16.jpg)
We consider the classwise-separable hinge loss
L(z , y) :=K∑
k=1
`(zk , yk) =K∑
k=1
max(1− ykzk , 0)
Minimizing a separable loss is equivalent to One-versus-all:
minW∈RD×K
N∑i=1
K∑k=1
`(wTk xi , yik) =
K∑k=1
(N∑i=1
`(wTk xi , yik)
)
To obtain sparse iterates, we add `1-penalty on W and add bias perclass w0k . The dual problem of the `1-`2-regularized problem is:
minαk∈RN
G (αk) :=1
2‖w(αk)‖2 −
N∑i=1
αik
s.t. w(αk) = proxλ(X̂Tαk),
0 ≤ αik ≤ 1.
(1)
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 9 / 17
![Page 17: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/17.jpg)
Outline
1 Problem SettingExtreme ClassificationRelated Works
2 AlgorithmSeparable LossAlgorithm Diagram
3 TheoryAnalysis of primal and dual sparsity
4 Experimental Results
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 10 / 17
![Page 18: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/18.jpg)
Primal-Dual-Sparse Active-set Method
O
(nnz(wk)nnz(x j)︸ ︷︷ ︸
Search
+ nnz(αk)nnz(x i )︸ ︷︷ ︸Update+Maintain
)per iteration.
Apply Random Sparsification on (already sparse) wk before search.
Update α by Coordinate Descent within Ak .
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 11 / 17
![Page 19: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/19.jpg)
Parallel & Primal-Dual Sparse Method
Due to the separable loss, theoptimization can beembarrassingly parallelized withone-time communication.
The input yk and output wk ofeach sub-problem are sparse.
Can be implemented in adistributed, shared-memory, ortwo-level parallelization setting.
Space: O(nnz(X ) + D).
Nearly linear speedup even withthousands of cores.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 12 / 17
![Page 20: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/20.jpg)
Outline
1 Problem SettingExtreme ClassificationRelated Works
2 AlgorithmSeparable LossAlgorithm Diagram
3 TheoryAnalysis of primal and dual sparsity
4 Experimental Results
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 13 / 17
![Page 21: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/21.jpg)
Theory: Primal and Dual Sparsity
Key Insight: The number of positive samples for each class
np =NkpK
is small. The following results hold if class-wise bias wk0 are added.
Step-1: bound ‖w‖1, and optimal ‖α∗‖1 in terms of np:
‖wk‖1 ≤2nkpλ
, ‖α∗k‖1 ≤ 4nkp .
Step-2: bound nnz(w), and nnz(α) in terms of ‖w‖1 and ‖α∗‖1:
nnz(w̃k) ≤ ‖wk‖21
δ2, nnz(αt
k) ≤ t ≤4‖α∗k‖2
1
ε
where w̃ is Random-Sparsified version of w with δ-approximationerror in ∇G (α), and ε is the desired precision of solution.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 14 / 17
![Page 22: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/22.jpg)
Theory: Primal and Dual Sparsity
Key Insight: The number of positive samples for each class
np =NkpK
is small. The following results hold if class-wise bias wk0 are added.
Step-1: bound ‖w‖1, and optimal ‖α∗‖1 in terms of np:
‖wk‖1 ≤2nkpλ
, ‖α∗k‖1 ≤ 4nkp .
Step-2: bound nnz(w), and nnz(α) in terms of ‖w‖1 and ‖α∗‖1:
nnz(w̃k) ≤ ‖wk‖21
δ2, nnz(αt
k) ≤ t ≤4‖α∗k‖2
1
ε
where w̃ is Random-Sparsified version of w with δ-approximationerror in ∇G (α), and ε is the desired precision of solution.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 14 / 17
![Page 23: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/23.jpg)
Theory: Primal and Dual Sparsity
Key Insight: The number of positive samples for each class
np =NkpK
is small. The following results hold if class-wise bias wk0 are added.
Step-1: bound ‖w‖1, and optimal ‖α∗‖1 in terms of np:
‖wk‖1 ≤2nkpλ
, ‖α∗k‖1 ≤ 4nkp .
Step-2: bound nnz(w), and nnz(α) in terms of ‖w‖1 and ‖α∗‖1:
nnz(w̃k) ≤ ‖wk‖21
δ2, nnz(αt
k) ≤ t ≤4‖α∗k‖2
1
ε
where w̃ is Random-Sparsified version of w with δ-approximationerror in ∇G (α), and ε is the desired precision of solution.
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 14 / 17
![Page 24: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/24.jpg)
Multilabel Classification
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 15 / 17
![Page 25: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/25.jpg)
Multiclass Classification
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 16 / 17
![Page 26: PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme …ianyen.site/publication/PPDSparse_Slide_KDD.pdf · 2018-06-02 · Outline 1 Problem Setting Extreme Classi cation](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa8e580ee819617ff3ab32a/html5/thumbnails/26.jpg)
Thank you
Xiangru [email protected]
Ian E.H. Yen, Xiangru Huang, Wei Dai, Pradeep Ravikumar, Inderjit S. Dhillon and Eric Xing (shortinst)PPDSparse: A Parallel Primal and Dual Sparse Method to Extreme ClassificationKDD, 2017 17 / 17