Multivariate Dyadic Regression Trees for Sparse Learning Problems
Xi ChenMachine Learning Department
Carnegie Mellon University(joint work with Han Liu)
Content
Experimental Results
Statistical Property
Multivariate Regression and Dyadic Regression Tree
Tree Learning Algorithm
Multivariate Dyadic Regression Tree for Sparse Learning
Multivariate Regression Model
Multivariate Regression Model
Predictors Responses
Estimate : Minimize the L2-risk
Empirical Risk Minimization
Tree Based Method
Estimation using tree based methodsWhy trees? Simplicity of Design Good Interpretability Easy Implementation Good Practical Performance
Tree Based Method
CART (Classification and Regression Tree)[Breiman 1984]
No. of terminal nodesHard to be theoretically analyzed!
Dyadic Decision/Regression Tree
Dyadic Split[Scott 2004]
Sparse Model
Lower Minimax Rate of Convergence of the risk
Slow
Fast
Sparse Model
Regression Tree
Piecewise Constant
Piecewise Linear
Piecewise Polynomial
Gamma-Ray Burst 845
Multivariate Dyadic Regression Tree (MDRT)
Active Set
Rule 1
Rule 2
Multivariate Dyadic Regression Tree (MDRT) Variable Selection
Multivariate Dyadic Regression Tree
Regularization Parameter
Fine partitionSparse Model
Lower degree poly
Statistical Property
Assumption 1:
Assumption 2:
Convergence Rate
Minimax Rate
Tree Learning Algorithm
Loss:
Minimize the cost
Tree Learning Algorithm
Tree-growing stage
Pruning-back stage
Randomized
Greedy
Experimental Results
Methods Compared
Methods
Greedy MDRT with M=1 MDRT(G, M=1)
Randomized MDRT with M=1 MDRT(R, M=1)
Greedy MDRT with M=0 MDRT(G, M=0)
Randomized MDRT with M=0 MDRT(R, M=0)
Classification and Regression Tree CART
Piecewise LinearPiecewise Constant
Generalized Nonlinear Model
Experimental Results
Synthetic Data
Linear Model
Additive Model
Experimental Results
Experimental Results
Real Data (MSE)
10 artificial variables from Unif(0,1)
15 artificial variables from Unif(0,1)
Never selected in 20 runs for M=1
Conclusion
Multivariate Regression Tree Model Dyadic Split A novel penalization term Theoretically, achieve nearly optimal minimax
rate for (α,C) smooth function Empirically, conduct variable selection for sparse
models Efficient computation tree learning algorithm
Extensions Classification Trees Forest Extensions