greedy dictionary learning algorithms for sparse surrogate 2017-12-19آ  greedy dictionary learning...

Download Greedy Dictionary Learning Algorithms for Sparse Surrogate 2017-12-19آ  Greedy Dictionary Learning Algorithms

If you can't read please download the document

Post on 07-Jun-2020

1 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Greedy Dictionary Learning Algorithms for Sparse Surrogate Modelling

    by

    Valentin Stolbunov

    A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

    Graduate Department of the Institute for Aerospace Studies University of Toronto

    c© Copyright 2017 by Valentin Stolbunov

  • Abstract

    Greedy Dictionary Learning Algorithms for Sparse Surrogate Modelling

    Valentin Stolbunov

    Doctor of Philosophy

    Graduate Department of the Institute for Aerospace Studies

    University of Toronto

    2017

    In the field of engineering design, numerical simulations are commonly used to forecast system performance before physical prototypes are built and tested. However the fi- delity of predictive models has outpaced advances in computer hardware and numerical methods, making it impractical to directly apply numerical optimization algorithms to the design of complex engineering systems modelled with high fidelity. A promising ap- proach for dealing with this computational challenge is the use of surrogate models, which serve as approximations of the high-fidelity computational models and can be evaluated very cheaply. This makes surrogates extremely valuable in design optimization and a wider class of problems: inverse parameter estimation, machine learning, uncertainty quantification, and visualization.

    This thesis is concerned with the development of greedy dictionary learning algorithms for efficiently constructing sparse surrogate models using a set of scattered observational data. The central idea is to define a dictionary of basis functions either a priori or a posteriori in light of the dataset and select a subset of the basis functions from the dic- tionary using a greedy search criterion. In this thesis, we first develop a novel algorithm for sparse learning from parameterized dictionaries in the context of greedy radial basis function learning (GRBF). Next, we develop a novel algorithm for general dictionary learning (GGDL). This algorithm is presented in the context of multiple kernel learn- ing with heterogenous dictionaries. In addition, we present a novel strategy, based on cross-validation, for parallelizing greedy dictionary learning and a randomized sampling strategy to significantly reduce approximation costs associated with large dictionaries. We also employ our GGDL algorithm in the context of uncertainty quantification to con- struct sparse polynomial chaos expansions. Finally, we demonstrate how our algorithms may be adapted to approximate gradient-enhanced datasets.

    Numerical studies are presented for a variety of test functions, machine learning datasets, and engineering case studies over a wide range of dataset size and dimen- sionality. Compared to state-of-the-art approximation techniques such as classical radial basis function approximations, Gaussian process models, and support vector machines, our algorithms build surrogates which are significantly more sparse, of comparable or improved accuracy, and often offer reduced computational and memory costs.

    ii

  • Acknowledgements

    I would like to begin by thanking Professor Prasanth Nair for being the absolute best

    mentor one could hope to have had. I am grateful for his support, guidance, patience,

    insight, and enthusiasm. Professor Nair puts his students first and it is thanks to him

    that I had the confidence and self-discipline to accomplish everything that I have.

    I would also like to thank the other members of my doctoral examination committee,

    Professor Clinton Groth and Professor David Zingg, for their comments on, and critiques

    of, the work presented here.

    To my UTIAS group colleagues, past and present – Mina Mitry, Matthew Li, Tomas

    Mawyin, Dr. Christophe Audouze, Ricardo Santos-Baptista, and Trefor Evans – I am

    honoured to have had the opportunity to collaborate with you and learn from you.

    I am grateful for financial support from UTIAS, the Centre for Research in Sustainable

    Aviation, the Ontario Graduate Scholarship, and Pratt & Whitney Canada.

    On a personal note, I would like to thank Umesh Thillaivasan, Phung Tran, and

    Victoria Tu. Our countless camping trips and game nights kept me sane. Over the past

    four years I have grown more as a person than in the previous twenty-two and for this

    period I am glad to have had you as my closest friends.

    Finally, I would like to acknowledge, and dedicate this thesis to, my parents. It would

    take another 150 pages for me to thank you for everything you have done for me, so just

    know that I love you.

    iii

  • Contents

    1 Introduction 1

    1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.2 Thesis Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    2 Background and Related Work 5

    2.1 Approximation of Scattered Data . . . . . . . . . . . . . . . . . . . . . . 5

    2.1.1 Radial Basis Function Expansions . . . . . . . . . . . . . . . . . . 9

    2.1.2 Kernel Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2.2 Sparse Dictionary Learning . . . . . . . . . . . . . . . . . . . . . . . . . 14

    2.2.1 `1 Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.2.2 Greedy Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.3 Summary and Concluding Remarks . . . . . . . . . . . . . . . . . . . . . 19

    3 Greedy Algorithm for RBF Approximation 20

    3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    3.2 Greedy Algorithm for RBF Approximation . . . . . . . . . . . . . . . . . 23

    3.2.1 Greedy Selection Criteria . . . . . . . . . . . . . . . . . . . . . . . 25

    3.2.2 Model Update Strategy and Shape Parameter Tuning . . . . . . . 26

    3.2.3 Choice of Stopping Criteria . . . . . . . . . . . . . . . . . . . . . 29

    3.2.4 Final Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.3 Tuning the RBF Kernel Type . . . . . . . . . . . . . . . . . . . . . . . . 32

    3.4 Summary and Concluding Remarks . . . . . . . . . . . . . . . . . . . . . 33

    4 Numerical Results for GRBF 35

    4.1 Demonstrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    4.1.1 GRBF Predictor Evolution . . . . . . . . . . . . . . . . . . . . . . 35

    4.1.2 One-Dimensional Example . . . . . . . . . . . . . . . . . . . . . . 38

    iv

  • 4.1.3 Residual Convergence Trends . . . . . . . . . . . . . . . . . . . . 42

    4.2 Test Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    4.2.1 Case 1: n = 1, 2, 10n ≤ m ≤ 50n . . . . . . . . . . . . . . . . . . 44 4.2.2 Case 2: n = 10, 20, 10n ≤ m ≤ 50n . . . . . . . . . . . . . . . . . 45 4.2.3 Case 3: n = 1, 2, m > 50n . . . . . . . . . . . . . . . . . . . . . . 46

    4.2.4 Case 4: n = 10, 20, m > 50n . . . . . . . . . . . . . . . . . . . . 48

    4.3 Summary and Concluding Remarks . . . . . . . . . . . . . . . . . . . . . 50

    5 Greedy Algorithms for General Dictionary Learning 52

    5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

    5.2 Multiple Kernel Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    5.2.1 Regularized Risk Minimization with Multiple Kernels . . . . . . . 54

    5.2.2 A Greedy Approach to Multiple Kernel Learning . . . . . . . . . 56

    5.3 Parallel Algorithm for General Dictionaries . . . . . . . . . . . . . . . . . 58

    5.3.1 General Dictionaries and Problem Statement . . . . . . . . . . . . 58

    5.3.2 Candidate Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    5.3.3 Stopping Criterion: ν-fold Cross Validation . . . . . . . . . . . . . 62

    5.3.4 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    5.3.5 Algorithm Block . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

    5.4 Randomized Algorithm for Larger Dictionaries . . . . . . . . . . . . . . . 66

    5.4.1 Random Candidate Sampling . . . . . . . . . . . . . . . . . . . . 68

    5.4.2 Stopping Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

    5.5 Summary and Concluding Remarks . . . . . . . . . . . . . . . . . . . . . 70

    6 Numerical Results and Case Studies for GGDL 72

    6.1 Demonstrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

    6.1.1 GGDL-CV Predictor Evolution . . . . . . . . . . . . . . . . . . . 73

    6.1.2 GGDL-R Predictor Evolution . . . . . . . . . . . . . . . . . . . . 76

    6.1.3 Shape Parameter Tuning . . . . . . . . . . . . . . . . . . . . . . . 79

    6.1.4 Structurally-Sampled RBF Centres . . . . . . . . . . . . . . . . . 83

    6.1.5 Learning with Multiple Kernel Types . . . . . . . . . . . . . . . . 84

    6.2 Machine Learning Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . 87

    6.2.1 Abalone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

    6.2.2 Boston . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    6.2.3 CPU Small . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

    6.2.4 Diabetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

    6.2.5 Final Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

    v

Recommended

View more >