nonsmooth optimization: theory and algorithms
TRANSCRIPT
4OR-Q J Oper Res (2010) 8:109–112DOI 10.1007/s10288-009-0107-y
PHD THESIS
Nonsmooth optimization: theory and algorithms
Enrico Gorgone
Received: 13 February 2009 / Revised: 21 July 2009 / Published online: 11 September 2009© Springer-Verlag 2009
Abstract This is a summary of the author’s PhD thesis supervised by ManlioGaudioso and Maria Flavia Monaco and defended on 21 February 2008 at the Uni-versità della Calabria. The thesis is a survey on nonsmooth optimization methods forboth convex and nonconvex functions. The main contribution of the dissertation isthe presentation of a new bundle type method. The thesis is written in English and isavailable from http://www2.deis.unical.it/logilab/gorgone.
Keywords Nonsmooth optimization · Bundle methods
MSC classification (2000) 90C26 · 65K05
1 Introductory remarks
Nonsmooth optimization tackles the problem of finding the minima (or the max-ima) of real-valued functions defined on R
n in absence of differentiability hypothesis.Numerical convex optimization stemmed as consequence of the development of con-vex analysis (Rockafellar 1970). The cutting plane, the subgradient methods and thebundle methods are well established numerical algorithms in the area.
We summarize here the cutting plane method, from which our approach derives.Let f be the objective function and let x1, . . . , xk be any set of points previouslygenerated. We construct a piecewise affine model of the max type and then we definethe next iterate xk+1 as follows:
xk+1 = arg minx∈Rn
fk�= arg min
x∈Rnmax
1≤ j≤k{ f (x j ) + gT
j (x − x j )},
E. Gorgone (B)Dipartimento di Elettronica Informatica e Sistemistica, Università della Calabria,87036 Rende, CS, Italye-mail: [email protected]
123
110 E. Gorgone
where g j ∈ ∂ f (x j ). Unfortunately this algorithm is unstable and its numerical perfor-mance is very often poor. The so-called bundle methods can be interpreted as stabilizedversions of the cutting plane (see, e.g., Hiriart-Urruty and Lemaréchal 1993).
Nonconvex nonsmooth optimization was born by simply adapting algorithms pri-marily designed for the convex case. Recently several algorithms, specifically con-ceived for dealing with nonconvexity, have appeared in the literature. The thesis isaimed at developing a new algorithm based on the concept of bundle splitting strategyintroduced in (Fuduli et al. 2004a,b; Gaudioso et al. 2009).
The thesis is organized as follows. We recall in part I some elements of convexanalysis (Rockafellar 1970) and then we survey the bundle methods, the subgradi-ent methods and some smoothing techniques. We recall in part II some elements ofnonsmooth analysis (see, e.g., Demyanov and Rubinov 1995) and then we survey agradient sampling technique and some bundle type methods. Finally we present ouralgorithm.
The bundle splitting strategy for nonconvex nonsmooth minimization differs fromthe traditional cutting plane approach because the bundle points are classified in termsof the sign of their linearization error, giving rise to two different affine models (oneconvex and the other concave) of the objective function which in turn yield differentquadratic subproblems. In (Fuduli et al. 2004a) the two piecewise affine approxima-tions define a kind of trust region, instead in (Fuduli et al. 2004b) the objective functionof the quadratic subproblem is a convex combination of the two models. Our approachconsists in locating the new iterate at a point where the disagreement between the twopiecewise affine approximations is maximal.
2 Nonsmooth optimization: a new bundle method
We consider the following unconstrained minimization problem:
(P) ={
min f (x)
x ∈ Rn
where f : Rn → R is not necessarily differentiable nor convex.
We restrict our attention to the core of our algorithm, i.e the quadratic subproblem.For any couple of points (xi , x) the linearization error αi is defined as the difference
between the actual value of f at x and the value at x of the linear expansion generatedat xi :
αi�= f (x) − f (xi ) − gT
i (x − xi ),
with gi ∈ ∂ f (xi ). For general function the linearization error may assume any sign.Thus we split the bundle index set I , i.e. the set of the indices of points previouslygenerated, as
I + �= {i : αi ≥ −σ } and I − �= {i : αi < −σ }, (1)
for some σ > 0. In particular, I + contains the points that exhibit a kind of “convex”behavior and I − the points that exhibit a kind of “concave” behavior with respect to x .
123
Nonsmooth optimization: theory and algorithms 111
Fig. 1 A bundle method for nonconvex functions
Let h(d)�= f (x + d) − f (x) be the difference function. We construct two polyhe-
dral models of h, using separately the two bundles. That is we define two piecewiseaffine functions:
�+(d)�= max
i∈I +
{gT
i d − αi
}, �−(d)
�= min
{0, min
i∈I −{gTi d − αi }
}.
Of course �+(d) is convex while �−(d) is concave.Our approach consists in finding a tentative stepsize d(ρ) by solving the following
convex problem:
mind∈IRn
�(d) + 1
2ρ‖d‖2 (2)
where ��= �+ − �− and ρ > 0 is the proximity parameter introduced for both
stabilization and well-posedness purposes (see Fig. 1). The rationale of the model isin the attempt of locating the new “sample point” x + d(ρ) so that both the modelfunctions �+ and �− predict reduction, but, at the same time, their predictions aremostly different.
Our algorithm is of bundle type and we prove its termination at an approximateoptimal solution, under the hypothesis that the function is locally Lipschitz and weaklysemismooth. The code has been tested on a set of 25 problems (Lukšan and Vlcek 2000)available on the web at the URL http://www.cs.cas.cz/~luksan/test.html. In Table 1we report the computational results in terms of the number Nf of function evaluations.By f ∗ and f we indicate, respectively, the minimum value of the objective functionand the function value reached by the algorithm when the stopping criterion is met.
The numerical results have been rather satisfactory. Possible future work is appli-cation of the algorithm to real life problem. A significant field would be that of clas-sification problems, where both nonconvexity and nonsmoothness of the objectivefunctions often appear.
123
112 E. Gorgone
Table 1 Computational results
Problemf ∗ Nf f
Rosenbrock 0 54 5.137e-06
CB2 1.9522245 16 1.9522255
DEM −3 13 −3.0000000
LQ −1.4142136 15 −1.4142136
Mifflin2 −1 14 −0.9999991
Shor 22.600162 27 22.600163
Maxq 0 187 1.561e-06
Goffin 0 56 1.984e-13
Wolfe −8 43 −7.9999998
L1HILB 0 30 1.709e-05
Gill 9.7857721 308 9.7858381
TR48 −638565 1662 −638514.80
Steiner2 16.703838 96 16.703844
Crescent 0 53 5.112e-06
CB3 2 14 2.0000000
QL 7.2 15 7.2000001
Mifflin1 −1 165 −0.9999389
Rosen-Suzuki −44 33 −43.999997
Maxquad −0.8414083 90 −0.8413860
Maxl 0 23 4.493e-15
El-Attar 0.5598131 172 0.5598143
MXHILB 0 24 1.764e-05
Colville1 −32.348679 36 −32.348677
HS78 −2.9197004 237 −2.9191783
Shell dual 32.348679 642 32.348687
References
Demyanov VF, Rubinov A (1995) Constructive nonsmooth analysis. Verlag Peter Lang, FrankfurtFuduli A, Gaudioso M, Giallombardo G (2004a) Minimizing nonconvex nonsmooth functions via cutting
planes and proximity control. SIAM J Optim 14:743–756Fuduli A, Gaudioso M, Giallombardo G (2004b) A DC piecewise affine model and a bundling technique
in nonconvex nonsmooth minimization. Optim Methods Softw 19:89–102Gaudioso M, Gorgone E, Monaco MF (2009) Piecewise linear approximations in nonconvex nonsmooth
optimization. Numerische Mathematik 113(1):73–88Hiriart-Urruty JB, Lemaréchal C (1993) Convex analysis and minimization algorithms vol II. Springer,
BerlinLukšan L, Vlcek J (2000) Test problems for nonsmooth unconstrained and linearly constrained optimization.
Technical Report 798, Institute of Computer Science, Academy of Sciences of the Czech Republic,Prague
Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton, NJ
123