topics on functional analysis, calculus of variations...

TOPICS ON FUNCTIONAL

ANALYSIS, CALCULUS OF

VARIATIONS AND DUALITY

Fabio Botelho

PAAcademic Publications

1991 Mathematics Subject Classification. 46N10, 49N15

Key words and phrases. Banach spaces, convex analysis, duality,calculus of variations, non-convex systems, generalized method of lines

Abstract. This work is a kind of revised and enlarged editionof the title Variational Convex Analysis, published by LambertAcademic Publishing. First we present the basic tools of analy-sis necessary to develop the core theory and applications. Newresults concerning duality principles for systems originally modeledby non-linear differential equations are shown in chapters 10 to 18.A key aspect of this work is that although the original problemsare non-linear with corresponding non-convex variational formula-tions, the dual formulations obtained are almost always concaveand amenable to numerical computations. When the primal prob-lem has no solution in the classical sense, the solution of dual prob-lem is a weak limit of minimizing sequences, and the evaluation ofsuch average behavior is important in many practical applications.Among the results we highlight the dual formulations for micro-magnetism, phase transition models, composites in elasticity andconductivity and others. To summarize, in the present work weintroduce convex analysis as an interesting alternative approach forthe understanding and computation of some important problems inthe modern calculus of variations.

Finally, in chapter 13 we develop a new numerical procedure,

called the generalized method of lines. Through such a method

the domain is discretized in lines (or more generally in curves) and

the solution on them is written as functions of boundary conditions

and boundary shape. In chapter 17 we apply the method to an

approximation of the incompressible Navier-Stokes system.

Author: Fabio BotelhoAddress: Department of Mathematics

Federal University of PelotasPelotas, RS, BRAZIL

email: [email protected]

c©Academic Publications, Ltd., 2011Electronic book: http://www.acadpubl.eu (397 pp.)ISBN: 978-954-2940-08-1

Contents

Introduction ixSummary of Main Results ixAcknowledgments xii

Part 1. Basic Functional Analysis 1

Chapter 1. Topological Vector Spaces 31.1. Introduction 31.2. Vector Spaces 31.3. Some Properties of Topological Vector Spaces 81.4. Compactness in Topological Vector Spaces 111.5. Normed and Metric Spaces 121.6. Compactness in Metric Spaces 131.7. The Arzela-Ascoli Theorem 201.8. Linear Mappings 231.9. Linearity and Continuity 241.10. Continuity of Operators on Banach Spaces 251.11. Some Classical Results on Banach Spaces 261.12. Hilbert Spaces 32

Chapter 2. The Hahn-Banach Theorems andWeak Topologies 39

2.1. Introduction 392.2. The Hahn-Banach Theorem 392.3. Weak Topologies 452.4. The Weak-Star Topology 472.5. Weak-Star Compactness 482.6. Separable Sets 522.7. Uniformly Convex Spaces 53

Chapter 3. Topics on Linear Operators 553.1. Topologies for Bounded Operators 553.2. Adjoint Operators 563.3. Compact Operators 59

iii

iv CONTENTS

3.4. The Square Root of a Positive Operator 613.5. About the Spectrum of a Linear Operator 663.6. The Spectral Theorem for Bounded

Self-Adjoint Operators 703.7. The Spectral Decomposition of

Unitary Transformations 783.8. Unbounded Operators 813.9. Symmetric and Self-Adjoint Operators 84

Chapter 4. Measure and Integration 914.1. Basic Concepts 914.2. Simple Functions 934.3. Measures 944.4. Integration of Simple Functions 954.5. The Fubini Theorem 994.6. The Lebesgue Measure in Rn 1054.7. Lebesgue Measurable Functions 115

Chapter 5. Distributions 1255.1. Basic Definitions and Results 1255.2. Differentiation of Distributions 129

Chapter 6. The Lebesgue and Sobolev Spaces 1316.1. Definition and Properties of Lp Spaces 1316.2. The Sobolev Spaces 1396.3. The Sobolev Imbedding Theorem 1446.4. The Proof of the Sobolev Imbedding Theorem 1456.5. Compact Imbeddings 166

Part 2. Variational Convex Analysis 171

Chapter 7. Basic Concepts on the Calculus of Variations 1737.1. Introduction to the Calculus of Variations 1737.2. Evaluating the Gateaux variations 1757.3. The Gateaux Variation in W 1,2(Ω) 1777.4. Elementary Convexity 1797.5. The Legendre-Hadamard Condition 1827.6. The Weierstrass Necessary Condition 1847.7. The du Bois-Reymond Lemma 1877.8. The Weierstrass-Erdmann Conditions 1897.9. Natural Boundary Conditions 191

Chapter 8. Basic Concepts on Convex Analysis 195

CONTENTS v

8.1. Convex Sets and Convex Functions 1958.2. Duality in Convex Optimization 2048.3. Relaxation for the Scalar Case 2088.4. Duality Suitable for the Vectorial Case 217

Chapter 9. Constrained Variational Optimization 2239.1. Basic Concepts 2239.2. Duality 2279.3. The Lagrange Multiplier Theorem 229

Part 3. Applications 233

Chapter 10. Duality Applied to a Plate Model 23510.1. Introduction 23510.2. The Primal Variational Formulation 24010.3. The Legendre Transform 24210.4. The Classical Dual Formulation 24410.5. The Second Duality Principle 25010.6. The Third Duality Principle 25510.7. A Convex Dual Formulation 25810.8. A Final Result, other Sufficient

Conditions of Optimality 26110.9. Final Remarks 264

Chapter 11. Duality Applied to Elasticity 26711.1. Introduction and Primal Formulation 26711.2. The First Duality Principle 26911.3. The Second Duality Principle 27411.4. Conclusion 276

Chapter 12. Duality Applied to a Membrane Shell Model 27712.1. Introduction and Primal Formulation 27712.2. The Legendre Transform 27912.3. The Polar Functional Related to F : U → R 28012.4. The Final Format of First Duality Principle 28012.5. The Second Duality Principle 28112.6. Conclusion 285

Chapter 13. Duality Applied to Ginzburg-Landau TypeEquations 287

13.1. Introduction 28713.2. A Concave Dual Variational Formulation 28913.3. Applications to Phase Transition in Polymers 293

vi CONTENTS

13.4. A Numerical Example 29813.5. A New Path for Relaxation 30113.6. A New Numerical Procedure,

the Generalized Method of Lines 30613.7. A Simple Numerical Example 31413.8. Conclusion 315

Chapter 14. Duality Applied to Conductivity in Composites 31714.1. Introduction 31714.2. The Primal Formulation 31814.3. The Duality Principle 31814.4. Conclusion 320

Chapter 15. Duality Applied to the OptimalDesign in Elasticity 323

15.1. Introduction 32315.2. The Main Duality Principles 32415.3. The First Applied Duality Principle 32915.4. A Concave Dual Formulation 33215.5. Duality for a Two-Phase Problem in Elasticity 33515.6. A Numerical Example 33815.7. Conclusion 339

Chapter 16. Duality Applied to Micro-Magnetism 34116.1. Introduction 34116.2. The Primal formulations and the Duality Principles 34216.3. A Preliminary Result 34416.4. The Duality Principle for the Hard Case 34516.5. An Alternative Dual Formulation for the Hard Case 34916.6. The Full Semi-linear Case 35416.7. The Cubic Case in Micro-magnetism 36016.8. Final Results, Other Duality Principles 36416.9. Conclusion 372

Chapter 17. Duality Applied to Fluid Mechanics 37317.1. Introduction and Primal Formulation 37317.2. The Legendre Transform 37517.3. Linear Systems which the Solutions

Solve the Navier-Stokes One 37617.4. The Method of Lines for the Navier-Stokes System 37917.5. Conclusion 385

Chapter 18. Duality Applied to a Beam Model 387

CONTENTS vii

18.1. Introduction and Statementof the Primal Formulation 387

18.2. Existence and Regularity Results for Problem P 38818.3. A Convex Dual Formulation for the Beam Model 39018.4. A Final Result, Another Duality Principle 39218.5. Conclusion 394

Bibliography 395

Introduction

The main objective of this work is to present recent results of theauthor about applications of duality to non-convex problems in thecalculus of variations. The text is divided into chapters describedin the next page, and chapters 1 to 9 present the basic concepts onstandard analysis necessary to develop the applications.

Of course, the material presented in the first 9 chapters is notnew, with exception of the section on relaxation for the scalar case,where we show different proofs of some theorems presented in Eke-land and Temam’s book Convex Analysis and Variational Problems

(indeed such a book is the theoretical base of the present work),and the section about relaxation for the vectorial case. The appli-cations, presented in chapters 10 to 18, correspond to the work ofthe present author along the last years, and almost all results in-cluding the applications of duality for micro-magnetism, compositesin elasticity and conductivity and phase transitions, were obtainedduring the Ph.D. program at Virginia Tech.

The key feature of this work is that while all problems studiedhere are non-linear with corresponding non-convex variational for-mulation, it has been almost always possible to develop convex (infact concave) dual variational formulations, which in general aremore amenable to numerical computations.

The section on relaxation for the vectorial case, as its title sug-gests, presents duality principles that are valid even for vectorialproblems. It is worth noting that such results were used in thistext to develop concave dual variational formulations in situationssuch as for conductivity in composites, vectorial examples in phasetransitions, etc.

Summary of Main Results

The main results of this work are summarized as follows.

Duality Applied to a Plate Model. Chapter 10 developsdual variational formulations for the two dimensional equations of

ix

x INTRODUCTION

the nonlinear elastic Kirchhoff-Love plate model. The first dualityprinciple presented is the classical one and may be found in sim-ilar format in [40], [22]. It is worth noting that such results arevalid only for positive definite membrane forces. However, we ob-tain new dual variational formulations which relax or even removesuch constraints. In particular we exhibit a convex dual variationalformulation which allows non positive definite membrane forces. Inthe last section, similar to the triality criterion introduced in [24],we obtain sufficient conditions of optimality for the present case.The results are based on fundamental tools of Convex Analysis andthe Legendre Transform, which can easily be analytically expressedfor the model in question.

Duality Applied to Finite Elasticity. Chapter 11 developsduality for a model in finite elasticity. The dual formulations ob-tained allow the matrix of stresses to be non positive definite. Thisis in some sense, an extension of earlier results (which establishthe complementary energy as a perfect global optimization dualityprinciple only if the stress tensor is positive definite at the equi-librium point). Again, the results are based on standard tools ofconvex analysis and on the concept of Legendre Transform.

Duality Applied to a Shell Model. The main focus of Chap-ter 12 is the development of dual variational formulations for anon-linear elastic membrane shell model. In the present literature,the concept of complementary energy can be established only ifthe external load produces a critical point with positive definitemembrane forces matrix. Our idea is to obtain dual variationalformulations for which the mentioned constraint is relaxed or eveneliminated. Again, the results are obtained through basic tools ofconvex analysis and the concept of Legendre Transform, which canbe analytically established for the concerned shell model.

Duality Applied to Ginzburg-Landau Type Equations.

Chapter 13 is concerned with the development of dual variationalformulations for Ginzburg-Landau type equations. Since the pri-mal formulations are non-convex, we use specific results for distancebetween two convex functions to obtain the dual approaches. Notethat we obtain a convex dual formulation and a kind of primal-dualformulation. For the convex formulation, optimality conditions areestablished. In the last section we develop the Generalized Methodof Lines, a new numerical procedure in which the solution of the

SUMMARY OF MAIN RESULTS xi

partial differential equation in question is written on lines as func-tions of boundary conditions and boundary shape.

Duality Applied to Conductivity in Composites. Themain focus of Chapter 14 is the development of a dual variationalformulation for a two-phase optimization problem in conductivity.The primal formulation may not have minimizers in the classicalsense. In this case, the solution through the dual formulation is aweak limit of minimizing sequences for the original problem.

Duality Applied to the Optimal Design in Elasticity.

The first part of Chapter 15 develops a dual variational formula-tion for the optimal design of a plate of variable thickness. Thedesign variable, namely the plate thickness, is supposed to mini-mize the plate deformation work due to a given external load. Thesecond part is concerned with the optimal design for a two-phaseproblem in elasticity. In this case, we are looking for the mixtureof two constituents that minimizes the structural internal work.In both applications the dual formulations were obtained throughbasic tools of convex analysis.

Duality Applied to Micro-Magnetism. The main focus ofChapter 16 is the development of dual variational formulations forfunctionals related to ferromagnetism models. We develop dualityprinciples for the so-called hard and full (semi-linear) uniaxial andcubic cases. It is important to emphasize that the new dual for-mulations here presented are convex and are useful to compute theaverage behavior of minimizing sequences, specially as the primalformulation has no minimizers in the classical sense. Once morethe results are obtained through standard tools of convex analysis.

Duality Applied to Fluid Mechanics. In Chapter 17 weuse the concept of Legendre Transform to obtain dual variationalformulations for the Navier-Stokes system. What is new and rel-evant is the establishment of a linear system whose the solutionalso solves the original problem. In the last section we present thesolution of an approximation concerning the Navier-Stokes systemthrough the generalized method of lines.

Duality Applied to a Beam Model. Chapter 18 developsexistence and duality for a non-linear beam model. Our final resultis a convex variational formulation for the model in question.

xii INTRODUCTION

Acknowledgments

This monograph is based in my Ph.D. thesis at Virginia Techand I am especially grateful to Professor Robert C. Rogers for hisexcellent work as advisor. I would like to thank the Departmentof Mathematics for its constant support and this opportunity ofstudying mathematics in advanced level. I am also grateful to allProfessors that have been teaching me during the last years, fortheir valuable work. Among the Professors, I particularly thankMartin Day (Calculus of Variations), James Thomson (Real Anal-ysis) and George Hagedorn (Functional Analysis) for the excellentlectured courses. Finally, special thanks to all my Professors atI.T.A. (Instituto Tecnologico de Aeronautica, SP-Brasil) my un-dergraduate and masters school, and to Virginia Tech-USA, twowonderful institutions in the American continent.

Part 1

Basic Functional Analysis

CHAPTER 1

Topological Vector Spaces

1.1. Introduction

The main objective of this chapter is to present an outline ofthe basic tools of analysis necessary to develop the subsequent chap-ters. We assume the reader has a background in linear algebra andelementary real analysis at an undergraduate level. All proofs aredeveloped in details.

1.2. Vector Spaces

We denote by F a scalar field. In practice this is either R or C,the set of real or complex numbers.

Definition 1.2.1 (Vector Spaces). A vector space over F is aset denoted by U , whose elements are called vectors, for which aredefined two operations, namely, addition denoted by (+) : U×U →U , and, scalar multiplication, denoted by (·) : F × U → U , so thatthe following relations are valid

(1) u+ v = v + u, ∀u, v ∈ U,(2) u+ (v + w) = (u+ v) + w, ∀u, v, w ∈ U,(3) there exists a vector denoted by θ such that u + θ = u,

∀u ∈ U,(4) for each u ∈ U, there exists a unique vector denoted by

−u such that u+ (−u) = θ,(5) α · (β · u) = (α · β) · u, ∀α, β ∈ F, u ∈ U,(6) α · (u+ v) = α · u+ α · v, ∀α ∈ F, u, v ∈ U,(7) (α + β) · u = α · u+ β · u, ∀α, β ∈ F, u ∈ U,(8) 1 · u = u, ∀u ∈ U.

Remark 1.2.2. From now on we may drop the dot (·) in scalarmultiplications and denote α · u simply as αu.

3

4 1. TOPOLOGICAL VECTOR SPACES

Definition 1.2.3 (Vector Subspace). Let U be a vector space.A set V ⊂ U is said to be a vector subspace of U if V is also avector space with the same operations as those of U . If V 6= U wesay that V is a proper subspace of U .

Definition 1.2.4 (Finite dimensional Space). A vector spaceis said to be of finite dimension if there exists fixed u1, u2, ..., un ∈ Usuch that for each u ∈ U there are corresponding α1, ...., αn ∈ F forwhich

u =n∑

i=1

αiui. (1.1)

Definition 1.2.5 (Topological Spaces). A set U is said tobe a topological space if it is possible to define a collection σ ofsubsets of U called a topology in U , for which are valid the followingproperties:

(1) U ∈ σ,(2) ∅ ∈ σ,(3) if A ∈ σ and B ∈ σ then A ∩B ∈ σ, and(4) arbitrary unions of elements in σ also belong to σ.

Any A ∈ σ is said to be an open set.

Remark 1.2.6. When necessary, to clarify the notation, weshall denote the vector space U endowed with the topology σ by(U, σ).

Definition 1.2.7 (Closed Sets). Let U be a topological space.A set A ⊂ U is said to be closed if U − A is open. We also denoteU −A = Ac.

Proposition 1.2.8. For closed sets we have the followingproperties:

(1) U and ∅ are closed,(2) If A and B are closed sets then A ∪ B is closed,(3) Arbitrary intersections of closed sets are closed.

Proof. (1) Since ∅ is open and U = ∅c, by Definition 1.2.7U is closed. Similarly, since U is open and ∅ = U−U = U c,∅ is closed.

1.2. VECTOR SPACES 5

(2) A,B closed implies that Ac and Bc are open, and by Defi-nition 1.2.5, Ac ∪ Bc is open, so that A ∩ B = (Ac ∪ Bc)c

is closed.(3) Consider A = ∩λ∈LAλ, where L is a collection of indices

and Aλ is closed, ∀λ ∈ L. We may write A = (∪λ∈LAcλ)

c

and since Acλ is open ∀λ ∈ L we have, by Definition 1.2.5,

that A is closed.

Definition 1.2.9 (Closure). Given A ⊂ U we define the clo-sure of A, denoted by A, as the intersection of all closed sets thatcontain A.

Remark 1.2.10. From Proposition 1.2.8 Item 3 we have thatA is the smallest closed set that contains A, in the sense that, if Cis closed and A ⊂ C then A ⊂ C.

Definition 1.2.11 (Interior). Given A ⊂ U we define itsinterior, denoted by A, as the union of all open sets contained inA.

Remark 1.2.12. It is not difficult to prove that if A is openthen A = A.

Definition 1.2.13 (Neighborhood). Given u0 ∈ U we say thatV is a neighborhood of u0 if such a set is open and contains u0. Wedenote such neighborhoods by Vu0.

Proposition 1.2.14. If A ⊂ U is a set such that for eachu ∈ A there exists a neighborhood Vu ∋ u such that Vu ⊂ A, thenA is open.

Proof. This follows from the fact that A = ∪u∈AVu and anyarbitrary union of open sets is open.

Definition 1.2.15 (Function). Let U and V be two topologicalspaces. We say that f : U → V is a function if f is a collection ofpairs (u, v) ∈ U × V such that for each u ∈ U there exists only onev ∈ V such that (u, v) ∈ f .


Definition 1.2.16 (Continuity at a Point). A function f :U → V is continuous at u ∈ U if for each neighborhood Vf(u) ⊂ Vof f(u) there exists a neighborhood Vu ⊂ U of u such that f(Vu) ⊂Vf(u).

Definition 1.2.17 (Continuous Function). A function f :U → V is continuous if it is continuous at each u ∈ U .

Proposition 1.2.18. A function f : U → V is continuous ifand only if f−1(V) is open for each open V ⊂ V , where

f−1(V) = u ∈ U | f(u) ∈ V. (1.2)

Proof. Suppose f−1(V) is open whenever V ⊂ V is open. Picku ∈ U and any open V such that f(u) ∈ V. Since u ∈ f−1(V) andf(f−1(V)) ⊂ V, we have that f is continuous at u ∈ U . Since u ∈ Uis arbitrary we have that f is continuous. Conversely, suppose fis continuous and pick V ⊂ V open. If f−1(V) = ∅ we are done,since ∅ is open,. Thus, suppose u ∈ f−1(V), since f is continuous,there exists Vu a neighborhood of u such that f(Vu) ⊂ V. Thismeans Vu ⊂ f−1(V) and therefore, from Proposition 1.2.14, f−1(V)is open.

Definition 1.2.19. We say that (U, σ) is a Hausdorff topo-logical space if, given u1, u2 ∈ U , u1 6= u2, there exists V1, V2 ∈ σsuch that

u1 ∈ V1 , u2 ∈ V2 and V1 ∩ V2 = ∅. (1.3)

Definition 1.2.20 (Base). A collection σ′ ⊂ σ is said to be abase for σ if every element of σ may be represented as a union ofelements of σ′.

Definition 1.2.21 (Local Base). A collection σ of neighbor-hoods of a point u ∈ U is said to be a local base at u if eachneighborhood of u contains a member of σ.

Definition 1.2.22 (Topological Vector Space). A vector spaceendowed with a topology, denoted by (U, σ), is said to be a topo-logical vector space if and only if

(1) Every single point of U is a closed set,

1.2. VECTOR SPACES 7

(2) The vector space operations (addition and scalar multipli-cation) are continuous with respect to σ.

More specifically, addition is continuous if, given u, v ∈ U andV ∈ σ such that u + v ∈ V then there exists Vu ∋ u and Vv ∋ vsuch that Vu + Vv ⊂ V. On the other hand, scalar multiplicationis continuous if given α ∈ F, u ∈ U and V ∋ α · u, there existsδ > 0 and Vu ∋ u such that, ∀β ∈ F satisfying |β − α| < δ we haveβVu ⊂ V.

Given (U, σ), let us associate with each u0 ∈ U and α0 ∈ F

(α0 6= 0) the functions Tu0 : U → U and Mα0 : U → U defined by

Tu0(u) = u0 + u (1.4)

and

Mα0(u) = α0 · u. (1.5)

The continuity of such functions is a straightforward consequenceof the continuity of vector space operations (addition and scalarmultiplication). It is clear that the respective inverse maps, namelyT−u0 and M1/α0 are also continuous. So if V is open then u0 + V,that is (T−u0)

−1(V) = Tu0(V) = u0 + V is open. By analogy α0Vis open. Thus σ is completely determined by a local base, so thatthe term local base will be understood henceforth as a local baseat θ. So to summarize, a local base of a topological vector space isa collection Ω of neighborhoods of θ, such that each neighborhoodof θ contains a member of Ω.

Now we present some simple results, namely:

Proposition 1.2.23. If A ⊂ U is open, then ∀u ∈ A thereexists a neighborhood V of θ such that u+ V ⊂ A

Proof. Just take V = A− u.

Proposition 1.2.24. Given a topological vector space (U, σ),any element of σ may be expressed as a union of translates of mem-bers of Ω, so that the local base Ω generates the topology σ.

Proof. Let A ⊂ U open and u ∈ A. V = A − u is a neighbor-hood of θ and by definition of local base, there exists a set VΩu ⊂ V


such that VΩu ∈ Ω. Thus, we may write

A = ∪u∈A(u+ VΩu). (1.6)

1.3. Some Properties of Topological Vector Spaces

In this section we study some fundamental properties of topo-logical vector spaces. We start with the following proposition:

Proposition 1.3.1. Any topological vector space U is a Haus-dorff space.

Proof. Pick u0, u1 ∈ U such that u0 6= u1. Thus V = U−u1−u0 is an open neighborhood of zero. As θ+θ = θ, by the continuityof addition, there exist V1 and V2 neighborhoods of θ such that

V1 + V2 ⊂ V (1.7)

define U = V1 ∩V2 ∩ (−V1)∩ (−V2), thus U = −U (symmetric) andU + U ⊂ V and hence

u0 + U + U ⊂ u0 + V ⊂ U − u1 (1.8)

so that

u0 + v1 + v2 6= u1, ∀v1, v2 ∈ U , (1.9)

or

u0 + v1 6= u1 − v2, ∀v1, v2 ∈ U , (1.10)

and since U = −U(u0 + U) ∩ (u1 + U) = ∅. (1.11)

Definition 1.3.2 (Bounded Sets). A set A ⊂ U is said tobe bounded if to each neighborhood of zero V there corresponds anumber s > 0 such that A ⊂ tV for each t > s.

Definition 1.3.3 (Convex Sets). A set A ⊂ U such that

if u, v ∈ A then λu+ (1 − λ)v ∈ A, ∀λ ∈ [0, 1], (1.12)

is said to be convex.

1.3. SOME PROPERTIES OF TOPOLOGICAL VECTOR SPACES 9

Definition 1.3.4 (Locally Convex Spaces). A topological vec-tor space U is said to be locally convex if there is a local base Ωwhose elements are convex.

Definition 1.3.5 (Balanced sets). A set A ⊂ U is said to bebalanced if αA ⊂ A, ∀α ∈ F such that |α| ≤ 1.

Theorem 1.3.6. In a topological vector space U we have:

(1) Every neighborhood of zero contains a balanced neighbor-hood of zero,

(2) Every convex neighborhood of zero contains a balancedconvex neighborhood of zero.

Proof. (1) Suppose U is a neighborhood of zero. From thecontinuity of scalar multiplication, there exist V (neigh-borhood of zero) and δ > 0, such that αV ⊂ U whenever|α| < δ. Define W = ∪|α|<δαV, thus W ⊂ U is a balancedneighborhood of zero.

(2) Suppose U is a convex neighborhood of zero in U . Define

A = ∩αU | α ∈ C, |α| = 1. (1.13)

As 0 · θ = θ (where θ ∈ U denotes the zero vector) fromthe continuity of scalar multiplication there exists δ > 0and there is a neighborhood of zero V such that if |β| < δthen βV ⊂ U . Define W as the union of all such βV.Thus W is balanced and α−1W = W as |α| = 1, so thatW = αW ⊂ αU , and hence W ⊂ A, which implies that theinterior A is a neighborhood of zero. Also A ⊂ U . Since Ais an intersection of convex sets, it is convex and so is A.Now will show that A is balanced and complete the proof.For this, it suffices to prove that A is balanced. Choose rand β such that 0 ≤ r ≤ 1 and |β| = 1. Then

rβA = ∩|α|=1rβαU = ∩|α|=1rαU . (1.14)

Since αU is a convex set that contains zero, we obtainrαU ⊂ αU , so that rβA ⊂ A, which completes the proof.

Proposition 1.3.7. Let U be a topological vector space andV a neighborhood of zero in U . Given u ∈ U , there exists r ∈ R+

such that βu ∈ V, ∀β such that |β| < r.


Proof. Observe that u+V is a neighborhood of 1·u, then by thecontinuity of scalar multiplication, there exists W neighborhood ofu and r > 0 such that

βW ⊂ u+ V, ∀β such that |β − 1| < r, (1.15)

so that

βu ∈ u+ V, (1.16)

or

(β − 1)u ∈ V, where |β − 1| < r, (1.17)

and thus

βu ∈ V, ∀β such that |β| < r, (1.18)

which completes the proof.

Corollary 1.3.8. Let V be a neighborhood of zero in U , ifrn is a sequence such that rn > 0, ∀n ∈ N and lim

n→∞rn = ∞, then

U ⊂ ∪∞n=1rnV.

Proof. Let u ∈ U , then αu ∈ V for any α sufficiently small,from the last proposition u ∈ 1

αV. As rn → ∞ we have that

rn >1α

for n sufficiently big, so that u ∈ rnV, which completes theproof.

Proposition 1.3.9. Suppose δn is sequence such that δn →0, δn < δn−1, ∀n ∈ N and V a bounded neighborhood of zero in U ,then δnV is a local base for U .

Proof. Let U be a neighborhood of zero, as V is bounded, thereexists t0 ∈ R+ such that V ⊂ tU for any t > t0. As lim

n→∞δn = 0,

there exists n0 ∈ N such that if n ≥ n0 then δn < 1t0

, so thatδnV ⊂ U , ∀n such that n ≥ n0.

Definition 1.3.10 (Convergence in topological vector spaces).Let U be a topological vector space. We say un converges tou0 ∈ U , if for each neighborhood V of u0 then, there exists N ∈ N

such that

un ∈ V, ∀n ≥ N.

1.4. COMPACTNESS IN TOPOLOGICAL VECTOR SPACES 11

1.4. Compactness in Topological Vector Spaces

We start this section with the definition of open covering.

Definition 1.4.1 (Open Covering). Given B ⊂ U we say thatOα, α ∈ A is a covering of B if B ⊂ ∪α∈AOα. If Oα is open∀α ∈ A then Oα is said to be an open covering of B.

Definition 1.4.2 (Compact Sets). A set B ⊂ U is said to becompact if each open covering of B has a finite sub-covering. Moreexplicitly, if B ⊂ ∪α∈AOα, where Oα is open ∀α ∈ A, then, thereexist α1, ..., αn ∈ A such that B ⊂ Oα1 ∪ ... ∪ Oαn , for some n, afinite positive integer.

Proposition 1.4.3. A compact subset of a Hausdorff space isclosed.

Proof. Let U be a Hausdorff space and consider A ⊂ U , Acompact. Given x ∈ A and y ∈ Ac, there exist open sets Ox andOx

y such that x ∈ Ox, y ∈ Oxy and Ox∩Ox

y = ∅. It is clear that A ⊂∪x∈AOx and since A is compact, we may find x1, x2, ..., xn suchthat A ⊂ ∪n

i=1Oxi. For the selected y ∈ Ac we have y ∈ ∩n

i=1Oxiy

and (∩ni=1Oxi

y ) ∩ (∪ni=1Oxi

) = ∅. Since ∩ni=1Oxi

y is open, and y is anarbitrary point of Ac we have that Ac is open, so that A is closed,which completes the proof.

Proposition 1.4.4. A closed subset of a compact space U iscompact.

Proof. Consider Oα, α ∈ L an open cover of A. Thus Ac, Oα,α ∈ L is a cover of U . As U is compact, there exist α1, α2, ..., αn

such that Ac ∪ (∪ni=1Oαi

) ⊃ U , so that Oαi, i ∈ 1, ..., n covers

A, so that A is compact. The proof is complete.

Definition 1.4.5 (Countably Compact Sets). A set A is saidto be countably compact if every infinite subset of A has a limitpoint in A.

Proposition 1.4.6. Every compact subset of a topologicalspace U is countably compact.


Proof. Let B an infinite subset of A compact and supposeB has no limit point. Choose x1, x2, .... ⊂ B and define F =x1, x2, x3, .... It is clear that F has no limit point. Thus for eachn ∈ N, there exist On open such that On ∩ F = xn. Also, foreach x ∈ A− F , there exist Ox such that x ∈ Ox and Ox ∩ F = ∅.Thus Ox, x ∈ A−F ; O1,O2, ... is an open cover of A without afinite subcover, which contradicts the fact that A is compact.

1.5. Normed and Metric Spaces

The idea here is to prepare a route for the study of Banachspaces defined below. We start with the definition of norm.

Definition 1.5.1 (Norm). A vector space U is said to be anormed space, if it is possible to define a function ‖·‖U : U → R+ =[0,+∞), called a norm, which satisfies the following properties:

(1) ‖u‖U > 0, if u 6= θ and ‖u‖U = 0 ⇔ u = θ(2) ‖u+ v‖U ≤ ‖u‖U + ‖v‖U , ∀ u, v ∈ U ,(3) ‖αu‖U = |α|‖u‖U , ∀u ∈ U, α ∈ F.

Now we present the definition of metric.

Definition 1.5.2 (Metric Space). A vector space U is said tobe a metric space if it is possible to define a function d : U × U →R+, called a metric on U , such that

(1) 0 ≤ d(u, v) <∞, ∀u, v ∈ U ,(2) d(u, v) = 0 ⇔ u = v,(3) d(u, v) = d(v, u), ∀u, v ∈ U ,(4) d(u, w) ≤ d(u, v) + d(v, w), ∀u, v, w ∈ U .

A metric can be defined through a norm, that is

d(u, v) = ‖u− v‖U . (1.19)

In this case we say that the metric is induced by the norm.The set Br(u) = v ∈ U | d(u, v) < r is called the open ball

with center at u and radius r. A metric d : U × U → R+ is said tobe invariant if

d(u+ w, v + w) = d(u, v), ∀u, v, w ∈ U. (1.20)

The following are some basic definitions concerning metric andnormed spaces:

1.6. COMPACTNESS IN METRIC SPACES 13

Definition 1.5.3 (Convergent Sequences). Given a metricspace U , we say that un ⊂ U converges to u0 ∈ U as n → ∞,if for each ε > 0, there exists n0 ∈ N, such that if n ≥ n0 thend(un, u0) < ε. In this case we write un → u0 as n→ +∞.

Definition 1.5.4 (Cauchy Sequence). un ⊂ U is said to bea Cauchy sequence if for each ε > 0 there exists n0 ∈ N such thatd(un, um) < ε, ∀m,n ≥ n0

Definition 1.5.5 (Completeness). A metric space U is saidto be complete if each Cauchy sequence related to d : U ×U → R+

converges to an element of U .

Definition 1.5.6 (Banach Spaces). A normed vector space Uis said to be a Banach Space if each Cauchy sequence related to themetric induced by the norm converges to an element of U .

Remark 1.5.7. We say that a topology σ is compatible witha metric d if any A ⊂ σ is represented by unions and/or finiteintersections of open balls. In this case we say that d : U×U → R+

induces the topology σ.

Definition 1.5.8 (Metrizable Spaces). A topological vectorspace (U, σ) is said to be metrizable if σ is compatible with somemetric d.

Definition 1.5.9 (Normable Spaces). A topological vectorspace (U, σ) is said to be normable if the induced metric (by thisnorm) is compatible with σ.

1.6. Compactness in Metric Spaces

Definition 1.6.1 (Diameter of a Set). Let (U, d) be a metricspace andA ⊂ U . We define the diameter of A, denoted by diam(A)by

diam(A) = supd(u, v) | u, v ∈ A.

Definition 1.6.2. Let (U, d) be a metric space. We say thatFk ⊂ U is a nested sequence of sets if

F1 ⊃ F2 ⊃ F3 ⊃ ....


Theorem 1.6.3. If (U, d) is a complete metric space thenevery nested sequence of non-empty closed sets Fk such that

limk→+∞

diam(Fk) = 0

has non-empty intersection, that is

∩∞k=1Fk 6= ∅.

Proof. Suppose Fk is a nested sequence and limk→∞

diam(Fk) =

0. For each n ∈ N select un ∈ Fn. Suppose given ε > 0. Since

limn→∞

diam(Fn) = 0,

there exists N ∈ N such that if n ≥ N then

diam(Fn) < ε.

Thus if m,n > N we have um, un ∈ FN so that

d(un, um) < ε.

Hence un is a Cauchy sequence. Being U complete, there existsu ∈ U such that

un → u as n→ ∞.

Choose m ∈ N. We have that un ∈ Fm, ∀n > m, so that

u ∈ Fm = Fm.

Since m ∈ N is arbitrary we obtain

u ∈ ∩∞m=1Fm.

The proof is complete.

Theorem 1.6.4. Let (U, d) be a metric space. If A ⊂ U iscompact then it is closed and bounded.

Proof. We have already proved that A is closed. Suppose, toobtain contradiction that A is not bounded. Thus for each K ∈ N

there exists u, v ∈ A such that

d(u, v) > K.

Observe thatA ⊂ ∪u∈AB1(u).

Since A is compact there exists u1, u2, ..., un ∈ A such that

A =⊂ ∪nk=1B1(uk).


DefineR = maxd(ui, uj) | i, j ∈ 1, ..., n.

Choose u, v ∈ A such that

d(u, v) > R+ 2. (1.21)

Observe that there exist i, j ∈ 1, ..., n such that

u ∈ B1(ui), v ∈ B1(uj).

Thus

d(u, v) ≤ d(u, ui) + d(ui, uj) + d(uj, v)

≤ 2 +R, (1.22)

which contradicts (1.21). This completes the proof.

Definition 1.6.5 (Relative Compactness). In a metric space(U, d) a set A ⊂ U is said to be relatively compact if A is compact.

Definition 1.6.6 (ε - nets). Let (U, d) be a metric space. Aset N ⊂ U is sat to be a ε-net with respect to a set A ⊂ U if foreach u ∈ A there exists v ∈ N such that

d(u, v) < ε.

Definition 1.6.7. Let (U, d) be a metric space. A set A ⊂ Uis said to be totally bounded if for each ε > 0 there exists a finiteε-net with respect to A.

Proposition 1.6.8. Let (U, d) be a metric space. If A ⊂ U istotally bounded then it is bounded.

Proof. Choose u, v ∈ A. Let u1, ..., un be the 1 − net withrespect to A. Define

R = maxd(ui, uj) | i, j ∈ 1, ..., n.Observe that there exist i, j ∈ 1, ..., n such that

d(u, ui) < 1, d(v, uj) < 1.

Thus

d(u, v) ≤ d(u, ui) + d(ui, uj) + d(uj, v)

≤ R + 2. (1.23)

Since u, v ∈ A are arbitrary, A is bounded.


Theorem 1.6.9. Let (U, d) be a metric space. If from eachsequence un ⊂ A we can select a convergent subsequence unk

then A is totally bounded.

Proof. Suppose, to obtain contradiction, that A is not totallybounded. Thus there exists ε0 > 0 such that there exists no ε0-netwith respect to A. Choose u1 ∈ A, hence u1 is not a ε0-net, thatis, there exists u2 ∈ A such that

d(u1, u2) > ε0.

Again u1, u2 is not a ε0-net for A, so that there exists u3 ∈ Asuch that

d(u1, u3) > ε0 and d(u2, u3) > ε0.

Proceeding in this fashion we can obtain a sequence un such that

d(un, um) > ε0, if m 6= n. (1.24)

Clearly we cannot extract a convergent subsequence of un, oth-erwise such a subsequence would be Cauchy contradicting (1.24).The proof is complete.

Definition 1.6.10 (Sequentially compact Sets). Let (U, d) bea metric space. A set A ⊂ U is said to be sequentially compact iffor each sequence un ⊂ A there exist a subsequence unk

andu ∈ A such that

unk→ u, as k → ∞.

Theorem 1.6.11. A subset A of a metric space (U, d) is com-pact if and only if it is sequentially compact.

Proof. Suppose A is compact. By Proposition 1.4.6 A is count-ably compact. Let un ⊂ A be a sequence. We have two situationsto consider.

(1) un has infinitely many equal terms, that is in this casewe have

un1 = un2 = .... = unk= ... = u ∈ A.

Thus the result follows trivially.(2) un has infinitely many distinct terms. In such a case,

being A countably compact, un has a limit point in A,


so that there exist a subsequence unk and u ∈ A such

that

unk→ u, as k → ∞.

In both cases we may find a subsequence converging to some u ∈ A.Thus A is sequentially compact.Conversely suppose A is sequentially compact, and suppose

Gα, α ∈ L is an open cover of A. For each u ∈ A define

δ(u) = supr | Br(u) ⊂ Gα, for some α ∈ L.First we prove that δ(u) > 0, ∀u ∈ A. Choose u ∈ A. Since A ⊂∪α∈LGα, there exists α0 ∈ L such that u ∈ Gα0 . Being Gα0 open,there exists r0 > 0 such that Br0(u) ⊂ Gα0 .

Thus

δ(u) ≥ r0 > 0.

Now define δ0 by

δ0 = infδ(u) | u ∈ A.Therefore, there exists a sequence un ⊂ A such that

δ(un) → δ0 as n→ ∞.

Since A is sequentially compact, we may obtain a subsequenceunk

and u0 ∈ A such that

δ(unk) → δ0 and unk

→ u0,

as k → ∞. Therefore, we may find K0 ∈ N such that if k > K0

then

d(unk, u0) <

δ(u0)

4. (1.25)

We claim that

δ(unk) ≥ δ(u0)

4, if k > K0.

To prove the claim, suppose

z ∈ B δ(u0)4

(unk), ∀k > K0,

(observe that in particular from (1.25)

u0 ∈ B δ(u0)4

(unk), ∀k > K0).

Sinceδ(u0)

2< δ(u0),


there exists some α1 ∈ L such that

B δ(u0)2

(u0) ⊂ Gα1 .

However, since

d(unk, u0) <

δ(u0)

4, if k > K0,

we obtainB δ(u0)

2

(u0) ⊃ B δ(u0)4

(unk), if k > K0,

so that

δ(unk) ≥ δ(u0)

4, ∀k > K0.

Therefore

limk→∞

δ(unk) = δ0 ≥

δ(u0)

4.

Choose ε > 0 such that

δ0 > ε > 0.

From the last theorem since A it is sequentially compact, it is totallybounded. For the ε > 0 chosen above, consider an ε-net containedin A (the fact that the ε-net may be chosen contained in A is alsoa consequence of last theorem) and denote it by N that is,

N = v1, ..., vn ∈ A.

Since δ0 > ε, there exists

α1, ..., αn ∈ L

such thatBε(vi) ⊂ Gαi

, ∀i ∈ 1, ..., n,considering that

δ(vi) ≥ δ0 > ε > 0, ∀i ∈ 1, ..., n.For u ∈ A, since N is an ε-net we have

u ∈ ∪ni=1Bε(vi) ⊂ ∪n

i=1Gαi.

Since u ∈ U is arbitrary we obtain

A ⊂ ∪ni=1Gαi

.

ThusGα1 , ..., Gαn

is a finite subcover for A of

Gα, α ∈ L.


Hence A is compact.The proof is complete.

Theorem 1.6.12. Let (U, d) be a metric space. Thus A ⊂ Uis relatively compact if and only if for each sequence in A, we mayselect a convergent subsequence.

Proof. Suppose A is relatively compact. Thus A is compact sothat from the last Theorem, A is sequentially compact.

Thus from each sequence in A we may select a subsequencewhich converges to some element of A. In particular, for each se-quence in A ⊂ A we may select a subsequence that converges tosome element of A.

Conversely, suppose that for each sequence in A we may select aconvergent subsequence. It suffices to prove that A is sequentiallycompact. Let vn be a sequence in A. Since A is dense in A, thereexists a sequence un ⊂ A such that

d(un, vn) <1

n.

From the hypothesis we may obtain a subsequence unk and u0 ∈

A such that

unk→ u0, as k → ∞.

Thus,

vnk→ u0 ∈ A, as k → ∞.

Therefore A is sequentially compact so that it is compact.

Theorem 1.6.13. Let (U, d) be a metric space.

(1) If A ⊂ U is relatively compact then it is totally bounded.(2) If (U, d) is a complete metric space and A ⊂ U is totaly

bounded then A is relatively compact.

Proof. (1) Suppose A ⊂ U is relatively compact. Fromthe last theorem, from each sequence in A we can extracta convergent subsequence. From Theorem 1.6.9A is totallybounded.

(2) Let (U, d) be a metric space and let A be a totally boundedsubset of U .

Let un be a sequence in A. Since A is totally boundedfor each k ∈ N we find a εk-net where εk = 1/k, denoted


by Nk where

Nk = v(k)1 , v

(k)2 , ..., v(k)

nk.

In particular for k = 1 un is contained in the 1-net N1.Thus at least one ball of radius 1 of N1 contains infinitely

many points of un. Let us select a subsequence u(1)nk k∈N

of this infinite set (which is contained in a ball of radius 1).Similarly, we may select a subsequence here just partially

relabeled u(2)nl l∈N of u(1)

nk which is contained in one ofthe balls of the 1

2-net. Proceeding in this fashion for each

k ∈ N we may find a subsequence denoted by u(k)nmm∈N of

the original sequence contained in a ball of radius 1/k.

Now consider the diagonal sequence denoted by u(k)nk k∈N

= zk. Thus

d(zn, zm) <2

k, if m,n > k,

that is zk is a Cauchy sequence, and since (U, d) is com-plete, there exists u ∈ U such that

zk → u as k → ∞.

From Theorem 1.6.12, A is relatively compact.The proof is complete.

1.7. The Arzela-Ascoli Theorem

In this section we present a classical result in analysis, namelythe Arzela-Ascoli theorem.

Definition 1.7.1 (Equi-continuity). Let F be a collection ofcomplex functions defined on a metric space (U, d). We say that Fis equicontinuous if for each ε > 0, there exists δ > 0 such that ifu, v ∈ U and d(u, v) < δ then

|f(u)− f(v)| < ε, ∀f ∈ F .Furthermore, we say that F is point-wise bounded if for each u ∈ Uthere exists M(u) ∈ R such that

|f(u)| < M(u), ∀f ∈ F .

Theorem 1.7.2 (Arzela-Ascoli). Suppose F is a point-wisebounded equicontinuous collection of complex functions defined on

1.7. THE ARZELA-ASCOLI THEOREM 21

a metric space (U, d). Also suppose that U has a countable densesubset E. Thus, each sequence fn ⊂ F has a subsequence thatconverges uniformly on every compact subset of U .

Proof. Let un be a countable dense set in (U, d). By hypoth-esis, fn(u1) is a bounded sequence, therefore it has a convergentsubsequence, which is denoted by fnk

(u1). Let us denote

fnk(u1) = f1,k(u1), ∀k ∈ N.

Thus there exists g1 ∈ C such that

f1,k(u1) → g1, as k → ∞.

Observe that fnk(u2) is also bounded and also it has a convergent

subsequence, which similarly as above we will denote by f2,k(u2).Again there exists g2 ∈ C such that

f2,k(u1) → g1, as k → ∞.

f2,k(u2) → g2, as k → ∞.

Proceeding in this fashion for each m ∈ N we may obtain fm,ksuch that

fm,k(uj) → gj, as k → ∞, ∀j ∈ 1, ..., m,where the set g1, g2, ..., gm is obtained as above. Consider thediagonal sequence

fk,k,and observe that the sequence

fk,k(um)k>m

is such that

fk,k(um) → gm ∈ C, as k → ∞, ∀m ∈ N.

Therefore we may conclude that from fn we may extract a sub-sequence also denoted by

fnk = fk,k

which is convergent in

E = unn∈N.


Now suppose K ⊂ U , being K compact. Suppose given ε > 0.From the equi-continuity hypothesis there exists δ > 0 such that ifu, v ∈ U and d(u, v) < δ we have

|fnk(u) − fnk

(v)| < ε

3, ∀k ∈ N.

Observe that

K ⊂ ∪u∈KB δ2(u),

and being K compact we may find u1, ..., uM such that

K ⊂ ∪Mj=1B δ

2(uj).

Since E is dense in U , there exists

vj ∈ B δ2(uj) ∩ E, ∀j ∈ 1, ...,M.

Fixing j ∈ 1, ...,M, from vj ∈ E we obtain that

limk→∞

fnk(vj)

exists as k → ∞. Hence there exists K0j∈ N such that if k, l > K0j

then

|fnk(vj) − fnl

(vj)| <ε

3.

Pick u ∈ K, thus

u ∈ B δ2(uj)

for some j ∈ 1, ...,M, so that

d(u, vj) < δ.

Therefore if

k, l > maxK01 , ..., K0M,

then

|fnk(u) − fnl

(u)| ≤ |fnk(u) − fnk

(vj)| + |fnk(vj) − fnl

(vj)|+|fnl

(vj) − fnl(u)|

≤ ε

3+ε

3+ε

3= ε. (1.26)

Since u ∈ K is arbitrary, we conclude that fnk is uniformly

Cauchy on K.The proof is complete.

1.8. LINEAR MAPPINGS 23

1.8. Linear Mappings

Given U, V topological vector spaces, a function (mapping) f :U → V , A ⊂ U and B ⊂ V , we define:

f(A) = f(u) | u ∈ A, (1.27)

and the inverse image of B, denoted f−1(B) as

f−1(B) = u ∈ U | f(u) ∈ B. (1.28)

Definition 1.8.1 (Linear Functions). A function f : U → Vis said to be linear if

f(αu+ βv) = αf(u) + βf(v), ∀u, v ∈ U, α, β ∈ F. (1.29)

Definition 1.8.2 (Null Space and Range). Given f : U → V ,we define the null space and the range of f, denoted by N(f) andR(f) respectively, as

N(f) = u ∈ U | f(u) = θ (1.30)

and

R(f) = v ∈ V | ∃u ∈ U such that f(u) = v. (1.31)

Note that if f is linear then N(f) and R(f) are subspaces of U .

Proposition 1.8.3. Let U, V be topological vector spaces. Iff : U → V is linear and continuous at θ, then it is continuouseverywhere.

Proof. Since f is linear we have f(θ) = θ. Since f is continuousat θ, given V ⊂ V a neighborhood of zero, there exists U ⊂ Uneighborhood of zero, such that

f(U) ⊂ V. (1.32)

Thus

v − u ∈ U ⇒ f(v − u) = f(v) − f(u) ∈ V, (1.33)

or

v ∈ u+ U ⇒ f(v) ∈ f(u) + V, (1.34)

which means that f is continuous at u. Since u is arbitrary, f iscontinuous everywhere.


1.9. Linearity and Continuity

Definition 1.9.1 (Bounded Functions). A function f : U → Vis said to be bounded if it maps bounded sets into bounded sets.

Proposition 1.9.2. A set E is bounded if and only if thefollowing condition is satisfied: whenever un ⊂ E and αn ⊂ F

are such that αn → 0 as n→ ∞ we have αnun → θ as n→ ∞.

Proof. Suppose E is bounded. Let U be a balanced neighbor-hood of θ in U , then E ⊂ tU for some t. For un ⊂ E, as αn → 0,there exists N such that if n > N then t < 1

|αn|. Since t−1E ⊂ U

and U is balanced, we have that αnun ∈ U , ∀n > N , and thusαnun → θ. Conversely, if E is not bounded, there is a neighbor-hood V of θ and rn such that rn → ∞ and E is not contained inrnV, that is, we can choose un such that r−1

n un is not in V, ∀n ∈ N,so that r−1

n un does not converge to θ.

Proposition 1.9.3. Let f : U → V be a linear function.Consider the following statements

(1) f is continuous,(2) f is bounded,(3) If un → θ then f(un) is bounded,(4) If un → θ then f(un) → θ.

Then,

• 1 implies 2,• 2 implies 3,• if U is metrizable then 3 implies 4, which implies 1.

Proof. (1) 1 implies 2: Suppose f is continuous, for W ⊂V neighborhood of zero, there exists a neighborhood ofzero in U , denoted by V, such that

f(V) ⊂ W. (1.35)

If E is bounded, there exists t0 ∈ R+ such that E ⊂ tV,∀t ≥ t0, so that

f(E) ⊂ f(tV) = tf(V) ⊂ tW, ∀t ≥ t0, (1.36)

and thus f is bounded.(2) 2 implies 3: Suppose un → θ and let W be a neighborhood

of zero. Then there exists N ∈ N such that if n ≥ N

1.10. CONTINUITY OF OPERATORS ON BANACH SPACES 25

then un ∈ V ⊂ W where V is a balanced neighborhood ofzero. On the other hand, for n < N , there exists Kn suchthat un ∈ KnV. Define K = max1, K1, ..., Kn. Thenun ∈ KV, ∀n ∈ N and hence un is bounded. Finallyfrom 2, we have that f(un) is bounded.

(3) 3 implies 4: Suppose U is metrizable and let un → θ.Given K ∈ N, there exists nK ∈ N such that if n > nK

then d(un, θ) <1

K2 . Define γn = 1 if n < n1 and γn = K,if nK ≤ n < nK+1 so that

d(γnun, θ) = d(Kun, θ) ≤ Kd(un, θ) < K−1. (1.37)

Thus since 2 implies 3 we have that f(γnun) is boundedso that, by Proposition 1.9.2 f(un) = γ−1

n f(γnun) → θ asn→ ∞.

(4) 4 implies 1: suppose 1 fails. Thus there exists a neighbor-hood of zero W ⊂ V such that f−1(W) contains no neigh-borhood of zero in U . Particularly, we can select un suchthat un ∈ B1/n(θ) and f(un) not in W so that f(un) doesnot converge to zero. Thus 4 fails.

1.10. Continuity of Operators on Banach Spaces

Let U, V be Banach spaces. We call a function A : U → V anoperator.

Proposition 1.10.1. Let U, V be Banach spaces. A linearoperator A : U → V is continuous if and only if there exists K ∈ R+

such that‖A(u)‖V < K‖u‖U , ∀u ∈ U.

Proof. Suppose A is linear and continuous. From Proposition1.9.3,

if un ⊂ U is such that un → θ then A(un) → θ. (1.38)

We claim that for each ε > 0 there exists δ > 0 such that if‖u‖U < δ then ‖A(u)‖V < ε.

Suppose, to obtain contradiction that the claim is false.Thus there exists ε0 > 0 such that for each n ∈ N there exists

un ∈ U such that ‖un‖U ≤ 1n

and ‖A(un)‖V ≥ ε0.Therefore un → θ and A(un) does not converge to θ, which

contradicts (1.38).


Thus the claim holds.In particular, for ε = 1 there exists δ > 0 such that if ‖u‖U < δ

then ‖A(u)‖V < 1. Thus given an arbitrary u ∈ U , u 6= θ, for

w =δu

2‖u‖U

we have

‖A(w)‖V =δ‖A(u)‖V

2‖u‖U

< 1,

that is

‖A(u)‖V <2‖u‖U

δ, ∀u ∈ U.

Defining

K =2

δthe first part of the proof is complete. Reciprocally, suppose thereexists K > 0 such that

‖A(u)‖V < K‖u‖U , ∀u ∈ U.

Hence un → θ implies ‖A(un)‖V → θ, so that from Proposition1.9.3, A is continuous.


1.11. Some Classical Results on Banach Spaces

In this section we present some important results in Banachspaces. We start with the following theorem.

Theorem 1.11.1. Let U and V be Banach spaces and letA : U → V be a linear operator. Then A is bounded if and only ifthe set C ⊂ U has at least one interior point, where

C = A−1[v ∈ V | ‖v‖V ≤ 1].

Proof. Suppose there exists u0 ∈ U in the interior of C. Thus,there exists r > 0 such that

Br(u0) = u ∈ U | ‖u− u0‖U < r ⊂ C.

Fix u ∈ U such that ‖u‖U < r. Thus, we have

‖A(u)‖V ≤ ‖A(u+ u0)‖V + ‖A(u0)‖V .

Observe also that‖(u+ u0) − u0‖U < r,

1.11. SOME CLASSICAL RESULTS ON BANACH SPACES 27

so that u+ u0 ∈ Br(u0) ⊂ C and thus

‖A(u+ u0)‖V ≤ 1

and hence

‖A(u)‖V ≤ 1 + ‖A(u0)‖V , (1.39)

∀u ∈ U such that ‖u‖U < r. Fix an arbitrary u ∈ U such thatu 6= θ. From (1.39)

w =u

‖u‖U

r

2

is such that

‖A(w)‖V =‖A(u)‖V

‖u‖U

r

2≤ 1 + ‖A(u0)‖V ,

so that

‖A(u)‖V ≤ (1 + ‖A(u0)‖V )‖u‖U2

r.

Since u ∈ U is arbitrary, A is bounded.Reciprocally, suppose A is bounded. Thus

‖A(u)‖V ≤ K‖u‖U , ∀u ∈ U,

for some K > 0. In particular

D =

u ∈ U | ‖u‖U ≤ 1

K

⊂ C.


Definition 1.11.2. A set S in a metric space U is said to benowhere dense if S has an empty interior.

Theorem 1.11.3 (Baire Category Theorem). A completemetric space is never the union of a countable number of nowheredense sets.

Proof. Suppose, to obtain contradiction, that U is a completemetric space and

U = ∪∞n=1An

where each An is nowhere dense. Since A1 is nowhere dense, thereexist u1 ∈ U which is not in A1, otherwise we would have U = A1,which is not possible since U is open. Furthermore, Ac

1 is open, sothat we may obtain u1 ∈ Ac

1 and 0 < r1 < 1 such that

B1 = Br1(u1)


satisfiesB1 ∩A1 = ∅.

Since A2 is nowhere dense we have B1 is not contained in A2. There-fore we may select u2 ∈ B1 \ A2 and since B1 \ A2 is open, thereexists 0 < r2 < 1/2 such that

B2 = Br2(u2) ⊂ B1 \ A2,

that isB2 ∩A2 = ∅.

Proceeding inductively in this fashion, for each n ∈ N we mayobtain un ∈ Bn−1 \ An such that we may choose an open ball Bn =Brn(un) such that

Bn ⊂ Bn−1,

Bn ∩ An = ∅and

0 < rn < 21−n.

Observe that un is a Cauchy sequence, considering that if m,n >N then un, um ∈ BN , so that

d(un, um) < 2(21−N).

Defineu = lim

n→∞un.

Sinceun ∈ BN , ∀n > N,

we getu ∈ BN ⊂ BN−1.

Therefore u is not in AN−1, ∀N > 1, which means u is not in∪∞

n=1An = U , a contradiction.The proof is complete.

Theorem 1.11.4 (The Principle of Uniform Boundedness).Let U be a Banach space. Let F be a family of linear boundedoperators from U into a normed linear space V . Suppose for eachu ∈ U there exists a Ku ∈ R such that

‖T (u)‖V < Ku, ∀T ∈ F .Then, there exists K ∈ R such that

‖T‖ < K, ∀T ∈ F .


Proof. Define

Bn = u ∈ U | ‖T (u)‖V ≤ n, ∀T ∈ F.By the hypotheses, given u ∈ U , u ∈ Bn for all n sufficiently big.Thus,

U = ∪∞n=1Bn.

Moreover each Bn is closed. By the Baire category theorem thereexists n0 ∈ N such that Bn0 has non-empty interior. That is, thereexists u0 ∈ U and r > 0 such that

Br(u0) ⊂ Bn0.

Thus, fixing an arbitrary T ∈ F , we have

‖T (u)‖V ≤ n0, ∀u ∈ Br(u0), .

Thus if ‖u‖U < r then ‖(u+ u0) − u0‖U < r, so that

‖T (u+ u0)‖V ≤ n0,

that is

‖T (u)‖V − ‖T (u0)‖V ≤ n0.

Thus

‖T (u)‖V ≤ 2n0, if ‖u‖U < r. (1.40)

For u ∈ U arbitrary, u 6= θ, define

w =ru

2‖u‖U,

from (1.40) we obtain

‖T (w)‖V =r‖T (u)‖V

2‖u‖U≤ 2n0,

so that

‖T (u)‖V ≤ 4n0‖u‖U

r, ∀u ∈ U.

Hence

‖T‖ ≤ 4n0

r, ∀T ∈ F .


Theorem 1.11.5 (The Open Mapping Theorem). Let U andV be Banach spaces and let A : U → V be a bounded onto linearoperator. Thus if O ⊂ U is open then A(O) is open in V .


Proof. First we will prove that given r > 0, there exists r′ > 0such that

A(Br(θ)) ⊃ BVr′ (θ). (1.41)

Here BVr′ (θ) denotes a ball in V of radius r′ with center in θ. Since

A is ontoV = ∪∞

n=1A(nB1(θ)).

By the Baire Category Theorem, there exists n0 ∈ N such that theclosure of A(n0B1(θ)) has non-empty interior, so that A(B1(θ)) hasnon-empty interior. We will show that there exists r′ > 0 such that

BVr′ (θ) ⊂ A(B1(θ)).

Observe that there exists y0 ∈ V and r1 > 0 such that

BVr1

(y0) ⊂ A(B1(θ)). (1.42)

Define u0 ∈ B1(θ) which satisfies A(u0) = y0. We claim that

A(Br2(θ)) ⊃ BVr1

(θ),

where r2 = 1 + ‖u0‖U . To prove the claim, pick

y ∈ A(B1(θ))

thus there exists u ∈ U such that ‖u‖U < 1 and A(u) = y.Therefore

A(u) = A(u− u0 + u0) = A(u− u0) + A(u0).

But observe that

‖u− u0‖U ≤ ‖u‖U + ‖u0‖U

< 1 + ‖u0‖U

= r2, (1.43)

so thatA(u− u0) ∈ A(Br2(θ)).

This meansy = A(u) ∈ A(u0) + A(Br2(θ)),

and henceA(B1(θ)) ⊂ A(u0) + A(Br2(θ)).

That is, from this and (1.42), we obtain

A(u0) + A(Br2(θ)) ⊃ A(B1(θ)) ⊃ BVr1

(y0) = A(u0) +BVr1

(θ),

and thereforeA(Br2(θ)) ⊃ BV

r1(θ).


SinceA(Br2(θ)) = r2A(B1(θ)),

we have, for some not relabeled r1 > 0 that

A(B1(θ)) ⊃ BVr1

(θ).

Thus it suffices to show that

A(B1(θ)) ⊂ A(B2(θ)),

to prove (1.41). Let y ∈ A(B1(θ)), since A is continuous we mayselect u1 ∈ B1(θ) such that

y − A(u1) ∈ BVr1/2(θ) ⊂ A(B1/2(θ)).

Now select u2 ∈ B1/2(θ) so that

y − A(u1) − A(u2) ∈ BVr1/4(θ).

By induction, we may obtain

un ∈ B21−n(θ),

such that

y −n∑

j=1

A(uj) ∈ BVr1/2n(θ).

Define

u =

∞∑

n=1

un,

we have that u ∈ B2(θ), so that

y =

∞∑

n=1

A(un) = A(u) ∈ A(B2(θ)).

ThereforeA(B1(θ)) ⊂ A(B2(θ)).

The proof of (1.41) is complete.To finish the proof of this theorem, assume O ⊂ U is open. Let

v0 ∈ A(O). Let u0 ∈ O be such that A(u0) = v0. Thus there existsr > 0 such that

Br(u0) ⊂ O.From (1.41),

A(Br(θ)) ⊃ BVr′ (θ),

for some r′ > 0. Thus

A(O) ⊃ A(u0) + A(Br(θ)) ⊃ v0 +BVr′ (θ).


This means that v0 is an interior point of A(O). Since v0 ∈ A(O)is arbitrary, we may conclude that A(O) is open.


Theorem 1.11.6 (The Inverse Mapping Theorem). A con-tinuous linear bijection of one Banach space onto another has acontinuous inverse.

Proof. Let A : U → V satisfying the theorem hypotheses.Since A is open, A−1 is continuous.

Definition 1.11.7 (Graph of a Mapping). Let A : U → V bean operator, where U and V are normed linear spaces. The graph

of A denoted by Γ(A) is defined by

Γ(A) = (u, v) ∈ U × V | v = A(u).

Theorem 1.11.8 (The Closed Graph Theorem). Let U andV be Banach spaces and let A : U → V be a linear operator. ThenA is bounded if and only if its graph is closed.

Proof. Suppose Γ(A) is closed. Since A is linear Γ(A) is asubspace of U ⊕ V . Also, being Γ(A) closed, it is a Banach spacewith the norm

‖(u,A(u)‖ = ‖u‖U + ‖A(u)‖V .

Consider the continuous mappings

Π1(u,A(u)) = u

and

Π2(u,A(u)) = A(u).

Observe that Π1 is a bijection, so that by the inverse mappingtheorem Π−1

1 is continuous. As

A = Π2 Π−11 ,

it follows that A is continuous. The converse is trivial.

1.12. Hilbert Spaces

At this point we introduce an important class of spaces, namelythe Hilbert spaces.

1.12. HILBERT SPACES 33

Definition 1.12.1. Let H be a vector space. We say that H isa real pre-Hilbert space if there exists a function (·, ·)H : H×H → R

such that

(1) (u, v)H = (v, u)H, ∀u, v ∈ H ,(2) (u+ v, w)H = (u, w)H + (v, w)H, ∀u, v, w ∈ H,(3) (αu, v)H = α(u, v)H, ∀u, v ∈ H, α ∈ R,(4) (u, u)H > 0, ∀u ∈ H, and (u, u)H = 0, if and only if u =

θ.

Remark 1.12.2. The function (·, ·)H : H × H → R is calledan inner-product.

Proposition 1.12.3 (Cauchy-Schwarz inequality). Let H bea pre-Hilbert space. Defining

‖u‖H =√

(u, u)H, ∀u ∈ H,

we have

|(u, v)H| ≤ ‖u‖H‖v‖H, ∀u, v ∈ H.

Equality holds if and only if u = αv for some α ∈ R or v = θ.

Proof. If v = θ the inequality is immediate. Assume v 6= θ.Given α ∈ R we have

0 ≤ (u− αv, u− αv)H

= (u, u)H + α2(v, v)H − 2α(u, v)H

= ‖u‖2H + α2‖v‖2

H − 2α(u, v)H. (1.44)

In particular for α = (u, v)H/‖v‖2H, we obtain

0 ≤ ‖u‖2H − (u, v)2

H

‖v‖2H

,

that is

|(u, v)H| ≤ ‖u‖H‖v‖H.

The remaining conclusions are left to the reader.

Proposition 1.12.4. On a pre-Hilbert space H , the function

‖ · ‖H : H → R

is a norm, where as above

‖u‖H =√

(u, u).


Proof. The only non-trivial property to be verified, concerningthe definition of norm, is the triangle inequality.

Observe that, given u, v ∈ H , from the Cauchy-Schwarz in-equality we have,

‖u+ v‖2H = (u+ v, u+ v)H

= (u, u)H + (v, v)H + 2(u, v)H

≤ (u, u)H + (v, v)H + 2|(u, v)H|≤ ‖u‖2

H + ‖v‖2H + 2‖u‖H‖v‖H

= (‖u‖H + ‖v‖H)2. (1.45)

Therefore‖u+ v‖H ≤ ‖u‖H + ‖v‖H , ∀u, v ∈ H.


Definition 1.12.5. A pre-Hilbert space H is to be a Hilbertspace if it is complete, that is, if any cauchy sequence inH convergesto an element of H .

Definition 1.12.6 (Orthogonal Complement). Let H be aHilbert space. Considering M ⊂ H we define its orthogonal com-plement, denoted by M⊥, by

M⊥ = u ∈ H | (u,m)H = 0, ∀m ∈M.

Theorem 1.12.7. Let H be a Hilbert space, M a closedsubspace of H and suppose u ∈ H . Under such hypotheses Thereexists a unique m0 ∈M such that

‖u−m0‖H = minm∈M

‖u−m‖H.

Moreover n0 = u−m0 ∈M⊥ so that

u = m0 + n0,

where m0 ∈ M and n0 ∈ M⊥. Finally, such a representationthrough M ⊕M⊥ is unique.

Proof. Define d by

d = infm∈M

‖u−m‖H.

Let mi ⊂M be a sequence such that

‖u−mi‖H → d, as i→ ∞.


Thus, from the parallelogram law we have

‖mi −mj‖2H = ‖mi − u− (mj − u)‖2

H

= 2‖mi − u‖2H + 2‖mj − u‖2

H − 2‖−2u+mi +mj‖2

H

= 2‖mi − u‖2H + 2‖mj − u‖2

H − 4‖−u+ (mi +mj)/2‖2

H

→ 2d2 + 2d2 − 4d2 = 0, as i, j → +∞. (1.46)

Thus mi ⊂ M is a Cauchy sequence. Since M is closed, thereexists m0 ∈M such that

mi → m0, as i→ +∞,

so that

‖u−mi‖H → ‖u−m0‖H = d.

Define

n0 = u−m0.

We will prove that n0 ∈M⊥.Pick m ∈M and t ∈ R, thus we have

d2 ≤ ‖u− (m0 − tm)‖2H

= ‖n0 + tm‖2H

= ‖n0‖2H + 2(n0, m)Ht+ ‖m‖2

Ht2. (1.47)

Since

‖n0‖2H = ‖u−m0‖2

H = d2,

we obtain

2(n0, m)Ht+ ‖m‖2Ht

2 ≥ 0, ∀t ∈ R

so that

(n0, m)H = 0.

Being m ∈M arbitrary, we obtain

n0 ∈ M⊥.

It remains to prove the uniqueness. Let m ∈M , thus

‖u−m‖2H = ‖u−m0 +m0 −m‖2

H

= ‖u−m0‖2H + ‖m−m0‖2

H , (1.48)

since

(u−m0, m−m0)H = (n0, m−m0)H = 0.


From (1.48) we obtain

‖u−m‖2H > ‖u−m0‖2

H = d2,

if m 6= m0.Therefore m0 is unique.Now suppose

u = m1 + n1,

where m1 ∈M and n1 ∈M⊥. As above, for m ∈M

‖u−m‖2H = ‖u−m1 +m1 −m‖2

H

= ‖u−m1‖2H + ‖m−m1‖2

H ,

≥ ‖u−m1‖H (1.49)

and thus since m0 such that

d = ‖u−m0‖H

is unique, we get

m1 = m0

and therefore

n1 = u−m0 = n0.


Theorem 1.12.8 (The Riesz Lemma). Let H be a Hilbertspace and let f : H → R be a continuous linear functional. Thenthere exists a unique u0 ∈ H such that

f(u) = (u, u0)H , ∀u ∈ H.

Moreover

‖f‖H∗ = ‖u0‖H .

Proof. Define N by

N = u ∈ H | f(u) = 0.Thus, as f is a continuous and linear N is a closed subspace of H .If N = H , then f(u) = 0 = (u, θ)H, ∀u ∈ H and the proof would becomplete. Thus assume N 6= H . By the last theorem there existsv 6= θ such that v ∈ N⊥.

Define

u0 =f(v)

‖v‖2H

v.


Thus if u ∈ N we have

f(u) = 0 = (u, u0)H = 0.

On the other hand if u = αv for some α ∈ R, we have

f(u) = αf(v)

=f(v)(αv, v)H

‖v‖2H

=

(

αv,f(v)v

‖v‖2H

)

H

= (αv, u0)H . (1.50)

Therefore f(u) equals (u, u0)H in the space spanned by N and v.Now we show that this last space (then span of N and v) is in factH . Just observe that given u ∈ H we may write

u =

(

u− f(u)v

f(v)

)

+f(u)v

f(v). (1.51)

Since

u− f(u)v

f(v)∈ N

we have finished the first part of the proof, that is, we have proventhat

f(u) = (u, u0)H , ∀u ∈ H.

To finish the proof, assume u1 ∈ H is such that

f(u) = (u, u1)H , ∀u ∈ H.

Thus,

‖u0 − u1‖2H = (u0 − u1, u0 − u1)H

= (u0 − u1, u0)H − (u0 − u1, u1)H

= f(u0 − u1) − f(u0 − u1) = 0. (1.52)

Hence u1 = u0.Let us now prove that

‖f‖H∗ = ‖u0‖H .

First observe that

‖f‖H∗ = supf(u) | u ∈ H, ‖u‖H ≤ 1= sup|(u, u0)H | | u ∈ H, ‖u‖H ≤ 1≤ sup‖u‖H‖u0‖H | u ∈ H, ‖u‖H ≤ 1≤ ‖u0‖H . (1.53)


On the other hand

‖f‖H∗ = supf(u) | u ∈ H, ‖u‖H ≤ 1

≥ f

(

u0

‖u0‖H

)

=(u0, u0)H

‖u0‖H

= ‖u0‖H . (1.54)

From (1.53) and (1.54)

‖f‖H∗ = ‖u0‖H .


Remark 1.12.9. Similarly as above we may define a Hilbertspace H over C, that is, a complex one. In this case the complexinner product (·, ·)H : H ×H → C is defined through the followingproperties:

(1) (u, v)H = (v, u)H, ∀u, v ∈ H ,(2) (u+ v, w)H = (u, w)H + (v, w)H, ∀u, v, w ∈ H,(3) (αu, v)H = α(u, v)H, ∀u, v ∈ H, α ∈ C,(4) (u, u)H > 0, ∀u ∈ H, and (u, u) = 0, if and only if u = θ.

Observe that in this case we have

(u, αv)H = α(u, v)H, ∀u, v ∈ H, α ∈ C,

where for α = a + bi ∈ C, we have α = a − bi. Finally, similarresults as those proven above are valid for complex Hilbert spaces.

CHAPTER 2

The Hahn-Banach Theorems and

Weak Topologies

2.1. Introduction

The notion of weak topologies and weak convergence is fun-damental in the modern variational analysis. Many importantproblems are non-convex and have no minimizers in the classicalsense. However the minimizing sequences in reflexive spaces maybe weakly convergent, and it is important to evaluate the averagebehavior of such sequences in many practical applications.

2.2. The Hahn-Banach Theorem

In this chapter U denotes a Banach space, unless otherwise in-dicated. We start this section by stating and proving the Hahn-Banach theorem for real vector spaces, which is sufficient for ourpurposes.

Theorem 2.2.1 (The Hahn-Banach Theorem). Consider afunctional p : U → R satisfying

p(λu) = λp(u), ∀u ∈ U, λ > 0, (2.1)

p(u+ v) ≤ p(u) + p(v), ∀u, v ∈ U. (2.2)

Let V ⊂ U be a vector subspace and let g : V → R be a linearfunctional such that

g(u) ≤ p(u), ∀u ∈ V. (2.3)

Then there exists a linear functional f : U → R such that

g(u) = f(u), ∀u ∈ V, (2.4)

and

f(u) ≤ p(u), ∀u ∈ U. (2.5)

39

40 2. THE HAHN-BANACH THEOREMS AND WEAK TOPOLOGIES

Proof. Pick z ∈ U − V . Denote by V the space spanned by Vand z, that is

V = v + αz | v ∈ V and α ∈ R. (2.6)

We may define an extension of g to V , denoted by g, as

g(αz + v) = αg(z) + g(v), (2.7)

where g(z) will be appropriately defined. Suppose given v1, v2 ∈ V ,α > 0, β > 0. Then

βg(v1) + αg(v2) = g(βv1 + αv2)

= (α + β)g(β

α+ βv1 +

α

α + βv2)

≤ (α + β)p(β

α+ β(v1 − αz) +

α

α+ β(v2 + βz))

≤ βp(v1 − αz) + αp(v2 + βz) (2.8)

and therefore

1

α[−p(v1 − αz) + g(v1)] ≤

1

β[p(v2 + βz) − g(v2)],

∀v1, v2 ∈ V, α, β > 0. (2.9)

Thus, there exists a ∈ R such that

supv∈V,α>0

[1

α(−p(v − αz) + g(v))] ≤ a ≤ inf

v∈V,α>0[1

α(p(v + αz) − g(v))].

(2.10)

If we define g(z) = a we obtain g(u) ≤ p(u), ∀u ∈ V . Define byE the set of extensions e of g, which satisfy e(u) ≤ p(u) on thesubspace where e is defined. We define a partial order in E bysetting e1 ≺ e2 if e2 is defined in a larger set than e1 and e1 = e2where both are defined. Let eαα∈A be a linearly ordered subsetof E . Let Vα be the subspace on which eα is defined. Define e on∪α∈AVα by setting e(u) = eα on Vα. Clearly eα ≺ e so each linearlyordered set of E has an upper bound. By the Zorn’s lemma, Ehas a maximal element f defined on some set U such that f(u) ≤p(u), ∀u ∈ U . We can conclude that U = U , otherwise if there

was an z1 ∈ U − U , as above we could have a new extension f1 tothe subspace spanned by z1 and U , contradicting the maximalityof f .

2.2. THE HAHN-BANACH THEOREM 41

Definition 2.2.2 (Topological Dual Space). For a Banachspace U , we define its Topological Dual Space, as the set of alllinear continuous functionals defined on U . We suppose that suchdual space of U , may be identified with a space denoted by U∗

through a bilinear form 〈·, ·〉U : U ×U∗ → R (here we are refereingto the standard representations of dual spaces concerning Lebesgueand Sobolev spaces). That is, given f : U → R linear continuousfunctional, there exists u∗ ∈ U∗ such that

f(u) = 〈u, u∗〉U , ∀u ∈ U. (2.11)

The norm of f , denoted by ‖f‖U∗, is defined as

‖f‖U∗ = supu∈U

|〈u, u∗〉U | | ‖u‖U ≤ 1. (2.12)

Corollary 2.2.3. Let V ⊂ U a vector subspace of U and letg : V → R a linear continuous functional of norm

‖g‖V ∗ = supu∈U

|g(u)| | ‖u‖V ≤ 1. (2.13)

Then, there exists an u∗ in U∗ such that

〈u, u∗〉U = g(u), ∀u ∈ V, (2.14)

and

‖u∗‖U∗ = ‖g‖V ∗ . (2.15)

Proof. Apply Theorem 2.2.1 with p(u) = ‖g‖V ∗‖u‖V .

Corollary 2.2.4. Given u0 ∈ U there exists u∗0 ∈ U∗ such that

‖u∗0‖U∗ = ‖u0‖U and 〈u0, u∗0〉U = ‖u0‖2

U . (2.16)

Proof. Apply Corollary 2.2.3 with V = αu0 | α ∈ R andg(tu0) = t‖u0‖2

U so that ‖g‖V ∗ = ‖u0‖U .

Corollary 2.2.5. Given u ∈ U we have

‖u‖U = supu∗∈U∗

|〈u, u∗〉U | | ‖u∗‖U∗ ≤ 1. (2.17)

Proof. Suppose u 6= θ. Since

|〈u, u∗〉U | ≤ ‖u‖U‖u∗‖U∗ , ∀u ∈ U, u∗ ∈ U∗


we have

supu∗∈U∗

|〈u, u∗〉U | | ‖u∗‖U∗ ≤ 1 ≤ ‖u‖U . (2.18)

However, from last corollary we have that there exists u∗0 ∈ U∗ suchthat ‖u∗0‖U∗ = ‖u‖U and 〈u, u∗0〉U = ‖u‖2

U . Define u∗1 = ‖u‖−1U u∗0.

Then ‖u∗1‖U = 1 and 〈u, u∗1〉U = ‖u‖U .

Definition 2.2.6 (Affine Hyper-Plane). Let U be a Banachspace. An affine hyper-plane H is a set of the form

H = u ∈ U | 〈u, u∗〉U = α (2.19)

for some u∗ ∈ U∗ and α ∈ R.

Proposition 2.2.7. A hyper-plane H defined as above isclosed.

Proof. The result follows from the continuity of 〈u, u∗〉U as afunctional defined in U .

Definition 2.2.8 (Separation). Given A,B ⊂ U we say thata hyper-plane H , defined as above separates A and B if

〈u, u∗〉U ≤ α, ∀u ∈ A, and 〈u, u∗〉U ≥ α, ∀u ∈ B. (2.20)

We say that H separates A and B strictly if there exists ε > 0 suchthat

〈u, u∗〉U ≤ α− ε, ∀u ∈ A, and 〈u, u∗〉U ≥ α + ε, ∀u ∈ B, (2.21)

Theorem 2.2.9 (Hahn-Banach theorem, geometric form). Con-sider A,B ⊂ U two convex disjoint non-empty sets, where A isopen. Then there exists a closed hyper-plane that separates A andB.

We need the following Lemma.

Lemma 2.2.10. Consider C ⊂ U a convex open set such thatθ ∈ C. Given u ∈ U , define

p(u) = infα > 0, α−1u ∈ C. (2.22)

2.2. THE HAHN-BANACH THEOREM 43

Thus, p is such that there exists M ∈ R+ satisfying

0 ≤ p(u) ≤ M‖u‖U , ∀u ∈ U, (2.23)

and

C = u ∈ U | p(u) < 1. (2.24)

Also

p(u+ v) ≤ p(u) + p(v), ∀u, v ∈ U.

Proof. Let r > 0 be such that B(θ, r) ⊂ C, thus

p(u) ≤ ‖u‖U

r, ∀u ∈ U (2.25)

which proves (2.23). Now suppose u ∈ C. Since C is open (1+ε)u ∈C for ε sufficiently small. Therefore p(u) ≤ 1

1+ε< 1. Conversely, if

p(u) < 1 there exists 0 < α < 1 such that α−1u ∈ C and therefore,since C is convex, u = α(α−1u) + (1 − α)θ ∈ C.

Also, let u, v ∈ C and ε > 0. Thus up(u)+ε

∈ C and vp(v)+ε

∈ C so

that tup(u)+ε

+ (1−t)vp(v)+ε

∈ C, ∀t ∈ [0, 1]. Particularly for t = p(u)+εp(u)+p(v)+2ε

we obtain u+vp(u)+p(v)+2ε

∈ C, which means p(u + v) ≤ p(u) + p(v) +

2ε, ∀ε > 0

Lemma 2.2.11. Consider C ⊂ U a convex open set and letu0 ∈ U be a vector not in C. Then there exists u∗ ∈ U∗ such that〈u, u∗〉U < 〈u0, u

∗〉U , ∀u ∈ C

Proof. By a translation, we may assume θ ∈ C. Consider thefunctional p as in the last lemma. Define V = αu0 | α ∈ R.Define g on V , by

g(tu0) = t, t ∈ R. (2.26)

We have that g(u) ≤ p(u), ∀u ∈ V . From the Hahn-Banach the-orem, there exist a linear functional f on U which extends g suchthat

f(u) ≤ p(u) ≤M‖u‖U . (2.27)

Here we have used lemma 2.2.10. In Particular, f(u0) = 1, and(also from the last lemma) f(u) < 1, ∀u ∈ C. The existence of u∗

satisfying the theorem follows from the continuity of f indicated in(2.27).


Proof of Theorem 2.2.9. Define C = A+ (−B) so that C isconvex and θ 6∈ C. From Lemma 2.2.11, there exists u∗ ∈ U∗ suchthat 〈w, u∗〉U < 0, ∀w ∈ C, which means

〈u, u∗〉U < 〈v, u∗〉U , ∀u ∈ A, v ∈ B. (2.28)

Thus, there exists α ∈ R such that

supu∈A

〈u, u∗〉U ≤ α ≤ infv∈B

〈v, u∗〉U , (2.29)

which completes the proof.

Theorem 2.2.12 (Hahn-Banach theorem, second geometricform). Consider A,B ⊂ U two convex disjoint non-empty sets.Suppose A is closed and B is compact. Then there exists an hyper-plane which separates A and B strictly.

Proof. There exists ε > 0 sufficiently small such that Aε =A + B(0, ε) and Bε = B + B(0, ε) are convex disjoint sets. FromTheorem 2.2.9, there exists u∗ ∈ U∗ such that u∗ 6= θ and

〈u+ εw1, u∗〉U ≤ 〈u+ εw2, u

∗〉U , ∀u ∈ A, v ∈ B, w1, w2 ∈ B(0, 1).(2.30)

Thus, there exists α ∈ R such that

〈u, u∗〉U + ε‖u∗‖U∗ ≤ α ≤ 〈v, u∗〉U − ε‖u∗‖U∗, ∀u ∈ A, v ∈ B.(2.31)

Corollary 2.2.13. Suppose V ⊂ U is a vector subspace suchthat V 6= U . Then there exists u∗ ∈ U∗ such that u∗ 6= θ and

〈u, u∗〉U = 0, ∀u ∈ V. (2.32)

Proof. Consider u0 ∈ U such that u0 6∈ V . Applying Theorem2.2.9 to A = V and B = u0 we obtain u∗ ∈ U∗ and α ∈ R suchthat u∗ 6= θ and

〈u, u∗〉U < α < 〈u0, u∗〉U , ∀u ∈ V. (2.33)

Since V is a subspace we must have 〈u, u∗〉U = 0, ∀u ∈ V .

2.3. WEAK TOPOLOGIES 45

2.3. Weak Topologies

Definition 2.3.1 (Weak Neighborhoods and Weak Topologies).For the topological space U and u0 ∈ U , we define a weak neigh-borhood of u0, denoted by Vw as

Vw = u ∈ U | |〈u− u0, u∗i 〉U | < ε, ∀i ∈ 1, ..., m, (2.34)

for some m ∈ N, ε > 0, and u∗i ∈ U∗, ∀i ∈ 1, ..., m. Also, wedefine the weak topology for U , denoted by σ(U,U∗) as the set ofarbitrary unions and finite intersections of weak neighborhoods inU .

Proposition 2.3.2. Consider Z a topological vector space andψ a function of Z into U . Then ψ is continuous as U is endowedwith the weak topology, if and only if u∗ ψ is continuous, for allu∗ ∈ U∗.

Proof. It is clear that if ψ is continuous with U endowed withthe weak topology, then u∗ ψ is continuous for all u∗ ∈ U∗. Con-versely, consider U a weakly open set in U . We have to show thatψ−1(U) is open in Z. But observe that U = ∪λ∈LVλ, where each Vλ

is a weak neighborhood. Thus ψ−1(U) = ∪λ∈Lψ−1(Vλ). The result

follows considering that u∗ ψ is continuous for all u∗ ∈ U∗, so thatψ−1(Vλ) is open, for all λ ∈ L.

Proposition 2.3.3. A Banach space U is Hausdorff as en-dowed with the weak topology σ(U,U∗).

Proof. Pick u1, u2 ∈ U such that u1 6= u2. From the Hahn-Banach theorem, second geometric form, there exists a hyper-planeseparating u1 and u2. That is, there exist u∗ ∈ U∗ and α ∈ R

such that

〈u1, u∗〉U < α < 〈u2, u

∗〉U . (2.35)

Defining

Vw1 = u ∈ U | |〈u− u1, u∗〉| < α− 〈u1, u

∗〉U, (2.36)

and

Vw2 = u ∈ U | |〈u− u2, u∗〉U | < 〈u2, u

∗〉U − α, (2.37)

we obtain u1 ∈ Vw1, u2 ∈ Vw2 and Vw1 ∩ Vw2 = ∅.


Remark 2.3.4. if un ∈ U is such that un converges to u inσ(U,U∗) then we write un u.

Proposition 2.3.5. Let U be a Banach space. Consideringun ⊂ U we have

(1) un u, for σ(U,U∗) ⇔ 〈un, u∗〉U → 〈u, u∗〉U , ∀u∗ ∈ U∗,

(2) If un → u strongly (in norm) then un u weakly,(3) If un u weakly, then ‖un‖U is bounded and ‖u‖U ≤

lim infn→∞

‖un‖U ,

(4) If un u weakly and u∗n → u∗ strongly in U∗ then〈un, u

∗n〉U → 〈u, u∗〉U .

Proof. (1) The result follows directly from the definitionof topology σ(U,U∗).

(2) This follows from the inequality

|〈un, u∗〉U − 〈u, u∗〉U | ≤ ‖u∗‖U∗‖un − u‖U . (2.38)

(3) Since for every u∗ ∈ U∗ the sequence 〈un, u∗〉U is bounded,

from the uniform boundedness principle we have that thereexists M > 0 such that ‖un‖U ≤M, ∀n ∈ N. Furthermore,for u∗ ∈ U∗ we have

|〈un, u∗〉U | ≤ ‖u∗‖U∗‖un‖U , (2.39)

and taking the limit, we obtain

|〈u, u∗〉U | ≤ lim infn→∞

‖u∗‖U∗‖un‖U . (2.40)

Thus

‖u‖U = sup‖u‖U∗≤1

|〈u, u∗〉U | ≤ lim infn→∞

‖un‖U . (2.41)

(4) Just observe that

|〈un, u∗n〉U − 〈u, u∗〉U | ≤ |〈un, u

∗n − u∗〉U |

+|〈u− un, u∗〉U |

≤ ‖u∗n − u∗‖U∗‖un‖U

+|〈un − u, u∗〉U |≤ M‖u∗n − u∗‖U∗

+|〈un − u, u∗〉U |. (2.42)

2.4. THE WEAK-STAR TOPOLOGY 47

Theorem 2.3.6. Consider A ⊂ U a convex set. Thus A isweakly closed if and only if it is strongly closed.

Proof. Suppose A is strongly closed. Consider u0 6∈ A. Bythe Hahn-Banach theorem there exists a closed hyper-plane whichseparates u0 and A strictly. Therefore there exists α ∈ R andu∗ ∈ U∗ such that

〈u0, u∗〉U < α < 〈v, u∗〉U , ∀v ∈ A. (2.43)

Define

V = u ∈ U | 〈u, u∗〉U < α, (2.44)

so that u0 ∈ V, V ⊂ U − A. Since V is open for σ(U,U∗) we havethat U −A is weakly open, hence A is weakly closed. The converseis obvious.

2.4. The Weak-Star Topology

Definition 2.4.1 (Reflexive Spaces). Let U be a Banachspace. We say that U is reflexive if the canonical injection J :U → U∗∗ defined by

〈u, u∗〉U = 〈u∗, J(u)〉U∗, ∀u ∈ U, u∗ ∈ U∗, (2.45)

is onto.

The weak topology for U∗ is denoted by σ(U∗, U∗∗). By analogy,we can define the topology σ(U∗, U), which is called the weak-startopology. A standard neighborhood of u∗0 ∈ U∗ for the weak-startopology, which we denoted by Vw∗, is given by

Vw∗ = u∗ ∈ U∗ | |〈ui, u∗ − u∗0〉U | < ε, ∀i ∈ 1, ..., m (2.46)

for some ε > 0, m ∈ N, ui ∈ U, ∀i ∈ 1, ..., m. It is clear that theweak topology for U∗ and the weak-star topology coincide if U isreflexive.

Proposition 2.4.2. Let U be a Banach space. U∗ as endowedwith the weak-star topology is a Hausdorff space.

Proof. The proof is similar to that of Proposition 2.3.3.


2.5. Weak-Star Compactness

We start with an important theorem about weak-* compactness.

Theorem 2.5.1 (Banach Alaoglu Theorem). The set BU∗ =f ∈ U∗ | ‖f‖U∗ ≤ 1 is compact for the topology σ(U∗, U) (theweak-star topology).

Proof. For each u ∈ U , we will associate a real number ωu anddenote ω =

∏

u∈U ωu. We have that ω ∈ RU and let us consider theprojections Pu : RU → R, where Pu(ω) = ωu. Consider the weakesttopology σ for which the functions Pu (u ∈ U) are continuous. ForU∗, with the topology σ(U∗, U) define φ : U∗ → RU , by

φ(u∗) =∏

u∈U

〈u, u∗〉U , ∀u∗ ∈ U∗. (2.47)

Since for each fixed u the mapping u∗ → 〈u, u∗〉U is weakly-starcontinuous, we see that, φ is σ continuous, since weak-star conver-gence and convergence in σ are equivalent in U∗. To prove thatφ−1 is continuous, from Proposition 2.3.2, it suffices to show thatthe function ω → 〈u, φ−1(ω)〉U is continuous on φ(U∗). This is truebecause 〈u, φ−1(ω)〉U = ωu on φ(U∗). On the other hand, it is alsoclear that φ(BU∗) = K where

K = ω ∈ RU | |ωu| ≤ ‖u‖U ,

ωu+v = ωu + ωv, ωλu = λωu, ∀u, v ∈ U, λ ∈ R. (2.48)

To finish the proof, it is sufficient, from the continuity of φ−1, toshow that K is compact in RU , concerning the topology σ. Observethat K = K1 ∩K2 where

K1 = ω ∈ RU | |ωu| ≤ ‖u‖U , ∀u ∈ U, (2.49)

and

K2 = ω ∈ RU | ωu+v = ωu + ωv, ωλu = λωu, ∀u, v ∈ U, λ ∈ R.(2.50)

The set∏

u∈U [−‖u‖U , ‖u‖U ] is compact as a Cartesian product ofcompact intervals. Since K1 ⊂ K and K1 is closed, we have thatK1 is compact (for the topology in question) . On the other hand,K2 is closed, because defining the closed sets Au,v and Bλ,u as

Au,v = ω ∈ RU | ωu+v − ωu − ωv = 0, (2.51)

2.5. WEAK-STAR COMPACTNESS 49

and

Bλ,u = ω ∈ RU ωλu − λωu = 0 (2.52)

we may write

K2 = (∩u,v∈UAu,v) ∩ (∩(λ,u)∈R×UBλ,u). (2.53)

We recall that the K2 is closed because arbitrary intersections ofclosed sets are closed. Finally, we have that K1 ∩ K2 is compact,which completes the proof.

Theorem 2.5.2 (Kakutani). Let U be a Banach space. ThenU is reflexive if and only if

BU = u ∈ U | ‖u‖U ≤ 1 (2.54)

is compact for the weak topology σ(U,U∗).

Proof. Suppose U is reflexive, then J(BU) = BU∗∗ . From thelast theorem BU∗∗ is compact for the topology σ(U∗∗, U∗). There-fore it suffices to verify that J−1 : U∗∗ → U is continuous from U∗∗

with the topology σ(U∗∗, U∗) to U , with the topology σ(U,U∗).From Proposition 2.3.2 it is sufficient to show that the function

u 7→ 〈f, J−1u〉U is continuous for the topology σ(U∗∗, U∗), for eachf ∈ U∗. Since 〈f, J−1u〉U = 〈u, f〉U∗ we have completed the firstpart of the proof. For the second we need two lemmas.

Lemma 2.5.3 (Helly). Let U be a Banach space, f1, ..., fn ∈U∗ and α1, ..., αn ∈ R, then 1 and 2 are equivalent, where:

(1)

Given ε > 0, there exists uε ∈ U such that ‖uε‖U ≤ 1 and

|〈uε, fi〉U − αi| < ε, ∀i ∈ 1, ..., n.(2)

∣

∣

∣

∣

∣

n∑

i=1

βiαi

∣

∣

∣

∣

∣

≤∥

∥

∥

∥

∥

n∑

i=1

βifi

∥

∥

∥

∥

∥

U∗

, ∀β1, ..., βn ∈ R. (2.55)

Proof. 1 ⇒ 2: Fix β1, ..., βn ∈ R, ε > 0 and define S =∑n

i=1 |βi|. From 1, we have∣

∣

∣

∣

∣

n∑

i=1

βi〈uε, fi〉U −n∑

i=1

βiαi

∣

∣

∣

∣

∣

< εS (2.56)


and therefore∣

∣

∣

∣

∣

n∑

i=1

βiαi

∣

∣

∣

∣

∣

−∣

∣

∣

∣

∣

n∑

i=1

βi〈uε, fi〉U∣

∣

∣

∣

∣

< εS (2.57)

or∣

∣

∣

∣

∣

n∑

i=1

βiαi

∣

∣

∣

∣

∣

<

∥

∥

∥

∥

∥

n∑

i=1

βifi

∥

∥

∥

∥

∥

U∗

‖uε‖U + εS ≤∥

∥

∥

∥

∥

n∑

i=1

βifi

∥

∥

∥

∥

∥

U∗

+ εS (2.58)

so that∣

∣

∣

∣

∣

n∑

i=1

βiαi

∣

∣

∣

∣

∣

≤∥

∥

∥

∥

∥

n∑

i=1

βifi

∥

∥

∥

∥

∥

U∗

(2.59)

since ε is arbitrary.

Now let us show that 2 ⇒ 1. Define ~α = (α1, ..., αn) ∈ Rn andconsider the function ϕ(u) = (〈f1, u〉U , ..., 〈fn, u〉U). Item 1 impliesthat ~α belongs to the closure of ϕ(BU). Let us suppose that ~α doesnot belong to the closure of ϕ(BU) and obtain a contradiction. Thuswe can separate ~α and the closure of ϕ(BU) strictly, that is there

exists ~β = (β1, ..., βn) ∈ Rn and γ ∈ R such that

ϕ(u) · ~β < γ < ~α · ~β, ∀u ∈ BU (2.60)

Taking the supremum in u we contradict 2.

Also we need the lemma.

Lemma 2.5.4. Let U be a Banach space. Then J(BU) isdense in BU∗∗ for the topology σ(U∗∗, U∗).

Proof. Let u∗∗ ∈ BU∗∗ and consider Vu∗∗ a neighborhood of u∗∗

for the topology σ(U∗∗, U∗). It suffices to show that J(BU )∩Vu∗∗ 6=∅. As Vu∗∗ is a weak neighborhood, there exists f1, ..., fn ∈ U∗ andε > 0 such that

Vu∗∗ = η ∈ U∗∗ | 〈fi, η − u∗∗〉U∗| < ε, ∀i ∈ 1, ..., n. (2.61)

Define αi = 〈fi, u∗∗〉U∗ and thus for any given β1, ..., βn ∈ R we have

∣

∣

∣

∣

∣

n∑

i=1

βiαi

∣

∣

∣

∣

∣

=

∣

∣

∣

∣

∣

〈n∑

i=1

βifi, u∗∗〉U∗

∣

∣

∣

∣

∣

≤∥

∥

∥

∥

∥

n∑

i=1

βifi

∥

∥

∥

∥

∥

U∗

, (2.62)

2.5. WEAK-STAR COMPACTNESS 51

so that from Helly lemma, there exists uε ∈ U such that ‖uε‖U ≤ 1and

|〈uε, fi〉U − αi| < ε, ∀i ∈ 1, ..., n (2.63)

or,

|〈fi, J(uε) − u∗∗〉U∗| < ε, ∀i ∈ 1, ..., n (2.64)

and hence

J(uε) ∈ Vu∗∗ . (2.65)

Now we will complete the proof of Kakutani Theorem. SupposeBU is weakly compact (that is, compact for the topology σ(U,U∗)).Observe that J : U → U∗∗ is weakly continuous, that is, it is contin-uous with U endowed with the topology σ(U,U∗) and U∗∗ endowedwith the topology σ(U∗∗, U∗). Thus as BU is weakly compact, wehave that J(BU) is compact for the topology σ(U∗∗, U∗). Fromthe last lemma, J(BU) is dense BU∗∗ for the topology σ(U∗∗, U∗).Hence J(BU) = BU∗∗ , or J(U) = U∗∗, which completes the proof.

Proposition 2.5.5. Let U be a reflexive Banach space. LetK ⊂ U be a convex closed bounded set. Then K is weakly compact.

Proof. From Theorem 2.3.6, K is weakly closed (closed for thetopology σ(U,U∗)). Since K is bounded, there exists α ∈ R+ suchthat K ⊂ αBU . Since K is weakly closed and K = K ∩ αBU , wehave that it is weakly compact.

Proposition 2.5.6. Let U be a reflexive Banach space andM ⊂ U a closed subspace. Then M with the norm induced by Uis reflexive.

Proof. We can identify two weak topologies in M , namely:

σ(M,M∗) and the trace of σ(U,U∗). (2.66)

It can be easily verified that these two topologies coincide (throughrestrictions and extensions of linear forms). From theorem 2.4.2,it suffices to show that BM is compact for the topology σ(M,M∗).But BU is compact for σ(U,U∗) and M ⊂ U is closed (strongly) andconvex so that it is weakly closed, thus from last proposition, BM


is compact for the topology σ(U,U∗), and therefore it is compactfor σ(M,M∗).

2.6. Separable Sets

Definition 2.6.1 (Separable Spaces). A metric space U is saidto be separable if there exist a set K ⊂ U such that K is countableand dense in U .

The next Proposition is proved in [9].

Proposition 2.6.2. Let U be a separable metric space. IfV ⊂ U then V is separable.

Theorem 2.6.3. Let U be a Banach space such that U∗ isseparable. Then U is separable.

Proof. Consider u∗n a countable dense set in U∗. Observethat

‖u∗n‖U∗ = sup|〈u∗n, u〉U | | u ∈ U and ‖u‖U = 1 (2.67)

so that for each n ∈ N, there exists un ∈ U such that ‖un‖U = 1and 〈u∗n, un〉U ≥ 1

2‖u∗n‖U∗ .

Define U0 as the vector space on Q spanned by un, and U1

as the vector space on R spanned by un. It is clear that U0 isdense in U1 and we will show that U1 is dense in U , so that U0 is adense set in U . For, suppose u∗ is such that 〈u, u∗〉U = 0, ∀u ∈ U1.Since u∗n is dense in U∗, given ε > 0, there exists n ∈ N such that‖u∗n − u∗‖U∗ < ε, so that

1

2‖u∗n‖U∗ ≤ 〈un, u

∗n〉U = 〈un, u

∗n − u∗〉U + 〈un, u

∗〉U≤ ‖u∗n − u∗‖U∗‖un‖U + 0 < ε (2.68)

or

‖u∗‖U∗ ≤ ‖u∗n − u∗‖U∗ + ‖u∗n‖U∗ < ε+ 2ε = 3ε. (2.69)

Therefore, since ε is arbitrary, ‖u∗‖U∗ = 0, that is u∗ = θ. ByCorollary 2.2.13 this completes the proof.

Proposition 2.6.4. U is reflexive if and only if U∗ is reflexive.

2.7. UNIFORMLY CONVEX SPACES 53

Proof. Suppose U is reflexive, as BU∗ is compact for σ(U∗, U)and σ(U∗, U) = σ(U∗, U∗∗) we have that BU∗ is compact for σ(U∗,U∗∗), which means that U∗ is reflexive.

Suppose U∗ is reflexive, from above U∗∗ is reflexive. Since J(U)is a closed subspace of U∗∗, from Proposition 2.5.6, J(U) is reflexive.Thus, U is reflexive, since J is a isometry.

Proposition 2.6.5. Let U be a Banach space. Then U isreflexive and separable if and only if U∗ is reflexive and separable.

2.7. Uniformly Convex Spaces

Definition 2.7.1 (Uniformly Convex Spaces). A Banach spaceU is said to be uniformly convex if for each ε > 0, there exists δ > 0such that:

If u, v ∈ U, ‖u‖U ≤ 1, ‖v‖U ≤ 1, and ‖u − v‖U > ε then‖u+v‖U

2< 1 − δ.

Theorem 2.7.2 (Milman Pettis). Every uniformly convexBanach space is reflexive.

Proof. Let η ∈ U∗∗ be such that ‖η‖U∗∗ = 1. It suffices to showthat η ∈ J(BU). Since J(BU) is closed in U∗∗, we have only to showthat for each ε > 0 there exists u ∈ U such that ‖η−J(u)‖U∗∗ < ε.

Thus, suppose given ε > 0. Let δ > 0 be the correspondingconstant relating the uniformly convex property.

Choose f ∈ U∗ such that ‖f‖U∗ = 1 and

〈f, η〉U∗ > 1 − δ

2. (2.70)

Define

V = ζ ∈ U∗∗ | |〈f, ζ − η〉U∗| < δ

2.

Observe that V is neighborhood of η in σ(U∗∗, U∗). Since J(BU)is dense in BU∗∗ concerning the topology σ(U∗∗, U∗), we have thatV ∩ J(BU) 6= ∅ and thus there exists u ∈ BU such that J(u) ∈ V.Suppose, to obtain contradiction, that

‖η − J(u)‖U∗∗ > ε.

Therefore, defining

W = (J(u) + εBU∗∗)c,


we have that η ∈ W , where W is also a weak neighborhood of η inσ(U∗∗, U∗), since BU∗∗ is closed in σ(U∗∗, U∗).

Hence V ∩W ∩ J(BU) 6= ∅, so that there exists some v ∈ BU

such that J(v) ∈ V ∩W. Thus, J(u) ∈ V and J(v) ∈ V , so that

|〈u, f〉U − 〈f, η〉U∗| < δ

2,

and

|〈v, f〉U − 〈f, η〉U∗| < δ

2.

Hence,

2〈f, η〉U∗ < 〈u+ v, f〉U + δ

≤ ‖u+ v‖U + δ. (2.71)

From this and (2.70) we obtain

‖u+ v‖U

2> 1 − δ,

and thus from the definition of uniform convexity, we obtain

‖u− v‖U ≤ ε. (2.72)

On the other hand, since J(v) ∈W , we have

‖J(u) − J(v)‖U∗∗ = ‖u− v‖U > ε,

which contradicts (2.72). The proof is complete.

CHAPTER 3

Topics on Linear Operators

3.1. Topologies for Bounded Operators

First we recall that the set of all bounded linear operators, de-noted by L(U, Y ), is a Banach space with the norm

‖A‖ = sup‖Au‖Y | ‖u‖U ≤ 1.The topology related to the metric induced by this norm is calledthe uniform operator topology.

Let us introduce now the strong operator topology, which isdefined as the weakest topology for which the functions

Eu : L(U, Y ) → Y

are continuous where

Eu(A) = Au, ∀A ∈ L(U, Y ).

For such a topology a base at origin is given by sets of the form

A | A ∈ L(U, Y ), ‖Aui‖Y < ε, ∀i ∈ 1, , ..., n,where u1, ..., un ∈ U and ε > 0.

Observe that a sequence An ⊂ L(U, Y ) converges to A con-cerning this last topology if

‖Anu−Au‖Y → 0, as n→ ∞, ∀u ∈ U.

In the next lines we describe the weak operator topology inL(U, Y ). Such a topology is weakest one such that the functions

Eu,v : L(U, Y ) → C

are continuous, where

Eu,v(A) = 〈Au, v〉Y , ∀A ∈ L(U, Y ), u ∈ U, v ∈ Y ∗.

For such a topology, a base at origin is given by sets of the form

A ∈ L(U, Y ) | |〈Aui, vj〉Y | < ε, ∀i ∈ 1, ..., n, j ∈ 1, ..., m.where ε > 0, u1, ..., un ∈ U , v1, ..., vm ∈ Y ∗.

55

56 3. TOPICS ON LINEAR OPERATORS

A sequence An ⊂ L(U, Y ) converges to A ∈ L(U, Y ) if

|〈Anu, v〉Y − 〈Au, v〉Y | → 0,

as n→ ∞, ∀u ∈ U, v ∈ Y ∗.

3.2. Adjoint Operators

We start this section recalling the definition of adjoint operator.

Definition 3.2.1. Let U, Y be Banach spaces. Given a boundedlinear operator A : U → Y and v∗ ∈ Y ∗, we have that T (u) =〈Au, v∗〉Y is such that

|T (u)| ≤ ‖Au‖Y · ‖v∗‖ ≤ ‖A‖‖v∗‖Y ∗‖u‖U .

Hence T (u) is a continuous linear functional on U and consideringour fundamental representation hypothesis, there exists u∗ ∈ U∗

such that

T (u) = 〈u, u∗〉U , ∀u ∈ U.

We define A∗ by setting u∗ = A∗v∗, so that

T (u) = 〈u, u∗〉U = 〈u,A∗v∗〉U

that is,

〈u,A∗v∗〉U = 〈Au, v∗〉Y , ∀u ∈ U, v∗ ∈ Y ∗.

We call A∗ : Y ∗ → U∗ the adjoint operator relating A : U → Y.

Theorem 3.2.2. Let U, Y be Banach spaces and let A : U →Y be a bounded linear operator. Then

‖A‖ = ‖A∗‖.

3.2. ADJOINT OPERATORS 57

Proof. Observe that

‖A‖ = supu∈U

‖Au‖ | ‖u‖U = 1

= supu∈U

supv∗∈Y ∗

〈Au, v∗〉Y | ‖v∗‖Y ∗ = 1, ‖u‖U = 1

= sup(u,v∗)∈U×Y ∗

〈Au, v∗〉Y | ‖v∗‖Y ∗ = 1, ‖u‖U = 1

= sup(u,v∗)∈U×Y ∗

〈u,A∗v∗〉U | ‖v∗‖Y ∗ = 1, ‖u‖U = 1

= supv∗∈Y ∗

supu∈U

〈u,A∗v∗〉U | ‖u‖U = 1, ‖v∗‖Y ∗ = 1

= supv∗∈Y ∗

‖A∗v∗‖, ‖v∗‖Y ∗ = 1

= ‖A∗‖. (3.1)

In particular, if U = Y = H where H is Hilbert space, we have

Theorem 3.2.3. Given the bounded linear operators A,B :H → H we have

(1) (AB)∗ = B∗A∗,(2) (A∗)∗ = A,(3) If A has a bounded inverse A−1 then A∗ has a bounded

inverse and

(A∗)−1 = (A−1)∗.

(4) ‖AA∗‖ = ‖A‖2.

Proof. (1) Observe that

(ABu, v)H = (Bu,A∗v)H = (u,B∗A∗v)H , ∀u, v ∈ H.

(2) Observe that

(u,Av)H = (A∗u, v)H = (u,A∗∗v)H, ∀u, v ∈ H.

(3) We have that

I = AA−1 = A−1A,

so that

I = I∗ = (AA−1)∗ = (A−1)∗A∗ = (A−1A)∗ = A∗(A−1)∗.

(4) Observe that

‖A∗A‖ ≤ ‖A‖‖A∗‖ = ‖A‖2,


and

‖A∗A‖ ≥ supu∈U

(u,A∗Au)H | ‖u‖U = 1

= supu∈U

(Au,Au)H | ‖u‖U = 1

= supu∈U

‖Au‖2H | ‖u‖U = 1 = ‖A‖2, (3.2)

and hence‖A∗A‖ = ‖A‖2.

Definition 3.2.4. Given A ∈ L(H) we say that A is self-adjoint if

A = A∗.

Theorem 3.2.5. Let U and Y be Banach spaces and letA : U → Y be a bounded linear operator. Then

[R(A)]⊥ = N(A∗),

where

[R(A)]⊥ = v∗ ∈ Y ∗ | 〈Au, v∗〉Y = 0, ∀u ∈ U.

Proof. Let v∗ ∈ N(A∗). Choose v ∈ R(A). Thus there exists uin U such that Au = v so that

〈v, v∗〉Y = 〈Au, v∗〉Y = 〈u,A∗v∗〉U = 0.

Since v ∈ R(A) is arbitrary we have obtained

N(A∗) ⊂ [R(A)]⊥.

Suppose v∗ ∈ [R(A)]⊥. Choose u ∈ U . Thus,

〈Au, v∗〉Y = 0,

so that〈u,A∗v∗〉U , ∀u ∈ U.

Therefore A∗v∗ = θ, that is, v∗ ∈ N(A∗). Since v∗ ∈ [R(A)]⊥ isarbitrary, we get

[R(A)]⊥ ⊂ N(A∗).

This completes the proof.

The proof of the next result may be found in Luenberger [27],page 156.

3.3. COMPACT OPERATORS 59

Theorem 3.2.6. Let U and Y be Banach spaces and letA : U → Y be a bounded linear operator. If R(A) is closed then

R(A∗) = [N(A)]⊥.

3.3. Compact Operators

We start this section defining compact operators.

Definition 3.3.1. Let U and Y be Banach spaces. An opera-tor A ∈ L(U, Y ) (linear and bounded) is said to compact if A takesbounded sets into pre-compact sets. Summarizing, A is compactif for each bounded sequence un ⊂ U , Aun has a convergentsubsequence in Y .

Theorem 3.3.2. A compact operator maps weakly convergentsequences into norm convergent sequences.

Proof. Let A : U → Y be a compact operator. Suppose

un u weakly in U.

By the uniform boundedness theorem, ‖un‖ is bounded. Thus,given v∗ ∈ Y ∗ we have

〈v∗, Aun〉Y = 〈A∗v∗, un〉Y→ 〈A∗v∗, u〉U= 〈v∗, Au〉Y . (3.3)

Being v∗ ∈ Y ∗ arbitrary, we get that

Aun Au weakly in Y. (3.4)

Suppose Aun does not converge in norm to Au. Thus there existsε > 0 and a subsequence Aunk

such that

‖Aunk−Au‖Y ≥ ε, ∀k ∈ N.

As unk is bounded and A is compact, Aunk

has a subsequenceconverging para v 6= Au. But then such a sequence convergesweakly to v 6= Au, which contradicts (3.4). The proof is com-plete.

Theorem 3.3.3. Let H be a separable Hilbert space. Thuseach compact operator in L(H) is the limit in norm of a sequenceof finite rank operators.


Proof. Let A be a compact operator in H . Let φj an or-thonormal basis in H . For each n ∈ N define

λn = sup‖Aψ‖H | ψ ∈ [φ1, ..., φn]⊥ and ‖ψ‖H = 1.It is clear that λn is a non-increasing sequence that converges toa limit λ ≥ 0. We will show that λ = 0. Choose a sequence ψnsuch that

ψn ∈ [φ1, ..., φn]⊥,

‖ψn‖H = 1 and ‖Aψn‖H ≥ λ/2. Now we will show that

ψn θ, weakly in H.

Let ψ∗ ∈ H∗ = H, thus there exists a sequence aj ⊂ C such that

ψ∗ =∞∑

j=1

ajφj.

Suppose given ε > 0. We may find n0 ∈ N such that

∞∑

j=n0

a2j < ε.

Choose n > n0. Hence there exists bjj>n such that

ψn =∞∑

j=n+1

bjφj,

and∞∑

j=n+1

b2j = 1.

Therefore

|(ψn, ψ∗)H | =

∣

∣

∣

∣

∣

∞∑

j=n+1

(φj, φj)Haj · bj∣

∣

∣

∣

∣

=

∣

∣

∣

∣

∣

∞∑

j=n+1

aj · bj∣

∣

∣

∣

∣

≤

√

√

√

√

∞∑

j=n+1

a2j

√

√

√

√

∞∑

j=n+1

b2j

≤ √ε, (3.5)

3.4. THE SQUARE ROOT OF A POSITIVE OPERATOR 61

if n > n0. Since ε > 0 is arbitrary,

(ψn, ψ∗)H → 0, as n→ ∞.

Since ψ∗ ∈ H is arbitrary, we get

ψn θ, weakly in H.

Hence, as A is compact, we have

Aψn → θ in norm ,

so that λ = 0. Finally, we may define An by

An(u) = A

(

n∑

j=1

(u, φj)Hφj

)

=

n∑

j=1

(u, φj)HAφj ,

for each u ∈ H . Thus

‖A−An‖ = λn → 0, as n→ ∞.


3.4. The Square Root of a Positive Operator

Definition 3.4.1. Let H be a Hilbert space. A mappingE : H → H is said to be a projection on M ⊂ H if for each z ∈ Hwe have

Ez = x

where z = x+ y, x ∈M and y ∈M⊥.

Observe that

(1) E is linear,(2) E is idempotent, that is E2 = E,(3) R(E) = M ,(4) N(E) = M⊥.

Also observe that from

Ez = x

we have

‖Ez‖H = ‖x‖2H ≤ ‖x‖2

H + ‖y‖2H = ‖z‖2

H ,

so that

‖E‖ ≤ 1.


Definition 3.4.2. Let A,B ∈ L(H). We write

A ≥ θ

if(Au, u)H ≥ 0, ∀u ∈ H,

and in this case we say that A is positive. Finally, we denote

A ≥ B

ifA−B ≥ θ.

Theorem 3.4.3. Let An be a sequence of self-adjoint com-muting operators in L(H). Let B ∈ L(H) be a self adjoint operatorsuch that

AiB = BAi, ∀i ∈ N.

Suppose also that

A1 ≤ A2 ≤ A3 ≤ ... ≤ An ≤ ... ≤ B.

Under such hypotheses there exists a self adjoint, bounded, linearoperator A such that

An → A in norm ,

andA ≤ B.

Proof. Consider the sequence Cn where

Cn = B − An ≥ 0, ∀n ∈ N.

Fix u ∈ H . First, we show that Cnu converges. Observe that

CiCj = CjCi, ∀i, j ∈ N.

Also, if n > m thenAn − Am ≥ θ

so thatCm = B −Am ≥ B − An = Cn.

Therefore from Cm ≥ θ and Cm − Cn ≥ θ we obtain

(Cm − Cn)Cm ≥ θ, if n > m

and alsoCn(Cm − Cn) ≥ θ.

Thus,(C2

mu, u)H ≥ (CnCmu, u)H ≥ (C2nu, u)H,


and we may conclude that

(C2nu, u)H

is a monotone non-increasing sequence of real numbers, boundedbelow by 0, so that there exists α ∈ R such that

limn→∞

(C2nu, u)H = α.

Since each Cn is self adjoint we obtain

‖(Cn − Cm)u‖2H = ((Cn − Cm)u, (Cn − Cm)u)H

= ((Cn − Cm)(Cn − Cm)u, u)H

= (C2nu, u)H − 2(CnCmu, u) + (C2

mu, u)H

→ α− 2α+ α = 0, (3.6)

asm,n→ ∞.

Therefore Cnu is a Cauchy sequence in norm, so that there existsthe limit

limn→∞

Cnu = limn→∞

(B −An)u,

and hence there exists

limn→∞

Anu, ∀u ∈ H.

Now define A byAu = lim

n→∞Anu.

Since the limitlim

n→∞Anu, ∀u ∈ H

exists we have thatsupn∈N

‖Anu‖H

is finite for all u ∈ H . By the principle of uniform boundedness

supn∈N

‖An‖ <∞

so that there exists K > 0 such that

‖An‖ ≤ K, ∀n ∈ N.

Therefore‖Anu‖H ≤ K‖u‖H ,

so that‖Au‖ = lim

n→∞‖Anu‖H ≤ K‖u‖H, ∀u ∈ H


which means that A is bounded. Fixing u, v ∈ H , we have

(Au, v)H = limn→∞

(Anu, v)H = limn→∞

(u,Anv)H = (u,Av)H,

and thus A is self adjoint. Finally

(Anu, u)H ≤ (Bu, u)H, ∀n ∈ N,

so that

(Au, u) = limn→∞

(Anu, u)H ≤ (Bu, u)H, ∀u ∈ H.

Hence A ≤ B.The proof is complete.

Definition 3.4.4. Let A ∈ L(A) be a positive operator. Theself adjoint operator B ∈ L(H) such that

B2 = A

is called the square root of A. If B ≥ θ we denote

B =√A.

Theorem 3.4.5. Suppose A ∈ L(H) is positive. Then thereexists B ≥ θ such that

B2 = A.

Furthermore B commutes with any C ∈ L(H) such that commuteswith A.

Proof. There is no loss of generality in considering

‖A‖ ≤ 1,

which means θ ≤ A ≤ I, because we may replace A by

A

‖A‖so that if

C2 =A

‖A‖then

B = ‖A‖1/2C.

Let

B0 = θ,


and consider the sequence of operators given by

Bn+1 = Bn +1

2(A− B2

n), ∀n ∈ N ∪ 0.

Since each Bn is polynomial in A, we have thatBn is self adjoint andcommute with any operator with commutes with A. In particular

BiBj = BjBi, ∀i, j ∈ N.

First we show that

Bn ≤ I, ∀n ∈ N ∪ 0.Since B0 = θ, and B1 = 1

2A, the statement holds for n = 1. Suppose

Bn ≤ I. Thus

I −Bn+1 = I −Bn − 1

2A+

1

2B2

n

=1

2(I − Bn)2 +

1

2(I − A) ≥ θ (3.7)

so that

Bn+1 ≤ I.

The induction is complete, that is,

Bn ≤ I, ∀n ∈ N.

Now we prove the monotonicity also by induction. Observe that

B0 ≤ B1,

and supposing

Bn−1 ≤ Bn,

we have

Bn+1 −Bn = Bn +1

2(A− B2

n) − Bn−1 −1

2(A−B2

n−1)

= Bn − Bn−1 −1

2(B2

n − B2n−1)

= Bn − Bn−1 −1

2(Bn +Bn−1)(Bn − Bn−1)

= (I − 1

2(Bn +Bn−1))(Bn − Bn−1)

=1

2((I −Bn−1) + (I −Bn))(Bn − Bn−1) ≥ θ.

The induction is complete, that is

θ = B0 ≤ B1 ≤ B2 ≤ ... ≤ Bn ≤ ... ≤ I.


By the last theorem there exists a self adjoint operator B such that

Bn → B in norm.

Fixing u ∈ H we have

Bn+1u = Bnu+1

2(A− B2

n)u,

so that taking the limit in norm as n→ ∞, we get

θ = (A− B2)u.

Being u ∈ H arbitrary we obtain

A = B2.

It is also clear that

B ≥ 0


3.5. About the Spectrum of a Linear Operator

Definition 3.5.1. Let U be a Banach space and let A ∈ L(U).A complex number λ is said to be in the resolvent set ρ(A) of A, if

λI − A

is a bijection with a bounded inverse. We call

Rλ(A) = (λI −A)−1

the resolvent of A in λ.If λ 6∈ ρ(A), we write

λ ∈ σ(A) = C − ρ(A),

where σ(A) is said to be the spectrum of A.

Definition 3.5.2. Let A ∈ L(U).

(1) If u 6= θ and Au = λu for some λ ∈ C then u is said to bean eigenvector of A and λ the corresponding eigenvalue.If λ is an eigenvalue, then (λI − A) is not injective andtherefore λ ∈ σ(A).

The set of eigenvalues is said to be the point spectrumof A.

3.5. ABOUT THE SPECTRUM OF A LINEAR OPERATOR 67

(2) If λ is not an eigenvalue but

R(λI − A)

is not dense in U and therefore λI − A is not a bijection,we have that λ ∈ σ(A). In this case we say that λ is in theresidual spectrum of A, or briefly λ ∈ Res[σ(A)].

Theorem 3.5.3. Let U be a Banach space and suppose thatA ∈ L(U). Then ρ(A) is an open subset of C and

F (λ) = Rλ(A)

is an analytic function with values in L(U) on each connected com-ponent of ρ(A). For λ, µ ∈ σ(A), Rλ(A) and Rµ(A) commuteand

Rλ(A) − Rµ(A) = (µ− λ)Rµ(A)Rλ(A).

Proof. Let λ0 ∈ ρ(A). We will show that λ0 is an interior pointof ρ(A).

Observe that symbolically we may write

1

λ−A=

1

λ− λ0 + (λ0 − A)

=1

λ0 −A

1

1 −(

λ0−λλ0−A

)

=1

λ0 −A

(

1 +

∞∑

n=1

(

λ0 − λ

λ0 − A

)n)

. (3.8)

Define,

Rλ(A) = Rλ0(A)I +

∞∑

n=1

(λ− λ0)n(Rλ0)

n. (3.9)

Observe that‖(Rλ0)

n‖ ≤ ‖Rλ0‖n.

Thus, the series indicated in (3.9) will converge in norm if

|λ− λ0| < ‖Rλ0‖−1. (3.10)

Hence, for λ satisfying (3.10), R(A) is well defined and we can easilycheck that

(λI −A)Rλ(A) = I = Rλ(A)(λI −A).


Therefore

Rλ(A) = Rλ(A), if |λ− λ0| < ‖Rλ0‖−1,

so that λ0 is an interior point. Since λ0 ∈ ρ(A) is arbitrary, we havethat ρ(A) is open. Finally, observe that

Rλ(A) −Rµ(A) = Rλ(A)(µI − A)Rµ(A) − Rλ(A)(λI − A)Rµ(A)

= Rλ(A)(µI)Rµ(A) − Rλ(A)(λI)Rµ(A)

= (µ− λ)Rλ(A)Rµ(A) (3.11)

Interchanging the roles of λ and µ we may conclude that Rλ andRµ commute.

Corollary 3.5.4. Let U be a Banach space and A ∈ L(U).Then the spectrum of A is non-empty.

Proof. Observe that if

‖A‖|λ| < 1

we have

(λI − A)−1 = [λ(I − A/λ)]−1

= λ−1(I − A/λ)−1

= λ−1

(

I +

∞∑

n=1

(

A

λ

)n)

. (3.12)

Therefore we may obtain

Rλ(A) = λ−1

(

I +∞∑

n=1

(

A

λ

)n)

.

In particular

‖Rλ(A)‖ → 0, as |λ| → ∞. (3.13)

Suppose, to obtain contradiction, that

σ(A) = ∅.In such a case Rλ(A) would be a entire bounded analytic function.From Liouville’s theorem, Rλ(A) would be constant, so that from(3.13) we would have

Rλ(A) = θ, ∀λ ∈ C,

which is a contradiction.

3.5. ABOUT THE SPECTRUM OF A LINEAR OPERATOR 69

Proposition 3.5.5. Let H be a Hilbert space and A ∈ L(H).

(1) If λ ∈ Res[σ(A)] then λ ∈ Pσ(A∗).(2) If λ ∈ Pσ(A) then λ ∈ Pσ(A∗) ∪ Res[σ(A∗)].

Proof. (1) If λ ∈ Res[σ(A)] then

R(A− λI) 6= H.

Therefore there exists v ∈ (R(A− λI))⊥, v 6= θ such that

(v, (A− λI)u)H = 0, ∀u ∈ H

that is

((A∗ − λI)v, u)H = 0, ∀u ∈ H

so that

(A∗ − λI)v = θ,

which means that λ ∈ Pσ(A∗).(2) Suppose there exists v 6= θ such that

(A− λI)v = θ,

and

λ 6∈ Pσ(A∗).

Thus

(u, (A− λI)v))H = 0, ∀u ∈ H,

so that

((A∗ − λI)u, v)H, ∀u ∈ H.

Since

(A∗ − λI)u 6= θ, ∀u ∈ H, u 6= θ,

we get v ∈ (R(A∗ − λI))⊥, so that R(A∗ − λI) 6= H .Hence λ ∈ Res[σ(A∗)].

Theorem 3.5.6. Let A ∈ L(H) be a self-adjoint operator.then

(1) σ(A) ⊂ R.(2) Eigenvectors corresponding to distinct eigenvalues of A are

orthogonal.


Proof. Let µ, λ ∈ R. Thus, given u ∈ H we have

‖(A− (λ+ µi))u‖2 = ‖(A− λ)u‖2 + µ2‖u‖2,

so that

‖(A− (λ+ µi))u‖2 ≥ µ2‖u‖2.

Therefore if µ 6= 0, A− (λ+µi) has a bounded inverse on its range,which is closed. If R(A − (λ + µi)) 6= H then by the last result(λ − µi) would be in the point spectrum of A, which contradictsthe last inequality. Hence, if µ 6= 0 then λ+µi ∈ ρ(A). To completethe proof, suppose

Au1 = λ1u1,

and

Au2 = λ2u2,

where

λ1, λ2 ∈ R, λ1 6= λ2 and u1, u2 6= θ.

Thus

(λ1 − λ2)(u1, u2)H = λ1(u1, u2)H − λ2(u1, u2)H

= (λ1u1, u2)H − (u1, λ2u2)H

= (Au1, u2)H − (u1, Au2)H

= (u1, Au2)H − (u1, Au2)H

= 0. (3.14)

Since λ1 − λ2 6= 0 we get

(u1, u2)H = 0.

3.6. The Spectral Theorem for Bounded

Self-Adjoint Operators

Let H be a complex Hilbert space. Consider A : H → H alinear bounded operator, that is A ∈ L(H), and suppose also thatsuch an operator is self-adjoint. Define

m = infu∈H

(Au, u)H | ‖u‖H = 1,

and

M = supu∈H

(Au, u)H | ‖u‖H = 1.

3.6. THE SPECTRAL THEOREM FOR BOUNDED... 71

Remark 3.6.1. It is possible to prove that for a linear self-adjoint operator A : H → H we have

‖A‖ = sup|(Au, u)H| | u ∈ H, ‖u‖H = 1.This propriety, which we do not prove in the present work, is crucialfor the subsequent results, since for example for A,B linear and selfadjoint and ε > 0 we have

−εI ≤ A−B ≤ εI,

we also would have‖A−B‖ < ε.

Define by P the set of all real polynomials defined in R. Define

Φ1 : P → L(H),

byΦ1(p(λ)) = p(A), ∀p ∈ P.

Thus we have

(1) Φ1(p1 + p2) = p1(A) + p2(A),(2) Φ1(p1 · p2) = p1(A)p2(A),(3) Φ1(αp) = αp(A), ∀α ∈ R, p ∈ P(4) if p(λ) ≥ 0, on [m,M ], then p(A) ≥ θ,

We will prove (4):Consider p ∈ P . Denote the real roots of p(λ) less or equal to

m by α1, α2, ..., αn and denote those that are greater or equal toM by β1, β2, ..., βl. Finally denote all the remaining roots, real orcomplex by

v1 + iµ1, ..., vk + iµk.

Observe that if µi = 0 then vi ∈ (m,M). The assumption thatp(λ) ≥ 0 on [m,M ] implies that any real root in (m,M) must beof even multiplicity.

Since complex roots must occur in conjugate pairs, we have thefollowing representation for p(λ) :

p(λ) = a

n∏

i=1

(λ− αi)

l∏

i=1

(βi − λ)

k∏

i=1

((λ− vi)2 + µ2

i ),

where a ≥ 0. Observe that

A− αiI ≥ θ,

since,(Au, u)H ≥ m(u, u)H ≥ αi(u, u)H, ∀u ∈ H,


and by analogy

βiI − A ≥ θ.

On the other hand, since A−vkI is self-adjoint, its square is positiveand hence since the sum of positive operators is positive, we obtain

(A− vkI)2 + µ2

kI ≥ θ.

Therefore

p(A) ≥ θ.

The idea is now to extend de domain of Φ1 to the set of uppersemi-continuous functions, and such set we will denote by Cup.

Observe that if f ∈ Cup, there exists a sequence of continuousfunctions gn such that

gn ↓ f, pointwise ,

that is

gn(λ) ↓ f(λ), ∀λ ∈ R.

Considering the Weierstrass Theorem, since gn ∈ C([m,M ]) wemay obtain a sequence of polynomials pn such that

∥

∥

∥

∥

(

gn +1

2n

)

− pn

∥

∥

∥

∥

∞

<1

2n,

where the norm ‖ · ‖∞ refers to [m,M ]. Thus

pn(λ) ↓ f(λ), on [m,M ].

Therefore

p1(A) ≥ p2(A) ≥ p3(A) ≥ ... ≥ pn(A) ≥ ...

Since pn(A) is self-adjoint for all n ∈ N, we have

pj(A)pk(A) = pk(A)pj(A), ∀j, k ∈ N.

Then the limn→∞

pn(A) (in norm) exists, and we denote

limn→∞

pn(A) = f(A).

Now recall the Dini’s Theorem:

Theorem 3.6.2 (Dini). Let gn be a sequence of continuousfunctions defined on a compact set K ⊂ R. Suppose gn → gpoint-wise and monotonically on K. Under such assumptions theconvergence in question is also uniform.


Now suppose that pn and qn are sequences of polynomialsuch that

pn ↓ f, and qn ↓ f,we will show that

limn→∞

pn(A) = limn→∞

qn(A).

First observe that being pn and qn sequences of continuousfunctions we have that

hnk(λ) = maxpn(λ), qk(λ), ∀λ ∈ [m,M ]

is also continuous, ∀n, k ∈ N. Now fix n ∈ N and define

hk(λ) = maxpk(λ), qn(λ).observe that

hk(λ) ↓ qn(λ), ∀λ ∈ R,

so that by Dini’s theorem

hk → qn, uniformly on [m,M ].

It follows that for each n ∈ N there exists kn ∈ N such that ifk > kn then

hk(λ) − qn(λ) ≤ 1

n, ∀λ ∈ [m,M ].

Since

pk(λ) ≤ hk(λ), ∀λ ∈ [m,M ],

we obtain

pk(λ) − qn(λ) ≤ 1

n, ∀λ ∈ [m,M ].

By analogy, we may show that for each n ∈ N there exists kn ∈ N

such that if k > kn then

qk(λ) − pn(λ) ≤ 1

n.

From above we obtain

limk→∞

pk(A) ≤ qn(A) +1

n.

Since the self adjoint qn(A) + 1/n commutes with the

limk→∞

pk(A)


we obtain

limk→∞

pk(A) ≤ limn→∞

(

qn(A) +1

n

)

≤ limn→∞

qn(A). (3.15)

Similarly we may obtain

limk→∞

qk(A) ≤ limn→∞

pn(A),

so that

limn→∞

qn(A) = limn→∞

pn(A) = f(A).

Hence, we may extend Φ1 : P → L(H) to Φ2 : Cup → L(H) whereCup as earlier indicated, denotes the set of upper semi-continuousfunctions, where

Φ2(f) = f(A).

Observe that Φ2 has the following properties

(1) Φ2(f1 + f2) = Φ2(f1) + Φ2(f2),(2) Φ2(f1 · f2) = f1(A)f2(A),(3) Φ2(αf) = αΦ2(A), ∀α ∈ R, α ≥ 0.(4) if f1(λ) ≥ f2(λ), ∀λ ∈ [m,M ], then

f1(A) ≥ f2(A).

The next step is to extend Φ2 to Φ3 : Cup− → L(H), where

Cup− = f − g | f, g ∈ Cup.

For h = f − g ∈ Cup− we define

Φ3(h) = f(A) − g(A).

Now we will show that Φ3 is well defined. Suppose that h ∈ Cup−

andh = f1 − g1 and h = f2 − g2.

Thus

f1 − g1 = f2 − g2,

that is

f1 + g2 = f2 + g1,

so that from the definition of Φ2 we obtain

f1(A) + g2(A) = f2(A) + g1(A),

that isf1(A) − g1(A) = f2(A) − g2(A).


Therefore Φ3 is well defined. Finally observe that for α < 0

α(f − g) = −αg − (−α)f,

where −αg ∈ Cup and −αf ∈ Cup. Thus

Φ3(αf) = αf(A) = αΦ3(f), ∀α ∈ R.

3.6.1. The Spectral Theorem. Consider the upper semi-continuous function

hµ(λ) =

1, if λ ≤ µ,0, if λ > µ.

(3.16)

Denote

E(µ) = Φ3(hµ) = hµ(A).

Observe that

hµ(λ)hµ(λ) = hµ(λ), ∀λ ∈ R,

so that

[E(µ)]2 = E(µ), ∀µ ∈ R.

Therefore

E(µ) | µ ∈ Ris a family of orthogonal projections. Also observe that if ν ≥ µ wehave

hν(λ)hµ(λ) = hµ(λ)hν(λ) = hµ(λ),

so thatE(ν)E(µ) = E(µ)E(ν) = E(µ), ∀ν ≥ µ.

If µ < m, then hµ(λ) = 0, on [m,M ], so that

E(µ) = 0, if µ < m.

Similarly, if µ ≥ M them hµ(λ) = 1, on [m,M ], so that

E(µ) = I, if µ ≥M.

Next we show that the family E(µ) is strongly continuous fromthe right. First we will establish a sequence of polynomials pnsuch that

pn ↓ hµ,

and

pn(λ) ≥ hµ+ 1n(λ), on [m,M ].

Observe that for any fixed n there exists a sequence of polynomialspn

j such thatpn

j ↓ hµ+1/n, point-wise .


Consider the monotone sequence

gn(λ) = minprs(λ) | r, s ∈ 1, ..., n.

Thus

gn(λ) ≥ hµ+ 1n(λ), ∀λ ∈ R,

and we obtain

limn→∞

gn(λ) ≥ limn→∞

hµ+ 1n(λ) = hµ(λ).

On the other hand

gn(λ) ≤ prn(λ), ∀λ ∈ R, ∀r ∈ 1, ..., n,

so that

limn→∞

gn(λ) ≤ limn→∞

prn(λ).

Therefore

limn→∞

gn(λ) ≤ limr→∞

limn→∞

prn(λ)

= hµ(λ). (3.17)

Thus

limn→∞

gn(λ) = hµ(λ).

Observe that gn are not necessarily polynomials. To set a se-quence of polynomials, observe that we may obtain a sequence pnof polynomials such that

|gn(λ) + 1/n− pn(λ)| < 1

2n, ∀λ ∈ [m,M ], n ∈ N.

so that

pn(λ) ≥ gn(λ) + 1/n− 1/2n ≥ gn(λ) ≥ hµ+1/n(λ).

Thus

pn(A) → E(µ),

and

pn(A) ≥ hµ+ 1n(A) = E(µ+ 1/n) ≥ E(µ).

Therefore we may write

E(µ) = limn→∞

pn(A) ≥ limn→∞

E(µ+ 1/n) ≥ E(µ).

Thus

limn→∞

E(µ+ 1/n) = E(µ).

From this we may easily obtain the strong continuity from the right.


For µ ≤ ν we have

µ(hν(λ) − hµ(λ)) ≤ λ(hν(λ) − hµ(λ))

≤ ν(hν(λ) − hµ(λ)). (3.18)

To verify this observe that if λ < µ or λ > ν then all terms involvedin the above inequalities are zero. On the other hand if

µ ≤ λ ≤ ν

then

hν(λ) − hµ(λ) = 1,

so that in any case (3.18) holds. From the monotonicity propertywe have

µ(E(ν) − E(µ)) ≤ A(E(ν) − E(µ))

≤ ν(E(ν) −E(µ)). (3.19)

Now choose a, b ∈ R such that

a < m and b ≥M.

Suppose given ε > 0. Choose a partition P0 of [a, b], that is

P0 = a = λ0, λ1, ..., λn = b,such that

maxk∈1,...,n

|λk − λk−1| < ε.

Hence

λk−1(E(λk) −E(λk−1)) ≤ A(E(λk) − E(λk−1))

≤ λk(E(λk) −E(λk−1)). (3.20)

Summing up on k and recalling that

n∑

k=1

E(λk) − E(λk−1) = I,

we obtainn∑

k=1

λk−1(E(λk) −E(λk−1)) ≤ A

≤n∑

k=1

λk(E(λk) − E(λk−1)).(3.21)


Let λ0k ∈ [λk−1, λk]. Since (λk − λ0

k) ≤ (λk − λk−1) from (3.20) weobtain

A−n∑

k=1

λ0k(E(λk) −E(λk−1)) ≤ ε

n∑

k=1

(E(λk) − E(λk−1))

= εI. (3.22)

By analogy

−εI ≤ A−n∑

k=1

λ0k(E(λk) −E(λk−1)). (3.23)

Since

A−n∑

k=1

λ0k(E(λk) − E(λk−1))

is self-adjoint we obtain

‖A−n∑

k=1

λ0k(E(λk) − E(λk−1))‖ < ε.

Being ε > 0 arbitrary, we may write

A =

∫ b

a

λdE(λ),

that is

A =

∫ M

m−

λdE(λ).

3.7. The Spectral Decomposition of

Unitary Transformations

Definition 3.7.1. Let H be a Hilbert space. A transformationU : H → H is said to be unitary if

(Uu, Uv)H = (u, v)H, ∀u, v ∈ H.

Observe that in this case

U∗U = UU∗ = I,

so that

U−1 = U∗.

3.7. THE SPECTRAL DECOMPOSITION OF... 79

Theorem 3.7.2. Every Unitary transformation U has a spec-tral decomposition

U =

∫ 2π

0−eiφdE(φ),

where E(φ) is a spectral family on [0, 2π]. Furthermore E(φ) iscontinuous at 0 and it is the limit of polynomials in U and U−1.

We present just a sketch of the proof. For the trigonometricpolynomials

p(eiφ) =

n∑

k=−n

ckeikφ,

consider the transformation

p(U) =n∑

k=−n

ckUk,

where ck ∈ C, ∀k ∈ −n, ..., 0, ..., n.Observe that

p(eiφ) =

n∑

k=−n

cke−ikφ,

so that the corresponding operator is

p(U)∗ =

n∑

k=−n

ckU−k =

n∑

k=−n

ck(U∗)k.

Also ifp(eiφ) ≥ 0

there exists a polynomial q such that

p(eiφ) = |q(eiφ)|2 = q(eiφ)q(eiφ),

so thatp(U) = [q(U)]∗q(U).

Therefore

(p(U)v, v)H = (q(U)∗q(U)v, v)H = (q(U)v, q(U)v)H ≥ 0, ∀v ∈ H,

which meansp(U) ≥ 0.

Define the function hµ(φ) by

hµ(φ) =

1, if 2kπ < φ ≤ 2kπ + µ,0, if 2kπ + µ < φ ≤ 2(k + 1)π,

(3.24)


for each k ∈ 0,±1,±2,±3, .... Define E(µ) = hµ(U). Observethat the family E(µ) are projections and in particular

E(0) = 0,

E(2π) = I

and if µ ≤ ν, since

hµ(φ) ≤ hν(φ),

we have

E(µ) ≤ E(ν).

Suppose given ε > 0. Let P0 be a partition of [0, 2π] that is,

P0 = 0 = φ0, φ1, ..., φn = 2πsuch that

maxj∈1,...,n

|φj − φj−1| < ε.

For fixed φ ∈ [0, 2π], let j ∈ 1, ..., n be such that

φ ∈ [φj−1, φj].

|eiφ −n∑

k=1

eiφk(hφk(φ) − hφk−1

(φ))| = |eiφ − eiφj |

≤ |φ− φj| < ε. (3.25)

Thus,

0 ≤ |eiφ −n∑

k=1

eiφk(hφk(φ) − hφk−1

(φ))|2 ≤ ε2

so that, for the corresponding operators

0 ≤ [U−n∑

k=1

eiφk(E(φk)−E(φk−1)]∗[U−

n∑

k=1

eiφk(E(φk)−E(φk−1)]

≤ ε2I

and hence

‖U −n∑

k=1

eiφk(E(φk) −E(φk−1)‖ < ε.

Being ε > 0 arbitrary, we may infer that

U =

∫ 2π

0

eiφdE(φ).

3.8. UNBOUNDED OPERATORS 81

3.8. Unbounded Operators

3.8.1. Introduction. LetH be a Hilbert space. Let A : D(A) →H be an operator, where unless indicated D(A) is a dense subsetof H . We consider in this section the special case where A is un-bounded.

Definition 3.8.1. Given A : D → H we define the graph ofA, denoted by Γ(A) by,

Γ(A) = (u,Au) | u ∈ D.

Definition 3.8.2. An operator A : D → H is said to be closedif Γ(A) is closed.

Definition 3.8.3. Let A1 : D1 → H and A2 : D2 → Hoperators. We write A2 ⊃ A1 if D2 ⊃ D1 and

A2u = A1u, ∀u ∈ D1.

In this case we say that A2 is an extension of A1.

Definition 3.8.4. A linear operator A : D → H is said tobe closable if it has a linear closed extension. The smallest closedextension of A is denote by A and is called the closure of A.

Proposition 3.8.5. Let A : D → H be a linear operator. IfA is closable then

Γ(A) = Γ(A).

Proof. Suppose B is a closed extension of A. Then

Γ(A) ⊂ Γ(B) = Γ(B),

so that if (θ, φ) ∈ Γ(A) then (θ, φ) ∈ Γ(B), and hence φ = θ. Definethe operator C by

D(C) = ψ | (ψ, φ) ∈ Γ(A) for some φ,and C(ψ) = φ, where φ is the unique point such that (ψ, φ) ∈ Γ(A).Hence

Γ(C) = Γ(A) ⊂ Γ(B),

so that

A ⊂ C.


However C ⊂ B and since B is an arbitrary closed extension of Awe have

C = A

so that

Γ(C) = Γ(A) = Γ(A).

Definition 3.8.6. Let A : D → H be a linear operator whereD is dense in H . Define D(A∗) by

D(A∗) = φ ∈ H | (Aψ, φ)H = (ψ, η)H, ∀ψ ∈ D for some η ∈ H.In this case we denote

A∗φ = η.

A∗ defined in this way is called the adjoint operator related to A.

Observe that by the Riesz lemma, φ ∈ D(A∗) if and only if thereexists K > 0 such that

|(Aψ, φ)H| ≤ K‖ψ‖H , ∀ψ ∈ D.

Also note that if

A ⊂ B then B∗ ⊂ A∗.

Finally, as D is dense in H then

η = A∗(φ)

is uniquely defined. However the domain of A∗ may not be dense,and in some situations we may have D(A∗) = θ.

If D(A∗) is dense we define

A∗∗ = (A∗)∗.

Theorem 3.8.7. Let A : D → H a linear operator, being Ddense in H . Then

(1) A∗ is closed,(2) A is closable if and only if D(A∗) is dense and in this case

A = A∗∗.

(3) If A is closable then (A)∗ = A∗.

3.8. UNBOUNDED OPERATORS 83

Proof. (1) We define the operator V : H × H → H × Hby

V (φ, ψ) = (−ψ, φ).

Let E ⊂ H ×H be a subspace. Thus if (φ1, ψ1) ∈ V (E⊥)then there exists (φ, ψ) ∈ E⊥ such that

V (φ, ψ) = (−ψ, φ) = (φ1, ψ1).

Henceψ = −φ1 and φ = ψ1,

so that for (ψ1,−φ1) ∈ E⊥ and (w1, w2) ∈ E we have

((ψ1,−φ1), (w1, w2))H×H = 0 = (ψ1, w1)H + (−φ1, w2)H .

Thus(φ1,−w2)H + (ψ1, w1)H = 0,

and therefore

((φ1, ψ1), (−w2, w1))H×H = 0,

that is

((φ1, ψ1), V (w1, w2))H×H = 0, ∀(w1, w2) ∈ E.

This means that

(φ1, ψ1) ∈ (V (E))⊥,

so thatV (E⊥) ⊂ (V (E))⊥.

It is easily verified that the implications from which thelast inclusion results are in fact equivalences, so that

V (E⊥) = (V (E))⊥.

Suppose (φ, η) ∈ H × H . Thus (φ, η) ∈ V (Γ(A))⊥ ifand only if

((φ, η), (−Aψ, ψ))H×H = 0, ∀ψ ∈ D,

which holds if and only if

(φ,Aψ)H = (η, ψ)H , ∀ψ ∈ D,

that is, if and only if

(φ, η) ∈ Γ(A∗).

ThusΓ(A∗) = V (Γ(A))⊥.

Since (V (Γ(A))⊥ is closed, A∗ is closed.


(2) Observe that Γ(A) is a linear subset of H ×H so that

Γ(A) = [Γ(A)⊥]⊥

= V 2[Γ(A)⊥]⊥

= [V [V (Γ(A))⊥]]⊥

= [V (Γ(A∗)]⊥ (3.26)

so that from the proof of item 1, if A∗ is densely definedwe get

Γ(A) = Γ[(A∗)∗].

Conversely, suppose D(A∗) is not dense. Thus there existsψ ∈ [D(A∗)]⊥ such that ψ 6= θ. Let (φ,A∗φ) ∈ Γ(A∗).Hence

((ψ, θ), (φ,A∗φ))H×H = (ψ, φ)H = 0,

so that(ψ, θ) ∈ [Γ(A)]⊥.

Therefore V [Γ(A)]⊥ is not the graph of a linear operator.

Since Γ(A) = V [Γ(A)]⊥ A is not closable.(3) Observe that if A is closable then

A∗ = (A∗) = A∗∗∗ = (A)∗.

3.9. Symmetric and Self-Adjoint Operators

Definition 3.9.1. Let A : D → H be a linear operator, whereD is dense in H . A is said to be symmetric if A ⊂ A∗, that is ifD ⊂ D(A∗) and

A∗φ = Aφ, ∀φ ∈ D.

Equivalently, A is symmetric if and only if

(Aφ, ψ)H = (φ,Aψ)H , ∀φ, ψ ∈ D.

Definition 3.9.2. Let A : D → H be a linear operator. Wesay that A is self-adjoint if A = A∗, that is if A is symmetric andD = D(A∗).

Definition 3.9.3. Let A : D → H be a symmetric operator.We say that A is essentially self-adjoint if its closure A is self-adjoint. If A is closed, a subset E ⊂ D is said to be a core for A ifA|E = A.

3.9. SYMMETRIC AND SELF-ADJOINT OPERATORS 85

Theorem 3.9.4. Let A : D → H be a symmetric operator.Then the following statements are equivalent

(1) A is self-adjoint.(2) A is closed and N(A∗ ± iI) = θ.(3) R(A± iI) = H.

Proof. • 1 implies 2:Suppose A is self-adjoint let φ ∈ D = D(A∗) be such

that

Aφ = iφ

so that

A∗φ = iφ.

Observe that

−i(φ, φ)H = (iφ, φ)H

= (Aφ, φ)H

= (φ,Aφ)H

= (φ, iφ)H

= i(φ, φ), (3.27)

so that (φ, φ)H = 0, that is φ = θ. Thus

N(A− iI) = θ.Similarly we prove that N(A + iI) = θ. Finally, sinceA∗ = A∗ = A, we get that A = A∗ is closed.

• 2 implies 3:Suppose 2 holds. Thus the equation

A∗φ = −iφhas no non trivial solution. We will prove that R(A− iI)is dense in H . If ψ ∈ R(A− iI)⊥ then

((A− iI)φ, ψ)H = 0, ∀φ ∈ D,

so that ψ ∈ D(A∗) and

(A− iI)∗ψ = (A∗ + iI)ψ = θ,

and hence by above ψ = θ. Now we will prove that R(A−iI) is closed and conclude that

R(A− iI) = H.


Given φ ∈ D we have

‖(A− iI)φ‖2H = ‖Aφ‖2

H + ‖φ‖2H.

Thus if φn ∈ D is such that

(A− iI)φn → ψ0,

we conclude that

φn → φ0,

for some φ0 and Aφn is also convergent. Since A isclosed, φ0 ∈ D and

(A− iI)φ0 = ψ0.

Therefore R(A− iI) is closed, so that

R(A− iI) = H.

SimilarlyR(A + iI) = H.

• 3 implies 1: Let φ ∈ D(A∗). Since R(A− iI) = H , there isan η ∈ D such that

(A− iI)η = (A∗ − iI)φ,

and since D ⊂ D(A∗) we obtain φ− η ∈ D(A∗), and

(A∗ − iI)(φ− η) = θ.

Since R(A+ iI) = H we have N(A∗− iI) = θ. Thereforeφ = η, so that D(A∗) = D. The proof is complete.

3.9.1. The Spectral Theorem Using Cayley Transform.

In this section H is a complex Hilbert space. We suppose A isdefined on a dense subspace of H , being A self-adjoint but possiblyunbounded. We have shown that (A + i) and (A − i) are onto Hand it is possible to prove that

U = (A− i)(A + i)−1,

exists on all H and it is unitary. Furthermore on the domain of A,

A = i(I + U)(I − U)−1.

The operator U is called the Cayley transform of A. We havealready proven that

U =

∫ 2π

0

eiφdF (φ),


where F (φ) is a monotone family of orthogonal projections, stronglycontinuous from the right and we may consider it such that

F (φ) =

0, if φ ≤ 0,I, if φ ≥ 2π.

(3.28)

Since F (φ) = 0, for all φ ≤ 0 and

F (0) = F (0+)

we obtain

F (0+) = 0 = F (0−),

that is, F (φ) is continuous at φ = 0. We claim that F is continuousat φ = 2π. Observe that F (2π) = F (2π+) so that we need only toshow that

F (2π−) = F (2π).

Suppose

F (2π) − F (2π−) 6= θ.

Thus there exists some u, v ∈ H such that

(F (2π) − F (2(π−)))u = v 6= θ.

Therefore

F (φ)v = F (φ)[(F (2π) − F (2π−))u],

so that

F (φ)v =

0, if φ < 2π,v, if φ ≥ 2π.

(3.29)

Observe that

U − I =

∫ 2π

0

(eiφ − 1)dF (φ),

and

U∗ − I =

∫ 2π

0

(e−iφ − 1)dF (φ).

Let φn be a partition of [0, 2π]. From the monotonicity of [0, 2π]and pairwise orthogonality of

F (φn) − F (φn−1)we can show that (this is not proved in details here)

(U∗ − I)(U − I) =

∫ 2π

0

(e−iφ − 1)(eiφ − 1)dF (φ),


so that, given z ∈ H we have

((U∗ − I)(U − I)z, z)H =

∫ 2π

0

|eiφ − 1|2d‖F (φ)z‖2,

thus, for v defined above

‖(U − I)v‖2 = ((U − I)v, (U − I)v)H

= ((U − I)∗(U − I)v, v)H

=

∫ 2π

0

|eiφ − 1|2d‖F (φ)v‖

=

∫ 2π−

0

|eiφ − 1|2d‖F (φ)v‖

= 0 (3.30)

The last two equalities results from e2πi − 1 = 0 and d‖F (φ)v‖ = θon [0, 2π). since v 6= θ the last equation implies that 1 ∈ Pσ(U),which contradicts the existence of

(I − U)−1.

Thus, F is continuous at φ = 2π.Now choose a sequence of real numbers φn such that φn ∈

(0, 2π), n = 0,±1,±2,±3, ... such that

−cot(

φn

2

)

= n.

Now define Tn = F (φn) − F (φn−1). Since U commutes with F (φ),U commutes with Tn. since

A = i(I + U)(I − U)−1,

this implies that the range of Tn is invariant under U and A. Ob-serve that

∑

n

Tn =∑

n

(F (φn) − F (φn−1))

= limφ→2π

F (φ) − limφ→0

F (φ)

= I − θ = I. (3.31)

Hence∑

n

R(Tn) = H.


Also, for u ∈ H we have that

F (φ)Tnu =

0, if φ < φn−1,(F (φ) − F (φn−1))u, if φn−1 ≤ φ ≤ φn,F (φn) − F (φn−1))u, if φ > φn,

(3.32)

so that

(I − U)Tnu =

∫ 2π

0

(1 − eiφ)dF (φ)Tnu

=

∫ φn

φn−1

(1 − eiφ)dF (φ)u. (3.33)

Therefore∫ φn

φn−1

(1 − eiφ)−1dF (φ)(I − U)Tnu

=

∫ φn

φn−1

(1 − eiφ)−1dF (φ)

∫ φn

φn−1

(1 − eiφ)dF (φ)u

=

∫ φn

φn−1

(1 − eiφ)−1(1 − eiφ)dF (φ)u

=

∫ φn

φn−1

dF (φ)u

=

∫ 2π

0

dF (φ)Tnu = Tnu. (3.34)

Hence[

(I − U)|R(Tn)

]−1=

∫ φn

φn−1

(1 − eiφ)−1dF (φ).

also from the theory earlier developed we may prove that

f(U) =

∫ 2π

0

f(eiφ)dF (φ).

From this, from above and as

A = i(I + U)(I − U)−1

we obtain

ATnu =

∫ φn

φn−1

i(1 + eiφ)(1 − eiφ)−1dF (φ)u.

Therefore defining

λ = −cot(

φ

2

)

,


andE(λ) = F (−2cot−1λ),

we get

i(1 + eiφ)(1 − eiφ)−1 = −cot(

φ

2

)

= λ.

Hence,

ATnu =

∫ n

n−1

λdE(λ)u.

Finally, from

u =∞∑

n=−∞

Tnu,

we can obtain

Au = A(

∞∑

n=−∞

Tnu)

=

∞∑

n=−∞

ATnu

=

∞∑

n=−∞

∫ n

n−1

λdE(λ)u. (3.35)

Being the convergence in question in norm, we may write

Au =

∫ ∞

−∞

λdE(λ)u.

Since u ∈ H is arbitrary, we may denote

A =

∫ ∞

−∞

λdE(λ).

CHAPTER 4

Measure and Integration

4.1. Basic Concepts

In this chapter U denotes a topological space.

Definition 4.1.1 (σ-Algebra ). A collection M of subsets ofU is said to be a σ-Algebra if M has the following properties:

(1) U ∈ M,(2) if A ∈ M then U −A ∈ M,(3) if An ∈ M, ∀n ∈ N, then ∪∞

n=0An ∈ M.

Definition 4.1.2 (Measurable Spaces). If M is a σ-algebrain U we say that U is a measurable space. The elements of M arecalled the measurable sets of U .

Definition 4.1.3 (Measurable Function). If U is a measurablespace and V is a topological space, we say that f : U → V is ameasurable function if f−1(V) is measurable whenever V ⊂ V is anopen set.

Remark 4.1.4. (1) Observe that ∅ = U−U so that from1 and 2 in Definition 4.1.1, we have that ∅ ∈ M.

(2) From 1 and 3 from Definition 4.1.1, it is clear that ∪ni=1Ai ∈

M whenever Ai ∈ M, ∀i ∈ 1, ..., n.(3) Since ∩∞

i=1Ai = (∪∞i=1A

ci)

c also from Definition 4.1.1, it isclear that M is closed under countable intersections.

(4) Since A − B = Bc ∩ A we obtain: if A,B ∈ M thenA−B ∈ M.

Theorem 4.1.5. Let F be any collection of subsets of U .Then there exists a smallest σ-algebra M0 in U such that F ⊂M0.

91

92 4. MEASURE AND INTEGRATION

Proof. Let Ω be the family of all σ-Algebras that contain F .Since the set of all subsets in U is a σ-algebra, Ω is non-empty.

Let M0 = ∩Mλ⊂ΩMλ, it is clear that M0 ⊃ F , it remains toprove that in fact M0 is a σ-algebra. Observe that:

(1) U ∈ Mλ, ∀Mλ ∈ Ω, so that, U ∈ M0,(2) A ∈ M0 implies A ∈ Mλ, ∀Mλ ∈ Ω, so that Ac ∈

Mλ, ∀Mλ ∈ Ω, which means Ac ∈ M0,

(3) An ⊂ M0 implies An ⊂ Mλ, ∀Mλ ∈ Ω, so that∪∞

n=1An ∈ Mλ, ∀Mλ ∈ Ω, which means ∪∞n=1An ∈ M0.

From Definition 4.1.1 the proof is complete.

Definition 4.1.6 (Borel Sets). Let U be a topological space,considering the last theorem there exists a smallest σ-algebra in U ,denoted by B, which contains the open sets of U . The elements ofB are called the Borel sets.

Theorem 4.1.7. Suppose M is a σ-algebra in U and V is atopological space. For f : U → V , we have:

(1) If Ω = E ⊂ V | f−1(E) ∈ M, then Ω is a σ-algebra.(2) If V = [−∞,∞], and f−1((α,∞]) ∈ M, for each α ∈ R,

then f is measurable.

Proof. (1) (a) V ∈ Ω since f−1(V ) = U and U ∈ M.(b) E ∈ Ω ⇒ f−1(E) ∈ M ⇒ U − f−1(E) ∈ M ⇒

f−1(V −E) ∈ M ⇒ V − E ∈ Ω.(c) Ei ⊂ Ω ⇒ f−1(Ei) ∈ M, ∀i ∈ N ⇒ ∪∞

i=1f−1(Ei) ∈

M ⇒ f−1(∪∞i=1Ei) ∈ M ⇒ ∪∞

i=1Ei ∈ Ω.Thus Ω is a σ-algebra.

(2) Define Ω = E ⊂ [−∞,∞] | f−1(E) ∈ M from above Ωis a σ- algebra. Given α ∈ R, let αn be a real sequencesuch that αn → α as n → ∞, αn < α, ∀n ∈ N . Since(αn,∞] ∈ Ω for each n and

[−∞, α) = ∪∞n=1[−∞, αn] = ∪∞

n=1(αn,∞]C, (4.1)

we obtain, [−∞, α) ∈ Ω. Furthermore, we have (α, β) =[−∞, β) ∩ (α,∞] ∈ Ω. Since every open set in [−∞,∞]may be expressed as a countable union of intervals (α, β)we have that Ω contains all the open sets. Thus, f−1(E) ∈M whenever E is open, so that f is measurable.

4.2. SIMPLE FUNCTIONS 93

Proposition 4.1.8. If fn : U → [−∞,∞] is a sequence ofmeasurable functions and g = supn≥1 fn and h = lim sup

n→∞fn then g

and h are measurable.

Proof. Observe that g−1((α,∞]) = ∪∞n=1f

−1n ((α,∞]). From

last theorem g is measurable. By analogy h = infk≥1supi≥k fi ismeasurable.

4.2. Simple Functions

Definition 4.2.1 (Simple Functions). A function f : U →C is said to be a simple function if its range (R(f)) has onlyfinitely many points. If α1, ..., αn = R(f) and we set Ai = u ∈U | f(u) = αi, clearly we have: f =

∑ni=1 αiχAi

, where

χAi(u) =

1, if u ∈ Ai,0, otherwise.

(4.2)

Theorem 4.2.2. Let f : U → [0,∞] be a measurable function.Thus there exists a sequence of simple functions sn : U → [0,∞]such that

(1) 0 ≤ s1 ≤ s2 ≤ ... ≤ f ,(2) sn(u) → f(u) as n→ ∞, ∀u ∈ U.

Proof. Define δn = 2−n. To each n ∈ N and each t ∈ R+, therecorresponds a unique integer K = Kn(t) such that

Kδn ≤ t ≤ (K + 1)δn. (4.3)

Defining

ϕn(t) =

Kn(t)δn, if 0 ≤ t < n,n, if t ≥ n,

(4.4)

we have that each ϕn is a Borel function on [0,∞], such that

(1) t− δn < ϕn(t) ≤ t if 0 ≤ t ≤ n,(2) 0 ≤ ϕ1 ≤ ... ≤ t,(3) ϕn(t) → t as n→ ∞, ∀t ∈ [0,∞].

It follows that the sequence sn = ϕnf corresponds to the resultsindicated above.


4.3. Measures

Definition 4.3.1 (Measure). Let M be a σ-algebra on atopological space U . A function µ : M → [0,∞] is said to bea measure if µ(∅) = 0 and µ is countably additive, that is, givenAi ⊂ U , a sequence of pairwise disjoint sets then

µ(∪∞i=1Ai) =

∞∑

i=1

µ(Ai). (4.5)

In this case (U,M, µ) is called a measure space.

Proposition 4.3.2. Let µ : M → [0,∞], where M is a σ-algebra of U . Then we have the following.

(1) µ(A1 ∪ ... ∪ An) = µ(A1) + ... + µ(An) for any given Aiof pairwise disjoint measurable sets of M.

(2) If A,B ∈ M and A ⊂ B then µ(A) ≤ µ(B).(3) If An ⊂ M, A = ∪∞

n=1An and

A1 ⊂ A2 ⊂ A3 ⊂ ... (4.6)

then, limn→∞

µ(An) = µ(A).

(4) If An ⊂ M, A = ∩∞n=1An, A1 ⊃ A2 ⊃ A3 ⊃ .... and

µ(A1) is finite then,

limn→∞

µ(An) = µ(A). (4.7)

Proof. (1) Take An+1 = An+2 = .... = ∅ in Definition 4.1.1item 1,

(2) Observe that B = A ∪ (B − A) and A ∩ (B − A) = ∅ sothat by above µ(A∪ (B−A)) = µ(A)+µ(B−A) ≥ µ(A),

(3) Let B1 = A1 and let Bn = An − An−1 then Bn ∈ M,Bi ∩ Bj = ∅ if i 6= j, An = B1 ∪ ... ∪ Bn and A = ∪∞

i=1Bi.Thus

µ(A) = µ(∪∞i=1Bi) =

∞∑

n=1

µ(Bi) = limn→∞

n∑

i=1

µ(Bi)

= limn→∞

µ(An). (4.8)

(4) Let Cn = A1 − An. Then C1 ⊂ C2 ⊂ ..., µ(Cn) = µ(A1) −µ(An), A1 − A = ∪∞

n=1Cn.

4.4. INTEGRATION OF SIMPLE FUNCTIONS 95

Thus by 3 we have

µ(A1) − µ(A) = µ(A1 − A) = limn→∞

µ(Cn)

= µ(A1) − limn→∞

µ(An). (4.9)

4.4. Integration of Simple Functions

Definition 4.4.1 (Integral for Simple Functions). For s : U →[0,∞], a measurable simple function, that is,

s =

n∑

i=1

αiχAi, (4.10)

where

χAi(u) =

1, if u ∈ Ai,0, otherwise,

(4.11)

we define the integral of s over E ⊂ M, denoted by∫

Es dµ as

∫

E

s dµ =

n∑

i=1

αiµ(Ai ∩E). (4.12)

The convention 0.∞ = 0 is used here.

Definition 4.4.2 (Integral for Non-negative Measurable Func-tions). If f : U → [0,∞] is measurable, for E ∈ M, we define theintegral of f on E, denoted by

∫

Efdµ, as

∫

E

fdµ = sups∈A

∫

E

sdµ, (4.13)

where

A = s simple and measurable | 0 ≤ s ≤ f. (4.14)

Definition 4.4.3 (Integrals for Measurable Functions). Fora measurable f : U → [−∞,∞] and E ∈ M, we define f+ =maxf, 0, f− = max−f, 0 and the integral of f on E, denotedby∫

Ef dµ, as

∫

E

f dµ =

∫

E

f+ dµ−∫

E

f− dµ.


Theorem 4.4.4 (Lebesgue’s Monotone Convergence Theorem).Let fn be a sequence of real measurable functions on U and sup-pose that

(1) 0 ≤ f1(u) ≤ f2(u) ≤ ... ≤ ∞, ∀u ∈ U,(2) fn(u) → f(u) as n→ ∞, ∀u ∈ U .

Then,(a) f is measurable,(b)

∫

Ufndµ→

∫

Ufdµ as n→ ∞.

Proof. Since∫

Ufndµ ≤

∫

Ufn+1dµ, ∀n ∈ N, there exists α ∈

[0,∞] such that∫

U

fndµ→ α, as n→ ∞, (4.15)

By Proposition 4.1.8, f is measurable, and since fn ≤ f we have∫

U

fndµ ≤∫

U

f dµ. (4.16)

From (4.15) and (4.16), we obtain

α ≤∫

U

fdµ. (4.17)

Let s be any simple function such that 0 ≤ s ≤ f , and let c ∈ R

such that 0 < c < 1. For each n ∈ N we define

En = u ∈ U | fn(u) ≥ cs(u). (4.18)

Clearly En is measurable and E1 ⊂ E2 ⊂ ... and U = ∪n∈NEn.Observe that

∫

U

fn dµ ≥∫

En

fn dµ ≥ c

∫

En

sdµ. (4.19)

Letting n→ ∞ and applying Proposition 4.3.2, we obtain

α = limn→∞

∫

U

fn dµ ≥ c

∫

U

sdµ, (4.20)

so that

α ≥∫

U

sdµ, ∀s simple and measurable such that 0 ≤ s ≤ f.

(4.21)

4.4. INTEGRATION OF SIMPLE FUNCTIONS 97

This implies

α ≥∫

U

fdµ. (4.22)

From (4.17) and (4.22) the proof is complete.

The next result we do not prove it (it is a direct consequence oflast theorem). For a proof see [36].

Corollary 4.4.5. Let fn be a sequence of non-negativemeasurable functions defined on U (fn : U → [0,∞], ∀n ∈ N).Defining f(u) =

∑∞n=1 fn(u), ∀u ∈ U, we have∫

U

f dµ =

∞∑

n=1

∫

U

fn dµ.

Theorem 4.4.6 (Fatou’s Lemma). If fn : U → [0,∞] is asequence of measurable functions, then

∫

U

lim infn→∞

fn dµ ≤ lim infn→∞

∫

U

fndµ. (4.23)

Proof. For each k ∈ N define gk : U → [0,∞] by

gk(u) = infi≥k

fi(u). (4.24)

Then

gk ≤ fk (4.25)

so that∫

U

gk dµ ≤∫

U

fk dµ, ∀k ∈ N. (4.26)

Also 0 ≤ g1 ≤ g2 ≤ ..., each gk is measurable, and

limk→∞

gk(u) = lim infn→∞

fn(u), ∀u ∈ U. (4.27)

From the Lebesgue monotone convergence theorem

lim infk→∞

∫

U

gk dµ = limk→∞

∫

U

gk dµ =

∫

U

lim infn→∞

fn dµ. (4.28)

From (4.26) we have

lim infk→∞

∫

U

gk dµ ≤ lim infk→∞

∫

U

fk dµ. (4.29)


Thus, from (4.28) and (4.29) we obtain∫

U

lim infn→∞

fn dµ ≤ lim infn→∞

∫

U

fn dµ. (4.30)

Theorem 4.4.7 (Lebesgue’s Dominated Convergence Theo-rem). Suppose fn is sequence of complex measurable functionson U such that

limn→∞

fn(u) = f(u), ∀u ∈ U. (4.31)

If there exists a measurable function g : U → R+ such that∫

Ug dµ <

∞ and |fn(u)| ≤ g(u), ∀u ∈ U, n ∈ N, then

(1)∫

U|f | dµ <∞,

(2) limn→∞

∫

U|fn − f | dµ = 0.

Proof. (1) This inequality holds since f is measurable and|f | ≤ g.

(2) Since 2g − |fn − f | ≥ 0 we may apply the Fatou’s Lemmaand obtain:

∫

U

2gdµ ≤ lim infn→∞

∫

U

(2g − |fn − f |)dµ, (4.32)

so that

lim supn→∞

∫

U

|fn − f | dµ ≤ 0. (4.33)

Hence

limn→∞

∫

U

|fn − f | dµ = 0. (4.34)


We finish this section with an important remark:

Remark 4.4.8. In a measurable space U we say that a prop-erty holds almost everywhere (a.e.) if it holds on U except for aset of measure zero. Finally, since integrals are not changed by theredefinition of the functions in question on sets of zero measure,the proprieties of items (1) and (2) of the Lebesgue’s monotone

4.5. THE FUBINI THEOREM 99

convergence may be considered a.e. in U , instead of in all U . Sim-ilar remarks are valid for the Fatou’s lemma and the Lebesgue’sdominated convergence theorem.

4.5. The Fubini Theorem

We start this section with the definition of complete measurespace.

Definition 4.5.1. We say that a measure space (U,M, µ) iscomplete if M contains all subsets of sets of zero measure. That isif A ∈ M, µ(A) = 0 and B ⊂ A then B ∈ M.

Definition 4.5.2. We say that C ∈ U is a semi-algebra in Uif the two conditions below are valid.

(1) if A,B ∈ C then A ∩B ∈ C,(2) For each A ∈ C, Ac is a finite disjoint union of elements in

C.

4.5.1. Product Measures. Let (U,M1, µ1) and (V,M2, µ2)be two complete measure spaces. We recall that the cartesian prod-uct between U and V , denoted by U × V is defined by

U × V = (u, v) | u ∈ U and v ∈ V .If A ⊂ U and B ⊂ V we call A × B a rectangle. If A ⊂ M1

and B ∈ M2 we say that A × B is a measurable rectangle. Thecollection R of measurable rectangles is a semi-algebra since

(A× B) ∩ (C ×D) = (A ∩ C) × (B ∩D),

and

(A× B)c = (Ac ×B) ∪ (A× Bc) ∪ (Ac × Bc).

We define λ : M1 ×M2 → R+ by

λ(A×B) = µ1(A)µ2(B).

Lemma 4.5.3. Let Ai × Bii∈N be a countable disjoint col-lection of measurable rectangles whose the union is the rectangleA× B. Then

λ(A×B) =∞∑

i=1

µ1(Ai)µ2(Bi).


Proof. Let u ∈ A. Thus each v ∈ B is such that (u, v) isexactly in one Ai × Bi. Therefore

χA×B(u, v) =

∞∑

i=1

χAi(u)χBi

(v).

Hence for the fixed u in question, from the corollary of Lebesguemonotone convergence theorem we may write

∫

V

χA×B(u, v)dµ2(v) =

∫ ∞∑

i=1

χAi(u)χBi

(v)dµ2(v)

=

∞∑

i=1

χAi(u)µ2(Bi) (4.35)

so that also from the mentioned corollary∫

U

dµ1(u)

∫

V


∞∑

i=1

µ1(Ai)µ2(Bi).

Observe that∫

U

dµ1(u)

∫

V


∫

U

dµ1(u)

∫

V

χA(u)χB(v)dµ2(v)

= µ1(A)µ2(B).

From the last two equations we may write

λ(A× B) = µ1(A)µ2(B) =∞∑

i=1

µ1(Ai)µ2(Bi).

Definition 4.5.4. Let E ⊂ U × V . We define Eu and Ev by

Eu = v | (u, v) ∈ E,and

Ev = u | (u, v) ∈ E.

Observe thatχEu(v) = χE(u, v),

(Ec)u = (Eu)c,

and(∪Eα)u = ∪(Eα)u,

for any collection Eα.


We denote by Rσ as the collection of sets which are countableunions of measurable rectangles. Also, Rσδ will denote the collec-tion of sets which are countable intersections of elements of Rσ.

Lemma 4.5.5. Let u ∈ U and E ∈ Rσδ. Then Eu is a mea-surable subset of V .

Proof. If E ∈ R the result is trivial. Let E ∈ Rσ. Then E maybe expressed as a disjoint union

E = ∪∞i=1Ei,

where Ei ∈ R, ∀i ∈ N. Thus,

χEu(v) = χE(u, v)

= supi∈N

χEi(u, v)

= supi∈N

χ(Ei)u(v). (4.36)

Since each (Ei)u is measurable we have that

χ(Ei)u(v)

is a measurable function of v, so that

χEu(v)

is measurable, which implies that Eu is measurable. Suppose now

E = ∩∞i=1Ei,

where Ei+1 ⊂ Ei, ∀i ∈ N. Then

χEu(v) = χE(u, v)

= infi∈N

χEi(u, v)

= infi∈N

χ(Ei)u(v). (4.37)

Thus as from above χ(Ei)u(v) is measurable for each i ∈ N, we have

that χEu is also measurable so that Eu is measurable.

Lemma 4.5.6. Let E be a set in Rσδ with (µ1 ×µ2)(E) <∞.Then the function g defined by

g(u) = µ2(Eu)


is a measurable function and∫

U

g dµ1(u) = (µ1 × µ2)(E).

Proof. The lemma is true if E is a measurable rectangle. LetEi be a disjoint sequence of measurable rectangles and E =∪∞

i=1Ei. Set

gi(u) = µ2((Ei)u).

Then each gi is a non-negative measurable function and

g =

∞∑

i=1

gi.

Thus g is measurable, and by the corollary of the Lebesgue mono-tone convergence theorem, we have

∫

U

g(u)dµ1(u) =

∞∑

i=1

∫

U

gi(u)dµ1(u)

=∞∑

i=1

(µ1 × µ2)(Ei)

= (µ1 × µ2)(E). (4.38)

Let E be a set of finite measure in Rσδ. Then there is a sequencein Rσ such that

Ei+1 ⊂ Ei

andE = ∩∞

i=1Ei.

Let gi(u) = µ2((Ei)u), since∫

U

g1(u) = (µ1 × µ2)(E1) <∞,

we have thatg1(u) <∞ a.e. in E1.

For an u ∈ E1 such that g1(u) < ∞ we have that (Ei)u is asequence of measurable sets of finite measure whose intersection isEu. Thus

g(u) = µ2(Eu) = limi→∞

µ2((Ei)u) = limi→∞

gi(u), (4.39)

that is,gi → g, a.e. in E.


We may conclude that g is also measurable. Since

0 ≤ gi ≤ g, ∀i ∈ N

the Lebesgue dominated convergence theorem implies that∫

E

g(u) dµ1(u) = limi→∞

∫

gi dµ1(u) = limi→∞

(µ1 × µ2)(Ei)

= (µ1 × µ2)(E).

Lemma 4.5.7. Let E be a set such that (µ1 × µ2)(E) = 0.then for almost all u ∈ U we have

µ2(Eu) = 0.

Proof. Observe that there is a set in Rσδ such that E ⊂ F and

(µ1 × µ2)(F ) = 0.

From the last lemmaµ2(Fu) = 0

fora almost all u. From Eu ⊂ Fu we obtain

µ2(Eu) = 0

for almost all u, since µ2 is complete.

Proposition 4.5.8. Let E be a measurable subset of U × Vsuch that (µ1 × µ2)(E) is finite. The for almost all u the set Eu isa mesurable subset of V . The function g defined by

g(u) = µ2(Eu)

is measurable and∫

g dµ1(u) = (µ1 × µ2)(E).

Proof. First observe that there is a set F ∈ Rσδ such thatE ⊂ F and

(µ1 × µ2)(F ) = (µ1 × µ2)(E).

Let G = F − E. Since F and G are measurable, G is measurable,and

(µ1 × µ2)(G) = 0.

By the last lemma we obtain

µ2(Gu) = 0,


for almost all u so that

g(u) = µ2(Eu) = µ2(Fu), a.e. in U.

By Lemma 4.5.6 we may conclude that g is measurable and∫

g dµ1(u) = (µ1 × µ2)(F ) = (µ1 × µ2)(E).

Theorem 4.5.9 (Fubini). Let (U,M1, µ1) and (V,M2, µ2)be two complete measure spaces and f an integrable function onU × V . Then

(1) fu(v) = f(u, v) is measurable and integrable for almost allu.

(2) fv(u) = f(u, v) is measurable and integrable for almost allv.

(3) h1(u) =∫

Vf(u, v) dµ2(v) is integrable on U .

(4) h2(v) =∫

Uf(u, v) dµ1(u) is integrable on V .

(5)

∫

U

[∫

V

f dµ2(v)

]

dµ1(u) =

∫

V

[∫

U

f dµ1(u)

]

dµ2(v)

=

∫

U×V

fd(µ1 × µ2). (4.40)

Proof. It suffices to consider the case where f is non-negative(we can then apply the result to f+ = max(f, 0) and f− = max(−f, 0)).The last proposition asserts that the theorem is true if f is a simplefunction which vanishes outside a set of finite measure. Similarlyas in in Theorem 4.2.2, we may obtain a sequence of non-negativesimple functions φn such that

φn ↑ f.

Observe that, given u ∈ U , fu is such that

(φn)u ↑ fu, a.e. .

By the Lebesgue monotone convergence theorem we get∫

V

f(u, v) dµ2(v) = limn→∞

∫

V

φn(u, v) dµ2(v),

4.6. THE LEBESGUE MEASURE IN Rn 105

so that this last resulting function is integrable in U . Again by theLebesgue monotone convergence theorem, we obtain∫

U

[∫

V

f dµ2(v)

]

dµ1(u) = limn→∞

∫

U

[∫

V

φn dµ2(v)

]

dµ1(u)

= limn→∞

∫

U×V

φn d(µ1 × µ2)

=

∫

U×V

f d(µ1 × µ2). (4.41)

4.6. The Lebesgue Measure in Rn

In this section we will define the Lebesgue measure and the con-cept of Lebesgue measurable set. We show that the set of Lebesguemeasurable sets is a σ − algebra so that the earlier results, provenfor more general measure spaces, remain valid in the present con-text (such as the Lebesgue monotone and dominated convergencetheorems).

We start with the following theorems without proofs.

Theorem 4.6.1. Every open set A ⊂ R may be expressed asa countable union of disjoint open intervals.

Remark 4.6.2. In this text Qj denotes a closed cube in Rn

and |Qj | its volume, that is, |Qj| =∏n

i=1(bi − ai), where Qj =∏n

i=1[ai, bi]. Also we assume that if two Q1 and Q2 closed or not,have the same interior, then |Q1| = |Q2| = |Q1|. We recall that twocubes Q1, Q2 ⊂ Rn are said to be quasi-disjoint if their interiors aredisjoint.

Theorem 4.6.3. Every open set A ⊂ Rn, where n ≥ 1 maybe expressed as a countable union of quasi-disjoint closed cubes.

Definition 4.6.4 (Outer Measure). Let E ⊂ Rn. The outermeasure of E, denoted by m∗(E), is defined by

m∗(E) = inf

∞∑

j=1

|Qj| : E ⊂ ∪∞j=1Qj

,

where Qj is a closed cube, ∀j ∈ N.


4.6.1. Properties of the Outer Measure. First observe thatgiven ε > 0, there exists a sequence Qj such that

E ⊂ ∪∞j=1Qj

and∞∑

j=1

|Qj| ≤ m∗(E) + ε.

(1) Monotonicity: If E1 ⊂ E2 then m∗(E1) ≤ m∗(E2). Thisfollows from the fact that if E2 ⊂ ∪∞

j=1Qj then E1 ⊂∪∞

j=1Qj .(2) Countable sub-additivity : If E ⊂ ∪∞

j=1Ej , then m∗(E) ≤∑∞

j=1m∗(Ej).

Proof. First assume that m∗(Ej) < ∞, ∀j ∈ N, oth-erwise the result is obvious. Thus, given ε > 0 for eachj ∈ N there exists a sequence Qk,jk∈N such that

Ej ⊂ ∪∞k=1Qk,j

and∞∑

k=1

|Qk,j| < m∗(Ej) +ε

2j.

Hence

E ⊂ ∪∞j,k=1Qk,j

and therefore

m∗(E) ≤∞∑

j,k=1

|Qk,j| =∞∑

j=1

(

∞∑

k=1

|Qk,j|)

≤∞∑

j=1

(

m∗(Ej) +ε

2j

)

=

∞∑

j=1

m∗(Ej) + ε. (4.42)

Being ε > 0 arbitrary, we obtain

m∗(E) ≤∞∑

j=1

m∗(Ej).


(3) IfE ⊂ Rn,

and

α = infm∗(A) | A is open and E ⊂ A,then

m∗(E) = α.

Proof. From the monotonicity, we have

m∗(E) ≤ m∗(A), ∀A ⊃ E,A open .

Thusm∗(E) ≤ α.

Suppose given ε > 0. Choose a sequence Qj of closedcubes such that

E ⊂ ∪∞j=1

and∞∑

j=1

|Qj| ≤ m∗(E) + ε.

Let Qj be a sequence of open cubes such that Qj ⊃ Qj

|Qj | ≤ |Qj | +ε

2j, ∀j ∈ N.

DefineA = ∪∞

j=1Qj,

hence A is open, A ⊃ E and

m∗(A) ≤∞∑

j=1

|Qj |

≤∞∑

j=1

(

|Qj| +ε

2j

)

=∞∑

j=1

|Qj | + ε

≤ m∗(E) + 2ε. (4.43)

thereforeα ≤ m∗(E) + 2ε.

Being ε > 0 arbitrary, we have

α ≤ m∗(E).



(4) If E = E1 ∪ E2 e d(E1, E2) > 0, then

m∗(E) = m∗(E1) +m∗(E2).

Proof. First observe that being E = E1 ∪E2 we have

m∗(E) ≤ m∗(E1) +m∗(E2).

Let ε > 0. Choose Qj a sequence of closed cubes suchthat

E ⊂ ∪∞j=1Qj ,

e∞∑

j=1

|Qj| ≤ m∗(E) + ε.

Let δ > 0 such that

d(E1, E2) > δ > 0.

Dividing the cubes Qj if necessary, we may assume thatthe diameter of each cube Qj is smaller than δ. Thus eachQj intersects just one of the sets E1 and E2. Denote by J1

and J2 the sets of indices j such that Qj intersects E1 andE2 respectively. thus,

E1 ⊂ ∪j∈J1Qj e E2 ⊂ ∪j∈J2Qj.

hence,

m∗(E1) +m∗(E2) ≤∑

j∈J1

|Qj| +∑

j∈J2

|Qj|

≤∞∑

j=1

|Qj| ≤ m∗(E) + ε. (4.44)

Being ε > 0 arbitrary,

m∗(E1) +m∗(E2) ≤ m∗(E).


(5) If a set E is a countable union of cubes quasi disjoints, thatis

E = ∪∞j=1Qj

then

m∗(E) =∞∑

j=1

|Qj |.


Proof. Let ε > 0.Let Qj be open cubes such that Qj ⊂⊂ Qj (that is,

the closure of Qj is contained in the interior of Qj) and

|Qj| ≤ |Qj| +ε

2j.

Thus, for each N ∈ N the cubes Q1, ..., QN are disjoint andeach pair have a finite distance. Hence,

m∗(∪Nj=1Qj) =

N∑

j=1

|Qj | ≥N∑

j=1

(

|Qj | −ε

2j

)

.

Being

∪Nj=1Qj ⊂ E

we obtain

m∗(E) ≥N∑

j=1

|Qj | ≥N∑

j=1

|Qj| − ε.

Therefore∞∑

j=1

|Qj| ≤ m∗(E) + ε.

Being ε > 0 arbitrary, we may conclude that∞∑

j=1

|Qj| ≤ m∗(E).


4.6.2. The Lebesgue Measure.

Definition 4.6.5. A set E ⊂ Rn is said to be Lebesgue mea-surable if for each ε > 0 there exists A ⊂ Rn open such that

E ⊂ A

and

m∗(A−E) ≤ ε.

If E is measurable, we define its Lebesgue measure, denoted bym(E), as

m(E) = m∗(E).


4.6.3. Properties of Measurable Sets.

(1) Each open set is measurable.(2) If m∗(E) = 0 then E is measurable. In particular if E ⊂ A

and m∗(A) = 0 then E is measurable.

Proof. Let E ⊂ Rn be such that m∗(E) = 0. Supposegiven ε > 0, thus there exists A ⊂ Rn open such thatE ⊂ A and m∗(A) < ε. Therefore

m∗(A−E) < ε.

(3) A countable union of measurable sets is measurable.

Proof. Suppose

E = ∪∞j=1Ej

where each Ej is measurable. Suppose given ε > 0. Foreach j ∈ N, there exists Aj ⊂ Rn open such that

Ej ⊂ Aj

e

m∗(Aj −Ej) ≤ε

2j.

Define A = ∪∞j=1Aj . thus E ⊂ A and

(A−E) ⊂ ∪∞j=1(Aj −Ej).

From the monotonicity and countable sub-additivity of theouter measure we have,

m∗(A− E) ≤∞∑

j=1

m∗(Aj − Ej) < ε.

(4) Closed sets are measurable.

Proof. Observe that

F = ∪∞k=1F ∩ Bk,

where Bk denotes a closed ball of radius k with center atorigin. Thus F may be expressed as a countable union ofcompact sets. Hence, we have only to show that if F iscompact then it is measurable. Let F be a compact set.Observe that

m∗(F ) <∞.


Let ε > 0, thus there exists an open A ⊂ Rn such thatF ⊂ A e

m∗(A) ≤ m∗(F ) + ε.

Being F closed, A− F is open, and therefore, A− F maybe expressed as a countable union of quasi disjoint closedcubes. Hence

A− F = ∪∞j=1Qj .

For each N ∈ N

K = ∪Nj=1Qj

is compact, therefore

d(K,F ) > 0.

Being K ∪ F ⊂ A, we have

m∗(A) ≥ m∗(F ∪K) = m∗(F ) +m∗(K) = m∗(F ) +

N∑

j=1

|Qj|.

ThereforeN∑

j=1

|Qj | ≤ m∗(A) −m∗(F ) ≤ ε.

Finally,

m∗(A− F ) ≤∞∑

j=1

|Qj| < ε.


(5) If E ⊂ Rn is measurable, then Ec is measurable.

Proof. A point x ∈ Rn is denoted by x = (x1, x2, ..., xn)where xi ∈ R for each i ∈ 1, ..., n. Let E be a measurableset. For each k ∈ N there exists an open Ak ⊃ E such that

m∗(Ak −E) <1

k.

Observe that Ack is closed and therefore measurable, ∀k ∈

N. ThusS = ∪∞

k=1Ack

is also measurable. On the other hand

S ⊂ Ec


and if x ∈ (Ec − S) then x ∈ Ec and x 6∈ S, so that x 6∈ Eand x 6∈ Ac

k, ∀k ∈ N. Hence x 6∈ E and x ∈ Ak, ∀k ∈ N

and finally x ∈ (Ak − E), ∀k ∈ N, that is,

Ec − S ⊂ Ak − E, ∀k ∈ N.

Therefore

m∗(Ec − S) ≤ 1

k, ∀k ∈ N.

Thus

m∗(Ec − S) = 0.

This means that Ec − S is measurable, so that,

Ec = S ∪ (Ec − S)

is measurable. The proof is complete.

(6) A countable intersection of measurable sets is measurable.

Proof. This follows from items 3 and 5 just observingthat

∩∞j=1Ej = (∪∞

j=1Ecj )

c.

Theorem 4.6.6. If Ei is sequence of measurable pairwisedisjoint sets and E = ∪∞

j=1Ei then

m(E) =∞∑

j=1

m(Ej).

Proof. First assume that Ej is bounded. Being Ecj measurable,

given ε > 0 there exists an open Hj ⊃ Ecj such that

m∗(Hj − Ecj ) <

ε

2j, ∀j ∈ N.

Denoting Fj = Hcj we have that Fj ⊂ Ej is closed and

m∗(Ej − Fj) <ε

2j, ∀j ∈ N.

For each N ∈ N the sets F1, ..., FN are compact and disjoint, sothat

m(∪Nj=1Fj) =

N∑

j=1

m(Fj).


As∪N

j=1Fj ⊂ E

we have

m(E) ≥N∑

j=1

m(Fj) ≥N∑

j=1

m(Ej) − ε.

Hence

m(E) ≥∞∑

j=1

m(Ej) − ε.

being ε > 0 arbitrary, we obtain

m(E) ≥∞∑

j=1

m(Ej).

As the reverse inequality is always valid, we have

m(E) =∞∑

j=1

m(Ej).

For the general case, select a sequence of cubes Qk such that

Rn = ∪∞k=1Qk

and Qk ⊂ Qk+1∀k ∈ N. Define S1 = Q1 e Sk = Qk −Qk−1, ∀k ≥ 2.Also define

Ej,k = Ej ∩ Sk, ∀j, k ∈ N.

ThusE = ∪∞

j=1 (∪∞k=1Ej,k) = ∪∞

j,k=1Ej,k,

where such an union is disjoint and each Ej,k is bounded. Throughthe last result, we get

m(E) =

∞∑

j,k=1

m(Ej,k)

=∞∑

j=1

∞∑

k=1

m(Ej,k)

=∞∑

j=1

m(Ej). (4.45)


Theorem 4.6.7. Suppose E ⊂ Rn is a measurable set. Thenfor each ε > 0:


(1) There exists an open set A ⊂ Rn such that E ⊂ A and

m(A−E) < ε.

(2) There exists a closed set F ⊂ Rn such that F ⊂ E and

m(E − F ) < ε.

(3) If m(E) is finite, there exists a compact set K ⊂ E suchthat

m(E −K) < ε.

(4) If m(E) is finite there exist a finite union of closed cubes

F = ∪Nj=1Qj

such that

m(E F ) ≤ ε,

where

E F = (E − F ) ∪ (F − E).

Proof. (1) This item follows from the definition of mea-surable set.

(2) Being Ec measurable, there exist an open B ⊂ Rn suchthat Ec ⊂ B and

m∗(B − Ec) < ε.

Defining F = Bc, we have that F is closed, F ⊂ E andE − F = B − Ec. Therefore

m(E − F ) < ε.

(3) Choose a closed set such that F ⊂ E e

m(E − F ) <ε

2.

Let Bn be a closed ball with center at origin and radiusn. Define Kn = F ∩ Bn and observe that Kn is compact,∀n ∈ N. Thus

E −Kn ց E − F.

Being m(E) <∞ we have

m(E −Kn) < ε,

for all n sufficiently big.

4.7. LEBESGUE MEASURABLE FUNCTIONS 115

(4) Choose a sequence of closed cubes Qj such that

E ⊂ ∪∞j=1Qj ,

and∞∑

j=1

|Qj| ≤ m(E) +ε

2.

Being m(E) < ∞ the series converges and there existsN0 ∈ N such that

∞∑

N0+1

|Qj| <ε

2.

Defining F = ∪N0j=1Qj , we have

m(E F ) = m(E − F ) +m(F − E)

≤ m(

∪∞j=N0+1Qj

)

+m(

∪∞j=1Qj −E

)

≤∞∑

j=N0+1

|Qj| +∞∑

j=1

|Qj| −m(E)

≤ ε

2+ε

2= ε. (4.46)

4.7. Lebesgue Measurable Functions

Definition 4.7.1. Let E ⊂ Rn be a measurable set. A func-tion f : E → [−∞,+∞] is said to be Lebesgue measurable if foreach a ∈ R, the set

f−1([−∞, a)) = x ∈ E | f(x) < ais measurable.

Observe that:

(1) If

f−1([−∞, a))

is measurable for each a ∈ R then

f−1([−∞, a]) = ∩∞k=1f

−1([−∞, a+ 1/k))

is measurable for each a ∈ R.


(2) If

f−1([−∞, a])

is measurable for each a ∈ R then

f−1([−∞, a)) = ∪∞k=1f

−1([−∞, a− 1/k])

is also measurable for each a ∈ R.(3) Given a ∈ R, observe that

f−1([−∞, a)) is measurable ⇔ E − f−1([−∞, a)) is measurable

⇔ f−1(R) − f−1([−∞, a)) ⇔ f−1(R − [−∞, a)) is measurable

⇔ f−1([a,+∞]) is measurable . (4.47)

(4) From above, we can prove that

f−1([−∞, a))

is measurable ∀a ∈ R if, and only if

f−1((a, b))

is measurable for each a, b ∈ R such that a < b. There-fore f is measurable if and only if f−1(O) is measurablewhenever O ⊂ R is open.

(5) Thus f is measurable if f−1(F) is measurable wheneverF ⊂ R is closed.

Proposition 4.7.2. If f is continuous in Rn, then f is mea-surable. If f is measurable and real and φ is continuous, then φ fis measurable.

Proof. The first implication is obvious. For the second, beingφ continuous

φ−1([−∞, a))

is open, and therefore

(φ f)−1(([−∞, a)) = f−1(φ−1([−∞, a)))

is measurable, ∀a ∈ R.

Proposition 4.7.3. Suppose fk is a sequence of measurablefunctions. Then

supk∈N

fk(x), infk∈N

fk(x),


and

lim supk→∞

fk(x), lim infk→∞

fk(x)

are measurable.

Proof. We will prove only that supn∈Nfn(x) is measurable. The

remaining proofs are analogous. Let

f(x) = supn∈N

fn(x).

Thus

f−1((a,+∞]) = ∪∞n=1f

−1n ((a,+∞]).

Being each fn measurable, such a set is measurable, ∀a ∈ R. Byanalogy

infk∈N

fk(x)

is measurable,

lim supk→∞

fk(x) = infk≥1

supj≥k

fj(x),

and

lim infk→∞

fk(x) = supk≥1

infj≥k

fj(x)

are measurable.

Proposition 4.7.4. Let fk be a sequence of measurablefunctions such that

limk→∞

fk(x) = f(x).

Then f is measurable.

Proof. Just observe that

f(x) = limk→∞

fk(x) = lim supk→∞

fk(x).

The next result we do not prove it. For a proof see [38].

Proposition 4.7.5. If f and g are measurable functions, then

(1) f 2 is measurable.(2) f + g and f · g are measurable if both assume finite values.


Proposition 4.7.6. Let E ⊂ Rn a measurable set. Supposef : E → R is measurable. Thus if g : E → R is such that

g(x) = f(x), a.e. in E

then g is measurable.

Proof. Define

A = x ∈ E | f(x) 6= g(x),and

B = x ∈ E | f(x) = g(x).A is measurable since m∗(A) = m(A) = 0 and therefore B = E−Ais also measurable. Let a ∈ R. Hence

g−1((a,+∞]) =(

g−1((a,+∞]) ∩ A)

∪(

g−1((a,+∞]) ∩ B)

.

Observe that

x ∈ g−1((a,+∞]) ∩ B ⇔ x ∈ B and g(x) ∈ (a,+∞]

⇔ x ∈ B and f(x) ∈ (a,+∞]

⇔ x ∈ B ∩ f−1((a,+∞]). (4.48)

Thus g−1((a,+∞])∩B is measurable. As g−1((a,+∞])∩A ⊂ A wehave m∗(g−1((a,+∞]) ∩ A) = 0, that is, such a set is measurable.Hence being g−1((a,+∞]) the union of two measurable sets is alsomeasurable. Being a ∈ R arbitrary, g is measurable.

Theorem 4.7.7. Suppose f is a non-negative measurablefunction on Rn. Then there exists a increasing sequence of non-negative simple functions ϕk such that

limk→∞

ϕk(x) = f(x), ∀x ∈ Rn.

Proof. Let N ∈ N. Let QN be the cube with center at originand side of measure N . Define

FN(x) =

f(x), if x ∈ QN and f(x) ≤ N,N, if x ∈ QN and f(x) > N,0, otherwise.

Thus FN (x) → f(x) as N → ∞, ∀x ∈ Rn. Fixing M,N ∈ N define

El,M =

x ∈ QN :l

M≤ FN(x) ≤ l + 1

M

,


for 0 ≤ l ≤ N ·M. Defining

FN,M =

NM∑

l=0

l

MχEl,M

,

we have that FN,M is a simple function and

0 ≤ FN(x) − FN,M(x) ≤ 1

M.

If ϕK(x) = FK,K(x) we obtain

0 ≤ |FK(x) − ϕK(x)| ≤ 1

K.

Hence

|f(x) − ϕK(x)| ≤ |f(x) − FK(x)| + |FK(x) − ϕK(x)|.Therefore

limK→∞

|f(x) − ϕK(x)| = 0, ∀x ∈ Rn.


Theorem 4.7.8. Suppose that f is a measurable functiondefined on Rn. Then there exists a sequence of simple functionsϕk such that

|ϕk(x)| ≤ |ϕk+1(x)|, ∀x ∈ Rn, k ∈ N,

andlimk→∞

ϕk(x) = f(x), ∀x ∈ Rn.

Proof. Writef(x) = f+(x) − f−(x),

wheref+(x) = maxf(x), 0

andf−(x) = max−f(x), 0.

Thus f+ and f− are non-negative measurable functions so that fromthe last theorem there exist increasing sequences of non-negativesimple functions such that

ϕ(1)k (x) → f+(x), ∀x ∈ Rn,

eϕ

(2)k (x) → f−(x), ∀x ∈ Rn,


as k → ∞. Defining

ϕk(x) = ϕ(1)k (x) − ϕ

(2)k (x),

we obtainϕk(x) → f(x), ∀x ∈ Rn

as k → ∞ and

|ϕk(x)| = ϕ(1)k (x) + ϕ

(2)k (x) ր |f(x)|, ∀x ∈ Rn,

as k → ∞.

Theorem 4.7.9. Suppose f is a measurable function in Rn.Then there exists a sequence of step functions ϕk which convergesto f a.e. in Rn.

Proof. From the last theorem, it suffices to prove that if E ismeasurable and m(E) < ∞ then χE may be approximated almosteverywhere in E by step functions. Suppose given ε > 0. Observethat from Proposition 4.6.7, there exist cubes Q1, ..., QN such that

m(E∪Nj=1 Qj) < ε.

We may obtain almost disjoints rectangles Rj such that ∪Mj=1Rj =

∪Nj=1Qj and disjoints rectangles Rj ⊂ Rj such that

m(E∪Mj=1 Rj) < 2ε.

Thus

f(x) =M∑

j=1

χRj,

possibly except in a set of measure < 2ε. Hence, for each k > 0,there exists a step function ϕk such that m(Ek) < 2−k where

Ek = x ∈ Rn | f(x) 6= ϕk(x).Defining

Fk = ∪∞j=k+1Ej

andF = ∩∞

k=1Fk

we have m(F ) = 0 considering that

m(F ) ≤ 2−k, ∀k ∈ N.

Finally, observe that

ϕk(x) → f(x), ∀x ∈ F c.



Theorem 4.7.10 (Egorov). Suppose that fk is a sequenceof measurable functions defined in a measurable set E such thatm(E) < ∞. Assume that fk → f, a.e in E. Thus given ε > 0 wemay find a closed set Aε ⊂ E such that fk → f uniformly in Aε

and m(E − Aε) < ε.

Proof. Without loosing generality we may assume that

fk → f, ∀x ∈ E.

For each N, k ∈ N define

ENk = x ∈ E | |fj(x) − f(x)| < 1/N, ∀j ≥ k.

Fixing N ∈ N, we may observe that

ENk ⊂ EN

k+1,

and that ∪∞k=1E

Nk = E. Thus we may obtain kN such that

m(E −ENkN

) <1

2N.

Observe that

|fj(x) − f(x)| < 1

N, ∀j ≥ kN , x ∈ EN

kN.

Choose M ∈ N such that∞∑

k=M

2−k ≤ ε

2.

DefineAε = ∩∞

N≥MENkN.

Thus

m(E − Aε) ≤∞∑

N=M

m(E − ENkN

) <ε

2.

Suppose given δ > 0. Let N ∈ N be such that N > M and 1/N < δ.

Thus if x ∈ Aε then x ∈ ENkN

so that

|fj(x) − f(x)| < δ, ∀j > kN .

Hence fk → f uniformly in Aε. Observe that Aε is measurable andthus there exists a closed set Aε ⊂ Aε such that

m(Aε −Aε) <ε

2.


That is:

m(E −Aε) ≤ m(E − Aε) +m(Aε − Aε) <ε

2+ε

2= ε,

and

fk → f

uniformly in Aε. The proof is complete.

Definition 4.7.11. We say that a set A ⊂ L1(Rn) is dense inL1(Rn), if for each f ∈ L1(Rn) and each ε > 0 there exists g ∈ Asuch that

‖f − g‖L1(Rn) =

∫

Rn

|f − g| dx < ε.

Theorem 4.7.12. About dense sets in L1(Rn) we have:

(1) The set of simple functions is dense in L1(Rn).(2) The set of step functions is dense in L1(Rn).(3) the set of continuous functions with compact support is

dense in L1(Rn).

Proof. (1) From the last theorems given f ∈ L1(Rn) thereexists a sequence of simple functions such that

ϕk(x) → f(x) a.e. in Rn.

Since ϕk may be also such that

|ϕk| ≤ f, ∀k ∈ N

from the Lebesgue dominated converge theorem, we have

‖ϕk − f‖L1(Rn) → 0,

as k → ∞.(2) From the last item , it suffices to show that simple functions

may approximated by step functions. As a simple functionis a linear combination of characteristic functions of setsof finite measure, it suffices to prove that given ε > 0 anda set of finite measure, there exists ϕ a step function suchthat

‖χE − ϕ‖L1(Rn) < ε.

This may be made similar as in Theorem 4.7.9.


(3) From the last item, it suffices to establish the result asf is a characteristic function of a rectangle in Rn. Firstconsider the case of a interval [a, b]. we may approximatef = χ[a,b] by g(x), where g is continuous, and linear on(a− ε, a) e (b, b+ ε) and

g(x) =

1, if a ≤ x ≤ b,0, if x ≤ a− ε or x ≥ b+ ε.

Thus‖f − g‖L1(Rn) < 2ε.

for the general case of a rectangle in Rn, we just recall thatin this case f is the product of the characteristic functionsof n intervals. Therefore we may approximate f by theproduct of n functions similar to g defined above.

CHAPTER 5

Distributions

5.1. Basic Definitions and Results

Definition 5.1.1 (Test Functions, the Space D(Ω)). Let Ω ⊂Rn be a nonempty open set. For each K ⊂ Ω compact, considerthe space DK , the set of all C∞(Ω) functions with support in K.We define the space of test functions, denoted by D(Ω) as

D(Ω) = ∪K⊂ΩDK , K compact. (5.1)

Thus φ ∈ D(Ω) if and only if φ ∈ C∞(Ω) and the support of φ is acompact subset of Ω.

Definition 5.1.2 (Topology for D(Ω)). Let Ω ⊂ Rn be anopen set.

(1) For every K ⊂ Ω compact, σK denotes the topology whicha local base is defined by VN,k, where N, k ∈ N,

VN,k = φ ∈ DK | ‖φ‖N < 1/k (5.2)

and

‖φ‖N = max|Dαφ(x)| | x ∈ Ω, |α| ≤ N. (5.3)

(2) σ denotes the collection of all convex balanced sets W ∈D(Ω) such that W ∩DK ⊂ σK for every compact K ⊂ Ω.

(3) We define σ in D(Ω) as the collection of all unions of setsof the form φ+ W, for φ ∈ D(Ω) and W ∈ σ.

Theorem 5.1.3. Concerning the last definition we have thefollowing.

(1) σ is a topology in D(Ω).(2) Through σ, D(Ω) is made into a locally convex topological

vector space.

Proof. (1) From item 3 of Definition 5.1.2, it is clear thatarbitrary unions of elements of σ are elements of σ. Let

125

126 5. DISTRIBUTIONS

us now show that finite intersections of elements of σ alsobelongs to σ. Suppose V1 ∈ σ and V2 ∈ σ, if V1 ∩ V2 = ∅we are done. Thus, suppose φ ∈ V1 ∩V2. By the definitionof σ there exist two sets of indices L1 and L2, such that

Vi = ∪λ∈Li(φiλ + Wiλ), for i = 1, 2, (5.4)

and as φ ∈ V1 ∩V2 there exist φi ∈ D(Ω) and Wi ∈ σ suchthat

φ ∈ φi + Wi, for i = 1, 2. (5.5)

Thus there exists K ⊂ Ω such that φi ∈ DK for i ∈ 1, 2.Since DK ∩Wi ∈ σK , DK ∩Wi is open in DK so that from(5.5) there exists 0 < δi < 1 such that

φ− φi ∈ (1 − δi)Wi, for i ∈ 1, 2. (5.6)

From (5.6) and from the convexity of Wi we have

φ− φi + δiWi ⊂ (1 − δi)Wi + δiWi = Wi (5.7)

so that

φ+ δiWi ⊂ φi + Wi ⊂ Vi, for i ∈ 1, 2. (5.8)

Define Wφ = (δ1W1) ∩ (δ2W2) so that

φ+ Wφ ⊂ Vi, (5.9)

and therefore we may write

V1 ∩ V2 = ∪φ∈V1∩V2(φ+ Wφ) ∈ σ. (5.10)

This completes the proof.(2) It suffices to show that single points are closed sets in D(Ω)

and the vector space operations are continuous.(a) Pick φ1, φ2 ∈ D(Ω) such that φ1 6= φ2 and define

V = φ ∈ D(Ω) | ‖φ‖0 < ‖φ1 − φ2‖0. (5.11)

Thus V ∈ σ and φ1 6∈ φ2 + V. As φ2 + V is openis contained D(Ω) − φ1 and φ2 6= φ1 is arbitrary,it follows that D(Ω) − φ1 is open, so that φ1 isclosed.

(b) The proof that addition is σ-continuous follows fromthe convexity of any element of σ. Thus given φ1, φ2 ∈D(Ω) and V ∈ σ we have

φ1 +1

2V + φ2 +

1

2V = φ1 + φ2 + V. (5.12)

5.1. BASIC DEFINITIONS AND RESULTS 127

(c) To prove the continuity of scalar multiplication, firstconsider φ0 ∈ D(Ω) and α0 ∈ R. Then:

αφ− α0φ0 = α(φ− φ0) + (α− α0)φ0. (5.13)

For V ∈ σ there exists δ > 0 such that δφ0 ∈ 12V. Let

us define c = 12(|α0| + δ). Thus if |α − α0| < δ then

(α− α0)φ0 ∈ 12V. Let φ ∈ D(Ω) such that

φ− φ0 ∈ cV =1

2(|α0| + δ)V, (5.14)

so that

(|α0| + δ)(φ− φ0) ∈1

2V. (5.15)

This means

α(φ− φ0) + (α− α0)φ0 ∈1

2V +

1

2V = V. (5.16)

Therefore αφ− α0φ0 ∈ V whenever |α − α0| < δ andφ− φ0 ∈ cV.

For the next result the proof may be found in Rudin [35].

Proposition 5.1.4. A convex balanced set V ⊂ D(Ω) is openif and only if V ∈ σ.

Proposition 5.1.5. The topology σK of DK ⊂ D(Ω) coincideswith the topology that DK inherits from D(Ω).

Proof. From Proposition 5.1.4 we have

V ∈ σ implies DK ∩ V ∈ σK . (5.17)

Now suppose V ∈ σK , we must show that there exists A ∈ σ suchthat V = A∩DK . The definition of σK implies that for every φ ∈ V,there exist N ∈ N and δφ > 0 such that

ϕ ∈ DK | ‖ϕ− φ‖N < δφ ⊂ V. (5.18)

Define

Uφ = ϕ ∈ D(Ω) | ‖ϕ‖N < δφ. (5.19)

Then Uφ ∈ σ and

DK ∩ (φ+ Uφ) = φ+ (DK ∩ Uφ) ⊂ V. (5.20)


Defining A = ∪φ∈V(φ+ Uφ), we have completed the proof.

The proof for the next two results may also be found in Rudin[35].

Proposition 5.1.6. If A is a bounded set of D(Ω) then A ⊂DK for some K ⊂ Ω, and there are MN < ∞ such that ‖φ‖N ≤MN , ∀φ ∈ A, N ∈ N.

Proposition 5.1.7. If φn is a Cauchy sequence in D(Ω),then φn ⊂ DK for some K ⊂ Ω compact, and

limi,j→∞

‖φi − φj‖N = 0, ∀N ∈ N. (5.21)

Proposition 5.1.8. If φn → 0 in D(Ω), then there exists acompact K ⊂ Ω which contains the support of φn, ∀n ∈ N andDαφn → 0 uniformly, for each multi-index α.

The proof follows directly from last proposition.

Theorem 5.1.9. Suppose T : D(Ω) → V is linear, whereV is a locally convex space. Then the following statements areequivalent.

(1) T is continuous.(2) T is bounded.(3) If φn → θ in D(Ω) then T (φn) → θ as n→ ∞.(4) The restrictions of T to each DK are continuous.

Proof. • 1 ⇒ 2. This follows from Proposition 1.9.3 .• 2 ⇒ 3. Suppose T is bounded and φn → 0 in D(Ω), by last

proposition φn → 0 in some DK so that φn is boundedand T (φn) is also bounded. Hence by Proposition 1.9.3,T (φn) → 0 in V .

• 3 ⇒ 4. Assume 3 holds and consider φn ⊂ DK . Ifφn → θ then, by Proposition 5.1.5, φn → θ in D(Ω), sothat, by above T (φn) → θ in V . Since DK is metrizable,also by proposition 1.9.3 we have that 4 follows.

• 4 ⇒ 1. Assume 4 holds and let V be a convex balancedneighborhood of zero in V . Define U = T−1(V). Thus Uis balanced and convex. By Proposition 5.1.5, U is openin D(Ω) if and only if DK ∩ U is open in DK for each

5.2. DIFFERENTIATION OF DISTRIBUTIONS 129

compact K ⊂ Ω, thus if the restrictions of T to each DK

are continuous at θ, then T is continuous at θ, hence 4implies 1.

Definition 5.1.10 (Distribution). A linear functional in D(Ω)which is continuous with respect to σ is said to be a Distribution.

Proposition 5.1.11. Every differential operator is a continu-ous mapping from D(Ω) into D(Ω).

Proof. Since ‖Dαφ‖N ≤ ‖φ‖|α|+N , ∀N ∈ N, Dα is continuouson each DK , so that by Theorem 5.1.9, Dα is continuous on D(Ω).

Theorem 5.1.12. Denoting by D′(Ω) the dual space of D(Ω)we have that T : D(Ω) → R ∈ D′(Ω) if and only if for each compactset K ⊂ Ω there exists an N ∈ N and c ∈ R+ such that

|T (φ)| ≤ c‖φ‖N , ∀φ ∈ DK . (5.22)

Proof. The proof follows from the equivalence of 1 and 4 inTheorem 5.1.9.

5.2. Differentiation of Distributions

Definition 5.2.1 (Derivatives for Distributions). Given T ∈D′(Ω) and a multi-index α, we define the Dα derivative of T as

DαT (φ) = (−1)|α|T (Dαφ), ∀φ ∈ D(Ω). (5.23)

Remark 5.2.2. Observe that if |T (φ)| ≤ c‖φ‖N , ∀φ ∈ D(Ω)for some c ∈ R+, then

|DαT (φ)| ≤ c‖Dαφ‖N ≤ c‖φ‖N+|α|, ∀φ ∈ D(Ω), (5.24)

thus DαT ∈ D′(Ω). Therefore, derivatives of distributions are alsodistributions.

Theorem 5.2.3. Suppose Tn ⊂ D′(Ω). Let T : D(Ω) → R

be defined by

T (φ) = limn→∞

Tn(φ), ∀φ ∈ D(Ω). (5.25)


Then T ∈ D′(Ω), and

DαTn → DαT in D′(Ω). (5.26)

Proof. Let K be an arbitrary compact subset of Ω. Since (5.25)holds for every φ ∈ DK , the principle of uniform boundedness im-plies that the restriction of T to DK is continuous. It follows fromTheorem 5.1.9 that T is continuous in D(Ω), that is, T ∈ D′(Ω).On the other hand

(DαT )(φ) = (−1)|α|T (Dαφ) = (−1)|α| limn→∞

Tn(Dαφ)

= limn→∞

(DαTn(φ)), ∀φ ∈ D(Ω). (5.27)

CHAPTER 6

The Lebesgue and Sobolev Spaces

We start with the definition of Lebesgue spaces, denoted byLp(Ω), where 1 ≤ p ≤ ∞ and Ω ⊂ Rn is an open set.

6.1. Definition and Properties of Lp Spaces

Definition 6.1.1 (Lp Spaces). For 1 ≤ p < ∞, we say thatu ∈ Lp(Ω) if u : Ω → R is measurable and

∫

Ω

|u|pdx <∞. (6.1)

We also denote ‖u‖p = [∫

Ω|u|pdx]1/p and will show that ‖ · ‖p is a

norm.

Definition 6.1.2 (L∞ Spaces). We say that u ∈ L∞(Ω) ifu is measurable and there exists M ∈ R+, such that |u(x)| ≤M, a.e. in Ω. We define

‖u‖∞ = infM > 0 | |u(x)| ≤M, a.e. in Ω. (6.2)

We will show that ‖ · ‖∞ is a norm. For 1 ≤ p ≤ ∞, we defineq by the relations

q =

+∞, if p = 1,p

p−1, if 1 < p < +∞,

1, if p = +∞,

so that symbolically we have

1

p+

1

q= 1.

The next result is fundamental in the proof of the Sobolev Imbed-ding Theorem.

131

132 6. THE LEBESGUE AND SOBOLEV SPACES

Theorem 6.1.3 (Holder Inequality). Consider u ∈ Lp(Ω) andv ∈ Lq(Ω), with 1 ≤ p ≤ ∞. Then uv ∈ L1(Ω) and

∫

Ω

|uv|dx ≤ ‖u‖p‖v‖q. (6.3)

Proof. The result is clear if p = 1 or p = ∞. You may assume‖u‖p, ‖v‖q > 0, otherwise the result is also obvious. Thus suppose1 < p <∞. From the concavity of log function on (0,∞) we obtain

log

(

1

pap +

1

qbq)

≥ 1

plog ap +

1

qlog bq = log(ab). (6.4)

Thus,

ab ≤ 1

p(ap) +

1

q(bq), ∀a ≥ 0, b ≥ 0. (6.5)

Therefore

|u(x)||v(x)| ≤ 1

p|u(x)|p +

1

q|v(x)|q, a.e. in Ω. (6.6)

Hence |uv| ∈ L1(Ω) and∫

Ω

|uv|dx ≤ 1

p‖u‖p

p +1

q‖v‖q

q. (6.7)

Replacing u by λu in (6.7) λ > 0, we obtain∫

Ω

|uv|dx ≤ λp−1

p‖u‖p

p +1

λq‖v‖q

q. (6.8)

For λ = ‖u‖−1p ‖v‖q/p

q we obtain the Holder inequality.

The next step is to prove that ‖ · ‖p is a norm.

Theorem 6.1.4. Lp(Ω) is a vector space and ‖ · ‖p is norm ∀psuch that 1 ≤ p ≤ ∞.

Proof. The only non trivial property to be proved concerningthe norm definition, is the triangle inequality. If p = 1 or p = ∞the result is clear. Thus, suppose 1 < p < ∞. For u, v ∈ Lp(Ω) wehave

|u(x) + v(x)|p ≤ (|u(x)| + |v(x)|)p ≤ 2p(|u(x)|p + |v(x)|p), (6.9)

6.1. DEFINITION AND PROPERTIES OF Lp SPACES 133

so that u+ v ∈ Lp(Ω). On the other hand

‖u+ v‖pp =

∫

Ω

|u+ v|p−1|u+ v|dx

≤∫

Ω

|u+ v|p−1|u|dx+

∫

Ω

|u+ v|p−1|v|dx, (6.10)

and hence, from Holder’s inequality

‖u+ v‖pp ≤ ‖u+ v‖p−1

p ‖u‖p + ‖u+ v‖p−1p ‖v‖p, (6.11)

that is,

‖u+ v‖p ≤ ‖u‖p + ‖v‖p, ∀u, v ∈ Lp(Ω). (6.12)

Theorem 6.1.5. Lp(Ω) is a Banach space for any p such that1 ≤ p ≤ ∞.

Proof. Suppose p = ∞. Suppose un is Cauchy sequencein L∞(Ω). Thus, given k ∈ N there exists Nk ∈ N such that, ifm,n ≥ Nk then

‖um − un‖∞ <1

k. (6.13)

Therefore, for each k, there exist a set Ek such that m(Ek) = 0,and

|um(x) − un(x)| <1

k, ∀x ∈ Ω − Ek, ∀m,n ≥ Nk. (6.14)

Observe that E = ∪∞k=1Ek is such that m(E) = 0. Thus un(x) is

a real Cauchy sequence at each x ∈ Ω−E. Define u(x) = limn→∞

un(x)

on Ω − E. Letting m→ ∞ in (6.14) we obtain

|u(x) − un(x)| <1

k, ∀x ∈ Ω −E, ∀n ≥ Nk. (6.15)

Thus u ∈ L∞(Ω) and ‖un − u‖∞ → 0 as n→ ∞.Now suppose 1 ≤ p <∞. Let un a Cauchy sequence in Lp(Ω).

We can extract a subsequence unk such that

‖unk+1− unk

‖p ≤ 1

2k, ∀k ∈ N. (6.16)

To simplify the notation we write uk in place of unk, so that

‖uk+1 − uk‖p ≤1

2k, ∀k ∈ N. (6.17)


Defining

gn(x) =n∑

k=1

|uk+1(x) − uk(x)|, (6.18)

we obtain

‖gn‖p ≤ 1, ∀n ∈ N. (6.19)

From the monotone convergence theorem and (6.19), gn(x) con-verges to a limit g(x) with g ∈ Lp(Ω). On the the other hand, form ≥ n ≥ 2 we have

|um(x) − un(x)| ≤ |um(x) − um−1(x)| + ... + |un+1(x) − un(x)|≤ g(x) − gn−1(x), a.e. in Ω. (6.20)

Hence un(x) is Cauchy a.e. in Ω and converges to a limit u(x) sothat

|u(x) − un(x)| ≤ g(x), a.e. in Ω, for n ≥ 2, (6.21)

which means u ∈ Lp(Ω). Finally from |un(x)−u(x)| → 0, a.e. in Ω,|un(x)− u(x)|p ≤ |g(x)|p and the Lebesgue dominated convergencetheorem we get

‖un − u‖p → 0 as n→ ∞. (6.22)

Theorem 6.1.6. Let un ⊂ Lp(Ω) and u ∈ Lp(Ω) such that‖un − u‖p → 0. Then there exists a subsequence unk

such that

(1) unk(x) → u(x), a.e. in Ω,

(2) |unk(x)| ≤ h(x), a.e. in Ω, ∀k ∈ N, for some h ∈ Lp(Ω).

Proof. The result is clear for p = ∞. Suppose 1 ≤ p < ∞.From the last theorem we can easily obtain that |unk

(x)−u(x)| → 0as k → ∞, a.e. in Ω. To complete the proof, just take h = u + g,where g is defined in the proof of last theorem.

Theorem 6.1.7. Lp(Ω) is reflexive for all p such that 1 < p <∞.

Proof. We divide the proof into 3 parts.


(1) For 2 ≤ p <∞ we have that∥

∥

∥

∥

u+ v

2

∥

∥

∥

∥

p

Lp(Ω)

+

∥

∥

∥

∥

u− v

2

∥

∥

∥

∥

p

Lp(Ω)

≤ 1

2(‖u‖p

Lp(Ω) + ‖v‖pLp(Ω)),

∀u, v ∈ Lp(Ω). (6.23)

Proof. Observe that

αp + βp ≤ (α2 + β2)p/2, ∀α, β ≥ 0. (6.24)

Now taking α =∣

∣

a+b2

∣

∣ and β =∣

∣

a−b2

∣

∣ in (6.24), we obtain

(using the convexity of tp/2),

|a+ b

2|p + |a− b

2|p ≤ (|a+ b

2|2 + |a− b

2|2)p/2 = (

a2

2+b2

2)p/2

≤ 1

2|a|p +

1

2|b|p. (6.25)

The inequality (6.23) follows immediately.(2) Lp(Ω) is uniformly convex, and therefore reflexive for 2 ≤

p <∞.Proof. Suppose given ε > 0 and suppose that

‖u‖p ≤ 1, ‖v‖p ≤ 1 and ‖u− v‖p > ε. (6.26)

From part 1, we obtain∥

∥

∥

∥

u+ v

2

∥

∥

∥

∥

p

p

< 1 −(ε

2

)p

, (6.27)

and therefore∥

∥

∥

∥

u+ v

2

∥

∥

∥

∥

p

< 1 − δ, (6.28)

for δ = 1 − (1 − (ε/2)p)1/p > 0. Thus Lp(Ω) is uniformlyconvex and from Theorem 2.7.2 it is reflexive.

(3) Lp(Ω) is reflexive for 1 < p ≤ 2. Let 1 < p ≤ 2, from 2we can conclude that Lq is reflexive. We will define T :Lp(Ω) → (Lq)∗ by

〈Tu, f〉Lq(Ω) =

∫

Ω

ufdx, ∀u ∈ Lp(Ω), f ∈ Lq(Ω). (6.29)

From the Holder inequality, we obtain

|〈Tu, f〉Lq(Ω)| ≤ ‖u‖p‖f‖q, (6.30)


so that

‖Tu‖(Lq(Ω))∗ ≤ ‖u‖p. (6.31)

Pick u ∈ Lp(Ω) and define f0(x) = |u(x)|p−2u(x) (f0(x) =0 if u(x) = 0). Thus, we have that f0 ∈ Lq(Ω), ‖f0‖q =‖u‖p−1

p and 〈Tu, f0〉Lq(Ω) = ‖u‖pp. Therefore,

‖Tu‖(Lq(Ω))∗ ≥ 〈Tu, f0〉Lq(Ω)

‖f0‖q

= ‖u‖p (6.32)

Hence from (6.31) and (6.32) we have

‖Tu‖(Lq(Ω))∗ = ‖u‖p, ∀u ∈ Lp(Ω). (6.33)

Thus T is an isometry from Lp(Ω) to a closed subspace of(Lq(Ω))∗. Since from the first part Lq(Ω) is reflexive, wehave that (Lq(Ω))∗ is reflexive. Hence T (Lp(Ω)) and Lp(Ω)are reflexive.

Theorem 6.1.8 (Riesz Representation Theorem). Let 1 <p <∞ and let f be a continuous linear functional on Lp(Ω). Thenthere exists a unique u0 ∈ Lq such that

f(v) =

∫

Ω

vu0 dx, ∀v ∈ Lp(Ω). (6.34)

Furthermore

‖f‖(Lp)∗ = ‖u0‖q. (6.35)

Proof. First we define the operator T : Lq(Ω) → (Lp(Ω))∗ by

〈Tu, v〉Lp(Ω) =

∫

Ω

uv dx, ∀v ∈ Lp(Ω). (6.36)

Similarly to last theorem, we obtain

‖Tu‖(Lp(Ω))∗ = ‖u‖q. (6.37)

We have to show that T is onto. Define E = T (Lq(Ω)). As E isa closed subspace, it suffices to show that E is dense in (Lp(Ω))∗.Suppose h ∈ (Lp)∗∗ = Lp is such that

〈Tu, h〉Lp(Ω) = 0, ∀u ∈ Lq(Ω). (6.38)

Choosing u = |h|p−2h we may conclude that h = 0 which, byCorollary 2.2.13 completes the first part of the proof. The proof ofuniqueness is left to the reader.


Definition 6.1.9. Let 1 ≤ p ≤ ∞. We say that u ∈ Lploc(Ω) if

uχK ∈ Lp(Ω) for all compact K ⊂ Ω.

6.1.1. Spaces of Continuous Functions. We introduce somedefinitions and properties concerning spaces of continuous func-tions. First, we recall that by a domain we mean an open setin Rn. Thus for a domain Ω ⊂ Rn and for any nonnegative integerm we define by Cm(Ω) the set of all functions u which the partialderivatives Dαu are continuous on Ω for any α such that |α| ≤ m,where if Dα = Dα1

1 Dα22 ...Dαn

n we have |α| = α1 + ... + αn. Wedefine C∞(Ω) = ∩∞

m=0Cm(Ω) and denote C0(Ω) = C(Ω). Given a

function φ : Ω → R, its support, denoted by spt(φ) is given by

spt(φ) = x ∈ Ω | φ(x) 6= 0.

C∞c (Ω) denotes the set of functions in C∞(Ω) with compact support

contained in Ω.The sets C0(Ω) and C∞

0 (Ω) consist of the closure of Cc(Ω) (whichis the set of functions in C(Ω) with compact support in Ω) andC∞

c (Ω) respectively, relating the uniform convergence norm. Onthe other hand, Cm

B (Ω) denotes the set of functions u ∈ Cm(Ω) forwhich Dαu is bounded on Ω for 0 ≤ |α| ≤ m. Observe that Cm

B (Ω)is a Banach space with the norm denoted by ‖ · ‖B,m given by

‖u‖B,m = max0≤|α|≤m

supx∈Ω

|Dαu(x)| .

Also, we define Cm(Ω) as the set of functions u ∈ Cm(Ω) for whichDαu is bounded and uniformly continuous on Ω for 0 ≤ |α| ≤ m.Observe that Cm(Ω) is a closed subspace of Cm

B (Ω) and is also aBanach space with the norm inherited from Cm

B (Ω). Finally wedefine the spaces of Holder continuous functions.

Definition 6.1.10 (Spaces of Holder Continuous Functions).If 0 < λ < 1, for a nonnegative integer m we define the space ofHolder continuous functions denoted by Cm,λ(Ω), as the subspaceof Cm(Ω) consisting of those functions u for which, for 0 ≤ |α| ≤ m,there exists a constant K such that

|Dαu(x) −Dαu(y)| ≤ K|x− y|λ, ∀x, y ∈ Ω.


Cm,λ(Ω) is a Banach space with the norm denoted by ‖ · ‖m,λ givenby

‖u‖m,λ = ‖u‖B,m + max0≤|α|≤m

supx,y∈Ω

|Dαu(x) −Dαu(y)||x− y|λ , x 6= y

.

Theorem 6.1.11. The space C0(Ω) is dense in Lp(Ω), for1 ≤ p <∞.

Proof. For the proof we need the following lemma:

Lemma 6.1.12. Let f ∈ L1loc(Ω) such that

∫

Ω

fu dx = 0, ∀u ∈ C0(Ω). (6.39)

Then f = 0 a.e. in Ω.

Suppose f ∈ L1(Ω) and m(Ω) < ∞. Given ε > 0, since C0(Ω)is dense in L1(Ω), there exists f1 ∈ C0(Ω) such that ‖f − f1‖1 < εand thus, from (6.39) we obtain

|∫

Ω

f1u dx| ≤ ε‖u‖∞, ∀u ∈ C0(Ω). (6.40)

Defining

K1 = x ∈ Ω | f1(x) ≥ ε, (6.41)

and

K2 = x ∈ Ω | f1(x) ≤ −ε. (6.42)

As K1 and K2 are disjoint compact sets, by the Urysohn Theoremthere exists u0 ∈ C0(Ω) such that

u0(x) =

+1, if x ∈ K1,−1, if x ∈ K2

(6.43)

and

|u0(x)| ≤ 1, ∀x ∈ Ω. (6.44)

Also defining K = K1 ∪K2, we may write∫

Ω

f1u0 dx =

∫

Ω−K

f1u0 dx+

∫

K

f1u0 dx. (6.45)

6.2. THE SOBOLEV SPACES 139

Observe that, from (6.40)∫

K

|f1| dx ≤∫

Ω

|f1u0| dx ≤ ε (6.46)

so that∫

Ω

|f1| dx =

∫

K

|f1| dx+

∫

Ω−K

|f1| dx ≤ ε+ εm(Ω). (6.47)

Hence

‖f‖1 ≤ ‖f − f1‖1 + ‖f1‖1 ≤ 2ε+ εm(Ω). (6.48)

Since ε > 0 is arbitrary, we have that f = 0 a.e. in Ω. Finally, ifm(Ω) = ∞ , define

Ωn = x ∈ Ω | dist(x,Ωc) > 1/n and |x| < n. (6.49)

It is clear that Ω = ∪∞n=1Ωn and from above f = 0 a.e. on Ωn, ∀n ∈

N, so that f = 0 a.e. in Ω.Finally, to finish the proof of Theorem 6.1.11, suppose h ∈

Lq(Ω) is such that∫

Ω

hu dx = 0, ∀u ∈ C0(Ω). (6.50)

Observe that h ∈ L1loc(Ω) since

∫

K|h| dx ≤ ‖h‖qm(K)1/p < ∞.

From last lemma h = 0 a.e. in Ω, which by Corollary 2.2.13 com-pletes the proof.

Theorem 6.1.13. Lp(Ω) is separable for any 1 ≤ p <∞.

Proof. The result follows from last theorem and from the factthat C0(K) is separable for each K ⊂ Ω compact (from the Weier-strass theorem, polynomials with rational coefficients are denseC0(K)). Observe that Ω = ∪∞

n=1Ωn, Ωn defined as in (6.49), whereΩn is compact, ∀n ∈ N.

6.2. The Sobolev Spaces

Now we define the Sobolev spaces, denoted by Wm,p(Ω).

Definition 6.2.1 (Sobolev Spaces). We say that u ∈Wm,p(Ω)if u ∈ Lp(Ω) and Dαu ∈ Lp(Ω), for all α such that 0 ≤ |α| ≤ m,where the derivatives are understood in the distributional sense.


Definition 6.2.2. We define the norm ‖ · ‖m,p for Wm,p(Ω),where m ∈ N and 1 ≤ p ≤ ∞, as

‖u‖m,p =

∑

0≤|α|≤m

‖Dαu‖pp

1/p

, if 1 ≤ p <∞, (6.51)

and

‖u‖m,∞ = max0≤|α|≤m

‖Dαu‖∞ . (6.52)

Theorem 6.2.3. Wm,p(Ω) is a Banach space.

Proof. Consider un a Cauchy sequence in Wm,p(Ω). ThenDαun is a Cauchy sequence for each 0 ≤ |α| ≤ m. Since Lp(Ω) iscomplete there exist functions u and uα, for 0 ≤ |α| ≤ m, in Lp(Ω)such that un → u and Dαun → uα in Lp(Ω) as n→ ∞. From aboveLp(Ω) ⊂ L1

loc(Ω) and so un determines a distribution Tun ∈ D′(Ω).For any φ ∈ D(Ω) we have, by Holder’s inequality

|Tun(φ) − Tu(φ)| ≤∫

Ω

|un(x) − u(x)||φ(x)|dx ≤ ‖φ‖q‖un − u‖p.

(6.53)

Hence Tun(φ) → Tu(φ) for every φ ∈ D(Ω) as n → ∞. SimilarlyTDαun(φ) → Tuα(φ) for every φ ∈ D(Ω). We have that

Tuα(φ) = limn→∞

TDαun(φ) = limn→∞

(−1)|α|Tun(Dαφ)

= (−1)|α|Tu(Dαφ) = TDαu(φ), (6.54)

for every φ ∈ D(Ω). Thus uα = Dαu in the sense of distributions,for 0 ≤ |α| ≤ m, and u ∈ Wm,p(Ω). As lim

n→∞‖u − un‖m,p = 0,

Wm,p(Ω) is complete.

Remark 6.2.4. Observe that distributional and classical deriva-tives coincide when the latter exist and are continuous. We defineS ⊂ Wm,p(Ω) by

S = φ ∈ Cm(Ω) | ‖φ‖m,p <∞ (6.55)

Thus, the completion of S concerning the norm ‖ · ‖m,p is denotedby Hm,p(Ω).

Corollary 6.2.5. Hm,p(Ω) ⊂Wm,p(Ω)


Proof. Since Wm,p(Ω) is complete we have that Hm,p(Ω) ⊂Wm,p(Ω).

Theorem 6.2.6. Wm,p(Ω) is separable if 1 ≤ p < ∞, and isreflexive and uniformly convex if 1 < p <∞. Particularly, Wm,2(Ω)is a separable Hilbert space with the inner product

(u, v)m =∑

0≤|α|≤m

〈Dαu,Dαv〉L2(Ω). (6.56)

Proof. We can see Wm,p(Ω) as a subspace of Lp(Ω,RN), whereN =

∑

0≤|α|≤m 1. From the relevant properties for Lp(Ω), we have

that Lp(Ω; RN) is a reflexive and uniformly convex for 1 < p <∞ and separable for 1 ≤ p < ∞. Given u ∈ Wm,p(Ω), we mayassociate the vector Pu ∈ Lp(Ω; RN) defined by

Pu = Dαu0≤|α|≤m. (6.57)

Since ‖Pu‖pN = ‖u‖m,p, we have that Wm,p is closed subspace ofLp(Ω; RN). Thus from Theorem 1.21 in Adams [1], we have thatWm,p(Ω) is separable if 1 ≤ p < ∞ and, reflexive and uniformlyconvex, if 1 < p <∞.

Lemma 6.2.7. Let 1 ≤ p < ∞ and define U = Lp(Ω; RN ).For every continuous linear functional f on U , there exists a uniquev ∈ Lq(Ω; RN) = U∗ such that

f(u) =N∑

i=1

〈ui, vi〉, ∀u ∈ U. (6.58)

Moreover,

‖f‖U∗ = ‖v‖qN , (6.59)

where ‖ · ‖qN = ‖ · ‖Lq(Ω,RN ).

Proof. For u = (u1, ..., un) ∈ Lp(Ω; RN) we may write

f(u) = f((u1, 0, ..., 0)) + ...+ f((0, ..., 0, uj, 0, ..., 0))

+ ...+ f((0, ..., 0, un)), (6.60)

and since f((0, ..., 0, uj, 0, ..., 0)) is continuous linear functional onuj ∈ Lp(Ω), there exists a unique vj ∈ Lq(Ω) such that f(0, ..., 0, uj,


0, ..., 0) = 〈uj, vj〉L2(Ω), ∀uj ∈ Lp(Ω), ∀ 1 ≤ j ≤ N , so that

f(u) =N∑

i=1

〈ui, vi〉, ∀u ∈ U. (6.61)

From Holder’s inequality we obtain

|f(u)| ≤N∑

j=1

‖uj‖p‖vj‖q ≤ ‖u‖pN‖v‖qN , (6.62)

and hence ‖f‖U∗ ≤ ‖v‖qN . The equality in (6.62) is achieved foru ∈ Lp(Ω,RN ), 1 < p <∞ such that

uj(x) =

|vj |q−2vj, if vj 6= 00, if vj = 0.

(6.63)

If p = 1 choose k such that ‖vk‖∞ = max1≤j≤N ‖vj‖∞. Givenε > 0, there is a measurable set A such thatm(A) > 0 and |vk(x)| ≥‖vk‖∞ − ε, ∀x ∈ A. Defining u(x) as

ui(x) =

vk/vk, if i = k, x ∈ A and vk(x) 6= 00, otherwise,

(6.64)

we have

f(uk) = 〈u, vk〉L2(Ω) =

∫

A

|vk|dx ≥ (‖(vk‖∞ − ε)‖uk‖1

= (‖v‖∞N − ε)‖u‖1N . (6.65)

Since ε is arbitrary, the proof is complete.

Theorem 6.2.8. Let 1 ≤ p < ∞. Given a continuous linearfunctional f on Wm,p(Ω), there exists v ∈ Lq(Ω,RN ) such that

f(u) =∑

0≤|α|≤m

〈Dαu, vα〉L2(Ω). (6.66)

Proof. Consider f a continuous linear operator on U = Wm,p(Ω).

By the Hahn Banach Theorem, we can extend f to f , on Lp(Ω; RN ),

so that ‖f‖qN = ‖f‖U∗ and by the last theorem, there exists vα ∈Lq(Ω; RN ) such that

f(u) =∑

0≤|α|≤m

〈uα, vα〉L2(Ω), ∀v ∈ Lp(Ω; RN). (6.67)


In particular for u ∈ Wm,p(Ω), defining u = Dαu ∈ Lp(Ω; RN )we obtain

f(u) = f(u) =∑

1≤|α|≤m

〈Dαu, vα〉L2(Ω). (6.68)

Finally, observe that, also from the Hahn-Banach theorem ‖f‖U∗ =

‖f‖qN = ‖v‖qN .

Definition 6.2.9. Let Ω ⊂ Rn be a domain. For m a positiveinteger and 1 ≤ p <∞ we define Wm,p

0 (Ω) as the closure in ‖·‖m,p ofC∞

c (Ω), where we recall that C∞c (Ω) denotes the the set of C∞(Ω)

functions with compact support contained in Ω. Finally, we alsorecall that the support of φ : Ω → R, denoted by spt(φ), is givenby

spt(φ) = x ∈ Ω |φ(x) 6= 0.

Now we enunciate two theorems relating the concept of trace.We do not prove such results. For proofs see Evans, [18].

Theorem 6.2.10 (The trace theorem). Let Ω ⊂ Rn be anopen bounded set such that ∂Ω is C1. Thus there exists a boundedlinear operator

T : W 1,p(Ω) → Lp(∂Ω),

such that

(1) Tu = u|∂Ω if u ∈W 1,p(Ω) ∩ C(Ω) and,(2) ‖Tu‖p,∂Ω ≤ C‖u‖1,p,Ω, ∀u ∈ W 1,p(Ω), where the constant

C depends only on p and Ω.

Tu is called the trace of u on ∂Ω.

Theorem 6.2.11. Let Ω ⊂ Rn be an open bounded set suchthat ∂Ω is C1. Then u ∈ W 1,p

0 (Ω) if and only if u ∈ W 1,p(Ω) andTu = 0 on ∂Ω.

Remark 6.2.12. Similar results are valid for Wm,p0 , however

in this case the traces relative to derivatives of order up to m − 1are involved.


6.3. The Sobolev Imbedding Theorem

6.3.1. The Statement of Sobolev Imbedding Theorem.

Now we present the Sobolev Imbedding Theorem. We recall thatfor normed spaces X, Y the notation

X → Y

means that X ⊂ Y and there exists a constant K > 0 such that

‖u‖Y ≤ K‖u‖X , ∀u ∈ X.

If in addition the imbedding is compact then for any bounded se-quence un ⊂ X there exists a convergent subsequence unk

,which converges to some u in the norm ‖ · ‖Y .

Theorem 6.3.1 (The Sobolev Imbedding Theorem). Let Ωbe an open bounded set in Rn such that ∂Ω is C1. Let j ≥ 0 andm ≥ 1 be integers and let 1 ≤ p <∞.

(1) Part I

(a) Case A If either mp > n or m = n and p = 1 then

W j+m,p(Ω) → CjB(Ω). (6.69)

Moreover,

W j+m,p(Ω) → W j,q(Ω), for p ≤ q ≤ ∞, (6.70)

and, in particular

Wm,p(Ω) → Lq(Ω), for p ≤ q ≤ ∞. (6.71)

(b) Case B If mp = n, then

W j+m,p(Ω) → W j,q(Ω), for p ≤ q <∞, (6.72)

and, in particular

Wm,p(Ω) → Lq(Ω), for p ≤ q <∞. (6.73)

(c) Case C If mp < n and p = 1 , then

W j+m,p(Ω) → W j,q(Ω), for p ≤ q ≤ p∗ =np

n−mp, (6.74)

and, in particular

Wm,p(Ω) → Lq(Ω), for p ≤ q ≤ p∗ =np

n−mp. (6.75)

6.4. THE PROOF OF THE SOBOLEV IMBEDDING THEOREM 145

(2) Part II

If mp > n > (m− 1)p, then

W j+m,p → Cj,λ(Ω), for 0 < λ ≤ m− (n/p), (6.76)

and if n = (m− 1)p, then

W j+m,p → Cj,λ(Ω), for 0 < λ < 1. (6.77)

Also, if n = m − 1 and p = 1, then (6.77) holds for λ = 1as well.

(3) Part III All imbeddings in Parts A and B are valid for ar-bitrary domains Ω if the W −space undergoing the imbed-ding is replaced with the corresponding W0 − space.

6.4. The Proof of the Sobolev Imbedding Theorem

Now we present a collection of results which imply the proof ofthe Sobolev imbedding theorem. We start with the approximationby smooth functions.

Definition 6.4.1. Let Ω ⊂ Rn be an open bounded set. Foreach ε > 0 define

Ωε = x ∈ Ω | dist(x, ∂Ω) > ε.

Definition 6.4.2. Define η ∈ C∞c (Rn) by

η(x) =

C exp(

1|x|22−1

)

, if |x|2 < 1,

0, if |x|2 ≥ 1,

where | · |2 refers to the Euclidean norm in Rn, that is for x =(x1, ..., xn) ∈ Rn, we have

|x|2 =√

x21 + ...+ x2

n.

Moreover, C > 0 is chosen so that∫

Rn

η dx = 1.

For each ε > 0, set

ηε(x) =1

εnη(x

ε

)

.


The function η is said to be the fundamental mollifier. The func-tions ηε ∈ C∞

c (Rn) and satisfy∫

Rn

ηε dx = 1,

and spt(ηε) ⊂ B(0, ε).

Definition 6.4.3. If f : Ω → Rn is locally integrable, wedefine its mollification, denoted by fε : Ωε → R as:

fε = ηε ∗ f,that is,

fε(x) =

∫

Ω

ηε(x− y)f(y) dy

=

∫

B(0,ε)

ηε(y)f(x− y) dy. (6.78)

Theorem 6.4.4 (Properties of mollifiers). The mollifiers havethe following properties:

(1) fε ∈ C∞(Ωε),(2) fε → f a.e. as ε→ 0,(3) If f ∈ C(Ω) then fε → f uniformly on compact subsets of

Ω.

Proof. (1) fix x ∈ Ωε, i ∈ 1, ..., n and a h small enoughsuch that

x+ hei ∈ Ωε.

Thus

fε(x+ hei) − fε(x)

h=

1

εn

∫

Ω

1

h

[

η

(

x+ hei − y

ε

)

− η

(

x− y

ε

)]

× f(y) dy

=1

εn

∫

V

1

h

[

η

(

x+ hei − y

ε

)

− η

(

x− y

ε

)]

× f(y) dy, (6.79)

for an appropriate compact V ⊂⊂ Ω. As

1

h

[

η

(

x+ hei − y

ε

)

− η

(

x− y

ε

)]

→ 1

ε

∂η

∂xi

(

x− y

ε

)

,


as h→ 0, uniformly on V , we obtain

∂fε(x)

∂xi=

∫

Ω

∂ηε(x− y)

∂xif(y) dy.

By analogy, we may show that

Dαfε(x) =

∫

Ω

Dαηε(x− y)f(y) dy, ∀x ∈ Ωε.

(2) From the Lebesgue differentiation theorem we have

limr→0

1

|B(x, r)|

∫

B(x,r)

|f(y) − f(x)| dy = 0, (6.80)

for almost all x ∈ Ω. Fix x ∈ Ω such that (6.80) holds.Hence,

|fε(x) − f(x)| =

∫

B(x,ε)

ηε(x− y)[f(x) − f(y)] dy

≤ 1

εn

∫

B(x,ε)

η

(

x− y

ε

)

[f(x) − f(y)] dy

≤ C

|B(x, ε)|

∫

B(x,ε)

|f(y) − f(x)| dy (6.81)

for an appropriate constant C > 0. From (6.80), we obtainfε → f as ε→ 0.

(3) Assume f ∈ C(Ω). Given V ⊂⊂ Ω choose W such that

V ⊂⊂ W ⊂⊂ Ω,

and note that f is uniformly continuous on W Thus thelimit indicated in (6.80) holds uniformly on V , and there-fore fε → f uniformly on V .

Theorem 6.4.5. Let u ∈ Lp(Ω), where 1 ≤ p <∞. Then

ηε ∗ u ∈ Lp(Ω),

‖ηε ∗ u‖p ≤ ‖u‖p, ∀ε > 0

and

limε→0+

‖ηε ∗ u− u‖p = 0.


Proof. Suppose u ∈ Lp(Ω) and 1 < p < ∞. Defining q =p/(p− 1), from Holder’s inequality we have

|ηε ∗ u(x)| =

∣

∣

∣

∣

∫

Rn

ηε(x− y)u(y) dy

∣

∣

∣

∣

=

∣

∣

∣

∣

∫

Rn

[ηε(x− y)](1−1/p)[ηε(x− y)]1/pu(y) dy

∣

∣

∣

∣

≤[∫

Rn

ηε(x− y) dy

]1/q [∫

Rn

ηε(x− y)|u(y)|p dy]1/p

=

[∫

Rn

ηε(x− y)|u(y)|p dy]1/p

. (6.82)

From this and Fubini theorem, we obtain∫

Ω

|ηε ∗ u(x)|p dx ≤∫

Rn

∫

Rn

ηε(x− y)|u(y)|p dy dx

=

∫

Rn

|u(y)|p(∫

Rn

ηε(x− y) dx

)

dy

= ‖u‖pp. (6.83)

Suppose given ρ > 0. As C0(Ω) is dense in Lp(Ω), there existsφ ∈ C0(Ω) such that

‖u− φ‖p < ρ/3.

From the fact that

ηε ∗ φ→ φ

as ε→ 0, uniformly in Ω we have that there exists δ > 0 such that

‖ηε ∗ φ− φ‖p < ρ/3

if ε < δ. Thus for any ε < δ(ρ), we get

‖ηε ∗ u− u‖p = ‖ηε ∗ u− ηε ∗ φ+ ηε ∗ φ− φ+ φ− u‖p

≤ ‖ηε ∗ u− ηε ∗ φ‖p + ‖ηε ∗ φ− φ‖p + ‖φ− u‖p

≤ ρ/3 + ρ/3 + ρ/3 = ρ. (6.84)

Since ρ > 0 is arbitrary, the proof is complete.

For the next theorem we denote

u(x) =

u(x), if x ∈ Ω,0, if x ∈ Rn − Ω.


6.4.1. Relatively Compact Sets in Lp(Ω).

Theorem 6.4.6. Consider 1 ≤ p < ∞. A bounded set K ⊂Lp(Ω) is relatively compact if and only if for each ε > 0, there existsδ > 0 and G ⊂⊂ Ω (here G ⊂⊂ Ω denotes that G ⊂ Ω) such thatfor each u ∈ K and h ∈ Rn such that |h| < δ we have

(1)∫

Ω

|u(x+ h) − u(x)|p dx < εp, (6.85)

(2)∫

Ω−G

|u(x)|p dx < εp. (6.86)

Proof. SupposeK is relatively compact in Lp(Ω). Suppose givenε > 0. As K is compact we may find a ε/6-net for K. Denote sucha ε/6-net by N where

N = v1, ..., vm ⊂ Lp(Ω).

Since Cc(Ω) is dense in Lp(Ω), for each k ∈ 1, ..., m there existsφk ∈ Cc(Ω) such that

‖φk − vk‖p <ε

6.

Thus defining

S = φ1, ..., φm,given u ∈ K, we may select vk ∈ N such that

‖u− vk‖p <ε

6,

so that

‖φk − u‖p ≤ ‖φk − vk‖p + ‖vk − u‖p

≤ ε

6+ε

6=ε

3. (6.87)

Define

G = ∪mk=1spt(φk),

where

spt(φk) = x ∈ Rn | φk(x) 6= 0.We have that

G ⊂⊂ Ω,


where as above mentioned this means G ⊂ Ω. Observe that

εp > ‖u− φk‖pp ≥

∫

Ω−G

|u(x)|p dx.

Since u ∈ K is arbitrary, (6.86) is proven. Since φk is continuousand spt(φk) is compact we have that φk is uniformly continuous,

that is, for the ε given above, there exists δ > 0 such that if |h| <minδ, 1 then

|φk(x+ h) − φk(x)| <ε

3(|G| + 1), ∀x ∈ G,

Thus,∫

Ω

|φk(x+ h) − φk(x)|p dx <(ε

3

)p

.

Also observe that since

‖u− φk‖p <ε

3,

we have that‖Thu− Thφk‖p <

ε

3,

where Thu = u(x+ h). Thus if |h| < δ = minδ, 1, we obtain

‖Thu− u‖p ≤ ‖Thu− Thφk‖p + ‖Thφk − φk‖p

+‖φk − u‖p

<ε

3+ε

3+ε

3= ε. (6.88)

For the converse, it suffices to consider the special case Ω = Rn,because for the general Ω we can define K = u | u ∈ K. Supposegiven ε > 0 and choose G ⊂⊂ Rn such that for all u ∈ K we have

∫

Rn−G

|u(x)|p dx < ε

3.

For each ρ > 0 the function ηρ ∗ u ∈ C∞(Rn), and in particularηρ ∗ u ∈ C(G). Suppose φ ∈ C0(R

n). Fix ρ > 0. By Holder’sinequality we have

|ηρ ∗ φ(x) − φ(x)|p =

∣

∣

∣

∣

∫

Rn

ηρ(y)(φ(x− y) − φ(x)) dy

∣

∣

∣

∣

p

=

∣

∣

∣

∣

∫

Rn

(ηρ(y))1−1/p(ηρ(y))

1/p(T−yφ(x) − φ(x)) dy

∣

∣

∣

∣

p

≤∫

Bρ(θ)

(ηρ(y))|T−yφ(x) − φ(x)|p dy. (6.89)


Hence, from the Fubini theorem we may write

∫

Rn

|ηρ ∗ φ(x) − φ(x)|p dx

≤∫

Bρ(θ)

(ηρ(y))

∫

Rn

|T−yφ(x) − φ(x)|p dx dy, (6.90)

so that we may write

‖ηρ ∗ φ− φ‖p ≤ suph∈Bρ(θ)

‖Thφ− φ‖p. (6.91)

Fix u ∈ Lp(Rn). We may obtain a sequence φk ⊂ Cc(Rn) such

that

φk → u, in Lp(Rn).

Observe that

ηρ ∗ φk → ηρ ∗ u, in Lp(Rn),

as k → ∞. Also

Thφk → Thu, in Lp(Rn),

as k → ∞. Thus

‖Thφk − φk‖p → ‖Thu− u‖p,

in particular

limk→∞

suph∈Bρ(θ)

‖Thφk − φk‖ ≤ suph∈Bρ(θ)

‖Thu− u‖p.

Therefore as

‖ηρ ∗ φk − φk‖p → ‖ηρ ∗ u− u‖p,

as k → ∞, from (6.91) we get

‖ηρ ∗ u− u‖p ≤ suph∈Bρ(θ)

‖Thu− u‖p.


‖ηρ ∗ u− u‖p → 0, uniformly in K as ρ→ 0.

Fix ρ0 > 0 such that∫

G

|ηρ0 ∗ u− u|p dx < ε

3 · 2p−1, ∀u ∈ K.


Observe that

|ηρ0 ∗ u(x)| =

∣

∣

∣

∣

∫

Rn

ηρ0(x− y)u(y) dy

∣

∣

∣

∣

=

∣

∣

∣

∣

∫

Rn

[ηρ0(x− y)](1−1/p)[ηρ0(x− y)]1/pu(y) dy

∣

∣

∣

∣

≤[∫

Rn

ηρ0(x− y) dy

]1/q [∫

Rn

ηρ0(x− y)|u(y)|p dy]1/p

=

[∫

Rn

ηρ0(x− y)|u(y)|p dy]1/p

. (6.92)

From this, we may write,

|ηρ0 ∗ u(x)| ≤(

supy∈Rn

ηρ0(y)

)1/p

‖u‖p ≤ K1, ∀x ∈ Rn, u ∈ K

where K1 = K2K3,

K2 =

(

supy∈Rn

ηρ0(y)

)1/p

,

and K3 is any constant such that

‖u‖p < K3, ∀u ∈ K.

Similarly

|ηρ0 ∗ u(x+ h) − ηρ0u(x)| ≤(

supy∈Rn

ηρ0(y)

)1/p

‖Thu− u‖p,

and thus from (6.85) we obtain

ηρ0 ∗ u(x+ h) → ηρ0 ∗ u(x), as h→ 0

uniformly in Rn and for u ∈ K.By the Arzela-Ascoli Theorem

ηρ0 ∗ u | u ∈ Kis relatively compact in C(G), and it is totally bounded so thatthere exists a ε0-net N = v1, ..., vm where

ε0 =ε

3 · 2p−1|G|.

Thus for some k ∈ 1, ..., m we have

‖vk − ηρ0 ∗ u‖∞ < ε0.


Hence,∫

Rn

|u(x) − vk(x)|p dx =

∫

Rn−G

|u(x)p| dx+

∫

G

|u(x) − vk(x)|p dx

≤ ε

3+ 2p−1

∫

G

(|u(x) − (ηρ0 ∗ u)(x)|p

+|ηρ0 ∗ u(x) − vk(x)|p) dx

≤ ε

3+ 2p−1

(

ε

3 · 2p−1+

ε|G|3 · 2p−1|G|

)

= ε. (6.93)

Thus K is totally bounded and therefore it is relatively compact.The proof is complete.

6.4.2. Some Approximation Results.

Theorem 6.4.7. Let Ω ⊂ Rn be an open set. Assume u ∈Wm,p(Ω) for some 1 ≤ p <∞, and set

uε = ηε ∗ u in Ωε.

Then,

(1) uε ∈ C∞(Ωε), ∀ε > 0,(2) uε → u in Wm,p

loc (Ω), as ε→ 0,

Proof. Assertion 1 has been already proved. Let us prove 2.We will show that if |α| ≤ m, then

Dαuε = ηε ∗Dαu, in Ωε.

For, let x ∈ Ωε. Thus,

Dαuε(x) = Dα

(∫

Ω

ηε(x− y)u(y) dy

)

=

∫

Ω

Dαxηε(x− y)u(y) dy

= (−1)|α|∫

Ω

Dαy (ηε(x− y))u(y) dy. (6.94)

Observe that for fixed x ∈ Ωε the function

φ(y) = ηε(x− y) ∈ C∞c (Ω).


Therefore,∫

Ω

Dαy (ηε(x− y))u(y) dy = (−1)|α|

∫

Ω

ηε(x− y)Dαyu(y) dy,

and hence,

Dαuε(x) = (−1)|α|+|α|

∫

Ω

ηε(x− y)Dαu(y) dy

= (ηε ∗Dαu)(x). (6.95)

Now choose any open bounded set such that V ⊂⊂ Ω. We havethat

Dαuε → Dαu, in Lp(Ω) as ε→ 0,

for each |α| ≤ m.Thus,

‖uε − u‖pm,p,V =

∑

|α|≤m

‖Dαuε −Dαu‖p,V → 0,

as ε→ 0.

Theorem 6.4.8. Let Ω ⊂ Rn be a bounded open set andsuppose u ∈ Wm,p(Ω) for some 1 ≤ p < ∞. Then there exists asequence uk ⊂ C∞(Ω) such that

uk → u in Wm,p(Ω).

Proof. Observe that

Ω = ∪∞i=1Ωi,

where

Ωi = x ∈ Ω | dist(x, ∂Ω) > 1/i.Define

Vi = Ωi+3 − Ωi+1,

and choose any open set V0 such that V0 ⊂⊂ Ω, so that

Ω = ∪∞i=0Vi.

Let ζi∞i=0 be a smooth partition of unit subordinate to the opensets Vi∞i=0. That is,

0 ≤ ζi ≤ 1, ζi ∈ C∞c (Vi)

∑∞i=0 ζi = 1, on Ω,


Now suppose u ∈ Wm,p(Ω). Thus ζiu ∈ Wm,p(Ω) and spt(ζiu) ⊂Vi ⊂ Ω. Choose δ > 0. For each i ∈ N choose εi > 0 small enoughso that

ui = ηεi∗ (ζiu)

satisfies

‖ui − ζiu‖m,p,Ω ≤ δ

2i+1,

and spt(ui) ⊂Wi where Wi = Ωi+4 − Ωi ⊃ Vi. Define

v =∞∑

i=0

ui.

Thus such a function belongs to C∞(Ω), since for each open V ⊂⊂Ω there are at most finitely many non-zero terms in the sum. Since

u =∞∑

i=0

ζiu,

we have that for a fixed V ⊂⊂ Ω,

‖v − u‖m,p,V ≤∞∑

i=0

‖ui − ζiu‖m,p,V

≤ δ∞∑

i=0

1

2i+1= δ. (6.96)

Taking the supremum over sets V ⊂⊂ Ω we obtain

‖v − u‖m,p,Ω < δ.

Since δ > 0 is arbitrary, the proof is complete.

The next result is also relevant. For a proof see Evans, [18] page232.

Theorem 6.4.9. Let Ω ⊂ Rn be a bounded set such that ∂Ωis C1. Suppose u ∈ Wm,p(Ω) where 1 ≤ p < ∞. Thus there existsa sequence un ⊂ C∞(Ω) such that

un → u in Wm,p(Ω), as n→ ∞.


6.4.3. Extensions. In this section we study extensions of Sobolevspaces from a domain Ω ⊂ Rn to Rn.

Theorem 6.4.10. Assume Ω ⊂ Rn is an open bounded set,and that ∂Ω is C1. Let V be a bounded open set such that Ω ⊂⊂ V .Then there exists a bounded linear operator

E : W 1,p(Ω) →W 1,p(Rn),

such that for each u ∈W 1,p(Ω) we have:

(1) Eu = u, a.e. in Ω,(2) Eu has support in V , and(3) ‖Eu‖1,p,Rn ≤ C‖u‖1,p,Ω, where the constant depend only

on p,Ω, and V.

Proof. Fix x0 ∈ ∂Ω and suppose first that ∂Ω is flat near x0,lying in the plane xn = 0. Thus there exists an open ball B withcenter in x0 and radius r > 0 such that

B+ = B ∩ x : xn ≥ 0 ⊂ ΩB− = B ∩ x : xn ≤ 0 ⊂ Rn − Ω,

First assume u ∈ C∞(Ω) and define u by

u =

u(x), if x ∈ B+

−3u(x1, ..., xn−1,−xn) + 4u(x1, ..., xn−1,−xn/2), if x ∈ B−.

We will show that u ∈ C1(B). Indeed,

∂u−(x)

∂xn= 3

∂u(x1, ..., xn−1,−xn)

∂xn− 2

∂u(x1, ..., xn−1,−xn/2)

∂xn,

so that

u−|xn=0 = u+|xn=0.

We can also verify that

u+xi

= u−xi, ∀i ∈ 1, ..., n− 1.

Observing the expression of the partial derivatives of u in B+ andB− we have

‖u‖1,p,B ≤ C‖u‖1,p,B+,

where C does not depend on u.Let us now consider the situation such that ∂Ω is not necessarily

flat near x0. Let Φ be a C1 mapping which straightens out ∂Ω near


x0. The inverse of Φ is denoted by Ψ. Write y = Φ(x) so thatx = Ψ(y), and define u1 by

u1 = u(Ψ(y)).

Therefore, there exists r > 0 such that for B(x0, r) = B1 we havesimilar result as above, that is u1 is C1 and

‖u1‖1,p,B1 ≤ C1‖u1‖1,p,B+1.

Define W = Ψ(B1), then we may write

‖u‖1,p,W ≤ C2‖u‖1,p,W+.

Hence u is an extension of u from W+ = Ψ(B+1 ) to W = Ψ(B1).

Since ∂Ω is compact, there exists finitely many points x0i ∈ ∂Ωsuch that the union of corresponding Wi covers ∂Ω. Thus ∂Ω ⊂∪N

i=1Wi = W for some N ∈ N. Take W0 ⊂⊂ Ω such that Ω ⊂∪N

i=0Wi, and let ζiNi=0 be an associated partition of unity. Denot-

ing by ui the extension of u from W+i to Wi and u0 = u defining

u =N∑

i=0

ζiui,

we obtain u = u, in Ω and

‖u‖1,p,Rn ≤ C3‖u‖1,p,Ω.

Observe that the partition of unity may be chosen such that thesupport of u be in V ⊃ Ω. Henceforth we denote

Eu = u.

Recall that we have considered u ∈ C∞(Ω). Suppose now u ∈W 1,p(Ω) and choose uk ⊂ C∞(Ω) converging to u in W 1,p(Ω).Thus, from above

‖Euk − Eul‖1,p,Rn ≤ C3‖uk − ul‖1,p,Ω,

and thus Euk is a Cauchy sequence converging to some u. Defin-ing u = Eu, we obtain Eu = u, a.e. in Ω and

‖u‖1,p,Rn ≤ C3‖u‖1,p,Ω,

where the constant does not depend on u. The proof is complete.


6.4.4. The Main Results.

Definition 6.4.11. For 1 ≤ p < n we define r = npn−p

.

The next theorem we do not prove it. For a proof see Evans,[18] page 263.

Theorem 6.4.12 (Gagliardo-Nirenberg-Sobolev inequality). Let1 ≤ p < n. Thus there exists a constant K > 0 depending only pand n such that

‖u‖r,Rn ≤ K‖Du‖p,Rn, ∀u ∈ C1c (R

n).

Theorem 6.4.13. Let Ω ⊂ Rn be a bounded open set. Sup-pose ∂Ω is C1, 1 ≤ p < n and u ∈ W 1,p(Ω).

Then u ∈ Lr(Ω) and

‖u‖r,Ω ≤ K‖u‖1,p,Ω,

where the constant depends only on p, n and Ω.

Proof. Since ∂Ω is C1, from Theorem 6.4.10, there exists anextension Eu = u ∈ W 1,p(Rn) such that u = u in Ω the support ofu is compact and

‖u‖1,p,Rn ≤ C‖u‖1,p,Ω,

where C does not depend on u. As u has compact support, fromTheorem 6.4.8, there exists a sequence uk ∈ C∞

c (Rn) such that

uk → u in W 1,p(Rn).

from the last theorem

‖uk − ul‖r,Rn ≤ K‖Duk −Dul‖p,Rn.

Hence,

uk → u in Lr(Rn).

also from the last theorem

‖uk‖r,Rn ≤ K‖Duk‖p,Rn, ∀k ∈ N,

so that

‖u‖r,Rn ≤ K‖Du‖p,Rn.


Therefore, we may get

‖u‖r,Ω ≤ ‖u‖r,Rn

≤ K‖Du‖p,Rn

≤ K1‖u‖1,p,Rn

≤ K2‖u‖1,p,Ω. (6.97)


Theorem 6.4.14. Let Ω ⊂ Rn be a bounded open set suchthat ∂Ω ∈ C1. If mp < n, then Wm,p(Ω) → Lq(Ω) for p ≤ q ≤(np)/(n−mp).

Proof. Define q0 = np/(n −mp). We first prove by inductionon m that

Wm,p → Lq0(Ω).

The last result is exactly the case for m = 1. Assume

Wm−1,p → Lr1(Ω), (6.98)

where

r1 = np/(n− (m− 1)p) = np/(n− np+ p),

whenever n > (m−1)p. If u ∈Wm,p(Ω) where n > mp, then u andDju are in Wm−1,p(Ω), so that from (6.98) we have u ∈ W 1,r1(Ω)and

‖u‖1,r1,Ω ≤ K‖u‖m,p,Ω. (6.99)

Since n > mp we have that r1 = np/((n − mp) + p) < n, fromq0 = nr1/(n− r1) = np/(n−mp) by the last theorem we have

‖u‖q0,Ω ≤ K2‖u‖1,r1,Ω,

where the constant K2 does not depend on u, and therefore fromthis and (6.99) we obtain

‖u‖q0,Ω ≤ K2‖u‖1,r1,Ω ≤ K3‖u‖m,p,Ω. (6.100)

The induction is complete. Now suppose p ≤ q ≤ q0. Define

s = (q0 − q)p/(q0 − p) and t = p/s = (q0 − p)/(q0 − q).


Through Holder’s inequality, we get

‖u‖qq,Ω =

∫

Ω

|u(x)|s|u(x)|q−s dx

≤(∫

Ω

|u(x)|st dx)1/t(∫

Ω

|u(x)|(q−s)t′ dx

)1/t′

= ‖u‖p/tp,Ω‖u‖

q0/t′

q0,Ω

≤ ‖u‖p/tp,Ω(K3)

q0/t′‖u‖q0/t′

m,p,Ω

≤ (K3)q0/t′‖u‖p/t

m,p,Ω‖u‖q0/t′

m,p,Ω

= (K3)q0/t′‖u‖q

m,p,Ω, (6.101)

since

p/t+ q0/t′ = q.


Corollary 6.4.15. If mp = n, then Wm,p(Ω) → Lq for p ≤q <∞.

Proof. If q ≥ p′ = p/(p − 1) then q = ns/(n − ms) wheres = pq/(p+ q) is such that 1 ≤ s ≤ p. observe that

Wm,p(Ω) → Wm,s(Ω)

with the imbedding constant depending only on |Ω|. Since ms < n,by the last theorem we obtain

Wm,p(Ω) → Wm,s(Ω) → Lq(Ω).

Now if p ≤ q ≤ p′, from above we have Wm,p(Ω) → Lp′(Ω) and theobvious imbedding Wm,p(Ω) → Lp(Ω). Define s = (p′−q)p/(p′−p)and the result follows from a reasoning analogous to the final chainof inequalities of last theorem, indicated in (??).

About the next theorem, note that its hypotheses are satisfiedif ∂Ω is C1 (here we do not give the details).

Theorem 6.4.16. Let Ω ⊂ Rn be an open bounded set, suchthat for each x ∈ Ω there exists a convex set Cx ⊂ Ω whose shapedepends on x, but such that |Cx| > α, for some α > 0 that doesnot depend on x. Thus if mp > n, then

Wm,p(Ω) → C0B(Ω).


Proof. Suppose first m = 1 so that p > n. Fix x ∈ Ω and picky ∈ Cx. For φ ∈ C∞(Ω), from the fundamental theorem of calculus,we have

φ(y) − φ(x) =

∫ 1

0

d(φ(x+ t(y − x))

dtdt.

Thus,

|φ(x)| ≤ |φ(y)|+∫ 1

0

∣

∣

∣

∣

d(φ(x+ t(y − x))

dt

∣

∣

∣

∣

dt,

and hence∫

Cx

|φ(x)| dy ≤∫

Cx

|φ(y)| dy +

∫

Cx

∫ 1

0

∣

∣

∣

∣

d(φ(x+ t(y − x))

dt

∣

∣

∣

∣

dt dy,

so that, from Holder’s inequality and Fubini theorem we get,

|φ(x)|α ≤ |φ(x)| · |Cx|

≤ ‖φ‖p,Ω|Cx|1/p′ +

∫ 1

0

∫

Cx

∣

∣

∣

∣

d(φ(x+ t(y − x))

dt

∣

∣

∣

∣

dy dt.

Therefore

|φ(x)|α ≤ ‖φ‖p,Ω|Ω|1/p′ +

∫ 1

0

∫

V

|∇φ(z)|δt−n dz dt,

where V ⊂ tΩ, |V | = tn|Cx| and δ denotes the diameter of Ω. FromHolder’s inequality again, we obtain

|φ(x)|α ≤ ‖φ‖p,Ω|Ω|1/p′+δ

∫ 1

0

(∫

V

|∇φ(z)|p dy)1/p

t−n(tn|Cx|)1/p′ dt,

and thus

|φ(x)|α ≤ ‖φ‖p,Ω|Ω|1/p′ + δ|Cx|1/p′‖∇φ‖p,Ω

∫ 1

0

t−n(1−1/p′) dt.

Since p > n we obtain∫ 1

0

t−n(1−1/p′) dt =

∫ 1

0

t−n/p dt =1

1 − n/p.

From this, the last inequality and from the fact that |Cx| ≤ |Ω|, wehave that there exists K > 0 such that

|φ(x)| ≤ K‖φ‖1,p,Ω, ∀x ∈ Ω, φ ∈ C∞(Ω). (6.102)

Here the constant K depends only on p, n and Ω. Consider nowu ∈W 1,p(Ω).


Thus there exists a sequence φk ⊂ C∞(Ω) such that

φk → u, in W 1,p(Ω).

Up to a not relabeled subsequence, we have

φk → u, a.e. in Ω. (6.103)

Fix x ∈ Ω such that the limit indicate in (6.103) holds. Supposegiven ε > 0. Therefore, there exists k0 ∈ N such that

|φk0(x) − u(x)| ≤ ε/2

and

‖φk0 − u‖1,p,Ω < ε/(2K).

Thus,

|u(x)| ≤ |φk0(x)| + ε/2

≤ K‖φk0‖1,p,Ω + ε/2

≤ K‖u‖1,p,Ω + ε. (6.104)

Since ε > 0 is arbitrary, the proof for m = 1 is complete, becausefor φk ∈ C∞(Ω) such that φk → u in W 1,p(Ω), from (6.102) wehave that φk is a uniformly Cauchy sequence, so that it convergesto a continuous u∗, where u∗ = u, a.e. in Ω.

For m > 1 but p > n we still have

|u(x)| ≤ K‖u‖1,p,Ω ≤ K1‖u‖m,p,Ω, a.e. in Ω, ∀u ∈Wm,p(Ω).

If p ≤ n ≤ mp, there exists j satisfying 1 ≤ j ≤ m − 1 such thatjp ≤ n ≤ (j + 1)p. If jp < n set

r = np/(n− jp),

so that by above and the last theorem:

‖u‖∞ ≤ K1‖u‖1,r,Ω ≤ K1‖u‖m−j,r,Ω ≤ K1‖u‖m,r,Ω ≤ K2‖u‖m,p,Ω.

If jp = n choosing r = max(n, p) also by the last theorem we obtainthe same last chain of inequalities. This completes the proof.

Theorem 6.4.17. If mp > n, then Wm,p(Ω) → Lq(Ω) forp ≤ q ≤ ∞.

Proof. From the proof of the last theorem, we may obtain

‖u‖∞,Ω ≤ K‖u‖m,p,Ω, ∀u ∈ Wm,p(Ω).


If p ≤ q <∞, we have

‖u‖qq,Ω =

∫

Ω

|u(x)|p|u(x)|q−p dx

≤∫

Ω

|u(x)|p (K‖u‖m,p,Ω)q−p dx

≤ Kq−p‖u‖pp,Ω‖u‖q−p

m,p,Ω

≤ Kq−p‖u‖pm,p,Ω‖u‖q−p

m,p,Ω

= Kq−p‖u‖qm,p,Ω. (6.105)


Theorem 6.4.18. Let S ⊂ Rn be a n-dimensional sphere ofradius bigger than 3. If n < p, then there exists a constant C,depending only on p and n, such that

‖u‖C0,λ(S) ≤ ‖u‖1,p,S, ∀u ∈ C1(S),

where 0 < λ ≤ 1 − n/p.

Proof. First consider λ = 1− n/p and u ∈ C1(S). Let x, y ∈ Ssuch that |x− y| < 1 and define σ = |x− y|. Consider a fixed cubedenoted by Rσ ⊂ S such that |Rσ| = σn and x, y ∈ Rσ. For z ∈ Rσ,we may write:

u(x) − u(z) = −∫ 1

0

du(x+ t(z − x))

dtdt,

that is,

u(x)σn =

∫

Rσ

u(z) dz −∫

Rσ

∫ 1

0

∇u(x+ t(z − x)) · (z − x) dt dz.


Thus, denoting in the next lines V by an appropriate set such that|V | = tn|Rσ|, we obtain

|u(x) −∫

Rσ

u(z) dz/σn| ≤√nσ1−n

∫

Rσ

∫ 1

0

|∇u(x+ t(z − x))| dt dz

≤√nσ1−n

∫ 1

0

t−n

∫

V

|∇u(z)| dz dt

≤√nσ1−n

∫ 1

0

t−n‖∇u‖p,S|V |1/p′ dt

≤√nσ1−nσn/p′‖∇u‖p,S

∫ 1

0

t−ntn/p′ dt

≤√nσ1−n/p‖∇u‖p,S

∫ 1

0

t−n/p dt

≤σ1−n/p‖u‖1,p,SK, (6.106)

where

K =√n

∫ 1

0

t−n/p dt =√n/(1 − n/p).

A similar inequality holds with y in place of x, so that

|u(x) − u(y)| ≤ 2K|x− y|1−n/p‖u‖1,p,S, ∀x, y ∈ Rσ.

Now consider 0 < λ < 1 − n/p. Observe that, as |x − y|λ ≥|x− y|1−n/p if |x− y| < 1, we have,

supx,y∈S

|u(x) − u(y)||x− y|λ | x 6= y, |x− y| < 1

≤ supx,y∈S

|u(x) − u(y)||x− y|1−n/p

| x 6= y, |x− y| < 1

≤ K‖u‖1,p,S. (6.107)

Also,

supx,y∈S

|u(x) − u(y)||x− y|λ | |x− y| ≥ 1

≤ 2‖u‖∞,S ≤ 2K1‖u‖1,p,S

so that

supx,y∈S

|u(x) − u(y)||x− y|λ | x 6= y

≤ (K + 2K1)‖u‖1,p,S, ∀u ∈W 1,p(S).



Theorem 6.4.19. Let Ω ⊂ Rn be an open bounded set suchthat ∂Ω is C1 Assume n < p ≤ ∞ and u ∈W 1,p(Ω). Then

W 1,p(Ω) → C0,λ(Ω),

for all 0 < λ ≤ 1 − n/p.

Proof. Fix 0 < λ ≤ 1 − n/p and let u ∈ W 1,p(Ω). Since ∂Ω isC1, from Theorem 6.4.10, there exists an extension Eu = u suchthat u = u, a.e. in Ω, and

‖u‖1,p,Rn ≤ K‖u‖1,p,Ω,

where the constant K does not depend on u. From the proof of thissame theorem, we may assume that spt(u) is on a n-dimensionalsphere S ⊃ Ω with sufficiently big radius and such sphere does notdepend on u. Thus, in fact, we have

‖u‖1,p,S ≤ K‖u‖1,p,Ω.

Since C∞(S) is dense in W 1,p(S), there exists a sequence φk ⊂C∞(S) such that

uk → u, in W 1,p(S). (6.108)

Up to a not relabeled subsequence, we have

uk → u, a.e. in Ω.

From last theorem we have

‖uk − ul‖C0,λ(S) ≤ C‖uk − ul‖1,p,S,

so that uk is a cauchy sequence in C0,λ(S), and thus uk → u∗ forsome u∗ ∈ C0,λ(S). Hence, from this and (6.108), we have

u∗ = u, a.e. in S.

Finally, from above and last theorem we may write:

‖u∗‖C0,λ(Ω) ≤ ‖u∗‖C0,λ(S) ≤ K1‖u‖1,p,S ≤ K2‖u‖1,p,Ω.



6.5. Compact Imbeddings

Theorem 6.5.1. Let m be a non-negative integer and let0 < ν < λ ≤ 1. Then the following imbeddings exist:

Cm+1(Ω) → Cm(Ω), (6.109)

Cm,λ(Ω) → Cm(Ω), (6.110)

Cm,λ(Ω) → Cm,ν(Ω). (6.111)

If Ω is bounded then imbeddings (6.110) and (6.111) are compact.

Proof. The Imbeddings (6.109) and (6.110) follows from theinequalites

‖φ‖Cm(Ω) ≤ ‖φ‖Cm+1(Ω),

‖φ‖Cm(Ω) ≤ ‖φ‖Cm,λ(Ω).

To establish (6.111) note that for |α| ≤ m

supx,y∈Ω

|Dαφ(x) −Dαφ(y)||x− y|ν | x 6= y, |x− y| < 1

≤ supx,y∈Ω

|Dαφ(x) −Dαφ(y)||x− y|λ | x 6= y, |x− y| < 1

,(6.112)

and also,

supx,y∈Ω

|Dαφ(x) −Dαφ(y)|

|x− y|ν | |x− y| ≥ 1

≤ 2 supx∈Ω

|Dαφ|.(6.113)

Therefore, we may conclude that

‖φ‖Cm,ν(Ω) ≤ 3‖φ‖Cm,λ(Ω), ∀φ ∈ Cm,ν(Ω).

Now suppose Ω is bounded. If A is a bounded set in C0,λ(Ω) thenthere exists M > 0 such that

‖φ‖C0,λ(Ω) ≤M, ∀φ ∈ A.

But then

|φ(x) − φ(y)| ≤M |x− y|λ, ∀x, y ∈ Ω, φ ∈ A,

so that by the Ascoli-Arzela theorem, A is pre-compact in C(Ω).This proves the compactness of (6.110) for m = 0.

6.5. COMPACT IMBEDDINGS 167

If m ≥ 1 and A is bounded in Cm,λ(Ω), then A is boundedin C0,λ(Ω). Thus, by above there is a sequence φk ⊂ A andφ ∈ C0,λ(Ω) such that

φk → φ in C(Ω).

However, Diφk is also bounded in C0,λ(Ω), so that there exist anot relabeled subsequence, also denoted by φk and ψi such that

Diφk → ψi, in C(Ω).

The convergence in C(Ω) being the uniform one, we have ψi =Diφ. We can proceed extracting (not relabeled) subsequences untilobtaining

Dαφk → Dαφ, in C(Ω), ∀ 0 ≤ |α| ≤ m.

This completes the proof of compactness of (6.110). For (6.111),let S be a bounded set in Cm,λ(Ω). Observe that

|Dαφ(x) −Dαφ(y)||x− y|ν =

( |Dαφ(x) −Dαφ(y)||x− y|λ

)ν/λ

·|Dαφ(x) −Dαφ(y)|1−ν/λ

≤ K|Dαφ(x) −Dαφ(y)|1−ν/λ,(6.114)

for all φ ∈ S. From (6.110), S has a converging subsequence inCm(Ω). From (6.114) such a subsequence is also converging inCm,ν(Ω). The proof is complete.

Theorem 6.5.2 (Rellich-Kondrachov). Let Ω ⊂ Rn be anopen bounded set such that ∂Ω is C1. Let j,m be integers, j ≥0, m ≥ 1, and let 1 ≤ p <∞.

(1) Part I- If mp ≤ n, then the following imbeddings arecompact:

W j+m,p(Ω) →W j,q(Ω),

if 0 < n−mp < n and 1 ≤ np/(n−mp), (6.115)

W j+m,p(Ω) → W j,q(Ω), if n = mp, 1 ≤ q <∞. (6.116)

(2) Part II- If mp > n, then the following imbeddings arecompact:

W j+m,p → CjB(Ω), (6.117)


W j+m,p(Ω) → W j,q(Ω), if 1 ≤ q ≤ ∞. (6.118)

(3) Part III-The following imbeddings are compact:

W j+m,p(Ω) → Cj(Ω), if mp > n, (6.119)

W j+m,p(Ω) → Cj,λ(Ω),

if mp > n ≥ (m− 1)p and 0 < λ < m− n/p. (6.120)

(4) Part IV- All the above imbeddings are compact if wereplace W j+m,p(Ω) by W j+m,p

0 (Ω).

Remark 6.5.3. Given X, Y, Z spaces, for which we have theimbeddings X → Y and Y → Z and if one of these imbeddingsis compact then the composite imbedding X → Z is compact.Since the extension operator u → u where u(x) = u(x) if x ∈ Ω

and u(x) = 0 if x ∈ Rn − Ω, defines an imbedding W j+m,p0 (Ω) →

W j+m,p(Rn) we have that Part-IV of above theorem follows fromthe application of Parts I-III to Rn (despite the fact we are assumingΩ bounded, the general results may be found in Adams [1]).

Remark 6.5.4. To prove the compactness of any of aboveimbeddings it is sufficient to consider the case j = 0. Suppose,for example, that the first imbedding has been proved for j = 0.For j ≥ 1 and ui bounded sequence in W j+m,p(Ω) we have thatDαui is bounded in Wm,p(Ω) for each α such that |α| ≤ j. Fromthe case j = 0 it is possible to extract a subsequence (similarly toa diagonal process) uik for which Dαuik converges in Lq(Ω) foreach α such that |α| ≤ j, so that uik converges in W j,q(Ω).

Remark 6.5.5. Since Ω is bounded, C0B(Ω) → Lq(Ω) for

1 ≤ q ≤ ∞. In fact

‖u‖0,q,Ω ≤ ‖u‖C0B[vol(Ω)]1/q. (6.121)

Thus the compactness of (6.118) (for j = 0) follows from that of(6.117).

Proof of Parts II and III. If mp > n > (m−1)p and 0 < λ <(m− n)/p, then there exists µ such that λ < µ < m− (n/p). SinceΩ is bounded, the imbedding C0,µ(Ω) → C0,λ(Ω) is compact byTheorem 6.5.1. Since by the Sobolev Imbedding Theorem we haveWm,p(Ω) → C0,µ(Ω), we have that imbedding (6.120) is compact.

6.5. COMPACT IMBEDDINGS 169

If mp > n, let j∗ be the non-negative integer satisfying (m −j∗)p > n ≥ (m− j∗ − 1)p. Thus we have the chain of imbeddings

Wm,p(Ω) →Wm−j∗,p(Ω) → C0,µ(Ω) → C(Ω), (6.122)

where 0 < µ < m − j∗ − (n/p). The last imbedding in (6.122) iscompact by Theorem 6.5.1, so that (6.119) is compact for j = 0.By analogy (6.117) is compact for j = 0. Therefore from the aboveremarks, (6.118) is also compact. For the proof of Part I, we needthe following lemma:

Lemma 6.5.6. Let Ω be an bounded domain in Rn. Let1 ≤ q1 ≤ q0 and suppose

Wm,p(Ω) → Lq0(Ω), (6.123)

Wm,p(Ω) → Lq1 . (6.124)

Suppose also that (6.124) is compact. If q1 ≤ q < q0, then theimbedding

Wm,p → Lq(Ω) (6.125)

is compact.

Proof. Define λ = q1(q0 − q)/(q(q0 − q1)) and µ = q0(q −q1)/(q(q0 − q1)). We have that λ > 0 and µ ≥ 0. From Holder’sinequality and (6.123) there exists K ∈ R+ such that,

‖u‖0,q,Ω ≤ ‖u‖λ0,q1,Ω‖u‖µ

0,q0,Ω ≤ K‖u‖λ0,q1,Ω‖u‖µ

m,p,Ω,

∀u ∈Wm,p(Ω). (6.126)

Thus considering a sequence ui bounded inWm,p(Ω), since (6.124)is compact there exists a subsequence unk that converges, and istherefore a Cauchy sequence in Lq1(Ω). From (6.126), unk is alsoa Cauchy sequence in Lq(Ω), so that (6.125) is compact.

Proof of Part I. Consider j = 0. Define q0 = np/(n −mp).To prove the imbedding

Wm,p(Ω) → Lq(Ω), 1 ≤ q < q0, (6.127)

is compact, by last lemma it suffices to do so only for q = 1. Fork ∈ N, define

Ωk = x ∈ Ω | dist(x, ∂Ω) > 2/k. (6.128)


Suppose A is a bounded set of functions inWm,p(Ω), that is supposethere exists K1 > 0 such that

‖u‖W m,p(Ω) < K1, ∀u ∈ A.

Also, suppose given ε > 0, and define, for u ∈ Wm,p(Ω), u(x) =u(x) if x ∈ Ω, u(x) = 0, if x ∈ Rn − Ω. Fix u ∈ A. From Holder’sinequality and considering that Wm,p(Ω) → Lq0(Ω), we have∫

Ω−Ωk

|u(x)|dx ≤∫

Ω−Ωk

|u(x)|q0dx

1/q0∫

Ω−Ωk

1dx

1−1/q0

≤ K1‖u‖m,p,Ω[vol(Ω − Ωk)]1−1/q0 , (6.129)

Thus, since A is bounded in Wm,p, there exists K0 ∈ N such thatif k ≥ K0 then

∫

Ω−Ωk

|u(x)|dx < ε, ∀u ∈ A (6.130)

and, now fixing a not relabeled k > K0, we get,∫

Ω−Ωk

|u(x+ h) − u(x)|dx < 2ε, ∀u ∈ A, ∀h ∈ Rn. (6.131)

Observe that if |h| < 1/k, then x+ th ∈ Ω2k provided x ∈ Ωk and0 ≤ t ≤ 1. If u ∈ C∞(Ω) we have that∫

Ωk

|u(x+ h) − u(x)| ≤∫

Ωk

dx

∫ 1

0

|du(x+ th)

dt|dt

≤ |h|∫ 1

0

dt

∫

Ω2k

|∇u(y)|dy ≤ |h|‖u‖1,1,Ω

≤ K2|h|‖u‖m,p,Ω. (6.132)

Since C∞(Ω) is dense in Wm,p(Ω), from above for |h| sufficientlysmall

∫

Ω

|u(x+ h) − u(x)|dx < 3ε, ∀u ∈ A. (6.133)

From Theorem 6.4.6, A is relatively compact in L1(Ω) and there-fore the imbedding indicated (6.127) is compact for q = 1. Thiscompletes the proof.

Part 2

Variational Convex Analysis

CHAPTER 7

Basic Concepts on the Calculus of Variations

7.1. Introduction to the Calculus of Variations

We recall that a functional is a function whose the co-domain isthe real set. We denote such functionals by F : U → R, where U isa Banach space. In our work format, we consider the special cases

(1) F (u) =∫

Ωf(x, u,∇u) dx, where Ω ⊂ Rn is an open,

bounded, connected set.(2) F (u) =

∫

Ωf(x, u,∇u,D2u) dx, here

Du = ∇u =

∂ui

∂xj

and

D2u = D2ui =

∂2ui

∂xk∂xl

,

for i ∈ 1, ..., N and k, l ∈ 1, ..., N.Also, f : Ω × RN × RN×n is denoted by f(x, s, ξ) and we assume

(1)

∂f(x, s, ξ)

∂sand

(2)

∂f(x, s, ξ)

∂ξ

are continuous ∀(x, s, ξ) ∈ Ω × RN × RN×n.

Remark 7.1.1. We also recall that the notation ∇u = Dumay be used

Now we define our general problem, namely problem P where

Problem P : minimize F (u) on U,

173

174 7. BASIC CONCEPTS ON THE CALCULUS OF VARIATIONS

that is, to find u0 ∈ U such that

F (u0) = minu∈U

F (u).

At this point, we introduce some essential definitions.

Definition 7.1.2 (Space of admissible variations). Given F :U → R we define the space of admissible variations for F , denotedby V as

V = ϕ | u+ ϕ ∈ U, ∀u ∈ U.

For example, for F : U → R given by

F (u) =1

2

∫

Ω

∇u · ∇u dx− 〈u, f〉U ,

where Ω ⊂ R3 and

U = u ∈W 1,2(Ω) | u = u on ∂Ωwe have

V = W 1,20 (Ω).

Observe that in this example U is a subset of a Banach space.

Definition 7.1.3 (Local minimum). Given F : U → R, wesay that u0 ∈ U is a local minimum for F , if there exists δ > 0 suchthat

F (u) ≥ F (u0), ∀u ∈ U, such that ‖u− u0‖U < δ,

or equivalently

F (u0 + ϕ) ≥ F (u0), ∀ϕ ∈ V, such that ‖ϕ‖U < δ.

Definition 7.1.4 (Gateaux variation). Given F : U → R wedefine the Gateaux variation of F on the direction ϕ ∈ V, denotedby δF (u, ϕ) as

δF (u, ϕ) = limε→0

F (u+ εϕ) − F (u)

ε,

if such a limit is well defined. Furthermore, if there exists u∗ ∈ U∗

such that

δF (u, ϕ) = 〈ϕ, u∗〉U , ∀ϕ ∈ U,

7.2. EVALUATING THE GATEAUX VARIATIONS 175

we say that F is Gateaux differentiable at u ∈ U , and u∗ ∈ U∗ issaid to be the Gateaux derivative of F at u. Finally we denote

u∗ = δF (u) or u∗ =∂F (u)

∂u.

7.2. Evaluating the Gateaux variations

Consider F : U → R such that

F (u) =

∫

Ω

f(x, u,∇u) dx

where the hypothesis indicated in the last section are assumed.Consider u ∈ C1(Ω; RN) and ϕ ∈ C1

c (Ω; RN) and let us evaluateδF (u, ϕ):

From Definition 7.1.4,


F (u+ εϕ) − F (u)

ε.

Observe that

limε→0

f(x, u+ εϕ,∇u+ ε∇ϕ) − f(x, u,∇u)ε

=∂f(x, u,∇u)

∂s· ϕ+

∂f(x, u,∇u)∂ξ

· ∇ϕ.

Define

G(x, u, ϕ, ε) =f(x, u+ εϕ,∇u+ ε∇ϕ) − f(x, u,∇u)

ε,

and

G(x, u, ϕ) =∂f(x, u,∇u)

∂s· ϕ+

∂f(x, u,∇u)∂ξ

· ∇ϕ.

Thus we havelimε→0

G(x, u, ϕ, ε) = G(x, u, ϕ).

Now will show that

limε→0

∫

Ω

G(x, u, ϕ, ε) dx =

∫

Ω

G(x, u, ϕ) dx.

It suffices to show that

limn→∞

∫

Ω

G(x, u, ϕ, 1/n) dx =

∫

Ω

G(x, u, ϕ) dx.

For, for each n ∈ N, define xn ∈ Ω such that

|G(xn, u(xn), ϕ(xn), 1/n) − G(xn, u(xn), ϕ(xn))| = cn,


where

cn = maxx∈Ω

|G(x, u(x), ϕ(x), 1/n) − G(x, u(x), ϕ(x))|.

Since the function in question is continuous on the compact set Ω,xn is well defined. Also from the fact that Ω is compact, thereexists a subsequence xnj and x0 ∈ Ω such that

limj→+∞

xnj = x0.

Thus

limj→+∞

cnj = c0

= limj→+∞

|G(x0, u(x0), ϕ(x0), 1/(nj)) − G(x0, u(x0), ϕ(x0))| = 0.

Therefore, given ε > 0, there exists j0 ∈ N such that if j > j0 then

cnj< ε/|Ω|.

Thus, if j > j0, we have:∣

∣

∣

∣

∫

Ω

G(x, u, ϕ, 1/(nj)) dx−∫

Ω

G(x, u, ϕ) dx

∣

∣

∣

∣

≤∫

Ω

|G(x, u, ϕ, 1/(nj)) − G(x, u, ϕ)| dx ≤ cnj|Ω| < ε. (7.1)

Hence, we may write

limj→+∞

∫

Ω

G(x, u, ϕ, 1/(nj)) dx =

∫

Ω

G(x, u, ϕ) dx,

that is,

δF (u, ϕ) =

∫

Ω

∂f(x, u,∇u)∂s

· ϕ+∂f(x, u,∇u)

∂ξ· ∇ϕ

dx.

Theorem 7.2.1 (Fundamental lemma of calculus of variations).Consider an open set Ω ⊂ Rn and u ∈ L1

loc(Ω) such that∫

Ω

uϕ dx = 0, ∀ϕ ∈ C∞c (Ω).

Then u = 0, a.e. in Ω.

Remark 7.2.2. Of course a similar result is valid for thevectorial case. A proof of such a result was given in Chapter 6.

7.3. THE GATEAUX VARIATION IN W1,2(Ω) 177

Theorem 7.2.3 (Necessary conditions for a local minimum).Suppose u ∈ U is a local minimum for a Gateaux differentiableF : U → R. Then

δF (u, ϕ) = 0, ∀ϕ ∈ V.

Proof. Fix ϕ ∈ V. Define φ(ε) = F (u + εϕ). Since by hy-pothesis φ is differentiable and attains a minimum at ε = 0, fromthe standard necessary condition φ′(0) = 0, we obtain φ′(0) =δF (u, ϕ) = 0.

Theorem 7.2.4. Consider the hypotheses stated at section7.1 on F : U → R. Suppose F attains a local minimum atu ∈ C2(Ω; RN) and additionally assume that f ∈ C2(Ω,RN ,RN×n).Then the necessary conditions for a local minimum for F are givenby the Euler-Lagrange equations:

∂f(x, u,∇u)∂s

− div

(

∂f(x, u,∇u)∂ξ

)

= θ, in Ω.

Proof. From Theorem 7.2.3, the necessary condition stands forδF (u, ϕ) = 0, ∀ϕ ∈ V. From above this implies, after integration byparts∫

Ω

(

∂f(x, u,∇u)∂s

− div

(

∂f(x, u,∇u)∂ξ

))

· ϕ dx = 0,

∀ϕ ∈ C∞c (Ω,RN ).

The result then follows from the fundamental lemma of calculus ofvariations.

7.3. The Gateaux Variation in W 1,2(Ω)

Theorem 7.3.1. Consider the functional F : U → R, where

U = u ∈ W 1,2(Ω,RN) | u = u0 in ∂Ω.Suppose

F (u) =

∫

Ω

f(x, u,∇u) dx,

where f : Ω × RN × RN×n is such that

|f(x, s1, ξ1) − f(x, s2, ξ2)| < K(|s1 − s2| + |ξ1 − ξ2|),∀s1, s2 ∈ RN , ξ1, ξ2 ∈ RN×n.


Also assume the hypothesis of section 7.1. Under such assumptions,for each u ∈ U and ϕ ∈ V = W 1,2

0 (Ω; RN), we have

δF (u, ϕ) =

∫

Ω

∂f(x, u,∇u)∂s

· ϕ+∂f(x, u,∇u)

∂ξ· ∇ϕ

dx.

Proof. From Definition 7.1.4,


F (u+ εϕ) − F (u)

ε.

Observe that

limε→0

f(x, u+ εϕ,∇u+ ε∇ϕ) − f(x, u,∇u)ε

=∂f(x, u,∇u)

∂s· ϕ+

∂f(x, u,∇u)∂ξ

· ∇ϕ, a.e in Ω.

Define

G(x, u, ϕ, ε) =f(x, u+ εϕ,∇u+ ε∇ϕ) − f(x, u,∇u)

ε,

and

G(x, u, ϕ) =∂f(x, u,∇u)

∂s· ϕ+

∂f(x, u,∇u)∂ξ

· ∇ϕ.

Thus we have

limε→0

G(x, u, ϕ, ε) = G(x, u, ϕ), a.e in Ω.

Now will show that

limε→0

∫

Ω

G(x, u, ϕ, ε) dx =

∫

Ω

G(x, u, ϕ) dx.

It suffices to show that

limn→∞

∫

Ω

G(x, u, ϕ, 1/n) dx =

∫

Ω

G(x, u, ϕ) dx.

Observe that∫

Ω

|G(x, u, ϕ, 1/n)| dx ≤∫

Ω

K(|ϕ| + |∇ϕ|) dx

≤ K(‖ϕ‖2N + ‖∇ϕ‖2N×n)|Ω|1/2.

By the Lebesgue dominated convergence theorem, we obtain

limn→+∞

∫

Ω

G(x, u, ϕ, 1/(n)) dx =

∫

Ω

G(x, u, ϕ) dx,

7.4. ELEMENTARY CONVEXITY 179

that is,

δF (u, ϕ) =

∫

Ω

∂f(x, u,∇u)∂s

· ϕ+∂f(x, u,∇u)

∂ξ· ∇ϕ

dx.

7.4. Elementary Convexity

Definition 7.4.1. A function f : Rn → R is said to be convexif

f(λx+ (1 − λ)y) ≤ λf(x) + (1 − λ)f(y), ∀x, y ∈ Rn, λ ∈ [0, 1].

Proposition 7.4.2. If f : Rn → R is convex and differentiable,then

f(y) − f(x) ≥ 〈f ′(x), y − x〉Rn , ∀x, y ∈ Rn.

Proof. Pick x, y ∈ Rn. By hypothesis

f((1 − λ)x+ λy) ≤ (1 − λ)f(x) + λf(y), ∀λ ∈ [0, 1].

Thus

f(x+ λ(y − x)) − f(x)

λ≤ f(y) − f(x), ∀λ ∈ (0, 1].

Letting λ→ 0+ we obtain

f(y) − f(x) ≥ 〈f ′(x), y − x〉Rn .

Since x, y ∈ Rn are arbitrary, the proof is complete.

Proposition 7.4.3. Let f : Rn → R be a differentiable func-tion. If

f(y) − f(x) ≥ 〈f ′(x), y − x〉Rn , ∀x, y ∈ Rn,

then f is convex.

Proof. Define f ∗(x∗) by

f(x∗) = supx∈Rn

〈x, x∗〉Rn − f(x).

Such a function f ∗ is called the Fenchel conjugate of f . Observethat, by hypothesis,

f ∗(f ′(x)) = supy∈Rn

〈y, f ′(x)〉Rn − f(y) = 〈x, f ′(x)〉Rn − f(x).

(7.2)


On the other hand

f ∗(x∗) ≥ 〈x, x∗〉Rn − f(x), ∀x, x∗ ∈ Rn,

that isf(x) ≥ 〈x, x∗〉Rn − f ∗(x∗), ∀x, x∗ ∈ Rn.

Observe that, from (7.2)

f(x) = 〈x, f ′(x)〉Rn − f ∗(f ′(x))

and thus

f(x) = supx∗∈Rn

〈x, x∗〉Rn − f(x∗), ∀x ∈ Rn.

Pick x, y ∈ Rn and λ ∈ [0, 1]. Thus, we may write

f(λx+ (1 − λ)y) = supx∗∈Rn

〈λx+ (1 − λ)y, x∗〉Rn − f ∗(x∗)

= supx∗∈Rn

λ〈x, x∗〉Rn + (1 − λ)〈y, x∗〉Rn − λf ∗(x∗)

− (1 − λ)f ∗(x∗)≤λ sup

x∗∈Rn

〈x, x∗〉Rn − f ∗(x∗)

+ (1 − λ) supx∗∈Rn

〈y, x∗〉Rn − f ∗(x∗)

=λf(x) + (1 − λ)f(y). (7.3)

Since x, y ∈ Rn and λ ∈ [0, 1] are arbitrary, we have that f isconvex.

Corollary 7.4.4. Let f : Rn → R be twice differentiable and

∂2f(x)

∂xi∂xj

,

positive definite, for all x ∈ Rn. Then f is convex.

Proof. Pick x, y ∈ Rn. Using Taylor’s expansion we obtain

f(y) = f(x) + 〈f ′(x), y − x〉Rn +

n∑

i=1

n∑

j=1

∂2f(x)

∂xi∂xj(yi − xi)(yj − xj),

for x = λx+(1−λ)y (for some λ ∈ [0, 1]). From the hypothesis weobtain

f(y)− f(x) − 〈f ′(x), y − x〉Rn ≥ 0.

Since x, y ∈ Rn are arbitrary, the proof is complete.

Similarly we may obtain the following result.

7.4. ELEMENTARY CONVEXITY 181

Corollary 7.4.5. Let U be a Banach space. Consider F :U → R Gateaux differentiable. Then F is convex if and only if

F (v) − F (u) ≥ 〈F ′(u), v − u〉U , ∀u, v ∈ U.

Corollary 7.4.6. let U be a Banach space. Suppose F : U →R is twice Gateaux differentiable and that

δ2F (u, ϕ) ≥ 0, ∀u ∈ U, ϕ ∈ V.Then, F is convex.

Proof. Pick u, v ∈ U . Define φ(ε) = F (u + ε(v − u)). Byhypothesis, φ is twice differentiable, so that

φ(1) = φ(0) + φ′(0) + φ′′(ε),

where |ε| ≤ 1. Thus

F (v) = F (u) + δF (u, v − u) + δ2F (u+ ε(v − u), v − u).

Therefore, by hypothesis

F (v) ≥ F (u) + δF (u, v − u).

Since F is Gateaux differentiable, we obtain

F (v) ≥ F (u) + 〈F ′(u), v − u〉U .Being u, v ∈ U arbitrary, the proof is complete.

Corollary 7.4.7. Let U be a Banach space. Let F : U → R

be a convex Gateaux differentiable functional. If F ′(u) = θ, then

F (v) ≥ F (u), ∀v ∈ U,

that is, u ∈ U is a global minimizer for F .

Proof. Just observe that

F (v) ≥ F (u) + 〈F ′(u), v − u〉U , ∀u, v ∈ U.

Therefore, from F ′(u) = θ, we obtain

F (v) ≥ F (u), ∀v ∈ U.


7.5. The Legendre-Hadamard Condition

Theorem 7.5.1. If u ∈ C1(Ω; RN) is such that

δ2F (u, ϕ) ≥ 0, ∀ϕ ∈ C∞c (Ω,RN),

then

fξiαξk

β(x, u(x),∇u(x))ρiρkηαηβ ≥ 0, ∀x ∈ Ω, ρ ∈ RN , η ∈ Rn.

Such a condition is known as the Legendre-Hadamard condition.

Proof. Suppose

δ2F (u, ϕ) ≥ 0, ∀ϕ ∈ C∞c (Ω; RN).

We denote δ2F (u, ϕ) by

δ2F (u, ϕ) =

∫

Ω

a(x)Dϕ(x) ·Dϕ(x) dx

+

∫

Ω

b(x)ϕ(x) ·Dϕ(x) dx+

∫

Ω

c(x)ϕ(x) · ϕ(x) dx,

(7.4)

wherea(x) = fξξ(x, u(x), Du(x)),

b(x) = fsξ(x, u(x), Du(x)),

andc(x) = fss(x, u(x), Du(x)).

Now consider v ∈ C∞c (B1(0),RN). Thus given x0 ∈ Ω for λ suf-

ficiently small we have that ϕ(x) = λv(

x−x0

λ

)

is an admissible di-rection. Now we introduce the new coordinates y = (y1, ..., yn) bysetting y = λ−1(x− x0) and multiply (7.4) by λ−n to obtain∫

B1(0)

a(x0 + λy)Dv(y) ·Dv(y) + 2λb(x0 + λy)v(y) ·Dv(y)

+ λ2c(x0 + λy)v(y) · v(y) dy > 0,

where a = aαβij , b = bβjk and c = cjk. Since a, b and c are

continuous, we have

a(x0 + λy)Dv(y) ·Dv(y) → a(x0)Dv(y) ·Dv(y),λb(x0 + λy)v(y) ·Dv(y) → 0,

andλ2c(x0 + λy)v(y) · v(y) → 0,

7.5. THE LEGENDRE-HADAMARD CONDITION 183

uniformly on Ω as λ→ 0. Thus this limit give us∫

B1(0)

fαβjk Dαv

jDβvk dx ≥ 0, ∀v ∈ C∞

c (B1(0); RN), (7.5)

where

fαβjk = aαβ

jk (x0) = fξiαξk

β(x0, u(x0),∇u(x0)).

Now define v = (v1, ..., vN) where

vj = ρjcos((η · y)t)ζ(y)

ρ = (ρ1, ..., ρN ) ∈ RN

and

η = (η1, ..., ηn) ∈ Rn

and ζ ∈ C∞c (B1(0)). From (7.5) we obtain

0 ≤ fαβjk ρ

jρk

∫

B1(0)

(ηαt(−sin((η · y)t)ζ + cos((η · y)t)Dαζ)

· (ηβt(−sin((η · y)t)ζ + cos((η · y)t)Dβζ) dy (7.6)

By analogy for

vj = ρjsin((η · y)t)ζ(y)we obtain

0 ≤ fαβjk ρ

jρk

∫

B1(0)

(ηαt(cos((η · y)t)ζ + sin((η · y)t)Dαζ)

· (ηβt(cos((η · y)t)ζ + sin((η · y)t)Dβζ) dy (7.7)

Summing up these last two equations, dividing the result by t2 andletting t→ +∞ we obtain

0 ≤ fαβjk ρ

jρkηαηβ

∫

B1(0)

ζ2 dy,

for all ζ ∈ C∞c (B1(0)), which implies

0 ≤ fαβjk ρ

jρkηαηβ .



7.6. The Weierstrass Necessary Condition

In this section we state without proving the result concerningthe Weierstrass Condition for the case n,N ≥ 1.

Theorem 7.6.1. Suppose u ∈ C1(Ω; RN) has the strong min-imum property. Then for each x ∈ Ω and for each rank-one matrixπi

α = ρiηα the Weierstrass condition is satisfied, that is,

EF (x, u(x), Du(x) + π) ≥ 0,

where

EF (x, u(x), Du(x) + π) = f(x, u(x), Du(x) + ρ⊗ η)

− f(x, u(x), Du(x)) − ρiηαfξiα(x, u(x), Du(x)).

The function EF is known as the Weierstrass excess function.

7.6.1. The Weierstrass Condition for n = 1. Here wepresent the Weierstrass condition for the special case N ≥ 1 andn = 1. We start with a definition.

Definition 7.6.2. We say that u ∈ C([a, b]; RN) if u : [a, b] →RN is continuous in [a, b], and Du is continuous except on a finiteset of points in [a, b].

Theorem 7.6.3 (Weierstrass). Let Ω = (a, b) and f : Ω ×RN ×RN → R be such that fs(x, s, ξ) and fξ(x, s, ξ) are continuouson Ω × RN × RN .

Define F : U → R by

F (u) =

∫ b

a

f(x, u(x), u′(x)) dx,

where

U = u ∈ C1([a, b]; RN) | u(a) = α, u(b) = β.Suppose u ∈ U minimizes locally F on U , that is, suppose thatthere exists ε0 > 0 such that

F (u) ≤ F (v), ∀v ∈ U, such that ‖u− v‖∞ < ε0.

Under such hypotheses, we have

E(x, u(x), u′(x+), w) ≥ 0, ∀x ∈ [a, b], w ∈ RN ,

7.6. THE WEIERSTRASS NECESSARY CONDITION 185

and

E(x, u(x), u′(x−), w) ≥ 0, ∀x ∈ [a, b], w ∈ RN ,

where

u′(x+) = limh→0+

u′(x+ h),

u′(x+) = limh→0−

u′(x+ h),

and,

E(x, s, ξ, w) = f(x, s, w) − f(x, s, ξ) − fξ(x, s, ξ)(w − ξ).

Remark 7.6.4. The function E is known as the WeierstrassExcess Function.

Proof. Fix x0 ∈ (a, b) and w ∈ RN . Choose 0 < ε < 1 andh > 0 such that u+ v ∈ U and

‖v‖∞ < ε0

where v(x) is given by

v(x) =

(x− x0)w, if 0 ≤ x− x0 ≤ εh,ε(h− x+ x0)w, if εh ≤ x− x0 ≤ h,0, otherwise,

where

ε =ε

1 − ε.

From

F (u+ v) − F (u) ≥ 0

we obtain∫ x0+h

x0

f(x, u(x) + v(x), u′(x) + v′(x)) dx

−∫ x0+h

x0

f(x, u(x), u′(x)) dx ≥ 0. (7.8)

Define

x =x− x0

h,

so that

dx =dx

h.


From (7.8) we obtain

h

∫ 1

0

f(x0+xh, u(x0+xh)+v(x0+xh), u′(x0+xh)+v

′(x0+xh) dx

− h

∫ 1

0

f(x0 + xh, u(x0 + xh), u′(x0 + xh)) dx ≥ 0. (7.9)

where the derivatives are related to x.Therefore∫ ε

0

f(x0 + xh, u(x0 + xh) + v(x0 + xh), u′(x0 + xh) + w) dx

−∫ ε

0

f(x0 + xh, u(x0 + xh), u′(x0 + xh)) dx

+

∫ 1

ε

f(x0 + xh, u(x0 + xh) + v(x0 + xh), u′(x0 + xh) − εw) dx

−∫ 1

ε

f(x0 + xh, u(x0 + xh), u′(x0 + xh)) dx

≥ 0. (7.10)

Letting h→ 0 we obtain

ε(f(x0, u(x0), u′(x0+) + w) − f(x0, u(x0), u

′(x0+))

+(1 − ε)(f(x0, u(x0), u′(x0+) − εw) − f(x0, u(x0), u

′(x0+))) ≥ 0.

Hence, by the mean value theorem we get

ε(f(x0, u(x0), u′(x0+) + w) − f(x0, u(x0), u

′(x0+))

−(1 − ε)ε(fξ(x0, u(x0), u′(x0+) + ρ(ε)w)) · w ≥ 0. (7.11)

Dividing by ε and letting ε → 0, so that ε → 0 and ρ(ε) → 0 wefinally obtain

f(x0, u(x0), u′(x0+) + w) − f(x0, u(x0), u

′(x0+))

− fξ(x0, u(x0), u′(x0+)) · w ≥ 0.

Similarly we may get

f(x0, u(x0), u′(x0−) + w) − f(x0, u(x0), u

′(x0−))

− fξ(x0, u(x0), u′(x0−)) · w ≥ 0.

Since x0 ∈ [a, b] and w ∈ RN are arbitrary, the proof is complete.

7.7. THE DU BOIS-REYMOND LEMMA 187

7.7. The du Bois-Reymond Lemma

We present now a simpler version of the fundamental lemma ofcalculus of variations. The result is specific for n = 1 and is knownas du Bois-Reymond Lemma.

Lemma 7.7.1 (du Bois-Reymond). If u ∈ C and∫ b

a

uϕ′ dx = 0, ∀ϕ ∈ V,

where

V = ϕ ∈ C1[a, b] | ϕ(a) = ϕ(b) = 0,then there exists c ∈ R such that

u(x) = c, ∀x ∈ [a, b].

Proof. Define

c =1

b− a

∫ b

a

u(t) dt,

and

ϕ(x) =

∫ x

a

(u(t) − c) dt.

Thus we have, ϕ(a) = 0, and

ϕ(b) =

∫ b

a

u(t) dt− c(b− a) = 0.

Moreover ϕ ∈ C1([a, b]) so that

ϕ ∈ V.Therefore

0 ≤∫ b

a

(u(x) − c)2 dx

=

∫ b

a

(u(x) − c)ϕ′(x) dx

=

∫ b

a

u(x)ϕ′(x) dx− c[ϕ(x)]ba = 0. (7.12)

Thus∫ b

a

(u(x) − c)2 dx = 0,


and being u(x) − c continuous, we finally obtain

u(x) − c = 0, ∀x ∈ [a, b].


Proposition 7.7.2. If u, v ∈ C([a, b]) and∫ b

a

(u(x)ϕ(x) + v(x)ϕ′(x)) dx = 0,

∀ϕ ∈ V, where

V = ϕ ∈ C1[a, b] | ϕ(a) = ϕ(b) = 0,then

v ∈ C1([a, b])

and

v′(x) = u(x), ∀x ∈ [a, b].

Proof. Define

u1(x) =

∫ x

a

u(t) dt, ∀x ∈ [a, b].

Thus u1 ∈ C1([a, b]) and

u′1(x) = u(x), ∀x ∈ [a, b].

Hence, for ϕ ∈ V, we have

0 =

∫ b

a

(u(x)ϕ(x) + v(x)ϕ′(x) dx

=

∫ b

a

(−u1(x)ϕ′(x) + vϕ′(x)) dx+ [u1(x)ϕ(x)]ba

=

∫ b

a

(v(x) − u1(x))ϕ′(x) dx. (7.13)

That is,∫ b

a

(v(x) − u1(x))ϕ′(x) dx, ∀ϕ ∈ V.

By the du Bois- Reymond lemma, there exists c ∈ R such that

v(x) − u1(x) = c, ∀x ∈ [a, b].

Hence

v = u1 + c ∈ C1([a, b]),

7.8. THE WEIERSTRASS-ERDMANN CONDITIONS 189

so that

v′(x) = u′1(x) = u(x), ∀x ∈ [a, b].


7.8. The Weierstrass-Erdmann Conditions

We start with a definition.

Definition 7.8.1. Define I = [a, b]. A function u ∈ C([a, b]; RN)is said to be a weak Lipschitz extremal of

F (u) =

∫ b

a

f(x, u(x), u′(x)) dx,

if∫ b

a

(fs(x, u(x), u′(x)) · ϕ+ fξ(x, u(x), u

′(x)) · ϕ′(x)) dx = 0,

∀ϕ ∈ C∞c ([a, b]; RN).

Proposition 7.8.2. For any Lipschitz extremal of

F (u) =

∫ b

a

f(x, u(x), u′(x)) dx

there exists a constant c ∈ RN such that

fξ(x, u(x), u′(x)) = c+

∫ x

a

fs(t, u(t), u′(t)) dt, ∀x ∈ [a, b]. (7.14)

Proof. Fix ϕ ∈ C∞c ([a, b]; RN ). Integration by parts of the ex-

tremal condition

δF (u, ϕ) = 0,

implies that

∫ b

a

fξ(x, u(x), u′(x)) · ϕ′(x) dx

−∫ b

a

∫ x

a

fs(t, u(t), u′(t)) dt · ϕ′(x) dx = 0.

Since ϕ is arbitrary, from the du Bois-Reymond lemma, there existsc ∈ RN such that

fξ(x, u(x), u′(x)) −

∫ x

a

fs(t, u(t), u′(t)) dt = c, ∀x ∈ [a, b].



Theorem 7.8.3 (Weierstrass-Erdmann Corner Conditions). Let

I = [a, b]. Suppose u ∈ C1([a, b]; RN ) is such that

F (u) ≤ F (v), ∀v ∈ Cr,

for some r > 0. where

Cr = v ∈ C1([a, b]; RN ) | v(a) = u(a), v(b) = u(b),

and ‖u− v‖∞ < r.Let x0 ∈ (a, b) be a corner point of u. Denoting u0 = u(x0),

ξ+0 = u′(x0 + 0) and ξ−0 = u′(x0 − 0), then the following relations

are valid:

(1) fξ(x0, u0, ξ−0 ) = fξ(x0, u0, ξ

+0 ),

(2)

f(x0, u0, ξ−0 ) − ξ−0 fξ(x0, u0, ξ

−0 )

= f(x0, u0, ξ+0 ) − ξ+

0 fξ(x0, u0, ξ+0 ).

Remark 7.8.4. The conditions above are known as the Weierst-rass-Erdmann corner conditions.

Proof. Condition (1) is just a consequence of equation (7.14).For (2), define

τε(x) = x+ ελ(x),

where λ ∈ C∞c (I). Observe that τε(a) = a and τε(b) = b, ∀ε > 0.

Also τ0(x) = x. Choose ε0 > 0 sufficiently small such that for eachε satisfying |ε| < ε0, we have τ ′ε(x) > 0 and

uε(x) = (u τ−1ε )(x) ∈ Cr.

Define

φ(ε) = F (x, uε, u′ε(x)).

Thus φ has a local minimum at 0, so that φ′(0) = 0, that is

d(F (x, uε, u′ε(x)))

dε|ε=0 = 0.

Observe thatduε

dx= u′(τ−1

ε (x))dτ−1

ε (x)

dx,

7.9. NATURAL BOUNDARY CONDITIONS 191

anddτ−1

ε (x)

dx=

1

1 + ελ′(τ−1ε (x))

.

Thus,

F (uε) =

∫ b

a

f

(

x, u(τ−1ε (x)), u′(τ−1

ε (x))

(

1

1 + ελ′(τ−1ε (x))

))

dx.

Definingx = τ−1

ε (x),

we obtain

dx =1

1 + ελ′(x)dx,

that isdx = (1 + ελ′(x)) dx.

Dropping the bar for the new variable, we may write

F (uε) =

∫ b

a

f

(

x+ ελ(x), u(x),u′(x)

1 + ελ′(x)

)

(1 + ελ′(x)) dx.

FromdF (uε)

dε|ε=0,

we obtain∫ b

a

(λfx(x, u(x), u′(x)) + λ′(x)(f(x, u(x), u′(x))

− u′(x)fξ(x, u(x), u′(x)))) dx = 0. (7.15)

Since λ is arbitrary, from proposition 7.7.2, we obtain

f(x, u(x), u′(x))−u′(x)fξ(x, u(x), u′(x))−

∫ x

a

fx(t, u(t), u′(t)) dt = c1

for some c1 ∈ R.Being

∫ x

afx(t, u(t), u

′(t)) dt + c1 a continuous function (in factabsolutely continuous), the proof is complete.

7.9. Natural Boundary Conditions

Consider the functional f : U → R, where

F (u)

∫

Ω

f(x, u(x),∇u(x)) dx,

f(x, s, ξ) ∈ C1(Ω,RN ,RN×n),

and Ω ⊂ Rn is an open bounded connected set.


Proposition 7.9.1. Assume

U = u ∈W 1,2(Ω; RN); u = u0 on Γ0,where Γ0 ⊂ ∂Ω is closed and ∂Ω = Γ = Γ0 ∪ Γ1 being Γ1 open inΓ and Γ0 ∩ Γ1 = ∅. Thus if ∂Ω ∈ C1, f ∈ C2(Ω,RN ,RN×n) andu ∈ C2(Ω; RN), and also

δF (u, ϕ) = 0, ∀ϕ ∈ C1(Ω; RN), such that ϕ = 0 on Γ0,

then u is a extremal of F which satisfies the following naturalboundary conditions,

nαfxiiα(x, u(x)∇u(x)) = 0, a.e. on Γ1, ∀i ∈ 1, ..., N.

Proof. Observe that δF (u, ϕ) = 0, ∀ϕ ∈ C∞c (Ω; RN), thus u is

a extremal of F and through integration by parts and the funda-mental lemma of calculus of variations, we obtain

Lf (u) = 0, in Ω,

where

Lf(u) = fs(x, u(x),∇u(x)) − div(fξ(x, u(x),∇u(x)).Defining

V = ϕ ∈ C1(Ω; RN) | ϕ = 0 on Γ0,for an arbitrary ϕ ∈ V, we obtain

δF (u, ϕ) =

∫

Ω

Lf (u) · ϕ dx

+

∫

Γ1

nαfxiiα(x, u(x),∇u(x))ϕi(x) dΓ

=

∫

Γ1

nαfxiiα(x, u(x),∇u(x))ϕi(x) dΓ

= 0, ∀ϕ ∈ V. (7.16)

Suppose, to obtain contradiction, that

nαfxiiα(x0, u(x0),∇u(x0)) = β > 0,

for some x0 ∈ Γ1 and some i ∈ 1, ..., N. Defining

G(x) = nαfxiiα(x, u(x),∇u(x)),

by the continuity of G, there exists r > 0 such that

G(x) > β/2, in Br(x0),

7.9. NATURAL BOUNDARY CONDITIONS 193

and in particular

G(x) > β/2, in Br(x0) ∩ Γ1.

Choose 0 < r1 < r such that Br1(x0)∩Γ0 = ∅. This is possible sinceΓ0 is closed and x0 ∈ Γ1.

Choose ϕi ∈ C∞c (Br1(x0)) such that ϕi ≥ 0 in Br1(x0) and ϕi >

0 in Br1/2(x0). Therefore∫

Γ1

G(x)ϕi(x) dx >β

2

∫

Γ1

ϕi dx > 0,

and this contradicts (7.16). Thus

G(x) ≤ 0, ∀x ∈ Γ1,

and by analogyG(x) ≥ 0, ∀x ∈ Γ1,

so thatG(x) = 0, ∀x ∈ Γ1.


CHAPTER 8

Basic Concepts on Convex Analysis

For this chapter the most relevant reference is Ekeland andTemam, [17].

8.1. Convex Sets and Convex Functions

Let S be a subset of a vector space U . We recall that S is convexif given u, v ∈ S then

λu+ (1 − λ)v ∈ S, ∀λ ∈ [0, 1]. (8.1)

Definition 8.1.1 (Convex hull). Let S be a subset of a vectorspace U , we define the convex hull of S, denoted by Co(S) as

Co(S) =

n∑

i=1

λiui | n ∈ N,

n∑

i=1

λi = 1, λi ≥ 0, ui ∈ S, ∀i ∈ 1, ..., n

. (8.2)

Definition 8.1.2 (Convex Functional). Let S be convex sub-set of the vector space U . A functional F : S → R = R∪+∞,−∞is said to be convex if

F (λu+ (1 − λ)v) ≤ λF (u) + (1 − λ)F (v), ∀u, v ∈ S, λ ∈ [0, 1].(8.3)

Definition 8.1.3 (Lower Semi-continuity). Let U be Banachspace. We say that F : U → R is lower semi-continuous (l.s.c.) atu ∈ U , if

lim infn→+∞

F (un) ≥ F (u), (8.4)

whenever

un → u strongly (in norm). (8.5)

195

196 8. BASIC CONCEPTS ON CONVEX ANALYSIS

Definition 8.1.4 (Weak Lower Semi-Continuity). Let U beBanach space. We say that F : U → R is weakly lower semi-continuous (w.l.s.c.) at u ∈ U , if

lim infn→+∞

F (un) ≥ F (u), (8.6)

whenever

un u, weakly. (8.7)

Remark 8.1.5. We say that F is a (weak) lower semi-continuousfunction, if F : U → R is (weak) lower semi-continuous ∀u ∈ U .

Definition 8.1.6 (Epigraph). Given F : U → R we define itsEpigraph, denoted by Epi(F ) as

Epi(F ) = (u, a) ∈ U × R | a ≥ F (u).

Now we present a very important result but which we do notprove. For a proof see Ekeland and Temam, [17].

Proposition 8.1.7. A function F : U → R is l.s.c. (lowersemi-continuous) if and only if its epigraph is closed.

Corollary 8.1.8. Every convex l.s.c. function F : U → R isalso w.l.s.c. (weakly lower semi-continuous).

Proof. The result follows from the fact that the epigraph of Fis convex and closed convex sets are weakly closed.

Definition 8.1.9 (Affine Continuous Function). Let U bea Banach space. A functional F : U → R is said to be affinecontinuous if there exist u∗ ∈ U∗ and α ∈ R such that

F (u) = 〈u, u∗〉U + α, ∀u ∈ U. (8.8)

Definition 8.1.10 (Γ(U)). Let U be a Banach space, we saythat F : U → R belongs to Γ(U) and write F ∈ Γ(U) if F canbe represented as the point-wise supremum of a family of affinecontinuous functions. If F ∈ Γ(U), and F (u) ∈ R for some u ∈ Uthen we write F ∈ Γ0(U).

8.1. CONVEX SETS AND CONVEX FUNCTIONS 197

Proposition 8.1.11. Let U be a Banach space, then F ∈ Γ(U)if and only if F is convex and l.s.c., and if F takes the value −∞then F ≡ −∞.

Definition 8.1.12 (Convex Envelope). Let U be a Banachspace. Given F : U → R, we define its convex envelope, denotedby CF : U → R by

CF (u) = sup(u∗,α)∈A∗

〈u, u∗〉 + α, (8.9)

where

A∗ = (u∗, α) ∈ U∗ × R | 〈v, u∗〉U + α ≤ F (v), ∀v ∈ U (8.10)

Definition 8.1.13 (Polar Functionals). Given F : U → R, wedefine the related polar functional, denoted by F ∗ : U∗ → R, by

F ∗(u∗) = supu∈U

〈u, u∗〉U − F (u), ∀u∗ ∈ U∗. (8.11)

Definition 8.1.14 (Bipolar Functional). Given F : U → R,we define the related bipolar functional, denoted by F ∗∗ : U → R,as

F ∗∗(u) = supu∗∈U∗

〈u, u∗〉U − F ∗(u∗), ∀u ∈ U. (8.12)

Proposition 8.1.15. Given F : U → R, then F ∗∗(u) = CF (u)and in particular if F ∈ Γ(U) then F ∗∗(u) = F (u).

Proof. By definition, the convex envelope of F is the supremumof all affine continuous minorants of F . We can consider only themaximal minorants, that functions of the form

u 7→ 〈u, u∗〉U − F ∗(u∗). (8.13)

Thus,

CF (u) = supu∗∈U∗

〈u, u∗〉U − F ∗(u∗) = F ∗∗(u). (8.14)

Corollary 8.1.16. Given F : U → R, we have F ∗ = F ∗∗∗.


Proof. Since F ∗∗ ≤ F we obtain

F ∗ ≤ F ∗∗∗. (8.15)

On the other hand, we have

F ∗∗(u) ≥ 〈u, u∗〉U − F ∗(u∗), (8.16)

so that

F ∗∗∗(u∗) = supu∈U

〈u, u∗〉U − F ∗∗(u) ≤ F ∗(u∗). (8.17)

From (8.15) and (8.17) we obtain F ∗(u∗) = F ∗∗∗(u∗).

Definition 8.1.17 (Gateaux Differentiability). A functionalF : U → R is said to be Gateaux differentiable at u ∈ U if thereexists u∗ ∈ U∗ such that:

limλ→0

F (u+ λh) − F (u)

λ= 〈h, u∗〉U , ∀h ∈ U. (8.18)

The vector u∗ is said to be the Gateaux derivative of F : U → R

at u and may be denoted as follows:

u∗ =∂F (u)

∂uor u∗ = δF (u) (8.19)

Definition 8.1.18 (Sub-gradients). Given F : U → R, wedefine the set of sub-gradients of F at u, denoted by ∂F (u) as:

∂F (u) = u∗ ∈ U∗, such that 〈v − u, u∗〉U + F (u) ≤ F (v),

∀v ∈ U. (8.20)

Definition 8.1.19 (Adjoint Operator). Let U and Y be Ba-nach spaces and Λ : U → Y a continuous linear operator. TheAdjoint Operator related to Λ, denoted by Λ∗ : Y ∗ → U∗ is definedthrough the equation:

〈u,Λ∗v∗〉U = 〈Λu, v∗〉Y , ∀u ∈ U, v∗ ∈ Y ∗. (8.21)

Lemma 8.1.20 (Continuity of Convex Functions). If in aneighborhood of a point u ∈ U , a convex function F is boundedabove by a finite constant, then F is continuous at u.

Proof. By translation, we may reduce the problem to the casewhere u = θ and F (u) = 0. Let V be a neighborhood of origin suchthat F (v) ≤ a < +∞, ∀v ∈ V. Define W = V ∩ (−V) (which is


a symmetric neighborhood of origin). Pick ε ∈ (0, 1). If v ∈ εW,since F is convex and

v

ε∈ V (8.22)

we may infer that

F (v) ≤ (1 − ε)F (0) + εF (v/ε) ≤ εa. (8.23)

Also−vε

∈ V. (8.24)

Thus

F (v) ≥ (1 + ε)F (0) − εF (−v/ε) ≥ −εa. (8.25)

Therefore

|F (v)| ≤ εa, ∀v ∈ εW, (8.26)

that is, F is continuous at u = θ.

Proposition 8.1.21. Let F : U → R be a convex functionfinite and continuous at u ∈ U . Then ∂F (u) 6= ∅.

Proof. Since F is convex, Epi(F ) is convex, as F is continuousat u, we have that Epi(F ) is non-empty. Observe that (u, F (u))belongs to the boundary of Epi(F ), so that denoting A = Epi(F ),we may separate (u, F (u)) from A by a closed hyper-plane H , whichmay be written as

H = (v, a) ∈ U × R | 〈v, u∗〉U + αa = β, (8.27)

for some fixed α, β ∈ R and u∗ ∈ U∗, so that

〈v, u∗〉U + αa ≥ β, ∀(v, a) ∈ Epi(F ), (8.28)

and

〈u, u∗〉U + αF (u) = β, (8.29)

where (α, β, u∗) 6= (0, 0, θ). Suppose α = 0, thus we have

〈v − u, u∗〉U ≥ 0, ∀v ∈ U, (8.30)

and thus we obtain u∗ = θ, and β = 0. Therefore we may assumeα > 0 (considering (8.28)) so that ∀v ∈ U we have

β

α− 〈v, u∗/α〉U ≤ F (v), (8.31)


and

β

α− 〈u, u∗/α〉U = F (u), (8.32)

or

〈v − u,−u∗/α〉U + F (u) ≤ F (v), ∀v ∈ U, (8.33)

so that

−u∗/α ∈ ∂F (u). (8.34)

Definition 8.1.22 (Caratheodory Mapping ). Let S ⊂ Rn bean open set, we say that that g : S × Rl → R is a Caratheodorymapping if:

∀ξ ∈ Rl, x 7→ g(x, ξ) is a measurable function,

and

for almost all x ∈ S, ξ 7→ g(x, ξ) is a continuous function.

The proof of next results may be found in Ekeland and Temam[17].

Proposition 8.1.23. Let E and F be two Banach spaces, Sa Borel subset of Rn, and g : S × E → F a Caratheodory map-ping. For each measurable function u : S → E, let G1(u) be themeasurable function x 7→ g(x, u(x)) ∈ F .

If G1 maps Lp(S,E) into Lr(S, F ) for 1 ≤ p, r <∞, then G1 iscontinuous in the norm topology.

For the functionalG : U → R, defined byG(u) =∫

Sg(x, u(x))dS

, where U = U∗ = [L2(S)]l (this is a especial case of the more gen-eral hypothesis presented in [17]) we have the following result.

Proposition 8.1.24. Considering the last proposition we canexpress G∗ : U∗ → R as :

G∗(u∗) =

∫

S

g∗(x, u∗(x))dS, (8.35)

where g∗(x, y) = supη∈Rl

(y · η − g(x, η)), almost everywhere in S.


For non-convex functionals it may be sometimes difficult to ex-press analytically conditions for a global extremum. This fact mo-tivates the definition of Legendre Transform, which is establishedthrough a local extremum.

Definition 8.1.25 (Legendre’s Transform and Associated Func-tional). Consider a differentiable function g : Rn → R. Its Le-gendre Transform, denoted by g∗L : Rn

L → R is expressed as:

g∗L(y∗) = x0i · y∗i − g(x0), (8.36)

where x0 is the solution of the system:

y∗i =∂g(x0)

∂xi, (8.37)

andRnL = y∗ ∈ Rn such that equation (8.37) has a unique solution.

Furthermore, considering the functional G : Y → R defined asG(v) =

∫

Sg(v)dS, we define the Associated Legendre Transform

Functional, denoted by G∗L : Y ∗

L → R as:

G∗L(v∗) =

∫

S

g∗L(v∗)dS, (8.38)

where Y ∗L = v∗ ∈ Y ∗ | v∗(x) ∈ Rn

L, a.e. in S.

About the Legendre transform we still have the following results:

Proposition 8.1.26. Considering the last definitions, supposethat for each y∗ ∈ Rn

L at least in a neighborhood (of y∗) it is possibleto define a differentiable function by the expression

x0(y∗) = [

∂g

∂x]−1(y∗). (8.39)

Then, ∀ i ∈ 1, ..., nwe may write:

y∗i =∂g(x0)

∂xi

⇔ x0i =∂g∗L(y∗)

∂y∗i(8.40)

Proof. Suppose firstly that:

y∗i =∂g(x0)

∂xi, ∀ i ∈ 1, ..., n, (8.41)

thus:

g∗L(y∗) = y∗i x0i − g(x0) (8.42)


and taking derivatives for this expression we have:

∂g∗L(y∗)

∂y∗i= y∗j

∂x0j

∂y∗i+ x0i − ∂g(x0)

∂xj

∂x0j

∂y∗i, (8.43)

or

∂g∗L(y∗)

∂y∗i= (y∗j −

∂g(x0)

∂xj)∂x0j

∂y∗i+ x0i (8.44)

which from (8.41) implies that:

∂g∗L(y∗)

∂y∗i= x0i , ∀ i ∈ 1, ..., n. (8.45)

This completes the first half of the proof. Conversely, suppose nowthat:

x0i =∂g∗L(y∗)

∂y∗i, ∀i ∈ 1, ..., n. (8.46)

As y∗ ∈ RnL there exists x0 ∈ Rn such that:

y∗i =∂g(x0)

∂xi∀i ∈ 1, ..., n, (8.47)

and,

g∗L(y∗) = y∗i x0i − g(x0) (8.48)

and therefore taking derivatives for this expression we can obtain:

∂g∗L(y∗)

∂y∗i= y∗j

∂x0j

∂y∗i+ x0i − ∂g(x0)

∂xj

∂x0j

∂y∗i, (8.49)

∀ i ∈ 1, ..., n, so that:

∂g∗L(y∗)

∂y∗i= (y∗j −

∂g(x0)

∂xj

)∂x0j

∂y∗i+ x0i (8.50)

∀ i ∈ 1, ..., n, which from (8.46) and (8.47), implies that:

x0i =∂g∗L(y∗)

∂y∗i= x0i , ∀ i ∈ 1, ..., n, (8.51)

from this and (8.47) we have:

y∗i =∂g(x0)

∂xi

=∂g(x0)

∂xi

∀ i ∈ 1, ..., n. (8.52)


Theorem 8.1.27. Consider the functional J : U → R de-fined as J(u) = (G Λ)(u) − 〈u, f〉U where Λ(= Λi) : U → Y(i ∈ 1, ..., n) is a continuous linear operator and, G : Y → R isa functional that can be expressed as G(v) =

∫

Sg(v)dS, ∀v ∈ Y

(here g : Rn → R is a differentiable function that admits LegendreTransform denoted by g∗L : Rn

L → R. That is, the hypothesis men-tioned at Proposition 8.1.26 are satisfied).

Under these assumptions we have:

δJ(u0) = θ ⇔ δ(−G∗L(v∗0) + 〈u0,Λ

∗v∗0 − f〉U) = θ, (8.53)

where v∗0 = ∂G(Λ(u0))∂v

is supposed to be such that v∗0(x) ∈ RnL, a.e.

in S and in this case:

J(u0) = −G∗L(v∗0). (8.54)

Proof. Suppose first that δJ(u0) = θ, that is:

Λ∗∂G(Λu0)

∂v− f = θ (8.55)

which, as v∗0 = ∂G(Λu0)∂v

implies that:

Λ∗v∗0 − f = θ, (8.56)

and

v∗0i =∂g(Λu0)

∂xi. (8.57)

Thus from the last proposition we can write:

Λi(u0) =∂g∗L(v∗0)

∂y∗i, for i ∈ 1, .., n (8.58)

which means:

Λu0 =∂G∗

L(v∗0)

∂v∗. (8.59)

Therefore from (8.56) and (8.59) we have:

δ(−G∗L(v∗0) + 〈u0,Λ

∗v∗0 − f〉U) = θ. (8.60)

This completes the first part of the proof.Conversely, suppose now that:

δ(−G∗L(v∗0) + 〈u0,Λ

∗v∗0 − f〉U) = θ, (8.61)

that is:

Λ∗v∗0 − f = θ (8.62)


and

Λu0 =∂G∗

L(v∗0)

∂v∗. (8.63)

Clearly, from (8.63), the last proposition and (8.62) we can write:

v∗0 =∂G(Λ(u0))

∂v(8.64)

and

Λ∗∂G(Λu0)

∂v− f = θ, (8.65)

which implies:

δJ(u0) = θ. (8.66)

Finally, we have:

J(u0) = G(Λu0) − 〈u0, f〉U (8.67)

From this, (8.62) and (8.64) we have

J(u0) = G(Λu0) − 〈u0,Λ∗v∗0〉U = G(Λu0) − 〈Λu0, v

∗0〉Y

= −G∗L(v∗0). (8.68)

8.2. Duality in Convex Optimization

Let U be a Banach space. Given F : U → R (F ∈ Γ0(U)) wedefine the problem P as

P : minimize F (u) on U. (8.69)

We say that u0 ∈ U is a solution of problem P if F (u0) = infu∈U F (u).Consider a function φ(u, p) (φ : U × Y → R) such that

φ(u, 0) = F (u), (8.70)

we define the problem P∗, as

P∗ : maximize − φ∗(0, p∗) on Y ∗. (8.71)

Observe that

φ∗(0, p∗) = sup(u,p)∈U×Y

〈0, u〉U + 〈p, p∗〉Y − φ(u, p) ≥ −φ(u, 0),

(8.72)

or

infu∈U

φ(u, 0) ≥ supp∗∈Y ∗

−φ∗(0, p∗). (8.73)

8.2. DUALITY IN CONVEX OPTIMIZATION 205

Proposition 8.2.1. Consider φ ∈ Γ0(U × Y ). If we define

h(p) = infu∈U

φ(u, p), (8.74)

then h is convex.

Proof. We have to show that given p, q ∈ Y and λ ∈ (0, 1), wehave

h(λp+ (1 − λ)q) ≤ λh(p) + (1 − λ)h(q). (8.75)

If h(p) = +∞ or h(q) = +∞ we are done. Thus let us assumeh(p) < +∞ and h(q) < +∞. For each a > h(p) there exists u ∈ Usuch that

h(p) ≤ φ(u, p) ≤ a, (8.76)

and, if b > h(q), there exists v ∈ U such that

h(q) ≤ φ(v, q) ≤ b. (8.77)

Thus

h(λp+ (1 − λ)q) ≤ infw∈U

φ(w, λp+ (1 − λ)q)≤ φ(λu+ (1 − λ)v, λp+ (1 − λ)q) ≤ λφ(u, p) + (1 − λ)φ(v, q)

≤ λa+ (1 − λ)b. (8.78)

Letting a→ h(p) and b→ h(q) we obtain

h(λp+ (1 − λ)q) ≤ λh(p) + (1 − λ)h(q). (8.79)

Proposition 8.2.2. For h as above, we have h∗(p∗) = φ∗(0, p∗),∀p∗ ∈ Y ∗, so that

h∗∗(0) = supp∗∈Y ∗

−φ∗(0, p∗). (8.80)

Proof. Observe that

h∗(p∗) = supp∈Y

〈p, p∗〉Y − h(p) = supp∈Y

〈p, p∗〉Y − infu∈U

φ(u, p),(8.81)

so that

h∗(p∗) = sup(u,p)∈U×Y

〈p, p∗〉Y − φ(u, p) = φ∗(0, p∗). (8.82)


Proposition 8.2.3. The set of solutions of the problem P∗

(the dual problem) is identical to ∂h∗∗(0).

Proof. Consider p∗0 ∈ Y ∗ a solution of Problem P∗, that is,

−φ∗(0, p∗0) ≥ −φ∗(0, p∗), ∀p∗ ∈ Y ∗, (8.83)

which is equivalent to

−h∗(p∗0) ≥ −h∗(p∗), ∀p∗ ∈ Y ∗, (8.84)


− h(p∗0) = supp∗∈Y ∗

〈0, p∗〉Y − h∗(p∗) ⇔ −h∗(p∗0) = h∗∗(0)

⇔ p∗0 ∈ ∂h∗∗(0). (8.85)

Theorem 8.2.4. Consider φ : U × Y → R convex. Assumeinfu∈Uφ(u, 0) ∈ R and there exists u0 ∈ U such that p 7→ φ(u0, p)is finite and continuous at 0 ∈ Y , then

infu∈U

φ(u, 0) = supp∗∈Y ∗

−φ∗(0, p∗), (8.86)

and the dual problem has at least one solution.

Proof. By hypothesis h(0) ∈ R and as was shown above, h isconvex. As the function p 7→ φ(u0, p) is convex and continuous at0 ∈ Y , there exists a neighborhood V of zero in Y such that

φ(u0, p) ≤ M < +∞, ∀p ∈ V, (8.87)

for some M ∈ R. Thus, we may write

h(p) = infu∈U

φ(u, p) ≤ φ(u0, p) ≤ M, ∀p ∈ V. (8.88)

Hence, from Lemma 8.1.20, h is continuous at 0. Thus by Proposi-tion 8.1.21, h is sub-differentiable at 0, which means h(0) = h∗∗(0).Therefore by Proposition 8.2.3, the dual problem has solutions and

h(0) = infu∈U

φ(u, 0) = supp∗∈Y ∗

−φ∗(0, p∗) = h∗∗(0). (8.89)

Now we apply the last results to φ(u, p) = G(Λu + p) + F (u),where Λ : U → Y is a continuous linear operator whose adjointoperator is denoted by Λ∗ : Y ∗ → U∗. We may enunciate thefollowing theorem.

8.2. DUALITY IN CONVEX OPTIMIZATION 207

Theorem 8.2.5. Suppose U is a reflexive Banach space anddefine J : U → R by

J(u) = G(Λu) + F (u) = φ(u, 0), (8.90)

where lim J(u) = +∞ as ‖u‖U → ∞ and F ∈ Γ0(U), G ∈ Γ0(Y ).Also suppose there exists u ∈ U such that J(u) < +∞ with thefunction p 7→ G(p) continuous at Λu. Under such hypothesis, thereexist u0 ∈ U and p∗0 ∈ Y ∗ such that

J(u0) = minu∈U

J(u) = maxp∗∈Y ∗

−G∗(p∗) − F ∗(−Λ∗p∗)

= −G∗(p∗0) − F ∗(−Λ∗p∗0). (8.91)

Proof. The existence of solutions for the primal problem followsfrom the direct method of calculus of variations. That is, consid-ering a minimizing sequence, from above (coercivity hypothesis),such a sequence is bounded and has a weakly convergent subse-quence to some u0 ∈ U . Finally, from the lower semi-continuity ofprimal formulation, we may conclude that u0 is a minimizer. Theother conclusions follow from Theorem 8.2.4 just observing that

φ∗(0, p∗) = supu∈U,p∈Y

〈p, p∗〉Y −G(Λu+ p) − F (u)

= supu∈U,q∈Y

〈q, p∗〉 −G(q) − 〈Λu, p∗〉 − F (u), (8.92)

so that

φ∗(0, p∗) = G∗(p∗) + supu∈U

−〈u,Λ∗p∗〉U − F (u)

= G∗(p∗) + F ∗(−Λ∗p∗). (8.93)

Thus,

infu∈U

φ(u, 0) = supp∗∈Y ∗

−φ∗(0, p∗) (8.94)

and solutions u0 and p∗0 for the primal and dual problems, respec-tively, imply that

J(u0) = minu∈U

J(u) = maxp∗∈Y ∗

−G∗(p∗) − F ∗(−Λ∗p∗)

= −G∗(p∗0) − F ∗(−Λ∗p∗0). (8.95)


8.3. Relaxation for the Scalar Case

In this section, Ω ⊂ RN denotes a bounded open set with alocally Lipschitz boundary. That is, for each point x ∈ ∂Ω thereexists a neighborhood Ux whose the intersection with ∂Ω is thegraph of a Lipschitz continuous function.

We start with the following definition.

Definition 8.3.1. A function u : Ω → R is said to be affineif ∇u is constant on Ω. Furthermore, we say that u : Ω → R ispiecewise affine if it is continuous and there exists a partition of Ωinto a set of zero measure and finite number of open sets on whichu is affine.

The proof of next result is found in [17].

Theorem 8.3.2. Let r ∈ N and let uk, 1 ≤ k ≤ r be piecewiseaffine functions from Ω into R and αk such that αk > 0, ∀k ∈1, ..., r and

∑rk=1 αk = 1. Given ε > 0, there exists a locally

Lipschitz function u : Ω → R and r disjoint open sets Ωk, 1 ≤ k ≤ r,such that

|m(Ωk) − αkm(Ω)| < αkε, ∀k ∈ 1, ..., r, (8.96)

∇u(x) = ∇uk(x), a.e. on Ωk, (8.97)

|∇u(x)| ≤ max1≤k≤r

|∇uk(x)|, a.e. on Ω, (8.98)

∣

∣

∣

∣

∣

u(x) −r∑

k=1

αkuk

∣

∣

∣

∣

∣

< ε, ∀x ∈ Ω, (8.99)

u(x) =r∑

k=1

αkuk(x), ∀x ∈ ∂Ω. (8.100)

The next result is also found in [17].

Proposition 8.3.3. Let r ∈ N and let uk, 1 ≤ k ≤ r bepiecewise affine functions from Ω into R. Consider a Caratheodoryfunction f : Ω × RN → R and a positive function c ∈ L1(Ω) which

8.3. RELAXATION FOR THE SCALAR CASE 209

satisfy

c(x) ≥ sup|f(x, ξ)| | |ξ| ≤ max1≤k≤r

‖∇uk‖∞. (8.101)

Given ε > 0, there exists a locally Lipschitz function u : Ω → R

such that∣

∣

∣

∣

∣

∫

Ω

f(x,∇u)dx−r∑

k=1

αk

∫

Ω

f(x,∇uk)dx

∣

∣

∣

∣

∣

< ε, (8.102)

|∇u(x)| ≤ max1≤k≤r

|∇uk(x)|, a.e. in Ω, (8.103)

|u(x) −r∑

k=1

αkuk(x)| < ε, ∀x ∈ Ω (8.104)

u(x) =r∑

k=1

αkuk(x), ∀x ∈ ∂Ω. (8.105)

Proof. It is sufficient to establish the result for functions uk

affine over Ω, since Ω can be divided into pieces on which uk areaffine, and such pieces can be put together through (8.105). Letε > 0 be given. We know that simple functions are dense in L1(Ω),concerning the L1 norm. Thus there exists a partition of Ω into afinite number of open sets Oi, 1 ≤ i ≤ N1 and a negligible set, andthere exists fk constant functions over each Oi such that

∫

Ω

|f(x,∇uk(x)) − fk(x)|dx < ε, 1 ≤ k ≤ r. (8.106)

Now choose δ > 0 such that

δ ≤ ε

N1(1 + max1≤k≤r‖fk‖∞) (8.107)

and if B is a measurable set

m(B) < δ ⇒∫

B

c(x)dx ≤ ε/N1. (8.108)

Now we apply Theorem 8.3.2, to each of the open sets Oi, thereforethere exists a locally Lipschitz function u : Oi → R and there existr open disjoints spaces Ωi

k, 1 ≤ k ≤ r, such that

|m(Ωik) − αkm(Oi)| ≤ αkδ, for 1 ≤ k ≤ r, (8.109)

∇u = ∇uk, a.e. in Ωik, (8.110)


|∇u(x)| ≤ max1≤k≤r

|∇uk(x)|, a.e. Oi, (8.111)

∣

∣

∣

∣

∣

u(x) −r∑

k=1

αkuk(x)

∣

∣

∣

∣

∣

≤ δ, ∀x ∈ Oi (8.112)

u(x) =

r∑

k=1

αkuk(x), ∀x ∈ ∂Oi. (8.113)

We can define u =∑r

k=1 αkuk on Ω − ∪N1i=1Oi. Therefore u is con-

tinuous and locally Lipschitz. Now observe that

∫

Oi

f(x,∇u(x))dx−r∑

k=1

∫

Ωik

f(x,∇uk(x))dx

=

∫

Oi−∪rk=1Ω

ik

f(x,∇u(x))dx. (8.114)

From |f(x,∇u(x))| ≤ c(x), m(Oi − ∪rk=1Ω

ik) ≤ δ and (8.108) we

obtain

∣

∣

∣

∣

∣

∫

Oi


k=1

∫

Ωik

f(x,∇uk(x)dx

∣

∣

∣

∣

∣

=

∣

∣

∣

∣

∣

∫

Oi−∪rk=1Ω

ik

f(x,∇u(x))dx∣

∣

∣

∣

∣

≤ ε/N1. (8.115)

Considering that fk is constant in Oi, from (8.107), (8.108) and(8.109) we obtain

r∑

k=1

|∫

Ωik

fk(x)dx− αk

∫

Oi

fk(x)dx| < ε/N1. (8.116)


We recall that Ωk = ∪N1i=1Ω

ik so that

∣

∣

∣

∣

∣

∫

Ω


k=1

αk

∫

Ω

f(x,∇uk(x))dx

∣

∣

∣

∣

∣

≤∣

∣

∣

∣

∣

∫

Ω


k=1

∫

Ωk

f(x,∇uk(x))dx

∣

∣

∣

∣

∣

+

r∑

k=1

∫

Ωk

|f(x,∇uk(x) − fk(x)|dx

+

r∑

k=1

∣

∣

∣

∣

∫

Ωk

fk(x)dx− αk

∫

Ω

fk(x)dx

∣

∣

∣

∣

+

r∑

k=1

αk

∫

Ω

|fk(x) − f(x,∇uk(x))|dx. (8.117)

From (8.115), (8.106),(8.116) and (8.106) again, we obtain∣

∣

∣

∣

∣

∫

Ω


k=1

αk

∫

Ω

f(x,∇uk)dx

∣

∣

∣

∣

∣

< 4ε. (8.118)

The next result we do not prove it. It is a well known resultfrom the finite element theory.

Proposition 8.3.4. If u ∈ W 1,p0 (Ω) there exists a sequence

un of piecewise affine functions over Ω, null on ∂Ω, such that

un → u, in Lp(Ω) (8.119)

and

∇un → ∇u, in Lp(Ω; RN). (8.120)

Proposition 8.3.5. For p such that 1 < p <∞, suppose thatf : Ω× RN → R is a Caratheodory function , for which there exista1, a2 ∈ L1(Ω) and constants c1 ≥ c2 > 0 such that

a2(x) + c2|ξ|p ≤ f(x, ξ) ≤ a1(x) + c1|ξ|p, ∀x ∈ Ω, ξ ∈ RN . (8.121)

Then, given u ∈W 1,p(Ω) piecewise affine, ε > 0 and a neighborhoodV of zero in the topology σ(Lp(Ω,RN), Lq(Ω,RN)) there exists afunction v ∈ W 1,p(Ω) such that

∇v −∇u ∈ V, (8.122)

u = v on ∂Ω,


‖v − u‖∞ < ε, (8.123)

and∣

∣

∣

∣

∫

Ω

f(x,∇v(x))dx−∫

Ω

f ∗∗(x,∇u(x))dx∣

∣

∣

∣

< ε. (8.124)

Proof. Suppose given ε > 0, u ∈ W 1,p(Ω) piecewise affine con-tinuous, and a neighborhood V of zero, which may be expressedas

V = w ∈ Lp(Ω,RN) |∣

∣

∣

∣

∫

Ω

hm · wdx∣

∣

∣

∣

< η,

∀m ∈ 1, ...,M, (8.125)

where M ∈ N, hm ∈ Lq(Ω,RN ), η ∈ R+. By hypothesis, thereexists a partition of Ω into a negligible set Ω0 and open subspaces∆i, 1 ≤ i ≤ r, over which ∇u(x) is constant. From standardresults of convex analysis in RN , for each i ∈ 1, ..., r we can

obtain αk ≥ 01≤k≤N+1, and ξk such that∑N+1

k=1 αk = 1 and

N+1∑

k=1

αkξk = ∇u, ∀x ∈ ∆i, (8.126)

andN+1∑

k=1

αkf(x, ξk) = f ∗∗(x,∇u(x)). (8.127)

Define βi = maxk∈1,...,N+1|ξk| on ∆i, and ρ1 = maxi∈1,...,rβi,and ρ = maxρ1, ‖∇u‖∞. Now, observe that we can obtain func-

tions hm ∈ C∞0 (Ω; RN) such that

maxm∈1,...,M

‖hm − hm‖Lq(Ω,RN ) <η

4ρm(Ω). (8.128)

Define C = maxm∈1,...,M ‖div(hm)‖Lq(Ω) and we can also define

ε1 = minε/4, 1/(m(Ω)1/p), η/(2Cm(Ω)1/p), 1/m(Ω) (8.129)

We recall that ρ does not depend on ε. Furthermore, for eachi ∈ 1, ..., r there exists a compact subset Ki ⊂ ∆i such that

∫

∆i−Ki

[a1(x) + c1(x) max|ξ|≤ρ

|ξ|p]dx < ε1

r. (8.130)

Also, observe that the restrictions of f and f ∗∗ to Ki × ρB arecontinuous, so that from this and from the compactness of ρB,


for all x ∈ Ki, we can find an open ball ωx with center in x andcontained in Ω, such that

|f ∗∗(y,∇u(x))− f ∗∗(x,∇u(x))| < ε1

m(Ω), ∀y ∈ ωx ∩Ki, (8.131)

and

|f(y, ξ)− f(x, ξ)| < ε1

m(Ω), ∀y ∈ ωx ∩Ki, ∀ξ ∈ ρB. (8.132)

Therefore, from this and (8.127) we may write∣

∣

∣

∣

∣

f ∗∗(y,∇u(x)) −N+1∑

k=1

αkf(y, ξk)

∣

∣

∣

∣

∣

<2ε1

m(Ω), ∀y ∈ ωx ∩Ki. (8.133)

We can cover the compact set Ki with a finite number of thoseopen balls ωx, denoted by ωj , 1 ≤ j ≤ l. Consider the open sets

ω′j = ωj−∪j−1

i=1 ωi, we have that ∪lj=1ω

′j = ∪l

j=1ωj. Defining functions

uk, for 1 ≤ k ≤ N + 1 such that ∇uk = ξk and u =∑N+1

k=1 αkuk wemay apply Proposition 8.3.3 to each of the open sets ω′

j, so that we

obtain functions vi ∈W 1,p(Ω) such that∣

∣

∣

∣

∣

∫

ω′

j

f(x,∇vi(x)dx−N+1∑

k=1

αk

∫

ω′

j

f(x, ξk)dx

∣

∣

∣

∣

∣

<ε1

rl, (8.134)

|∇vi| < ρ, ∀x ∈ ω′j, (8.135)

|vi(x) − u(x)| < ε1, ∀x ∈ ω′j, (8.136)

and

vi(x) = u(x), ∀x ∈ ∂ω′j . (8.137)

Finally we set

vi = u on ∆i −∪lj=1ωj . (8.138)

We may define a continuous mapping v : Ω → R by

v(x) = vi(x), if x ∈ ∆i, (8.139)

v(x) = u(x), if x ∈ Ω0. (8.140)

We have that v(x) = u(x), ∀x ∈ ∂Ω and ‖∇v‖∞ < ρ. Also, from(8.130)

∫

∆i−Ki

|f ∗∗(x,∇u(x)|dx < ε1

r(8.141)


and∫

∆i−Ki

|f(x,∇v(x)|dx < ε1

r. (8.142)

On the other hand, from (8.133) and (8.134)

∣

∣

∣

∣

∣

∫

Ki∩ω′

j


Ki∩ω′

j

f ∗∗(x,∇u(x))dx∣

∣

∣

∣

∣

≤ ε1

rl+ε1m(ω′

j ∩Ki)

m(Ω)(8.143)

so that

|∫

Ki


Ki

f ∗∗(x,∇u(x))dx|

≤ ε1

r+ε1m(Ki)

m(Ω). (8.144)

Now summing up in i and considering (8.141) and (8.142) we obtain(8.124), that is

|∫

Ω

f(x,∇v(x))dx −∫

Ω

f ∗∗(x,∇u(x))dx| < 4ε1 ≤ ε. (8.145)

Also, observe that from above, we have

‖v − u‖∞ < ε1, (8.146)

and thus∣

∣

∣

∣

∫

Ω

hm · (∇v(x) −∇u(x))dx∣

∣

∣

∣

=

∣

∣

∣

∣

−∫

Ω

div(hm)(v(x) − u(x))dx

∣

∣

∣

∣

≤ ‖div(hm)‖Lq(Ω)‖v − u‖Lp(S)

≤ Cε1m(Ω)1/p

<η

2. (8.147)

Also we have that∣

∣

∣

∣

∫

Ω

(hm − hm) · (∇v −∇u)dx∣

∣

∣

∣

≤ ‖hm − hm‖Lq(Ω,RN )‖∇v −∇u‖Lp(Ω,RN ) ≤η

2. (8.148)


Thus∣

∣

∣

∣

∫

Ω

hm · (∇v −∇u)dx∣

∣

∣

∣

< η, ∀m ∈ 1, ...,M. (8.149)

Theorem 8.3.6. Assuming the hypothesis of last theorem,given a function u ∈ W 1,p

0 (Ω), given ε > 0 and a neighborhoodof zero V in σ(Lp(Ω,RN), Lq(Ω,RN)), we have that there exists afunction v ∈ W 1,p

0 (Ω) such that

∇v −∇u ∈ V, (8.150)

and∣

∣

∣

∣

∫

Ω


Ω

f ∗∗(x,∇u(x))dx∣

∣

∣

∣

< ε. (8.151)

Proof. We can approximate u by a function w which is piece-wise affine and null on the boundary. Thus, there exists δ > 0 suchthat we can obtain w ∈W 1,p

0 (Ω) piecewise affine such that

‖u− w‖1,p < δ (8.152)

so that

∇w −∇u ∈ 1

2V, (8.153)

and∣

∣

∣

∣

∫

Ω

f ∗∗(x,∇w(x))dx−∫

Ω

f ∗∗(x,∇u(x))dx∣

∣

∣

∣

<ε

2. (8.154)

From Proposition 8.3.5 we may obtain v ∈W 1,p0 (Ω) such that

∇v −∇w ∈ 1

2V, (8.155)

and∣

∣

∣

∣

∫

Ω

f ∗∗(x,∇w(x))dx−∫

Ω

f(x,∇v(x))dx∣

∣

∣

∣

<ε

2. (8.156)

From (8.154) and (8.156)∣

∣

∣

∣

∫

Ω

f ∗∗(x,∇u(x))dx−∫

Ω

f(x,∇v(x))dx∣

∣

∣

∣

< ε. (8.157)


Finally, from (8.153), (8.155) and from the fact the weak neighbor-hoods are convex, we have

∇v −∇u ∈ V. (8.158)

To finish this chapter, we present two theorems which summa-rize the last results.

Theorem 8.3.7. Let f be a Caratheodory function from Ω ×RN into R which satisfies

a2(x) + c2|ξ|p ≤ f(x, ξ) ≤ a1(x) + c1|ξ|p (8.159)

where a1, a2 ∈ L1(Ω), 1 < p < +∞, b ≥ 0 and c1 ≥ c2 > 0. Under

such assumptions, defining U = W 1,p0 (Ω), we have

infu∈U

∫

Ω

f(x,∇u)dx

= minu∈U

∫

Ω

f ∗∗(x,∇u)dx

(8.160)

The solutions of relaxed problem are weak cluster points in W 1,p0 (Ω)

of the minimizing sequences of primal problem.

Proof. The existence of solutions for the convex relaxed for-mulation is a consequence of the reflexivity of U and coercivityhypothesis, which allows an application of the direct method ofcalculus of variations. That is, considering a minimizing sequence,from above (coercivity hypothesis), such a sequence is bounded andhas a weakly convergent subsequence to some u ∈W 1,p(Ω). Finally,from the lower semi-continuity of relaxed formulation, we may con-clude that u is a minimizer. The relation (8.160) follows from lasttheorem.

Theorem 8.3.8. Let f be a Caratheodory function from Ω ×RN into R which satisfies

a2(x) + c2|ξ|p ≤ f(x, ξ) ≤ a1(x) + c1|ξ|p (8.161)

where a1, a2 ∈ L1(Ω), 1 < p < +∞, b ≥ 0 and c1 ≥ c2 > 0. Let

u0 ∈W 1,p(Ω). Under such assumptions, defining U = u | u− u0 ∈W 1,p

0 (Ω), we have

infu∈U

∫

Ω

f(x,∇u)dx

= minu∈U

∫

Ω

f ∗∗(x,∇u)dx

(8.162)

8.4. DUALITY SUITABLE FOR THE VECTORIAL CASE 217

The solutions of relaxed problem are weak cluster points inW 1,p(Ω) of the minimizing sequences of primal problem.

Proof. Just apply the last theorem to the integrand g(x, ξ) =f(x, ξ + ∇u0). For details see [17].

8.4. Duality Suitable for the Vectorial Case

Definition 8.4.1 (A Cone and its Partial Order Relation). LetU be a Banach space and m > 0. We define C(m) as

C(m) = (u, a) ∈ U × R | a+m‖u‖U ≤ 0. (8.163)

Also, we define an order relation for the cone C(m), namely

(u, a) ≤ (v, b) ⇔ (v − u, b− a) ∈ C(m). (8.164)

Proposition 8.4.2. Let S ⊂ U × R be a closed set such that

infa | (u, a) ∈ S > −∞. (8.165)

Then S has a maximal element under the order relation of lastdefinition.

For a proof see [17], page 28.The next result is particularly relevant for non-convex function-

als.

Theorem 8.4.3. Let F : U → R be lower semi-continuousfunctional such that −∞ < infu∈UF (u) < +∞. Given ε > 0,suppose u ∈ U is such that

F (u) ≤ infu∈U

F (u) + ε, (8.166)

then, for each λ > 0, there exists uλ ∈ U such that

‖u− uλ‖U ≤ λ and F (uλ) ≤ F (u). (8.167)

Proof. We will apply the last proposition to S = Epi(F ),which is a closed set. For the order relation associated with C(ε/λ),there exists a maximal element, which we denote by (uλ, aλ). Thus(uλ, aλ) ≥ (u, F (u)). Since (uλ, aλ, ) is maximal, we have aλ =F (uλ). Also observe that

(u, F (u)) ≤ (uλ, F (uλ)), (8.168)


so thatε

λ‖u− uλ‖U ≤ F (u) − F (uλ). (8.169)


infu∈U

F (u) ≤ F (uλ) ≤ F (u)− ε

λ‖u−uλ‖U ≤ F (u) ≤ inf

u∈UF (u)+ε,

so that

0 ≤ F (u) − F (uλ) ≤ ε, (8.170)

and therefore

‖u− uλ‖U ≤ λ. (8.171)

Remark 8.4.4. Observe that

F (uλ) −ε

λt‖v‖U ≤ F (uλ + tv), ∀t ∈ [0, 1], v ∈ U, (8.172)

so that, if F is Gateaux differentiable, we obtain

− ε

λ‖v‖U ≤ 〈δF (uλ), v〉U . (8.173)

Thus

‖δF (uλ)‖U∗ ≤ ε/λ. (8.174)

Now, for λ =√ε we obtain the following result.

Theorem 8.4.5. Let F : U → R be a Gateaux differentiablefunctional. Given ε > 0 suppose that u ∈ U is such that

F (u) ≤ infu∈U

F (u) + ε. (8.175)

Then there exists v ∈ U such that

F (v) ≤ F (u), (8.176)

‖u− v‖U ≤ √ε, (8.177)

and

‖δF (v)‖U∗ ≤ √ε. (8.178)

The next theorem easily follows from above results.


Theorem 8.4.6. Let J : U → R, be defined by

J(u) = G(∇u) − 〈f, u〉L2(S;RN ), (8.179)

where

U = W 1,20 (S; RN), (8.180)

We suppose G is Gateaux-differentiable and J bounded from below.Then, given ε > 0, there exists uε ∈ U such that

J(uε) − infu∈U

J(u) < ε, (8.181)

and

‖δJ(uε)‖U∗ <√ε. (8.182)

We finish this Chapter with the most important result we haveobtained for vectorial problems in the Calculus of Variations, namely:

Theorem 8.4.7. Let U be a reflexive Banach space. Consider(G Λ) : U → R and (F Λ1) : U → R l.s.c. functionals such thatJ : U → R defined as

J(u) = (G Λ)(u) − (F Λ1)(u) − 〈u, f〉Uis below bounded. (Here Λ : U → Y and Λ1 : U → Y1 are con-tinuous linear operators whose adjoint operators are denoted byΛ∗ : Y ∗ → U∗ and Λ∗

1 : Y ∗ → U∗, respectively). Also we supposethe existence of L : Y1 → Y continuous and linear operator suchthat L∗ is onto and

Λ(u) = L(Λ1(u)), ∀u ∈ U.

Under such assumptions, we have

infu∈U

J(u) ≥ supv∗∈A∗

infz∗∈Y ∗

1

F ∗(L∗z∗) −G∗(v∗ + z∗),

where

A∗ = v∗ ∈ Y ∗ | Λ∗v∗ = f.In addition we assume (F Λ1) : U → R is convex and Gateaux

differentiable, and suppose there exists a solution (v∗0, z∗0) of the

dual formulation, so that,

L

(

∂F ∗(L∗z∗0)

∂v∗

)

∈ ∂G∗(v∗0 + z∗0),

Λ∗v∗0 − f = 0.


Suppose u0 ∈ U is such that

∂F ∗(L∗z∗0)

∂v∗= Λ1u0,

so thatΛu0 ∈ ∂G∗(v∗0 + z∗0).

Also we assume that there exists a sequence un ⊂ U such thatun u0 weakly in U and

G(Λun) → G∗∗(Λu0) as n → ∞.

Under these additional assumptions we have

infu∈U

J(u) = maxv∗∈A∗

infz∗∈Y ∗

1

F ∗(L∗z∗) −G∗(v∗ + z∗)

= F ∗(L∗z∗0) −G∗(v∗0 + z∗0).

Proof. Observe that

G∗(v∗ + z∗) ≥ 〈Λu, v∗〉Y + 〈Λu, z∗〉Y −G(Λu), ∀u ∈ U,

that is,

− F ∗(L∗z∗) +G∗(v∗ + z∗) ≥ 〈u, f〉U − F ∗(L∗z∗) + 〈Λ1u, L∗z∗〉Y1

−G(Λu), ∀u ∈ U, v∗ ∈ A∗

so that

supz∗∈Y ∗

1

−F ∗(L∗z∗) +G∗(v∗ + z∗)

≥ supz∗∈Y ∗

1

〈u, f〉U − F ∗(L∗z∗) + 〈Λ1u, L∗z∗〉Y1 −G(Λu),

∀v∗ ∈ A∗, u ∈ U and therefore

G(Λu) − F (Λ1u) − 〈u, f〉U ≥ infz∗∈Y ∗

1

F ∗(L∗z∗) −G∗(v∗ + z∗),

∀v∗ ∈ A∗, u ∈ U

which means

infu∈U


infz∗∈Y ∗

1

F ∗(L∗z∗) −G∗(v∗ + z∗),

whereA∗ = v∗ ∈ Y ∗ | Λ∗v∗ = f.

Now suppose

L

(

∂F ∗(L∗z∗0)

∂v∗

)

∈ ∂G∗(v∗0 + z∗0),


and u0 ∈ U is such that

∂F ∗(L∗z∗0)

∂v∗= Λ1u0.

Observe thatΛu0 = L(Λ1u0) ∈ ∂G(v∗0 + z∗0)

implies that

G∗(v∗0 + z∗0) = 〈Λu0, v∗0〉Y + 〈Λu0, z

∗0〉Y −G∗∗(Λu0).

From the hypothesis

un u0 weakly in U

andG(Λun) → G∗∗(Λu0) as n→ ∞.

Thus, given ε > 0, there exists n0 ∈ N such that if n ≥ n0 then

G∗(v∗0 + z∗0) − 〈Λun, v∗0〉Y − 〈Λun, z

∗0〉Y +G(Λun) < ε/2.

On the other hand, since F (Λ1u) is convex and l.s.c. we have

lim supn→∞

−F (Λ1un) ≤ −F (Λ1u0).

Hence, there exists n1 ∈ N such that if n ≥ n1 then

〈Λun, z∗0〉Y −F (Λ1un) ≤ 〈Λu0, z

∗0〉Y −F (Λ1u0)+

ε

2= F ∗(L∗z∗0)+

ε

2,

so that for all n ≥ maxn0, n1 we obtain

G∗(v∗0 + z∗0) − F ∗(L∗z∗0) − 〈un, f〉U − F (Λ1un) +G(Λun) < ε.

Since ε is arbitrary, the proof is complete.

CHAPTER 9

Constrained Variational Optimization

9.1. Basic Concepts

For this chapter the most relevant reference is the excellent bookof Luenberger, [27], where more details may be found. We startwith the definition of cone:

Definition 9.1.1 (Cone). Given U a Banach space, we saythat C ⊂ U is a cone with vertex at origin, if given u ∈ C, we havethat λu ∈ C, ∀λ ≥ 0. By analogy we define a cone with vertex atp ∈ U as P = p+ C, where C is any cone with vertex at origin.

Definition 9.1.2. Let P be a convex cone in U . For u, v ∈ Uwe write u ≥ v (with respect to P ) if u−v ∈ P . In particular u ≥ θif and only if u ∈ C. Also

P+ = u∗ ∈ U∗ | 〈u, u∗〉U ≥ 0, ∀u ∈ P. (9.1)

If u∗ ∈ P+ we write u∗ ≥ θ∗.

Proposition 9.1.3. Let U be a Banach space and P be aclosed cone in U . If u ∈ U satisfies 〈u, u∗〉U ≥ 0, ∀u∗ ≥ θ∗, thenu ≥ θ.

Proof. We prove the contrapositive. Assume u 6∈ P . Thenby the separating hyperplane theorem there is an u∗ ∈ U∗ suchthat 〈u, u∗〉U < 〈p, u∗〉U , ∀p ∈ P . Since P is cone we must have〈p, u∗〉U ≥ 0, otherwise we would have 〈u, u∗〉 > 〈αp, u∗〉U for someα > 0. Thus u∗ ∈ P+. Finally, since infp∈P〈p, u∗〉U = 0, weobtain 〈u, u∗〉U < 0 which completes the proof.

Definition 9.1.4 (Convex Mapping). Let U,Z be vectorspaces. Let P ⊂ Z be a cone. A mapping G : U → Z is said

223

224 9. CONSTRAINED VARIATIONAL OPTIMIZATION

to be convex if the domain of G is convex and

G(αu1 + (1 − α)u2) ≤ αG(u1) + (1 − α)G(u2),

∀u1, u2 ∈ U, α ∈ [0, 1]. (9.2)

Consider the problem P, defined as

Problem P : Minimize F : U → R subject to u ∈ Ω, and G(u) ≤ θ

Define

ω(z) = infF (u) | u ∈ Ω and G(u) ≤ z. (9.3)

For such a functional we have the following result.

Proposition 9.1.5. If F is a real convex functional and G isconvex, then ω is convex.

Proof. Observe that

ω(αz1 + (1 − α)z2) = infF (u) | u ∈ Ω

and G(u) ≤ αz1 + (1 − α)z2(9.4)

≤ infF (u) | u = αu1 + (1 − α)u2 u1, u2 ∈ Ω

and G(u1) ≤ z1, G(u2) ≤ z2(9.5)

≤α infF (u1) | u1 ∈ Ω, G(u1) ≤ z1+ (1 − α) infF (u2) | u2 ∈ Ω, G(u2) ≤ z2

(9.6)

≤αω(z1) + (1 − α)ω(z2). (9.7)

Now we establish the Lagrange multiplier theorem for convexglobal optimization.

Theorem 9.1.6. Let U be a vector space, Z a Banach space,Ω a convex subset of U , P a positive cone of Z. Assume that Pcontains an interior point. Let F be a real convex functional on Ωand G a convex mapping from Ω into Z. Assume the existence of

9.1. BASIC CONCEPTS 225

u1 ∈ Ω such that G(u1) < θ. Defining

µ0 = infu∈Ω

F (u) | G(u) ≤ θ, (9.8)

then there exists z∗0 ≥ θ, z∗0 ∈ Z∗ such that

µ0 = infu∈Ω

F (u) + 〈G(u), z∗0〉Z. (9.9)

Furthermore, if the infimum in (9.8) is attained by u0 ∈ U suchthat G(u0) ≤ θ, it is also attained in (9.9) by the same u0 and also〈G(u0), z

∗0〉Z = 0. We refer to z∗0 as the Lagrangian Multiplier.

Proof. Consider the space W = R×Z and the sets A,B where

A = (r, z) ∈ (R, Z) | r ≥ F (u), z ≥ G(u) for some u ∈ Ω,(9.10)

and

B = (r, z) ∈ (R, Z) | r ≤ µ0, z ≤ θ, (9.11)

where µ0 = infu∈ΩF (u) | G(u) ≤ θ. Since F and G are convex,A and B are convex sets. It is clear that A contains no interiorpoint of B, and since N = −P contains an interior point , the setB contains an interior point. Thus, from the separating hyperplanetheorem, there is a non-zero element w∗

0 = (r0, z∗0) ∈W ∗ such that

r0r1 + 〈z1, z∗0〉Z ≥ r0r2 + 〈z2, z∗0〉Z , ∀(r1, z1) ∈ A, (r2, z2) ∈ B.(9.12)

From the nature of B it is clear that w∗0 ≥ θ. That is, r0 ≥ 0 and

z∗0 ≥ θ. We will show that r0 > 0. The point (µ0, θ) ∈ B, hence

r0r + 〈z, z∗0〉Z ≥ r0µ0, ∀(r, z) ∈ A. (9.13)

If r0 = 0 then 〈G(u1), z∗0〉Z ≥ 0 and z∗0 6= θ. Since G(u1) < θ and

z∗0 ≥ θ we have a contradiction. Therefore r0 > 0 and, withoutloss of generality we may assume r0 = 1. Since the point (µ0, θ) isarbitrarily close to A and B, we have

µ0 = inf(r,z)∈A

r + 〈z, z∗0〉Z ≤ infu∈Ω

F (u) + 〈G(u), z∗0〉Z

≤ infF (u) | u ∈ Ω, G(u) ≤ θ = µ0. (9.14)

Also, if there exists u0 such that G(u0) ≤ θ, µ0 = F (u0), then

µ0 ≤ F (u0) + 〈G(u0), z∗0〉Z ≤ F (u0) = µ0. (9.15)


Hence

〈G(u0), z∗0〉Z = 0. (9.16)

Corollary 9.1.7. Let the hypothesis of the last theorem hold.Suppose

F (u0) = infu∈Ω

F (u) | G(u) ≤ θ. (9.17)

Then there exists z∗0 ≥ θ such that the Lagrangian L : U ×Z∗ → R

defined by

L(u, z∗) = F (u) + 〈G(u), z∗〉Z (9.18)

has a saddle point at (u0, z∗0). That is

L(u0, z∗) ≤ L(u0, z

∗0) ≤ L(u, z∗0), ∀u ∈ Ω, z∗ ≥ θ. (9.19)

Proof. For z∗0 obtained in the last theorem, we have

L(u0, z∗0) ≤ L(u, z∗0), ∀u ∈ Ω. (9.20)

As 〈G(u0), z∗0〉Z = 0, we have

L(u0, z∗) − L(u0, z

∗0) = 〈G(u0), z

∗〉Z − 〈G(u0), z∗0〉Z

= 〈G(u0), z∗〉Z ≤ 0. (9.21)

We now prove two theorems relevant to develop the subsequentsection.

Theorem 9.1.8. Let F : Ω ⊂ U → R and G : Ω → Z. LetP ⊂ Z be a cone. Suppose there exists (u0, z

∗0) ∈ U × Z∗ where

z∗0 ≥ θ and u0 ∈ Ω are such that

F (u0) + 〈G(u0), z∗0〉Z ≤ F (u) + 〈G(u), z∗0〉Z , ∀u ∈ Ω. (9.22)

Then

F (u0) + 〈G(u0), z∗0〉Z

= infF (u) | u ∈ Ω and G(u) ≤ G(u0). (9.23)

Proof. Suppose there is a u1 ∈ Ω such that F (u1) < F (u0) andG(u1) ≤ G(u0). Thus

〈G(u1), z∗0〉Z ≤ 〈G(u0), z

∗0〉Z (9.24)

so that

F (u1) + 〈G(u1), z∗0〉Z < F (u0) + 〈G(u0), z

∗0〉Z , (9.25)

9.2. DUALITY 227

which contradicts the hypothesis of the theorem.

Theorem 9.1.9. Let F be a convex real functional and G :Ω → Z convex and let u0 and u1 be solutions to the problems P0

and P1 respectively, where

P0 : minimize F (u) subject to u ∈ Ω and G(u) ≤ z0, (9.26)

and

P1 : minimize F (u) subject to u ∈ Ω and G(u) ≤ z1. (9.27)

Suppose z∗0 and z∗1 are the Lagrange multipliers related to theseproblems. Then

〈z1 − z0, z∗1〉Z ≤ F (u0) − F (u1) ≤ 〈z1 − z0, z

∗0〉Z . (9.28)

Proof. For u0, z∗0 we have

F (u0) + 〈G(u0) − z0, z∗0〉Z ≤ F (u) + 〈G(u) − z0, z

∗0〉Z , ∀u ∈ Ω,

(9.29)

and, particularly for u = u1 and considering that 〈G(u0)−z0, z∗0〉Z =0, we obtain

F (u0) − F (u1) ≤ 〈G(u1) − z0, z∗0〉Z ≤ 〈z1 − z0, z

∗0〉Z . (9.30)

A similar argument applied to u1, z∗1 provides us the other inequal-

ity.

9.2. Duality

Consider the basic convex programming problem:

Minimize F (u) subject to G(u) ≤ θ, u ∈ Ω, (9.31)

where F : U → R is a convex functional, G : U → Z is convexmapping, and Ω is a convex set. We define ϕ : Z∗ → R by

ϕ(z∗) = infu∈Ω

F (u) + 〈G(u), z∗〉Z. (9.32)

Proposition 9.2.1. ϕ is concave and

ϕ(z∗) = infz∈Γ

ω(z) + 〈z, z∗〉Z, (9.33)

where

ω(z) = infu∈Ω

F (u) | G(u) ≤ z, (9.34)


and

Γ = z ∈ Z | G(u) ≤ z for some u ∈ Ω.

Proof. Observe that

ϕ(z∗) = infu∈Ω

F (u) + 〈G(u), z∗〉Z≤ inf

u∈ΩF (u) + 〈z, z∗〉Z | G(u) ≤ z

= ω(z) + 〈z, z∗〉Z , ∀z∗ ≥ θ, z ∈ Γ. (9.35)

On the other hand, for any u1 ∈ Ω, defining z1 = G(u1), we obtain

F (u1) + 〈G(u1), z∗〉Z ≥ inf

u∈ΩF (u) + 〈z1, z∗〉Z | G(u) ≤ z1

= ω(z1) + 〈z1, z∗〉Z , (9.36)

so that

ϕ(z∗) ≥ infz∈Γ

ω(z) + 〈z, z∗〉Z. (9.37)

Theorem 9.2.2 (Lagrange Duality). Consider F : Ω ⊂ U →R a convex functional, Ω a convex set, and G : U → Z a convexmapping. Suppose there exists a u1 such that G(u1) < θ and thatinfu∈ΩF (u) | G(u) ≤ θ <∞. Under such assumptions, we have

infu∈Ω

F (u) | G(u) ≤ θ = maxz∗≥θ

ϕ(z∗). (9.38)

If the infimum on the left side in (13.20) is achieved at some u0 ∈ Uand the max on the right side at z∗0 ∈ Z∗, then

〈G(u0), z∗0〉Z = 0 (9.39)

and u0 minimizes F (u) + 〈G(u), z∗0〉Z on Ω.

Proof. For z∗ ≥ θ we have

infu∈Ω

F (u) + 〈G(u), z∗〉Z ≤ infu∈Ω,G(u)≤θ

F (u) + 〈G(u), z∗〉Z

≤ infu∈Ω,G(u)≤θ

F (u) ≤ µ0. (9.40)

or

ϕ(z∗) ≤ µ0. (9.41)

The result follows from Theorem 9.1.6.

9.3. THE LAGRANGE MULTIPLIER THEOREM 229

9.3. The Lagrange Multiplier Theorem

Definition 9.3.1. Let T : D → V be a continuously Frechetdifferentiable operator, where D ⊂ U is open, being U and V Ba-nach spaces. We say that u0 ∈ D is a regular point of T if T ′(u0)maps U onto V .

Theorem 9.3.2 (Generalized Inverse Function Theorem). Letu0 be a regular point of T : U → V , where U and V are Banachspaces. Defining v0 = T (u0), there exists a neighborhood BV

r (v0)and a constant K > 0 such that the equation

v = T (u)

has a solution for each v ∈ BVr (v0) and such a solution is such that

‖u− u0‖U ≤ K‖v − v0‖V .

Proof. Define L0 = N(T ′(u0)). Observe that U/L0 is a Banachspace, for which we may define

A : U/L0 → V

byA(u) = T ′(u0)u.

Since T ′(u0) is onto, so is A, so that by the inverse mapping theo-rem, A has a continuous inverse A−1. Choose ε > 0 such that

ε <1

4‖A−1‖Choose r > 0 such that if ‖u−u0‖U < r then ‖T ′(u)−T ′(u0)‖ < ε.Let v ∈ Br′(v0) where r′ = r

4‖A−1‖. Define g0 = θ ∈ L0 and also

defineL1 = A−1[v − T (u0 − g0)].

Choose g1 ∈ L1 such that

‖g1 − g0‖U ≤ 2‖L1 − L0‖.This is possible since

‖L1 − L0‖ = infg∈L1

‖g − g0‖U.

ThereforeL1 = A−1[v − v0],

so that‖L1‖ ≤ ‖A−1‖‖v − v0‖V ,


and hence

‖g1‖U ≤ 2‖L1‖≤ 2‖A−1‖‖v − v0‖V

≤ 2‖A−1‖r′

≤ r

2. (9.42)

Now reasoning by induction, for n ≥ 2, assume that

‖gn−1‖U < r and ‖gn−2‖U < r,

define Ln by

Ln − Ln−1 = A−1(v − T (u0 + gn−1)), (9.43)

and choose gn ∈ Ln such that

‖gn − gn−1‖U ≤ 2‖Ln − Ln−1‖.This is possible since

‖Ln − Ln−1‖ = infg∈Ln

‖g − gn−1‖U.

Observe that we may write

Ln−1 = gn−1 = A−1[A[gn−1]] = A−1[T ′(u0)gn−1],

and thus

Ln = A−1[v − T (u0 + gn−1) + T ′(u0)gn−1],

and

Ln−1 = A−1[v − T (u0 + gn−2) + T ′(u0)gn−2],

so that

Ln−Ln−1 = −A−1[T (u0+gn−1)−T (u0+gn−2)−T ′(u0)(gn−1−gn−2)].

Define

gt = tgn−1 + (1 − t)gn−2.

By the generalized mean value inequality, we obtain

‖Ln −Ln−1‖ ≤ ‖A−1‖‖gn−1 − gn−2‖U supt∈(0,1)

‖T ′(u0 + gt)− T ′(u0)‖

being

‖gt‖U < r.

Hence

‖Ln − Ln−1‖ ≤ ε‖A−1‖‖gn−1 − gn−2‖U ,

9.3. THE LAGRANGE MULTIPLIER THEOREM 231

so that

‖gn − gn−1‖U ≤ 2‖Ln − Ln−1‖≤ 2ε‖A−1‖‖gn−1 − gn−2‖U

≤ 1

2‖gn−1 − gn−2‖U . (9.44)

Thus

‖gn‖U = ‖gn − gn−1 + gn−1 − gn−2 + gn−2 − gn−3 + ...− g0‖U

≤ ‖g1‖U(1 +1

2+ ... +

1

2n)

< 2‖g1‖U ≤ r. (9.45)

Therefore‖gn‖U < r, ∀n ∈ N

and

‖gn − gn−1‖U ≤ 1

2‖gn−1 − gn−2‖U , ∀n ∈ N.

Finally, since gn is a Cauchy sequence and U is a Banach space,there exists g ∈ U such that gn → g as n→ ∞, and thus

Ln → L = g, as n→ ∞.

From this and (9.43) we get

θ = L− L = A−1[v − T (u0 + g)],

and since A−1 is a bijection, we obtain

v = T (u0 + g).

Furthermore

‖g‖U ≤ 2‖g1‖U ≤ 4‖A−1‖‖v − v0‖V ,

so that we may choose

K = 4‖A−1‖.This completes the proof.

Before the final result, we need the lemma:

Lemma 9.3.3. Suppose the functional F : U → R achievesa local extremum under the constraint H(u) = θ at the point u0.Also assume that F and H are continuously Frechet differentiablein an open set containing u0 and that u0 is a regular point of H .Then 〈h, F ′(u0)〉U = 0 for all h satisfying H ′(u0)h = θ.


Proof. Without loss of generality, suppose the local extremumis a minimum. Consider the transformation T : U → R × Zdefined by T (u) = (F (u), H(u)). Suppose there exists a h suchthat H ′(u0)h = θ, F ′(u0)h 6= 0, then T ′(u0) = (F ′(u0), H

′(u0)) :U → R × Z is onto since H ′(u0) is onto Z (and obviously sinceF ′(u0)h 6= 0, F ′(u0) is onto R). By the inverse function theorem,given ε > 0 there exists u ∈ U and δ > 0 with ‖u− u0‖U < ε suchthat T (u) = (F (u0) − δ, θ), which contradicts the assumption thatu0 is a local minimum.

Theorem 9.3.4 (Lagrange Multiplier). Suppose F is a contin-uously Frechet differentiable functional which has a local extremumunder the constraint H(u) = θ at the regular point u0, then thereexists a Lagrangian multiplier z∗0 ∈ Z∗ such that the Lagrangianfunctional

L(u) = F (u) + 〈H(u), z∗0〉Z (9.46)

is stationary at u0, that is, F ′(u0) +H ′(u0)∗z∗0 = θ.

Proof. From last lemma we have that F ′(u0) is orthogonal tothe null space of H ′(u0). Since the the range of H ′(u0) is closed, itfollows that

F ′(u0) ∈ R[(H ′(u0))∗], (9.47)

therefore there exists z∗0 ∈ Z∗ such that

F ′(u0) = −H ′(u0)∗z∗0 . (9.48)

Part 3

Applications

CHAPTER 10

Duality Applied to a Plate Model

10.1. Introduction

The main objective of the present chapter is to develop sys-tematic approaches for obtaining dual variational formulations forsystems originally modeled by non-linear differential equations.

Duality for linear systems is well established and is the mainsubject of classical convex analysis, since in case of linearity, bothprimal and dual formulations are generally convex. In case of non-linear differential equations, some complications occur and the stan-dard models of duality for convex analysis must be modified andextended.

In particular in the case of Kirchhoff-Love plate model, thereis a non-linearity concerning the strain tensor (that is, a geometricnon-linearity). To apply the classical results of convex analysis andobtain the complementary formulation is possible only for a specialclass of external loads. This leads to non-compressed plates, pleasesee Telega [40], Gao [22] and other references therein.

We now describe the primal formulation and related dualityprinciples. Consider a plate whose middle surface is represented byan open bounded set S ⊂ R2, whose boundary is denoted by Γ,subjected to a load to be specified. We denote by uα : S → R (α =1, 2) the horizontal displacements and by w : S → R, the verticaldisplacement field. The boundary value form of the Kirchhoff-Lovemodel can be expressed by the equations:

Nαβ,β = 0,

Qα,α +Mαβ,αβ + P = 0, a.e. in S(10.1)

and

Nαβ .nβ − Pα = 0,

(Qα +Mαβ,β)nα +∂(Mαβtαnβ)

∂s− P = 0,

Mαβnαnβ −Mn = 0, on Γt,

(10.2)

235

236 10. DUALITY APPLIED TO A PLATE MODEL

where,Nαβ = Hαβλµγλµ,

Mαβ = hαβλµκλµ

and,

γαβ(u) =1

2(uα,β + uβ,α + w,αw,β),

καβ(u) = −w,αβ,

with the boundary conditions

uα = w =∂w

∂n= 0, on Γu.

Here, Nαβ denote the membrane forces, Mαβ denote the mo-ments and Qα = Nαβw,β stand for functions related to therotation work of membrane forces, P ∈ L2(S) is a field of ver-tical distributed forces applied on S, (Pα, P ) ∈ (L2(Γt))

3 denoteforces applied to Γt concerning the horizontal directions defined byα = 1, 2 and vertical direction respectively. Mn are distributed mo-ments applied also to Γt, where Γ is such that Γu ⊂ Γ, Γ = Γu ∪ Γt

and Γu ∩ Γt = ∅. Finally, the matrices Hαβλµ and hαβλµ arerelated to the coefficients of Hooke’s Law.

The corresponding primal variational formulation to this bound-ary value model is represented by the functional J : U → R, where

J(u) =1

2

∫

S

HαβλµγαβγλµdS +1

2

∫

S

hαβλµκαβκλµdS −∫

S

PwdS

−∫

Γt

(Pw + Pαuα −Mn∂w

∂n)dΓ

and

U = (uα, w) ∈W 1,2(S) ×W 1,2(S) ×W 2,2(S),

uα = w =∂w

∂n= 0 on Γu.

The first duality principle presented is the classical one (again wemention the earlier similar results in Telega [40], Gao [22]) , and isobtained by applying a little change of Rockafellar’s approach forconvex analysis. We have developed a different proof from the onefound in [40], by using the definition of Legendre Transform andrelated properties. Such a result may be summarized as

infu∈U

J(u) = supv∗∈A∗∩C∗

−G∗L(v∗) (10.3)

10.1. INTRODUCTION 237

The dual functional, denoted by −G∗L : A∗ ∩ C∗ → R is ex-

pressed as

G∗L(v∗) =

1

2

∫

S

HαβλµNαβNλµdS +1

2

∫

S

hαβλµMαβMλµdS

+1

2

∫

S

NαβQαQβdS

,

where C∗ is defined by equations (10.1) and (10.2) and

A∗ = v∗ ∈ Y ∗| N11 > 0, N22 > 0, and

N11N22 −N212 > 0, a.e. in S, (10.4)

here v∗ = Nαβ,Mαβ , Qα ∈ Y ∗ = L2(S; R10) ≡ L2(S)Therefore, since the functional G∗

L(v∗) is convex in A∗, the du-ality is perfect if the optimal solution for the primal formulationsatisfies the constraints indicated in (10.4), however it is importantto emphasize that such constraints imply no compression along theplate.

For the second and third principles, we emphasize that our dualformulations remove or relax the constraints on the external load,and are valid even for compressed plates.

Still for these two principles, we use a theorem (Toland, [42])which does not require convexity of primal functionals. Such aresult can be summarized as:

infu∈U

G(u) − F (u) = infu∗∈U∗

F ∗(u∗) −G∗(u∗)

Here G : U → R and F : U → R and, F ∗ : U∗ → R and G∗ : U∗ →R denote the primal and dual functionals respectively.

In particular for the second principle, we modify the above resultby applying it to a not one to one relation between primal and dualvariables, obtaining the final duality principle expressed as follows

inf(u,p)∈U×Y

JK(u, p) ≤ inf(u,v∗)∈U×Y ∗

J∗K(u, v∗)

where

JK(u, p) = G(Λu+ p) − F (u) +K

2〈p, p〉

L2(S)


and

J∗K(u, v∗) = F ∗(Λ∗v∗) −GL(v∗) +K

∥

∥

∥

∥

Λu− ∂g∗L(v∗)

∂y∗

∥

∥

∥

∥

2

L2(S)

+1

2K〈v∗, v∗〉

L2(S).

HereK ∈ R is a positive constant and we are particularly concernedwith the fact that

JK(uK , pK) → J(u0), as K → +∞and

J∗K(uK , v

∗K) → J(u0), as K → +∞

where

JK(uK , pK) = inf(u,p)∈U×Y

JK(u, p),

J∗K(uK , v

∗K) = inf

(u,v∗)∈U×Y ∗

J∗K(u, v∗)

and

J(u0) = infu∈U

J(u) = G(Λu) − F (u).Even though we do not prove it in the present article, postponinga more rigorous analysis concerning the behavior of uK indicatedabove as K → +∞, for a future work.

For the third duality principle, the dual variables must satisfythe following constraints :

N11 +K > 0, N22 +K > 0 and

(N11 +K)(N22 +K) −N212 > 0, a.e. in S. (10.5)

Such a principle may be summarized by the following result,

infu∈U

G(Λu)−F (Λ1u)−〈u, p〉U ≤ infz∗∈Y ∗

supv∗∈B∗(z∗)

F ∗(z∗)−G∗L(v∗),

where

B∗(z∗) = v∗ ∈ Y ∗ | Λ∗v∗ − Λ∗1z

∗ − p = 0Therefore the constant K > 0 must be chosen so that the op-

timal point of the primal formulation satisfies the constraints in-dicated in (10.5). This is because these relations also define anenlarged region in which the analytical expression of the functionalG∗

L : Y ∗ → R is convex, so that, in this case, negative membraneforces are allowed.

10.1. INTRODUCTION 239

In Section 10.7, we present a convex dual variational formulationwhich may be expressed through the following duality principle:

infu∈U

J(u) = sup(v∗,z∗)∈E∗∩B∗

−G∗(v∗) + 〈z∗α, z∗α〉L2(S)/(2K)

where,

G∗(v∗) = G∗L(v∗) =

1

2

∫

S

HαβλµNαβNλµdS+1

2

∫

S


+1

2

∫

S

NKαβQ,αQ,βdS

if v∗ ∈ E∗, where

v∗ = Nαβ, Mαβ, Qα ∈ E∗ ⇔ v∗ ∈ L2(S,R10)

and

N11 +K > 0 N22 +K > 0

and

(N11 +K)(N22 +K) −N212 > 0, a.e. in S,

where

NKαβ =

N11 +K N12

N12 N22 +K

−1

(10.6)

and

(v∗, z∗) ∈ B∗ ⇔

Nαβ,β + Pα = 0,

Qα,α +Mαβ,αβ − z∗α,α + P = 0,

h1212M12 + z∗1,2/K = 0,

z∗1,2 = z∗2,1, a.e. in S, and, z∗ · n = 0 on Γ.

We are assuming the existence of u0 ∈ U such that δJ(u0) =θ, and so that there exists K > 0 for which N11(u0) + K > 0,N22(u0) +K > 0 , (N11(u0) +K)(N22(u0) +K)−N12(u0)

2 > 0 (a.ein S) and h1212/(2K0) > K where K0 is the constant related to aPoincare type Inequality and,

Nαβ(u0) = Hαβλµγλµ(u0).

Finally, in the last section, we prove a result similar to thoseobtained through the triality criterion introduced in Gao [24] andestablish sufficient conditions for the existence of a minimizer for


the primal formulation. Such conditions may be summarized byδJ(u0) = θ and

1

2

∫

S

Nαβ(u0)w,αw,βdS +1

2

∫

S

hαβλµw,αβw,λµdS ≥ 0,

∀w ∈W 2,20 (S). 2

For this last result, our proof is new. The statement of resultsthemselves follows those of Gao [24].

We are now ready to state the result of Toland [42], throughwhich will be constructed three duality principles.

Theorem 10.1.1. Let J : U −→ R be a functional defined asJ(u) = G(u) − F (u), ∀u ∈ U , where there exists u0 ∈ U such thatJ(u0) = inf

u∈UJ(u) and ∂F (u0) 6= ∅, then

infu∈U

G(u) − F (u) = infu∗∈U∗

F ∗(u∗) −G∗(u∗)

and for u∗0 ∈ ∂F (u0) we have,

F ∗(u∗0) −G∗(u∗0) = infu∗∈U∗

F ∗(u∗) −G∗(u∗).

Furthermore u∗0 ∈ ∂G(u0).

10.2. The Primal Variational Formulation

Let S ⊂ R2 be an open bounded set (with a boundary denotedby Γ) which represents the middle surface of a plate of thicknessh. The vectorial basis related to the Cartesian system x1, x2, x3is denoted by (aα, a3), where α = 1, 2 (in general Greek indicesstand for 1 or 2), a3 denotes the vector normal to S, t is the vectortangent to Γ and n is the outer normal to S. The displacementswill be denoted by:

u = uα, u3 = uαaα + u3a3,

The Kirchhoff-Love relations are

uα(x1, x2, x3) = uα(x1, x2) − x3w(x1, x2),α

and

u3(x1, x2, x3) = w(x1, x2),

10.2. THE PRIMAL VARIATIONAL FORMULATION 241

where −h/2 ≤ x3 ≤ h/2 so that we have u = (uα, w) ∈ U where

U =

(uα, w) ∈W 1,2(S) ×W 1,2(S) ×W 2,2(S),

uα = w =∂w

∂n= 0 on Γu

.

We divide the boundary into two parts, so that Γu ⊂ Γ, Γ = Γu∪Γt

and Γu ∩ Γt = ∅. The strain tensors are denoted by

γαβ(u) =1

2[Λ1αβ(u) + Λ2α(u)Λ2β(u)] (10.7)

and

καβ(u) = Λ3αβ(u) (10.8)

where: Λ = Λ1αβ, Λ2α, Λ3αβ : U → Y = Y ∗ = L2(S; R10) ≡L2(S) is defined by:

Λ1αβ(u) = uα,β + uβ,α, (10.9)

Λ2α(u) = w,α (10.10)

and

Λ3αβ(u) = −w,αβ. (10.11)

The constitutive relations are expressed as

Nαβ = Hαβλµγλµ, (10.12)

Mαβ = hαβλµκλµ (10.13)

where: Hαβλµ and hαβλµ = h2

12Hαβλµ, are positive definite ma-

trices and such that Hαβλµ = Hαβµλ = Hβαλµ = Hβαµλ. Further-more Nαβ denote the membrane forces and Mαβ the moments.The plate stored energy, denoted by (G Λ) : U → R is expressedas

(G Λ)(u) =1

2

∫

S

NαβγαβdS +1

2

∫

S

MαβκαβdS (10.14)

and the external work, denoted as F : U → R, is given by

F (u) =

∫

S

PwdS +

∫

Γt


∂n)dΓ, (10.15)

where P denotes a vertical distributed load applied in S and P , Pα

are forces applied on Γt ⊂ Γ related to directions defined by a3 and


aα respectively, and, Mn denote moments also applied on Γt. Thepotential energy, denoted by J : U → R is expressed as:

J(u) = (G Λ)(u) − F (u)

It is important to emphasize that conditions for the existenceof a minimizer (here denoted by u0) related to G(Λu) − F (u) werepresented in Ciarlet [14]. Such u0 ∈ U satisfies the equation:

δ(G(Λu0) − F (u0)) = θ

and we should expect at least one minimizer if ‖Pα‖L2(Γt) is smallenough and m(Γu) > 0 (where m stands for the Lebesgue measure)and with no restrictions concerning the magnitude of ‖Pα‖L2(S) ifm(Γ) = m(Γu), so that in the latter case, we consider a field ofdistributed forces Pα applied on S.

Some inequalities of Sobolev type are necessary to prove theabove result, and in this work we assume some regularity hypothesisconcerning S and its boundary, namely: in addition to S beingopen and bounded, also we assume it is connected with a Lipschitzcontinuous boundary Γ, so that S is locally on one side of Γ.

The formal proof of existence of a minimizer for J(u) = G(Λu)−F (u) is obtained through the Direct Method of Calculus of varia-tions. We do not repeat this procedure here, we just refer to Ciarlet[14] for details.

10.3. The Legendre Transform

In this section we determine the Legendre Transform related tothe function g : R10 → R where:

g(y) =1

2Hαβλµ[(y1αβ + y1βα + y2αy2β)/2][(y1λµ + y1µλ + y2λy2µ)/2]

+1

2hαβλµy3αβy3λµ (10.16)

and we recall that

G(Λu) =

∫

S

g(Λu)dS.

From Definition 8.1.25 we may write

g∗L(y∗) = 〈y0, y∗〉R10 − g(y0)

where y0 is the unique solution of the system,

y∗i =∂g(y0)

∂yi

10.3. THE LEGENDRE TRANSFORM 243

which for the above function g, implies:

y∗1αβ = Hαβλµ(y1λµ + y2λy2µ/2)

y∗2α = Hαβλµ(y1λµ,+y2λy2µ/2)y2β = y∗1αβy2β ,

and

y∗3αβ = hαβλµy3λµ.

Inverting this system we obtain

y021 = (y∗122 .y∗21 − y∗112 .y∗22)/∆,

y022 = (−y∗112 .y∗21 + y∗111 .y∗22)/∆,

and

y01αβ = Hαβλµy∗1λµ − y02α .y02β/2

where

Hαβλµ = Hαβλµ−1,

∆ = y∗111y∗122 − (y∗112)2 (we recall that y∗112 = y∗121 , as a result of thesymmetries of Hαβλµ).

By analogy,

y03αβ = hαβλµv∗3λµ

where:

hαβλµ = hαβλµ−1.

Thus we can define the set RnL, concerning Definition 8.1.25 as

RnL = y∗ ∈ R10 | ∆ 6= 0. (10.17)

After some simple algebraic manipulations we obtain the expressionfor g∗L : Rn

L → R, that is,

g∗L(y∗) =1

2Hαβλµy

∗1αβy

∗1λµ +

1

2hαβλµy

∗3αβy

∗3λµ +

1

2y∗1αβy02αy02β .

(10.18)

Also from Definition 8.1.25, we have

Y ∗L = v∗ ∈ Y ∗ = L2(S; R10) ≡ L2(S) | v∗(x) ∈ Rn

L a.e. in Sso that G∗

L : Y ∗L → R may be expressed as

G∗L(v∗) =

∫

S

g∗L(v∗)dS.


Or, from (10.18),

G∗L(v∗) =

1

2

∫

S

Hαβλµv∗1αβv

∗1λµdS +

1

2

∫

S

hαβλµv∗3αβv

∗3λµdS

+1

2

∫

S

v∗1αβv02αv02βdS. 2

Changing the notation, as indicated below,

v∗1αβ = Nαβ, v∗2α = Qα = v∗1αβv02β = Nαβv02β , v∗3αβ = Mαβ

we could express G∗L : Y ∗

L → R as

G∗L(v∗) =

1

2

∫

S


2

∫

S


+1

2

∫

S

NαβQαQβdS,

whereNαβ = Nαβ−1.

Remark 10.3.1. Also we can use the transformation

Qα = Nαβw,β

and obtain

G∗L(v∗) =

1

2

∫

S


2

∫

S


+1

2

∫

S

Nαβw,αw,βdS.

The term denoted by Gp : Y ∗ × U → R and expressed as

Gp(v∗, w) =

1

2

∫

S

Nαβw,αw,βdS

is known as the gap function.

10.4. The Classical Dual Formulation

In this section we establish the dual variational formulation inthe classical sense.

We recall that J : U → R is expressed by

J(u) = (G Λ)(u) − F (u),

where (G Λ) : U → R and F : U → R were defined by equations(10.14) and (10.15) respectively. It is known and easy to see that

10.4. THE CLASSICAL DUAL FORMULATION 245

infu∈U

G(Λu) + F (u) ≥ supv∗∈Y ∗

−G∗(v∗) − F ∗(−Λ∗v∗). (10.19)

Now we prove a result concerning the representation of the polarfunctional, namely:

Proposition 10.4.1. Considering the earlier definitions andassumptions on G : Y → R (see section 10.2), expressed by G(v) =∫

Sg(v)dS, where g : R10 → R is indicated in (10.16), we have

v∗ ∈ A∗ ⇒ G∗(v∗) = G∗L(v∗)

where

G∗L(v∗) =

1

2

∫

S


2

∫

S


+1

2

∫

S

NαβQαQβdS

and

A∗ = v∗ = Nαβ ,Mαβ, Qα ∈ Y ∗| N11 > 0, N22 > 0, and

N11N22 −N212 > 0, a.e. in S. (10.20)

Proof. First, consider the quadratic inequality in x as indicatedbelow,

ax2 + bx+ c ≤ 0, ∀x ∈ R,


(a < 0 and b2 − 4ac ≤ 0) or (a = 0, b = 0 and c ≤ 0). (10.21)

Consider now the inequality

a1x2 + b1xy + c1y

2 + d1x+ e1y + f1 ≤ 0, ∀x, y ∈ R2 (10.22)

and the quadratic equation related to the variable x, for

a = a1, b = b1y + d1 and c = c1y2 + e1y + f1,

and for a1 < 0, from (10.21) the inequality (10.22) is equivalent to

(b21 − 4a1c1)y2 + (2b1d1 − 4a1e1)y + d2

1 − 4a1f1 ≤ 0, ∀y ∈ R.

Finally, for

a = b21 − 4a1c1 < 0, b = 2b1d1 − 4a1e1 and c = d21 − 4a1f1,


also from (10.21), the last inequality is equivalent to

−c1d21 − a1e

21 + b1d1e1 − (b21 − 4a1c1)f1 ≤ 0. (10.23)

In order to represent the polar functional related to the plate storedenergy, we first consider the polar functional related to g1(y), where

g1(y) =1

2Hαβλµ(y1αβ +

1

2y2αy2β)(y1λµ +

1

2y2λy2µ),

g(y) = g1(y) + g2(y)

and

g2(y) =1

2hαβλµy3αβy3λµ.

In fact we determine a set in which the polar functional is repre-sented by the Legendre Transform g∗1L(y∗), where, from (10.18),

g∗1L(y∗) =1

2Hαβλµy

∗1αβy

∗1λµ

+y∗111(y∗22)2 − 2.y∗112y∗21y∗22 + y∗122(y∗21)2

2[y∗111y∗122 − (y∗112)2]. (10.24)

Thus, since

g∗1(y∗) = sup

y∈R6

y∗1αβy1αβ + y∗2αy2α − g1(y)

we can write

g∗1L(y∗) = g∗1(y∗) ⇔ g∗1L(y∗) ≥ y∗1αβy1αβ + y∗2αy2α − g1(y), ∀y ∈ R6.

Or

y∗1αβy1αβ + y∗2αy2α − 1

2Hαβλµ(y1αβ +

1

2y2αy2β)(y1λµ +

1

2y2λy2µ)

− g∗1L(y∗) ≤ 0, ∀y ∈ R6. (10.25)

However, considering the transformation

y1αβ = y1αβ +1

2y2αy2β ,

y1αβ = y1αβ − 1

2y2αy2β , (10.26)

and substituting such relations into (10.25), we obtain

g∗1L(y∗) = g∗1(y∗) ⇔


y∗1αβ(y1αβ − 1

2y2αy2β) + y∗2αy2α − 1

2Hαβλµy1αβ y1λµ − g∗1L(y∗) ≤ 0,

∀y1αβ , y2α ∈ R6. (10.27)

On the other hand, since Hαβλµ is a positive definite matrix wehave

supy

1αβ ∈R4

y∗1αβ y1αβ − 1

2Hαβλµy1αβ y1λµ =

1

2Hαβλµy

∗1αβy

∗1λµ . (10.28)

Thus considering (10.28) and the expression of g∗L(y∗) indicated in(10.24) , inequality (10.27) is satisfied if

− 1

2y∗1αβy2αy2β + y∗2αy2α − y∗111 .(y∗22)2 − 2.y∗112 .y∗21y∗22 + y∗122 .(y∗21)2

2[y∗111y∗122 − (y∗112)2]

≤ 0, ∀y2α ∈ R2. (10.29)

So, for

a1 = −1

2y∗111 , b1 = −y∗112 , c1 = −1

2y∗122 , , d1 = y∗21 , e1 = y∗22

and

f1 = −y∗111 .(y∗22)2 − 2.y∗112 .y∗21y∗22 + y∗122 .(y∗21)2

2[y∗111y∗122 − (y∗112)2]

we obtain

−c1d21 − a1e

21 + b1d1e1 − (b21 − 4a1c1)f1 = 0

Therefore from (10.23), the inequality (10.25) is satisfied if a1 < 0(y∗111 > 0) and b21 − 4a1c1 < 0 (y∗111y∗122 − (y∗112)2 > 0 which impliesy∗122 > 0).

Thus we have shown that

y∗ ∈ A∗ ⇒ g∗1(y∗) = g∗1L(y∗), (10.30)

where

A∗ = y∗ ∈ R6| y∗111 > 0, y∗122 > 0, y∗111y∗122 − (y∗112)2 > 0.On the other hand, by analogy to above results, it can easily beproved that

g∗2(y∗) = g∗2L(y∗), ∀y∗3αβ ∈ R3 (10.31)

where

g∗2L(y∗) =1

2hαβλµy

∗3αβy

∗3λµ (10.32)


and

g∗2(y∗) = sup

y∈R3

y∗3αβy3αβ − 1

2hαβλµy3αβy3λµ.

From (10.30) and (10.31), we can write

if y∗ ∈ A∗ then g∗1(y∗)+g∗2(y

∗) = g∗1L(y∗)+g∗2L(y∗) ≤ (g1+g2)∗(y∗).

As (g1 + g2)∗(y∗) ≤ g∗1(y

∗) + g∗2(y∗) we have

if y∗ ∈ A∗ then g∗L(y∗) = g∗1L(y∗) + g∗2L(y∗)

= (g1 + g2)∗(y∗) = g∗(y∗). (10.33)

However, from Proposition 8.1.24

G∗(v∗) =

∫

S

g∗(v∗)dS (10.34)

so that from (10.33) and (10.34) we obtain

v∗ ∈ A∗ ⇒ G∗(v∗) =

∫

S

g∗L(v∗)dS = G∗L(v∗)

where,

A∗ = v∗ ∈ Y ∗ | v∗(x) ∈ A∗, a.e. in S.Alternatively,

A∗ = v∗ ∈ Y ∗ | v∗111 > 0, v∗122 > 0, and v∗111v∗122 − (v∗112)2 > 0,

a.e. in S. (10.35)

Thus, through the notation

v∗1αβ = Nαβ, v∗2α = Qα = v∗1αβv02β = Nαβv02β , v∗3αβ = Mαβ

we have

A∗ = v∗ = Nαβ ,Mαβ, Qα ∈ Y ∗| N11 > 0, N22 > 0, and

N11N22 −N212 > 0, a.e. in S. (10.36)

10.4.1. The Polar Functional Related to F : U → R. Weare concerned with the evaluation of the extremum,

F ∗(−Λ∗v∗) = supu∈U

〈u,−Λ∗v∗〉U − F (u),

or

F ∗(−Λ∗v∗) = supu∈U

〈Λu,−v∗〉Y − F (u).


Considering

F (u) = −(∫

S

PwdS +

∫

Γt


∂n)dΓ

)

= 〈u, f〉U (10.37)

we have

F ∗(−Λ∗v∗) =

0, if v∗ ∈ C∗,+∞, otherwise,

(10.38)

where v∗ ∈ C∗ ⇔ v∗ ∈ Y ∗ and

v∗1αβ,β = 0,v∗2α,α + v∗

3αβ,αβ + P = 0, a.e. in S,(10.39)

and

v∗1αβ .nβ − Pα = 0,

(v∗2α + v∗3αβ,β).nα +∂(v∗

3αβ tαnβ)

∂s− P = 0,

v∗3αβnαnβ −Mn = 0, on Γt.

(10.40)

Remark 10.4.2. We can also denote

C∗ = v∗ ∈ Y ∗ | Λ∗v∗ = f, (10.41)

where the relation Λ∗v∗ = f is defined by (10.39) and (10.40).

10.4.2. The First Duality Principle. Considering inequal-ity (10.19), the expression of G∗(v∗), and the set C∗ above defined,we can write

infu∈U

(G Λ)(u) − F (u) ≥ supv∗∈A∗∩C∗

−G∗L(v∗) (10.42)

so that the final form of the concerned duality principle results fromthe following theorem.

Theorem 10.4.3. Let (G Λ) : U → R and F : U → R bedefined by (10.14) and (10.15) respectively (and here we express Fas F (u) = 〈u, f〉U). If −G∗

L : Y ∗L → R attains a local extremum at

v∗0 ∈ A∗ under the constraint Λ∗v∗ − f = 0, then

infu∈U

(G Λ)(u) + F (u) = supv∗∈A∗∩C∗

−G∗L(v∗)


and u0 ∈ U and v∗0 ∈ Y ∗ such that:

δ−G∗L(v∗0) + 〈u0,Λ

∗v∗0 − f〉U = θ

are also such that

J(u0) = −G∗L(v∗0) and δJ(u0) = θ.

The proof of above theorem is consequence of the standard nec-essary conditions for a local extremum for −G∗

L : Y ∗L → R under the

constraint Λ∗v∗ − f = θ, the inequality (10.42) plus an applicationof Theorem 8.1.27.

Therefore, in a more explicit format we would have

infu∈U

1

2

∫

S

HαβλµγαβγλµdS +1

2

∫

S

hαβλµκαβκλµdS

−(∫

S

PwdS +

∫

Γt

PwdS +

∫

Γt

PαuαdΓ −∫

Γt

Mn∂w

∂ndΓ

)

= supv∗∈A∗∩C∗

−1

2

∫

S

HαβλµNαβNλµdS − 1

2

∫

S


− 1

2

∫

S

NαβQαQβdS

(10.43)

where v∗ ∈ C∗ ⇔ v∗ ∈ Y ∗ and,

Nαβ,β = 0,Qα,α +Mαβ,αβ + P = 0, a.e. in S

and



∂s− P = 0,


with the set A∗ defined by (10.20) and

Nαβ = Nαβ−1.

10.5. The Second Duality Principle

The next result is a extension of Theorem 10.1.1 and, instead ofcalculating the polar functional related to the main part of primalformulation, it is determined its Legendre Transform and associatedfunctional.

10.5. THE SECOND DUALITY PRINCIPLE 251

Theorem 10.5.1. Consider Gateaux differentiable function-als G Λ : U → R and F Λ1 : U → R where only the secondone is necessarily convex, through which is defined the functionalJK : U × Y → R expressed as

JK(u, p) = G(Λu+p)+K〈p, p〉L2(S)−F (Λ1u)−K〈p, p〉L2(S)

2−〈u, u∗0〉U .

Suppose there exists (u0, p0) ∈ U × Y such that

JK(u0, p0) = inf(u,p)∈U×Y

JK(u, p)

and δJK(u0, p0) = θ. Here Λ = Λi : U → Y and Λ1 : U → Y arecontinuous linear operators whose adjoint operators are denoted byΛ∗ : Y ∗ → U∗ and Λ∗

1 : Y ∗ → U∗ respectively.Furthermore assume there exists a differentiable function de-

noted by g : Rn → R so that G : Y → R may be expressed asG(v) =

∫

Ωg(v)dS, ∀v ∈ Y where g admits differentiable Legendre

transform denoted by g∗L : RnL → R.

Under these assumptions we have

inf(u,p)∈U×Y

JK(u, p) ≤ inf(z∗,v∗,u)∈E∗

J∗K(z∗, v∗, u),

where

J∗K(z∗, v∗, u) = F ∗(z∗) + (1/2K)〈v∗, v∗〉L2(S) −G∗

L(v∗)

+K

n∑

i=1

∥

∥

∥

∥

Λiu−∂g∗L(v∗)

∂y∗i

∥

∥

∥

∥

2

L2(S)

,

and

E∗ = (z∗, v∗, u) ∈ Y ∗ × Y ∗L × U | − Λ∗

1z∗ + Λ∗v∗ − u∗0 = θ.

Also, the functions z∗0 , v∗0, and u0, defined by

z∗0 =∂F (Λ1u0)

∂v,

v∗0 =∂G(Λu0 + p0)

∂v,

and

u0 = u0

are such that

−Λ∗1z

∗0 + Λ∗v∗0 − u∗0 = θ,


and thus

JK(u0, p0) ≤ inf(z∗,v∗,u)∈E∗

J∗K(z∗, v∗, u) ≤ JK(u0, p0)

+ 2K〈p0, p0〉L2(S) (10.44)

where we are assuming that v∗0 ∈ Y ∗L .

Proof. Defining α = inf(u,p)∈U×Y JK(u, p),G1(u, p) = G(Λu+p)+K〈p, p〉L2(S) and G2(u, p) = F (Λ1u)+(K/2)〈p, p〉L2(S)+〈u, u∗0〉Uwe have:

G1(u, p) ≥ G2(u, p) + α, ∀(u, p) ∈ U × Y.

Thus, ∀ v∗ ∈ Y ∗L , we have

sup(u,p)∈U×Y

〈v∗,Λu+ p〉L

2(S) −G2(u, p) ≥ 〈v∗,Λu+ p〉L

2(S)

−G1(u, p) + α, ∀(u, p) ∈ U × Y. (10.45)

From Theorem 8.2.5:

sup(u,p)∈U×Y

〈v∗,Λu+ p〉L

2(S) −G2(u, p)

= infz∗∈C∗(v∗)

F (z∗) + (1/2K)〈v∗, v∗〉L

2(Ω) (10.46)

where

C∗(v∗) = z∗ ∈ Y ∗ | − Λ∗1z

∗ + Λ∗v∗ − u∗0 = θ.Furthermore

〈v∗,Λu+ p〉L

2(S) −G1(u, p)

= 〈v∗,Λu+ p〉L

2(S) −G(Λu+ p) −K〈p, p〉L

2(S).

Choosing u = u and p satisfying the equations

v∗i =∂G(Λu + p)

∂vi,

from a well known Legendre Transform property, we obtain

pi =∂GL(v∗)

∂v∗i− Λiu

so that

〈v∗,Λu+p〉L

2(S)−G1(u, p) = G∗L(v∗)−K

n∑

i=1

∥

∥

∥

∥


∂y∗i

∥

∥

∥

∥

2

L2(S)

.


From last results and inequality (10.45) we have

infz∗∈C∗(v∗)

F ∗(z∗) + (1/2K)〈v∗, v∗〉L

2(S) −G∗L(v∗)

+ Kn∑

i=1

∥

∥

∥

∥


∂y∗i

∥

∥

∥

∥

2

L2(S)

≥ α

= inf(u,p)∈U×Y

JK(u, p) (10.47)

that is,

F ∗(z∗) +1

2K〈v∗, v∗〉

L2(S) −G∗

L(v∗) +Kn∑

i=1

∥

∥

∥

∥


∂y∗i

∥

∥

∥

∥

2

L2(S)

≥ α

= inf(u,p)∈U×Y

JK(u, p), if z∗ ∈ C∗(v∗). (10.48)

Hence

inf(z∗,v∗,u)∈E∗

F ∗(z∗) + (1/2K)〈v∗, v∗〉L2(S) −G∗L(v∗)

+ Kn∑

i=1

∥

∥

∥

∥


∂y∗i

∥

∥

∥

∥

2

L2(S)

≥ α

= inf(u,p)∈U×Y

JK(u, p) (10.49)

so that:

inf(z∗,v∗,u)∈E∗

J∗K(z∗, v∗, u) ≥ inf

(u,p)∈U×YJK(u, p)

where E∗ = C∗(v∗)×Y ∗L ×U, and the remaining conclusions follow

from the expressions of JK(u0, p0) and J∗K(z∗0 , v

∗0, u0).

Remark 10.5.2. We conjecture that the duality gap betweenthe primal and dual formulations, namely 2K〈p0, p0〉L2(S), goes tozero as K → +∞, since p0 ∈ Y satisfies the extremal condition:

1

K

∂G(Λu0 + p0)

∂v+ p0 = 0,

and JK(u, p) is bounded from below. We do not prove it in thepresent work.


In the application of last theorem to the Kirchhoff-Love platemodel, we would have F (Λ1u) = θ, and therefore the variable z∗ isnot present in the dual formulation. Also,

〈u, u∗0〉U =

∫

S

PwdS +

∫

Γt

(

Pαuα + Pw −Mn∂w

∂n

)

dΓ (10.50)

and thus the relevant duality principle could be expressed as

infu∈U

G(Λu+ p) +K〈p, p〉L

2(S) − 〈u, u∗0〉U − K

2〈p, p〉

L2(S)

≤

inf(v∗,u)∈E∗

−1

2

∫

S


2

∫

S


− 1

2

∫

S

NαβQαQβdS+

1

2K

∫

S

NαβNαβdS +1

2K

∫

S

MαβMαβdS

+K

2∑

α,β=1

∥

∥

∥

∥

1

2(uα,β + uβ,α) − HαβλµNλµ +

1

2v02αv02β

∥

∥

∥

∥

2

L2(S)

+K2∑

α=1

‖w,α − v02α‖2L2(S) +K

2∑

α,β=1

‖ − w,αβ − hαβλµMλµ‖2L2(S)

(10.51)

where (v∗, u) ∈ E∗ = C∗ × U ⇔ (v∗, u) ∈ Y ∗L × U and,

Nαβ,β = 0,Qα,α +Mαβ,αβ + P = 0, a.e. in S

and



∂s− P = 0,


where v02α is defined through the equations

Qα = Nαβv02β

and,

Nαβ = Nαβ−1.

10.6. THE THIRD DUALITY PRINCIPLE 255

Finally, we recall that

Y ∗L = v∗ ∈ Y ∗ | ∆ = N11N22 − (N12)

2 6= 0, a.e in S.

10.6. The Third Duality Principle

Now we establish the third result, which may be summarizedby the following theorem:

Theorem 10.6.1. Let U be a reflexive Banach space, (GΛ) :U → R a convex Gateaux differentiable functional and (F Λ1) :U → R convex, coercive and lower semi-continuous (l.s.c.) suchthat the functional

J(u) = (G Λ)(u) − F (Λ1u) − 〈u, p〉Uis bounded from below , where Λ : U → Y and Λ1 : U → Y arecontinuous linear operators.

Then we may write:

infz∗∈Y ∗


F ∗(z∗) −G∗(v∗) ≥ infu∈U

J(u)

where B∗(z∗) = v∗ ∈ Y ∗ such that Λ∗v∗ − Λ∗1z

∗ − p = 0

Proof. By hypothesis there exists α ∈ R (α = infu∈UJ(u))so that J(u) ≥ α, ∀u ∈ U .

That is,

(G Λ)(u) ≥ F (Λ1u) + 〈u, p〉U + α, ∀u ∈ U.

The above inequality clearly implies that

supu∈U

〈u, u∗〉U −F (Λ1u)−〈u, p〉U ≥ supu∈U

〈u, u∗〉U − (G Λ)(u)+α

∀u∗ ∈ U∗. Since F is convex, coercive and l.s.c., by Theorem 8.2.5we may write

supu∈U

〈u, u∗〉U − F (Λ1u) − 〈u, p〉U = infz∗∈A∗(u∗)

F ∗(z∗),

where,A∗(u∗) = z∗ ∈ Y ∗ | Λ∗

1z∗ + p = u∗.

Since G also satisfies the hypothesis of Theorem 8.2.5, we have

supu∈U

〈u, u∗〉U − (G Λ)(u) = infv∗∈D∗(u∗)

G∗(v∗),

whereD∗(u∗) = v∗ ∈ Y ∗ | Λ∗v∗ = u∗.


Therefore we may summarize the last results as

F (z∗) + supv∗∈D∗(u∗)

−G∗(v∗) ≥ α, ∀z∗ ∈ A∗(u∗).

This inequality implies

F (z∗) + supv∗∈B∗(z∗)

−G∗(v∗) ≥ α,

so that we can write

infz∗∈Y ∗


F ∗(z∗) −G∗(v∗) ≥ infu∈U

J(u)

where B∗(z∗) = v∗ ∈ Y ∗ | Λ∗v∗ − Λ∗1z

∗ − p = 0.

We will apply the last theorem to a changed functional con-cerning the primal formulation related to the Kirchhoff-Love platemodel. We redefine (G Λ) : U → R and (F Λ1) : U → R as

(G Λ)(u) =1

2

∫

S

Hαβλµγαβ(u)γλµ(u)dS

+1

2

∫

S

hαβλµκαβ(u)κλµ(u)dS +1

2K

∫

S

w,αw,α dS

if N11(u) + K > 0, N22(u) + K > 0 and (N11(u) + K)(N22(u) +K) −N12(u)

2 > 0 and, +∞ otherwise.

Remark 10.6.2. Notice that (G Λ) : U → R is convex andGateaux differentiable on its effective domain, which is sufficientfor our purposes, since the concerned Fenchel conjugate may beeasily expressed through the region of interest.

Also, we define

F (Λ1u) =1

2K

∫

S

w,αw,α dS

〈u, p〉U =

∫

S

PwdS +

∫

S

PαuαdS

where

u = (uα, w) ∈ U = W 1,20 (S) ×W 1,2

0 (S) ×W 2,20 (S).

These boundary conditions refer to a clamped plate. Furthermore,

Λ1(u) = w,1 , w,2 and

Λ = Λ1αβ ,Λ2α,Λ3αβ

10.6. THE THIRD DUALITY PRINCIPLE 257

as indicated in (10.9), (10.10) and (10.11).Calculating G∗ : Y ∗ → R and F ∗ : Y ∗ → R we would obtain

G∗(v∗) =1

2

∫

S


2

∫

S

hαβλµMαβMλµdS+

+1

2

∫

S

Nαβw,αw,β dS +1

2K

∫

S

w,αw,α dS (10.52)

if v∗ ∈ E∗. Here

v∗ = Nαβ,Mαβ , w,α,

E∗ = v∗ ∈ Y ∗ | N11 +K > 0, N22 +K > 0 and

(N11 +K)(N22 +K) −N212 > 0, a.e. in S

and,

F ∗(z∗) =1

2K

∫

S

(z∗1)2dS +

1

2K

∫

S

(z∗2)2dS.

Furthermore, v∗ ∈ B∗(z∗) ⇔ v∗ ∈ Y ∗ and,

Nαβ,β + Pα = 0,−(z∗α),α + (Nαβw,β),α +Mαβ,αβ +Kw,αα + P = 0, a.e. in S.

Finally, we can express the application of last theorem as:

infz∗∈Y ∗


T

E∗

1

2K

∫

S

(z∗1)2dS +

1

2K

∫

S

(z∗2)2dS−

−1

2

∫

S


2

∫

S

hαβλµMαβMλµdS+

−1

2

∫

S

Nαβw,αw,β dS − 1

2K

∫

S

w,αw,α dS

≥ infu∈U

J(u). (10.53)

The above inequality can in fact represents an equality if thepositive real constant K is chosen so that the point of local ex-

tremum v∗0 = ∂G(Λu0)∂v

∈ E∗ (which meansN11(u0)+K > 0, N22(u0)+K > 0, and (N11(u0)+K)(N22(u0)+K)−N12(u0)

2 > 0). The men-tioned equality is a result of a little change concerning Theorem8.1.27.


Remark 10.6.3. For the determination of G∗(v∗) in (10.52)we have used the transformation

Qα = Nαβw,β +Kw,α,

similarly as indicated in Remark 10.3.1.

10.7. A Convex Dual Formulation

Remark 10.7.1. In this section we assume

Hαβλµ = h 4λ0µ0

λ0 + 2µ0δαβδλµ + 2µ0(δαλδβµ + δαµδβλ),

and

hαβλµ =h2Hαβλµ

12,

where δαβ denotes the Kronecker delta and λ0, µ0 are appropriateconstants.

The next result may be summarized by the following Theorem:

Theorem 10.7.2. Consider the functionals (G Λ) : U → R,(F Λ1) : U → R and 〈u, p〉U defined as

(G Λ)(u) =1

2

∫

S


+1

2

∫

S

hαβλµκαβ(u)κλµ(u)dS +1

2K

∫

S

w,αw,α dS,

F (Λ1u) =1

2K

∫

S

w,αw,α dS,

and

〈u, p〉U =

∫

S

PwdS +

∫

S

PαuαdS,

where

u = (uα, w) ∈ U = W 1,20 (S) ×W 1,2

0 (S) ×W 2,20 (S).

The operators γαβ and καβ are defined in (10.7) and (10.8),respectively. Furthermore, we define J(u) = (GΛ)(u)−F (Λ1u)−〈u, p〉U , and

Λ1(u) = w,1 , w,2 .Suppose there exists u0 ∈ U such that δJ(u0) = 0, and thatthere exists K > 0 for which N11(u0) + K > 0, N22(u0) + K >0 , (N11(u0) + K)(N22(u0) + K) − N12(u0)

2 > 0 (a.e in S) and

10.7. A CONVEX DUAL FORMULATION 259

h1212/(2K0) > K where K0 is the constant related to Poincare In-equality and,

Nαβ(u0) = Hαβλµγλµ(u0).

Then,

J(u0) = minu∈U

J(u) = max(v∗,z∗)∈E∗∩B∗

−G∗(v∗) + 〈z∗α, z∗α〉L2(S)/(2K)

= −G∗(v∗0) + 〈z∗0α , z∗0α〉L2(S)/(2K) (10.54)

where,

v∗0 =∂G(Λu0)

∂vand z∗0α = Kw0,α,

G∗(v∗) = G∗L(v∗) =

1

2

∫

S

HαβλµNαβNλµdS

+1

2

∫

S

hαβλµMαβMλµdS +1

2

∫

S

NKαβQ,αQ,βdS

if v∗ ∈ E∗, where v∗ = Nαβ, Mαβ, Qα ∈ E∗ ⇔ v∗ ∈L2(S,R10) and

N11+K > 0 N22+K > 0 and (N11+K)(N22+K)−N212 > 0, a.e. in S

and,

(v∗, z∗) ∈ B∗ ⇔

Nαβ,β + Pα = 0,

Qα,α +Mαβ,αβ − z∗α,α + P = 0,

h1212M12 + z∗1,2/K = 0,

z∗1,2 = z∗2,1, a.e. in S, and, z∗ · n = 0 on Γ,

being NKαβ as indicated in (10.6).

Proof. Similarly to Proposition 10.4.1, we may obtain the fol-lowing result. If v∗ ∈ E∗ then

G∗L(v∗) = G∗(v∗) ≥ 〈v∗,Λu〉Y −G(Λu), ∀u ∈ U,

so that

G∗L(v∗)− 1

2K〈z∗α, z∗α〉L2(S) ≥ 〈v∗,Λu〉Y − 1

2K〈z∗α, z∗α〉L2(S)−G(Λu),

∀u ∈ U,


and thus, as Λ∗v∗−Λ∗1z

∗−p = 0 (see the definition of B∗) we obtain

Qα,α +Mαβ,αβ − z∗α,α + P = 0 a.e. in S.

Through this equation we may symbolically write

M12 = Λ−1312(−Qα,α + z∗α,α − Mαβ,αβ − P )/2, (10.55)

where Mαβ,αβ denotes M11,11 + M22,22, in S, so that substitutingsuch a relation in the last inequality we have

1

2

∫

S


2

∫

S

h1111M211dS +

∫

S

h1122M11M22dS

+1

2

∫

S

h2222M222dS

+ 2

∫

S

h1212(Λ−1312(v

∗, z∗))2dS

+1

2

∫

S

NKαβQ,αQ,βdS

− 1

2K〈z∗α, z∗α〉L2(S)

≥ 〈Λ1u, z∗〉L2(S;R2) −

1


− G(Λu) + 〈u, p〉U , ∀u ∈ U, (10.56)

where M12 is made explicit through equation (10.55). This equationmakes z∗ an independent variable, so that evaluating the supremumconcerning z∗, particularly for the left side of above inequality, theglobal extremum is achieved through the equation :

−([Λ−1312 ]

∗[h1212Λ−1312(v

∗, z∗)]),α − z∗α/K = 0, a.e. in S.

This means

−h1212Λ−1312(v

∗, z∗) − z∗α,β/K = 0, a.e. in S and z∗ · n = 0 on Γ

or

h1212M12 + z∗α,β/K = 0, a.e. in S and z∗ · n = 0 on Γ

for (α, β) = (1, 2) and (2, 1). Therefore, after evaluating the supremain both sides of (10.56), we may write

G∗L(v∗) − 1

2K〈z∗α, z∗α〉L2(S) ≥ F (Λ1u) −G(Λu) + 〈u, p〉U ,

∀u ∈ U, and (v∗, z∗) ∈ B∗ ∩ E∗.

10.8. A FINAL RESULT, OTHER SUFFICIENT... 261

and it seems to be clear that the condition h1212/(2K0) > K guar-antees coercivity for the expression of left side in the last inequality(see the next remark), so that the unique local extremum concern-ing z∗ is also a global extremum. The equality and remaining con-clusions results from the Gateaux differentiability of primal anddual formulations and an application (with little changes) of The-orem 8.1.27.

Remark 10.7.3. Observe that the dual functional could beexpressed as

G∗L(v∗) − 1


=1

2

∫

S


2

∫

S

h1111M211dS +

∫

S

h1122M11M22dS

+1

2

∫

S

h2222M222dS +

1

2

∫

S

NKαβQ,αQ,βdS +

∫

S

h1212(z∗1,2)

2/K2dS

+

∫

S

h1212(z∗2,1)

2/K2dS − 1

2K〈z∗α, z∗α〉L2(S). (10.57)

Thus, through the relation h1212/(2K0) > K (where K0 is the con-stant related to a Poincare type inequality, now specified throughthe last format of the dual formulation), it is now clear that thedual formulation is convex on E∗ ∩ B∗.

10.8. A Final Result, other Sufficient

Conditions of Optimality

This final result is developed similarly to the triality criterionintroduced in Gao [24], which describes, in some situations, suffi-cient conditions for optimality.

We prove the following result.

Theorem 10.8.1. Consider J : U → R where J(u) = G(Λu)+F (u),

G(Λu) =1

2

∫

S

Hαβλµγαβ(u)γλµ(u)dS +1

2

∫

S

hαβλµw,αβw,λµdS.

Here the operators γαβ are defined as in (10.7),

F (u) = −∫

S

PwdS ≡ −〈u, f〉U ,


and,

U = W 1,20 (S) ×W 1,2

0 (S) ×W 2,20 (S).

Then, if u0 ∈ U is such that δJ(u0) = θ and

1

2

∫

S

Nαβ(u0)w,αw,βdS +1

2

∫

S

hαβλµw,αβw,λµdS ≥ 0,

∀w ∈W 2,20 (S), (10.58)

we have

J(u0) = minu∈U

J(u).

Proof. It is clear that

G(Λu) + F (u) ≥ −(G Λ)∗(u∗) − F ∗(−u∗), ∀u ∈ U, u∗ ∈ U∗,

so that

G(Λu) + F (u) ≥ −(G Λ)∗(Λ∗v∗) − F ∗(−Λ∗v∗),

∀u ∈ U, v∗ ∈ Y ∗. (10.59)

Consider u0 for which δJ(u0) = θ and such that (10.58) is satisfied.Defining

v∗0 =∂G(Λu0)

∂v,

from Theorem 8.1.27 we have that

δ(−G∗L(v∗0) + 〈u0,Λv

∗0 − f〉U) = θ,

J(u0) = −GL(v∗0),

and

Λ∗v∗0 = f.

This means

F ∗(−Λ∗v∗0) = 0.

On the other hand

(G Λ)∗(Λ∗v∗0) = supu∈U

〈Λu, v∗0〉Y −G(Λu),

10.8. A FINAL RESULT, OTHER SUFFICIENT... 263

or

(G Λ)∗(Λv∗0) = supu∈U

⟨

uα,β + uβ,α

2, Nαβ(u0)

⟩

L2(S)

+〈−w,αβ,Mαβ(u0)〉L2(S)

+〈w,αQα(u0)〉L2(S) −1

2

∫

S


−1

2

∫

S

hαβλµw,αβw,λµdS . (10.60)

Since

γαβ(u) =uα,β + uβ,α

2+

1

2w,αw,β,

from the last equality, we may write

(GΛ)∗(Λ∗v∗0) = supu∈U

〈γαβ(u), Nαβ(u0)〉L2(S) − 〈w,αw,β

2, Nαβ(u0)〉L2(S)

+ 〈−w,αβ,Mαβ(u0)〉L2(S) + 〈w,α, Qα(u0)〉L2(S)

−1

2

∫

S

Hαβλµγαβ(u)γλµ(u)dS − 1

2

∫

S

hαβλµw,αβw,λµdS

. (10.61)

As (Qα(u0)),α + (Mαβ(u0)),αβ + P = 0, we obtain

(G Λ)∗(Λ∗v∗0) ≤

supu∈U

−⟨w,αw,β

2, Nαβ(u0)

⟩

L2(S)− 1

2

∫

S

hαβλµw,αβw,λµdS

+

∫

S

PwdS

+1

2

∫

S

HαβλµNαβ(u0)Nλµ(u0)dS. (10.62)

Therefore, from hypothesis (10.58) the extremum indicated in (10.62)is attained for functions satisfying

(Nαβ(u0)wβ),α − (hαβλµwλµ),αβ + P = 0. (10.63)

From δJ(u0) = θ and boundary conditions we obtain

w = w0, a.e. in S,

so that

(GΛ)∗(Λ∗v∗0) ≤⟨w0,αw0,β

2, Nαβ(u0)

⟩

L2(S)+

1

2

∫

S

hαβλµw0,αβw0,λµdS

+1

2

∫

S



However, sinceQα(u0) = Nαβ(u0)w0,β ,

andMαβ(u0) = −hαβλµw0,λµ

from (10.64) we obtain

(G Λ)∗(Λ∗v∗0) ≤1

2

∫

Nαβ(u0)Qα(u0)Qβ(u0)dS

+1

2

∫

S

hαβλµMαβ(u0)Mλµ(u0)dS

+1

2

∫

S


Hence(G Λ)∗(Λ∗v∗0) ≤ GL(v∗0) = −J(u0),

and thus as F ∗(−Λ∗v∗0) = 0, we have that

J(u0) ≤ −(G Λ)∗(Λ∗v∗0) − F ∗(−Λ∗v∗0),

which, from (10.59) completes the proof.

10.9. Final Remarks

In this chapter we presented four different dual variational for-mulations for the Kirchhoff-Love plate model. Earlier results (seereferences [40],[22]) present a constraint concerning the gap func-tional to establish the complementary energy (dual formulation).In the present work the dual formulations are established on thehypothesis of existence of a global extremum for the primal func-tional and the results are applicable even for compressed plates. Inparticular the second duality principle is obtained through an ex-tension of a theorem met in [42], and in this case we are concernedwith the solution behavior as K → +∞, even though a rigorousand complete analysis of such behavior has been postponed for afuture work. However, what seems to be interesting is that the dualformulation as indicated in (10.51) is represented by a natural ex-tension of the results found in [42] (particularly Theorem 10.1.1),plus a kind of penalization concerning the inversion of constitutiveequations.

It is worth noting that the third dual formulation was based onthe same theorem, despite the fact such a result had not been di-rectly used, we followed a similar idea to prove the duality principle.For this last result, the membrane forces are allowed to be negative

10.9. FINAL REMARKS 265

since it is observed the restriction N11 + K > 0, N22 +K > 0 and(N11 + K)(N22 + K) − N2

12 > 0, a.e. in S, where K ∈ R is apositive suitable constant.

In section 10.7, we obtained a convex dual variational formula-tion for the plate model, which allows non positive definite mem-brane force matrices. In this formulation, the Poincare inequalityplays a fundamental role.

Finally, in the last section, we developed a result similar toGao’s triality criterion presented in [24]. In the plate applicationthis gives sufficient conditions for optimality. We present a newproof of sufficient conditions of existence of a global extremum forthe primal problem. As earlier mentioned, such conditions may besummarized by δJ(u0) = θ and

1

2

∫

S

Nαβ(u0)w,αw,βdS+1

2

∫

S

hαβλµw,αβw,λµdS ≥ 0, ∀w ∈W 2,20 (S).

CHAPTER 11

Duality Applied to Elasticity

11.1. Introduction and Primal Formulation

Our first objective in the present chapter is to establish a dualvariational formulation for a finite elasticity model. Even thoughexistence of solutions for this model has been proven in Ciarlet[13], the concept of complementary energy, as a global optimizationapproach, is possible to be defined only if the stress tensor is positivedefinite at a critical point. Thus we have the goal of relaxing suchconstraints and start by describing the primal formulation.

Consider S ⊂ R3 an open, bounded, connected set, which rep-resents the reference volume occupied by an elastic solid underthe load f ∈ L2(S; R3). We denote by Γ the boundary of S.The field of displacements under the action of f is denoted byu ≡ (u1, u2, u3) ∈ U , where u1, u2, and u3 denotes the displacementsrelated to directions x, y, and z respectively, on the cartesian basis(x, y, z). Here U is defined as

U = u = (u1, u2, u3) ∈W 1,4(S; R3) |u = (0, 0, 0) ≡ θ on Γ0 (11.1)

and Γ = Γ0 ∪ Γ1, Γ0 ∩ Γ1 = ∅. Denoting the stress tensor by σij,where

σij = Hijkl

(

1

2(uk,l + ul,k + um,kum,l)

)

(11.2)

and Hijkl is a positive definite matrix related to the coefficients ofHooke’s Law, the boundary value form of the finite elasticity modelis given by

σij,j + (σimum,j),j + fi = 0, in S,

u = θ, on Γ0.(11.3)

267

268 11. DUALITY APPLIED TO ELASTICITY

The corresponding primal variational formulation is represented byJ : U → R, where

J(u) =1

2

∫

S

Hijkl

(

1

2(ui,j + uj,i + um,ium,j)

)

(

1

2(uk,l + ul,k + um,kum,l)

)

dx− 〈u, f〉L2(S;R3) (11.4)

Remark 11.1.1. Derivatives must be always understood indistributional sense, whereas boundary conditions are in the senseof traces. Finally, we denote ‖ · ‖2 as the standard norm for L2(S).

The next result has been proven in Chapter 10.

Theorem 11.1.2. Consider Gateaux differentiable functionalsG Λ : U → R and F Λ1 : U → R where only the secondone is necessarily convex, through which is defined the functionalJK : U × Y → R expressed as:

JK(u, p) = G(Λu+ p) +K〈p, p〉L2(S,Rn) − F (Λ1u)

− K

2〈p, p〉L2(S,Rn) − 〈u, u∗0〉U (11.5)

so that it is supposed the existence of (u0, p0) ∈ U × Y such that

JK(u0, p0) = inf(u,p)∈U×Y

JK(u, p) (11.6)

and δJK(u0, p0) = θ (here Λ = Λi : U → Y and Λ1 : U → Y arecontinuous linear operators whose adjoint operators are denoted byΛ∗ : Y ∗ → U∗ and Λ∗

1 : Y ∗ → U∗ respectively).Furthermore it is assumed the existence of a differentiable func-

tion denoted by g : Rn → R so that G : Y → R may be expressedas: G(v) =

∫

Ωg(v)dS, ∀v ∈ Y where g admits differentiable Le-

gendre Transform denoted by g∗L : RnL → R.

Under these assumptions we have:

inf(u,p)∈U×Y

JK(u, p) ≤ inf(z∗,v∗,u)∈E∗

J∗K(z∗, v∗, u) (11.7)

11.2. THE FIRST DUALITY PRINCIPLE 269

where

J∗K(z∗, v∗, u) = F ∗(z∗) + 1/(2K)〈v∗, v∗〉L2(S,Rn) −G∗

L(v∗)

+K

n∑

i=1

∥

∥

∥

∥


∂y∗i

∥

∥

∥

∥

2

2

and

E∗ =

(z∗, v∗, u) ∈ Y ∗ × Y ∗L × U | − Λ∗

1z∗ + Λ∗v∗ − u∗0 = θ. (11.8)

Also, the functions z∗0 , v∗0, and u0, defined by

z∗0 =∂F (Λ1u0)

∂v, (11.9)

v∗0 =∂G(Λu0 + p0)

∂v(11.10)

and

u0 = u0 (11.11)

are such that

−Λ∗1z

∗0 + Λ∗v∗0 − u∗0 = θ (11.12)

and thus

JK(u0, p0) ≤ inf(z∗,v∗,u)∈E∗

J∗K(z∗, v∗, u)

≤ JK(u0, p0) + 2K〈p0, p0〉L2(S,Rn) (11.13)

where we are assuming that v∗0 ∈ Y ∗L .

11.2. The First Duality Principle

In this section we establish the first duality principle. We startwith the following theorem.

Theorem 11.2.1. Define J : U → R as

J(u) = G∗∗(Λu) − F1(u) (11.14)


where

G(Λu) =1

2

∫

S

Hijkl

(

1

2(ui,j + ui,j + um,ium,j)

)

(

1

2(uk,l + uk,l + um,kum,l)

)

dx

+K

2〈um,i, um,i〉L2(S), (11.15)

F1(u) = F (u) − 〈u, f〉L2(S;R3), (11.16)

and

F (u) =K

2〈um,i, um,i〉L2(S). (11.17)

Here Λ : U → Y = Y1 × Y2 ≡ L2(S; R9) × L2(S,R9) is given by

Λu = Λ1u,Λ2u ≡ 1

2(ui,j + ui,j), um,i. (11.18)

Thus we can write

infu∈U

J(u) ≤ infz∗∈Y ∗

2

supv∗∈E∗

F ∗(z∗) −G∗(v∗), (11.19)

where

v∗ ≡ σ,Q, (11.20)

E∗ = E1 ∩ E2,

E1 = (v∗, z∗) ∈ Y ∗ | σij,j +Qij,j − z∗ij,j + fi = 0, a.e. in S,(11.21)

E2 = (v∗, z∗) ∈ Y ∗ | σ∗ijnj +Qijnj = z∗ijnj, on Γ1, (11.22)

and Y ∗ = Y ∗ × Y ∗1 . Also,

F ∗(z∗) =1

2K〈z∗im, z∗im〉L2(S) (11.23)

and

G∗(v∗) = G∗L(v∗) =

1

2

∫

S

Hijklσijσkldx

+1

2

∫

S

σKijQmiQmjdx (11.24)


if σK is positive definite in S, where

σK =

σ11 +K σ12 σ13

σ21 σ22 +K σ23

σ31 σ32 σ33 +K

(11.25)

and also σKij = σK

−1. Finally, Hijkl = Hijkl−1.

Remark 11.2.2. The proof of a similar result was given inF.Botelho [6]. However in the present case some details are slightlydifferent so that we provide a complete proof.

Proof. Defining α ≡ infu∈UJ(u) we have

G∗∗(Λu) − F1(u) ≥ α, ∀u ∈ U, (11.26)

or

−F1(u) ≥ −G∗∗(Λu) + α, ∀u ∈ U, (11.27)

so that

supu∈U

〈Λ1u, w∗〉Y1 − F1(u) ≥ sup

u∈U〈Λ1u, w

∗〉Y1 −G(Λu) + α.

(11.28)

However, from Theorem 8.2.5

F ∗1 (u∗) = sup

u∈U〈Λ1u, w

∗〉Y1 − F1(u) = infz∗∈C∗

1∩C∗

2

F ∗(z∗) (11.29)

where

C∗1 = z∗ ∈ Y ∗

2 | z∗ij,j − w∗ij,j − fi = 0, a.e. in S, (11.30)

and

C∗2 = z∗ ∈ Y ∗

2 | z∗ijnj = w∗ijnj , on Γ1. (11.31)

On the other hand, also from Theorem 8.2.5

(G∗∗ Λ)∗(u∗) = supu∈U

〈Λ1u, w∗〉Y1 −G∗∗(Λu) = inf

v∗∈D∗

1∩D∗

2

G∗(v∗)(11.32)

where

D∗1 = v∗ ∈ Y ∗ | σij,j +Qij,j − w∗

ij,j = 0, a.e. in S, (11.33)

and

D∗2 = v∗ ∈ Y ∗ | σ∗

ijnj +Qijnj = w∗ijnj, on Γ1. (11.34)


We can summarize the last results by

infu∈U

J(u) = α ≤ F ∗1 (u∗) − (G∗∗ Λ)∗(u∗) ≤ F ∗(z∗)

+ supv∗∈D∗

1∩D∗

2

−G∗(v∗), (11.35)

∀z∗ ∈ C∗1 ∩ C∗

2 .Hence we can write

infu∈U


1

supv∗∈E∗

F ∗(z∗) −G∗(v∗) (11.36)

where E∗ = E1 ∩ E2,

E1 = (v∗, z∗) ∈ Y ∗ | σij,j +Qij,j − z∗ij,j + fi = 0, a.e. in S,(11.37)

E2 = (v∗, z∗) ∈ Y ∗ | σ∗ijnj +Qijnj = z∗ijnj, on Γ1, (11.38)

and Y ∗ = Y ∗ × Y ∗1 .

Finally, we have to prove that G∗L(v∗) = G∗(v∗) if σK is positive

definite in S. We start by formally calculating g∗L(y∗), the Legendretransform of g(y), where

g(y) = Hijkl

(

y1ij +1

2y2miy2mj

)(

y1kl +1

2y2mky2ml

)

+K

2y2miy2mi. (11.39)

We recall that

g∗L(y∗) = 〈y, y∗〉R18 − g(y) (11.40)

where y ∈ R18 is solution of equation

y∗ =∂g(y)

∂y. (11.41)

Thus

y∗1ij = σij = Hijkl

(

y1kl +1

2y2mky2ml

)

(11.42)

and

y∗2mi = Qmi = Hijkl

(

y1kl +1

2y2oky2ol

)

y2mj +Ky2mi (11.43)

so that

Qmi = σijy2mj +Ky2mi . (11.44)


Inverting these last equations, we have

y2mi = σKijQmj (11.45)

where

σKij = σ−1

K =

σ11 +K σ12 σ13

σ21 σ22 +K σ23

σ31 σ32 σ33 +K

−1

(11.46)

and also

y1ij = Hijklσkl −1

2y2miy2mj . (11.47)

Finally

g∗L(σ,Q) =1

2Hijklσijσkl +

1

2σK

ijQmiQmj . (11.48)

Now we will prove that g∗L(v∗) = g∗(v∗) if σK is positive definite.First observe that

g∗(v∗) = supy∈R18

〈y1, σ〉R9 + 〈y2, Q〉R9

−1

2Hijkl

(

y1ij +1

2y2miy2mj

)(

y1kl +1

2y2mky2ml

)

−K2y2miy2mi

= sup(y1,y2)∈R9×R9

〈y1ij − 1

2y2miy2mj , σij〉R + 〈y2, Q〉R9

−1

2Hijkl[y1ij ][y1kl] − K

2y2miy2mi

.

The result follows just observing that

supy1∈R9

〈y1ij , σij〉R − 1

2Hijkl[y1ij ][y1kl]

=1

2Hijklσijσkl (11.49)

and

supy2∈R9

〈−1

2y2miy2mj , σij〉R + 〈y2, Q〉R9 − K

2y2miy2mi

=1

2σK

ijQmiQmj (11.50)

if σK is positive definite.



The second duality principle is summarized by the followingtheorem, which is similar to Theorem 11.1.2. However some detailsare different, so that again we provide a complete proof. Note thatthis result is particularly applicable as the optimal matrix σij isnegative definite.

Theorem 11.3.1. Define J : U × Y → R as

JK(u, p) = G(Λu+ p) +K〈p, p〉Y − F (Λ1u) −K

2〈p, p〉Y

− 〈u, f〉L2(S;R3). (11.51)

Suppose K is big enough so that JK : U×Y → R is bounded below,where

G(Λu) =1 + ε

2

∫

S

Hijkl

(

1


)

(

1


)

dx (11.52)

and

F (Λ1u) =ε

2

∫

S

Hijkl

(

1


)

(

1


)

dx. (11.53)

Here Λ : U → Y is expressed as:

Λu =

1

2(ui,j + ui,j) +

umiumj

2

and Λ1 : U → Y = Y1 × Y2 ≡ L2(S; R9) × L2(S,R9) is given by

Λ1u = Λ11u,Λ12u ≡

1

2(ui,j + ui,j)

, um,i

. (11.54)

Thus we can write

infu∈U

JK(u) ≤ inf(z∗,(σ,Q),u)∈Y ∗×A∗×U

F ∗(z∗, Q, σ) −G∗L(z∗)

+1

2K〈z∗, z∗〉Y +

3∑

i,j=1

K

∥

∥

∥

∥

ui,j + uj,i

2+um,ium,j

2− Hijkl

1 + εz∗kl

∥

∥

∥

∥

2

2

,

(11.55)


where,

A∗ = (σ,Q) ∈ Y ∗ | σij,j −Qij,j + fi = 0,

in S, (σij −Qij)nj = 0, on Γ1, (11.56)

F ∗(z∗, σ, Q) =1

2ε

∫

S

Hijkl(z∗ij − σij)(z

∗kl − σkl) dx

− 1

2

∫

S

σijQmiQmjdx (11.57)

if σij is negative definite in S, and also σij=σij−1.Finally,

G∗L(z∗) =

1

2(1 + ε)

∫

S

Hijklz∗ijz

∗kl dx. (11.58)

Proof. Defining α = inf(u,p)∈U×Y JK(u, p),G2(u, p) = G(Λu+ p) +K〈p, p〉Y

and

G1(u, p) = F (Λ1u) +K

2〈p, p〉Y + 〈u, f〉L2(S;R3)

we have,

J(u, p) = G2(u, p) −G1(u, p) ≥ α, ∀u ∈ U, p ∈ Y.

That is

−G1(u, p) ≥ −G2(u, p) + α, ∀u ∈ U, p ∈ Y,

so that

sup(u,p)∈U×Y

〈Λu+ p, z∗〉Y −G1(u, p) ≥ 〈Λu+ p, z∗〉Y −G2(u, p) + α.

(11.59)

However, from the standard results of variational convex analysis

sup(u,p)∈U×Y

〈Λu+ p, z∗〉Y −G1(u, p)

= inf(σ,Q))∈A∗

F ∗(z∗, σ, Q) +1

2K〈z∗, z∗〉Y .

On the other hand, particularly for p satisfying the relation,

z∗ =∂G(Λu+ p)

∂z,


from the well known proprieties of Legendre transform, we have:

p =∂G∗

L(z∗)

∂z∗− Λu.

Therefore

〈Λu+ p, z∗〉Y −G2(u, p) = G∗L(z∗)

−3∑

i,j=1

K

∥

∥

∥

∥

ui,j + uj,i

2+um,ium,j

2− Hijkl

1 + εz∗kl

∥

∥

∥

∥

2

2

.

Replacing the last results into (11.59) we obtain,

infu∈U

JK(u) ≤ inf(z∗,(σ,Q))∈Y ∗×A∗

F ∗(z∗, Q, σ) −G∗L(z∗)

+1

2K〈z∗, z∗〉Y (11.60)

+

3∑

i,j=1

K

∥

∥

∥

∥

ui,j + uj,i

2+um,ium,j

2− Hijkl

1 + εz∗kl

∥

∥

∥

∥

2

2

,


Remark 11.3.2. From the expression of the dual formulationwe conjecture that the duality gap between the primal and dualproblems goes to zero as K → ∞, if we have a critical point u0 ∈ Ufor which the corresponding matrix σij(u0) is negative definitea.e. in S and

− 1

2

∫

S

σij(u0)vmivmj dx−1

2

∫

S

Hijklvijvkl dx ≥ 0,

∀v ∈ L2(S; R9).

A proof and/or full justification of such a conjecture is planned fora future work.

11.4. Conclusion

In this chapter we have presented three different dual variationalformulations for a non-linear model in Elasticity. The second prin-ciple is particularly useful when there is a critical point for whichthe matrix of stresses is negative definite.

CHAPTER 12

Duality Applied to a Membrane Shell Model


In this chapter we establish dual variational formulations for theelastic membrane shell model presented in [15] (P.Ciarlet). Ciarletproves existence of solutions, however, the complementary energyas developed in [40]( J.J.Telega), is possible only for a special classof external loads, that generate a critical point with positive definitemembrane tensor. In this chapter we have the goal of relaxing suchconstraints, and start by describing the primal formulation.

Consider a domain S ⊂ R2 and a injective mapping ~θ : S ×[−ε, ε] → R3 such that S0 = ~θ(S) denotes the middle surface of ashell of thickness 2ε.

The mapping ~θ may be expressed as:

~θ(x) = ~θ1(y) + x3a3(y)

where a3 = (a1×a2)/‖a1×a2‖ and aα = ∂α~θ, y = (y1, y2) denote

the curvilinear coordinates, x = (y, x3) and −ε ≤ x3 ≤ ε.The contravariant basis of the tangent plane to S in y, denoted

by aα is defined through the relations

aα · aβ = δβ

α, and a3 = a3

The covariant components of the metric tensor, denoted byaαβ, are defined as

aαβ = aα · aβ

and the concerned contravariant components are denoted by aαβand expressed as

aαβ = aα · aβ

It is not difficult to show that aα = aαβaβ, aαβ = aαβ−1

and aα = aαβ

aβ. We also define√

a(y) = ‖a1(y) × a2(y)‖.The curvature tensor denoted by bαβ is expressed as

bαβ = a3 · ∂αaβ.

277

278 12. DUALITY APPLIED TO A MEMBRANE SHELL MODEL

Concerning the displacements due to external loads action, wedenote them by η = ηia

i, and the admissible displacements field isdenoted by U , where

U = η ∈W 1,4(S; R3) | η = (0, 0, 0) on Γ0and here Γ = Γ0 ∪ Γ1 (Γ0 ∩ Γ1 = ∅) denotes the boundary of S.

We now state the Theorem 9-1-1 of reference [15] (MathematicalElasticity , Vol. III -Theory of shells), by P.Ciarlet.

Theorem 12.1.1. Let S ⊂ R2 a domain and ~θ ∈ C2(S; R3) be

an injective mapping such that the two vectors aα = ∂α~θ are linearly

independent in all points of S, let a3 = (a1 × a2)/‖a1 × a2‖, andlet the vectors a

i be defined by ai.aj = δi

j. Given a displacement

field ηiai of the surface S0 = ~θ(S), let the covariant components of

the change of metric tensor associated with this displacement fieldbe defined by

Gαβ(η) = (aαβ(η) − aαβ),

where aαβ and aαβ(η) denote the covariant components of the metric

tensors of the surfaces ~θ(S) and (~θ + ηiai)(S) respectively. Then

Gαβ(η) = (ηα‖β + ηβ‖α + amnηm‖αηn‖β)/2

where ηα‖β = ∂βηα − Γσαβησ − bαβη3 and η3‖β = ∂βη3 + bσβησ.

We define the constitutive relations as

Nαβ = aαβστε Gστ (η) (12.1)

where Nαβ denotes the membrane forces, and

aαβστε =

ε

4(

4λµ

λ+ 2µaαβaστ + 2µ(aασaβτ + aατaβσ)) (12.2)

The potential energy (stored energy plus external work) is ex-pressed by the functional J : U → R where,

J(η) =1

2

∫

S

aαβστε Gστ (η)Gαβ(η)

√a dy −

∫

S

piηi

√a dy

where pi =∫ ε

−εf i dx3, here f i ∈ L2(S × [−ε, ε]; R3) denotes the

external load density.We will define (G Λ) : U → R and F : U → R as

(G Λ)(η) =1

2

∫

S


√a dy (12.3)


and,

F (η) =

∫

S

piηi

√a dy, (12.4)

where Λ = Λ1αβ ,Λ2mα, Λ1αβ(η) = (ηα‖β+ηβ‖α)/2 and Λ2mα(η) =ηm‖α. Thus, the primal variational formulation is given by J : U →R, where

J(η) = G(Λη) − F (η). (12.5)


We will be concerned with the Legendre Transform related tothe function g : R10 → R expressed as

g(y) =1

2aαβστ

ε (y1αβ +1

2amny2mαy2nβ)(y1στ +

1

2akly2kσy2lτ ) (12.6)

Its Legendre Transform, denoted by g∗L : R10L → R is given by

g∗L(y∗) = 〈y, y∗〉R10 − g(y) (12.7)

where y ∈ R10 is solution of the system

y∗ =∂g(y)

∂y. (12.8)

That is,

y∗1αβ = aαβστε (y1στ +

1


and

y∗2mα = aαβστε (y1στ +

1

2akly2kσy2lτ )amny2nβ (12.10)

or

y∗2mα = y∗1αβamny∗2nβ . (12.11)

Therefore, after simple algebraic manipulations we would obtain

g∗L(y∗) =1

2aαβστy

∗1στy∗1αβ +

1

2Rαβmny

∗2αmy∗2nβ , (12.12)

where

Rαβmn = y∗1αβamn−1. (12.13)


Thus, denoting v∗ = Nαβ , Qmα we have

G∗L(v∗) =

∫

S

g∗L(v∗)√a dy =

∫

S

aαβστNστNαβ

√a dy

+1

2

∫

w

RαβmnQmαQnβ

√a dy. (12.14)

We now obtain the polar functional related to the external load.

12.3. The Polar Functional Related to F : U → R

The polar functional related to F : U → R, for v∗ = Nαβ , Qmαis expressed by

F ∗(Λ∗v∗) = supη∈U

〈η,Λ∗v∗〉U − F (u). (12.15)

That is,

F ∗(Λ∗v∗) = supη∈U

〈Λ1αβη,Nαβ√a〉L2(S) + 〈Λ2mαη,Qmα

√a〉L2(S)

−∫

S

piηi

√a dy. (12.16)

Thus

F ∗(Λ∗v∗) =

0, if v∗ ∈ C∗,+∞, otherwise,

(12.17)

where

v∗ ∈ C∗ ⇔

−(Nαβ +Qαβ)|β + bαβQ3β = pα in S,

−bαβ(Nαβ +Qαβ) −Q3β|β = p3 in S,

(Nαβ +Qαβ)νβ = 0 on Γ1,

Q3β|βνβ = 0 on Γ1.

(12.18)

12.4. The Final Format of First Duality Principle

The duality principle presented in Theorem 10.5.1 is applied tothe present case. Thus we obtain

infη∈U

J(η) ≤ infv∗∈C∗

1

2K〈v∗, v∗〉L2(S,R10) −G∗

L(v∗)

+K

∥

∥

∥

∥

Λu− ∂g∗L(v∗)

∂y∗

∥

∥

∥

∥

2

L2(S)

.


More explicitly

infη∈U

1

2

∫

S


√a dy −

∫

S

piηi

√a dy

≤

infv∗∈C∗

1

2K

(∫

S

NαβNαβ√a dy +

∫

S

QmαQmα√a dy

)

−1

2

∫

S

aαβστNστNαβ

√a dy − 1

2

∫

S

RαβmnQmαQnβ

√a dy

+

2∑

α,β=1

K∥

∥ηα‖β − aαβστNστ − aklv02kαv02lβ

∥

∥

L2(S)

+

3∑

m=1

2∑

α=1

K‖ηm‖α − v02mα‖L2(S)

.


Our objective now is to establish a different dual formulation.First we will write the primal formulation as the difference of twoconvex functionals and then will obtain the second duality principle,as indicated in the next theorem.

Theorem 12.5.1. Define J : U → R as

J(η) = G(Λη) − F (Λ2η) (12.19)

where

G(Λη) = G∗∗(Λη) − 〈η, p〉, (12.20)

G(Λη) =1

2

∫

S


√a dy

+K

2

∫

S

ηm‖αηm‖α

√a dy, (12.21)

F (Λ2η) =K

2

∫

S

ηm‖αηm‖α

√a dy, (12.22)

〈η, p〉 =

∫

S

piηi

√a dy, (12.23)


and Λ : U → Y ∗1 × Y ∗

2 ≡ L2(S; R4) × L2(S; R6) is defined as

Λη = Λ1η,Λ2η ≡

1

2(ηα‖β + ηβ‖α)

, ηm‖α

. (12.24)

Thus, we may write

infη∈U

J(η) ≤ infu∗∈U∗

supv∗∈C∗(u∗)

F ∗(u∗) −G∗(v∗) (12.25)

where

v∗ ≡ N,Q, (12.26)

F ∗(u∗) =1

2K

∫

S

u∗miu∗mi

√a dy, (12.27)

G∗(v∗) = G∗L(v∗) =

∫

S

g∗L(v∗)√ady =

1

2

∫

S

aαβστNστNαβ

√ady

+1

2

∫

S

RαβmnQmαQnβ

√a dy (12.28)

ifNK = Nαβamn+Kδαβδmn is positive definite in S. Furthermore

Rαβmn = N−1K (12.29)

and

aαβστ = aαβστε −1. (12.30)

Finally,

v∗ ≡ N,Q ∈ C∗(u∗) ⇔

−(Nαβ +Qαβ − u∗αβ)|β + bαβ(Q3β − u∗3β) = pα in S,

−bαβ(Nαβ +Qαβ − u∗αβ) − (Q3β − u∗3β)|β = p3 in S,

(Nαβ + (Qαβ − u∗αβ))νβ = 0 on Γ1,

(Q3β − u∗3β)|βνβ = 0 on Γ1.

(12.31)

Proof. Defining α ≡ infη∈UJ(η) we have

G(Λη) − F (Λ2η) ≥ α, ∀η ∈ U, (12.32)

or

−F (Λ2η) ≥ −G(Λη) + α, ∀η ∈ U, (12.33)


so that

supη∈U

〈Λ2η, u∗〉 − F (η) ≥ sup

η∈U〈Λ2η, u

∗〉 − G(Λη) + α. (12.34)

However,

F ∗(u∗) ≥ supη∈U

〈Λ2η, u∗〉 − F (Λ2η). (12.35)

On the other hand, from Theorem 8.2.5

(G Λ)∗(Λ∗2u

∗) = supη∈U

〈Λ2η, u∗〉 − G(Λη) = inf

v∗∈C∗(u∗)G∗(v∗)

(12.36)

where

v∗ ≡ N,Q ∈ C∗(u∗) ⇔

−(Nαβ +Qαβ − u∗αβ)|β + bαβ(Q3β − u∗3β) = pα in S,

−bαβ(Nαβ +Qαβ − u∗αβ) − (Q3β − u∗3β)|β = p3 in S,

(Nαβ + (Qαβ − u∗αβ))νβ = 0 on Γ1,

(Q3β − u∗3β)|βνβ = 0 on Γ1.

(12.37)

We can summarize the last results by

infη∈U

J(η) = α ≤ F ∗(u∗) − (G Λ)∗(Λ∗2u

∗) = F ∗(u∗)

+ supv∗∈C∗(u∗)

−G∗(v∗), (12.38)

∀u∗ ∈ U∗.Finally, we have to prove that G∗

L(v∗) = G∗(v∗) if NK is positivedefinite in S. We start by formally calculating g∗L(y∗), the Legendretransform of g(y), where g : R10 → R is expressed as:

g(y) =1

2aαβστ

ε (y1αβ +1

2amny2mαy2nβ)(y1στ +

1

2akly2kσy2lτ )

+K

2y2mαy2mα. (12.39)

Observe that g∗L : R10L → R is given by

g∗L(y∗) = 〈y, y∗〉R10 − g(y) (12.40)


where y ∈ R10 is solution of the system

y∗ =∂g(y)

∂y. (12.41)

That is,

y∗1αβ = aαβστε (y1στ +

1


and,

y∗2mα = aαβστε (y1στ +

1

2akly2kσy2lτ )amny2nβ +Ky2mα (12.43)

or

y∗2mα = y∗1αβamny∗2nβ +Ky2mα . (12.44)

Therefore, after simple algebraic manipulations we would obtain

g∗L(y∗) =1

2aαβστy

∗1στy1αβ +

1

2Rαβmny

∗2αmy∗2nβ (12.45)

where

Rαβmn = y∗1αβamn +Kδαβδ

mn−1. (12.46)

Thus, through the notation v∗ = Nαβ , Qmα we would obtain

G∗L(v∗) =

∫

w

g∗L(v∗)√a dy =

1

2

∫

S

aαβστNστNαβ

√a dy

+1

2

∫

S

RαβmnQmαQnβ

√a dy (12.47)

where

Rαβmn = Nαβamn +Kδαβδmn−1. (12.48)

Now we will prove that g∗L(v∗) = g∗(v∗) if NK is positive definite.First observe that

g∗(v∗) = supy∈R10

〈y1, N〉R4 + 〈y2, Q〉R6

− 1

2aαβστ

ε [y1αβ +1

2amny2mαy2nβ ][y1στ +

1

2akly2kσy2lτ ] − K

2y2mαy2mα

= sup(y1,y2)∈R4×R6

〈y1αβ − 1

2amny2mαy2mβ , Nαβ〉R + 〈y2, Q〉R6

− 1

2aαβστ

ε [y1αβ ][y1στ ] − K

2y2mαy2mα. (12.49)

12.6. CONCLUSION 285

The result follows just observing that

supy1∈R4

〈y1αβ , Nαβ〉R − 1

2aαβστ

ε [y1αβ ][y1στ ] =1

2aαβστN

στNαβ (12.50)

and

supy2∈R6

〈−1

2amny2mαy2mβ , Nαβ〉R + 〈y2, Q〉R6 − K

2y2mαy2mα

=1

2RαβmnQ

mαQnβ (12.51)

if NK is positive definite.

12.6. Conclusion

We obtained three different dual variational formulations for thenon-linear elastic shell model studied in P.Ciarlet [15] (a membraneshell model).The first duality principle presented is an extension ofa theorem found in Toland [42]. The solution behavior asK → +∞is of particular interest for a future work.

The second duality principle relaxes the condition of positivedefinite membrane tensor, and thus the constant K must be chosenso that the matrix NK is positive definite at the equilibrium point,where NK = Nαβamn +Kδαβδ

mn.

CHAPTER 13

Duality Applied to Ginzburg-Landau Type

Equations

13.1. Introduction

In this article, our first objectives are to show existence anddevelop dual formulations concerning the real semi-linear Ginzburg-Landau equation. We start by describing the primal formulation.

By S ⊂ R3 we denote an open connected bounded set witha sufficiently regular boundary Γ = ∂S (regular enough so thatthe Sobolev Imbedding Theorem holds). The Ginzburg-Landauequation is given by:

−∇2u+ α(u2

2− β)u− f = 0 in S,

u = 0 on Γ,(13.1)

where u : S → R denotes the primal field and f ∈ L2(S). Moreover,α, β are real positive constants.

Remark 13.1.1. The complex Ginzburg-Landau equationplays a fundamental role in the theory of superconductivity (see[3], for details). In the present work we deal with the simpler realform however, the results obtained may be easily extended to thecomplex case.

The corresponding variational formulation is given by the func-tional J : U → R where:

J(u) =1

2

∫

S

|∇u|2 dx+α

2

∫

S

(u2

2− β)2 dx−

∫

S

fu dx (13.2)

where U = u ∈W 1,2(S) | u = 0 on Γ = W 1,20 (S).

We are particularly concerned with the fact that equations indi-cated in (13.1) are necessary conditions for the solution of Problem P,where:

Problem P : to find u0 ∈ U such that J(u0) = minu∈U

J(u).

287

28813. DUALITY APPLIED TO GINZBURG-LANDAU TYPE EQUATIONS

13.1.1. Existence of Solution for the Ginzburg-Landau

Equation. We start with a remark.

Remark 13.1.2. From the Sobolev Imbedding Theorem for

mp < n, n−mp < n, p ≤ q ≤ p∗ = np/(n−mp),

we have:W j+m,p(Ω) → W j,q(Ω).

Therefore, considering n = 3, m = 1, j = 0, p = 2, and q = 4, weobtain:

W 1,2(Ω) ⊂ L4(Ω) ⊂ L2(Ω)

and thus‖u‖L4(Ω) → +∞ ⇒ ‖u‖W 1,2(Ω) → +∞.

Furthermore, from above and Poincare Inequality it is clear thatfor J given by (13.2), we have:

J(u) → +∞ as ‖u‖W 1,2(S) → +∞,

that is, J is coercive.Now we establish the existence of a minimizer for J : U → R.

It is a well known procedure (the direct method of Calculus ofVariations). We present it here for the sake of completeness.

Theorem 13.1.3. For α, β ∈ R+, f ∈ L2(S) there exists atleast one u0 ∈ U such that:

J(u0) = minu∈U

J(u)

where

J(u) =1

2

∫

S

|∇u|2 dx+α

2

∫

S

(u2

2− β)2 dx−

∫

S

fu dx

and U = u ∈W 1,2(S) | u = 0 on Γ = W 1,20 (S).

Proof. From Remark 13.1.2 we have:

J(u) → +∞ as ‖u‖U → +∞.

Also from Poincare inequality, there exists α1 ∈ R such thatα1 = infu∈UJ(u), so that, for un minimizing sequence, in thesense that:

J(un) → α1 as n→ +∞ (13.3)

13.2. A CONCAVE DUAL VARIATIONAL FORMULATION 289

we have that ‖un‖U is bounded, and thus, as W 1,20 (S) is reflexive,

there exists u0 ∈ W 1,20 (S) and a subsequence unj ⊂ un such

that:

unj u0, weakly in W 1,20 (S). (13.4)

From (13.4), by the Rellich-Kondrachov theorem, up to a subse-quence, which is also denoted by unj, we have:

unj → u0, strongly in L2(S). (13.5)

Furthermore, defining J1 : U → R as:

J1(u) =1

2

∫

S

|∇u|2 dx+α

8

∫

S

u4 dx−∫

S

fu dx

we have that J1 : U → R is convex and strongly continuous, there-fore weakly lower semi-continuous, so that

lim infj→+∞

J1(unj) ≥ J1(u0). (13.6)

On the other hand, from (13.5):∫

S

(unj)2dx→

∫

S

u20dx, as j → +∞ (13.7)

and thus, from (13.6) and (13.7) we may write:

α1 = infu∈U

J(u) = lim infj→+∞

J(unj) ≥ J(u0).

13.2. A Concave Dual Variational Formulation

We start this section by enunciating the following theorem whichhas been proven in chapter 10.


J(u) = (G Λ)(u) − F (Λ1u) − 〈u, u∗0〉Uis bounded from below , where Λ : U → Y and Λ1 : U → Y1 arecontinuous linear operators.

Then we may write:

infz∗∈Y ∗

1


F ∗(z∗) −G∗(v∗) ≥ infu∈U

J(u)



∗ − u∗0 = 0.

Our next result refers to a convex dual variational formulation,through which we obtain sufficient conditions for optimality.

Theorem 13.2.2. Consider J : U → R, where

J(u) =

∫

S

1

2|∇u|2 dx+

∫

S

α

2(u2

2− β)2 dx−

∫

S

fu dx,

and U = W 1,20 (S). For K = 1/K0, where K0 stands for the con-

stant related to Poincare inequality, we have the following dualityprinciple

infu∈U

J(u) ≥ sup(z∗,v∗1 ,v∗0)∈B∗

−G∗L(v∗, z∗)

where

G∗L(v∗, z∗) =

1

2K2

∫

S

|∇z∗|2 dx− 1

2K

∫

S

(z∗)2 dx+1

2

∫

S

(v∗1)2

v∗0 +Kdx

+1

2α

∫

S

(v∗0)2 dx+ β

∫

S

v∗0 dx, (13.8)

and

B∗ = (z∗, v∗1, v∗0) ∈ L2(S; R3) |

− 1

K∇2z∗ + v∗1 − z∗ = f, v∗0 +K > 0, a.e. in S, z∗ = 0 on Γ.

If in addition there exists u0 ∈ U such that δJ(u0) = θ and v∗0+K =(α/2)u2

0 − β +K > 0, a.e. in S, then

J(u0) = minu∈U

J(u) = max(z∗,v∗1 ,v∗0 )∈B∗

−G∗L(v∗, z∗) = −G∗

L(v∗, z∗),

where

v∗0 =α

2u2

0 − β,

v∗1 = (v∗0 +K)u0

and

z∗ = Ku0.

Proof. Observe that we may write

J(u) = G(Λu) − F (Λ1u) −∫

S

fu dx,

13.2. A CONCAVE DUAL VARIATIONAL FORMULATION 291

where

G(Λu) =

∫

S

1

2|∇u|2dx+

∫

S

α

2(u2

2− β + 0)2dx+

K

2

∫

S

u2dx,

F (Λ1u) =K

2

∫

S

u2 dx,

whereΛu = Λ0u,Λ1u,Λ2u,

andΛ0u = 0, Λ1u = u, Λ2u = ∇u.

From Theorem 13.2.1 (here this is an auxiliary theorem throughwhich we obtain A∗, below indicated), we have

infu∈U

J(u) = infz∗∈Y ∗

1

supv∗∈A∗

F ∗(z∗) −G∗(v∗),

where

F ∗(z∗) =1

2K

∫

S

(z∗)2 dx,

and

G∗(v∗) =1

2

∫

S

|v∗2|2 dx+1

2

∫

S

(v∗1)2

v∗0 +Kdx+

1

2α

∫

S

(v∗0)2 dx+β

∫

S

v∗0 dx,

if v∗0 +K > 0, a.e. in S, and

A∗ = v∗ ∈ Y ∗ | Λ∗v∗ − Λ∗1z

∗ − f = 0,or

A∗ = (z∗, v∗) ∈ L2(S) × L2(S; R5) |− div(v∗2) + v∗1 − z∗ − f = 0, a.e. in S.

Observe that

G∗(v∗) ≥ 〈Λu, v∗〉Y −G(Λu), ∀u ∈ U, v∗ ∈ A∗,

and thus

−F ∗(z∗) +G∗(v∗) ≥ −F ∗(z∗) + 〈Λ1u, z∗〉L2(S) + 〈u, f〉U −G(Λu),

(13.9)

and hence, making z∗ an independent variable through A∗, from(13.9) we may write

supz∗∈L2(S)

−F ∗(z∗) +G∗(v∗2(v∗1, z

∗), v∗1, v∗0) ≥ sup

z∗∈L2(S)

−F ∗(z∗)

+〈Λ1u, z∗〉L2(S) +

∫

S

fu dx−G(Λu)

, (13.10)


so that

supz∗∈L2(S)

− 1

2K

∫

S

(z∗)2 dx+1

2

∫

S

(v∗2(z∗, v∗1))

2 dx+1

2

∫

S

(v∗1)2

v∗0 +Kdx

+1

2α

∫

S

(v∗0)2 dx+ β

∫

S

v∗0 dx

≥ F (Λ1u) +

∫

S

fudx−G(Λu). (13.11)

Therefore if K ≤ 1/K0 (here K0 denotes the constant concerningPoincare Inequality), the supremum in the left side of (13.11) isattained through the relations

v∗2 =∇z∗K

and z∗ = 0 on Γ,

so that the final format of our duality principle is given by

infu∈U

J(u) ≥ sup(z∗,v∗1 ,v∗0)∈B∗

− 1

2K2

∫

S

|∇z∗|2 dx+1

2K

∫

S

(z∗)2 dx

−1

2

∫

S

(v∗1)2

v∗0 +Kdx− 1

2α

∫

S

(v∗0)2 dx− β

∫

S

v∗0 dx

, (13.12)

where

B∗ = (z∗, v∗1, v∗0) ∈ L2(S; R3) |

− 1

K∇2z∗ + v∗1 − z∗ = f, v∗0 +K > 0, a.e. in S, z∗ = 0 on Γ.

The remaining conclusions follow from an application (with littlechanges) of Theorem 8.1.27.

Remark 13.2.3. The relations

v∗2 =∇z∗K

and z∗ = 0 on Γ,

are sufficient for attainability of supremum indicated in (13.11) butjust partially necessary, however we assume them because the ex-pression of dual problem is simplified without violating inequality(13.12) (in fact the difference between the primal and dual func-tionals even increases under such relations).

13.3. APPLICATIONS TO PHASE TRANSITION IN POLYMERS 293

13.3. Applications to Phase Transition in Polymers

We consider now a variational problem closely related to theGinzburg-Landau formulation. See [12] and other references thereinfor more information how this applies to phase transition in poly-mers). For an open bounded S ⊂ R3 with a sufficient regularboundary denoted by Γ, let us define J : U × V → R, as

J(u, v) =ε

2

∫

S

|∇u|2dx+1

4ε

∫

S

(u2 − 1)2dx+γ

2

∫

S

|∇v|2dx,

under the constraints,

1

|S|

∫

S

udx = m, (13.13)

−∇2v = u−m, (13.14)

and∫

S

vdx = 0. (13.15)

Here U = V = W 1,2(S), ε is a small constant and −1 < m < 1. Wemay rewrite the primal functional, now denoting it by J : U ×V →R = R ∪ +∞ as

J(u, v) = G1(Λ(u, v)) − F (u).

Here Λ : U × V → Y ≡ L2(S) × L4(S) × L2(S; R3) × L2(S; R3) isdefined as

Λ(u, v) = Λ0u = 0,Λ1u = u,Λ2u = ∇u,Λ3v = ∇v,also

G1(Λ(u, v)) = G(Λ(u, v)) + Ind1(u, v) + Ind2(u, v) + Ind3(u, v).

Hence

G(Λ(u, v)) =ε

2

∫

S

|∇u|2dx+1

4ε

∫

S

(u2 − 1 + 0)2dx+K

2

∫

S

u2dx

+γ

2

∫

S

|∇v|2dx,

Ind1(u, v) =

0, if1

|S|

∫

S

udx = m,

+∞, otherwise,


Ind2(u, v) =

0, if ∇2v + u−m = 0, a.e. in S,

+∞, otherwise,

Ind3(u, v) =

0, if

∫

S

vdx = 0,

+∞, otherwise,

and,

F (u) =K

2

∫

S

u2dx.

Similarly to Theorem 13.2.1, through appropriate Lagrange multi-pliers for the constraints, we may write

inf(u,v)∈U×V

J(u, v) ≤

infz∗∈Y ∗

1

sup(u∗,v∗,λ)∈A∗

F ∗(z∗) −G∗(u∗, v∗) + λ1m+

∫

S

λ2mdx

,

where

G∗(u∗, v∗) =1

2ε

∫

S

|u∗1|2dx+1

2

∫

S

(u∗2)2

2u∗0 +Kdx+ ε

∫

S

(u∗0)2dx

+

∫

S

u∗0dx+1

2γ

∫

S

|v∗|2dx,

if 2u∗0 +K > 0, a.e. in S, and

F (z∗) =1

2K

∫

S

(z∗)2dx.

Also, defining Y ∗ = L2(S) × Y ∗ × R × L2(S) × R, we have

A∗ = A∗1 ∩A∗

2 ∩A∗3,A

∗1 = (z∗, u∗, λ) ∈ Y ∗ |

div(u∗1)−u∗2+z∗− λ1

|S| −λ2 = 0 a.e. in S and u∗1 ·n = 0 on Γ,

A∗2 = (z∗, u∗, λ) ∈ Y ∗ | ∇2λ2 + λ3 − div(v∗) = 0 a.e. in S,

v∗ · n+∂λ2

∂n= λ2 = 0, on Γ,

and

A∗3 = (z∗, u∗, λ) ∈ Y ∗ | 2u∗0 +K > 0, a.e. in S.


Similarly to the case of Ginzburg-Landau formulation we can obtain

inf(u,v)∈U×V

J(u, v) ≥

sup(z∗,u∗,v∗,λ)∈C∗

− ε

2K2

∫

S

|∇z∗|2dx+ 1

2K

∫

S

(z∗)2dx−1

2

∫

S

(u∗2)2

2u∗0 +Kdx−

−ε∫

S

(u∗0)2dx+

∫

S

u∗0dx−1

2γ

∫

S

|v∗|2dx− λ1m−∫

S

λ2mdx,

where

C∗ = C∗1 ∩A∗

2 ∩A∗3,

C∗1 = (z∗, u∗, λ) ∈ Y ∗ | ε∇

2z∗

K−u∗2+z∗−

λ1

|S|−λ2 = 0 a.e. in S

and∂z∗

∂n= 0 on Γ.

Remark 13.3.1. It is important to emphasize that by analogyto Section 13.2, we may obtain sufficient conditions for optimality.That is, if there exists a critical point for the dual formulation forwhich 2u∗0+K > 0 a.e. in S andK = ε/K0 (whereK0 stands for theconstant related to Poincare inequality), then the correspondingprimal point through Legendre transform relations is also a globalminimizer and the last inequality is in fact an equality, as has beenpointed out in Theorem 13.2.2.

13.3.1. Another Two Phase Model in Polymers. The fol-lowing problem has applications in two phase models in Polymers.To minimize the functional J : U × V → R (here consider S ⊂ R3

as above), where

J(u, v) = |Du|(S) +γ

2

∫

S

|∇v|2dx,

under the constraints∫

S

udx = m,

−∇2v = u−m, (13.16)

where

U = BV (S, −1, 1), (13.17)


and V = W 1,2(S) (here BV denotes the space of functions withbounded variation in S and |Du|(S) denotes the total variation ofu in S).

Redefining U as U = W 1,1(S) ∩ L2(S), we rewrite the primalformulation, through suitable Lagrange multipliers, now denotingit by J(u, v, λ1, λ2, λ3) (J : U × V × L2(S) × R × L2(S) → R), as

J(u, v, λ) =

∫

S

|∇u|dx+γ

2

∫

S

|∇v|2dx+

∫

S

λ1

2(u2 − 1)dx

+ λ2

(∫

S

udx−m

)

+

∫

S

λ3(∇2v + u−m)dx, (13.18)

where λ = (λ1, λ2, λ3). We may write

J(u, v, λ) = G(Λ(u, v)) + F (u, v, λ),

where

G(Λ(u, v)) =

∫

S

|∇u|dx+γ

2

∫

S

|∇v|2dx,

here Λ : U × V → Y = L2(S; R3) × L2(S; R3) is defined as

Λ(u, v) = Λ1u = ∇u,Λ2v = ∇vand

F (u, v, λ) = λ2

(∫

S

udx−m

)

+

∫

S

λ3(∇2v + u−m)dx

+

∫

S

λ1

2(u2 − 1)dx.

Defining

U =

(u, v) ∈ U × V |∫

S

udx = m, ∇2v + u−m = 0,

u2 = 1 a.e. in S

,

from the Lagrange Multiplier version of Theorem 8.2.5 we have that

inf(u,v)∈U

G(Λ(u, v)) + F (u, v, θ)

= sup(v∗ ,λ)∈Y ∗×B∗

−G∗(v∗) − F ∗(−Λ∗v∗, λ),

where

G∗(v∗) = supv∈Y

〈v, v∗〉Y −G(v) =1

2γ

∫

S

|v∗2|2dx+ Ind0(v∗1),


and

Ind0(v∗1) =

0, if |v∗1|2 ≤ 1, a.e. in S,

+∞, otherwise.

We may define

Ind1(v∗1) =

0, if v∗1 · n = 0, on Γ,

+∞, otherwise,

and

Ind2(v∗2) =

0, if div(v∗2) −∇2λ3 = 0, a.e. in S,v∗2 · n+ ∂λ3

∂n= λ3 = 0 on Γ,

+∞, otherwise,

so that

F ∗(−Λ∗v∗, λ) =

∫

S

|div(v∗1) − λ2 − λ3|dx+ Ind2(v∗2) + Ind1(v

∗1)

+ λ2m+

∫

S

λ3mdx.

Therefore, we can summarize the last results by the following du-ality principle,

inf(u,v)∈U

G(Λ(u, v)) + F (u, v, θ) = sup(v∗,λ)∈A∗∩B∗

− 1

2γ

∫

S

|v∗2|2dx

−∫

S

|div(v∗1) − λ2 − λ3|dx− λ2m−∫

S

λ3mdx

,

where

A∗ = v∗ ∈ Y ∗ = L2(S; R3) × L2(S; R3) | |v∗1(x)|2 ≤ 1,

a.e. in S and v∗1 · n = 0, on Γ.and

B∗ = (v∗2, λ3) ∈ L2(S; R3)×L2(S)|div(v∗2)−∇2λ3 = 0, a.e. in S,

v∗2 · n +∂λ3

∂n= λ3 = 0 on Γ. (13.19)

Remark 13.3.2. It is worth noting that the last dual formu-lation represents a standard convex non-smooth optimization prob-lem. The non-smoothness is the responsible by a possible micro-structure formation. Furthermore such a formulation seems to be


amenable to numerical computation (in a simpler way as comparedto the primal approach).

13.4. A Numerical Example

In this section we present numerical results for a one-dimensionalexample originally due to Bolza (see [29] for details about the pri-mal formulation).

Consider J : U → R expressed as

J(u) =1

2

∫ 1

0

((u,x)2 − 1)2dx+

1

2

∫ 1

0

(u− f)2dx

or, defining S = [0, 1],

G(Λu) =1

2

∫ 1

0

((u,x)2 − 1)2dx

and

F (u) =1

2

∫ 1

0

(u− f)2dx

we may write

J(u) = G(Λu) + F (u)

where, for convenience we define, Λ : U → Y ≡ L4(S) × L2(S) as

Λu = u,x, 0.Furthermore, we have

U = u ∈W 1,4(S) | u(0) = 0 and u(1) = 0.5For Y = Y ∗ = L4(S) × L2(S), defining

G(Λu+ p) =1

2

∫

S

((u,x + p1)2 − 1.0 + p0)

2dx

for v∗0 > 0 we obtain

G(Λu)+F (u) ≥ infp∈Y

−〈p0, v∗0〉L2(S)−〈p1, v

∗1〉L2(S)+G(Λu+p)+F (u)

or

G(Λu) + F (u) ≥ infp∈Y

−〈q0, v∗0〉L2(S) − 〈q1, v∗1〉L2(S) +G(q)

+ 〈0, v∗0〉L2(S) + 〈u′, v∗1〉L2(S) + F (u).Here q = Λu+ p so that

G(Λu) + F (u) ≥ −G∗(v∗) + 〈0, v∗0〉L2(S) + 〈u,x, v∗1〉L2(S) + F (u).

13.4. A NUMERICAL EXAMPLE 299

That is

G(Λu)+F (u) ≥ −G∗(v∗)+ infu∈U

〈0, v∗0〉L2(S) + 〈u,x, v∗1〉L2(S) +F (u),

orinfu∈U

G(Λu) + F (u) ≥ supv∗∈A∗

−G∗(v∗) − F ∗(−Λ∗v∗)

where

G∗(v∗) =1

2

∫

S

(v∗1)2

v∗0dx+

1

2

∫

S

(v∗0)2dx,

if v∗0 > 0, a.e. in S. Also

F ∗(−Λ∗v∗) =1

2

∫

S

[(v∗1),x]2dx+ 〈f, (v∗1),x〉L2(S) − v∗1(1)u(1)

andA∗ = v∗ ∈ Y ∗ | v∗0 > 0, a.e. in S.

Remark 13.4.1. Through the extremal condition v∗0 = ((u,x)2−

1) and Weierstrass condition (u,x)2 − 1.0 ≥ 0 we can see that the

dual formulation is convex for v∗0 > 0, however it is possible thatthe primal formulation has no minimizers, and we could expect amicrostructure formation through v∗0 = 0 (that is, u,x = ±1, de-pending on f(x)). To allow v∗0 = 0 we will redefine the primalfunctional as below indicated.

Define G1 : U → R and F1 : U → R by

G1(u) = G(Λu) + F (u) +K

2

∫

S

(u,x)2dx

and

F1(u) =K

2

∫

S

(u,x)2dx.

Also defining G(Λu) = G(Λu)+ K2

∫

S(u,x)

2dx, from Theorem 13.2.1we can write

infu∈U



F ∗1 (z∗) − G∗(v∗0, v

∗2) − F ∗(v∗1)

(13.20)

where

F ∗1 (z∗) =

1

2K

∫

S

(z∗)2dx,

G∗(v∗0, v∗2) =

1

2

∫

S

(v∗2)2

v∗0 +Kdx+

1

2

∫

S

(v∗0)2dx,


0 0.2 0.4 0.6 0.8 1−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Figure 1. Vertical axis: u0(x)-weak limit of minimiz-ing sequences for f(x)=0

F ∗(v∗1) =1

2

∫

S

(v∗1)2dx+ 〈f, v∗1〉L2(S) − v∗2(1)u(1)

and

B∗(z∗) = v∗ ∈ Y ∗ | −(v∗2),x+v∗1−z∗ = 0 and v∗0 ≥ 0 a.e. in S.

We developed an algorithm based on the dual formulation in-dicated in (13.20). It is relevant to emphasize that such a dualformulation is convex if the supremum indicated is evaluated underthe constraint v∗0 ≥ 0 a.e. in S (this results follows from the tradi-tional Weierstrass condition, so that there is no duality gap betweenthe primal and dual formulations and the inequality indicated in(13.20) is in fact an equality).

We present numerical results for f(x) = 0 (see figure 1), f(x) =0.3 ∗ Sin(π ∗ x) (figure 2) and f(x) = 0.3 ∗ Cos(π ∗ x) (figure 3).The solutions indicated as optima through the dual formulations(denoted by u0), are in fact weak cluster points of minimizing se-quences for the primal formulations.

13.5. A NEW PATH FOR RELAXATION 301

0 0.2 0.4 0.6 0.8 10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Figure 2. Vertical axis: u0(x)-weak limit of minimiz-ing sequences for f(x) = 0.3 ∗ Sin(π ∗ x)

0 0.2 0.4 0.6 0.8 10

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Figure 3. Vertical axis: u0(x)-weak limit of minimiz-ing sequences for f(x) = 0.3 ∗ Cos(π ∗ x)

13.5. A New Path for Relaxation

Similarly as in Theorem 13.2.1, we may prove:

infu∈U

G1(∇u) +G2(u) − F (u) − 〈u, f〉U≤ inf

z∗∈Y ∗

supv∗∈Y ∗

F ∗(z∗) −G∗2(z

∗ + div(v∗) + f) −G∗1(v

∗),


where,

G1(∇u) =γ

2

∫

S

|∇u|2 dx,

G2(u) =α

4

∫

S

|u|4 dx,and

F (u) =αβ

2

∫

S

|u|2 dx.Also

F ∗(z∗) =1

2αβ

∫

S

|z∗|2 dx,

G∗2(z

∗ + div(v∗) + f) =3

4α1/3

∫

S

(z∗ + div(v∗) + f)4/3 dx.

and

G∗1(v

∗) =1

2γ

∫

S

|v∗|2 dx.Let us denote

J∗(v∗, z∗) = F ∗(z∗) −G∗2(z

∗ + div(v∗) + f) −G∗1(v

∗),

and

J∗1 (v∗, z∗) = F ∗(z∗) −G∗

2(z∗ + div(v∗) + f) − 〈∇u, v∗〉Y

+γ

2

∫

S

|∇u|2 dx.

The Euler-Lagrange equations for this last functional, are given by

∂F ∗(z∗)

∂z∗− ∂G∗

2(z∗ + div(v∗) + f)

∂z∗= 0, in S (13.21)

and

∇u−∇(

∂G∗2(z

∗ + div(v∗) + f)

∂z∗

)

= 0, in S (13.22)

that is

z∗

αβ=

(

z + div(v∗) + f

α

)1/3

, in S (13.23)

and

∇u−∇(

z + div(v∗) + f

α

)1/3

= 0, in S. (13.24)

From (13.23) we have

α(z∗)3 = (αβ)3(z∗ + div(v∗) + f), in S. (13.25)


Let us denote by z∗k, where k ∈ 1, 2, 3, the three roots of (13.25),which at first are assumed to be all real, and define

J∗k (v∗, z∗k) = F ∗(z∗k) −G∗

2(z∗k + div(v∗) + f)

− 〈∇z∗k, v∗〉Y /(αβ) +1

2(αβ)2

∫

S

|∇z∗k|2 dx. (13.26)

Denoting

uk =∂F ∗(z∗k)

∂z∗=

z∗kαβ

we obtain

F ∗(z∗k) = 〈uk, z∗k〉Y − F (uk),

and from (13.23),

uk =∂G∗

2(z∗k + div(v∗) + f)

∂z∗,

that is

G∗2(z

∗k + div(v∗) + f) = 〈uk, z

∗k + div(v∗) + f〉U −G2(uk).

Therefore

J∗1 (v∗, z∗k) =

α

4

∫

S

|uk|4 dx−αβ

2

∫

S

|uk|2 dx

+γ

2

∫

S

|∇uk|2 dx ≥ infu∈U

J(u). (13.27)

Henceforth we denote

F (u) =

∫

S

f(u) dx,

G2(u) =

∫

S

g2(u) dx,

where

f(u) =αβ

2u2 dx,

g2(u) =α

4|u|4.

Therefore

F ∗(z∗) =

∫

S

f ∗(z∗) dx,

G∗2(v

∗) =

∫

S

g∗2(v∗) dx,


being

f ∗(z∗) =1

2αβ(z∗)2,

and

g∗2(v∗) =

3

4α1/3|v∗|4/3.

Let λ1, λ2, λ3 be given such that λk(x) ∈ 0, 1, a.e. in S, and∑3

k=1 λk(x) = 1, a.e. in S. Again denote by z∗k for k ∈ 1, 2, 3the three roots of equation (13.25) (at first assumed to be real).

Define

u =

3∑

k=1

λkz∗k/(αβ),

and

z∗ =

3∑

k=1

λkz∗k .

Observe that from the values assumed by λ1, λ2, λ3 we may write:

F ∗(z∗) =

∫

S

3∑

k=1

λkf∗(z∗k) dx

= 〈3∑

k=1

λkuk,3∑

k=1

λkz∗k〉Y −

∫

S

3∑

k=1

λkf(uk) dx

= 〈3∑

k=1

λkuk,3∑

k=1

λkz∗k〉Y − F (

3∑

k=1

λkuk), (13.28)

and by analogy

G∗2(z

∗ + div(v∗) + f) =

∫

S

3∑

k=1

λkg∗2(z

∗k + div(v∗) + f) dx

= 〈3∑

k=1

λkuk,3∑

k=1

λkz∗k + div(v∗) + f〉Y

−G2(3∑

k=1

λkuk). (13.29)


Thus,

J∗1 (v∗, z∗) = F ∗(z∗) −G∗

2(z∗ + div(v∗) + f)

−〈∇u, v∗〉Y +γ

2

∫

S

|∇u|2 dx

=

∫

S

3∑

k=1

λkf∗(z∗k) dx−

∫

S

3∑

k=1

λkg∗2(z


−〈∇(3∑

k=1

z∗k), v∗〉Y /(αβ) +

γ

2(αβ)2

∫

S

∣

∣

∣

∣

∣

∇(

3∑

k=1

λkz∗k

)∣

∣

∣

∣

∣

2

dx

= 〈u, z∗〉Y − F (u) − 〈u, z∗ + div(v∗) + f〉Y +G2(u)

−〈∇u, v∗〉Y +G1(∇u)= J(u) ≥ inf

u∈UJ(u). (13.30)

We may summarize the last results by

inf(v∗,λ)∈Y ∗×B

∫

S

3∑

k=1

λkf∗(z∗k) dx−

∫

S

3∑

k=1

λkg∗2(z


−⟨

∇(

3∑

k=1

λkz∗k

)

, v∗

⟩

Y

/(αβ) +γ

2(αβ)2

∫

S

∣

∣

∣

∣

∣

∇(

3∑

k=1

λkz∗k

)∣

∣

∣

∣

∣

2

dx

≥ infu∈U

J(u), (13.31)

where

B = λ = (λ1, λ2, λ3) measurable |

λk ∈ 0, 1,3∑

k=1

λk = 1, a.e. in S.

Remark 13.5.1. This primal-dual formulation is particularlyuseful to compute the global optimal point as γ is small. In sucha case we expect the solution moves between the wells along thedomain.

Remark 13.5.2. In case just one root z∗k is real, a little adap-tation is necessary. For numerical results, we may for example,replace the complex roots by a big fixed integer N , in order to notchange the global optimal point.


13.6. A New Numerical Procedure,

the Generalized Method of Lines

Consider first the equation

∇2u = 0, in S ⊂ R2, (13.32)


u = 0 on Γ0 and u = uf , on Γ1.

Here Γ0 denotes the internal boundary of S and Γ1 the externalone. Assume Γ0 is smooth and consider the simpler case where

Γ1 = 2Γ0.

Finally, also suppose there exists r(θ), a smooth function such that

Γ0 = (θ, r(θ)) | 0 ≤ θ ≤ 2π,being r(0) = r(2π).

In polar coordinates the above equation may be written as

∂2u

∂r2+

1

r

∂u

∂r+

1

r2

∂2u

∂θ2= 0, in S, (13.33)

and


Define the variable t by

t =r

r(θ).

Also defining u by

u(r, θ) = u(t, θ),

dropping the bar in u, equation (13.32) is equivalent to

∂2u

∂t2+ (

1

tf2(θ))

∂u

∂t

+1

tf3(θ)

∂2u

∂θ∂t+f4(θ)

t2∂2u

∂θ2= 0, (13.34)

in S. Here f2(θ), f3(θ) and f4(θ) are known functions.More specifically, denoting

f1(θ) =−r′(θ)r(θ)

,

we have:

f2(θ) = 1 +f ′

1(θ)

1 + f1(θ)2,

13.6. A NEW NUMERICAL PROCEDURE... 307

f3(θ) =2f1(θ)

1 + f1(θ)2,

and

f4(θ) =1

1 + f1(θ)2,

Observe that t ∈ [1, 2] in S. Discretizing in t (N equal pieceswhich will generate N lines ) we obtain the equation

un+1 − 2un + un−1

d2+un − un−1

d(1

tf2(θ))

+∂(un − un−1)

∂θ

1

tdf3(θ) +

∂2un

∂θ2

f4(θ)

t2= 0, (13.35)

∀n ∈ 1, ..., N. Here, un(θ) corresponds to the solution on the linen. Thus we may write

un = T (un−1, un, un+1),

where

T (un−1, un, un+1) =un+1 + un−1

2− d2

2

(

un − un−1

d(1

tf2(θ))

+∂(un − un−1)

∂θ

1

tdf3(θ) +

∂2un

∂θ2

f4(θ)

t2

)

. (13.36)

Now we recall a classical definition.

Definition 13.6.1. Let C be a subset of a Banach space U andlet T : C → C be an operator. Thus T is said to be a contractionmapping if there exists 0 ≤ α < 1 such that

‖T (x1) − T (x2)‖U ≤ α‖x1 − x2‖U , ∀x1, x2 ∈ C.

Remark 13.6.2. Observe that if ‖T ′(x)‖U ≤ α < 1, on aconvex set C then T is a contraction mapping, since by the meanvalue inequality,

‖T (x1) − T (x2)‖U ≤ supx∈C

‖T ′(x)‖‖x1 − x2‖U , ∀x1, x2 ∈ C.

The next result is the base of our generalized method of lines.For a proof see [27].

Theorem 13.6.3 (Contraction Mapping Theorem). Let Cbe a closed subset of a Banach space U . Assume T is contraction


mapping on C, then there exists a unique x ∈ C such that x = T (x).Moreover, for an arbitrary x0 ∈ C defining the sequence

x1 = T (x0) and xk+1 = T (xk), ∀k ∈ N

we have

xk → x, in norm, as k → +∞.

From equation (13.36), if d = 1/N is small enough, since un−1 ≃un it is clear that for fixed un+1, G(un) = T (un−1, un, un+1) is acontraction mapping, considering that ‖G′(un)‖ ≤ α < 1, for some0 < α < 1 in a set that contains the solution of equation in question.

In particular for n = 1 we have

u1 = T (0, u1, u2).

we may use the Contraction Mapping Theorem to calculate u1 asa function of u2. The procedure would be

(1) Set x0 = u2.(2) Obtain x1 = T (0, x0, u2),(3) obtain recursively

xk+1 = T (0, xk, u2),

(4) and finally get

u1 = limk→∞

xk = g1(u2).

We have obtained thus

u1 = g1(u2).

We can repeat the process for n = 2, that is we can solve theequation

u2 = T (u1, u2, u3),

which from above stands for:

u2 = T (g1(u2), u2, u3).

The procedure would be:

(1) Set x0 = u3,(2) calculate

xk+1 = T (g1(xk), xk, u3).

(3) obtain

u2 = limk→∞

xk = g2(u3).


We proceed in this fashion until obtaining

uN−1 = gN−1(uN) = gN−1(uf).

Being uf known we have obtained uN−1 . We may then calculate

uN−2 = gN−2(uN−1),

uN−3 = gN−3(uN−2),

and so on, up to finding

u1 = g1(u2).

The problem is then solved.

Remark 13.6.4. Here we consider some points concerning theconvergence of the method.

In the next lines the norm indicated as in ‖xk‖ refers toW 2,2([0, 2π]).In particular for n = 1 from above we have:

u1 = T (0, u1, u2).

We will construct the sequence xk (in a little different way as above)by

x1 = u2/2,

and

xk+1 = T (0, xk, u2) = u2/2 + dT (xk),

where the operator T is properly defined from the expression of T .Observe that

‖xk+2 − xk+1‖ ≤ d‖T‖‖xk+1 − xk‖,and if

0 ≤ α = d‖T‖ < 1,

we have that xk is (Cauchy) convergent. Through a standardprocedure for this kind of sequence, we may obtain

‖xk+1 − x1‖ ≤ 1

1 − α‖x2 − x1‖,

so that denoting u1 = limk→∞

xk, we get

‖u1 − u2/2‖ ≤ 1

1 − αd‖T‖‖u2/2‖,

Having such an estimate, we may similarly obtain

u2 ≈ u3 + O(d),


and generically

un ≈ un+1 + O(d), ∀n ∈ 1, ..., N − 1.

This last calculation is just to clarify that the procedure of obtain-ing the relation between consecutive lines through the contractionmapping theorem is well defined. Anyway, we postpone a morecomplete analysis of the approximation error for a future work.


Denoting x = θ, particularly for N = 10, truncating the seriesup the term in d2, we obtain the following expression for the lines

Line 1 :

u1(x) = 0.1uf(x) + 0.034f2(x)uf(x)

+0.034f3(x)u′f(x) + 0.008f4(x)u

′′f(x)

Line 2 :

u2(x) = 0.2uf(x) + 0.058f2(x)uf(x)

+0.058f3(x)u′f(x) + 0.015f4(x)u

′′f(x)

Line 3 :

u3(x) = 0.3uf(x) + 0.075f2(x)uf(x)

+0.075f3(x)u′f(x) + 0.020f4(x)u

′′f(x)

Line 4 :

u4(x) = 0.4uf(x) + 0.083f2(x)uf(x)

+0.083f3(x)u′f(x) + 0.024f4(x)u

′′f(x)

Line 5 :

u5(x) = 0.5uf(x) + 0.084f2(x)uf(x)

+0.084f3(x)u′f(x) + 0.026f4(x)u

′′f(x)

Line 6 :

u6(x) = 0.6uf(x) + 0.079f2(x)uf(x)

+0.079f3(x)u′f(x) + 0.026f4(x)u

′′f(x)

Line 7 :

u7(x) = 0.7uf(x) + 0.068f2(x)uf(x)

+0.068f3(x)u′f(x) + 0.022f4(x)u

′′f(x)

Line 8 :

u8(x) = 0.8uf(x) + 0.051f2(x)uf(x)

+0.051f3(x)u′f(x) + 0.018f4(x)u

′′f(x)

Line 9 :

u9(x) = 0.9uf(x) + 0.028f2(x)uf(x)

+0.028f3(x)u′f(x) + 0.010f4(x)u

′′f(x)


Remark 13.6.5. As mentioned above we have truncated theseries up to the terms in d2. Of course higher order approximationsare possible, but it seems that to obtain better results, the bestprocedure is just to discretize more (that is, to increase N). We canuse higher values of N as 100, 1, 000, or even 100, 000 still keepingthe method viable.

Remark 13.6.6. Finally, it is worth noting that the General-ized Method of Lines may be used to solve a large class of non-linearproblems. As an example we solve the Ginzburg-Landau type equa-tion:

−∇2u+ Au3 − ABu = 0, in S ⊂ R2, (13.37)

with the boundary conditions:


Here A,B > 0, Γ0 denotes the internal boundary of S and Γ1 theexternal one. Again we assume Γ0 is a smooth function and considerthe simpler case where

Γ1 = 2Γ0.

Finally, also suppose there exists r(θ), a smooth function such that

Γ0 = (θ, r(θ)) | 0 ≤ θ ≤ 2π,being r(0) = r(2π). Particularly, for N = 10, truncating the seriesup the term in d2, we obtain the following expression for the lines(here x stands for θ, being f1(x), f2(x), f3(x) and f4(x) as aboveand f5(x) = r2(x)f4(x)):

Line 1 :

u1(x) = 0.1uf(x) + 0.034f2(x)uf (x)

+0.034f3(x)u′f(x) + 0.008f4(x)u

′′f(x)

+0.017ABf5(x)uf(x) − 0.005Af5(x)u3f(x)

Line 2 :

u2(x) = 0.2uf(x) + 0.058f2(x)f5(x)uf(x)

+0.058f3(x)u′f(x) + 0.015f4(x)u

′′f(x)

+0.032ABf5(x)uf(x) − 0.009Af5(x)u3f(x)


Line 3 :

u3(x) = 0.3uf(x) + 0.075f2(x)uf (x)

+0.075f3(x)u′f(x) + 0.020f4(x)u

′′f(x)

+0.045ABf5(x)uf(x) − 0.015Af5(x)u3f(x)

Line 4 :

u4(x) = 0.4uf(x) + 0.083f2(x)uf (x)

+0.083f3(x)u′f(x) + 0.024f4(x)u

′′f(x)

+0.056ABf5(x)uf(x) − 0.019Af5(x)u3f(x)

Line 5 :

u5(x) = 0.5uf(x) + 0.084f2(x)uf (x)

+0.084f3(x)u′f(x) + 0.026f4(x)u

′′f(x)

+0.062ABf5(x)uf(x) − 0.023Af5(x)u3f(x)

Line 6 :

u6(x) = 0.6uf(x) + 0.079f2(x)uf (x)

+0.079f3(x)u′f(x) + 0.026f4(x)u

′′f(x)

+0.063ABf5(x)uf(x) − 0.026Af5(x)u3f(x)

Line 7 :

u7(x) = 0.7uf(x) + 0.068f2(x)uf (x)

+0.068f3(x)u′f(x) + 0.022f4(x)u

′′f(x)

+0.059ABf5(x)uf(x) − 0.026Af5(x)u3f(x)

Line 8 :

u8(x) = 0.8uf(x) + 0.051f2(x)uf (x)

+0.051f3(x)u′f(x) + 0.018f4(x)u

′′f(x)

+0.048ABf5(x)uf(x) − 0.023Af5(x)u3f(x)


Line 9 :

u9(x) = 0.9uf(x) + 0.045f2(x)uf(x)

+0.045f3(x)u′f(x) + 0.010f4(x)u

′′f(x)

+0.028ABf5(x)uf(x) − 0.015Af5(x)u3f(x).

Remark 13.6.7. Observe that Γ1 could be any multiple ofΓ0, in fact any real number greater than 1 multiplied by Γ0. Ofparticular interest is the case where Γ1 = kΓ0 and k → +∞. Inthis case we simulate +∞ through a big real number. Finally,the Generalized Method of Lines may be adapted very easily forcartesian coordinates.

13.7. A Simple Numerical Example

Just to illustrate the possibilities of the generalized method oflines, consider the equation

∇2u = ∇2u, in S,

where

S = (r, θ) | 1 ≤ r ≤ 2, 0 ≤ θ ≤ 2π,u = u on Γ0 and Γ1,

where Γ0 and Γ1 are boundaries of the circles with centers at originand radius 1 and 2 respectively. Finally, in polar coordinates (herex stands for θ),

u = r2 cos(x).

See below the values of the 10 lines obtained by the generalizedmethod of lines (u[n]) and the exact values ( u[n] for the samelines):

Line 1

u[1] = 1.21683 cos(x), u[1] = 1.21 cos(x)

Line 2

u[2] = 1.44713 cos(x), u[2] = 1.44 cos(x)

Line 3

u[3] = 1.69354 cos(x), u[3] = 1.69 cos(x)


Line 4

u[4] = 1.95811 cos(x) u[4] = 1.96 cos(x)

Line 5

u[5] = 2.24248 cos(x), u[5] = 2.25 cos(x)

Line 6

u[6] = 2.54796 cos(x), u[6] = 2.56 cos(x)

Line 7

u[7] = 2.87563 cos(x), u[7] = 2.89 cos(x)

Line 8

u[8] = 3.22638 cos(x), u[8] = 3.24 cos(x)

Line 9

u[9] = 3.60096 cos(x), u[9] = 3.61 cos(x)

Remark 13.7.1. We expect the error decreases as we in-crease the number of lines. Specifically for this example high orderapproximations are not necessary.

13.8. Conclusion

In this article we have developed a dual variational formulationfor a Ginzburg-Landau type equation. The dual approach obtainedis represented by a concave functional. Also, sufficient conditionsof optimality are provided. Finally, in the last section we introducethe new Generalized Method of Lines and it is important to empha-size that the results developed may be applied to a great varietyof situations. An application for the compressible Navier-Stokessystem is planned for a future work.

CHAPTER 14

Duality Applied to Conductivity in Composites

14.1. Introduction

For the primal formulation we repeat the statements found inreference [19] (U.Fidalgo, P.Pedregal). Consider a material con-fined into a bounded domain Ω ⊂ RN , N > 1. The medium is ob-tained by mixing two constituents with different electric permitivityand conductivity. Let Q0 and Q1 denote the two N ×N symmetricmatrices of electric permitivity corresponding to each phase. Foreach phase, we also denote by Lj , j = 0, 1, the anisotropic N ×Nsymmetric matrix of conductivity. Let 0 ≤ t1 ≤ 1 be the propor-tion of the constituent 1 into the mixture. Constituent 1 occupies aspace in the physical domain Ω which we denote by E ⊂ Ω. Regard-ing the set E as our design variable, we introduce the characteristicfunction χ : Ω → 0, 1:

χ(x) =

1, if x ∈ E,0, otherwise,

(14.1)

Thus,∫

E

dx =

∫

Ω

χ(x)dx = t1

∫

Ω

dx = t1|Ω|. (14.2)

The matrix of conductivity corresponding to the material as awhole is L = χL1 + (1 − χ)L0.

Finally, the electrostatic potential, denoted by u : Ω → R issupposed to satisfy the equation

div[χL1∇u+ (1 − χ)L0∇u] = P (x), in Ω, (14.3)


u = u0, on ∂Ω (14.4)

where P : Ω → R is a given source or sink of current (we assumeP ∈ L2(Ω)).

317

318 14. DUALITY APPLIED TO CONDUCTIVITY IN COMPOSITES

14.2. The Primal Formulation

Now and on we assume N = 3. Consider the problem of mini-mizing the cost functional,

I(χ, u) =

∫

Ω

(

χ

2(∇u)TQ1∇u+

(1 − χ)

2(∇u)TQ0∇u

)

dx (14.5)

subject to

div[χL1∇u+ (1 − χ)L0∇u] = P (x) (14.6)

where u ∈ U, here U = u ∈W 1,2(Ω) | u = u0 on ∂Ω.We will rewrite this problem as the minimization of J : U×Y →

R, where Y = L2(S; R3),

J(u, f) =

inft∈B

∫

Ω

t

2((∇u)TQ1∇u) +

(1 − t)

2((∇u)TQ0∇u) + Ind1(u, f)

dx

+ Ind2(u, f),

Ind1(∇u, f) =

0, if (tL1 + (1 − t)L0)∇u− f = 0,+∞, otherwise,

and

Ind2(u, f) =

0, if div(f) = P a.e. in Ω,+∞, otherwise.

Here

B = t measurable | t(x) ∈ 0, 1, a.e. in Ω,

∫

Ω

t(x)dx = t1|Ω|.

14.3. The Duality Principle

Our main duality principle is summarized by the following the-orem.

Theorem 14.3.1. Let J : U × Y → R be given by:

J(u) = G(∇u, f) +G1(f) − F (∇u),

14.3. THE DUALITY PRINCIPLE 319

where

G(∇u, f) =

inft∈B

∫

Ω

t

2((∇u)TQ1∇u) +

(1 − t)

2((∇u)TQ0∇u) + Ind1(u, f)

dx

+K

2

∫

Ω

|∇u|2 dx, (14.7)

F (∇u) =K

2

∫

Ω

|∇u|2 dx

and

G1(f) = Ind2(f).

Thus, we may write

inf(u,f)∈U×Y

J(u, f) ≥ sup(v∗,λ)∈C∗×U

infz∗∈Y ∗

F ∗(z∗) −G∗(v∗ + z∗,∇λ)

+〈λ, P 〉L2(Ω) − 〈u0, v∗.n〉L2(∂Ω), (14.8)

where

G∗(v∗ + z∗, f ∗) = sup(u,f)∈U×Y

〈v, v∗ + z∗〉Y + 〈f, f ∗〉Y −G(v, f),

that is,

G∗(v∗ + z∗,∇λ)

= supt∈B

1

2

∫

Ω

(η(v∗ + z∗,∇λ)T )Q(t)(η(v∗ + z∗,∇λ)) dx

,

η(v∗ + z∗,∇λ) = (v∗ + z∗ + (t(L1) + (1 − t)L0)T∇λ),

and

F ∗(z∗) =1

2K

∫

Ω

|z∗|2 dx.

Finally,

Q(t) = (tQ1 + (1 − t)Q0 +KI)−1,

where I denotes the identity matrix,

C∗ = (v∗, λ) ∈ Y ∗ × U | div(v∗) = 0, a.e. in Ω, λ = 0 on ∂Ω,and

B = t measurable | t(x) ∈ 0, 1, a.e. in Ω,

∫

Ω

t(x)dx = t1|Ω|.

320 14. DUALITY APPLIED TO CONDUCTIVITY IN COMPOSITES

Proof. Observe that

G∗(v∗ + z∗,∇λ) − 〈λ, P 〉L2(Ω) + 〈u0, v∗.n〉L2(∂Ω)

≥ 〈f,∇λ〉Y +〈∇u, v∗+z∗〉Y −〈λ, P 〉L2(Ω)+〈u0, v∗.n〉L2(∂Ω)−G(∇u, f),

∀u ∈ U, v∗, z∗ ∈ Y ∗, f ∈ Y , so that

G∗(v∗ + z∗,∇λ) − 〈λ, P 〉L2(Ω) + 〈u0, v∗.n〉L2(∂Ω)

≥ 〈∇u, z∗〉Y − Ind2(f) −G(∇u, f), (14.9)

∀u ∈ U, v∗ ∈ C∗, z∗ ∈ Y ∗, f ∈ Y , and thus

− F ∗(z∗) +G∗(v∗ + z∗,∇λ) − 〈λ, P 〉L2(Ω) + 〈u0, v∗.n〉L2(∂Ω)

≥ −F ∗(z∗) + 〈∇u, z∗〉Y − Ind2(f) −G(∇u, f), (14.10)

∀u ∈ U, v∗ ∈ C∗, z∗ ∈ Y ∗, f ∈ Y .Taking the supremum in z∗ at both sides of last inequality, we

obtain

supz∗∈Y ∗

−F ∗(z∗) +G∗(v∗ + z∗,∇λ)−〈λ, P 〉L2(Ω) + 〈u0, v∗.n〉L2(∂Ω)

≥ F (∇u) − Ind2(f) −G(∇u, f), (14.11)

∀u ∈ U, v∗ ∈ C∗, f ∈ Y . and hence

inf(u,f)∈U×Y

J(u, f) ≥ sup(v∗,λ)∈C∗

infz∗∈Y ∗

F ∗(z∗) −G∗(v∗ + z∗,∇λ)

+〈λ, P 〉L2(Ω) − 〈u0, v∗.n〉L2(∂Ω). (14.12)


Remark 14.3.2. We conjecture that if K > 0 is big enoughthen the duality gap is zero. In fact we are supposing that

G(∇un, fn) = G∗∗(∇un, fn)

for any n ∈ N sufficiently big, where (un, fn) is a minimizingsequence for the primal problem.

14.4. Conclusion

In this chapter we developed duality for a two-phase non-convexvariational problem in conductivity. As we may not have minimiz-ers for this kind of problem, the solution of the dual variationalformulation reflects the average behavior of minimizing sequences,


as a weak cluster point of such (minimizing) sequences. Finally, itseems that the solution of dual problem is not difficult to compute.

CHAPTER 15

Duality Applied to the Optimal

Design in Elasticity

15.1. Introduction

The first objective of the present article is the establishment ofa dual variational formulation for the optimal design, concerningthe minimization of internal work, of a plate of variable thickness(in [29] we have reference to similar problems). Such a thicknessis denoted by h(x) and allowed to assume the values between aminimum h0 and maximum h1. The total plate volume, assumedto be fixed, is a design constraint denoted by V . Now we pass todescribe the primal formulation.

Consider a plate which the middle surface is denoted by S ⊂ R2,where S is an open bounded connected set with a sufficiently regularboundary denoted by Γ. As mentioned above the design variableh(x) is such that h0 ≤ h(x) ≤ h1, where x = (x1, x2) ∈ S ⊂ R2

. The field of normal displacements to S, due to a external loadP ∈ L2(S), is denoted by w : S → R.

The optimization problem is the minimization of J : U → R,where

J(w) = inft∈C

∫

S

Hαβλµ(t)

2w,αβw,λµ

dx, (15.1)

subject to

(Hαβλµ(t)w,λµ),αβ = P, in S (15.2)

and∫

S

(th1 + (1 − t)h0) dx = t1h1|S| = V , (15.3)

where

C = t measurable | t(x) ∈ [0, 1], a.e. in S,323

324 15. DUALITY APPLIED TO THE OPTIMAL...

0 < t1 < 1 and |S| denotes the Lebesgue measure of S. Moreover,

U = W 2,20 (S) =

w ∈W 2,2(S) | w =∂w

∂n= 0 on Γ

. (15.4)

Finally,

Hαβλµ(t) = (th1 + (1 − t)h0)3Aαβλµ (15.5)

where h(x) = t(x)h1 + (1 − t(x))h0 and Aαβλµ is a positive def-inite matrix related to Hooke’s Law. Observe that 0 ≤ t(x) ≤1, a.e. in S.

15.2. The Main Duality Principles

The next theorem is also relevant for the subsequent results. Asimilar result may be found in [6]. We develop a proof again forthe sake of completeness.

Theorem 15.2.1. Let U be a Banach space. Consider thel.s.c. functionals (G Λ) : U → R, (F Λ1) : U → R, such that

J : U → R is bounded below, where

J(u) = G∗∗(Λu) − F (Λ1u) − 〈u, f〉U .Here Λ : U → Y , and Λ1 : U → Y1 are continuous linear operatorssuch that there exists L : Y1 → Y a continuous linear operatorsatisfying

Λu = L(Λ1u), ∀u ∈ U.

Under these hypotheses, we have

infu∈U


supv∗∈A∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

, (15.6)

where,A∗ = v∗ ∈ Y ∗ | Λ∗v∗ = f.

Proof. Defining β = infu∈UJ(u), we have

−F (Λ1u) ≥ −G∗∗(Λu) + 〈u, f〉U + β, ∀u ∈ U.

Hence

〈Λ1u, L∗z∗〉Y1 − F (Λ1u) ≥ 〈Λu, z∗〉Y −G∗∗(Λu) + 〈u, f〉U + β,

∀u ∈ U, z∗ ∈ Y ∗. Therefore

supv1∈Y1

〈v1, L∗z∗〉Y1−F (v1) ≥ sup

u∈U〈Λu, z∗〉Y −G∗∗(Λu)+〈u, f〉U+β,

15.2. THE MAIN DUALITY PRINCIPLES 325

so that, from this and Theorem 8.2.5 we may write,

F ∗(L∗z∗) ≥ infv∗∈A∗

G∗(v∗ + z∗) + β.

That is,

infz∗∈Y ∗

supv∗∈A∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

≥ β.


Our main theoretical result is given by the following theorem.

Theorem 15.2.2. Let U be a reflexive Banach space. Con-sider (G Λ) : U → R and (F Λ1) : U → R l.s.c. functionals suchthat J : U → R defined as

J(u) = (G Λ)(u) − (F Λ1)(u) − 〈u, f〉Uis bounded below. Here Λ : U → Y and Λ1 : U → Y1 are continuouslinear operators whose adjoint operators are denoted by Λ∗ : Y ∗ →U∗ and Λ∗

1 : Y ∗ → U∗, respectively. Also we suppose the existenceof L : Y1 → Y a continuous linear injective operator such thatL∗ : Y ∗ → Y ∗

1 is onto and

Λ(u) = L(Λ1(u)), ∀u ∈ U.

Under such assumptions, we have

infu∈U


infz∗∈Y ∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

,

where

A∗ = v∗ ∈ Y ∗ | Λ∗v∗ = f.Now in addition assume that (F Λ1) : U → R is convex and

Gateaux differentiable, G∗ : Y ∗ → R is Gateaux differentiable andstrongly continuous and, (G Λ) : U → R, and (F ∗ L∗) : Y ∗ → R

are Gateaux differentiable functionals. Also assume that for eachv∗ ∈ Y ∗, the functional J∗ : Y ∗ × Y ∗ → R, where

J∗(v∗, z∗) = F ∗(L∗z∗) −G∗(v∗ + z∗),

is such that

J∗(v∗, z∗) → +∞, as ‖z∗‖Y ∗ → ∞ or ‖L∗z∗‖Y ∗

1→ ∞.


Furthermore, suppose that if z∗n ⊂ Y ∗ is such that ‖L∗z∗n‖Y ∗

1<

K, ∀n ∈ N for some K > 0 then there exists z∗ ∈ Y ∗ such that, fora not relabeled subsequence, we have

L∗z∗n L∗z∗, weakly in Y ∗1 ,

andz∗n → z∗, strongly in Y ∗.

Under such assumptions, for each v∗ ∈ Y ∗ we may find z∗ ∈ Y ∗

here denoted by z∗(v∗) such that

α(v∗) ≡ infz∗∈Y ∗

J∗(v∗, z∗) = J∗(v∗, z∗(v∗)).

Observe that z∗(v∗) is obtained through the extremal equation

∂F ∗(L∗z∗)

∂z∗− ∂G∗(v∗ + z∗)

∂z∗= θ.

Moreover, assume the second variation related to z∗ ∈ Y ∗ of J∗(v∗, z∗)is positive definite at z∗(v∗), in the sense that

δ2z∗(J

∗(v∗, z∗(v∗)), φ) > 0, ∀φ ∈ C∞0 , φ 6= θ.

Finally, suppose that there exists (u0, v∗0) ∈ U × Y ∗ such that

δ〈u0,Λ∗v∗0 − f〉U + F ∗(L∗z∗(v∗0)) −G∗(v∗0 + z∗0(v

∗0)) = θ. (15.7)

Under such additional assumptions, we have

infu∈u

G∗∗(Λu) − F (Λ1u) − 〈u, f〉U =

G∗∗(Λu0) − F (Λ1u0) − 〈u0, f〉U = F ∗(L∗z∗0) −G∗(v∗0 + z∗0) =

supv∗∈A∗

infz∗∈Y ∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

. (15.8)

Furthermore, the following chain of equalities is satisfied,

infz∗∈Y ∗

supv∗∈A∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

= infu∈U

G∗∗(Λu) − F (Λ1u) − 〈u, f〉U

= supv∗∈A∗

infz∗∈Y ∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

.

Proof. Observe that

G∗(v∗ + z∗) ≥ 〈Λu, v∗〉Y + 〈Λu, z∗〉Y −G(Λu),

∀u ∈ U, z∗ ∈ Y ∗, v∗ ∈ Y ∗

15.2. THE MAIN DUALITY PRINCIPLES 327

that is,

−F ∗(L∗z∗)+G∗(v∗+z∗) ≥ 〈u, f〉U−F ∗(L∗z∗)+〈Λ1u, L∗z∗〉Y1−G(Λu),

∀u ∈ U, z∗ ∈ Y ∗, v∗ ∈ A∗, so that

supz∗∈Y ∗

−F ∗(L∗z∗) +G∗(v∗ + z∗)

≥ supz∗∈Y ∗

〈u, f〉U − F ∗(L∗z∗) + 〈Λ1u, L∗z∗〉Y1 −G(Λu), (15.9)

∀v∗ ∈ A∗, u ∈ U and therefore, since L∗ : Y ∗ → Y ∗1 is onto,

supz∗∈Y ∗

−F ∗(L∗z∗) + 〈Λ1u, L∗z∗〉Y1 = F (Λ1u).

From this and (15.9), we get

G(Λu) − F (Λ1u) − 〈u, f〉U ≥ infz∗∈Y ∗

F ∗(L∗z∗) −G∗(v∗ + z∗),∀v∗ ∈ A∗, u ∈ U

which means

infu∈U


infz∗∈Y ∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

,

where

A∗ = v∗ ∈ Y ∗ | Λ∗v∗ = f.Similarly we may obtain

infu∈U

G∗∗(Λu) − F (Λ1u) − 〈u, f〉U

≥ supv∗∈A∗

infz∗∈Y ∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

. (15.10)

Fix v∗ ∈ Y ∗. Recall that

α(v∗) ≡ infz∗∈Y ∗

J∗(v∗, z∗).

From the coercivity hypothesis, a minimizing sequence z∗n ⊂ Y ∗

is such that

‖L∗z∗n‖Y ∗

1< K1, ∀n ∈ N,

for some K1 > 0. Also from the hypotheses, there exists z∗ ∈ Y ∗

such that, up to a not relabeled subsequence

L∗z∗n L∗z∗, weakly in Y ∗1 ,

and

z∗n → z∗, strongly in Y ∗.


Thuslim infn→∞

F ∗(L∗z∗n) ≥ F ∗(L∗z∗),

and, since G∗ : Y ∗ → R is strongly continuous,

G∗(v∗ + z∗n) → G∗(v∗ + z∗), as n→ ∞.

Therefore,

α(v∗) = lim infn→∞

F ∗(L∗z∗n)−G∗(v∗ + z∗n) ≥ F ∗(L∗z∗)−G∗(v∗ + z∗).

In particular, denoting z∗0 = z∗(v∗0) = z∗0 , we recall that z∗0 is ob-tained through the extremal equation

∂F ∗(L∗z∗0)

∂z∗− ∂G∗(v∗0 + z∗0)

∂z∗= θ. (15.11)

Taking the variation of (15.11) in v∗, we obtain

∂2F ∗(L∗z∗0)

∂(z∗)2

∂z∗0∂v∗

− ∂2G∗(v∗0 + z∗0)

∂(z∗)2

∂z∗0∂v∗

− ∂2G∗(v∗0 + z∗0)

∂z∗∂v∗= θ,

and considering the hypothesis about the second variation in z∗ ofJ∗(v∗0, z

∗), we may infer that

∂z∗0∂v∗

is well defined. From the extremal equation (15.7)

Λu0 +

(

∂F ∗(L∗z∗0)

∂z∗− ∂G∗(v∗0 + z∗0)

∂z∗

)

∂z∗0∂v∗

− ∂G∗(v∗0 + z∗0)

∂v∗= θ,

and (15.11), we obtain

Λu0 −∂G∗(v∗0 + z∗0)

∂v∗= θ, (15.12)

so that

G∗(v∗0 + z∗0) = 〈Λu0, v∗0 + z∗0〉Y −G∗∗(Λu0). (15.13)

From (15.11) and (15.12), we get

L

(

∂F ∗(L∗z∗0)

∂w∗

)

− Λu0 = θ,

and thus, since L : Y1 → Y is injective,

∂F ∗(L∗z∗0)

∂w∗− Λ1u0 = θ,

where w∗ = L∗z∗. Hence

F ∗(L∗z∗0) = 〈Λ1u0, L∗z∗0〉Y1 − F (Λ1u0). (15.14)

15.3. THE FIRST APPLIED DUALITY PRINCIPLE 329

From the Gateaux variation related to u ∈ U in (15.7), we obtain

Λ∗v∗0 = f. (15.15)

From (15.13), (15.14) and (15.15), we get

F ∗(L∗z∗0) −G∗(v∗0 + z∗0) = G∗∗(Λu0) − F (Λ1u0) − 〈u0, f〉U .From this and (15.10) we have that (15.8) holds, that is,

infu∈u

G∗∗(Λu) − F (Λ1u) − 〈u, f〉U =

G∗∗(Λu0) − F (Λ1u0) − 〈u0, f〉U = F ∗(L∗z∗0) −G∗(v∗0 + z∗0) =

supv∗∈A∗

infz∗∈Y ∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

.

Finally, observe that (15.12) and (15.15) means

−G∗(v∗0 + z∗0) = supv∗∈A∗

−G∗(v∗ + z∗0),

and thus

F ∗(L∗z∗0) + supv∗∈A∗

−G∗(v∗ + z∗0) = F ∗(L∗z∗0) −G∗(v∗0 + z∗0).

From this, (15.6) and (15.10), we get

infz∗∈Y ∗

supv∗∈A∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

= infu∈U

G∗∗(Λu) − F (Λ1u) − 〈u, f〉U

= supv∗∈A∗

infz∗∈Y ∗

F ∗(L∗z∗) −G∗(v∗ + z∗)

.


15.3. The First Applied Duality Principle

Now we rewrite the primal formulation, so that we express J :U × Y → R, as

J(w, f) = inft∈B

∫

S

Hαβλµ(t)

2w,αβw,λµ + Ind1(Λw, f)

dx

+ Ind2(w, f), (15.16)

where,

Ind1(Λw, f) =

0, if fαβ = Hαβλµ(t)w,λµ,+∞, otherwise,

(15.17)


Ind2(w, f) =

0, if fαβ,αβ = P, a.e in S,+∞, otherwise,

(15.18)

Λ : U → Y is given by

Λw = w,αβ,

and

B = t measurable | t(x) ∈ [0, 1] a.e. in S,∫

S

(th1 + (1 − t)h0) dx = t1h1|S| = V

, (15.19)

and also Y = L2(S; R4).Observe that

inf(w,f)∈U×Y

J(w, f) = inft∈B

infw∈U

G(Λw, f, t) + F (w, f)

, (15.20)

where

G(Λw, f, t) =

∫

S

Hαβλµ(t)

2w,αβw,λµ + Ind1(w, f)

dx,

and

F (w, f) = Ind2(w, f).

From Theorem 8.2.5, we may write

inf(w,f)∈U×Y

G(Λw, f, t) + F (w, f)

= sup(v∗,f∗)∈Y ∗×Y ∗

−G∗(v∗, f ∗, t) − F ∗(−Λ∗v∗,−f ∗), (15.21)

where

G∗(v∗, f ∗, t) = sup(v,f)∈Y ×Y

〈vαβ, v∗αβ〉L2(S)+〈fαβ, f

∗αβ〉L2(S)−G(v, f, t),

or

G∗(v∗, f ∗, t) = sup(v,f)∈Y ×Y

〈vαβ, v∗αβ〉L2(S) + 〈fαβ, f

∗αβ〉L2(S)

−∫

S

Hαβλµ(t)

2vαβvλµ + Ind1(v, f)

dx

,

15.3. THE FIRST APPLIED DUALITY PRINCIPLE 331

so that

G∗(v∗, f ∗, t) = sup(v,f)∈Y ×Y

〈vαβ, v∗αβ〉L2(S) + 〈Hαβλµ(t)vλµ, f

∗αβ〉L2(S)

−∫

S

Hαβλµ(t)

2vαβvλµ dx

. (15.22)

Thus we may write

G∗(v∗, f ∗, t) =1

2

∫

S

Hαβλµ(t)f ∗αβf

∗λµ dx+ 〈v∗αβ, f

∗αβ〉L2(S)

+1

2

∫

S

Hαβλµ(t)v∗αβv∗λµ dx,

where

Hαβλµ(t) = Hαβλµ(t)−1.

On the other hand

F ∗(−Λ∗v∗,−f ∗)

= sup(w,f)∈U×Y

−〈w,αβ, v∗αβ〉L2(S) − 〈fαβ, f

∗αβ〉L2(S) − F (w, f),

or

F ∗(−Λ∗v∗,−f ∗)

= sup(w,f)∈U×Y

−〈w,αβ, v∗αβ〉L2(S) − 〈fαβ , f

∗αβ〉L2(S) − Ind2(w, f),

that is,

F ∗(−Λ∗v∗,−f ∗)

= sup(w,f)∈U×Y

−〈w,αβ, v∗αβ〉L2(S)−〈fαβ , f

∗αβ〉L2(S)+〈w, fαβ,αβ−P 〉L2(S),

where w is an appropriate Lagrange multiplier, so that

F ∗(−Λ∗v∗,−f ∗) =

−〈w, P 〉L2(S), if (v∗, f ∗) ∈ B∗,+∞, otherwise,

(15.23)

where

B∗ = (v∗, f ∗) ∈ Y ∗ × Y ∗ | f ∗αβ = w,αβ, v

∗αβ,αβ = 0, in S.


Hence, the duality principle indicated in (15.21), may be expressedas

inf(w,f)∈U×Y

G(Λw, f, t) + F (w, f)

= sup(v∗,w)∈B∗

−1

2

∫

S

Hαβλµ(t)w,αβw,λµ dx− 〈v∗αβ, w,αβ〉L2(S)

− 1

2

∫

S

Hαβλµ(t)v∗αβv∗λµ dx+ 〈w, P 〉L2(S)

.

We may evaluate the last supremum and obtain v∗ = θ, and there-fore,

inf(w,f)∈U×Y

G(Λw, f, t) + F (w, f)

= supw∈U

−1

2

∫

S

Hαβλµ(t)w,αβw,λµ dx+ 〈w, P 〉L2(S)

,

However, also from Theorem 8.2.5, we may conclude that

supw∈U

−1

2

∫

S

Hαβλµ(t)w,αβw,λµ dx+ 〈w, P 〉L2(S)

= infMαβ∈D∗

1

2

∫

S

Hαβλµ(t)MαβMλµdS

,

where

D∗ = Mαβ ∈ Y ∗ |Mαβ,αβ + P = 0, in S.And thus, the final format of the concerned duality principle wouldbe

inf(w,f)∈U×Y

J(w, f) = inf(t,Mαβ)∈B×D∗

1

2

∫

S

Hαβλµ(t)MαβMλµdS

.

15.4. A Concave Dual Formulation

Our next result is a concave dual variational formulation.

Theorem 15.4.1. Define J : U × Y → R, as

J(w, f) = G(Λw) +G1(w, f) − F (Λw),

15.4. A CONCAVE DUAL FORMULATION 333

where

G(Λw, f) = inft∈B

∫

S

Hαβλµ(t)

2w,αβw,λµ + Ind1(Λw, f)

dx

+K

2

∫

S

w,αβw,αβ dx, (15.24)

G1(w, f) = Ind2(w, f),

Ind1(Λw, f) =

0, if fαβ = Hαβλµ(t)w,λµ,+∞, otherwise,

(15.25)

Ind2(w, f) =

0, if fαβ,αβ = P, a.e in S,+∞, otherwise,

(15.26)

Λ : U → Y is given by

Λw = w,αβ,

F (Λw) =K

2

∫

S

w,αβw,αβ dx

and

B = t measurable | t(x) ∈ [0, 1] a.e. in S,∫

S

(th1 + (1 − t)h0) dx = t1h1|S|

.

Under such assumptions we have,

inf(w,f)∈U×Y

J(w, f)

≥ sup(w×v∗)∈U×B∗

infz∗∈Y ∗

F ∗(z∗) −G∗(z∗ + v∗,Λw) + 〈w, P 〉L2(S)

,

where

F ∗(z∗) =1

2K

∫

S

z∗αβz∗αβ dx,

and

G∗(z∗, f ∗) = sup(v,f)∈Y ×Y

〈v, z∗〉Y + 〈f, f ∗〉Y −G(v, f),


or more explicitly,

G∗(z∗, f ∗) = supt∈B

1

2

∫

S

HKαβλµ(t)z∗αβz

∗λµ dx

+1

2

∫

S

HKαβλµ(t)Hλµργ(t)f

∗ργz

∗αβ dx (15.27)

+1

2

∫

S

Hαβλµ(t)HKλµργ(t)Hργθη(t)f

∗θηf

∗αβ dx

.

Here

Hαβλµ(t) = Hαβλµ(t)−1 and HKαβλµ(t) = Hαβλµ(t) +KI−1,

and, I denotes the identity matrix. Finally,

B∗ = v∗ ∈ Y ∗ | v∗αβ,αβ = 0 in S.

Proof. Observe that

G∗1(−Λ∗v∗,−f ∗) = sup

(w,f)∈U×Y

〈Λw,−v∗〉Y +〈f,−f ∗〉Y −Ind2(w, f),

or

G∗1(−Λ∗v∗,−f ∗)

= sup(w,f)∈U×Y

〈Λw,−v∗〉Y + 〈f,−f ∗〉Y + 〈w, fαβ,αβ − P 〉Y .

Thus

G∗1(−Λ∗v∗,−f ∗) =

−〈w, P 〉L2(S), if (v∗, f ∗) ∈ B∗1 ,

+∞, otherwise,

where

B∗1 = (v∗, f ∗) ∈ Y ∗ × Y ∗ | f ∗

αβ = w,αβ, v∗αβ,αβ = 0, in S.

On the other hand

G∗(z∗ + v∗,Λw) − 〈w,P 〉L2(S) = 〈Λw, z∗〉Y + 〈Λw, v∗〉Y+ 〈f,Λw〉Y −G(Λw, f) − 〈w, P 〉L2(S)

≥〈Λw, z∗〉Y −G(Λw, f)−G1(w, f),

∀(z∗, v∗, w) ∈ Y ∗ ×B∗ × U, w ∈ U, f ∈ Y. From last inequality weobtain

− F ∗(z∗) +G∗(z∗ + v∗,Λw) − 〈w, P 〉L2(S)

≥ −F ∗(z∗) + 〈Λw, z∗〉Y −G(Λw, f) −G1(w, f),

15.5. DUALITY FOR A TWO-PHASE PROBLEM IN ELASTICITY 335

∀(z∗, v∗, w) ∈ Y ∗ × B∗ × U, w ∈ U, f ∈ Y. Taking the supremumin z∗ in both sides of last inequality we obtain:

supz∗∈Y ∗

−F ∗(z∗) +G∗(z∗ + v∗,Λw) − 〈w, P 〉L2(S)

≥ F (Λw) −G(Λw, f) −G1(w, f),

∀w ∈ U, w ∈ U, f ∈ Y, v∗ ∈ B∗, and finally, we may write

inf(w,f)∈U×Y

J(w, f) ≥ sup(w,v∗)∈U×B∗

infz∗∈Y ∗

F ∗(z∗)

−G∗(z∗ + v∗,Λw) + 〈w, P 〉L2(S)

. (15.28)

Remark 15.4.2. The last results are similar to those of Theo-rem 15.2.2, however the latter problem has a format slightly differ-ent from the one of such a theorem, so that we preferred to developin details the dual formulation. We conjecture that if K > 0 isbig enough, so that a minimizing sequence for the primal problemis inside the region of convexity of G(Λw, f) + G1(w, f), then theinequality indicated in (15.28) is in fact an equality. Finally, it isworth noting that the dual problem seems to be simple to compute.

15.5. Duality for a Two-Phase Problem in Elasticity

In this section we develop duality for a two phase problem inelasticity. Consider V ⊂ R3, and open connected bounded set witha sufficiently regular boundary denoted by ∂V = S0 ∪ S1, whereS0 ∩ S1 = ∅. Here V stands for the volume of a elastic solid underthe action of a load P ∈ L2(V ; R3). The field of displacements isdenoted by u = (u1, u2, u3) ∈ U where

U = u ∈W 1,2(V ; R3) | u = (0, 0, 0) on S0. (15.29)

The strain tensor, denoted by e = eij, is defined as

eij(u) =1

2(ui,j + uj,i). (15.30)

The solid V is supposed to be composed by mixing two constituents,namely 1 and 0, with elasticity matrices related to Hooke’s Lawdenoted by H1

ijkl and H0ijkl, respectively. The part occupied by

constituent 1 is denoted by E and represented by the characteristic


function χ : V → 0, 1 where

χ(x) =

1, if x ∈ E,0, otherwise,

(15.31)

Now we define the optimization problem of minimizing J(u, χ)where

J(u, χ) =1

2

∫

V

(

χH1ijkleij(u)ekl(u) + (1 − χ)H0

ijkleij(u)ekl(u))

dx,

(15.32)

subject to(

χH1ijklekl(u) + (1 − χ)H0

ijklekl(u))

,j+ Pi = 0, in V, (15.33)

u ∈ U and∫

V

χ dx ≤ t1|V |, (15.34)

here 0 < t1 < 1 and |V | denotes the Lebesgue measure of V .We rewrite the primal formulation, now denoting it by J : U ×

Y → R as

J(u, f) = inft∈B

∫

V

Hijkl(t)

2eij(u)ekl(u) + Ind1(eij(u), f)

dx

+ Ind2(u, f), (15.35)

whereHijkl(t) = tH1

ijkl + (1 − t)H0ijkl,

Ind1(eij(u), f) =

0, if fij = tH1ijklekl(u) + (1 − t)H0

ijklekl(u),+∞, otherwise,

Ind2(u, f) =

0, if fij,j + Pi = 0, a.e in S,+∞, otherwise,

B =

t measurable | t(x) ∈ 0, 1, a.e. in V,

∫

V

t(x) dx ≤ t1|V |

,

and also Y = L2(V ; R9).By analogy to last section, we may obtain

inf(u,f)∈U×Y

J(u, f) = inf(t,σ)∈B×B∗

1

2

∫

V

Hijkl(t)σijσkl dx

,

whereHijkl = Hijkl(t)−1,

B∗ = σ ∈ Y ∗ | σij,j + Pi = 0, in V, σijnj = 0 on S1,and

15.5. DUALITY FOR A TWO-PHASE PROBLEM IN ELASTICITY 337

Remark 15.5.1. Similarly to earlier sections, we may alsoobtain the duality principle:

inf(u,f)∈U×Y

J(u, f) ≥ sup(u,v∗)∈U×C∗

infz∗∈Y ∗

F ∗(z∗) −G∗(z∗ + v∗, e(u))

+〈u, P 〉L2(S;R3)

,

where

F ∗(z∗) =1

2K

∫

S

z∗ijz∗ij dx,

and

G∗(z∗, f ∗) = supt∈B

1

2

∫

S

HKijkl(t)z

∗ijz

∗kl dx

+1

2

∫

S

HKijkl(t)Hklop(t)f

∗opz

∗ij dx (15.36)

+1

2

∫

S

Hijkl(t)HKklop(t)Hopmn(t)f ∗

mnf∗ij dx

.

HereHijkl(t) = tH1

ijkl−1 + (1 − t)H0ijkl−1

and

HKijkl(t) = tH1

ijkl +KI−1 + (1 − t)H0ijkl +KI−1.

Finally

C∗ = v∗ ∈ Y ∗ | v∗ij,j = 0 in V, v∗ijnj = 0, on ∂S1.Also observe that the last dual functional, now denoted by J∗(v∗, u),may be written as

J∗(v∗, u) = inft∈B

infz∗∈Y ∗

1

2K

∫

S

z∗ijz∗ij dx

−1

2

∫

S

HKijkl(t)(z

∗ij + v∗ij)(z

∗kl + v∗kl) dx

−1

2

∫

S

HKijkl(t)Hklop(t)eop(u)(z

∗ij + v∗ij) dx

−1

2

∫

S

Hijkl(t)HKklop(t)Hopmn(t)emn(u)eij(u) dx

+〈u, P 〉L2(S;R3). (15.37)

After the evaluation of the infimum in z∗, we obtain a concavefunctional in (v∗, u), as the infimum (in t) of a collection of concave


functionals in (v∗, u), even though we have not proved it in thepresent work.

15.6. A Numerical Example

Consider again the optimization problem which is given by theminimization of J : U → R, where

J(w) = inft∈C

∫

S

Hαβλµ(t)

2w,αβw,λµ

dx, (15.38)

subject to

(Hαβλµ(t)w,λµ),αβ = P, in S (15.39)

and∫

S

(th1 + (1 − t)h0) dx = t1h1|S| = V , (15.40)

where

C = t measurable | t(x) ∈ [0, 1], a.e. in S,t1 = 0.46 (in this example) and |S| denotes the Lebesgue measureof S. Moreover,

U = W 2,20 (S) =

w ∈W 2,2(S) | w = 0 on Γ

. (15.41)

We develop numerical results for the particular case where,

Hαβλµ(t) = H(t) = h(t)3E, (15.42)

where h(t) = th1 + (1 − t)h0 and h1 = 0.1 and h0 = 10−4 andE = 106, with the units related to the International System (here wehave denoted x = (x1, x2)). Observe that 0 ≤ t(x) ≤ 1, a.e. in S.

The concerned duality principle would be

inf(w,f)∈U×Y

J(w, f) = inf(t,Mαβ)∈B×D∗

1

2

∫

S

H(t)MαβMλµdS

.

where

D∗ = Mαβ ∈ Y ∗ |Mαβ,αβ + P = 0, in S, Mαβnαnβ = 0 on Γ.We have computed the dual problem for S = [0, 1] × [0, 1] and avertical load acting on the plate given by P (x) = 10000, obtainingthe results indicated in the respective figures, for t(x) and w0(x).


00.2

0.40.6

0.81

0

0.5

10

0.2

0.4

0.6

0.8

1

Figure 1. Vertical axis: t(x) where t(x)h1 is the plate thickness

00.2

0.40.6

0.81

0

0.5

1−0.012

−0.01

−0.008

−0.006

−0.004

−0.002

0

Figure 2. Vertical axis: w0(x)-field of displacementsfor P (x) = 10000

15.7. Conclusion

In this article we have developed dual variational formulationsfirst for the optimal design of the variable thickness of a plate and,


in a second step, for a two-phase problem in elasticity. The in-fima in t indicated in the dual formulations represent the structuresearch for stiffness in the optimization process, which implies theminimization of the internal work. In some cases the primal prob-lem may not have solutions, so that the solution of dual problem isa weak cluster point of minimizing sequences for the primal formu-lation. We expect the results obtained can be used as engineeringproject tools.

CHAPTER 16

Duality Applied to Micro-Magnetism

16.1. Introduction

In this article we develop dual variational formulations for mod-els in micro-magnetism. For the primal formulation we refer toreferences [26, 30] for details.

Let Ω ⊂ R3 be an open bounded set with a finite Lebesguemeasure and a regular boundary denoted by ∂Ω, being the cor-responding outer normal denoted by n. Consider the model ofmicro-magnetism in which the magnetization m : Ω → R3, is givenby the minimization of the functional

J(m, f) =α

2

∫

Ω

|∇m|22 dx+

∫

Ω

ϕ(m(x)) dx−∫

Ω

H(x) ·m dx

+1

2

∫

R3

|f(z)|22 dz, (16.1)

where

m = (m1, m2, m3) ∈W 1,2(Ω; R3) ≡ Y1, |m(x)|2 = 1, a.e. in Ω(16.2)

and f ∈ L2(R3; R3) ≡ Y2 is the unique field determined by thesimplified Maxwell’s equations

Curl(f) = 0, div(−f +mχΩ) = 0, a.e. in R3. (16.3)

Here H ∈ L2(Ω; R3) is a known external field and χΩ is a functiondefined by

χΩ(x) =

1, if x ∈ Ω,0, otherwise.

(16.4)

The termα

2

∫

Ω

|∇m|22 dx

341

342 16. DUALITY APPLIED TO MICRO-MAGNETISM

is called the exchange energy, where

|m|2 =

√

√

√

√

3∑

k=1

m2k

and

|∇m|22 =3∑

k=1

|∇mk|22.

Finally, ϕ(m) represents the anisotropic contribution and is givenby a multi-well functional whose minima establish the preferreddirections of magnetization.

16.2. The Primal formulations and the Duality Principles

16.2.1. Summary of Results for the Hard Uniaxial Case.

We examine first the case of uniaxial material with no exchangeenergy. That is, α = 0 and ϕ(x) = β(1 − |m · e|).

Observe that

ϕ(m) = minβ(1 +m · e), β(1 −m · e)where β > 0 and e ∈ R3 is a unit vector. Thus we can express thethe functional J : Y1 × Y2 → R = R ∪ +∞, as

J(m, f) = G(m, f) + F (m)

where

G(m, f) =

∫

Ω

ming1(m), g2(m) dx+1

2

∫

R3

|f(z)|22 dz

+ Ind0(m) + Ind1(f) + Ind2(m, f),

and

F (m) = −∫

Ω

H(x) ·m dx.

Here,

g1(m) = β(1 +m · e),g2(m) = β(1 −m · e),

Ind0(m) =

0, if |m(x)|2 = 1 a.e. in Ω,+∞, otherwise,

Ind1(m, f) =

0, if div(−f +mχΩ) = 0 a.e. in R3,+∞, otherwise,

16.2. THE PRIMAL FORMULATIONS AND THE DUALITY PRINCIPLES343

and

Ind2(f) =

0, if Curl(f) = θ, a.e. in R3,+∞, otherwise.

The dual functional for such a variational formulation can beexpressed by the following duality principle:

inf(m,f)∈Y1×Y2J(m, f) ≥

sup(λ1,λ2)∈Y ∗

inft∈B

−∫

Ω

(

3∑

k=1

(∂λ2

∂xi+Hi + β(1 − 2t)ei)

2)1/2 dx

−1

2

∫

R3

|Curl∗λ1 −∇λ2|22 dx

+

∫

Ω

β dx (16.5)

whereB = t measurable | t(x) ∈ [0, 1], a.e. in Ω.

and

Y ∗ = (λ1, λ2) ∈W 1,2(R3; R3) ×W 1,2(R3) | λ2 = 0 on ∂Ω.16.2.2. The Results for the Full Semi-linear Case. Now

we present the duality principle for the full semi-linear case, thatis, for α > 0. First we define G : Y1 × Y2 → R and F : Y1 × Y2 → R

as

G(m, f) =α

2

∫

Ω

|∇m|22 dx+

∫

Ω

ming1(m), g2(m) dx

+1

2

∫

R3

|f(z)|22 dz + Ind0(m) + Ind1(m) + Ind2(m), (16.6)

and

F (m, f) = −∫

Ω

H ·m dx.

Also,g1(m) = β(1 +m · e),g2(m) = β(1 −m · e),

Ind0(m) =


Ind1(m, f) =


and

Ind2(f) =



For J(m, f) = G(m, f) +F (m, f), the dual variational formulationis given by the following duality principle

inf(m,f)∈Y1×Y2

J(m, f) ≥ sup(λ1,λ2,y∗)∈Y ∗×Y ∗

0

inft∈B

− 1

2α

∫

Ω

|y∗|22 dx

−∫

Ω

(

3∑

i=1

(div(y∗i ) +Hi + (1 − 2t)βei +∂λ2

∂xi)2)1/2 dx

−1

2

∫

R3

|∇λ2 − Curl∗λ1|22dz

+

∫

Ω

βdx, (16.7)

where

B = t measurable | t(x) ∈ [0, 1], a.e. in Ω,

Y ∗0 = y∗ ∈W 1,2(Ω; R3×3) | y∗i · n = 0 on ∂Ω, ∀i ∈ 1, 2, 3,

and

Y ∗ = (λ1, λ2) ∈W 1,2(R3; R3) ×W 1,2(R3) | λ2 = 0 on ∂Ω.

Remark 16.2.1. It is important to emphasize that in bothcases the dual formulations are concave. Thus the dual problemsalways have solutions, even when for the hard uniaxial case theminimizer in the primal problem is not attained.

16.3. A Preliminary Result

Now we present a simple but very useful result, through whichwe establish our first duality principles.

Theorem 16.3.1. Consider (G Λ) : V → R (not necessarilyconvex) such that J : V → R defined by

J(m) = G(Λm) − 〈m, f〉V , ∀m ∈ V,

is bounded from below (here as usual Λ : U → Y is a continuouslinear operator). Under such assumptions, we have

infm∈V

J(m) = supy∗∈A∗

−(G Λ)∗(Λ∗y∗)

where

A∗ = y∗ ∈ Y ∗ | Λ∗y∗ − f = 0.

16.4. THE DUALITY PRINCIPLE FOR THE HARD CASE 345

Proof. The proof is simple, just observe that

− (G Λ)∗(Λ∗y∗) = −(G Λ)∗(f) = − supm∈V

〈m, f〉V −G(Λm),

∀y∗ ∈ A∗.

Remark 16.3.2. What seems to be relevant is that, whencomputing (G Λ)∗(Λ∗y∗), we obtain a duality which is perfectconcerning the convex envelope of the primal formulation.

16.4. The Duality Principle for the Hard Case

We recall the primal formulation for the hard uniaxial case,expressed by J(m, f) where

J(m, f) = G(m, f) + F (m),

G(m, f) =

∫

Ω

ming1(m), g2(m)dx+1

2

∫

R3

|f(z)|22 dz

+ Ind0(m) + Ind1(f) + Ind2(m, f),

and

F (m, f) = −∫

Ω

H(x) ·m dx.

Also,g1(m) = β(1 +m · e),g2(m) = β(1 −m · e),

Ind0(m) =


Ind1(m, f) =


and

Ind2(f) =


From Theorem 16.3.1, we may write

inf(m,f)∈Y1×Y2

J(m, f)

= sup(m∗,f∗)∈Y ∗

1 ×Y ∗

2

−G∗(m∗, f ∗) − F ∗(−m∗,−f ∗)). (16.8)

We now calculate the dual functionals. First we have that

G∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3)+〈f, f ∗〉L2(R3;R3)−G(m, f),


or

G∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−∫

Ω

ming1(m), g2(m) dx−∫

R3

|f(z)|22 dz

−Ind0(m) − Ind1(f) − Ind2(m, f) . (16.9)

That is,

G∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

− inft∈B

∫

Ω

(tg1(m) + (1 − t)tg2(m)) dx−∫

R3

|f(z)|22 dz

−Ind0(m) − Ind1(f) − Ind2(m, f), (16.10)

where

B = t measurable | t(x) ∈ [0, 1], a.e. in Ω.

Thus,

G∗(m∗, f ∗) = sup(m,f,t)∈Y1×Y2×B

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−∫

Ω

(tg1(m) + (1 − t)g2(m) )dx−∫

R3

|f(z)|22 dz

−Ind0(m) − Ind1(f) − Ind2(m, f), (16.11)

or,

G∗(m∗, f ∗) = supt∈B

sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−∫

Ω

(tg1(m) + (1 − t)g2(m)) dx−∫

R3

|f(z)|22 dz

−Ind0(m) − Ind1(f) − Ind2(m, f). (16.12)

16.4. THE DUALITY PRINCIPLE FOR THE HARD CASE 347

Hence


sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−∫

Ω

(tg1(m) + (1 − t)g2(m))dx−∫

R3

|f(z)|22 dz

−∫

Ω

λ

2(

3∑

i=1

m2i − 1)dx− 〈Curl(f), λ1〉L2(R3,R3)

−〈div(−f +mχΩ), λ2〉L2(R3),where λ, λ1 and λ2 are appropriate Lagrange Multipliers concerningthe respective constraints.

Therefore,


sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−∫

Ω

β(t(1 +m.e) + (1 − t)(1 −m.e))dx

−∫

R3

|f(z)|22 dz

−∫

Ω

λ

2(

3∑

i=1

m2i − 1)dx− 〈Curl(f), λ1〉L2(R3;R3)

−〈div(−f +mχΩ), λ2〉L2(R3).The last indicated supremum is attained for functions satisfying

the equations

m∗i + β(1 − 2t)ei − λmi +

∂λ2

∂xi= 0

or

mi =m∗

i + β(1 − 2t)ei + ∂λ2

∂xi

λ= 0

and thus from the constraint3∑

i=1

m2i − 1 = 0

we obtain

λ = (3∑

i=1

(m∗i + β(1 − 2t)ei +

∂λ2

∂xi

)2)1/2.


Also, the supremum in f is achieved for functions satisfying

f ∗ − f + Curl∗λ1 −∇λ2 = 0.

Observe that we need the condition λ2 = 0 on ∂Ω to have a finitesupremum, so that


inf(λ1,λ2)∈Y

∫

Ω

(

3∑

i=1

(m∗i + β(1 − 2t)ei +

∂λ2

∂xi)2)1/2dx

+1

2

∫

R3

|f ∗ + Curl∗λ1 −∇λ2|22dx

−∫

Ω

βdx.

Furthermore

F ∗(−m∗,−f ∗)

= sup(m,f)∈Y1×Y2

〈m,−m∗〉L2(Ω;R3) + 〈f,−f ∗〉L2(R3;R3) − F (m, f),

F ∗(−m∗,−f ∗)

= sup(m,f)∈Y1×Y2

〈m,−m∗〉L2(Ω;R3)+〈f,−f ∗〉L2(R3;R3)−∫

Ω

H(x)·m dx,

so that

F ∗(−m∗,−f ∗) =

0, if (m∗, f ∗) ∈ B∗,+∞, otherwise,

(16.13)

where

B∗(m∗, f ∗) ∈ Y ∗1 × Y ∗

2 | m∗ = H a.e. in Ω, f ∗ = θ, a.e. in R3.Therefore we may summarize the duality principle indicated in(16.8) as

inf(m,f)∈Y1×Y2

J(m, f) =

inft∈B


−∫

Ω

(

3∑

i=1

(Hi + β(1 − 2t)ei +∂λ2

∂xi)2)1/2dx

−1

2

∫

R3

|Curl∗λ1 −∇λ2|22 dx+

∫

Ω

βdx

, (16.14)

whereB = t measurable | t(x) ∈ [0, 1], a.e. in Ω,

and

Y ∗ = (λ1, λ2) ∈W 1,2(R3; R3) ×W 1,2(R3) | λ2 = 0 on ∂Ω.

16.5. AN ALTERNATIVE DUAL FORMULATION FOR THE HARD CASE349

Thus, interchanging the infimum and supremum in the dual formu-lation we finally obtain,

inf(m,f)∈Y1×Y2

G(m, f) + F (m, f)

≥ sup(λ1,λ2)∈Y ∗

inft∈B

−∫

Ω

(

3∑

i=1

(∂λ2

∂xi+Hi + β(1 − 2t)ei)

2)1/2 dx

−1

2

∫

R3


+

∫

Ω

β dx (16.15)

whereB = t measurable | t(x) ∈ [0, 1] a.e. in Ω.

and

Y ∗ = (λ1, λ2) ∈W 1,2(R3; R3) ×W 1,2(R3) | λ2 = 0 on ∂Ω.

16.5. An Alternative Dual Formulation for the Hard Case

16.5.1. The Primal Formulation. We examine again thecase where α = 0 and ϕ(x) = β(1 − |m.e|), the hard uniaxialcase.

Observe that

ϕ(m) = minβ(1 +m.e), β(1 −m.e)where β > 0 and e ∈ R3 is a unitary vector. Thus we can expressthe functional J : V → R = R ∪ +∞, (here V ≡ Y1 × Y2), as

J(m, f) = G(m, f) + F (m)

where

G(m, f) =

∫

Ω

ming1(m), g2(m)dx+1

2

∫

R3

|f(z)|2dz

+K

2〈m,m〉Y1 −

K

2|Ω|,

and

F (m, f) = Ind0(m) + Ind1(f) + Ind2(m, f) −∫

Ω

H(x) ·mdx.

Here we have defined,

g1(m) = β(1 +m.e),

g2(m) = β(1 −m.e),

Ind0(m) =



Ind1(m, f) =


and

Ind2(f) =


16.5.2. The Duality Principle. Similarly as in Theorem 8.2.5,we may write

inf(m,f)∈Y1×Y2

G∗∗(m, f) + F (m, f))

≥ sup(m∗,f∗)∈Y ∗

1 ×Y ∗

2

−G∗(m∗, f ∗) − F ∗(−m∗,−f ∗)). (16.16)

Having in mind the application of such a duality principle, wepass to the calculation of dual functionals. First we have that

G∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3)+〈f, f ∗〉L2(R3;R3)−G(m, f),

or

G∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−∫

Ω

ming1(m), g2(m)dx

−∫

R3

|f(z)|2dz − K

2〈m,m〉Y1 +

K

2|Ω|, (16.17)

that is,

G∗(m∗, f ∗)

=1

2K

∫

Ω

|m∗|2 dx+β

K

∫

Ω

|m∗ · e| dx+

β2

2|e|2 − β +

K

2

|Ω|

+1

2

∫

R3

|f ∗|2 dx.

On the other hand

F ∗(−m∗,−f ∗)

= sup(m,f)∈Y1×Y2

〈m,−m∗〉L2(Ω;R3) + 〈f,−f ∗〉L2(R3;R3) − F (m, f),


that is,

F ∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,−m∗〉L2(Ω;R3) + 〈f,−f ∗〉Y2

− Ind0(m) − Ind1(f) − Ind2(m, f) +

∫

Ω

H(x) ·mdx.. (16.18)

Hence, as the evaluation of this last supremum is a quadratic opti-mization problem, so that we may write:

F ∗(−m∗,−f ∗) = sup(m,f)∈Y1×Y2

〈m,−m∗〉L2(Ω;R3) + 〈f,−f ∗〉L2(R3;R3)

−∫

Ω

λ

2(

3∑

i=1

m2i − 1)dx− 〈Curl(f), λ1〉L2(R3,R3)

−〈div(−f +mχΩ), λ2〉L2(R3) +

∫

Ω

H(x) ·m dx, (16.19)

where λ, λ1 and λ2 are appropriate Lagrange Multipliers concerningthe respective constraints.

The last supremum in m is attained for functions satisfying theequations

−m∗i +Hi − λmi +

∂λ2

∂xi= 0

or

mi =−m∗

i +Hi + ∂λ2

∂xi

λ= 0

and thus from the constraint3∑

i=1

m2i − 1 = 0

we obtain

λ =

(

3∑

i=1

(−m∗i +Hi +

∂λ2

∂xi

)2

)1/2

.

The supremum in f equals +∞, unless the following relation issatisfied:

f ∗ − Curl∗λ1 −∇λ2 = θ, in R3.

Observe that we need the condition λ2 = 0 on ∂Ω to have a finitesupremum, so that

F ∗(−m∗,−f ∗) = inf(λ1,λ2)∈Y

∫

Ω

(

3∑

i=1

(−m∗i +Hi +

∂λ2

∂xi

)2

)1/2

dx.


Therefore we may summarize the duality principle indicated in(16.16) as:

inf(m,f)∈Y1×Y2

G∗∗(m, f) + F (m, f)

≥ supm∈Y

− 1

2K

∫

Ω

|m∗|22 dx−β

K

∫

Ω

|m∗ · e| dx

−

β2

2|e|22 − β +

K

2

|Ω|

−∫

Ω

(

3∑

i=1

(−m∗i +Hi +

∂λ2

∂xi

)2

)1/2

dx

−1

2

∫

R3

|Curl∗λ1 −∇λ2|22 dx, (16.20)

wherem = (m∗, λ1, λ2),

Y = Y ∗1 × Y ∗,

and

Y ∗ = (λ1, λ2) ∈W 1,2(R3; R3) ×W 1,2(R3) | λ2 = 0 on ∂Ω.

Remark 16.5.1. The last inequality is in fact an equality, con-sidering that the dual formulation is concave and coercive, so thatthrough an application of the direct method of calculus of varia-tions, we may infer that supremum is attained with a correspondingpoint that solves the primal formulation. Thus the equality followsfrom the standard relations between primal and dual functionals.Denoting

G1(m, f) =

∫

Ω

ming1(m), g2(m)dx+K

2〈m,m〉Y1 −

K

2|Ω|,

G∗1(m

∗) =1

2K

∫

Ω

|m∗|22 dx+β

K

∫

Ω

|m∗ · e| dx

+

β2

2|e|22 − β +

K

2

|Ω|,

and

G∗2(m

∗, λ2) =

∫

Ω

(

3∑

i=1

(−m∗i +Hi +

∂λ2

∂xi

)2

)1/2

dx,


the optimality conditions for the dual problem are given by:

θ ∈ ∂G∗1(m

∗) + ∂G∗2(−m∗, λ2),

−div(Curl∗λ1 −∇λ2) ∈ div(

∂G∗2(−m∗, λ2)

)

,

and

Curl(Curl∗λ1 −∇λ2) = θ, in R3.

Hence, denoting f0 = Curl∗λ1 − ∇λ2, there exists m0 ∈ Y1 suchthat

m0 ∈ ∂G∗1(m

∗),

and therefore,

G∗1(m

∗) = 〈m0i , m∗i 〉L2(Ω) − G∗∗

1 (m0). (16.21)

Also

−m0 ∈ ∂G∗2(−m∗, λ2),

that is:

G∗2(−m∗, λ2) = 〈m0,−m∗ + ∇λ2〉L2(Ω,R3) − G∗∗

2 (m0)

= 〈m0,−m∗ +H + ∇λ2〉L2(Ω,R3) − Ind0(m0)

−Ind1(m0, f0) − Ind2(f0), (16.22)

and,

div(m0χΩ − f0) = 0, in R3. (16.23)

Finally,

1

2

∫

R3

|Curl∗λ1 −∇λ2|22 dx = 〈f0, Curl∗λ−∇λ2〉L2(R3,;R3)

− 1

2

∫

R3

|f0|22 dx. (16.24)

Thus, from (16.21), (16.22), (16.23) and (16.24), we have

G∗∗1 (m0) +

1

2

∫

R3

|f0|22 dx− 〈m0, H〉L2(Ω,R3)

Ind0(m0) + Ind1(m0, f0) + Ind2(f0)

= −G∗1(m

∗) − G∗2(−m∗, λ2) −

1

2

∫

R3

|Curl∗λ1 −∇λ2|22 dx.


Remark 16.5.2. We conjecture that if K > 0 is big enough,so that for a minimizing sequence of the original problem we haveG∗∗

1 (mn) = G1(mn) for all n ∈ N sufficiently big, then we can

replace G∗∗1 (m) by G1(m) in the last duality principle, and hence

the duality gap between the primal and dual problems is zero.

16.6. The Full Semi-linear Case

Now we present a study concerning duality for the full semi-linear case, that is, for α > 0. First we recall the definition ofG : Y1 × Y2 → R and F : Y1 × Y2 → R, that is,

G(m, f) =α

2

∫

Ω

|∇m|22 dx+

∫

Ω

ming1(m), g2(m)dx

+1

2

∫

R3

|f(z)|22 dz + Ind0(m) + Ind1(m) + Ind2(m),

and

F (m, f) = −∫

Ω

H ·m dx.

Also,

g1(m) = β(1 +m · e),g2(m) = β(1 −m · e),

Ind0(m) =


Ind1(m, f) =


and

Ind2(f) =

0, if Curl(f) = 0, a.e. in R3,+∞, otherwise.

From Theorem 16.3.1, we have

inf(m,f)∈Y1×Y2

G(m, f) + F (m, f)

= sup(m∗,f∗)∈Y ∗

1 ×Y ∗

2

−G∗(m∗, f ∗) − F ∗(−m∗,−f ∗), (16.25)

where

G∗(m∗, f ∗) =

sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3) −G(m, f),

16.6. THE FULL SEMI-LINEAR CASE 355

and

F ∗(−m∗,−f ∗)

= sup(m,f)∈Y1×Y2

〈m,−m∗〉L2(Ω;R3) + 〈f,−f ∗〉L2(R3;R3) − F (m, f).

Thus, from above definitions we may write

G∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−α2

∫

Ω

|∇m|22 dx−∫

Ω

ming1(m), g2(m)dx

−1

2

∫

R3

|f(z)|22 dz − Ind0(m) − Ind1(m) − Ind2(m)

.

Or

G∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−α2

∫

Ω

|∇m|22 dx

− inft∈B

∫

Ω

(tg1(m) + (1 − t)g2(m))dx

−1

2

∫

R3

|f(z)|22 dz

−Ind0(m) − Ind1(m) − Ind2(m), (16.26)

where

B = t measurable | t(x) ∈ [0, 1], a.e. in Ω.Hence

G∗(m∗, f ∗) = sup(m,f,t)∈Y1×Y2×B

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−α2

∫

Ω

|∇m|22 dx

−∫

Ω

(tg1(m) + (1 − t)g2(m))dx− 1

2

∫

R3

|f(z)|22 dz

−Ind0(m) − Ind1(m) − Ind2(m), (16.27)


or


sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−α2

∫

Ω

|∇m|22 dx

−∫

Ω

(tg1(m) + (1 − t)g2(m))dx− 1

2

∫

R3

|f(z)|22 dz

−Ind0(m) − Ind1(m) − Ind2(m). (16.28)

Thus, as the second supremum is a quadratic optimization problem,there exist Lagrange multipliers (λ, λ1, λ2) ∈ L2(S) × L2(S; R3) ×L2(S), such that


sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3) + 〈f, f ∗〉L2(R3;R3)

−α2

∫

Ω

|∇m|22dx

−∫

Ω

(tg1(m) + (1 − t)g2(m))dx− 1

2

∫

R3

|f(z)|22dz

−∫

Ω

λ

2(

3∑

i=1

m2i − 1)dx− 〈Curl(f), λ1〉L2(R3;R3)

−〈div(−f +mχΩ), λ2〉L2(R3).Hence we may write,


supm∈Y1

−G(Λm) − F (m, t) + F ∗1 (f ∗),

where

F ∗1 (f ∗) =

1

2

∫

R3

|f ∗ + Curl∗λ1 −∇λ2|22 dx

G(Λm) =α

2

∫

Ω

|∇m|22 dx,

Λm = ∇m,and

F (m, t) = −〈m,m∗〉L2(Ω;R3) +

∫

Ω

tg1(m) + (1 − t)g2(m)dx

+

∫

Ω

λ

2(

3∑

i=1

m2i − 1)dx+ 〈div(mχΩ), λ2〉L2(R3). (16.29)


Therefore, similarly as in Theorem 8.2.5,

sup(m,f)∈Y1×Y2

−G(Λm) − F (m, t) ≤ infy∗∈Y ∗

0

G∗(y∗) + F ∗(−Λ∗y∗, t),

where

G∗(y∗) = supy∈L2(Ω,R3×3)

〈y, y∗〉L2(Ω;R3×3) −α

2

∫

Ω

|y|22dx

=1

2α

∫

Ω

|y∗|22 dx

and

F ∗(−Λ∗y∗, t) = supm∈Y1

−〈∇mi, y∗i 〉L2(Ω;R3) − F (m, t).

Thus we may write

F ∗(−Λ∗y∗, t) = supm∈Y1

−〈∇mi, y∗i 〉L2(Ω,R3) + 〈m,m∗〉L2(Ω;R3)

−∫

Ω

(t(1 +m · e) + (1 − t)(1 −m · e))βdx

−∫

Ω

λ

2(

3∑

i=1

m2i − 1)dx− 〈div(mχΩ), λ2〉L2(R3). (16.30)

The last supremum is attained for functions satisfying y∗i ·n+λ2ni =0 on ∂Ω, for all i ∈ 1, 2, 3, where n denotes the outer normal to∂Ω (such a condition is necessary to guarantee a finite supremum).Furthermore

div(y∗i ) +m∗i + (1 − 2t)βei − λmi +

∂λ2

∂xi= 0,

or

mi =div(y∗i ) +m∗

i + (1 − 2t)βei + ∂λ2

∂xi

λ.

From the constraint∑3

i=1m2i − 1 = 0, we obtain

λ =

(

3∑

i=1

(

div(y∗i ) +m∗i + (1 − 2t)βei +

∂λ2

∂xi

)2)1/2

,


so that

F ∗(−Λ∗y∗, t) =

∫

Ω

(

3∑

i=1

(

div(y∗i ) +m∗i + (1 − 2t)βei +

∂λ2

∂xi

)2)1/2

dx

−∫

Ω

β dx.

Therefore, summarizing the last results, we may write

G∗(m∗, f ∗) ≤ supt∈B

infy∗∈Y ∗

0

1

2α

∫

Ω

|y∗|22 dx

+

∫

Ω

(3∑

i=1

(div(y∗i ) +m∗i + (1 − 2t)βei +

∂λ2

∂xi

)2)1/2dx

+1

2

∫

R3

| − f ∗ + Curl∗λ1 −∇λ2|22 dz −∫

Ω

β dx,

where, Y ∗0 = y∗ ∈ W 1,2(Ω; R3×3) | y∗i · n + λ2ni = 0 on ∂Ω, ∀i ∈

1, 2, 3 . On the other hand

F ∗(−m∗,−f ∗) = sup(m,f)∈Y1×Y2

〈m,−m∗〉L2(Ω;R3) + 〈f,−f ∗〉L2(Ω;R3)

+

∫

Ω

H ·mdx

so that

F ∗(−m∗,−f ∗) =

0, if (m∗, f ∗) ∈ B∗,+∞, otherwise,

where

B∗ = (m∗, f ∗) ∈ Y ∗1 ×Y ∗

2 | m∗ = H, a.e. in Ω, f ∗ = θ a.e. in R3.


Therefore, we could summarize the duality principle indicated in(16.25) as

inf(m,f)∈Y1×Y2

J(m, f) ≥ inft∈B


supy∗∈Y ∗

0

− 1

2α

∫

Ω

|y∗|22dx

−∫

Ω

(3∑

i=1

(div(y∗i ) +Hi + (1 − 2t)βei +∂λ2

∂xi)2)1/2 dx

−1

2

∫

R3

|∇λ2 − Curl∗λ1|22 dz

+

∫

Ω

β dx, (16.31)

where

B = t measurable | t(x) ∈ [0, 1], a.e. in Ω,

Y ∗0 = y∗ ∈W 1,2(Ω; R3×3) | y∗i ·n+ λ2ni = 0 on ∂Ω, ∀i ∈ 1, 2, 3,

and

Y ∗ = W 1,2(R3; R3) ×W 1,2(R3).

Hence, we may obtain the following duality principle, for which thedual for formulation is concave.

inf(m,f)∈Y1×Y2

G(m, f) + F (m, f)

≥ sup(λ1,λ2,y∗)∈Y ∗×Y ∗

0

inft∈B

− 1

2α

∫

Ω

|y∗|22 dx

−∫

Ω

(3∑

i=1

(div(y∗i ) +Hi + (1 − 2t)βei +∂λ2

∂xi

)2)1/2 dx

−1

2

∫

R3

|∇λ2 − Curl∗λ1|22 dz

+

∫

Ω

β dx, (16.32)

where

Y ∗0 = y∗ ∈W 1,2(Ω; R3×3) | y∗i · n = 0 on ∂Ω, ∀i ∈ 1, 2, 3,

and

Y ∗ = (λ1, λ2) ∈W 1,2(R3; R3) ×W 1,2(R3) | λ2 = 0 on ∂Ω.


16.7. The Cubic Case in Micro-magnetism

In this section we present results for the cubic case. For theprimal formulation we refer to reference [26] for details.

Let Ω ⊂ R3 be an open bounded set with a finite Lebesguemeasure and a regular boundary denoted by ∂Ω and, consider themodel of micro-magnetism in which the magnetization m : Ω → R3,is given by the minimization of the functional J(m, f), where:

J(m, f) =α

2

∫

Ω

|∇m|22 dx+

∫

Ω

ϕ(m(x)) dx−∫

Ω

H ·m dx

+1

2

∫

R3

|f(z)|22 dz,

m ∈ W 1,2(Ω; R3) ≡ Y1, |m(x)|2 = 1, a.e. in Ω

and f ∈ L2(R3; R3) ≡ Y2 is the unique field determined by thesimplified Maxwell’s equations

Curl(f) = 0, div(−f +mχΩ) = 0, in R3.

Here the function ϕ(m), for the cubic anisotropy is given by:

ϕ(m) = K0 +K1

∑

i6=j

m2im

2j +K2m

21m

22m

23,

where K1, K2 > 0. Also H ∈ L2(Ω; R3) is a known external fieldand χΩ is a function defined as

χΩ(x) =

1, if x ∈ Ω,0, otherwise.

The termα

2

∫

Ω

|∇m|22 dx

stands for the exchange energy. Finally, ϕ(m) represents the anisotro-pic contribution and is given by a multi-well functional whose min-ima establish the preferred directions of magnetization.

Remark 16.7.1. It is worth noting that ϕ(m) has six pointsof minimum, namely r1 = (1, 0, 0), r2 = (0, 1, 0) and r3 = (0, 0, 1),r4 = (−1, 0, 0), r5 = (0,−1, 0) and r6 = (0, 0,−1) which define thepreferred directions of magnetization.

16.7. THE CUBIC CASE IN MICRO-MAGNETISM 361

16.7.1. The Primal Formulation. For α = 0, we define theprimal formulation as J : Y1 × Y2 → R = R ∪ +∞, where

J(m, f) = G(m, f) + F (m, f).

Here G : Y1 × Y2 → R = R ∪ +∞ is defined as

G(m, f) =

∫

Ω

ϕ(m) dx+K

2

∫

Ω

(3∑

k=1

m2k − 1) dx (16.33)

and we shall rewrite ϕ as the approximation (and it must be em-phasized that this is an approximation for the original problem)

ϕ(m) = ming1(m), g2(m), g3(m), g4(m), g5(m), g6(m),

where

gk(m) = ϕ(rk) +

3∑

i=1

∂ϕ(rk)

∂mi(mi − rki)

+1

2

3∑

i=1

3∑

j=1

∂2ϕ(rk)

∂mi∂mj(mi − rki)(mj − rkj)

and, as above mentioned, r1 = (1, 0, 0), r2 = (0, 1, 0), r3 = (0, 0, 1),r4 = (−1, 0, 0), r5 = (0,−1, 0) and r6 = (0, 0,−1) are the pointsthrough which the possible microstructure is formed for the cubiccase, what justify such expansions. Also, we define F : V → R, as

F (m, f) =1

2

∫

R3

|f(z)|22 dz −∫

Ω

H ·m dx

+ Ind1(m) + Ind2(m, f) + Ind3(f), (16.34)

where

Ind1(m) =


Ind2(m, f) =


and

Ind3(f) =



16.7.2. The Duality Principles. Similarly as in Theorem8.2.5, we have

inf(m,f)∈Y1×Y2

G∗∗(m, f) + F (m, f)

≥ sup(m∗,f∗)∈Y ∗

1 ×Y ∗

2

−G∗(m∗, f ∗) − F ∗(−m∗,−f ∗), (16.35)

where

G∗(m∗, f ∗) = sup(m,f)∈Y1×Y2

〈m,m∗〉L2(Ω;R3)+〈f, f ∗〉L2(Ω;R3)−G(m, f),

so that, defining gk(m) = gk(m) + K2(∑3

k=1m2k), we have

G∗(m, f ∗) = ∫

Ωmaxk∈1,...,6g∗k(m) dx+ K

2|Ω|, if (m∗, f ∗) ∈ B∗,

+∞, otherwise,

where

B∗ = (m∗, f ∗) ∈ Y ∗1 × Y ∗

2 | f ∗ = (0, 0, 0) ≡ θ, a.e. in R3.Also,

F ∗(−m∗,−f ∗)

= sup(m,f)∈Y1×Y2

−〈m,m∗〉L2(Ω;R3) − 〈f, f ∗〉L2(Ω;R3) − F (m, f),

or

F ∗(−m∗,−f ∗) =

sup(m,f)∈Y1×Y2

−〈m,m∗〉L2(Ω;R3) − 〈f, f ∗〉L2(Ω;R3) −1

2

∫

R3

|f(z)|22 dz

+

∫

Ω

H ·m dx− Ind1(m) − Ind2(m, f) − Ind3(f),

that is

F ∗(−m∗,−f ∗) = sup(m,f)∈Y1×Y2

−〈m,m∗〉L2(Ω;R3) − 〈f, f ∗〉L2(Ω;R3)

− 1

2

∫

R3

|f(z)|22 dz +

∫

Ω

H ·m dx

−∫

Ω

λ

2(

3∑

k=1

m2k − 1)dx− 〈Curl(f), λ1〉L2(R3;R3)

− 〈(div(−f +mχΩ), λ2〉L2(R3),

16.7. THE CUBIC CASE IN MICRO-MAGNETISM 363

so that

F ∗(−m∗,−f ∗) =

∫

Ω

(

3∑

i=1

(∂λ2

∂xi−m∗

i +Hi)2)1/2dx

− 1

2

∫

R3

|f ∗ + Curl∗λ1 −∇λ2|22 dx.

Hence the duality principle indicated in (16.35) may be expressedas

inf(m,f)∈Y1×Y2

G∗∗(m, f) + F (m, f)

≥ sup(m∗,λ1,λ2)∈Y ∗

−∫

Ω

maxk∈1,...,6

g∗k(m∗) dx

−∫

Ω

(3∑

i=1

(∂λ2

∂xi

−m∗i +Hi)

2)1/2 dx

−1

2

∫

R3

|Curl∗λ1 −∇λ2|22 dz

− K

2|Ω|, (16.36)

where

Y ∗ = (m∗, λ1, λ2) ∈ Y ∗1 ×W 1,2(R3; R3) ×W 1,2(R3)) |

λ2 = 0 on ∂Ω. 2

By analogy, we may obtain the results for the semi-linear cubic case(for α > 0), namely

inf(m,f)∈Y1×Y2

G(m, f) + F (m, f)

≥ sup(m∗,λ1,λ2,y∗)∈Y ∗×Y ∗

0

−∫

Ω

maxk∈1,...,6

g∗k(m∗) dx− 1

2α

∫

Ω

|y∗|22 dx

−∫

Ω

(

3∑

i=1

(div(y∗i ) +∂λ2

∂xi−m∗

i +Hi)2)1/2 dx

−1

2

∫

R3


− K

2|Ω|,

where

G(m, f) =α

2

∫

Ω

|∇m|22 dx+G(m, f),

Y ∗0 = y∗ ∈W 1,2(Ω; R3×3) | y∗i · n = 0 on ∂Ω,


and

Y ∗ = (m∗, λ1, λ2) ∈ Y ∗1 ×W 1,2(R3; R3) ×W 1,2(R3) |

λ2 = 0 on ∂Ω. 2

16.8. Final Results, Other Duality Principles

In this section we establish new duality principles for the prob-lems addressed above and finish it with a conjecture concerningoptimality conditions. We start with the following result.

Theorem 16.8.1. Let J : V × Y2 → R be defined by

J(m, f) =α

2〈∇mi,∇mi〉Y +β

∫

Ω

(1−|m ·e|) dx+1

2

∫

R3

|f(x)|22 dx

− 〈mi, Hi〉Y + Ind0(m) + Ind1(m, f) + Ind2(m, f),

where

Ind0(m) =


Ind1(m, f) =


and

Ind2(f) =


Also V = W 1,2(Ω; R3), Y = L2(Ω), Y = L2(Ω; R3), and Y2 =L2(R3; R3). Observe that we may write

J(m, f) = G1(m) +G2(m) +G3(m) +G4(f) + Ind1(m, f)

+Ind2(f) − 〈mi, Hi〉Y − F (m), (16.37)

where

G1(m) =α

2〈∇mi,∇mi〉Y ,

G2(m) = β

∫

Ω

(1 − |m · e|) dx+K

2〈mi, mi〉Y ,

G3(m) =K1

2〈mi, mi〉Y + Ind0(m, f),

G4(f) =1

2

∫

R3

|f(x)|22 dx,

16.8. FINAL RESULTS, OTHER DUALITY PRINCIPLES 365

and

F (m) =(K +K1)

2〈mi, mi〉Y .

Through such a notation, the following duality principle holds

inf(m,f)∈V ×Y2

J(m, f)

≥ sup(v∗,λ)∈A∗×Y ∗

infz∗∈Y ∗

F ∗(z∗) −G∗(v∗1) −G∗2(v

∗2)

−G∗3(v

∗3 + z∗ −∇λ2) −G∗

4(∇λ2 − Curl∗λ1) , (16.38)

where

F ∗(z∗) =1

2(K +K1)〈z∗i , z∗i 〉Y ,

G∗1(v

∗1) =

1

2α〈v∗1i , v∗1i〉Y ,

G∗2(v

∗2) =

1

2K

∫

Ω

|v∗2|22 dx+β

K

∫

Ω

|v∗2 · e| dx+

β2

2|e|22 − β

|Ω|,

G∗3(v

∗3 −∇λ2) =

∫

Ω|v∗3 + z∗ −∇λ2|2 dx, if λ2 = 0 on ∂Ω,

+∞, otherwise,

and

G∗4(∇λ2 − Curl∗λ1) =

1

2‖∇λ2 − Curl∗λ1‖2

2,R3 .

Finally, A∗ = A1 ∩ A2, where

A1 = v∗ ∈ W 1,2(Ω; R3×3)×Y×Y | div(v∗1i)−v∗2i−v∗3i+Hi = 0, in Ω,

A2 = v∗ ∈W 1,2(Ω; R3×3) × Y × Y | v∗1i · n = 0 on ∂Ω,and

Y ∗ = (λ1, λ2) ∈W 1,2(R3; R3) ×W 1,2(R3) | λ2 = 0 on ∂Ω.

Proof. Concerning the dual functionals, we will make a explicitcalculation only of G∗

3 and G∗4 (the remaining calculations are ele-

mentary). We have that

G∗3(v

∗3 + z∗ −∇λ2) = sup

m∈V〈m, v∗3 + z∗〉Y + 〈div(m), λ2〉Y

−G3(m) − Ind1(m), (16.39)


so that through a suitable Lagrange multiplier λ0, we may write,

G∗3(v

∗3+z

∗−∇λ2) = supm∈V

〈m, v∗3+z∗〉Y −〈m,∇λ2〉Y +〈λ2, m·n〉L2(∂Ω)

− K1

2〈mi, mi〉Y − 1

2

∫

Ω

λ0

[(

3∑

k=1

m2k

)

− 1

]

dx. (16.40)

The supremum is attained for functions satisfying the followingequation

v∗3 + z∗ −∇λ2 −K1(m) − λ0m = θ, (16.41)

that is,

m =v∗3 + z∗ −∇λ2

K1 + λ0

.

From the condition∑3

k=1m2k = 1 in Ω, λ0 must be such that

m =v∗3 + z∗ −∇λ2

|v∗3 + z∗ −∇λ2|2,

where | · |2 denotes the euclidean norm in R3. Thus

G∗3(v

∗3 + z∗−∇λ2) =

∫

Ω|v∗3 + z∗ −∇λ2|2 dx, if λ2 = 0 on ∂Ω,

+∞, otherwise.

For the expression of G∗4(∇λ2 − Curl∗λ1), we have that

G∗4(∇λ2 − Curl∗λ1) =

supf∈Y

〈λ2,−div(f)〉Y + 〈λ1, Curl(f)〉Y − 1

2

∫

R3

|f(x)|22 dx

.

The supremum is attained for functions satisfying

f = ∇λ2 − Curl∗λ1, (16.42)

so that

G∗4(∇λ2 − Curl∗λ1) =

1

2‖∇λ2 − Curl∗λ1‖2

2,R3 . (16.43)

Now, observe that

G∗1(v

∗1) +G∗

2(v∗2) +G∗

3(v∗3 + z∗ −∇λ2) +G∗

4(∇λ2 − Curl∗λ1)

≥ 〈v∗1i,∇mi〉Y + 〈v∗2i , mi〉Y + 〈m, v∗3 + z∗ −∇λ2〉Y+ 〈f,∇λ2 − Curl∗λ1〉Y −G1(m) −G2(m) −G3(m) −G4(f),

(16.44)


∀(v∗, λ) ∈ A∗, m ∈ V, f ∈ Y. Thus,

−F ∗(z∗) +G∗1(v

∗1) +G∗

2(v∗2) +G∗

3(v∗3 + z∗ −∇λ2)

+G∗4(∇λ2 − Curl∗λ1)

≥ −F ∗(z∗) + 〈m, z∗〉Y + 〈mi, Hi〉Y − Ind1(m, f) − Ind2(f)

−G1(m) −G2(m) −G3(m) −G4(f), (16.45)

∀(v∗, λ) ∈ A∗, m ∈ V, f ∈ Y, z∗ ∈ Y ∗. Taking the supremum in z∗

in both sides of last inequality, we obtain

supz∗∈Y ∗

−F ∗(z∗) +G∗1(v

∗1) +G∗

2(v∗2) +G∗

3(v∗3 + z∗ −∇λ2)

+G∗4(∇λ2 − Curl∗λ1)

≥ F (m) + 〈mi, Hi〉Y − Ind1(m, f) − Ind2(f)

−G1(m) −G2(m) −G3(m) −G4(f), (16.46)

∀(v∗, λ) ∈ A∗, m ∈ V, f ∈ Y, and therefore

inf(m,f)∈V ×Y2

J(m, f)

≥ sup(v∗,λ)∈A∗×Y ∗

infz∗∈Y ∗

F ∗(z∗) −G∗(v∗1) −G∗2(v

∗2)

−G∗3(v

∗3 + z∗ −∇λ2) −G∗

4(∇λ2 − Curl∗λ1) . (16.47)

In a similar fashion, we may prove the following result.

Theorem 16.8.2. Let J : V × Y2 → R be defined by

J(m, f) =α

2〈∇mi,∇mi〉Y + β

∫

Ω

(1 − |m · e|) dx+1

2

∫

R3

|f(x)|22 dx

− 〈mi, Hi〉Y + Ind0(m) + Ind1(m, f) + Ind2(m, f).(16.48)

Observe that we may write

J(m, f) = G1(m) +G2(m) +G3(m) +G4(f) + Ind1(m, f)

+Ind2(f) − 〈mi, Hi〉Y − F (m), (16.49)

whereG1(m) =

α

2〈∇mi,∇mi〉Y ,

G2(m) = β

∫

Ω

(1 − |m · e|) dx+K

2〈mi, mi〉Y ,

G3(m) =K1

2〈mi, mi〉Y + Ind0(m),


G4(f) =1

2

∫

R3

|f(x)|22 dx,

and

F (m) =(K +K1)

2〈mi, mi〉Y .

Through such a notation, the following duality principle holds

inf(m,f)∈V ×Y

J(m, f) ≥

sup(v∗,λ)∈A∗×Y ∗

infz∗∈Y ∗

F ∗(div(z∗)) −G∗(v∗1 + z∗) −G∗2(v

∗2)

−G∗3(v

∗3 −∇λ2) −G∗

4(∇λ2 − Curl∗λ1) , (16.50)

where

F ∗(div(z∗)) =1

2(K +K1)〈div(z∗i ), div(z∗i )〉Y ,

G∗1(v

∗1 + z∗) =

1

2α〈v∗1i + z∗i , v

∗1i + z∗i 〉Y ,

G∗2(v

∗2) =

1

2K

∫

Ω

|v∗2|22 dx+β

K

∫

Ω

|v∗2 · e| dx+

β2

2|e|22 − β

|Ω|,

G∗3(v

∗3 −∇λ2) =

∫

Ω|v∗3 −∇λ2|2 dx, if λ2 = 0 on ∂Ω,

+∞, otherwise,

and

G∗4(∇λ2 − Curl∗λ1) =

1

2‖∇λ2 − Curl∗λ1‖2

2,R3 .

Here, A∗ = A1 ∩A2, where

A1 = v∗ ∈ W 1,2(Ω; R3×3)×Y×Y | div(v∗1i)−v∗2i−v∗3i+Hi = 0, in Ω,

and defining Y1 = W 1,2(Ω; R3×3) × Y × Y and Y2 = Y × Y × Y wedenote:

A2 = (v∗, z∗) ∈ Y1 × Y2 | v∗1i · n = 0 and z∗i · n = 0 on ∂Ω,and

Y ∗ = (λ1, λ2) ∈W 1,2(R3; R3) ×W 1,2(R3) | λ2 = 0 on ∂Ω.Finally, if K > 0 and K1 > 0 are such that

J1(z∗, v∗) = F ∗(div(z∗)) −G1(v

∗1 + z∗)


is coercive in z∗, for each v∗1 ∈ W 1,2(Ω; R3) and there exists (m0, v∗, z∗)

∈ V × Y1 × Y2 such that

δ 〈m0,−div(v∗1) + v∗2 + v∗3 −H〉Y + F ∗(div(z∗)) −G∗1(v

∗1 + z∗)

−G∗(v∗2) −G∗3(v

∗3 −∇λ2) −G∗

4(∇λ2 − Curl∗λ1) = θ, (16.51)

then, for

f0 = ∇λ2 − Curl∗λ1,

we have

inf(m,f)∈V ×Y2

G1(∇m) +G∗∗2 (m) +G3(m)

+G4(f) + Ind1(m, f) + Ind2(f) − 〈mi, Hi〉Y − F (m)= G1(∇m0)+G

∗∗2 (m0)+G3(m0)+G4(f0)+Ind1(m0, f0)+Ind2(f0)

− 〈m0i, Hi〉Y − F (m0)

= F ∗(div(z∗))−G∗1(v

∗1+z

∗)−G∗2(v

∗2)−G∗

3(v∗3−∇λ2)−G∗

4(∇λ2−Curl∗λ1)

= sup(v∗,λ)∈A∗×Y ∗

infz∗∈Y ∗

F ∗(div(z∗)) −G∗(v∗1 + z∗) −G∗2(v

∗2)

−G∗3(v

∗3 −∇λ2) −G∗

4(∇λ2 − Curl∗λ1) .

Proof. Similarly to last theorem, we may prove

inf(m,f)∈V ×Y

J(m, f) ≥

sup(v∗,λ)∈A∗×Y ∗

infz∗∈Y ∗

F ∗(div(z∗)) −G∗(v∗1 + z∗) −G∗2(v

∗2)

−G∗3(v

∗3 −∇λ2) −G∗

4(∇λ2 − Curl∗λ1) , (16.52)

and

inf(m,f)∈V ×Y2

G1(∇m) +G∗∗2 (m) +G3(m) +G4(f)

+Ind1(m, f) + Ind2(f) − 〈mi, Hi〉Y − F (m)

≥ sup(v∗,λ)∈A∗×Y ∗

infz∗∈Y ∗

F ∗(div(z∗)) −G∗(v∗1 + z∗) −G∗2(v

∗2)

−G∗3(v

∗3 −∇λ2) −G∗

4(∇λ2 − Curl∗λ1) . (16.53)


By hypothesis, (m0, v∗, z∗) ∈ V × Y1 × Y2 is a solution of equa-

tion,

δ 〈m0,−div(v∗1) + v∗2 + v∗3 −H〉Y + F ∗(div(z∗)) −G∗1(v

∗1 + z∗)

−G∗(v∗2) −G∗3(v

∗3 −∇λ2) −G∗

4(∇λ2 − Curl∗λ1) = θ. (16.54)

Thus, from the variation in z∗ we obtain,

−∇(div(z∗i ))/(K +K1) = (v∗1i + z∗i )/α, in Ω, (16.55)

and from the variation in v∗1,

v∗1 + z∗ = α∇(m0). (16.56)

From these two equations together with the boundary conditionsfor v∗1 and z∗ we obtain,

−div(z∗) = (K +K1)m0, (16.57)

so that

F ∗(div(z∗)) = 〈m0,−div(z∗)〉Y − F (m0), (16.58)

and

G∗1(v

∗1 + z∗) = 〈∇m0, v

∗1 + z∗〉Y −G1(∇m0). (16.59)

The variation relating v∗2 give us

v∗2 =∂G∗∗

2 (m0)

∂m, (16.60)

that is

G∗(v∗2) = 〈m0, v∗2〉Y −G∗∗

2 (m0). (16.61)

From the variation in v∗3 we have

v∗3 −∇λ2

|v∗3 −∇λ2|2= m0, in Ω, (16.62)

that is,

v∗3 = |v∗3 −∇λ2|2m0 + ∇λ2, in Ω. (16.63)

Now denoting

λ0 = |v∗3 −∇λ2|2 −K1, in Ω, (16.64)

we obtain

v∗3 = ∇λ2 + (λ0 +K1)m0, in Ω, (16.65)


so that

G∗3(v

∗3 −∇λ2) = 〈m0, v

∗3 −∇λ2〉Y − K1

2〈m0, m0〉Y

− 1

2

∫

Ω

λ0

[(

3∑

k=1

m20k

)

− 1

]

dx, (16.66)

and thus

G∗3(v

∗3 −∇λ2) = 〈m0, v

∗3 −∇λ2〉Y −G3(m0). (16.67)

The variation in λ2 and λ1, give us

−div(

v∗3 −∇λ2

|v∗3 −∇λ2|2χΩ

)

+ div(∇λ2 − Curl∗λ1) = 0, in R3,

(16.68)

and

Curl(∇λ2 − Curl∗λ1) = θ, in R3. (16.69)

Hence from (16.68) and (16.62), defining f0 = ∇λ2 − Curl∗λ1 wehave,

div(m0χΩ − f0) = 0 in R3, (16.70)

and

Curl(f0) = θ, in R3. (16.71)

Also,

G∗4(∇λ2 − Curl∗λ1) = 〈f0,∇λ2 − Curl∗λ1〉Y −G4(f0). (16.72)

Therefore, from (16.58), (16.59), (16.61), (16.67), (16.70) and (16.71)we obtain

F ∗(div(z∗))−G∗1(v

∗1+z

∗)−G∗2(v

∗2)−G∗

3(v∗3−∇λ2)−G∗

4(∇λ2−Curl∗λ1)

= G1(∇m0) +G∗∗2 (m0) +G3(m0) +G4(f0)

+ Ind1(m0, f0) + Ind2(f0) − 〈m0i , Hi〉Y − F (m0). (16.73)


Remark 16.8.3. The variation relating m in (16.54) give us

−div(v∗1) + v∗2 + v∗3 −H = 0, in Ω, (16.74)


which, from (16.56), (16.57), (16.60) and (16.65) implies that,

−αdiv(∇m0) − (K +K1)m0 +∂G∗∗

2 (m0)

∂m+(λ0 +K1)m0 −H + ∇λ2 = θ, in Ω.(16.75)

Equations (16.75), (16.70), (16.71), are the Euler-Lagrange equa-tions for the relaxed problem. However the derivatives

m0 =∂G∗

2(v∗2)

∂v∗2and v∗2 =

∂G∗∗2 (m0)

∂m

must be understood after a regularization, considering that formallyG∗

2 is not Gateaux differentiable. Finally, we conjecture that if K >0 and K1 > 0 are big enough (but still satisfying the hypothesesof last theorem), so that for a minimizing sequence (mn, fn) wehave λ0 +K1 > 0 and G∗∗

2 (mn) = G2(mn) for all n ∈ N sufficientlybig, then duality gap between the original primal problem and thedual one is zero. A formal proof of such a conjecture is planned fora future work.

16.9. Conclusion

In this chapter we develop duality principles for models in fer-romagnetism met in reference [26], for example . Almost all dualvariational formulations here presented are convex (in fact concave)either for the uniaxial or cubic cases. The results are obtainedthrough standard tools of convex analysis. It is important to em-phasize that in some situations (specially the hard cases), the min-ima may not be attained through the primal approaches, so thatthe minimizers of the dual formulations reflect the average behaviorof minimizing sequences for the primal problems, as weak clusterpoints of such sequences.

CHAPTER 17

Duality Applied to Fluid Mechanics


In this article our first result is a dual variational formulationfor the incompressible two-dimensional Navier-Stokes system. Weestablish as a primal formulation, the sum of L2 norm of each ofequations, and thus obtain the dual formulation through the Le-gendre Transform concept. Now we pass to present the primalformulation.

Consider S ⊂ R2 an open, bounded and connected set, whosethe internal boundary is denoted by Γ0 and, the external one isdenoted by Γ1. Denoting by u : S → R the field of velocity indirection x of the Cartesian system (x, y), by v : S → R, thevelocity field in the direction y, by p : S → R, the pressure field, sothat P = p/ρ, where ρ is the constant fluid density, ν is the viscositycoefficient and, g is the gravity constant, the Navier-Stokes PDEsystem is expressed by:

ν∇2u− u∂xu− v∂yu− ∂xP + gx = 0, in S,

ν∇2v − u∂xv − v∂yv − ∂yP + gy = 0, in S,

∂xu+ ∂yv = 0, in S,

(17.1)

u = v = 0, on Γ0,

u = u∞, v = 0, P = P∞, on Γ1

(17.2)

The primal variational formulation, denoted by J : U → R, isexpressed as:

J(u) =1

2

(

‖L1(u)‖2L2(S) + ‖L2(u)‖2

L2(S) + ‖L3(u)‖2L2(S)

)

(17.3)

where u = (u, v, P ) ∈ U , and

U = u ∈ H2(S) ×H2(S) ×H1(S) |u = v = 0, Γ0, and, u = u∞, v = 0, P = P∞ on Γ1. (17.4)

373

374 17. DUALITY APPLIED TO FLUID MECHANICS

Also

L1(u) = ν∇2u− u∂xu− v∂yu− ∂xP + gx, (17.5)

L2(u) = ν∇2v − u∂xv − v∂yv − ∂yP + gy (17.6)

and

L3(u) = ∂xu+ ∂yv. (17.7)

Clearly we can write

J(u) =

∫

S

g(Λu)dS (17.8)

where Λu = Λiu, for i ∈ 1, ..., 14 or more explicitly

Λ1u = ν∇2u, Λ2u = u, Λ3u = −∂xu,

Λ4u = v, Λ5u = −∂yu, Λ6u = −∂xP,

Λ7u = ν∇2v, Λ8u = u, Λ9u = −∂xv,

Λ10u = v, Λ11u = −∂yv, Λ12u = −∂yP,

Λ13u = ∂xu, Λ14u = ∂yv. (17.9)

Here

g(y) = g1(y) + g2(y) + g3(y) (17.10)

where

g1(y) =1

2(y1 + y2y3 + y4y5 + y6 + gx)

2, (17.11)

g2(y) =1

2(y7 + y8y9 + y10y11 + y12 + gy)

2, (17.12)

g3(y) =1

2(y13 + y14)

2. (17.13)

Remark 17.1.1. Through the space U derivatives must bealways understood in distributional sense, whereas boundary con-ditions are related to the sense of traces. Finally, it is worth em-phasizing that the results of this article are valid for the specialcases in which the external forces are represented by a gradient,with exception of the generalized method of lines, which may beadapted for more general situations.



We will apply Definition 8.1.25 to g(y) = g1(y) + g2(y) + g3(y).First for g1(y) we have

g∗1L(y∗) =

6∑

i=1

yiy∗i − g1(y), (17.14)

where yi is such that

y∗i =∂g1(y)

∂yi∀i ∈ 1, ..., 6,

that is

y∗1 = y∗6 = (y1 + y2y3 + y4y5 + y6 + gx),

y∗2 = (y1 + y2y3 + y4y5 + y6 + gx)y3 = y∗1y3,

y∗3 = (y1 + y2y3 + y4y5 + y6 + gx)y2 = y∗1y2,

y∗4 = (y1 + y2y3 + y4y5 + y6 + gx)y5 = y∗1y5,

y∗5 = (y1 + y2y3 + y4y5 + y6 + gx)y4 = y∗1y4.

Inverting such relations we obtain

y2 = y∗3/y∗1,

y3 = y∗2/y∗1,

y4 = y∗5/y∗1,

y5 = y∗4/y∗1,

y1 + y6 = y∗1 − y∗2y∗3/(y

∗1)

2 − y∗4y∗5/(y

∗1)

2 − gx.

Replacing such expressions in (17.14) we obtain

g∗1L(y∗) =y∗2y

∗3

y∗1+y∗4y

∗5

y∗1+

(y∗1)2

2− gxy

∗1, (17.15)

Similarly, we may obtain

g∗2L(y∗) =y∗8y

∗9

y∗7+y∗10y

∗11

y∗7+

(y∗7)2

2− gyy

∗7, (17.16)

and

g∗3L(y∗) =1

2(y∗13)

2. (17.17)

so that g∗L(y∗) = g∗1L(y∗) + g2L(y∗) + g3L(y∗).


Remark 17.2.1. Observe that to solve the system

δ(−GL(v∗) + 〈u,Λ∗v∗〉U) = θ (17.18)

implies the solution of the Navier-Stokes system.Here, Λ∗v∗ = θ denotes:

ν∇2v∗1 + v∗2 + v∗8 + ∂xv∗3 + ∂yv

∗5 − ∂xv

∗13 = 0, in S, (17.19)

ν∇2v∗7 + v∗4 + v∗10 + ∂xv∗9 + ∂yv

∗11 − ∂yv

∗13 = 0, in S, (17.20)

and

∂xv∗1 + ∂yv

∗7 = 0, in S. (17.21)

Also,

G∗L(v∗) =

∫

S

g∗1L(v∗)dS +

∫

S

g∗2L(v∗)dS +

∫

S

g∗3L(v∗)dS (17.22)

or, more explicitly:

G∗L(v∗) =

∫

S

v∗2v∗3

v∗1dS +

∫

S

v∗4v∗5

v∗1dS +

1

2

∫

S

(v∗1)2 dS

−∫

S

gxv∗1 dS +

∫

S

v∗8v∗9

v∗7dS +

∫

S

v∗10v∗11

v∗7dS

+1

2

∫

S

(v∗7)2 dS −

∫

S

gyv∗7 dS. (17.23)

17.3. Linear Systems which the Solutions

Solve the Navier-Stokes One

Through the next result we obtain a linear system whose the so-lution also solves the Navier-Stokes one for a special class of bound-ary conditions. Specifically for such a result we consider a domainS with a boundary denoted by Γ

Theorem 17.3.1. A solution of Navier-Stokes system earlierindicated, that is



∂xu+ ∂yv = 0, in S,

(17.24)


17.3. LINEAR SYSTEMS WHICH THE SOLUTIONS... 377

~u · n = 0, on Γ, (17.25)

where ~u = (u, v), is given by

u = ∂xw0,

v = ∂yw0,(17.26)

where w0 is a solution of the equation

∇2w0 = 0 in S,

∇w0 · n = 0, on Γ.(17.27)

Proof. We will suppose that the solution may be expressed by

u =v∗2v∗1, v =

v∗3v∗1.

It is easy to verify that v∗2 = ∂xv∗1 and v∗3 = ∂yv

∗1 are such that

∂yL1(u) = ∂xL2(u)

whereL1(u) = ν∇2u− u∂xu− v∂yu

andL2(u) = ν∇2v − u∂xv − v∂yv.

Therefore, the variable v∗1 may be used to solve the continuityequation, so that

∂xu+ ∂yv = 0

is equivalent to

∂x

∂xv∗1

v∗1

+ ∂y

∂yv∗1

v∗1

.

That is∇2(ln(v∗1)) = 0.

Definingw0 = ln(v∗1),

the equation of continuity stands for

∇2w0 = 0, in S

which with the concerned boundary conditions,

∇w0 · n = 0, on Γ. (17.28)

gives a solution of the system in question.


Remark 17.3.2. To prove such a result we could simply havereplaced (17.26) into (17.24) to obtain

∂xF − ∂xP + gx = 0, in S, (17.29)

and

∂yF − ∂yP + gy = 0, in S, (17.30)

whereF = ν∇2w0 − (∂xw0)

2/2 − (∂yw0)2/2.

Being w0 known as a solution of

∇2w0 = 0 in S

with corresponding boundary conditions, it is clear that equations(17.29) and (17.30) have a solution in P with the proper boundaryconditions above described.

In the final result in this section we work with the full systemof boundary conditions.

Theorem 17.3.3. A solution of the Navier-Stokes system



∂xu+ ∂yv = 0, in S,

(17.31)

u = v = 0, on Γ0,

u = u∞, v = 0, P = P∞, on Γ1,(17.32)

is given by

u = ∂xw0 + ∂xw1,

v = ∂yw0 − ∂yw1

(17.33)

if (w0, w1) is a solution of the system:

∇2w0 + ∂xxw1 − ∂yyw1 = 0, in S,

∂xyw1 = 0, in S,(17.34)

17.4. THE METHOD OF LINES FOR THE NAVIER-STOKES SYSTEM 379


u = ∂xw0 + ∂xw1 = 0, on Γ0,

v = ∂yw0 − ∂yw1 = 0, on Γ0,

u = ∂xw0 + ∂xw1 = u∞, on Γ1,

v = ∂yw0 − ∂yw1 = 0, on Γ1.

(17.35)

Proof. It is easy to verify that u = ∂xw0 + ∂xw1 and v =∂yw0 − ∂yw1 such that ∂xyw1 = 0 are also such that

∂yL1(u) = ∂xL2(u)

whereL1(u) = ν∇2u− u∂xu− v∂yu

andL2(u) = ν∇2v − u∂xv − v∂yv.

On the other hand, for u and v defined above, the equation ofcontinuity is given by:

∇2w0 + ∂xxw1 − ∂yyw1 = 0, in S.

Therefore the solution of this last equation simultaneously with∂xyw1 = 0, in S, and the boundary conditions

u = ∂xw0 + ∂xw1 = 0, v = ∂yw0 − ∂yw1 = 0, on Γ0,

and

u = ∂xw0 + ∂xw1 = u∞, v = ∂yw0 − ∂yw1 = 0, on Γ1,

gives us a solution of the Navier-Stokes system.

17.4. The Method of Lines for the Navier-Stokes System

Similarly as in chapter 13 (in such a chapter details may befound), we develop the solution for the Navier-Stokes system throughthe generalized method of lines. We write the solution for an ap-proximation of the system in question.

In fact we add the term

ε∂ttP

to the equation of continuity after the transformation of coordi-nates. The results are related to the special case Γ1 = 2Γ0, however


it is possible to obtain them for more general situations, speciallyfor Γ1 = KΓ0, where K > 1 is big number, so that Γ1 simulatesconditions at infinity.

The boundary conditions are

u = v = 0 and P = P0(x) on Γ0,

and

u = uf(x), v = vf (x) and P = Pf(x), on Γ1.

The optimal P0(x) is the one which minimizes∫

S

(∂xu+ ∂yv)2 dS,

in order to provide the best approximation for the continuity equa-tion. Finally, we consider g = 0, ν = 0.08, ε = 0.082 and ρ = 1.Units refers to the international system. Here the functions f1(x)up to f17(x) are relating the boundary shape and are determinedthrough the transformations of coordinates, similarly as obtainedin chapter 13.

We start presenting the lines (for N=10) for the field of velocityu (here x stands for θ):

Line 1

u[1] =0.1uf (x) + 0.562f16(x)P0(x) − 0.563f16(x)Pf (x) + 0.033f11(x)uf (x)

− 0.206f14(x)uf (x)2 − 0.206f16(x)uf (x)vf (x) − 0.281f17(x)P0(x)

− 0.140f17(x)P ′f (x) + 0.045f12(x)u′

f (x) − 0.0656f15(x)uf (x)u′f (x)

− 0.103f17(x)vf (x)u′f (x) + 0.008f13(x)u′′

f (x),

Line 2



− 0.270f17(x)P ′f (x) + 0.080f12(x)u′

f (x) − 0.130f15(x)uf (x)u′f (x)

− 0.205f17(x)vf (x)u′f (x) + 0.015f13(x)u′′

f (x)

Line 3



− 0.378f17(x)P ′f (x) + 0.105f12(x)u′

f (x) − 0.190f15(x)uf (x)u′f (x)

− 0.301f17(x)vf (x)u′f (x) + 0.020f13(x)u′′

f (x)


Line 4



− 457f17(x)P ′f (x) + 0.120f12(x)u′

f (x) − 0.242f15(x)uf (x)u′f (x)

− 0.387f17(x)vf(x)u′f (x) + 0.024f13(x)u′′

f (x)

Line 5



− 0.501f17(x)P ′f (x) + 0.125f12(x)u′

f (x) − 0.279f15(x)uf (x)u′f (x)

− 0.452f17(x)vf (x)u′f (x) + 0.026f13(x)u′′

f (x)

Line 6

u[6] =0.6uf (x) + 1.499f16(x)P0(x) − 1.501f16(x)Pf [x] + 0.080f11(x)uf (x)


− 0.503f17(x)P ′f (x) + 0.120f12(x)u′

f (x) − 0.296f15(x)uf (x)u′f (x)

− 0.487f17(x)vf (x)u′f (x) + 0.026f13(x)u′′

f (x)

Line 7



− 0.458f17(x)P ′f (x) + 0.105f12(x)u′

f (x) − 0.284f15(x)uf (x)u′f (x)

− 0.476f17(x)vf (x)u′f (x) + 0.023f13(x)u′′

f (x)

Line 8



− 0.362f17(x)P ′f (x) + 0.080f12(x)u′

f (x) − 0.237f15(x)uf (x)u′f (x)

− 0.404f17(x)vf (x)u′f (x) + 0.018f13(x)u′′

f (x)


Line 9



− 0.211f17(x)P ′f (x) + 0.045f12(x)u′

f (x) − 0.145f15(x)uf (x)u′f (x)

− 0.252f17(x)vf (x)u′f (x) + 0.010f13(x)u′′

f (x)

The Line s for the field of velocities v are presented below:

Line 1

v[1] =0.1vf (x) + 0.562f16(x)P0(x) − 0.563f16(x)Pf (x) + 0.033f11(x)vf (x)

− 0.206f14(x)vf (x)2 − 0.206f16(x)uf (x)vf (x) − 0.281f17(x)P0(x)

− 0.140f17(x)P ′f (x) + 0.045f12(x)u′

f (x) − 0.0656f15(x)uf (x)v′f (x)

− 0.103f17(x)vf (x)v′f (x) + 0.008f13(x)v′′f (x)

Line 2



− 0.270f17(x)P ′f (x) + 0.080f12(x)v′f (x) − 0.130f15(x)uf (x)v′f (x)

− 0.205f17(x)vf (x)v′f (x) + 0.015f13(x)v′′f (x)

Line 3




− 0.301f17(x)vf (x)v′f (x) + 0.020f13(x)v′′f (x)

Line 4




− 0.387f17(x)vf (x)v′f (x) + 0.024f13(x)v′′f (x)


Line 5



− 0.501f17(x)P ′f (x) + 0.125f12(x)u′

f (x) − 0.279f15(x)uf (x)v′f (x)

− 0.452f17(x)vf (x)v′f (x) + 0.026f13(x)v′′f (x)

Line 6

v[6] =0.6vf (x) + 1.499f16(x)P0(x) − 1.501f16(x)Pf [x] + 0.080f11(x)vf (x)



− 0.487f17(x)vf (x)v′f (x) + 0.026f13(x)v′′f (x)

Line 7




− 0.476f17(x)vf (x)v′f (x) + 0.023f13(x)v′′f (x)

Line 8




− 0.404f17(x)vf (x)v′f (x) + 0.018f13(x)v′′f (x)

Line 9




− 0.252f17(x)vf (x)v′f (x) + 0.010f13(x)v′′f (x)

Finally the lines for the field of pressure P is presented below:

Line 1

P [1] = 0.1Pf(x) + 0.9P0(x) + 7.03f6(x)uf(x) + 7.03f4(x)vf (x)

+1.75f7(x)u′f(x) + 1.75f5(x)v

′f (x)


Line 2


+3.37f7(x)u′f(x) + 3.37f5(x)v

′f (x)

Line 3


+4.72f7(x)u′f(x) + 4.72f5(x)v

′f (x)

Line 4


+5.72f7(x)u′f(x) + 5.72f5(x)v

′f (x)

Line 5


+6.26f7(x)u′f(x) + 6.26f5(x)v

′f (x)

Line 6


+6.28f7(x)u′f(x) + 6.28f5(x)v

′f (x)

Line 7


+5.72f7(x)u′f(x) + 5.72f5(x)v

′f (x)

Line 8


+5.72f7(x)u′f [x] + 5.72f5(x)v

′f (x)

Line 9


+2.63f7(x)u′f(x) + 2.63f5(x)v

′f (x).


17.5. Conclusion

In this chapter we develop a study about Legendre Transformapplied to the two-dimensional incompressible Navier-Stokes sys-tem. The final section presents the solution by the generalizedmethod of lines. The extension of results to R3, compressible andtime dependent cases is planned for a future work.

CHAPTER 18

Duality Applied to a Beam Model

18.1. Introduction and Statement

of the Primal Formulation

In this article we present an existence result and duality theoryconcerning the non-linear beam model proposed in Gao, [23].

The boundary value form of Gao’s beam model is representedby the equation

EIw,xxxx − a(w,x)2w,xx + λw,xx = f, in [0, l] (18.1)

subject to the conditions:

w(0) = w(l) = w,x(0) = w,x(l) = 0, (18.2)

where w : [0, l] → R denotes the field of vertical displacements.The corresponding primal variational formulation for such a

model, is expressed by the functional J : U → R, where:

J(w) =

∫ l

0

1

2(EI(w,xx)

2 +a

6(w,x)

4 − λ(w,x)2)dx−

∫ l

0

fwdx

(18.3)

Here E denotes the Young Modulus related to a specific mate-rial, I = bh3

12for a beam with rectangular cross section (rectangle

basis b and height h), a is a constant related to the cross sectionarea. Furthermore, l denotes the beam length (in fact the beam isrepresented by the set [0, l] = x ∈ R | 0 ≤ x ≤ l, λ denotes anaxial compressive load applied to x = l and finally, f(x) denotesthe distributed vertical load.

Also, we define:

U = w ∈W 2,2([0, l]) ∩W 1,4([0, l]) |w(0) = w(l) = 0 = w,x(0) = w,x(l). (18.4)

Remark 18.1.1. The boundary conditions refer to a clampedbeam at x=0 and x=l.

387

388 18. DUALITY APPLIED TO A BEAM MODEL

We consider the problem,

Problem P :

To find w0 ∈ U, such that J(w0) = infw∈U

J(w) (18.5)

Equation (18.1) stands for the necessary conditions for Problem P.

Remark 18.1.2. From the Sobolev Imbedding Theorem wehave the following result (case A for mp > n):

W j+m,p(Ω) → W j,q(Ω), (18.6)

for p ≤ q < ∞. For the present case we have m = n = 1, p = 2,j = 1 and q = 4, which means:

W 2,2([0, l]) ⊂W 1,4([0, l]) (18.7)

so that

W 2,2([0, l]) ∩W 1,4([0, l]) = W 2,2([0, l]). (18.8)

18.2. Existence and Regularity Results for Problem PIn this section we show the existence of a minimizer for the un-

constrained problem P. More specifically, we establish the followingresult:

Theorem 18.2.1. Given b, h, a, l, E, λ ∈ R+ and f ∈ L2([0, l])there exists at least one w0 ∈ U such that

J(w0) = infw∈U

J(w), (18.9)

where

U = w ∈W 2,2([0, l]) | w(0) = w(l) = 0 = w,x(0) = w,x(l)(18.10)

and

J(w) =

∫ l

0

1

2(EI(w,xx)

2 +a

6(w,x)

4 − λ(w,x)2)dx−

∫ l

0

fwdx,

∀ w ∈ U. (18.11)

Proof. From Poincare Inequality it is clear that J is coercive,that is:

18.2. EXISTENCE AND REGULARITY RESULTS FOR Problem P 389

lim‖w‖U→+∞

J(w) = +∞ (18.12)

where

‖w‖U = ‖w‖W 2,2([0,l]), ∀w ∈ U. (18.13)

Therefore since J is strongly continuous, there exists α ∈ R

such that

α = infw∈U

J(w). (18.14)

Thus, if wnn∈N is a minimizing sequence (in the sense that

limn→+∞

J(wn) = α),

then ‖wn‖W 2,20 ([0,l]) and ‖wn‖W 1,4([0,l]) are bounded sequences in

reflexive Banach spaces (see Remark 18.1.2).Hence, there exists w0 ∈W 2,2

0 ([0, L]) and a subsequence wnj ⊂wn such that

wnj → w0 as j → +∞, weakly in W 2,20 ([0, l]). (18.15)

From the Rellich Kondrachov theorem, up to a subsequence, whichwe also denote by wnj we have

wnj,x → w0,x as j → +∞, strongly in L2([0, l]), (18.16)

Furthermore we have

J(w) = J1(w) − λ

2

∫ l

0

(w,x)2 dx (18.17)

where

J1(w) =

∫ l

0

1

2(EI(w,xx)

2 +a

6(w,x)

4) dx−∫ l

0

fw dx. (18.18)

As J1 is convex and continuous, it is also weakly lower semi-continuous,so that

lim infj→+∞

J1(wnj) ≥ J1(w0) (18.19)

From this and equation (18.16), as wnj is also a minimizing se-quence, we can conclude that:

α = infw∈U

J(w) = lim infj→+∞

J(wnj) ≥ J(w0) (18.20)


which implies

J(w0) = α = infw∈U

J(w). (18.21)

Remark 18.2.2. We recall that from the Rellich-KondrachovTheorem, Part III, for mp > n we have the following compactimbedding

W j+m,p(Ω) → Cj(Ω0). (18.22)

In our case consider n = m = 1, p = 2 and j = 1, that is, asw0 ∈ W 2,2((0, l)) (here Ω = Ω0 = (0, 1)) we can conclude thatw0 ∈ C1([0, l]), which means that w0 has continuous derivativein [0, l] (no corners). In fact such a regularity result refers to thespace W 2,2([0, l]) as a whole, not only to this solution w0. To obtaindeeper results concerning regularity, we would need to evaluate theeffect of necessary conditions on the solution w0.

Now we present a duality principle proven in [6].


J(u) = (G Λ)(u) − F (Λ1u) − 〈u, p〉Uis bounded from below , where Λ : U → Y and Λ1 : U → Y arecontinuous linear operators.

Then we may write:

infz∗∈Y ∗


F ∗(z∗) −G∗(v∗) ≥ infu∈U

J(u)


∗ − p = 0

18.3. A Convex Dual Formulation for the Beam Model

Now, we rewrite the primal variational formulation in a stan-dard double-well format, as J : U → R, where

J(w) =

∫ l

0

EI

2(w,xx)

2dx+

∫ l

0

α

2(w2

,x

2− β)2dx−

∫ l

0

fwdx,

(18.23)

18.3. A CONVEX DUAL FORMULATION FOR THE BEAM MODEL 391

where U = W 2,20 ([0, l]) and α, β are appropriate positive real con-

stants. We may also write

J(w) = G(Λw) − F (Λ1w) − 〈w, f〉L2([0,l]), (18.24)

where

G(Λw) =

∫ l

0

EI

2(w,xx)

2dx+

∫ l

0

α

2(w2

,x

2− β)2dx+

K

2

∫ l

0

w2,xdx,

(18.25)

F (Λ1w) =K

2

∫ l

0

w2,xdx, (18.26)

where Λ : U → Y = L2([0, l]; R3), Λ1 : U → Y1 = Y ∗1 = L2([0, l])

and Λ2 : U → Y2 = L2([0, l]), are given by

Λw = Λ1w,Λ2w,w, (18.27)

and

Λ1w = w,x, Λ2w = w,xx. (18.28)


infw∈U

J(w) ≥ supv∗∈A∗

infz∗∈Y ∗

1

F ∗(L∗z∗) −G∗(v∗, z∗),

where

F ∗(L∗z∗) =1

2K

∫ l

0

(z∗,x)2dx, (18.29)

and

G∗(v∗, z∗) = G∗L(v∗, z∗) =

1

2EI

∫ l

0

(v∗2 + z∗)2dx+1

2

∫ l

0

(v∗1)2

v∗0 +Kdx

+1

2α

∫ l

0

(v∗0)2dx+ β

∫ l

0

v∗0dx, (18.30)

if v∗0 +K > 0, a.e. in [0, l], and

A∗ = v∗ ∈ Y ∗ | Λ∗v∗ − f = 0, (18.31)

that is,

A∗ = v∗ ∈ Y ∗ | v∗2,xx − v∗1,x = f, a.e. in [0, l]. (18.32)


The final format of the concerned duality principle is given by,

infw∈U

J(w) ≥ supv∗∈A∗∩B∗

infz∗∈Y ∗

0

1

2K

∫ l

0

(z∗,x)2 dx− 1

2EI

∫ l

0

(v∗2+z∗)2dx

− 1

2

∫ l

0

(v∗1)2

v∗0 +Kdx− 1

2α

∫ l

0

(v∗0)2 dx+ β

∫ l

0

v∗0 dx (18.33)

whereB∗ = v∗ ∈ Y ∗ | v∗0 +K > 0 a.e. in [0, l]

andY ∗

0 = z∗ ∈ Y ∗1 | z∗(0) = z∗(l) = 0.

Remark 18.3.1. It is important to emphasize that the in-equality indicated in (18.33) is an equality if there exists a criticalpoint for the dual formulation such that v∗0 + K > 0 a.e. in [0, l]and K < EI/K0, where, as above mentioned, K0 is the constantconcerning the Poincare inequality. In such a case, the dual formu-lation is convex.

18.4. A Final Result, Another Duality Principle


infw∈U

J(w) = infz∗∈Y ∗

supv∗∈A∗

F ∗(z∗) −G∗(v∗), (18.34)

where

F ∗(z∗) =1

2K

∫ l

0

(z∗)2dx, (18.35)

and

G∗(v∗) =1

2EI

∫ l

0

(v∗2)2dx+

1

2

∫ l

0

(v∗1)2

v∗0 +Kdx

+1

2α

∫ l

0

(v∗0)2dx+ β

∫ l

0

v∗0dx, (18.36)

if v∗0 +K > 0, a.e. in [0, l], and

A∗ = v∗ ∈ Y ∗ | Λ∗v∗ − Λ∗1z

∗ − f = 0, (18.37)

or

A∗ = (v∗, z∗) ∈ L2([0, l],R4) |v∗2,xx − v∗1,x + z∗,x = f, in [0, l]. (18.38)

18.4. A FINAL RESULT, ANOTHER DUALITY PRINCIPLE 393

Observe that

G∗(v∗) ≥ 〈Λw, v∗〉Y −G(Λw), ∀w ∈ U, v∗ ∈ Y ∗. (18.39)

Thus

− F ∗(z∗) +G∗(v∗)

≥ −F ∗(z∗) + 〈Λ1w, z∗〉 + 〈w, f〉U −G(Λw). (18.40)

We can make z∗ an independent variable though A∗, that is, forv∗2(z, v

∗1) given by

v∗2(z∗, v∗1) = (v∗2)

′(0)x+ v∗2(0) +

∫ x

0

v∗1(t) dt−∫ x

0

z∗(t) dt

+

∫ x

0

∫ t1

0

f dt dt1. (18.41)

From (18.40), we may write

supz∗∈L2([0,l])

−F ∗(z∗) +G∗(v∗2(v∗1, z

∗), v∗1, v∗0)

≥ supz∗∈L2([0,1])

−F ∗(z∗) + 〈Λ1w, z∗〉 + 〈w, f〉U −G(Λw), (18.42)

so that we may infer that

sup(v∗1 ,v∗0)∈L2([0,l])×C∗

infz∗∈L2([0,1])

1

2K

∫ 1

0

(z∗)2dx− 1

2EI

∫ 1

0

(v∗2(z∗, v∗1))

2dx

−1

2

∫ 1

0

(v∗1)2

v∗0 +Kdx− 1

2α

∫ 1

0

(v∗0)2dx− β

∫ 1

0

v∗0dx

≤ infw∈U

J(w), (18.43)

where

C∗ = v∗0 ∈ L2([0, l]) | v∗0 +K > 0 a.e. in [0, l].

Observe that the infimum for the dual formulation indicated in(18.43) is attained, for K < EI/K0 (here K0 denotes the constantconcerning Poincare Inequality), through the relation

v∗2 =EIz∗,xK

, z∗(0) = z∗(l) = 0 (18.44)


so that the final format of our duality principle is given by

infu∈U

J(w) ≥ sup(z∗,v∗1 ,v∗0)∈B∗∩C∗

− EI

2K2

∫ 1

0

(z∗,x)2 dx+

1

2K

∫ 1

0

(z∗)2 dx

−1

2

∫ 1

0

(v∗1)2

v∗0 +Kdx− 1

2α

∫ 1

0

(v∗0)2dx− β

∫ 1

0

v∗0 dx

. (18.45)

Defining Y ∗0 = W 1,2

0 [0, l] × L2([0, l],R2) we have

B∗ = (z∗, v∗1, v∗0) ∈ Y ∗0 |

EI

Kz∗,xxx − v∗1,x + z∗,x = f, a.e. in [0, l]. (18.46)

Remark 18.4.1. It is important to emphasize that equalityholds in (18.45) only if there exists a critical point for the dual for-mulation such that v∗0 +K > 0 a.e. in [0, l] and K < EI/K0, where,as above mentioned, K0 is the constant concerning the Poincare in-equality. In such a case, the dual formulation is convex.

18.5. Conclusion

In this chapter we present an existence result and dual varia-tional formulations for the non-convex (in variational format) Gao’sbeam model. In both duality principles, the Poincare inequalityplays a fundamental role for the establishment of optimality condi-tions.

Bibliography

[1] R.A. Adams, Sobolev Spaces, Academic Press, New York (1975).[2] R.A. Adams and J.F. Fournier, Sobolev Spaces, second edition , Elsevier

(2003).[3] J.F. Annet, Superconductivity, Superfluids and Condensates, Oxford Mas-

ter Series in Condensed Matter Physics, Oxford University Press, Reprint(2010)

[4] G. Bachman and L. Narici, Functional Analysis, Dover Publications,Reprint (2000).

[5] J.M. Ball and R.D. James, Fine mixtures as minimizers of energy , Archivefor Rational Mechanics and Analysis, 100, 15-52 (1987).

[6] F. Botelho, Dual Variational Formulations for a Non-linear Model of

Plates, Journal of Convex Analysis, 17 , No. 1, 131-158 (2010).[7] F. Botelho, Duality for Ginzburg-Landau Type Equations and the Gener-

alized Method of Lines, submitted.[8] F. Botelho, Dual Variational Formulations for Models in Micro-magnetism

- The Uniaxial and Cubic Cases, submitted.[9] H.Brezis, Analyse Fonctionnelle, Masson (1987).

[10] I.V. Chenchiah and K. Bhattacharya, The Relaxation of Two-well Energies

with Possibly Unequal Moduli, Arch. Rational Mech. Anal. 187 (2008), 409-479.

[11] M. Chipot, Approximation and oscillations, Microstructure and PhaseTransition, the IMA volumes in mathematics and applications, 54, 27-38 (1993).

[12] R. Choksi, M.A. Peletier, and J.F. Williams, On the Phase Diagram for

Microphase Separation of Diblock Copolymers: an Approach via a Nonlocal

Cahn-Hilliard Functional, to appear in SIAM J. Appl. Math. (2009).[13] P.Ciarlet, Mathematical Elasticity, Vol. I – Three Dimensional Elasticity,

North Holland Elsivier (1988).[14] P.Ciarlet, Mathematical Elasticity, Vol. II – Theory of Plates, North Hol-

land Elsevier (1997).[15] P.Ciarlet, Mathematical Elasticity, Vol. III – Theory of Shells, North Hol-

land Elsevier (2000).[16] B. Dacorogna, Direct methods in the Calculus of Variations, Springer-

Verlag (1989).[17] I.Ekeland and R.Temam, Convex Analysis and Variational Problems.

North Holland (1976).[18] L. C. Evans, Partial Differential Equations, Graduate Studies in Mathe-

matics, 19, AMS (1998).

395

396 BIBLIOGRAPHY

[19] U.Fidalgo and P.Pedregal, A General Lower Bound for the Relaxation of

an Optimal Design Problem in Conductivity with a Quadratic Cost Func-

tional, Pre-print (2007).[20] N.B. Firoozye and R.V. Khon, Geometric Parameters and the Relaxation

for Multiwell Energies, Microstructure and Phase Transition, the IMAvolumes in mathematics and applications, 54, 85-110 (1993).

[21] D.Y.Gao and G.Strang, Geometric Nonlinearity: Potential Energy, Com-

plementary Energy and the Gap Function, Quartely Journal of AppliedMathematics, 47, 487-504 (1989a).

[22] D.Y.Gao, On the extreme variational principles for non-linear elastic

plates. Quarterly of Applied Mathematics, XLVIII, No. 2 (June 1990),361-370.

[23] D.Y.Gao, Finite Deformation Beam Models and Triality Theory in Dy-

namical Post-Buckling Analysis. International Journal of Non-linear Me-chanics 35 (2000), 103-131.

[24] D.Y.Gao, Duality Principles in Nonconvex Systems, Theory, Methods and

Applications, Kluwer, Dordrecht,(2000).[25] R.D.James and D. Kinderlehrer Frustration in ferromagnetic materials,

Continuum Mech. Thermodyn. 2 (1990) 215-239.[26] H. Kronmuller and M. Fahnle, Micromagnetism and the Microstructure of

Ferromagnetic Solids, Cambridge University Press (2003).[27] D.G. Luenberger, Optimization by Vector Space Methods, John Wiley and

Sons, Inc. (1969).[28] G.W. Milton, Theory of composites, Cambridge Monographs on Applied

and Computational Mathematics. Cambridge University Press, Cambridge(2002).

[29] P.Pedregal, Parametrized measures and variational principles, Progress inNonlinear Differential Equations and Their Applications, 30, Birkhauser(1997).

[30] P.Pedregal and B.Yan, On Two Dimensional Ferromagnetism, pre-print(2007).

[31] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Volume

I, Functional Analysis, Reprint Elsevier (Singapore, 2003).[32] R. T. Rockafellar, Convex Analysis, Princeton Univ. Press, (1970).[33] R.C. Rogers, A nonlocal model of the exchange energy in ferromagnet ma-

terials, Journal of Integral Equations and Applications, 3, No. 1 (Winter1991).

[34] R.C. Rogers, Nonlocal variational problems in nonlinear electromagneto-

elastostatics, SIAM J. Math, Anal. 19, No. 6 (November 1988).[35] W. Rudin, Functional Analysis, second edition, McGraw-Hill (1991).[36] W. Rudin, Real and Complex Analysis, third edition, McGraw-Hill (1987).[37] D.R.S.Talbot and J.R.Willis, Bounds for the effective contitutive relation

of a nonlinear composite, Proc. R. Soc. Lond. (2004), 460, 2705-2723.[38] E. M. Stein and R. Shakarchi, Real Analysis, Princeton Lectures in Anal-

ysis III, Princeton University Press (2005).[39] R. Temam, Navier-Stokes Equations, AMS Chelsea, reprint (2001).

BIBLIOGRAPHY 397

[40] J.J. Telega, On the complementary energy principle in non-linear elastic-

ity. Part I: Von Karman plates and three dimensional solids, C.R. Acad.Sci. Paris, Serie II, 308, 1193-1198; Part II: Linear elastic solid and non-convex boundary condition. Minimax approach, ibid, pp. 1313-1317 (1989)

[41] A.Galka and J.J.Telega Duality and the complementary energy principle

for a class of geometrically non-linear structures. Part I. Five parameter

shell model; Part II. Anomalous dual variational priciples for compressed

elastic beams, Arch. Mech. 47 (1995) 677-698, 699-724.[42] J.F. Toland, A duality principle for non-convex optimisation and the cal-

culus of variations, Arch. Rath. Mech. Anal., 71, No. 1 (1979), 41-61.[43] J. L. Troutman, Variational Calculus and Optimal Control, second edition,

Springer (1996).

topics on functional analysis, calculus of variations...

Documents