henk broer and floris takens january 28, 2008broer/pdf/bt.pdf · nomical evolutions to population...

Dynamical Systems and Chaos

Henk Broer and Floris Takens

January 28, 2008

i

Preface

The discipline of Dynamical Systems provides the mathematical language describ-ing the time dependence of deterministic systems. Since thepast four decades thereis a ongoing theoretical development. This book starts fromthe phenomologicalpoint of view, reviewing examples, some of which have been guiding in the de-velopment of the theory. So we discuss oscillatiors, like the pendulum in all itsvariations concerning damping and periodic forcing and theVan der Pol system,the Henon and Logistic families, the Newton algorithm seenas a dynamical systemand the Lorenz and the Rossler systems. The Doubling map on the circle and theThom map (a.k.a. the Arnold cat map) on the 2-dimensional torus are useful toymodels to illustrate theoretical ideas like symbolic dynamics. In an appendix the1963 Lorenz model is derived from appropriate partial differential equations.

The phenomena concern equilibrium, periodic, quasi-periodic and chaotic dynam-ics as these occur in all kinds of modelling and are met in computer simulationsand in experiments. The application area varies from celestial mechanics and eco-nomical evolutions to population dynamics and climate variability. One general,motivating question is how one should give intelligent interpretations to data thatcome from computer simulations or experiments. For this thorough theoretical in-vestigations are needed.

One key idea is that the dynamical systems used for modellingshould be ‘robust’,which means that relevant qualitative properties should bepersistent under smallperturbations. Here we can distinguish between variationsof the initial state orof system parameters. In the latter case one may think of the coefficients in theequations of motion. One of the strongest forms of persistence, called structuralstability will be discussed, but many weaker forms often aremore relevant. Insteadof individual evolutions one also has to consider invariantmanifolds, like stable andunstable manifolds of equilibria and periodic orbits and invariant tori as these oftenmark the geometrical organisation of the state space. Here anotion like (normal)hyperbolicity come into play. In fact, looking more globally at the structure ofdynamical systems, we consider attractors and basin boundaries, mainly based onthe examples of the Solenoid and the Horseshoe map.

The central concept of the theory is chaos, to be defined in terms of unpredictabil-ity. The prediction principle we use is coinedl’histoire se repete, i.e., by matchingwith past observations. The unpredictability then will be expressed in a dispersionexponent. Structural stability turns out to be useful in proving that the chaotic-ity of the Doubling and Thom maps is persistent under small perturbations. Theideas on predictability will also be used in the developmentof Reconstruction The-ory, where important dynamical invariants like the box-counting and correlationdimensions, which often are fractal, of an attractor and related forms of entropy

ii

are re-constructed from time series based on a segment of thepast evolution. Alsonumerical estimation methods of the invariants will be discussed; these are impor-tant for applications, e.g., in controlling chemical processes and in early warningsystems regarding epilepsia.

Another central subject is formed by multi- and quasi-periodic dynamics. Here thedispersion exponent vanishes and quasi-periodicity to some extent forms the ‘orderin between the chaos’. We deal with circle dynamics near rigid rotations and similarsettings for area preserving and holomorphic maps. Persistence of quasi-periodicityis part of Kolmogorov-Arnold-Moser (orKAM ) Theory. In the main text we focuson the so-called 1 bite small divisor problem that can directly be solved by Fourierseries. In an appendix we show how this linear problem is usedto prove aKAM

Theorem by Newton iterations.

In another appendix we discuss transitions from ordery to more complicated dynam-ics upon variation of system parameters, in particular Hopf, Hopf-Neımark-Sackerand quasi-periodic bifurcations, indicating how this yields a unified theory for theonset of turbulence in fluid dynamics according to Hopf-Landau-Lifschitz-Ruelle-Takens. Also we indicate a few open problems regarding both conservative anddissipative chaos.

This book is written as a textbook for either undergraduate or graduate students,depending on their level. In any case a good knowledge of ordinary differentialequations is required. Also some knowledge of metric spaces, topology and mea-sure theory will help and an appendix on these subjects has been included. Thebook also is directed towards the applied researcher who likes to get a better un-derstanding of the data that computer simulation or experiment may provide himwith.

We maintain a wide scope, but given the wealth of material on this subject, wecannot possibly aim for completeness. However, the book does aim to be a guideto the dynamical systems literature. Below we present a few general references attextbook, handbook or encyclopædia level.

A brief historical note. Occasionally in the text some historical note is presentedand here we give a birdseye perspective. One historical linestarts around 1900 withPoincare, originating from celestial mechanics, in particular from his studies of theunwieldy 3-body probem [39], which later turned out to have chaotic evolutions.Poincare, among other things, introduced geometry in the theory of ordinary dif-ferential equations; instead of studying only one evolution, trying to compute it inone way or the other, he was the one that proposed to consider the structure andorganisation of all possible evolutions of a dynamical system. This qualitative linewas picked up later by the Fields medallists Thom and Smale and by a lot of othersin the 1960’s and 70’s and this development leaves many traces in the present book.

iii

Around this same time a really significant input to ‘chaos’ theory came from outsidemathematics. We mention Lorenz’s meteorological contribution with the celebratedLorenz attractor [129, 130], the work of May on population dynamics [133] and theinput of Henon-Heiles from astronomy, to which in the 1976 the famous Henon at-tractor was added [108]. It should be said that during this time the computer becamean important tool for lengthy computations, simulations and for graphical represen-tations, which had a tremendous effect on the field. Many of these developmentsalso will be dealt with below.

From the early 1970’s on these two lines merged, leading to the discipline ofnon-linear dynamical systemsas known now, which is an exciting area of research andwhich has many application both in the classical sciences, in the life sciences, inmeteorology, etc.

Guide to the literature. As said before, we could not aim for completeness, butwe do aim, among other things, to be an entrance to the literature. We have orderedthe bibliography at the end, subdividing it as follows.

- The references [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] form general contributions tothe theory of nonlinear dynamical systems at a textbook level;

- More in the direction of bifurcation theory we like to pointat the textbooks[12, 13, 14] and to [15, 16];

- For textbooks on the ergodic theory of nonlinear dynamics see [17, 18];

- A few handbooks on dynamical systems are [19, 20, 21, 22], where in thelatter reference we like to point at the chapters [23, 24, 25,26, 27];

- We also mention the Russian Encyclopædia of Mathematics aspublished bySpringer, of which we refer explicitly to [28].

Acknowledgements. . . .

Henk Broer and Floris TakensGroningen, Spring 2008

Contents

1 Examples and definitions of dynamical phenomena 11.1 The pendulum as a dynamical system . . . . . . . . . . . . . . . . 2

1.1.1 The free pendulum . . . . . . . . . . . . . . . . . . . . . . 21.1.1.1 The free, undamped pendulum . . . . . . . . . . 21.1.1.2 The free, damped pendulum . . . . . . . . . . . . 7

1.1.2 The forced pendulum . . . . . . . . . . . . . . . . . . . . . 91.2 General definition of dynamical systems . . . . . . . . . . . . . .. 14

1.2.1 Differential equations . . . . . . . . . . . . . . . . . . . . . 161.2.2 Constructions of dynamical systems . . . . . . . . . . . . . 19

1.2.2.1 Restriction . . . . . . . . . . . . . . . . . . . . . 191.2.2.2 Discretisation . . . . . . . . . . . . . . . . . . . 201.2.2.3 Suspension . . . . . . . . . . . . . . . . . . . . . 211.2.2.4 Poincare map . . . . . . . . . . . . . . . . . . . 23

1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.3.1 A Hopf bifurcation in the Van der Pol equation . . . . . . . 25

1.3.1.1 The Van der Pol equation . . . . . . . . . . . . . 251.3.1.2 Hopf bifurcation . . . . . . . . . . . . . . . . . . 26

1.3.2 The Henon map; saddle points and separatrices . . . . . .. 281.3.2.1 Saddle points and separatrices . . . . . . . . . . . 311.3.2.2 Numerical computation of separatrices . . . . . . 31

1.3.3 The Logistic system; bifurcation diagrams . . . . . . . . .. 321.3.4 The Newton algorithm . . . . . . . . . . . . . . . . . . . . 38

1.3.4.1 R ∪ ∞ as a circle: stereographic projection . . 401.3.4.2 Applicability of the Newton algorithm . . . . . . 411.3.4.3 Non-converging Newton algorithm . . . . . . . . 411.3.4.4 Newton algorithme in higher dimensions . . . . . 43

1.3.5 Dynamical systems defined by partial differential equations 451.3.5.1 The 1-dimensional wave equation . . . . . . . . . 451.3.5.2 Solution of the 1-dimensional wave equation . . . 461.3.5.3 The 1-dimensional heat equation . . . . . . . . . 46

1.3.6 The Lorenz attractor . . . . . . . . . . . . . . . . . . . . . 49

vi CONTENTS

1.3.6.1 Discussion . . . . . . . . . . . . . . . . . . . . . 491.3.6.2 Numerical simulations . . . . . . . . . . . . . . . 50

1.3.7 The Rossler attractor; Poincare map . . . . . . . . . . . . .511.3.7.1 The Rossler system . . . . . . . . . . . . . . . . 521.3.7.2 The attractor of the Poincare map . . . . . . . . . 53

1.3.8 The Doubling map and symbolic dynamics . . . . . . . . . 541.3.8.1 The Doubling map on the interval . . . . . . . . . 541.3.8.2 The Doubling map on the circle . . . . . . . . . . 551.3.8.3 The Doubling map in symbolic dynamics . . . . . 551.3.8.4 Analysis of the Doubling map in symbolic form . 57

1.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

2 Qualitative properties and predictability of evolutions 672.1 Stationary and periodic evolutions . . . . . . . . . . . . . . . . .. 68

2.1.1 Predictability of periodic and stationary motions . .. . . . 682.1.2 Asymptotically and eventually periodic evolutions .. . . . 70

2.2 Multi- and quasi-periodic evolutions . . . . . . . . . . . . . . .. . 712.2.1 Then-dimensional torus . . . . . . . . . . . . . . . . . . . 732.2.2 Translations on a torus . . . . . . . . . . . . . . . . . . . . 74

2.2.2.1 Translation systems on the 1-dimensional torus . . 752.2.2.2 Translation systems on the 2-dimensional torus

with time setR . . . . . . . . . . . . . . . . . . . 772.2.2.3 Translation systems on then-dimensional torus

with time setR . . . . . . . . . . . . . . . . . . . 782.2.2.4 Translation systems on then-dimensional torus

with time setZ or N . . . . . . . . . . . . . . . . 802.2.3 General definition of multi- and quasi-periodic evolutions . 81

2.2.3.1 Multi- and quasi-periodic subsystems . . . . . . . 812.2.3.2 Example: the driven Van der Pol equation . . . . 83

2.2.4 Predictability of quasi-periodic evolutions . . . . . .. . . . 852.2.5 Historical remarks . . . . . . . . . . . . . . . . . . . . . . 88

2.3 Chaotic evolutions . . . . . . . . . . . . . . . . . . . . . . . . . . 892.3.1 Badly predictable (chaotic) evolutions of the Doubling map 902.3.2 Definition of dispersion exponent and chaos . . . . . . . . .932.3.3 Properties of the dispersion exponent . . . . . . . . . . . . 97

2.3.3.1 ‘Transition’ from quasi-periodic to stochastic . .. 972.3.3.2 ‘Transition’ from periodic to chaotic . . . . . . . 972.3.3.3 ‘Transition’ from chaotic to stochastic . . . . . . 98

2.3.4 Chaotic evolutions in the examples of Chapter 1 . . . . . .. 992.3.5 Chaotic evolutions of the Thom map . . . . . . . . . . . . . 101

2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

CONTENTS vii

3 Persistence of dynamical properties 1093.1 Variation of initial state . . . . . . . . . . . . . . . . . . . . . . . . 1093.2 Variation of parameters . . . . . . . . . . . . . . . . . . . . . . . . 1123.3 Persistence of stationary and periodic evolutions . . . .. . . . . . . 115

3.3.1 Persistence of stationary evolutions . . . . . . . . . . . . .1153.3.2 Persistence of periodic evolutions . . . . . . . . . . . . . . 119

3.4 Persistence for the Doubling map . . . . . . . . . . . . . . . . . . . 1193.4.1 Perturbations of the Doubling map: persistent chaoticity . . 1203.4.2 Structural stability . . . . . . . . . . . . . . . . . . . . . . 1233.4.3 The Doubling map modelling a (fair) coin . . . . . . . . . . 127

3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

4 Global structure of dynamical systems 1334.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344.2 Examples of attractors . . . . . . . . . . . . . . . . . . . . . . . . 136

4.2.1 The Doubling map and hyperbolic attractors . . . . . . . . .1374.2.1.1 The Doubling map on the plane . . . . . . . . . . 1374.2.1.2 The Doubling map in 3-space: the Solenoid . . . 1394.2.1.3 The Solenoid as a hyperbolic attractor . . . . . . 1434.2.1.4 Properties of hyperbolic attractors . . . . . . . . . 146

4.2.2 Non-hyperbolic attractors . . . . . . . . . . . . . . . . . . 1494.2.2.1 Henon-like attractors . . . . . . . . . . . . . . . 1504.2.2.2 The Lorenz attractor . . . . . . . . . . . . . . . . 151

4.3 Basin boundaries and the Horseshoe map . . . . . . . . . . . . . . 1564.3.1 Gradient systems . . . . . . . . . . . . . . . . . . . . . . . 1564.3.2 The Horseshoe map . . . . . . . . . . . . . . . . . . . . . . 158

4.3.2.1 Symbolic dynamics . . . . . . . . . . . . . . . . 1614.3.2.2 Structural stability . . . . . . . . . . . . . . . . . 163

4.3.3 Horseshoe-like sets in basin boundaries . . . . . . . . . . .1654.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

5 On KAM Theory 1735.1 Introduction, setting of the problem . . . . . . . . . . . . . . . .. 1735.2 KAM Theory of circle maps . . . . . . . . . . . . . . . . . . . . . 175

5.2.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 1765.2.2 Formal considerations and small divisors . . . . . . . . . .178

5.3 KAM Theory of area preserving maps . . . . . . . . . . . . . . . . 1825.4 KAM Theory of holomorphic maps . . . . . . . . . . . . . . . . . 184

5.4.1 Complex linearization . . . . . . . . . . . . . . . . . . . . 1845.4.2 Cremer’s example in Herman’s version . . . . . . . . . . . 185

5.5 The 1-bite small divisor problem . . . . . . . . . . . . . . . . . . . 186

viii CONTENTS

5.5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 1865.5.2 Setting of the problem and formal solution . . . . . . . . . 1875.5.3 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 1895.5.4 Finite differentiability . . . . . . . . . . . . . . . . . . . . 193

5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

6 Reconstruction and time series analysis 2036.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2036.2 An experimental example: the dripping faucet . . . . . . . . .. . . 2046.3 The Reconstruction Theorem . . . . . . . . . . . . . . . . . . . . . 205

6.3.1 Generalizations . . . . . . . . . . . . . . . . . . . . . . . . 2076.3.1.1 Continuous time . . . . . . . . . . . . . . . . . . 2076.3.1.2 Multidimensional measurements . . . . . . . . . 2086.3.1.3 Endomorphisms . . . . . . . . . . . . . . . . . . 2086.3.1.4 Compactness . . . . . . . . . . . . . . . . . . . . 208

6.3.2 Historical note . . . . . . . . . . . . . . . . . . . . . . . . 2096.4 Reconstruction and detecting determinism . . . . . . . . . . .. . . 210

6.4.1 Box counting dimension and its numerical estimation .. . . 2126.4.2 Numerical estimation of the box counting dimension . .. . 2136.4.3 Box counting dimension as an indication for ‘thin’ subsets . 2146.4.4 Estimation of topological entropy . . . . . . . . . . . . . . 215

6.5 Stationarity and reconstruction measures . . . . . . . . . . .. . . . 2156.5.1 Probability measures defined by relative frequencies. . . . 2166.5.2 Definition of stationarity and reconstruction measures . . . 2176.5.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

6.6 Correlation dimensions and entropies . . . . . . . . . . . . . . .. 2196.6.1 Miscellaneous remarks . . . . . . . . . . . . . . . . . . . . 221

6.6.1.1 Compatibility of the definitions of dimension andentropy with reconstruction . . . . . . . . . . . . 221

6.6.1.2 Generalized correlation integrals, dimensions andentropies . . . . . . . . . . . . . . . . . . . . . . 222

6.7 Numerical estimation of correlation integrals, dimensions and en-tropies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

6.8 Classical time series analysis, correlation integrals, and predictability 2276.8.1 Classical time series analysis . . . . . . . . . . . . . . . . . 227

6.8.1.1 Optimal linear predictors . . . . . . . . . . . . . 2286.8.1.2 Gaussian timeseries . . . . . . . . . . . . . . . . 228

6.8.2 Determinism and auto-covariances . . . . . . . . . . . . . . 2296.8.3 Predictability and correlation integrals . . . . . . . . .. . . 231

6.8.3.1 l’Histoire se repete . . . . . . . . . . . . . . . . . 2326.8.3.2 Local linear predictors . . . . . . . . . . . . . . . 233

CONTENTS ix

6.9 Miscellaneous subjects . . . . . . . . . . . . . . . . . . . . . . . . 2346.9.1 Lyapunov exponents . . . . . . . . . . . . . . . . . . . . . 2356.9.2 Estimation of Lyapunov exponents from a time series . .. . 2366.9.3 The Kantz-Diks test — discriminating between time series

and testing for reversibility . . . . . . . . . . . . . . . . . . 2376.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

A Diffential Topology and Measure Theory 243A.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243A.2 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243A.3 Differentiable manifolds . . . . . . . . . . . . . . . . . . . . . . . 247A.4 Measure theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

B MiscellaneousKAM Theory 257B.1 The integer affine structure of quasi-periodic tori . . . .. . . . . . 257B.2 Classical (conservative) KAM Theory . . . . . . . . . . . . . . . .258B.3 Dissipative KAM theory . . . . . . . . . . . . . . . . . . . . . . . 260B.4 On the KAM proof in the dissipative case . . . . . . . . . . . . . . 261

B.4.1 Reformulation and some notation . . . . . . . . . . . . . . 262B.4.2 On the Newtonian iteration . . . . . . . . . . . . . . . . . . 262

B.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

C Miscellaneous bifurcations 267C.1 Local bifurcations of low codimension . . . . . . . . . . . . . . .. 268

C.1.1 Saddle-node bifurcation . . . . . . . . . . . . . . . . . . . 269C.1.2 Period doubling bifurcation . . . . . . . . . . . . . . . . . 269C.1.3 Hopf bifurcation . . . . . . . . . . . . . . . . . . . . . . . 270C.1.4 Hopf- Neımark-Sacker bifurcation . . . . . . . . . . . . . . 270C.1.5 The saddle-center bifurcation . . . . . . . . . . . . . . . . 273

C.2 Quasi-periodic bifurcations . . . . . . . . . . . . . . . . . . . . . .274C.2.1 The quasi-periodic saddle-center bifurcation . . . . .. . . . 274C.2.2 The quasi-periodic Hopf bifurcation . . . . . . . . . . . . . 275

C.3 Transition to chaos . . . . . . . . . . . . . . . . . . . . . . . . . . 277C.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282

D Derivation of the Lorenz equations 283D.1 Geometry and flow of an incompressible fluid . . . . . . . . . . . .283D.2 Heat transport and the influence of temperature . . . . . . . .. . . 285D.3 Rayleigh stability analysis . . . . . . . . . . . . . . . . . . . . . . 287D.4 Restriction to a 3-dimensional state space . . . . . . . . . . .. . . 289D.5 Numerical simulations . . . . . . . . . . . . . . . . . . . . . . . . 292

x CONTENTS

Chapter 1

Examples and definitions ofdynamical phenomena

A dynamical system can be any mechanism that evolves deterministically in time.Simple examples can be found in mechanics, one may think of the pendulum orthe Solar System. Similar systems occur in chemistry and meteorology. We shouldnote that in biological and economic systems it is less clearwhen we are dealingwith determinism.1 We are interested inevolutions, i.e., functions that describe thestateof a system in dependence oftimeand that satisfy theequations of motionofthe system. Mathematical definitions of these concepts willfollow later.

The simplest type of evolutions isstationary, where the state is constant in time.Next we also knowperiodic evolutions. Here, after a fixed period, the system alwaysreturns to the same state and we get a precise repetition of what happened in theprevious period. Stationary and periodic evolutions are very regular and predictable.Apart from these types of evolutions, already in quite simple systems one meetsevolutions that are not so regular and predictable. Sometimes we shall speak ofchaoticbehaviour.

In the first part of this chapter, using systems as the pendulum with or withoutdamping or external forcing, we give an impression of the types of evolution thatmay occur. We shall see that usually one system has various types of evolution.Which types are prevalent strongly depends on the kind of system at hand; forexample, in mechanical systems it is important whether we consider friction of not.During this exposition it will become clearer what is the mathematical frame workin which to describe the various phenomena. This will determine the language andthe way of thinking of the discipline ofDynamical Systems.

1The problem whether a system is deterministic or not will be addressed later, as well as thequestion in howfar our considerations (partially) keep their value in situations that are not (totally)determiniscic. Here one may think of sensitive dependence on initial conditions in combination withfluctuations due to thermal molecular dynamics or of quantumfluctuations, compare with [170].

2 Examples and definitions of dynamical phenomena

pendulum In the second part of this chapter we are more explicit, giving a formal definition of‘dynamical system’. Then also the concepts likestate, time, etc. will be discussed.After this, returning to the previous examples, we shall illustrate these concepts.In the final part of this chapter we treat a number of miscellaneous examples, thatwill regularly return in later parts of the book. Here we particularly pay attention toexamples that, historically speaking, have given direction to the development of thediscipline.

1.1 The pendulum as a dynamical system

As a preparation to the formal definitions we go into later this chapter, we firsttreat the pendulum in a number of variations: with or withoutdamping and exter-nal forcing. We shall present graphs of numerically computed evolutions. Fromthese graphs it will be clear that qualitative differences exist between the variousevolutions and that the type of evolution that is prevalent strongly depends on thecontext at hand. This already holds for the restricted classof mechanical systemsconsidered here.

1.1.1 The free pendulum

The planar, mathematical pendulum consists of a rod, suspended at a fixed point ina vertical plane in which the pendulum can move. All mass is thought of as to beconcentrated in a point mass at the end of the rod, while the rod itself is massless.Also the rod is assumed stiff. The pendulum has massm and lengthℓ.We moreoverassume the suspension to be such that the pendulum not only can oscillate, but alsocan rotate around its point of suspension. In the case without external forcing, wespeak of thefreependulum.

1.1.1.1 The free, undamped pendulum

In the case without both damping (or friction) and without external forcing, thependulum is only subject to gravity, with accelerationg. The gravitational force ispointing vertically downward with strengthmg and has a component−mg sinϕalong the circle described by the point mass, compare with Figure 1.1. Letϕ bethe angle between the rod and the downward vertical, expressed in radians. Thecorresponding path described by the point mass then has length ℓϕ. The relationshipbetween force and motion is determined by Newton’s2 famous lawF = ma, whereF denotes the force,m the mass anda the acceleration.

2Isaac Newton 1642-1727

1.1 The pendulum as a dynamical system 3

ℓϕ

mg

−mg sinϕ

Figure 1.1: Sketch of the planar mathematical pendulum.

By the stiffness of the rod, no motion takes place in the radial direction and wetherefore just apply Newton’s law in theϕ-direction. The component of the force intheϕ-direction is given by−mg sinϕ, where the minus sign accounts for the factthat the force is driving the pendulum back toϕ = 0. For the accelerationa we have

a =d2(ℓϕ)

dt2= ℓ

d2ϕ

dt2,

where(d2ϕ/dt2)(t) = ϕ′′(t) is the second derivative of the functiont 7→ ϕ(t).Substituted into Newton’s law this gives

mℓϕ′′ = −mg sinϕ

or, equivalently,

ϕ′′ = −gℓ

sinϕ. (1.1)

So we derived theequation of motionof the pendulum, which in this case is anor-dinary differential equation. This means that the evolutions are given by the func-tionst 7→ ϕ(t) that satisfy the equation of motion (1.1). In the sequel we abbreviateω =

√

g/ℓ.

Observe that in the equation of motion (1.1) the massm no longer occurs. Thismeans that the mass has no influence on the possible evolutions of this system. Asanexperimentalfact this was already known to Galileo,3 a predecessor of Newtonconcerning the development of Classical Mechanics.

3Galileo Galilei 1564-1642


Remark (Digression on Ordinary Differential Equations I). In the above exam-ple, the equation of motion is an ordinary differential equation. Concerning thesolutions of such equations we make the following digression.

1. From the Theory of Ordinary Differential Equations, e.g., see [114, 92, 43], itis known that such an ordinary differential equation, giveninitial conditionsor initial state, has a uniquely determined solution.4 Since the differentialequation has order two, the initial state att = 0 is determined bybothϕ(0)andϕ′(0), so both position and velocity at the timet = 0. The theory saysthat, given such an initial state, there exists exactly one solution t 7→ ϕ(t),mathematically speaking a function, also called ‘motion’.This means thatposition and velocity at the instantt = 0 determine the motion forall futuretime.5 In the discussion, later this chapter, on the state space of adynamicalsystem, we shall see that in the present pendulum case, the states are givenby pairs(ϕ, ϕ′). This implies that the evolution, strictly speaking, is a mapt 7→ (ϕ(t), ϕ′(t)), wheret 7→ ϕ(t) satisfies the equation of motion (1.1). Theplane with coordinates(ϕ, ϕ′) often is calledphase plane. The fact that theinitial state of the pendulum is determined for all future time, means that thependulum is adeterministicsystem.6

2. Between existence and explicit construction of solutions of a differential equa-tion there is a wide gap. Indeed, only for quite simple systems, likes theharmonic oscillator with equation of motion

ϕ′′ = −ω2ϕ,

we can explicitly compute the solutions. In this example these are all of theformϕ(t) = A cos(ωt+B), whereA andB can be expressed in terms of theinitial state(ϕ(0), ϕ′(0)). So the solution is periodic with period2π/ω andwith amplitudeA, while the numberB, which is only determined up to aninteger multiple of2π; it is the phase at the timet = 0. Since nearϕ = 0 wehave the linear approximation

sinϕ ≈ ϕ,

in the sense thatsinϕ = ϕ+ o(|ϕ|), we may consider the harmonic oscillatoras a linear approximation of the pendulum. Indeed, the pendulum itself alsohas oscillating motions of the formϕ(t) = F (t), whereF (t) = F (t+ P ) for

4For a more elaborate discussion we refer to§1.2.5Here certain restrictions have to be made, to be discussed ina digression later this chapter.6In Chapter 6 we will see that the concept of determinism for more general systems is not so easy

to define.


t

ϕ

π

−π

0

(a)

t

ϕ

2π

0

(b)

ϕ

ϕ′

(c)

Figure 1.2: Evolutions of the free, undamped pendulum: (a) the pendulum oscil-lates, (b) the pendulum rotates around the point of suspension, (c) a few integralcurves in the phase plane.

a real numberP, the period of the motion. The periodP varies for differentsolutions, increasing with the amplitude. It should be noted that the functionF occurring here, is not explicitly expressable in elementary functions. Formore information, see [123].

3. General methods exist for obtaining numerical approximations of solutionsof ordinary equations given initial conditions. Such solutions can only becomputed for a restricted time interval. Moreover, due to accumulation of er-rors, such numerical approximations generally will not meet with reasobablecriteria of reliability when the time interval is growing (too) long.

We now return to the context of the pendulum without forcing or damping, that is, tothe free, undamped pendulum. In Figure 1.2 we present a number of characteristicmotions, or evolutions, as these occur for the pendulum.

In the diagrams (a) and (b) we plottedϕ as a function oft. Diagram (a) shows anoscillatory motion, i.e., where there is no rotation aroundthe point of suspension.Diagram (b) does show a motion where the pendulum rotates. Inthe latter casethe graph does not directly look periodic, but when realising thatϕ is an anglevariable, which means that only its values modulo integer multiples of 2π are of


interest for the position, we conclude that also this motionis periodic. In diagram(c) we represent motions in a totally different way. Indeed,for a number of possiblemotions we depict the integral curve in the(ϕ, ϕ′)-plane, i.e., the phase plane. Thecurves are of the formt 7→ (ϕ(t), ϕ′(t)), where t varies overR. We show twoperiodic motions as in diagram (a), noting that in this representation each motioncorresponds to a closed curve. Also two periodic motions areshown as in diagram(b), one for the case whereϕ continuously increases and the other for the case whereϕ decreases. Here the periodic character of the motion does not show so clearly.

Remark. The state space of the pendulum is a cylinder. In order to see that the lattercurves are periodic, we have to make the identificationϕ ∼ ϕ+2π,which turns the(ϕ, ϕ′)-plane into a cylinder. This also has its consequences for the representation ofthe periodic motions according to Figure 1.2 (a); indeed in diagram correspondingto (c) we then would only witness two of such motions.

Apart from these periodic motions, also a few other motions have been represented.The points(2πk, 0), with integerk, correspond to the stationary motion (whichamounts to rest and not to motion) where the pendulum hangs inits downwardequilibrium. The point(2π(k + 1

2), 0) k ∈ Z correspond to the stationary motion

where the pendulum stands upside down. We see that a stationary motion is depictedas a point, such a point should be interpretated as a ‘constant’ curve. Finally motionsare represented that connect successive points of the form(2π(k + 1

2), 0) k ∈ Z. In

these motions fort→ ∞ the pendulum tends to its upside down position, while alsofor t → −∞ the pendulum came from this upside down position. The lattertwomotions at first sight may seem somewhat exceptional, which they are in the sensethat the set of initial states leading to such evolutions have area zero in the plane.However, when considering diagram (c) we see that this kind of motion shouldoccur at the boundary of the two kinds of more regular periodic motion. Thereforethe corresponding curve is calledseparatrix. This is something that will happenmore often: the exceptional motions are important for understanding the ‘picture ofall possible motions’.

For the mathematical understanding of these motions, in particular of the repre-sentations of Figure 1.2 diagram (c), it is important to realise that our system isundamped or frictionless. Mechanics then teaches us thatconservation of energyholds. The energy in this case is given by

H(ϕ, ϕ′) = mℓ2(12(ϕ

′)2 − ω2 cosϕ). (1.2)

Conservation of energy means that for any solutionϕ(t) of the equation of motion(1.1), the functionH(ϕ(t), ϕ′(t)) is constant int. This fact also follows from moredirect considerations, compare with Exercise 1.3. This means that the solutions asindicated in diagram (c) are completely determined by levelcurves of the function


H. Indeed, the level curves withH-values between−mℓ2ω2 and+mℓ2ω2 are closedcurves corresponding to oscillations of the pendulum. And each level curve withH > mℓ2ω2 consists of two components where the pendulum rotates around thepoint of suspension: in one component the rotation is to the left and in the other oneto the right. The levelH = mℓ2ω2 is a curve with double points, corresponding tothe exceptional motions just described: the upside down position and the motionsthat fort→ ±∞ tend to this position.

We briefly return to a remark made in the above digression, saying that the explicitexpression of solutions of the equation of motion is not possible in terms of ele-mentary functions. Indeed, considering the level curveH(ϕ, ϕ′) = E we solveto

ϕ′ = ±√

2

(

E

mℓ2+ ω2 cosϕ

)

.

The time it takes for the solution to travel betweenϕ = ϕ1 andϕ = ϕ2 is given bythe integral

T (E,ϕ1, ϕ2) =

∣

∣

∣

∣

∣

∣

∫ ϕ2

ϕ1

dϕ√

2(

Emℓ2

+ ω2 cosϕ)

∣

∣

∣

∣

∣

∣

,

which cannot be ‘solved’ in terms of known elementary functions. Indeed, it is a so-calledelliptic integral, an important subject of study for Complex Analysis duringthe 19th century. For more information see [123]. For an oscillatory motion ofthe pendulum we find values ofϕ, whereϕ′ = 0. Here the oscillation reaches itsmaximal values±ϕ(E), where the positive value usually is called the amplitude ofoscillation. We get

±ϕ(E) = arccos

(

− E

mℓ2ω2

)

.

The periodP (E) of this oscillation then is given byP (E) = 2T (E,−ϕ(E), ϕ(E)).

1.1.1.2 The free, damped pendulum

In the case that the pendulum has damping or friction, loss ofenergy takes place anda possible motion is bound to damp to rest: we speak of the damped pendulum. Forsimplicity it is here assumed that the friction force is proportional to the velocityand of opposite sign.7

7To some extent this is a simplifying assumption, but we are almost forced to this. In situationswith friction there are no simple first principles giving unique equations of motion. For an elemen-tary treatment on the complications that can occur when attempting an exact modelling of frictionphenomena, compare with [97]; Vol. 1, Chapter 12.2 en 12.3. Therefore we are fortunate that almostall qualitative statements are independent of the precise formula used for the friction force. Mainpoint is that, as long as the pendulum moves, the energy decreases.


t

ϕ

(a)

t

ϕ

(b)

ϕ

ϕ′

(c)

Figure 1.3: Evolutions of the free, damped pendulum: (a) themotion of the pendu-lum damps to the lower equilibrium; (b) As under (a), but faster; (c) a few evolutionsin the phase plane.

Therefore we now consider a pendulum, of which the equation of motion is givenby

ϕ′′ = −ω2 sinϕ− cϕ′, (1.3)

wherec > 0 is determined by the strength of the friction force. For non-stationarysolutions of this equation of motion the energyH, given by (1.2), decreases. Indeed,if ϕ(t) is a solution of (1.3), then

dH(ϕ(t), ϕ′(t))

dt= −cmℓ2(ϕ′(t))2. (1.4)

Equation (1.4) indeed shows that friction makes the energy decrease as long asthe pendulum moves, i.e., as long asϕ′ 6= 0. This implies that we can’t expectperiodic motions to occur (other than stationary). We expect that any motion willtend to a state where the pendulum is at rest. This indeed turns out to be the case.Moreover, we expect that the motion generally will tend to the stationary state wherethe pendulum hangs downward. This is exactly what is shown inthe numericalapproximations of Figure 1.3, represented in the same way asin Figure 1.2.


Also in this case there are a few exceptional motions, that are however not depictedin diagram (c). To begin with we have the stationary solution, where the pendulumstands upside down. Next there are motionsϕ(t), in which the pendulum tends tothe upside position ast→ ∞.

We noted before that things do not change too much when the friction law ischanged. In any case this holds for the remarks made up to now.Yet there arepossible differences in the way the pendulum tends to its downward equilibrium.For instance, compare the cases (a) and (b) in Figure 1.3. In case (a) we have thatϕ(t) passes through zero infinitely often when tending to its restposition, whilein diagram (b) this is not the case: nowϕ(t) creeps to equilibrium. In the presentequation of motion (1.3) this difference between (a) and (b)occurs when the frictionstrengthc increases and passes a certain threshold value. In the linearised equation

ϕ′′ = −ω2ϕ− cϕ′,

that can be solved explicitly, this effect can be directly witnessed. Compare withExercise 1.5.

We summarise what we have seen sofar regarding the dynamics of the free pen-dulum. The undamped case displays an abundance of periodic motions, this iscompletely destroyed by the tiniest amount of damping. Although systems withoutdamping in practice do not occur, still we often witness periodic behaviour. Amongother things this is related to the occurrence ofnegativefriction, that is directed inthe same direction as the motion. Here the ‘friction force’ has to deliver work andtherefore external energy is needed. Examples of this can befound among electricaloscillators, as described by Van der Pol [158, 159].8 A more modern reference is[114] Chapter 10, also see below. Another example of a kind ofnegative frictionoccurs in a mechanical clockwork, that very slowly consumesthe potential energyof a spring or a weight. In the next section we shall deal with periodic motions thatoccur as a consequence ofexternal forces.

1.1.2 The forced pendulum

Our equations of motion are based on Newton’s lawF = ma, where the accellera-tion a, up to a multiplicative constant, is given by the second derivativeϕ′′. There-fore we have to add the external force to the expression forϕ′′. We assume that thisforce is periodic in the timet, even that it has a simple cosinus shape. In this waywe arrive at the following equation of motion

ϕ′′ = −ω2 sinϕ− cϕ′ + A cos Ωt. (1.5)

8Balthasar van der Pol 1889-1959


To get an impression on the types of motion that can occur now,in figure 1.4 weshow a number of motions that have been computed numerically. Forω = 2.5, c =0.5 andA = 3.8, we letΩ vary between 1.4 and 2.1 with steps of magnitude 0.1. Inall these diagrams time runs from 0 to 100, while we takeϕ(0) = 0 andϕ′(0) = 0as initial conditions.

We observe that in the diagrams withΩ = 1.4, 1.5, 1.7, 1.9, 2.0 and 2.1 the mo-tion is quite orderly: disregardingtransientphenomena the pendulum describes aperiodic, oscillatory motion. For the values ofΩ in between, the oscillatory motionof the pendulum is alternated by motions where it rotates several times around itspoint of suspension. It is not even clear from the observation with t ∈ [0, 100],whether the pendulum will ever tend to a periodic motion. It indeed can be shownthat the system, with equation of motion (1.5) with well-chosen values of the pa-rametersω, c, A andΩ, has motions that never tend to a periodic motion, but thatkeeps going on in a weird fashion. In that case we speak ofchaoticmotions. Below,in Chapter 2, we shall discuss chaoticity in connection to unpredictability.

In Figure 1.5 we show an example of a motion of the damped, forced pendulum(1.5), that only after a long time tends to a periodic motion.So, here we are dealingwith a transient phenomenon that takes a long time. We depictthe solution as at-parametrised curve in the(ϕ, ϕ′)-plane. The data used areω = 1.43, Ω = 1, c =0.1 andA = 0.2. In the left diagram we took initial valuesϕ(0) = 0.3, ϕ′(0) = 0,while t traversed the interval[0, 100]. In the right diagram we took as initial statethe end state of the left diagram.

We finish with a few remarks on the forced pendulum without friction, i.e., withequation of motion (1.5) wherec = 0 :

ϕ′′ = −ω2 sinϕ + A cosΩt.

We show two of these motions in the diagrams of Figure 1.6. In both cases we tookω = 1.43 andΩ = 1. For the left diagram of Figure 1.6, which probably showsa chaotic evolution, we tookA = 0.01 and initial conditionsϕ(0) = 0.01 andϕ′(0) = 0. In the right diagram we witness a new phenomenon: the motion doesnot become periodic, but it looks rather regular. Later we shall return to this subject,but now we globally indicate what is the case here. The pendulum oscillates in awell-defined frequency and also the forcing has a well-defined frequency. In themotion depicted in the right diagram,both frequencies remain visible. This kind ofmotion is quite predictable and therefore is not called chaotic. Rather one speaks ofa quasi-periodicmotion. However, it should be mentioned that in this undamped,forced case, the initial states leading to quasi-periodic motions and those leading tochaotic motions are intertwined in a complicated manner. Hence it is hard to predictin advance what one will get.


t

ϕ

Ω = 1.4

t

ϕ

Ω = 1.5

t

ϕ

Ω = 1.6

t

ϕ

Ω = 1.7

t

ϕ

Ω = 1.8

t

ϕ

Ω = 1.9

t

ϕ

Ω = 2.0

t

ϕ

Ω = 2.1

Figure 1.4: Motions of the forced, damped pendulum for various values ofΩ for trunning from0 to 100 and withϕ(0) = 0 andϕ′(0) = 0 as initial conditions.


ϕ′

ϕ

ϕ′

ϕ

Figure 1.5: Evolution of the damped and forced pendulum (1.5) as at-paramatrisedcurve in the(ϕ, ϕ′)-plane. The evolution tends to a periodic motion. The rightdiagram shows a continuation of the segment in the left diagram. We tookω =1.43,Ω = 1, A = 0.2 andc = 0.1. Initial conditions in the left diagram areϕ(0) =0.3, ϕ′(0) = 0.

ϕ

ϕ′

ϕ

ϕ′

Figure 1.6: The evolution in the left diagram is probably chaotic and the one in theright diagram quasi-periodic. In both casesω = 1.43 andΩ = 1. In the left diagramA = 0.01, ϕ(0) = 0.01 andϕ′(0) = 0.


−4

−2

0

2

4

ϕϕ

−2 0 2

ϕϕ

−4

−2

0

2

4

ϕϕ

−2 0 2

ϕϕ

Figure 1.7: Phase portraits of Poincare maps for the undamped (left) and damped(right) pendulum with periodic forcing.

There is yet another way to describe te evolutions of a pendulum with periodicforcing. In this description we represent the state of the pendulum by(ϕ, ϕ′), justat the instantstn = 2πn/Ω. At these instants the driving force resumes a newperiod. The underlying idea is that there exists amapP that assigns to each state(ϕ, ϕ′) the next stateP (ϕ, ϕ′). In other words, wheneverϕ(t) is a solution of thet1 6= 2πk/Ω?

(FT)equation of motion (1.5), one hasP (ϕ(tn), ϕ′(tn)) = (ϕ(tn+1), ϕ′(tn+1)). This map

often is called thePoincare map, or alternativelyperiod mapor stroboscopic map.In Chapter 2, we shall return to this subject. By depicting the successive points(ϕ(tn), ϕ′(tn)) we obtain a simplification, particularly in the case of complicatedevolutions. First we observe that if (1.5) has a periodic solution, then its periodnecessarily is an integer multiple of2π/Ω. Second, it is easy to see that a periodicsolution of (1.5) for the Poincare map will lead to a periodic point, which shows asa finite set: indeed, if the period is2πk/Ω, we observe a set ofk points. In Figure1.7 we show several iterations of the Poincare map both in the undamped and in thedamped case. For more pictures in a related model of the forced pendulum, calledswing, see the Figures C.5, C.6 in Appendix C,§C.3.

We summarize as follows. For various models of the pendulum we encounteredstationary, periodic, quasi-periodic and chaotic motions. Also we met with motionsthat are stationary or periodic after a transient phenomenon. It turned out that certainkinds of dynamics are more or less characteristic or typicalfor the model underconsideration. We recall that the free, damped pendulum, apart from a transient,can only have stationary behaviour. Stationary behaviour however is atypical for theundamped pendulum with or without forcing. Indeed, in the case without forcingthe typical behaviour is periodic, while in the case with forcing we see both (quasi-)periodic and chaotic motions. Furthermore, for both the damped and the undampedfree pendulum, the upside down equilibrium is atypical. We here use the term


‘atypical’ to indicate that the initial states invoking this motion forms a very smallset (i.e., of area9 zero). It has been shown that the quasi-periodic behaviour of theundamped, forced pendulum also is typical, below we shall come back to this. Thechaotic behaviour probably also is typical, but this fact has not been establishedmathematically. On the other hand, for the damped, forced pendulum it has beenproven that certain forms of chaotic motion are typical, while quasi-periodic motiondoes not occur at all. For further discussion, see below.

In the following chapters we shall make similar qualitativestatements, only formu-lated more accurately, and valid for more general classes ofdynamical systems. Tothis purpose we first need to describe precisely what is meantby the concepts: dy-namical system. (quasi-) periodic, chaotic, up to transient phenomena, typical andatypical behaviour in dependence of initial conditions, etc. While doing this, wekeep demonstrating their meaning on concrete examples likethe pendulum in allits variations, and many other examples. Apart from the question how the motionmay depend on the initial state, we also investigate how the total dynamics may de-pend on ‘parameters’ likeω, c, A andΩ, in the equation of motion (1.5), or on moregeneral perturbations of this. Again we will speak in terms of more or less typicalbehaviour when the set of parameter values where this occursis larger or smaller.In this respect the concept ofpersistenceof dynamical properties is of importance.We use this term for a property that remains valid after smallperturbations, whereone may think both of perturbing the initial state and the equation of motion.

In these considerations the occurrence of chaotic dynamicsin deterministic systemswill receive a lot of attention. Here we shall pay attention to the possible scenariosaccording to which orderly dynamics can turn into chaos, seeAppendix C. Thisdiscussion includes the significance that chaos has for physical systems, like themeteorological system and its bad predictability. Also we touch the difference be-tween chaotic, deterministic systems and stochastic systems, i.e., systems that areinfluenced by randomness.

1.2 General definition of dynamical systems

Essential for the definition of a dynamical system isdeterminism, a property wealready met before. In case of the pendulum, the present state, i.e., both the positionand the velocity, determines all future states. Moreover, the whole past can bereconstructed from the present state. The central concept here is that ofstate, inthe above examples coined as positionand velocity. All possible states togetherform thestate space, in the free pendulum case the(ϕ, ϕ′)-plane or the phase plane.For a functionϕ = ϕ(t) that satisfies the equation of motion, we call the points

9That is, 2-dimensional Lebesgue measure.

1.2 General definition of dynamical systems 15

evolutionevolution!operatorstate space

(ϕ(t), ϕ′(t)) of the state space, seen as a function of the timet, an evolution. Alsothe map (curve)t 7→ (ϕ(t), ϕ′(t)) is called evolution. We now aim to express thedeterminism, that is, the unique determination of each future state by the presentone, in terms of amapas follows. If(ϕ, ϕ′) is the present state (att = 0) andt > 0is an instant in the future, then we denote the state at timet by Φ((ϕ, ϕ′), t). Thisassumes that the present state determines all future states, i.e., that the system isdeterministic. So, ifϕ(t) satisfies the equation of motion, then we have

Φ((ϕ(0), ϕ′(0)), t) = (ϕ(t), ϕ′(t)).

The mapΦ : R2 × R → R2

constructed in this way, is called theevolution operatorof the system. Note thatsuch an evolution operator also can ‘reconstruct the past’.

We now generally define a dynamical system as a structure, consisting of astatespace, also calledphase space, to be indicated byX, and anevolution operator

Φ : X × R → X. (1.6)

Lateron we also have to deal with situations where the evolution operator has asmaller domain of definition, but first, using the pendulum case as a leading exam-ple, we shall develop a few general properties for evolutionoperators of dynamicalsystems.

The first property to be discussed is that for anyx ∈ X necessarilyΦ(x, 0) = x.This just means that when a state evolves during a time interval of length0, thestate remains unchanged. This can be interpreted as a kind ofcontinuity property:in an infinitesimally small time interval the state can’t undergo a finite change. Thesecond property we mention is thatΦ(Φ(x, t1), t2) = Φ(x, t1 + t2). This has to dowith the following. If we consider the free pendulum, then, if t 7→ ϕ(t) satisfies theequation of motion, this also holds fort 7→ ϕ(t+ t1), for any constantt1.

These two properties together often are called thegroup property.To explain thisterminology we rewriteΦt(x) = Φ(x, t). Then the group property is equivalent tosaying that the mapt 7→ Φt is a group morphism ofR, as an additive group, to thegroup of bijections ofX, where the group operation is composition of maps. Thelatter also can be expressed in a more common form:

1. Φ0 = IdX (here IdX is the identity map ofX;

2. Φt1 Φt2 = Φt1+t2 (where indicates composition of maps).


In the above terms we define an evolution of a general dynamical system as a map(or curve) of the formt 7→ Φ(x, t) = Φt(x), for a fixedx ∈ X, called the initialstate. The image of this map also is called evolution. We can think of an evolu-tion operator as a map that defines a flow in the state spaceX, where the pointx ∈ X flows along the orbitt 7→ Φt(x). This is why the mapΦt sometimes also iscalledphase flowover timet. Notice that in the case where the evolution operatorΦ is differentiable, the mapsΦt arediffeomorphisms. This means that eachΦt isdifferentiable with a differentiable inverse, in this casegiven byΦ−t. For more in-formation on concepts like diffeomorphisms, the reader is referred to the AppendixA or to Hirsch [112].

1.2.1 Differential equations

Dynamical systems as just defined, are almost identical withsystems of first orderordinary differential equations. In fact, if for the state spaceX we take a vectorspace andΦ at least of classC2, i.e., twice continuously differentiable, then wedefine theC1-mapf : X → X by

f(x) =∂Φ

∂t(x, 0).

Then it is not hard to check that a curvet 7→ x(t) is an evolution of the dynamicalsystem defined byΦ, if and only if it is a solution of the differential equation

x′(t) = f(x(t)).

For the Theory of (Ordinary) Differential Equations we refer to [114, 92, 43] andto [10]. A few remarks are in order on the relationship between dynamical systemsand differential equations.

Remarks.

- In the case of the free pendulum we were given a second order differentialequationϕ′′ = −ω2 sinϕ− cϕ′ and not a first order equation. However, thereis a standard way to turn this equation into a system of first order equations.Writing x = ϕ andy = ϕ′, we obtain

x′ = y

y′ = −ω2 sin x− cy.

- In the forced pendulum case we not only had to deal with a second orderequation, but moreover, the timet also occurs explicitly in the ‘right handside’. We notice that here it is not the case that for a solution t 7→ ϕ(t) of the


t

(cos Ωt, sin Ωt)

Figure 1.8: A circle that originates by identifying the points of R, the distance ofwhich is an integer multiple of2π

Ω.

equation of motion, alsot 7→ ϕ(t+ t1) is a solution.10 This can be dealt withVOETNOOT?(FT) as follows. We just add a state variable that ‘copies’ the role of time. This is

a variablez, for whichz′ = 1. Thus, for the forced pendulum we get:

x′ = y

y′ = −ω2 sin x− cy + A cosΩz (1.7)

z′ = 1.

It may be clear that(x(t), y(t), z(t)) is a solution of the system (1.7) of firstorder differential equations, if and only ifϕ(t) = x(t − z(0)) satisfies theequation of motion of the forced pendulum. So we have eliminated the timedependence by raising the dimension of the state space by one.

Remark (Digression on Ordinary Differential Equations II) . In continuation ofRemark 1.1.1.1 we now review a few more issues on ordinary differential equations,for more information again referring to [114, 92, 10, 43].

1. In the pendulum with periodic forcing, the timet, in the original equationof motion, occurs in a periodic way. This means that a shift int over aninteger multiple of the period2π

Ωdoes not change anything. This means that

also in the system (1.7) for(x, y, z) a shift in z over an integer multiple of2πΩ

leaves everything unchanged. Mathematically this means that we mayconsiderz as a variable inR/

(

2πΩ

Z)

, i.e., in the set of real numbers in whichtwo elements are identified whenever their difference is an integer multiple of2πΩ. As indicated in Figure 1.8 the real numbers by such an identification turn

into a circle.10Unlesst1 = 2kπ/Ω, for somek ∈ Z, i.e., for an integer multiple of the period of forcing.


In this interpretation the state space is no longerR3, but rather

R2 × R/(

2πΩ Z)

.

Incidentally note that in this example the variablex also is an angle and canbe considered as a variable inR/(2πZ). In that case the state space would be

X = (R/(2πZ)) × R ×(

R/(

2πΩ Z))

.

This gives examples of dynamical systems, the state space ofwhich is aman-ifold, instead of just an open subset of a vector space. For generalreferencesee [112, 181, 48].

2. Furthermore it is not entirely true that any first order ordinary differentialequationx′ = f(x) always defines a dynamical system. In fact, examplesexist, even with continuousf, where an initial conditionx(0) does not deter-mine the solution uniquely. However, such anomalies do not occur wheneverf is at least of classC1.

3. Another problem is that differential equations exist with solutionsx(t) thatare not defined for allt ∈ R, but that for a certaint0 ‘tend to infinity’, i.e.,

limt→t0

||x(t)|| = ∞.

In our definition of dynamical systems, such evolutions are not allowed. Wemay include such systems aslocal dynamical systems, for more examples andfurther details see below.

Before arriving at a general definition of dynamical systems, we mention that wealso allow evolution operators, where in (1.6) the set of real numbersR is replacedby a subsetT ⊂ R. However, we do require the group property. This means thatwe have to require that0 ∈ T, while for t1, t2 ∈ T we also want thatt1 + t2 ∈ T.This can be summarised by saying thatT ⊆ R should be anadditive semigroup.We call T the time setof the dynamical system. The most important examplesthat we will meet are, next toT = R andT = Z, the casesT = R+ andT =Z+, being the sets of nonnegative reals and integers, respectively. In the caseswhereT is a semigroup and not a group, it is possible that the mapsΦt are non-invertible: the mapt 7→ Φt then is a morphism of semigroups, running fromTto the endomorphisms semigroup ofX. In these cases, it is not always possibleto reconstruct past states from the present one. For examples we refer to the nextsection.

As dynamical systems with time setR are usually given by differential equations,so dynamical systems with time setZ are given by an invertible map (or automor-phism) and systems with time setZ+ by an endomorphism. In the latter cases thismap is given byΦ1.


dynamical systemrestriction

Definitie 1.1 (Dynamical System).A dynamical systemconsists of a state spaceX, a time setT ⊆ R, being an additive subgroup, and an evolution operatorΦ :X × T → X satisfying the group property, i.e.,Φ(x, 0) = x and

Φ(Φ(x, t1), t2) = Φ(x, t1 + t2)

for all x ∈ X andt1, t2 ∈ T.

Often the dynamical system is denoted by the triple(X, T,Φ). If not explicitly saidotherwise, the spaceX is at least a topological space and the operatorΦ continuous.

Apart from systems strictly satisfying Definition 1.1, we also knowlocal dynamicalsystems.Here the evolution operatorΦ is not defined on the entire productX × T.We keep the property thatX × 0 belongs to the domain of definition ofΦ, thatΦ(x, 0) = x for all x ∈ X and thatΦ(Φ(x, t1), t2) = Φ(x, t1 + t2), as far as both(x, t1) and(Φ(x, t1), t2) are in the domain of definition ofΦ.

1.2.2 Constructions of dynamical systems

In this part we deal with a few elementary, but important, constructions that turn agiven dynamical system into another one. In all these constructions we start with adynamical system as in Definition 1.1, so with state spaceX, time setT ⊂ R andevolution operatorΦ.

1.2.2.1 Restriction

A subsetA ⊂ X is calledinvariant under the given dynamics ifΦ(A × T ) = A.In that case we define the restriction ofΦ to A as a dynamical system with statespaceA, time setT and evolution operatorΦA = Φ|A×T . Important cases wherethis construction is applied is the restriction, whereA is anattractoror aninvariantsubmanifold. For details see below.

A variation of this construction occurs where one only gets alocal dynamical sys-tem. This happens when we restrict to a subsetA that isnot invariant. In generalwe then get an evolution operator that is not defined on the entire productA × T.Restriction toA only makes sense when the domain of definition is ‘not too small’.As an example think of the following. Suppose thata ∈ X is a stationary point,which means thatΦ(a, t) = a for all t ∈ T. Assuming thatΦ is continuous, then thedomain of definition of its restriction to a neighbourhoodA of a consists of pairs(x, t) such that bothx ∈ A andΦ(x, t) ∈ A, thus forming a neighbourhood ofa × T in A× T.

A restriction as constructed in the previous example will beused to study the dy-namics in a small neighbourhood of a stationary point, in this case the pointa.Often


restriction!in thenarrower sense

restriction!in thewider sense

discretisation

we shall restrict even more, since we are not interested in evolutions that start inAand, after a long transient thoughX, return toA. In that case we restrict the domainof definition ofΦ even further, namely to the set of pairs(x, t) wherex ∈ A, t ∈ Tand where for allt′ ∈ T with 0 ≤ t′ ≤ t (or if t < 0, with 0 > t′ ≥ t), one hasΦ(x, t′) ∈ A.

Both forms of restriction are used and often it is left implicit which is the case. Inthe latter case we sometimes speak ofrestriction in the narrower senseand in theformer ofrestriction in the wider sense.

1.2.2.2 Discretisation

In the case whereT = R (or T = R+) we can introduce a new system with time setT = Z (or T = Z+) by defining a new evolution operatorΦ : X × T → X by

Φ(x, n) = Φ(x, hn),

whereh ∈ R is a positive constant. This construction has some interestin thetheory of numerical simulations, that often are based on (anapproximation of) sucha discretisation. Indeed, one computes subsequent states for instants (times) thatdiffer by a fixed valueh. Such an algorithm therefore is based on (an approximationof) Φh and its iterations. In the case whereΦ is determined by a differential equation

x′ = f(x), x ∈ Rn, (1.8)

the simplest approximation ofΦh is given by

Φh(x) ≈ x+ hf(x).

The corresponding method to approximate solutions of the differential equation(1.8) is called the Euler method, whereh is the step size.

Remark. Not all dynamical systems withT = Z can be obtained in this way,namely by discretizing an appropriate system with time setR and continuous evo-lution operator. A simple example of this occurs when the mapϕ : R → R, usedto generate the system by iteration, has a negative derivativeϕ′ everywhere. (Forexample just takeϕ(x) = −x.) Indeed, for any evolution operatorΦ of a contin-uous dynamical system with state spaceX = R and time setT = R, we havethat for x1 < x2 ∈ R necessarilyΦt(x1) < Φt(x2) for all t. This is true, sinceΦt is invertible, which implies thatΦt(x1) 6= Φt(x2). Moreover, the differenceΦt(x2) − Φt(x1) depends continuously ont, and has a positive sign fort = 0.Therefore, this sign necessarily is positive for allt ∈ R. Now observe that this iscontrarious to what happens when applying the mapϕ. Indeed, sinceϕ′(x) < 0everywhere, forx1 < x2 it follows thatϕ(x1) > ϕ(x2). Therefore, for any constanth it follows thatϕ 6= Φh and the dynamics generated byϕ cannnot be obtained bydiscretisation.


suspension1.2.2.3 Suspension

We start with a dynamical system(X, T,Φ) with time setT = Z, the suspension ofwhich will be a dynamical system with time setR. The state space of the suspendedsystem is defined as

X = X × R/ ∼,where∼ is the equivalence relation given by

(x1, s1) ∼ (x2, s2) ⇔ s2 − s1 ∈ Z andΦs2−s1(x2) = x1.

Another way to constructX is to takeX × [0, 1], and to identify the ‘boundaries’X×0 andX×1 such that the points(x, 1) and(Φ1(x), 0) are ‘glued together’.The fact that the two constructions give the same result can be seen as follows.The equivalence class inX × R containing the point(x, s) contains exactly oneelement(x′, s′) with s′ ∈ [0, 1). The elements inX×1 andX×0 are pairwiseequivalent, such that(x, 1) ∼ (Φ1(x), 0), which explains the ‘glueing prescription’.The evolution operatorΦ : X × R → X now is defined by

Φ([x, s], t) = [x, s + t],

where[−] indicates the∼-equivalence class.

Remark (Topological complexity of suspension).LetX be connected, then aftersuspension, the state space isnotsimply connected, which means that there exists acontinuous curveα : S1 → X that, withinX, can’t be deformed continuously to apoint. For details see the exercises; for general referencesee [112].

Example 1.1 (Suspension to cylinder or Mobius strip). As an example of thisconstruction we consider the suspension of two dynamical systems both withX =(−1,+1). We recall thatT = Z, so in both cases the system is generated by aninvertible map onX. In the first case we takeΦ1

1 = Id, the identity map, and inthe second caseΦ1

2 = −Id. To construct the suspension we considerX × [0, 1] =(−1, 1)× [0, 1]. In the former case we identify the boundary points(x, 0) and(x, 1):in this way X is the cylinder. In the latter case we identify the boundary points(x, 0) and(−x, 1), which leads to the Mobius strip. Compare with Figure 1.9. Inboth cases the suspended evolutions are periodic. For the suspension in the casewhereΦ1

1 = Id, we follow the evolution that starts at[x, 0]. After a timet this pointhas evolved till[x, t]. Sinced in this caseΦn(x) = x, the pairs(x, t) and (x′, t′)belong to the same equivalence class if and only ifx = x′ andt − t′ is an integer.This implies that the evolution is periodic with period 1. Inthe second case whereΦ1

1 = −Id, we have thatΦn(x) = (−1)nx. This means that the evolution starting at[0, 0], is periodic with period 1, since also here the pairs(0, t) and(0, t′) belong tothe same equivalence class if and only ift− t′ is an integer. However, any evolution


1

1

1

1

0

0

0

0

−1

−1

Figure 1.9: By ‘glueing’ the opposite edges of rectangle we can either obtain acylinder or a Mobius strip. On both cylinder and Mobius strip we indicated how thenew evolutions run, respectively in the cases whereΦ1

1 = Id andΦ12 = −Id, in both

cases withX = (−1,+1).

starting at[x, 0], with x 6= 0, is periodic with period 2. Indeed, the pairs(x, 0) and(x, 2) are equivalent, but for no0 < t < 2 the pairs(x, 0) are(x, t) equivalent. Thefact that evolutions that do not start ‘right in the middle’ have a double period, canalso be seen in Figure 1.9.

Observe that this construction, foranydynamical system with time setZ, providesa dynamical system with time setR. However, we like to point out that not everydynamical system with time setR can be obtained in this way. In fact, a systemobtained by suspension cannot have stationary evolutions.This means that dynam-ical systems with time setR having stationary evolutions, like the examples of thefree pendulum with or without friction, cannot be obtained as the outcome of asuspension construction.

We finally note the following. If we first suspend a dynamical system with timesetZ to a system with time setR, and we discretise subsequently (with constanta = 1 in the discretisation construction), after which we restrict to[x, s] | s = 0,we have returned to our original system, where we don’t distinguish betweenx and[x, 0].


Poincar“’e map1.2.2.4 Poincare map

We now turn to the construction of a Poincare map: a kind of inverse of the suspen-sion construction described above. We start with a dynamical system with time setR, given by a differential equationx′ = f(x), assuming that in the state spaceXthere is a submanifoldS ⊂ X of codimension 1 (meaning thatdimX = dimS+1),with the following properties:

1. For allx ∈ S the vectorf(x) is transversal toS. For a discussion on transver-sality see Appendix A or [112].

2. For any pointx ∈ S there exist real numberst−(x) < 0 < t+(x), such thatΦ(x, t−) ∈ S andΦ(x, t+) ∈ S.

We call t−(x), t+(x) the return times and in the sequel we assume that bothhave been chosen with minimal absolute value. This means that both aretimes of first return inS, either for negative or for positive time.

In the above setting we define a diffeomorphism

ϕ : S → S by ϕ(x) = Φ(x, t+).

Note that this indeed defines a diffeomorphism, since an inverse of this map simplyis given byx 7→ Φ(x, t−). We call ϕ the Poincare map ofS. The correspondingdynamical system onS has an evolution operator given byΦ(x, n) = ϕn(x).

One may well ask whether the requirement of the existence of asubmanifoldS ⊂ Xis not too restrictive. That the requirementis restrictive can be seen as follows.For X = Rn 11 and Φ continuous, a connected submanifoldS ⊂ Rn with theabove requirements does not exist; this can be seen using some Algebraic Topology,e.g., see [132], but we shall not discuss this subject here. Also see Excercise 1.22.We now discuss for which differential equations a Poincaremap construction isapplicable. For this we need to generalise the concept of differential equation to thecase where the state space is a generalmanifold, again compare with [112, 181, 48].In the differential equationξ′ = f(ξ) we then interprete the mapf as a vector fieldon the state space, compare with Appendix A or with the above references. Wewill not enter into all complications here, but refer to the example of the forcedpendulum, whereξ = (x, y, t) the state space has the formR2 ×

(

R/(2πΩ

Z))

. Thisis an example of a 3-dimensional manifold, since it locally can’t be distinguishedfrom R3. In this example it is easy to construct a submanifoldS with the desiredproperties: we just can takeS = (x, y, [0]). In this case the Poincare map alsois called period map or stroboscopic map, compare with§1.1.2, where we already

11Vector spaces are simply connected.


S1 = R/(2πΩ

Z)

S f(x)

x

Figure 1.10: The state spaceR2 × S1, with S1 = R/(2πΩ

Z), where forR2 we tooka discD. Also we indicate the submanifoldS as well as an evolution of the 3-dimensional system. Here two subsequent passages throughS are visible, as wellas the ‘definition’ of the image of one of the points ofS under the Poincare map.

met this particular situation. Compare with Figure 1.10. Toget a better view werestricted the(x, y)-plane to a finite discD, so obtaining a so-calledsolid torus. Fora given point(x, y, [0]) ∈ S we depict the evolution of the 3-dimensional systemstarting at(x, y, [0]) and also the image of(x, y, [0]) under the Poincare map. Fromnow on we again use the general notationx instead of(x, y, [0]) for a point inS.

To clarify the connection between suspension and Poincaremap, we take a dynam-ical system with time setZ of which the state spaceX is a manifold (e.g., a vectorspace) and whereΦ1 is a diffeomorphism. It can be shown that the state spaceXof the suspension also is a manifold (although definitely nota vector space) andthat the evolution operatorΦ is determined by a differential equation onX. Weretrieve the original system by constructing a Poincare map for the submanifoldS = [x, 0] ⊂ X. Compare with Example 1.1, for the suspensions of±Id asillustrated by Figure 1.9. Notice that here the return timest−(x) andt+(x), as in-troduced above, for allx ∈ S are equal to±1. In general, however, the return timest±(x) will vary with x ∈ S.

Often the term Poincare map is used in a wider sense. To explain this take a sub-manifoldS of codimension 1, for which we still require that for eachx ∈ S thevectorf(x) is not tangent toS. However, we do not require the return times to bedefined for all points ofS. The Poincare map then has the same domain of definition

1.3 Examples 25

Van der Pol systemast+. In this way we obtain alocal dynamical system. The evolution operator thencan be defined byΦn = ϕn, whereϕ−1 has the same domain of definition ast−.

One of the most important cases where such a locally defined Poincare map occurs,is at periodic evolutions of (autonomous) differential equationsx′ = f(x). In thestate space the periodic evolution determines a closed curve γ. ForS we now takea codimension 1 submanifold that intersectsγ transversally. This means thatγ andS have one pointp in common, whereγ is transversal toS. In that casef(p) is nottangent toS. By restrictingS to a (possibly) smaller neighbourhood ofp we canachieve that for any pointq ∈ S the vectorf(q) is not tangent toS. In this case thePoincare map is defined on a neighbourhood ofp in S. Lateron we shall return tothis subject.

1.3 Examples

In this section we discuss a few examples of dynamical systems from a widelydiverging background. These examples frequently occur in the literature and theconcepts of the previous section can be well illustrated on them. Also we shall meetnew phenomena.

1.3.1 A Hopf bifurcation in the Van der Pol equation

The equation of Van der Pol was designed as a model for an electronic oscillator,compare with [158, 159]. We shall not go into the derivation of this model. In anycase it is a mathematical description of obsolete technology based on radio tubes,the predecessors of our present transistors. We first describe the equation, whichturns out to have a periodic attractor and next introduce a parameter dependence inthe system.

1.3.1.1 The Van der Pol equation

The present interests in the Van der Pol equation is that it serves as an exampleof a (nonlinear) oscillator with a partially ‘negative friction’. The equation can bewritten in the form

x′′ = −x− x′(x2 − 1), (1.9)

as usual leading to the system

x′ = y

y′ = −x− y(x2 − 1).


We recognise the equation of a damped harmonic oscillator, where the dampingcoefficientc has been replaced by(x2−1). This means that for|x| < 1 the dampingis negative, increasing the energy instead of decreasing. For |x| > 1 we have theusual positive damping. To make these remarks more precise we define the energyof a solutionx(t) by

E(t) = 12

(

x(t)2 + x′(t)2)

.

By differentiation we then find that

E ′ = xx′ + x′x′′ = xx′ + x′(−x− x′(x2 − 1)) = −(x′)2(x2 − 1),

from which our assertions easily follow.

Remark. Let us briefly discuss the present terminology. In the Van derPol equation(1.9) we consider a position dependent dampingc(x) = x2 − 1. Notice that withoutany damping we would be just dealing with the harmonic oscillatorx′′ = −x, whichconserves the energyE(x, x′) = 1

2 (x2 + (x′)2) . In this way we observe the effectof an unconventional damping on conventional energy.

From the properties of this ‘friction’ it follows on the one hand that very smalloscillations gain in amplitude, since they occur in an area with negative friction. Onthe other hand very large oscillations will decrease in amplitude, since these mainlyoccur in an area with positive friction. One can show, compare with [114], that eachsolution of this equation, with exception of the zero solution, converges to one andthe same periodic solution. The Van der Pol equation (1.9) therefore describes asystem that, independent of the initial state different from x = 0 = x′, eventuallywill end up in the same periodic behaviour. It can even be shown that this propertyis robustor persistent, in the sense that small changes of the ‘right hand side’ ofthe differential equation will not change this property. Ofcourse, amplitude andfrequency of the oscillationcan shift by such a small change. Extensive researchhas been done on the behaviour of the solutions of (1.9) when aperiodic forcing isapplied, for more information compare with [6].

1.3.1.2 Hopf bifurcation

We now introduce aparameterin the Van der Pol equation (1.9), assuming thatthis parameter can be tuned in an arbitrary way. The questionis how the dynam-ics depends on such a parameter. More in generalbifurcation theorystudies thedynamics as function of one or more parameters, for a discussion and many ref-erences see Appendix C. In this example we take the friction variable, so that wecan make a continuous transition from the Van der Pol equation to an equation withonly positive, i.e., the more usual, friction. Indeed we consider

x′′ = −x− x′(x2 − µ). (1.10)

1.3 Examples 27

Hopf!bifurcation

x

y

Figure 1.11: Phase portrait of the Van der Pol oscillator (1.11).

The value of the parameterµ determines the area where the friction is negative: forgivenµ > 0 this is the area given by|x| < √

µ. We now expect:

1. Forµ ≤ 0 the equation (1.10) has no periodic solutions, but all solutionsconverge to the zero-solution;

2. Forµ > 0 all solutions of (1.10) (with the exception of the zero-solution)converge to a fixed periodic solution, the amplitude of whichdecreases asµtends to0. Compare with the Figures 1.11 and 1.12.

These expectations indeed are confirmed by the numerical phase portraits of Figure1.12, and it is even possible to prove these facts [116, 174].The transition atµ =0, where the behaviour of the system changes from non-oscillating to oscillatingbehaviour is an example of abifurcation, in particular aHopf bifurcation.12 Ingeneral a bifurcation occurs at parameter values where the behaviour of the systemschangesqualitatively. This as opposed toquantitativechanges, where for instance,the amplitude or the frequency of the oscillation shift a little.

12Eberhard Hopf 1902-1983


H“’enon!system

x

x′

(a)

x

x′

(b)

x

x′

(c)

Figure 1.12: A few evolutions of the parametrised Van der Polequation (1.10). a:µ < 0, b: µ = 0 and c:µ > 0. Compare with Figure 1.11.

1.3.2 The Henon map; saddle points and separatrices

The Henon13 map defines a dynamical systems with state spaceR2 and with timesetZ. The evolution operatorΦ is given by

Φ1 = Ha,b : (x, y) 7→ (1 − ax2 + y, bx),

wherea andb 6= 0 are constants. For the moment we fixa = 1.4, b = 0.3. For thesestandard values the map is simply calledH ; this is the original Henon map [108].That we may indeed take the time setZ, follows from the fact thatHa,b is invertiblefor b 6= 0, compare with Excercise 1.9.

The Henon map is one of the oldest and simplest examples of anexplicitly givensystem, the numerical simulation of which raises the conjecture that a general evo-lution will never tend to a stationary or (quasi-) periodic evolution. The examplewas published in 1976, while only in 1991, i.e., 15 years later, it was mathematicallyproven that indeed many evolutions of this system never tendto a (quasi-) periodicevolution [42]. The precise meaning of ‘many’ for this case will be discussed in alater chapter.

To get an impression of the dynamics of this system and of whatHenon in [108]described as a ‘strange attractor’, we should undertake thefollowing computer ex-periment. Take a more or less arbitrary initial point(x0, y0) ∈ R2 and plot sub-sequently the pointsHj(x0, y0) = (xj , yj) for j = 1, . . . , N, whereN is a largenumber. Now there are two possibilities:14 either the subsequent points get furtherand further apart (in which case one would expect an ‘error’ message since the ma-chine does not accept too large numbers), or, disregarding the first few (transient)

13Michel Henon 1931-14Theoretically there are more possibilities, but these are probably just as exceptional as the mo-

tions related to the upside down free pendulum.

1.3 Examples 29

H“’enon!attractor

x

y

Figure 1.13: The attractor of the Henon map and three magnifications, each time bya factor of 10.

iterates, a configuration shows up as depicted in Figure 1.13. Whenever we take theinitial point not too far from the origin, always the second alterative will occur. Weconclude that apparently any evolution that starts not too far from the origin, in a rel-atively brief time, ends up in the configuration of Figure 1.13. Moreover, within theaccuracy of the graphical representation, the evolution also visits each point of thisconfiguration. Since the configuration attracts all these evolutions, it will be calledandattractor. In Chapter 4 we will give a mathematical definition of attractors. Inthe magnifications we see that however far we zoom in, always acertain fine struc-ture seems present, which always seems to have the same complexity. This propertyis characteristic for what is nowadays called afractal, for general background see[131]. The existence of such fractals has led to newdimension concepts, that alsocan attain non-integer values. For general reference see Falconer [95, 96].

We surely can’t give a complete exposition of the propertiesof the Henon mapthat can be mathematically proven, but a single aspect we have to mention here.To start with, we look for stationary evolutions, i.e., points (x, y) ∈ R2 for whichH(x, y) = (x, y). The solutions are

x =−.7 ±

√.49 + 5.6

2.8, y = .3x,

or, in approximation,(.63, .19) and(−1.13,−.34). The former of these two points,


ξ

η

Figure 1.14: Evolutions of the Henon map nearp in the (ξ, η)-coordinates, where‘higher order terms’ have been neglected.

to be calledp, turns out to be situated inside the attractor. We investigate the nearbydynamics aroundp by expansion in a Taylor series

H(p+ (x, y)) = p+ dHp

(

xy

)

+ o(|x|, |y|).

To first approximation we have to deal with the linear map dHp, the derivative ofH at p. A direct computation yields that this matrix has real eigenvalues, one ofwhich is smaller than−1 and the other inside the interval(0, 1). First we considerthe dynamics when neglecting the higher order termso(|x|, |y|).We can choose newcoordinatesξ andη, in such a way thatp in these coordinates becomes the origin,while the coordinate axes are pointing in the direction of the eigenvectors of dHp,see Figure 1.14.

In the(ξ, η)-coordinates the (approximate) Henon map has the form

Happr(ξ, η) = (λ1ξ, λ2η),

with λ1 < −1 andλ2 ∈ (0, 1). This means first of all that both theξ- and theη-axis are invariant by iteration ofHappr. Moreover, points on theξ-axis, by repeatediteration ofHappr get further and further elongated (by repeated applicationof theinverseH−1

appr points on theξ-axis converge top). For points on theη-axis theopposite holds: by iteration ofH these converge top (and under iteration byH−1

appr

get elongated). Points that belong to neither axes move awayboth by iteration ofHappr and ofH−1

appr. Compare this with the behaviour of linear ordinary differentialequations, compare with [114, 92, 43]. All of this holds whenneglecting the higherorder terms. We now indicate, without proofs, what remains of this when the higherorder terms are no longer neglected. Since these statementshold for general 2-dimensional diffeomorphisms, and not only for the Henon map, we shall give ageneral formulation.

1.3 Examples 31

separatrix1.3.2.1 Saddle points and separatrices

Let ϕ : R2 → R2 be a diffeomorphism. We callp ∈ R2 a fixed pointwhenverϕ(p) = p, in other words if the constant mapn ∈ Z 7→ p ∈ R2 is a stationary evo-lution of the dynamical system generated byϕ. Now suppose that the eigenvaluesλ1, λ2 of the derivative map dϕp are real, where moreover|λ1| > 1 > |λ2|; in thiscasep is called a saddle point ofϕ. Compare with the situation for the Henon mapas sketched above.

In the sequelv1, v2 ∈ Tp(R2) are eigenvectors of d(ϕp), belonging to the eigenval-

uesλ1 andλ2, respectively. For a proof of the following theorem we refer to, e.g.,[8].

Theorem 1.1 (Local Separatrices).For a diffeomorphismϕ, with saddle pointp,eigenvaluesλ1, λ2 and eigenvectorsv1, v2 as above, there exists a neighbourhoodU of p such that

W uU(p) = q ∈ U | ϕj(q) ∈ U for all j ≤ 0 and

W sU(p) = q ∈ U | ϕj(q) ∈ U for all j ≥ 0

are smooth curves that contain the pointp and that inp are tangent tov1 andv2,respectively. Moreover, for allq ∈ W u

U(p) or q ∈ W sU(p), respectively,ϕj(q) tends

to p asj → −∞ andj → ∞, respectively.

The curvesW uU(p) andW s

U(p) in the conclusion of Theorem 1.1 are called thelocal unstable separatrixand local stable separatrix, respectively. The(global)separatricesnow are defined by:

W u(p) =⋃

j>0

ϕj(W uU(p)) andW s(p) =

⋃

j<0

ϕj(W sU(p)). (1.11)

Remarks.

- The term ‘separatrix’ indicates that positive orbits are being separated whenthe initial state crosses the separatrix.

- A more general terminology speaks of ‘locally stable’ and ‘locally unstable’manifolds.

1.3.2.2 Numerical computation of separatrices

To get a better heuristic idea, we now describe how such separatrices, or a finitesegment of these, can be approximated numerically. Again weconsider the saddle


Logistic system

Figure 1.15: Left: Henon attractor. Right: Pieces of the stable and unstable separa-tricesW s(p) andW u(p) of the saddle pointp = (.63, .19) of the Henon map. Notethe resemblance between the attractor andW u(p); later we come back to this.

pointp of the diffeomorphismϕ, with eigenvalues|λ1| > 1 > |λ2| and correspond-ing eigenvectorsv1 and v2. For the approximation of the unstable separatrix weinitially take a brief segment[p − εv1, p + εv1]. The smaller we choseε, the bet-ter this segment approximatesW u(p) (the error made here is orderO(ε2).) In thissegment we choseN equidistant points. In order to find a longer segment of theunstable separatrixW u(p), we apply them-fold iterateϕm to all subsequent pointsand connect the resultingN points by segments. The result evidently depends onthe choice of the parametersε, N andm. A segment ofW u(p) of any prescribedlength in this way can be obtained to arbitrary precision, bychosing these parame-ters appropriately. A proof of this statement is not too complicated, but is does useideas in the proof of Theorem 1.1. Therefore, we shall not go into details here.

By numerical approximation of the unstable separatrixW u(p) of the saddle pointpof the Henon map, we obtain the same picture as the Henon attractor, compare withFigure 1.15.

1.3.3 The Logistic system; bifurcation diagrams

The example to be treated now, originally comes from mathematical biology, inparticular from population dynamics. The intention was to give a model for thesuccesive generations of a certain animal species. The simplest models is the so-called ‘model with exponential growth’, where it is assumedthat each individual onthe average gives rise toµ individuals in the next generation. If we denote the size

1.3 Examples 33

of then-th generation byxn, we would expect that

xn+1 = µxn, n = 0, 1, 2, . . . . (1.12)

This indeed is a very simple equation of motion, which is provenly non-realistic.Before making the model somewhat more realistic, we first give a few remarks.

Remarks.

- It should be said that such an equation of motion never can bemore thanan approximationof reality. This is already included in the concept ofµ,the averagenumber of descendants per individuum. Another point is thatevidently the fate of most animal species is also determinedby factors thatwe here completely neglected, such as the climate, the availabity of food, thepresence of natural enemies, etc. What we shall try to give here is a model inwhich only the size of the previous generation on the next is being expressed.

- As already said the model given by (1.12) is quite unrealistic. To see this wecompute its evolutions. For a given initial populationx0 it directly follow thatthenth generation should have size

xn = µnx0.

Here one speak ofexponential growth, since the timen is in the exponent.We now distinguish three case, namelyµ < 1, µ = 1 andµ > 1. By the way,by its definition, necessarilyµ ≥ 0, where the caseµ = 0 is too uninterestingfor further consideration. In the former caseµ < 1, the population decreasesstepwise and will get extinct. Indeed, the ‘real’ size of thepopulation is aninteger number and therefore a decreasing population aftersome time disap-pears completely. In the second caseµ = 1, the population always keeps thesame size. In the latter caseµ > 1, the population increases ad infinitum. Soit would seem that only the second caseµ = 1 is somewhat realistic. How-ever, this situation turns out to be very unstable: already very small changesin the reproduction factorµ leads us into one of the other two cases.

One main reason why the model (1.12) is unrealistic, is that the effects of overpop-ulation are not taken into account. For this we can include a correction in the modelby making the reproduction factor dependent on the size of ofthe population, e.g.,in replacing

µ by µ(1 − xn

K),

whereK is the size of the population that implies such a serious overpopulationthat it immediately leads to extinction. Probably it will beclear that overpopulation


populationdynamics

should have a negative effect, but it may be less clear why thecorrection proposedhere is so relevant. This claim of relevance also will not be made, but the most im-portant reason for this choice is that it is just about the most simpleone. Moreover,it will turn out that the results to be discussed below, do notstrongly depend on theprecise form of the correction for overpopulation. Therefore, from now on we shalldeal with the dynamics generated by the map

xn+1 = µxn(1 − xn

K). (1.13)

We here deal with a system in which a number of parameters still has to be specified,namelyµ enK. Both have a clear biological interpretation:µ is the reproductionfactor for very small populations, whileK indicates for which population size over-population becomes disastrous. In principle we would like to know the dynamicsfor all different values of these parameters. It now turns out that we can simplify thisinvestigation. First of all the parameterK can be ‘transformed away’ as follows. Ifwe defineyn = 1

Kxn, the equation (1.13) turns into

yn+1 =xn+1

K=µxn

K(1 − xn

K) = µyn(1 − yn),

which has the same form as (1.13), but now withK = 1. So we see that we canget rid of K by not dealing with the population size, but by expressing this asthe fraction of the critical sizeK. It will turn out that the parameterµ can not beremoved. So we shall now deal with the dynamics given by

yn+1 = Φ1(yn) = µyn(1 − yn), (1.14)

called theLogistic system. The state space isX = [0, 1], indeed, any valuey > 1under iteration would immediately be succeeded by a negative value. Moreover wesee that it is not enough to requireµ > 0, since forµ > 4 we would have thatΦ1(1

2) /∈ X. So we take0 < µ ≤ 4. Finally it may be clear that the time set shouldbeZ+, since the mapΦ1 is not invertible.

Remarks.

- Note that this description makes sense only when the generations are clearlydistinguishable. For the human population, for example, itmakes no senseto express that two individuals that do not belong to the samefamily, have ageneration difference of, say, 5. For many species of insects, however, eachyear or day a new generation occurs and so the generations canbe numbered.In experimental biology, in this respect thedrosophila melanogaster, a specialkind of fruit fly, is quite popular, since here the dynamics evolves more rapid.

1.3 Examples 35

bifurcation diagram- A related quadratic demographic model with continuous time was introducedby Verhulst15 in 1838. The above counterpart with discrete time was exten-sively studied by May [133]16. For further historical remarks compare withMandelbrot [131] and with Peitgen c.s. [153].

An obvious question now is, how we may investigate dynamics of the Logisticsystem (1.14). In particular it is not so clear what kind of questions should beposed here. The burst of interest for this model was partly due to the biologicalimplications, but also to the numerical experiments that generally preceeded themathematical statements. Indeed, the numerical experiments gave rise to enigmaticfigures, that badly needed an explanation. We shall indicatesome results of thesenumerical experiments. A mathematical analysis of this system would exceed thepresent scope by far. For an introductory study we refer to [5], while an in-depthanalysis can be found in [134].

One possible experiment is to compute evolutions for a number of µ-values. InFigure 1.16 we depicted segments of such evolutions, whereyn is given as a functionof n, for n = 0, 1, . . . , 15, and where the initial value always wasy0 = 0.7. Forlower values ofµ nothing unexpected happens: forµ ≤ 1 the population gets extinctand for1 < µ < 3 the population tends to a stable value that is monotonicallyincreasing withµ. Forµ > 3, however, various things can happen. So it is possiblethat an evolution, after an irregular transient, gets periodic. However, it is alsopossible that the motion never tends to anything periodic, althougth such a statementevidently cannot be made on the basis of a finite numerical experiment. Moreover,it can be shown that the observed dynamical behaviour almostsolely depends on thevalue ofµ and is only influenced little by the choice of the initial statey0. The word‘little’ implies that there are exceptions, like the initial valuesy0 = 0 or y0 = 1.However, when chosing the initial valuey0 ∈ [0, 1] completely randomly,17 theprobability to get into the set of such exceptional values, is zero.

This way of analysing the Logistic system (1.14) is not yet very satisfactory. In-deed, it becomes clear that the dynamics strongly depends onthe value ofµ, atleast when3 ≤ µ ≤ 4, and drawing graphs for each value ofµ is not an option.Therefore we shall discuss abifurcation diagram, although this term has not yetbeen given a formal mathematical significance. It is a diagram in which graphicalinformation is given on the way in which the system depends onparameters. Assaid earlier, we speak of ‘bifurcation’ whenever the dynamics, as a function of oneor more parameters, changes qualitatively. In Figure 1.17 we show the result of the

15Pierre Francois Verhulst 1804-184916Robert May 1936-17One may think of the uniform distribution associated to the Lebesgue measure on[0, 1].


0

0.5

0.5

1

1

µ = 0.5

0

0.5

0.5

1

1

µ = 1.5

0

0.5

0.5

1

1

µ = 2.5

0

0.5

0.5

1

1

µ = 3.5

0

0.5

0.5

1

1

µ = 3.6

0

0.5

0.5

1

1

µ = 3.7

Figure 1.16: Six evolutions of the Logistic system (1.14), where each time the resultof 15 successive iterations is shown, to begin withy0 = 0.7. The subsequentµ-values areµ = 0.5, 1.5, 2.5, 3.5, 3.6 and3.7.

1.3 Examples 37

perioddoubling!bifurcation

perioddoubling!sequence

x

µ3 4

Figure 1.17: Bifurcation diagram of the Logistic system (1.14) for 2.5 < µ < 4.For each value ofµ, on the corresponding vertical line, we indicated they-values in[0, 1] that belong to the associated ‘attractor’.

following numerical experiment. For a large number of valuesµ ∈ [3, 4] we deter-mined which values ofy occur, after repeated iteration ofΦ1

µ(y) = µy(1−y),whenneglecting a transient initial segment. To this end we have taken the pointsΦj

µ(y0)for j = 300, 301, . . . , 1000 andy0 = 0.5. The motivation is given by the hope thatafter 300 iterates the consequences of the more or less random initial choice wouldbe no longer visible and we also hoped that in the ensuing 700 iterates all possi-bly occurring values ofy would have been found, at least within the precision ofthe graphical presentation used. In this presentation, thehorizontal coordinate isµ ∈ [2.5, 4]. The stepsize in theµ-direction was determined by the pixel size of thecomputer screen. They-values found in this way, determined the vertical coordi-nates of the points represented. The sets detected in this way, for the various valuesof µ, can be viewed as numerical approximations of the attractorsas these occur independence ofµ. Compare with the way we found a numerical approximation forthe Henon attractor.

In the ‘middle’ part of Figure 1.17, for each value ofµ we witness two values ofy. We conclude that the corresponding dynamics is periodic with period 2. For ahigher value ofµ we observe a splitting, after which for each value ofµ four val-ues ofy occur: here the dynamics has period 4. This continues over all powers of2, which can not be totally derived from Figure 1.17, but at present this is knownfrom theoretical investigations. The period doubling bifurcation will be discussedin Appendix C. After the infinite period doubling sequence has been passed, ap-


Newton algorithm proximately atµ = 3.57, the diagram gets even more involved. For many values ofµ it seems as if we have one or more intervals in they-direction, while for other val-ues ofµ only finitely manyy-values are obtained. The latterµ-values occur whereone observes ‘gaps’ in the diagram. This is most clearly witnessed in theµ-intervalwith period 3, but also the period 5 area is clearly visible. In a later chapter we willcontinue this discussion.

1.3.4 The Newton algorithm

We next turn to the Newton algorithm as this is used for determining a zero of afunction. We show how this algorithm gives rise to a dynamical system, investi-gate how this algorithm can fail (even persistently) and show one of the miraculouspictures provoked by such a dynamical system.

We start giving a short description of the Newton algorithm,which aims at findingthe zero of a real functionf. Although we might consider more general cases, weassume thatf : R → R is a polynomial map. The finding of the zero then runsas follows. First we look for a rough approximationx0 of the zero, for instanceby a graphical analysis. Next we apply a method to be explained below and thathopefully provides us with a better approximation of the zero. This improvementcan be repeated ad libitum, by which an arbitrarily sharp precision can be obtained.

The method to obtain a better approximation of the zero consists of replacing theintial guessx0 by

x1 = Nf(x0) = x0 −f(x0)

f ′(x0). (1.15)

We shall callNf the Newton operator associated tof. First we explain why weexpect that this replacement gives a better approximation.Indeed, we constructan approximation off by the linear polynomialLx0,f : R → R, determined byrequiring that both the value of the function and its first derivative off and ofLx0,f

coincide inx0. This means that

Lx0,f(x) = f(x0) + (x− x0)f′(x0)

In a small neighbourhood ofx0 this is a good approximation, since by Taylor’sformula

f(x) − Lx0,f(x) = O(|x− x0|2)asx → x0. Although in general the detection of a zero off may be an impossibleassignment, finding a zero ofLx0,f is simple: it is just given by0 = f(x0) +(x − x0)f

′(x0) or x = x0 − f(x0)/f′(x0), which exacly is the valuex1 = Nf(x0)

proposed in (1.15) as the better approximation of the zero off, also compare withFigure 1.18. So, instead of a zero off we determine a zero of an approximation

1.3 Examples 39

x

f(x)

x0x1

Figure 1.18: Newton operator for a polynomialf applied to a pointx0.

of f, which gets ever better when we repeat the approximation. It can indeed beproven, that when choosingx0 sufficiently close to the zero at hand, the pointsxj ,defined by

xj = Nf(xj−1) = (Nf)j(x0),

converge to that zero. See Exercise 1.16.

It now seems feasible to define an evolution operator by

Φ(x, n) = (Nf)n(x).

However, there is a problem: in a pointx ∈ R wheref ′(x) = 0, the expression(Nf)(x) is not defined. We can solve this problem by chosing as a state spaceX = R ∪ ∞ and extending the definition ofNf as follows:

1. Nf(∞) = ∞;

2. If 0 = f ′(x) = f(x), thenNf(x) = x;

3. If 0 = f ′(x) 6= f(x), thenNf(x) = ∞.

In this wayNf is defined on the whole state spaceX and we have thatx ∈ X is afixed point ofNf , i.e., thatNf(x) = x, if and only if eitherf(x) = 0 or x = ∞. Inthe case wheref is a polynomial of degree at least 2, the mapNf is evencontinuous.To make this statement precise, we have to endowX with a topology. To this endwe show thatX in a one-to-one way can be mapped onto the circle and how thiscircle can be endowed with a metric, and hence a topology.


NP

P

P ′

Figure 1.19: Stereografic projectionS1 \ P → R, that mapsP toP ′.

1.3.4.1 R ∪ ∞ as a circle: stereographic projection

We view the line of real numbersR as thex-axis of the planeR2 = x, y. In thesame plane we also have the unit circleS1 ⊂ R2 given byx2 + y2 = 1. NamingNP = (0, 1) ∈ S1, the North Pole ofS1, we now projectS1 \ NP → R asfollows. For any pointP = (x, y) ∈ S1 \ NP consider the straight line thatconnectsP with the North PoleNP. The intersectionP ′ of this line with thex-axisis the image of(x, y) in R, compare with Figure 1.19.

In this way we get a one-to-one correspondence between the points of R andS1 \ NP. We call this map thestereographic projection, that exists in higherdimensions as well. Indeed, completely analoguous we can consider the unit sphereSn−1 ⊂ Rn, define the North PoleNP = (0, 0, . . . , 0, 1) and define a mapSn−1 \NP → Rn−1 in a completely similar way. Returning to the casen = 2, we nowget the one-to-one correspondence betweenX andS1 by associating∞ toNP.

OnS1 we can define the distance between two points in different ways. One way isthe Euclidean distance inherited from the plane: for points(x1, y1) and(x2, y2) ∈ S1

we define

dE((x1, y1), (x2, y2)) =√

(x1 − x2)2 + (y1 − y2)2.

Another way is to take for two points(cosϕ1, sinϕ1) and(cosϕ2, sinϕ2) ∈ S1 thedistancealongthe unit circle by

dS1((cosϕ1, sinϕ1), (cosϕ2, sinϕ2)) = minn∈N|ϕ1 − ϕ2 + 2nπ|.

It is not hard so show that these two metrics are equivalent inthe sense that

1 ≤ dS1

dE≤ π.

We discussed these metrics to some detail, since this will also be useful later.

1.3 Examples 41

1.3.4.2 Applicability of the Newton algorithm

In case of an algorithm like the Newton algorithmNf , compare with (1.15), it isimportant to know whether for an arbitrary choice of the initial point, there is areasonable probability that repeated iteration leads to convergence to a zero off.We already stated that, when the initial point is close enough to a given zero off,we are sure to converge to this zero. It is easily seen that fora polynomialf ofdegree at least 1, the fixed point∞ of Nf is repelling. In fact one can show thatif |x| is sufficiently large, then|Nf(x)| < |x|. This means that the only way for asequencexj = (Nf)

j(x0), j ∈ Z+ to escape to∞, is by landing on a pointx wheref ′(x) = 0. (For a polynomial of positive degree, this only holds for a finite numberof points.) Another question now is, whether it is possible for such a sequencexj , j ∈ Z+ not to escape to∞, but neither ever to converge to a zero off.Moreoverone may wonder how exceptional such behaviour would be. Alsocompare withExcercise 1.14.

1.3.4.3 Non-converging Newton algorithm

We here construct examples where the Newton algorithm (1.15) does not converge.The simplest case would be to take a polynomials without any (real) zeroes. Weshall not discuss this case further. Another obvious way to invoke failure of conver-gence, is to ensure that there are two pointsp1 andp2, such that

Nf(p1) = p2 andNf(p2) = p1,

wherep1, p2 is an attracting orbit of period 2. To achieve this, we first observethat in general

N ′f(x) =

f(x)f ′′(x)

(f ′(x))2.

Next we consider an arbitrary 3rd degree polynomial

f(x) = ax3 + bx2 + cx+ d,

where the aim is to apply the above forp1 = −1 andp2 = +1. We have

f ′(x) = 3ax2 + 2bx+ c andf ′′(x) = 6ax+ 2b.

Direct computation now yields that

Nf(1) = −1 ⇔ 5a+ 3b+ c− d = 0 and

Nf(−1) = 1 ⇔ 5a− 3b+ c + d = 0,


where attractivity of this period 2 orbit−1, 1 can be simply achieved by requiringthatN ′

f (1) = 0, which amounts to

f ′′(1) = 0 ⇔ 6a+ 2b = 0.

All these conditions are fulfilled for

a = 1, b = −3, c = −5 andd = −9,

thereby yielding the polynomial

f(x) = x3 − 3x2 − 5x− 9. (1.16)

A subsequent question is whether this phenomenon of persistent non-convergenceof the Newton algorithm to a zero, is exceptional in the set ofall polynomials. Thisis not the case. Indeed, it turns out that in the space of all polynomials of degree3 there exists anopenset of polynomials for which the above phenomenon occurs.We now show that for any 3rd degree polynomial, the coefficients of which aresufficiently close to those of (1.16), there is an open set of initial points such thatthe corresponding Newton algorithm does not converge to a zero of f. Note thatthus we have established another form ofpersistence, in the sense that the propertyat hand remains valid under sufficiently small perturbations off within the space of3rd degree polynomials.

To construct our open set of 3rd degree polynomials, take an interval[a, b] such that:

i. a < 1 < b;

ii. (Nf)2([a, b]) ⊂ (a, b);

iii. | ((Nf)2)

′(x)| < 1 for all x ∈ [a, b];

iv. f has no zeroes in[a, b].

This implies that(Nf)2 maps the interval[a, b] within itself and is a contraction

here. Again, by the Contraction Principle, it follows that the points of[a, b], asan initial point for the Newton algorithm, do not give convergence to a zero off.Instead, as before, the algorithm always converges to a fixedpoint inside[a, b].

We now define the neighbourhoodU of f in the space of 3rd degree polynomialsby requiring thatf ∈ U if and only if the following properties hold:

1. (Nf)2[a, b] ⊂ (a, b);

2. For allx ∈ [a, b] one has|(

(Nf )2)′

(x) < 1| ;

1.3 Examples 43

3. f has no zeroes in[a, b].

It is not hard to show thatU , as defined thus, is open in the space of polynomials ofdegree 3 (and higher) and thatf ∈ U . The Contraction Principle, see Appendix A,then implies that for anyf ∈ U and for allx ∈ [a, b] the limit

limj→∞

(Nf)2j(x)

exists. Finally, sincef has no zeroes inside[a, b], it follows that the Newton algo-rithm does not converge to a zero off .

Remark. An example like (1.16) does not exist in the space of 2nd degree polyno-mials. This is a direct consequence of Exercise 1.11.

1.3.4.4 Newton algorithme in higher dimensions

We conclude this part with a few remarks on the Newton algorithm in higher di-mensions. First we think of the search of zeroes of mapsf : Rn → Rn. On thesame grounds as in the 1-dimensional case we define the Newtonoperator as

x1 = Nf (x0) = x0 − (Dx0f)−1(f(x0)).

The most important difference with (1.15) is that here(Dx0f)−1 is a matrix inverse.Such a simple extension for maps between vector spaces of different dimensionsdoes not exist, since then the derivative would never be invertible. Special interestshould be given tot the case of holomorphic mapsf : C → C. Now the Newtonoperator is exactly the same as (1.15), but here we have to deal with complex quan-tities. In the complex case also it holds true that, iff is a complex polynomial ofdegree at least 2, the Newton operator can be extended continuously toC ∪ ∞.The latter set, by stereographic projection, can be identified with the unit sphereS2 ⊂ R3 = x, y, z, given by the equationx2 +y2+z2 = 1. This complex Newtonalgorithm gives rise to very beautiful pictures, compare with [153]. The most well-known of these is related to the Newton operator for the polynomialf(z) = z3 − 1.Of course it is easy to identify the zeroes off as

p1 = 1 andp2,3 = exp(±23π)i).

The problem however is how to divide the complex plane into four subsetsB1, B2, B3

andR, whereBj is the (open) subset of points that under the Newton algorithm con-verge topj , and whereR is the remaining set. In Figure 1.20 the three regions areindicated in black, grey and white. Note that the area’s can be obtained from oneanother by rotating over2

3π radians to the left or to the right.


Figure 1.20: Sets of points inC that under the Newton algorithm off(z) = z3 − 1

converge to the zeroes1, (white) e23πi (grey) ande−

23πi (black).

From the figure it already shows that the boundary of the whiteregionB1 is utterlycomplicated. It can be shown that each point of this boundaryis boundary point ofall three regionsB1, B2 andB3. So we have a subdivision of the plane in three ter-ritories, the boundaries of which only contains triple points. For further reference,in particular regarding the construction of pictures like Figure 1.20, see [152, 153].

Remark. This problem of dividing the complex plane in domains of attraction ofthe three zeroes ofz3 − 1 under the Newton operator is remarkably old: in [73]Cayley18 proves that the corresponding problem forz2 − 1 has a simple solution:the zeroes are±1 and the corresponding domains of attraction are the left andtheright half plane (the remaining set being the imaginary axis). He already notes thatthe ‘next’, cubic casez3 − 1 turns out to be more difficult. In 1910 Brouwer19 in[70], without having any dynamical interpretation in mind,shows that it is possibleto divide the plane into three domains, such that the boundaries consist solely oftriple points. For a long period of time, this result was viewed as a pathology, bornin the curious mind of a pure mathematician. Here we see that such examples havea natural existence. The question may well be asked whether we would have gottenthe idea that this phenomenon occurs here, without computergraphics.

18Arthur Cayley 1821-189519L.E.J. Brouwer 1881-1966

1.3 Examples 45

1.3.5 Dynamical systems defined by partial differential equa-tions

We announced that mostly we shall deal with dynamical systems, the state space ofwhich is finite dimensional. Still, here we treat two examples of dynamical systemsdescribed by partial differential equations, the state space of which therefore is in-finite dimensional, since it is a function space. We’ll show that the1-dimensionalwave equationdetermines a dynamical system, the evolution operator of which canbe well-defined in all respects, admitting the time setR. Next we’ll see that for the1-dimensional heat equationthe situation is quite different, in particular when wewish to ‘reverse’ time. In fact, here we have to acceptR+ as its time set. The latterphenomenon occurs generally in the case of diffusion equations.

The above is not exhaustive regarding the peculiarities of partial differential equa-tions. In particular there exist important equations (likethe Navier-Stokes equation,that describes the flow of incompressible, viscous fluids), for which it is not yetclear how to specify the state space to get a well-defined evolution operator.

1.3.5.1 The 1-dimensional wave equation

The 1-dimensional wave equation describes the wave propagation in a 1-dimensionalmedium, such as a string, or an organ pipe. For such a medium theexcitationu(x, t),as a function of positionx and timet, satisfies the equation

∂2u

∂t2= V 2∂

2u

∂x2, (1.17)

whereV is the propagation velocity of the medium at hand. The choiceof unitscan me made such thatV ≡ 1. Also we use an abbreviated notation for partialderivatives and so rewrite equation (1.17) to

∂ttu = ∂xxu. (1.18)

When considering such equations for a string, then it is feasible to imposeboundaryconditions. For example, if we deal with a string of lengthL, then foru we takedomain of definition[0, L] × R = x, t and we require that for allt ∈ R one hasu(0, t) = u(L, t) = 0. First, however, we’ll discuss the case where the ‘string’ isunbounded and whereu therefore is defined on all ofR2. To determine the stateat the instantt = 0, it is not sufficient to only specify the functionx 7→ u(x, 0).Indeed, just as for mechanical systems like the pendulum, where we also have anequation of motion for the second order derivative(s) with respect to time, here toowe have to specify the initial velocityx 7→ ∂tu(x, 0) = v(x, 0). If the initial stateis thus specified, we can determine the corresponding solution of the wave equation(1.18).


wave equation 1.3.5.2 Solution of the 1-dimensional wave equation

First note that wheneverf andg are real functions of one variable, it follows that

uf+(x, t) = f(x+ t) andug−(x, t) = g(x− t)

are solutions of the wave equation: these represent a traveling wave running to theleft and to the right, respectively. Since the wave equationis linear, also the sum ofsuch solutions is a solution. On the one hand, the initial state of u = uf+ + ug−now is given byu(x, 0) = f(x) + g(x) and∂tu(x, 0) = f ′(x)− g′(x). On the otherhand, if the initial state is given by the functionsu(x, 0) andv(x, 0), then we canfind corresponding functionf andg, and hence the associated solution, by taking

f ′(x) = 12(∂xu(x, 0) + v(x, 0)) andg′(x) = 1

2(∂xu(x, 0) − v(x, 0)),

after whichf andg are obtained by taking primitives. We have to include appropri-ate additive constants to make the solution fit withu(x, 0) andv(x, 0).

The method described above needs some completion. Since thewave equation is ofsecond order, it is feasible to require that the solutions should be at least of classC2.This can be obtained by only considering initial states for whichu(x, 0) andv(x, 0)areC2 andC1, respectively. The solution as constructed then is automaticallyC2.Moreover, it can be showns that the given solution is unique.20

Finally, to find solutions that are only defined on the strip[0, L] × R, satisfy-ing the boundary conditionsu(0, t) = u(L, t) = 0 for all t ∈ R, we have totake initial statesu(x, 0) andv(x, 0) that satisfyu(0, 0) = u(L, 0) = v(0, 0) =v(L, 0) = 0. However, this is not sufficient. To ensure that also∂ttu(0, 0) = 0 and∂ttu(0, L) = 0, the initial state also has to satisfy∂xxu(0, 0) = ∂xxu(L, 0) = 0.Now to construct the solution, we first extendu(x, 0) andv(x, 0) to all values ofx ∈ R, to be explained now. Indeed, we require that both functions are periodic,with period2L. Next we take forx ∈ [0, L]: u(L + x, 0) = −u(L − x, 0) andv(L + x, 0) = −v(L − x, 0). It is easily checked that the thus extended functionsu(x, 0) andv(x, 0) are of classC2 andC1 respectively. The reader is invited to fillin further details, also see [78].

1.3.5.3 The 1-dimensional heat equation

We now consider a somewhat different equation, namely

∂tu(x, t) = ∂xxu(x, t). (1.19)

20The proof is not very complicated, but uses a transformationξ = x + t, η = x − t, which turnsthe wave equation into∂ξηu = 0.

1.3 Examples 47

heat equationThis equation describes the evolution of the temperature ina 1-dimensional medium(think of a rod), as a consequence of heat transport. In general the right hand sidehas to be multiplied by a constant, that depends on the heat conductivity in themedium and the so-called heat capacity of te medium. We however assume that theunits have been chosen in such a way that all constants equal 1. Also, from the startwe now assume that the medium has a finite length and that the temperature at theeindpoints is kept constantly at0. Mathematically it turns out to be convenient totake as thex-interval[0, π], although this choice is not essential. With an argument,completely analoguous to the previous case of the wave equation, we now see thatan initial state is given by aC2 temperature distributionu(x, 0) at the instantt = 0,that should satisfy

u(0, 0) = u(π, 0) = ∂xxu(0, 0) = ∂xxu(π, 0) = 0.

That the initial state does not have to contain information on ∂tu(x, 0) = v(x, 0)comes from the fact that the equation of motion just containsthe first derivativewith respect to the timet. As in the wave equation, the initial distributionu(x, 0)can be extended all overx ∈ R, such that the result is periodic with period2π andsuch thatu(π + x, 0) = −u(π − x, 0); and as in the case of the wave equation, thisextension is of classC2.

For this equation it turns out to be handy to use Fourier series. The simplest prop-erties of such series can be found in any good text book on Calculus, e.g., [78]. Wenow give a brief summary as far as needed. AnyC2-functionf that is2π-periodicand that satisfiesf(π + x) = −f(π − x), and hence alsof(0) = f(π) = 0, can beexpressed uniquely by a series

f(x) =

∞∑

k=1

ck sin(kx). (1.20)

The advantage of such a representation becomes clear when werealise that

∂xxf(x) = −∞∑

k=1

ckk2 sin(kx). (1.21)

For the moment we do not worry about convergence of infinite sums. Continuingin the same spirit, we now write down the solution of the heat equation. To this endwe first expand as a Fourier series the initial stateu(x, 0), properly extended overRas a periodic function:

u(x, 0) =

∞∑

k=1

Ck sin(kx).


We express the solution, for each value oft, again as a Fourier series:

u(x, t) =∞∑

k=1

Ck(t) sin(kx).

We next use formula (1.21) to conclude that necessarily

Ck(t) = Ck(0) exp(−k2t)

and as the general (formal) solution we so have found

u(x, t) =∞∑

k=1

Ck(0) exp(−k2t) sin(kx). (1.22)

We now just have to investigate the convergence of these series. In this way it willturn out, that the solutions in general are not defined for negativet.

For the investigation of the convergence of functions defined by infinite sums likein (1.20), we have to go a little deeper into the theory of Fourier series, comparewith [167]. From this theory we need the following.

Theorem 1.2 (Fourier series).For anyC2-functionf : R → R the Fourier seriesin (1.20)is absolutely convergent and even such, that for a positive constantC andfor all k one has

ck ≤ Ck−2.

On the other hand, if the series in(1.20)is absolutely convergent, the limit functionis well-defined and continuous and for the coefficientsck one has

ck =2

π

∫ π

0

sin(kx)f(x)dx.

By ℓ-fold differentiation of the equation (1.20) with respect to x, we concludethat if for a constantC > 0 and for allk one has

ck ≤ Ck−(ℓ+2),

then the corresponding functionf has to be of classCℓ. From this, and from theformula (1.19), it follows that if the initial functionu(x, 0) is of classC2, then thecorresponding solutionu(x, t), for eacht > 0, is aC∞-function ofx. It also shouldbe said here that there exist initial functionsu(x, 0) of classC2, and even of classC∞, for which the corresponding expression (D.3) does not converge for anyt < 0.An example of this is

u(x, 0) =

∞∑

k=1

exp(−k) sin(kx).

Indeed, for such an initial condition of the heat equation, no solution is defined fort < 0, at least not if we require that such a solution should be of classC2.

1.3 Examples 49

Lorenz!system1.3.6 The Lorenz attractor

In 1963 Lorenz21 [129] published a strongly simplified set of equations for the cir-culation in a horizontal fluid layer that is being heated frombelow. (Although wekeep talking of a fluid, one may also think of a layer of gas, forexample of theatmosphere.) If the lower side of such a layer is heated sufficiently, an instabilityarises and a convection comes into existence, the so-calledRayleigh-Benard con-vection. The equations for this phenomenon are far to complicated to be solvedanalytically, and, surely during the time that Lorenz wrotehis paper, they were alsotoo complicated for numerical solution. By strong simplification, Lorenz was ableto get some insight in possible properties of the dynamics that may be expected influid flows at the onset of turbulence.

This simplification, now known as the Lorenz system, is givenby the followingdifferential equations onR3 :

x′ = σy − σx

y′ = rx− y − xz (1.23)

z′ = −bz + xy.

The standard choice for the coefficients isσ = 10, b = 83 andr = 28. In Appendix

D we discuss how the simplification works and how the choice ofconstants wasmade.

1.3.6.1 Discussion

Here two phenomena showed up. First, as far as can be checked,the solutions ingeneral donotconverge to anything periodic. Compare with Figure 1.21, that is nu-merically obtained and shows a typical evolution of the Lorenz system (1.23). Sec-ond, however near to one another we take two initial states, after a certain amount oftime there is no longer any connection between the behaviourof the ensuing evolu-tions. This phenomenon is illustrated in Figure 1.22. The latter is an indication, thatthe evolutions of this system will be badly predictable. In the atmospheric dynam-ics, convection plays an important role. So the ‘unpredictability’ of the evolution ofthe Lorenz system may be related to the fact that prediction of the future state of theatmospheric system (such as weather or climate forecasts) are so unreliable. Thetheoretical foundation of the bad predictability of the weather has been deepened alot over the last years [35].

We have witnessed comparable behaviour already for the Logistic system, but thesystematic study of that system really started in the 1970s.In Lorenz’s earlier work,for the first time it was shown by numerical simulations, thatdeterministic systems

21Edward N. Lorenz 1917-


Lorenz!attractor

x

z

Figure 1.21: Typical evolution of the Lorenz system (1.23),projected on the(x, z)-plane.

by this sensitive dependence on initial statecan have a large amount of unpre-dictability. We now show how the fluid dynamics system mentioned earlier, can besimplified to a system of three ordinary differential equations. Next we show someresults of numerical simulations.

1.3.6.2 Numerical simulations

In Figure 1.21 we project a rather arbitrarily chosen evolution of the Lorenz system(1.23) on the(x, z)-plane. Here the usual parameter valuesσ = 10, r = 28 andb =83 have been taken. The figure shows why one is referring to the Lorenz butterfly.The background of this name, however, is somewhat more complicated. As saidearlier, Lorenz himself saw this system mostly as supportive for the thesis that therather inaccurate weather predictions are related to sensitive dependence on initialstates of the atmospheric system. In a lecture [129, 130] he illustrated this by wayof an example, saying that the wing beat of a butterfly in the Amazon jungle, aftera month, can have grown to such a size, that it might ‘cause’ a tornado in Texas. Itis this catastrophical butterfly that the name relates to. Asin the case of the Henonattractor, see Figure 1.13, this configuration is highly independent of the initial state,at least if we disregard a transient.

To demonstrate the sensitive dependence on intiatial state, in Figure 1.22 we taketwo initial statesx = y = z = 0.2 andx = y = 0.2, z = 0.20001, plotting thex-coordinate of the ensuing evolutions as a function of the time t ∈ [0, 50].We clearly

1.3 Examples 51

witness that the two graphs first are indistinguishable, butafter some time showdifferences, after which they rapidly get completely independent of each other.

x

t

20

−20

0

0

50

Figure 1.22: Thex-coordinate of two evolutions of the Lorenz system (1.23) withnearby initial states, as a function of time: an illustration of sensitive dependence ofinitial state.

1.3.7 The Rossler attractor; Poincare map

As in the case of the Lorenz attractor (D.1), also for the Rossler attractor we shalldeal with differential equations inR3. Rossler’s work dates later than Lorenz’s andwe borrowed the equations from [168]. It should be noted thatat that time theoccurrence of chaotic attractors no longer was a big surprise. The most importantcontribution of Rossler has been, that he developed asystematicmethod to find sucha system. We here include the example merely as an illustration of the constructionof a Poincare map, that differs a little from our original definition.


R“”ossler!systemR“”ossler!attractor Section

x

y

Figure 1.23: Projection of an evolution of the Rossler system (1.24) on the(x, y)-plane. The domain of bothx andy is the interval[−16, 16].

1.3.7.1 The Rossler system

To be more precise, we now present the equations of Rossler as

x′ = −y − z

y′ = x+ ay (1.24)

z′ = bx− cz + xz,

where we shall use the standard valuesa = 0.36, b = 0.4 andc = 4.5.

In Figure 1.23 we see the projection of an evolution curve on the (x, y)-plane.Again it is true, that such an evolution curve of (1.24), apart from an initial tran-sient that is strongly determined by the chosen initial state, always yields the sameconfiguration. As reported earlier, such a subset of the state space that attracts solu-tions, is called an attractor. In this case we call the attractorA ⊂ R3. From Figure(1.23) we derive that this setA forms a kind of band, wrapped around a ‘hole’ andthe evolution curves inA wind around this ‘hole’. We consider a half plane of theform L = x < 0, y = 0, such that the boundary∂L = x = 0, y = 0 sticksthrough the hole in the attractor. It is clearly suggested bythe numerics that in eachpoint of the intersectionA∩L, the evolution curve of (1.24) is not tangent toL andalso that the evolution curve, both in forward and backward time, returns toA ∩ L.This means that we now have a Poincare map

P : A ∩ L→ A ∩ L,

1.3 Examples 53

that assigns to each pointp ∈ A ∩ L the point where the evolution curve of (1.24)starting atp hitsA ∩ L at the first future occassion. So this is a Poincare map of(1.24),restrictedtoA.

1.3.7.2 The attractor of the Poincare map

We now can get an impression ofA ∩ L, which is the attractor of the Poincare mapcorresponding toA, by plotting the successive passings of the evolution of Figure1.23 through the planeL. The result is depicted in Figure 1.24. We ‘see’ now that

x

z

Figure 1.24: Attractor of the Poincare map; horizontally we have thex-axis withdomain[−6, 0] and vertically thez-axis with domain[−1, 1].

the attractor of the Poincare map is part of the curve that isthe graph of a functionz = z(x). Unfortunately we can ‘prove’ that the structure of this attractor cannot bethat simple. Als see Excercise 1.26.

At this moment we shall disregard the theoretical complications concerning thestructure of the Poincare map attractor, and pretend that it is indeed part of thegraph of a smooth functionz = z(x). We can then get an impression of the dy-namics inside this attractor, by computing, from the evolution in Figure 1.23, theapproximate graph of the Poincare map. This can be done as follows. As we al-ready saw, thex-coordinate yields a good parametrisation of the attractor. If nowx1, x2, . . . are the values of thex-coordinate at the successive passings of the evo-lution throughL = x < 0, y = 0, then the points(xi, xi+1) ∈ R2 should lie onthe graph of the Poincare map. In this way we get Figure 1.25.This graph remindsof the graph of the endomorphism that determines the Logistic system, see§1.3.3:it has one maximum in the interior of its domain of definition and also two (local)


Doubling map

xn

xn+1

Figure 1.25: Graph of the Poincare map as a 1-dimensional endomorphism.

minima at the boundaries. The mathematical theory developed for the Logistic mapapplies largely also to such a (so-called unimodal) map. We thus have a strong ‘ex-perimental’ indication that the behaviour of the Poincaremap of the Rossler system(1.24) may strongly resemble that of the Logistic system.

1.3.8 The Doubling map and symbolic dynamics

We now treat an example that is not directly connected to any kind of applicationor to an interpretation outside mathematics. Its interest,however, is that it can behandled mathematically rigorously. The example occurs in three variations, thatmutually differ only little. We now describe all three of them.

1.3.8.1 The Doubling map on the interval

We first deal with the Doubling map on the interval. In this case for the state spacewe haveX = [0, 1), while the dynamics is defined by the map

ϕ(x) = 2x modZ.

It may be clear that this map is not invertible, whence for thetime set we have tochooseT = Z+. The mapϕ generating the dynamics, clearly is not continuous:there is a dicontinuity at the pointx = 1

2. We can ‘remove’ this discontinuity by

replacing the state space by the circle. In this way we get thesecond version of oursystem.

1.3 Examples 55

symbolic dynamics1.3.8.2 The Doubling map on the circle

We met the circle as a state space before, when considering the Newton algorithm,see§1.3.4. There we thought of the unit circle in the plane. By parametrisingthis circle as(cosσ, sin σ) | σ ∈ R, it may be clear that the circle can also beviewed asR mod2πZ. The value of2π is quite arbitrary and just determined byour convention to measure angles in radians. Therefore we shall now replace2π by1 and consider the circle as the quotientR/Z of additive groups. In summary, wechoseX ′ = R/Z. In this version, the transformation that generates the dynamics isalmost the same, namely

ϕ′([x]) = [2x],

where, as before , we use the notation[−] to indicate the equivalence class. It is notdifficult to show that this transformation is continuous. The map

x 7→ [x]

determines a one-to-one correspondence between the elements ofX andX ′.

1.3.8.3 The Doubling map in symbolic dynamics

The third version of the doubling map has a somewhat different appearance. As thestate space we consider all infinite sequences

s = (s0, s1, s2, . . .),

where eachsj either takes on the value0 or 1. The set of all such sequences isdenoted byΣ+

2 . Here the subscript2 refers to the choice of eachsj from the set0, 1, while the superscript+ indicates that the set for the indicesj is Z+. Thedynamics now is defined by theshift map

σ : Σ+2 → Σ+

2 , s = (s0, s1, s2, . . .) 7→ σ(s) = (s1, s2, s3, . . .).

The relationship between this system and the previous ones can be seen by express-ing the elements ofX = [0, 1) in the binary number system. For eachx ∈ X thisyields an expression

x = 0.x0x1x2 . . .⇔ x =

∞∑

j=1

xj−12−j,

wherexj ∈ 0, 1, for all j ∈ Z+. In this way for each pointx ∈ X we havea corresponding point inΣ.

2 Moreover, forx = 0.x0x1x2 . . . we see thatϕ(x) =0.x1x2x3 . . . , which indicates the relationship between the dynamical systems

(X, T, ϕ) and(Σ+2 , T, σ).

A few remarks are in order now.


Remarks.

- It will be clear now that the choice of the binary number system is determinedby the fact that we deal with aDoubling map. For theTenfoldmapx 7→10 x modZ we would have to use the decimal number system.

- The relation between the first and the last version also can be expressed with-out using the binary number system. To this end we takeX = [0, 1), dividingit in two partsX = X0 ∪X1, where

X0 = [0, 12) andX1 = [12 , 1).

Now if a pointx ∈ X happens to be inX0, we say that its address is0, andsimilarly the address is1 for X1. So for x ∈ X we can define a sequences(x) ∈ Σ+

2 , by takingsj(x) as the address ofϕj(x), j ∈ Z+. In other wordsby defining

sj(x) =

0 if ϕj(x) ∈ X0,1 if ϕj(x) ∈ X1,

for j ∈ Z+. In this way we get, apart from a small problem that we’ll dis-cuss in a moment, exactly the same sequences as with the binary expansions.These addresses also are calledsymbolsand this is how we arrive at the termsymbolic dynamics.

- The small problem just mentioned is that there is an ambiguity in assigningelements ofΣ+

2 to elements ofX. This problem is illustrated by the fact thatin the decimal system one has the ambiguity

0.5 = 12 = 0.4999 . . . ,

related to an ‘infinite tail of nines’. This problem in general can be overcomeby excluding such ‘tails of nines’ in all decimal expansions.

In the binary system the analoguous problem is constituted by ‘tails of ones’.This makes the assignment

x 7→ s(x) = (s0(x), s1(x), s2(x), . . .)

slightly problematic. However, if we exclude ‘tails of ones’, then the map

s : X → Σ+2 (1.25)

is well-defined and injective, but not surjective (since we omitted all tails ofones.) Similarly the map

s = (s0, s1, s2, . . .) 7→∑

j

sj2−(j+1)

is well-defined but not injective, for the same kind of reason. For furtherdiscussion also see Excercise 1.17.

1.3 Examples 57

It is customary to define onΣ+2 a metric as follows ifs = (s0, s1, s2, . . .) and

t = (t0, t1, t2, . . .) ∈ Σ+2 are two sequences, then

d(s, t) = 2−j(s,t),

wherej(s, t) is the smallest index for whichsj(s,t) 6= tj(s,t). It is easy to show thatthis indeed defines a metric.

We conclude that the three versions of the Doubling map, as constructed here, arebasically identical, but that there are still small troubles (i.e., tails of ones) whichmeans that we have to be careful when translating results from one version to an-other.

1.3.8.4 Analysis of the Doubling map in symbolic form

Now we’ll see that the symbolic version of the Doubling map

σ : Σ+2 → Σ+

2

is very appropriate for proving a number of results on the dynamics generated byit. Here we recall that the time set isZ+. In each case we indicate what are thecorresponding results for the other versions.

Periodic points. We recall that a points ∈ Σ+2 is periodic with periodk if σk(s) =

s. In the next chapter we treat periodic evolutions in greater generality, but we liketo indicate already how the general situation is in the present case of time setZ+.

Observe that if a periodic point has periodk, then automatically it also has periodsnk, for all n ∈ Z+. For any periodic point there exists a smallest periodk0 > 0,called theprime period. A point with prime periodk0 then has as all its periodsnk0n∈Z+ . and no others. Moreover, a point of period 1 is a fixed point.

For our shift mapσ : Σ+2 → Σ+

2 it will be clear that the sequences is a periodicpoint with periodk if and only if the sequences is periodic of periodk. This meansthat

sj = sj+k

for all j ∈ Z+. Such a periodic point therefore is determined completely bytheinitial part (s0, . . . , sk−1) of the sequence. From this it follows that

The number of points inΣ+2 that is periodic of periodk under the shift

mapσ is equal to2k.

When transferring this result to the other versions of the Doubling map, we have tomake a correction: for any period we here include the sequence s = (1, 1, 1, . . .),


that completely consists of digits 1. This point should not be counted in the otherversions, where it is identifie with(0, 0, 0, . . .). Note that this is the only case witha ‘tail of ones’ for a periodic point. Thus we find

The number of periodic point of periodk for the Doubling map of[0, 1)or on the circleR/Z, equals2k − 1.

Next we’ll show that the set of all periodic points, so of any period, is dense inthe setΣ+

2 . In general we say that a subsetA ⊂ X is dense if its closure is equalto the total set, i.e., whenA = X. This means that for any points ∈ X and forany neighbourhoodU of s in X, the neighbourhoodU contains points ofA. Forexplanation of topological concepts, see Appendix A.

So consider an arbitrary pointss ∈ Σ+2 . For an arbitrary neighbourhoodU of s we

can say that for a certain indexj ∈ Z+, the subsetU contains all sequencest that areat most at a distance2−j from s, which means all sequencest the firstj elementsof which coincide with the firstj elements ofs. We now show thatU contains aperiodic point ofσ. Such a periodic pointt can be obtained as follows: we take

t0 = s0, t1 = s1, . . . , tj−1 = sj−1 and furthertk+j = tk for all k ∈ Z+.

We leave it to the reader to transfer this argument to the other two versions of thedoubling map. We thus showed

For the Doubling map, the set of all periodic points is dense in the statespace.

Construction of dense evolutions. If s ∈ Σ+2 is a periodic point, then its evo-

lutions as a subsetσj(s)|j ∈ Z+ ⊂ Σ+2 is finite, which therefore will not be

dense inΣ+2 . We’ll now show that also a sequences exist, the evolution of which,

as a subset ofΣ+2 is dense. We give two constructions, the first of which is very

explicit. In the second construction we show that not only such sequences do exist,but moreover, to some extent almost every sequence has this property. In the lattercase, however, we will not construct any of these sequences.

For the first construction, we first give a list of all finite sequences consisting of thesymbols 0 and 1. We start with sequences of length 1, next of length 2, etc. We alsocan determine the order of these sequences, by using a kind ofalphabetic ordering.Thus we get

0 1

00 01 10 11

000 001 010 011 100 101 110 111

1.3 Examples 59

et cetera. The announced sequences we get by joining all these blocks, one behindthe other, into one sequence. To prove that the evolution ofs, as a subsetσj(s)|j ∈Z+ ⊂ Σ+

2 is dense, we show the following. For any sequencet ∈ Σ+2 and any

j ∈ Z+, we aim to find an element of the formσN (s) with

d(t, σN(s)) < 2−j.

As we saw before, this would mean that both sequences are identical as far as thefirst j elements are concerned. We now look up in the sequences the block of lengthj that contains the first elements oft. This block exists by the construction ofs. Tobe precise, we look at the first occurrence of such a block and call the index of thefirst element of thisN. It may now be clear that the sequenceσN(s), obtained froms by deleting the firstN elements starts as

(sN , sN+1, . . . , sN+j−1, . . .)

and coincides witht at the firstj digits. This means that its distance tot is less than2−j, as required. This proves that the evolution ofs underσ is dense inΣ2

+.

The second ‘construction’ resorts to probability theory. We determine a sequences by chance: each of thesj for instance is determined by tossing a fair coin suchthat 0 and 1 occur with equal probability1

2. Also we assume mutual independence

of these events for allj ∈ Z+. Our aim is to show that withprobability 1 we willfind a sequence with a dense evolution. For explanation see Appendix A.

To do all this, for an arbitrary integerk consider an abitrary finite sequenceH =(h0, . . . , hk−1) of lengthk, where allhj are either 0 or 1. The question is what isthe probability that such a sequence will occurs somewhere as a finite subsequencein the infinite sequences. In fact, for anyj0 the probability that

(sj0 , . . . , sj0+k−1) 6= H

is equal to1 − 2−k. Moreover, forj0 = 0, k, 2k, 3k, . . . these events are mutuallyindependent. This means that the probability that such a finite sequence, does notoccur starting at the indicesj0 = 0, k, 2k, . . . ,Mk, is equal to(1 − 2−k)M . since

limM→∞

(1 − 2−k)M = 0,

we conclude that with probability 1 the finite sequenceH occurs somewhwere as afinite subsequence ins.

Since the total number of finite sequences of the formH is countablewe so haveshown that with probability 1 any finite sequence occurs somewhere as a finite sub-sequence ins. Returning to the original discussion, the occurrence of anyfinitesequence as a subsequence ofs implies that the evolution

σj(s) | j ∈ Z+ ⊂ Σ+2


is dense.

What do the above considerations mean for the other versionsof the Doubling map?It is not hard to see that the probability distribution we introduced onΣ+

2 by theassignment map (1.25) is pulled back to the uniform distribution (or Lebesgue mea-sure) on[0, 1). This means that a point of[0, 1), which is chosen at random ac-cording to the uniform distribution, will have a dense evolution with probability 1.Compare with Appendix A. Summarising we state:

For the Doubling map there are initial states, the evolutionof which,as a subset, is dense in the state space. In an appropriate sense, denseevolutions even occur with probability1.

1.4 Exercises

Exercise 1.1 (Integration constants and initial state).For the harmonic oscillator,with equation of motion

x′′ = −ω2x,

we saw that the general solution has the formx(t) = A cos(ωt+B). Determine thevalues ofA andB in terms of the initial conditionsx(0) andx′(0).

Exercise 1.2 (Pendulum time to infinity).Consider the free, undamped pendulumin the area of the phase plane, close to the region where the pendulum starts torotate around its point of suspension. We already mentionedthat the period of theoscillation grows. Now show that, as the amplitude tends toπ, the period tendsto ∞. (Hint: use the fact that solutions of ordinary differentialequations dependcontinuously on their intial conditions.)

Exercise 1.3 (Conservative systems).For a differentiable functionV : Rn → R

consider the second order differential equation

x′′(t) = −gradV (x(t)).

Show that for solutions of such an equation the law of conservation of energy holds.Here the energy is defined as

H(x, x′) = 12 ||x′||2 + V (x),

where|| − || denotes the Euclidean norm.

Exercise 1.4 (. . . with damping). In the setting of Exercise 1.3 we introduce adamping term in the differential equation as follows

x′′ = −gradV (x(t)) − cx′(t)

1.4 Exercises 61

with c > 0. Show that the energyH decreases in the sense that along any solutionx(t) of the latter equation

dH(x(t), x′(t))

dt= −c||x′(t)||2,

whereH is as above.

Exercise 1.5 (Linear, damped oscillator).Consider the equation of motion of thedamped harmonic oscillator

x′′ = −ω2x− cx′, x ∈ R,

wherec > 0 is a constant. We already noted that there can be several waysin whichsolutionsx(t) tends to zero: the solution either can or can not pass infinitely oftenthrough the value zero, see the diagrams (a) and (b) of Figure(1.3). Show that thetype of damping that occurs here depends on the eigenvalues of the matrix

(

0 1−ω2 −c

)

of the corresponding linear system

x′ = y

y′ = −ω2x− cy.

Indicate for which eigenvalues these kinds of damping occur.

Remark. Analogous conclusions can be drawn when nonlinear terms areincluded.

Exercise 1.6 (Shift property). Consider a mapΦ : X × R → X with Φ(x, 0) = xfor all x ∈ X. We call a map of the formt 7→ Φ(x, t), wherex ∈ X is fixed, anevolution (corresponding toΦ). Show that the following two properties are equiva-lent:

i. For any evolutionst 7→ ϕ(t) andt ∈ R alsot 7→ ϕ(t+ t) is an evolution;

ii. For anyx ∈ X andt1, t2 ∈ R we haveΦ(Φ(x, t1), t2) = Φ(x, t1 + t2).

Exercise 1.7 (Shift property for time-dependent systems).As in Exercise 1.6 weconsider a mapΦ : X × R → X, where nowX is a vector space. We moreoverassume thatΦ is aC2-map, where for anyt the mapΦt, defined byΦt(x) = Φ(x, t),is a diffeomorphism.22 We also assume thatΦ0 = IdX . Again we call maps of the

22Which means that a smooth inverse map exists.


form t 7→ Φ(x, t), for fixed x ∈ X, evolutions (corresponding toΦ). Show thatthere exists a differential equation of the form

x′(t) = f(x(t), t),

with aC1-mapf : X × R → X, such that every solution of this differential equa-tion is an evolution (corresponding toΦ). Give the connection betweenΦ andf.Furthermore show that in this setting the following two properties are equivalent:

i. The functionf is independent oft, in other words that for anyx ∈ X andt1, t2 ∈ R one hasf(x, t1) = f(x, t2);

ii. For anyx ∈ X ent1, t2 ∈ R one hasΦ(Φ(x, t1), t2) = Φ(x, t1 + t2).

Exercise 1.8 (To∞ in finite time). Consider the differential equation

x′ = 1 + x2, x ∈ R.

Show that the corresponding dynamical system is only locally defined. Specify itsevolution operator and its domain of definition.

Exercise 1.9 (Inverse of the Henon map). Show that the Henon map, given by

Ha,b : (x, y) 7→ (1 − ax2 + y, bx),

for b 6= 0 is an invertible map. Compute the inverse.

Exercise 1.10 (Inverse of a diffeomorphism near a saddle point). Let p be thesaddle point of a diffeomorphism

ϕ : R2 → R2.

Show thatp is also a saddle point of the inverse mapϕ−1 and that the unstableseparatrix atp for ϕ is the stable separatrix atp for ϕ−1 et vice versa. Express theeigenvalues of dϕ−1 atp in the corresponding eigenvalues for dϕ atp.

Exercise 1.11 (A normal form for the Logistic system).When dealing with theLogistic system in§1.3.3 we saw how the ‘maximal population’ parameterK couldbe transformed away. Now show more generally that for any quadratic map

x 7→ ax2 + bx+ c,

so witha 6= 0, always an affine substitution exists of the formy = αx + β, withα 6= 0, such that the given transformation gets the formy 7→ y2 + γ.

1.4 Exercises 63

Exercise 1.12 (The Henon family for b = 0). When dealing with the Henon family

Ha,b : (x, y) 7→ (1 − ax2 + y, bx),

we assumed thatb 6= 0, to ensure invertibility. However, if we takeb = 0, weobtain a dynamical system with time setZ+. We aim to show here that this systemessentially equals the Logistic system

Φµ(x) = µx(1 − x),

for an appropriately chosen value ofµ.

To start with, all ofR2 by Ha,0 is mapped onto thex-axis and we don’t loose anyinformation by restricting to thatx-axis. Fora 6= 0, the mapHa,0 defines a quadraticmap. Now determine for whichµ-value, as a function ofa, the maps

Ha,0|x−axisandΦ1µ

are taken into each other by affine substitutions. Give thesesubstitutions.

Exercise 1.13 (Fattening of non-invertible maps by diffeomorphisms). In theprevious Exercise 1.12 we saw how the mapsHa,b, for b → 0 give approximationsby invertible maps on the plane, of thenon-invertible mapHa,0 on thex-axis. Nowgive a general method to approximate non-invertible maps ofRn by invertible mapsof R2n.

Exercise 1.14 (Continuity of Newton operator).Show that, wheneverf is a realpolynomial of degree at least 2, the Newton operatorNf onR∪∞ is a continuousmap. Moreover show that for sufficiently large|x| one has|Nf(x)| < |x|. Whatexceptions show up for polynomials of degree 0 and 1?

Exercise 1.15 (The Newton operator of quadratic polynomials). Consider thepolynomial map

f(x) = x2 − 1.

Specify the corresponding Newton operatorNf and consider its dynamics on theinterval(0,∞).

1. Show that for0 < x < 1 one hasNf (x) > x;

2. Show that for1 < x one has1 < Nf (x) < x.

What is your conclusion for Newton iterations with initial points in(0,∞)?


Exercise 1.16 (Convergence of Newton iteration to a zero).Let f : R → R be apolynomial map and consider the Newton iterationx0, x2, . . . where

xj = Nf(xj−1),

for j ≥ 1. Let p be such thatf(p) = 0 andf ′(p) 6= 0.

1. Show that forx0 sufficiently close top, the iteration converges top. (Hint:Invoke the Contraction Principle, see Appendix A.)

2. Also show thatdNf

dx(p) = 0.

Exercise 1.17 ((Dis-) continuity of the Doubling map).When treating the Dou-bling map, we saw the maps

Σ2+ → X∗ = [0, 1] andX = [0, 1) → Σ2

+,

see formula (1.25). Show that the former is continuous whilethe latter is not.

Exercise 1.18 (Periodic points of the Doubling map).Determine how many pointsexist on the circleR/Z that have prime period 6 under the Doubling map.

Exercise 1.19 (The Doubling map inC). Consider the circleS1 as the unit circlein the complex planeC. Also consider the polynomial mapf : C → C defined byf(z) = z2. Show thatS1 is invariant underf and that the restrictionf |S1 coincideswith the Doubling map in the appropriate version. IsS1 an attractor off?

Exercise 1.20 (Linear parts).Given the time-dependent differential equation

x′ = f(x, t), x ∈ Rn,

with f(0, t) ≡ 0 andf(x, t) = Ax+O(|x|2), for ann× n-matrixA and where theO(|x|2)-terms contain allt-dependence, which we assume to be periodic of periodT. Consider the Poincare mapP : Rn → Rn, corresponding to the sectiont = 0modTZ. Show thatP (0) = 0 and that

dP0 =

∫ T

0

etA dt.

Exercise 1.21 (Measure zero in symbol space).Show thatΣ2+ is compact. Also

show that the set ofs ∈ Σ2+, that ends in a ‘tail of digits1, has measure zero.

Compare with Appendix A.

1.4 Exercises 65

Exercise 1.22 ((∗) Suspension lives on non-contractible state space).Considerthe state spaceX of a dynamical system that has been generated by suspension of adynamical system the state spaceX of which is a vector space. Aim of this exerciseis to show thatX can’t be a vector space. To this end we make use of some newconcepts.

WheneverY is a topological space, we call a continuous mapf : S1 = R/Z → Ya closed curve inY. Such a closed curve is contractible if there exists a continuousmap

F : S1 × [0, 1] → Y,

such thatF (s, 0) = f(s), for eachs ∈ S1 and such thatF (S1 × 1) consists ofone point.

1. Show that each closed curve in a vector space is contractible.

2. Show that forY = S1 andf : S1 → S1 the identity map,f, as a closed curvein S1, is not contractible.

3. Next construct a closed curve inX that is not contractible. Here we usethe mapΦ1 : X → X that generates the dynamics onX. Also we use therepresentation of the elements ofX as equivalence classes inX × R. Thecurve is defined by

γ : s ∈ [0, 1] 7→ [(1 − s) · Φ1(x) + s · x, s],

for any fixed elementx ∈ X.Using the suspension equivalence relation, showthat this curve is closed indeed.

4. Finally show that the closed curveγ is not contractible, and hence thatXcan’t be a vector space.

Exercise 1.23 ((∗) Attractor ⊆ W u(p), [16]). Let ϕ : R2 → R2 be a diffeomor-phism with a saddle pointp, such that

i. | det(dϕ)| ≤ c < 1 everywhere.

ii. W u(p) ∩W s(p) contains a (homoclinic) pointq 6= p.

Let U be an open bounded domain with boundary∂U ⊆ W u(p) ∪W s(p). Thenshow

1. Area(ϕn(U)) ≤ cn AreaU.

2. limn→∞ length(W s(p) ∩ ∂ (ϕn(U))) = 0.

3. For allr ∈ U for the omega limit setω(r) ⊆W u(p).


For a definition of omega limit set see Ch. 4.

Exercise 1.24 ((∗) Numerical approximation of W u(p)). In §1.3.2 on the Henonmap an algorithm was treated to find numerical approximations of a segment of theunstable separatrix. Here certain parametersε, N andm had to be chosen. Howwould you change these parameters to increase the precision? You may ignorerounding off errors that occur in numerical flops.

Exercise 1.25 ((∗) Attractor = W u(p)?). In Figure 1.15 we witness numericaloutput regarding the Henon attractor and a piece of the unstable separatrixW u(p).Check that these compare well. Carry out a similar comparison program for theattractor of the Poincare map of the forced, damped swing, see Figure C.6 in Ap-pendix C.

Exercise 1.26 ((∗) The Poincare map of the Rossler system).When consideringthe Rossler attractor in§1.3.7 we saw that the attractor of the Poincare map as con-structed there, appears to lie inside a (smooth) curve. Now numerical simulationsalways are unreliable, in the sense that it may be hard so get an idea what the pos-sible errors can be. It therefore may be difficult to found rigorous mathematicalproofs on numerical simulations. Here we ask to givearguments, based on the rep-resented simulation of an evolution of the Rossler system,why the structure of theattractor of the Poincare map in this case, can’t be as simple as the curve depictedthere.

Chapter 2

Qualitative properties andpredictability of evolutions

In the previous chapter we already met various types of evolutions, like stationaryand periodic ones. In the present chapter we shall investigate these evolutions morein detail and also we shall consider more general types. Herequalitative propertiesare our main interest. The precise meaning of ‘qualitative’in general is hard to give;to some extent it is the contrary or opposite of ‘quantitative’. Roughly one could saythat qualitative properties can be described inwords, while quantitative propertiesusually are described in terms of(real) numbers.

It will turn out that for the various types of evolutions, it is not always equally easy topredict the future course of the evolution. At first sight this may look strange: oncewe know the initial state and the evolution operator, we can just predict the futureof the evolution in a unique way. However, in general we only know the initial stateapproximately. Moreover, often, the evolution operator isnot known at all or onlyapproximately. We shall ask the question of predictabilitymainly in the followingform. Suppose the evolution operator is unknown, but that wedo know a (long)piece of the evolution, is it then possible to predict the future course of the evolutionat least partially? It will turn out that for the various kinds of evolution, the answerto this question strongly differs. For periodic, in particular stationary evolutions,the predictability is not a problem. Indeed, based on the observed regularity wepredict that this will also continue in the future. As we shall see, the determinationof the exact period still can be a problem. This situation occurs, to an even strongerdegree, with quasi-periodic solutions. However, it will turn out that for the so-calledchaotic evolutions an essentially worse predictability holds. In Chapter 6 we shallreturn to predictability probems in a somewhat different context.

68 Qualitative properties and predictability of evolutions

stationary evolutionperiodic evolutionpredictability!of

periodicevolutions

2.1 Stationary and periodic evolutions

Stationary and periodic evolutions have been already widely discussed in the previ-ous chapter. If(M,T,Φ) is a dynamical system with state spaceM, time setT andevolution operatorΦ : M × T → M, thenx ∈ M is periodic or stationary if thereexists a positivet ∈ T such thatΦ(x, t) = x; andx is stationary whenever for allt ∈ T one hasΦ(x, t) = x. If Φ(x, t) = x, we callt a period ofx. The periodic orstationary evolution as a set is given by

x(s) = Φ(x, s) | s ∈ T ⊂M.

We already noted in the example of the Doubling map, see§1.3.8, that each peri-odic point has many periods: ift is a period, then so are all integer multiples oft. As before we introduce the notion ofprime periodfor periodic, non-stationary,evolutions. This is the smallest positive period. This definitions appears withoutproblems, but this is not entirely true. In the case whereT = R, such a primeperiod does not necessarily exist. First of all this is the case for stationary evolu-tions, that can be excluded by just considering proper, i.e., non-stationary, periodicevolutions. However, even then there are pathological examples, see Exercise 2.3.In the sequel we assume that prime periods of non-stationaryperiodic evolutionsalways will be well-defined. It can be shown that this assumption is valid in allcases whereM is a metric (or metrisable) space and when the evolution operator,as a mapΦ : M ×T →M, is continuous. If such a prime period equalsτ, it is easyto show that the set of periods is given byτZ ∩ T.1

2.1.1 Predictability of periodic and stationary motions

The prediction of future behaviour in the presence of a stationary evolution is com-pletely trivial: the state does not change and the only reasonable prediction is thatthis will remain so. In the case of periodic evolution, the prediction problem doesnot appear much more exciting. In the case whereT = Z or T = N, this is indeedthe case. Once the period has been determined and the motion has been preciselyobserved over one period, we are done. We’ll see, however, that in the case whereT = R or T = R+, when the period is areal number, the situation is yet lessclear. We’ll illustrate this when dealing with the prediction of the yearly motion ofthe seasons, or equivalently, the yearly motion of the Sun with respect to the ver-nal equinox (the beginning of Spring). The precise prediction of this cyclus overa longer period is done with a ‘calendar’. In order to make such a ‘calendar’, itis necessary to know the length of the year very accurately and it is exactly herewhere the problem occurs. Indeed, this period, measured in days, or months, is not

1Often one speaks of ‘period’ when ‘prime period’ is meant.

2.1 Stationary and periodic evolutions 69

calendarsan integer number and we cannot expect to do better than find anapproximationfor this. By the ‘calendar’ here we should understand the setof rules according towhich this calendar is made, where one should think of the leap year arrangements.In stead of ‘calender’ one also speaks oftime tabulation.We give some history. In the year 45 BC the Roman emperor Julius Cæsar installedthe calendar named after him, the Julian Calendar. The year was thought to consistof 365.25 days. On the calendar this was implemented by adding a leap day everyfour years. However, the real year is a bit shorter and slowlythe seasons went outof phse with the calendar. Finally Pope Gregorius XIII decided to intervene. First,to ‘reset’ the calendar with nature, he decided that in the year 1582 (just at this oneoccassion) immediately after October 4, October 15 would follow. This means thatthe predictionsbased on the Julian system in this period of more than 1600 yearsalready show an error of 10 days. Apart from this unique intervention and in orderto prevent such shifts in the future, the leap year convention was changed as follows.In the future year numbers, divisible by 100, but not by 400, no longer will be leapyears. This led to the present, Gregorian, calendar.Of course we understand that also this intervention still isbased on an approxima-tion: real infallability can’t be expected here! In fact, a correction of 3 days per 400years amounts to 12 days per 1600 years, where only 10 days were necessary.

Our conclusion from this example is that when predicting thecourse of a periodicmotion, we expect an error that in the beginning is proportional to theinterval ofprediction, i.e., the time difference between the present moment of prediction andthe future instant of state we predict. When the prediction errors increase, a certainsaturationoccurs: when predicting seasons the error can’t exceed one half year.Moreover, when the period is known to agreater precision(by longer and / or moreprecise observations), the growth of the errors can be made smaller.

As said earlier, the problem of determination of the exact period, only occurs inthe case of time setsR or R+. In the cases where the time sets areZ or N theperiods are integers, which by sufficiently accurate observation can be determinedwith absolute precision. These periods that can be described by a countingword,i.e., they can be viewed as qualitative properties. In caseswhere the time set isR orR+, the period is rather a quantitative property of the evolution.In fact however, the difference is not that sharp. If the period is an integer, butvery large, then it can be practically impossible to determine this by observation.As we shall see when discussing so-calledmulti-periodicevolutions, it may evenbe impossible to determine by observation whether we are dealing with a periodicevolution or not.

Also regarding the more complicated evolutions to be dealt with below, we will askourselves how predictable these are and which errors we can expect as a function


asymptoticallystationaryevolution

asymptoticallyperiodic evolution

of the prediction interval. We then will merely consider predictions, based on alengthy observation of the evolution.

2.1.2 Asymptotically and eventually periodic evolutions

Returning to the dynamical systems(M,T,Φ) with an evolutionx(t) = Φ(x, t), asbefore, we give some definitions. We call the evolutionx(t) asymptotically peri-odic (or stationary)if for t → ∞ the pointx(t) tends to a periodic (or stationary)evolutionx(t), in the sense that the distance betweenx(t) and the setx(s)|s ∈ Ttends to zero ast → ∞. Here we use a metric on the state spaceM, compare withthe introduction to the present section.

Remarks.

- This definition can be extended to the case whereM only has a topology.In that case one may proceed as follows. We say that the evolution x(t) isasymptotically periodic if there exists a periodic orbitγ = x(t) | t ∈ Tsuch that for any neighbourhoodU of γ there existsTU ∈ R, such that fort > TU one has thatx(t) ∈ U.

We speak of asymptotic periodicity in thestrict senseif there exists a periodicorbit γ = x(t) | t ∈ T of periodτ and a constantc, such that

limN∋n→∞

x(t+ nτ) = x(t+ c).

In that case either the numberc/τ moduloZ or the pointx(c) can be inter-preted as the phase associate withx(t).

NB: In the case without topology, the concept of ‘asymptotic’ does no longermake sense, since the concept of limit fails to exist. The eventually periodicor stationary evolutions, to be discussed later this Chapter, are an exceptionto this.

- Assuming continuity ofΦ, it follows that the setx(s)|s ∈ T correspondingto the periodic evolutionx(s) is compact. Indeed, in the case whereT = Z

or Z+ the set is even finite, while in the cases whereT = R or R+ the setx(s)|s ∈ T is the continuous image of the circle. The reader can check thiseasily.

We have seen examples of such asymptotically periodic or asymptotically stationaryevolutions several times in the previous chapter: for the free pendulum with damp-ing, see§1.1.2, every non-stationary evolution is asymptotically stationary. For thedamped pendulum with forcing (see§1.1.2), and in the Van der Pol system (see

2.2 Multi- and quasi-periodic evolutions 71

eventually periodicevolution

eventuallystationaryevolution

§1.3.1) we saw asymptotically periodic evolutions. When discussing the Henonfamily (in §1.3.2) we saw another type of stationary evolution, given bya saddlepoint. For such a saddle pointp we defined stable and unstable separatrices, see(1.11). From this it may be clear that the stable separatrixW s(p) exactly consists ofpoints that are asymptotically stationary and of which the limit point is the saddlepointp.

The definition of asymptotically periodic or asymptotically stationary applies forany of the time sets we are using. In the case of time setN another special typeoccurs, the so-calledeventually periodic(or eventually stationary) evolution. Thisis an evolutionx(n), such that for certainN andk > 0,

x(n) = x(n + k) for all n > N.

So this is an evolution that after some time is exactly periodic (and not only in asmall neighbourhood of a periodic evolution). In this case,however, the distancefunction does not play any role.

This type of evolution is abundantly present in the system generated by the Doublingmapx 7→ 2x mod Z, defined on[0, 1), see§1.3.8. Here0 is the only stationarypoint. However, any rational pointp/q with q a power of2, is eventually stationary.Compare with Exercise 2.2. Similarly, any rational point iseventually periodic.Also for the Logistic system, see§1.3.3, such eventually periodic evolutions occuroften.

As said before, such evolutions do not occur in case of the time setsR andZ. Indeed,in these cases, the past can be uniquely reconstructed from the present state, whichclearly contradicts the existence of an evolution that ‘only later’ becomes periodic.(Once the evolution has become periodic, we can not detect how long ago this hashappened.)

Regarding the predictability of these evolutions the same remarks hold as in the caseof periodic and stationary evolutions. As said earlier, when discussing predictabil-ity, we always assume that the evolution to be predicted has been observed for quitesome time. This means that it is no further restriction to assume that we alreadyare in the stage that no observable differences exist between the asymptotically pe-riodic (or stationary) evolutions and the corresponding periodic (or stationary) limitevolution.

2.2 Multi- and quasi-periodic evolutions

In the previous chapter we already met with evolutions wheremore than one fre-quency (or period) plays a role. In particular this was the case for the free pendulum


with forcing, see§1.1.2. Also in dayly life we observe such motions. For instance,the motion of the Sun as seen from the Earth, knows two periodicities: the dailyperiod, of Sunrise and Sunset, and the yearly one that determines the change of theseasons.

We shall describe these types of evolution as a generalised type of periodic motions,starting out from our dynamical system(M,T,Φ). We assume thatM is a differ-entiable manifold and thatΦ : M × T → M is a differentiable map. For periodicevolutions, the whole evolutionx(t) | t ∈ T as a subset of the state spaceMconsists either of a finite set of points (in cases whereT = N or Z) or of a closedcurve (in the case whereT = R or R+). Such a finite set of points we can viewasZ/sZ, wheres is the (prime-) period. Such a closed curve can be viewed asthe differentiable image of the circleS1 = R/Z. Such parametrisations, so withZ/sZ or R/Z, can be chosen such that the dynamics when restricted to the periodicevolution, consists oftranslations.

- In case where the time set isZ or N :

Φ([x], k) = [x+ k],

where[x] ∈ M and where[−] denotes the equivalence class modulosZ;

- In case where the time set isR or R+ :

Φ([x], t) = [x+ tω],

where[x] ∈ M and where again[−] denotes the equivalence class moduloZ

and whereω−1 is the (prime) period of the periodic motion.

In both cases we parametrised the periodic evolution by an Abelian group, namelyZ/sZ or R/Z, respectively. Translations then consist of ‘adding everywhere thesame constant’, just as for a translation on a vector space.

In the case of several frequencies, the motion will take place on the (differentiable)image of the product of a number of circles (and, possibly, ofa finite set). Theproduct of a number of circles is calledtorus. A torus also can be seen as an Abeliangroup: the quasi-periodic dynamics then consists of translations on such a torus. Inthe sections to follow we first shall study these tori and their translations in detail.

We already like to point at a complication, that is simplest explained on a motionwith two periods. As an example we take the motion of the Sun, as discussed be-fore, with its daily and its yearly period. If the ratio of these two periods would berational, then the whole motion again would be periodic. Forinstance, if the yearlyperiod would be exactly 365.25 dayly periods, then the wholemotion described


multi-periodic!evolution

$n$-dimensionaltorus

Figure 2.1: The 2-torus as a square where the boundaries haveto be identified twoby two (a), and a realisation of this in the 3-dimensional space (b).

here would be periodic with a period of 4 years. As we saw in§2.1.1, this ratio isnot completely fitting, which is the reason for introducing the Gregorian calendar. Ifthe Gregorian calendar would be correct, the whole system would be periodic witha period of 400 years. And a further refinement would correspond to a still longerperiod. This phenomenon is characteristic for motions withmore than one period:they can be well-approximated by periodic motions, but as the approximation getsbetter, the period becomes longer. This might be an argumentfor saying: let usrestrict ourselves to periodic motions. However, for periodic motions of very longperiod (say longer than a human life as would be the 400 years and more we justdiscussed), it can be less meaningful to speak of periodic motion: they simply arenot experienced as such. This is the reason why we shall introduce the concept ofmulti-periodic evolutions. These are motions involving several periods (or frequen-cies), but where it is not excluded that, due to incidental rationality of frequencyratios the system can also be described with fewer periods.

2.2.1 Then-dimensional torus

Then-dimensional torus is a product ofn circles. This means that we can definethen-dimensional torus asRn/Zn, namely as the set ofn dimensional vectors inRn, where we identify vectors whenever their difference only has integer valuedcomponents. Then-dimensional torus is denoted byTn, so

Tn = Rn/Zn.

Forn = 1 we getT1 = S1 = R/Z, where we recall from§1.3.4, that the circleS1

can also be represented in other ways.2

The case wheren = 2 can be described a bit more geometrical. Here we deal withT2 = R2/Z2, so we start out with 2-dimensional vectors. Since in the equivalence

2Note that here we work moduloZ, while in §1.3.4 this was2πZ. This is just a matter of ‘length’or ‘angle’ scales.


relation we identify two vectors whenever their differenceis in Z2, we can representeach point ofT2 as a point in the square[0, 1] × [0, 1]. Also in this square there arestill points that have to be identified. The left boundary0 × [0, 1] is identifiedwith the right boundary1 × [0, 1] (by identifying points with the same verticalcoordinate). Similarly for the upper and the lower boundary. This means that thefour vertix points are all four identified. Compare with Figure 2.1, where similararrows have to be identified. We can realise the result of all this in the 3-dimensionalspace as the surface parametrised by ‘angles’ϕ enψ, both inR/Z, as follows:

(ϕ, ψ) 7→ ((1 − a cos(2πψ)) sin(2πφ), (1 − a cos(2πψ)) cos(2πφ), a sin(2πψ)),

where0 < a < 1 is a constant.

Example 2.1 (Rosette orbit of Mercury). The motion of the planet Mercury isan example where2 frequencies play a role. According to Newton’s theory thisplanet should move in a Keplerian ellipse. In that case the motion would be purelyperiodic. However, it wasobservedthat Mercury’s orbit is not purely elliptical, butthat at a longer time scale the ‘almost’ ellipse also rotates. In this way one obtainsthe so-called rosette orbit as indicated in Figure 2.2. Herethe two periods can beclearly distinguished. The first is the period of the ‘almost’ ellipse, approximatelyone Mercurian year, given by succesive instants when the distance between Mercuryand the Sun is maximal.3 As a second period we have the period of the major axis,i.e., for completing a full rotation which takes approximately 225 780 Yrs. CompareMisner, Thorne and Wheeler [138], p. 1113. Looking at Figure2.2 it is not hard toimagine that we are dealing with a curve on a 2-dimensional torus, that is projectedonto a plane.

Already before we saw that we can define metrics on the circle in different ways.The same holds for then-dimensional torus. A feasible choice is to define thedistance between[x] = [x1, . . . , xn] and[y] = [y1, . . . , yn] by

d(x, y) = maxj=1,...,n

(minm∈Z

(|xj − yj +m|)). (2.1)

As usual, the square brackets indicate that we are dealing with equivalence classesmodulo integer numbers.

2.2.2 Translations on a torus

For a vectorΩ = (ω1, . . . , ωn), metωi ∈ R, we define the corresponding translation(or rotation)

RΩ : Tn → Tn byRΩ[x1, . . . , xn] = [x1 + ω1, . . . , xn + ωn], (2.2)

3Called the apohelium.


translation system

Sun

Figure 2.2: Rosette orbit like the orbit of Mercury. In reality the major axis of theMercurian ‘almost’ ellipse rotates much slower than indicated here, it is only1.4′′

per revolution (i.e., per Mercurian year).

where square brackets again take the equivalence class.

With help of the translations (2.2) we now can define dynamical systems with statespaceM = Tn by

Φt = RtΩ. (2.3)

Here we can take as a time set bothT = Z andT = R. For a single translationRΩ,only the values modZ of theωj are of interest. However, when considering timesetR, wheret takes arbitrary values, the componentssωj of Ω are of interest as realnumbers. We call the dynamical system (2.3) thetranslation systemdefined byΩ.Before going to arbitrary dimension, we first consider a few simpler cases.

2.2.2.1 Translation systems on the 1-dimensional torus

In the case wheren = 1 and with time setT = R, the dynamics of the translationsystem (2.3) defined byΩ is just periodic. This statement holds in the sense that anypoint lies on a periodic evolution, except in the case whereΩ = 0, in which case allpoints are stationary. So nothing new is to be found here.

Takingn = 1 with T = Z or N, Definition (2.3) gives something completely dif-ferent. Now the dynamics strongly depends onΩ ∈ R, in the sense that it becomes


quasi-periodic!evolution

important whetherΩ is rational or not.4 Indeed, ifΩ = p/q, for certain integerspandq, with q > 0, then it follows that

RqΩ = RqΩ = IdT1 ,

is the identity map ofT!. This implies that each point ofT1 is periodic with periodq. Conversely, if there exists a point[x] that is periodic under (2.3) of periodq, thenit follows that

RqΩ([x]) = [x] ⇔ x+ qΩ = x modZ.

In that case necessarilyΩ is rational and can be written as a fraction with denomi-natorq and all[x] ∈ T1 have periodq. It follows that if Ω is irrational, no periodicpoints of (2.3) exist. This case of irrationalΩ, is the first case where we speakof quasi-periodic evolutions: the translation system (2.3) now also is called quasi-periodic. An important property of such a quasi-periodic evolution is, that it is densein T1. These are the contents of the following lemma.

Lemma 2.1 (Dense orbits).If Ω ∈ R \ Q, then for the translation system(2.3)onT1 defined byΩ, with time setN, any evolution[mΩ+x] | m ∈ N is dense inT1.This same conclusion also holds for time setZ.

Proof. We choose an arbitraryε > 0 and show that, for any[x] ∈ T1 the evolutionstarting at[x] of the translation system defined byΩ, approximates every point ofT1 to a distanceε.

Since periodic points do not occur, the (positive) evolution

[x(m)] = RmΩ ([x]) | m ∈ N

contains infinitely many different points. By compactness of T1 this set has anaccumulation point, which implies that there exist points[x(m1)] and[x(m2)] with

d([x(m1)], [x(m2)]) < ε.

This implies that the mapR|m1−m2|Ω is a translation over a distanceε < ε (to theright or to the left). This means that the successive points

Xj = Rj|m1−m2|Ω([x])

are within distanceε and thus that the set

Xj | j = 0, 1, 2, . . . , int

(

1

ε

)

,

4In this 1-dimensional case we do not make a difference between the vectorΩ = (ω1) and thevalue ofω1 as a real number.


x1x1x1x1x1x1x1x1

x2x2x2x2x2x2x2x2

x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1x1


Figure 2.3: Evolution curve of the translation system generated by a constant vectorfield on the 2-torusT2.

where int(1/ε) takes the integer part of int(1/ε), approximates each point of thecircle to a distance less thanε. Finally we note that all pointsXj belong to the(positive) evolution of[x]. This finishes the proof. 2

We observe that to some extent all evolutions of such a translation system are ‘equal’in the sense that the evolutions that start in[x] and[y] are related by the translationRy−x. This is the reason why often we only mention properties for the evolutionthat starts in[0] and not always explicitly say that the same holds for arbitrary otherevolutions as well.

Related to the treatment of the translation dynamics in arbitrary dimension we men-tion that in the 1-dimensional caseΩ is irrational, if and only if the numbers 1 andΩ are linearly independent inR, considered as a vector space over the fieldQ.

2.2.2.2 Translation systems on the 2-dimensional torus with time setR

A next special case to consider is the 2-dimensional torusT2 with a translation sys-tem with time setR. So we haveΩ = (ω1, ω2), assuming that not both componentsare equal to zero. Without restriction of generality we may then assume thatω1 6= 0.We can now take a Poincare map. As a submanifold we take

S = [x1, x2] ∈ T2 | x1 = 0.

Note thatS can be considered as a 1-dimensional torus.5 Then for the map

Rω−11 (ω1,ω2)

= R(1,ω−11 ω2)

,

5Here, and elsewhere in this section, the term ‘can be considered as’ means that the involved setsare diffeomorphic, and that the diffeomorphism even can be chosen as a linear map in terms of thecoordinates used onR/Z.


Van der Pol systemresonanceresonant!evolution

which is the timeω−11 map, the setS is mapped onto itself. OnS, considered as a

1-dimensional torus, this Poincare map is a translation overω2/ω1. Combining thiswith the translation dynamics on a 1-dimensional torus withtime setZ, we obtain:

1. If ω2/ω1 is rational, then all evolutions are periodic;

2. If ω2/ω1 is irrational, then each evolution is dense in the 2-dimensional torus.See Exercise 5.1.

In view of the translation dynamics in arbitrary dimension,we note thatω2/ω1 ∈R \ Q, if and only if ω1 enω2 are linearly independent as elements ofR as a vectorspace overQ.

2.2.2.3 Translation systems on then-dimensional torus with time setR

We now consider the translation dynamics on ann-dimensional torus, given by thetranslation vectorΩ = (ω1, ω2, . . . , ωn) and with time setR. An important fact isthat wheneverω1, ω2, . . . , ωn ∈ R, as a vector space over the rationals, are linearlyindependent, each evolution of the corresponding translation dynamics is dense inthen-dimensional torusTn. We don’t provide a proof, since this uses theory notincluded here (Fourier-theory, Measure Theory and ErgodicTheory). The curiousreader is referred to [34], Appendices 9, 11, where the analogue for time setZ isproven. Also see [33]. In cases whereω1, ω2, . . . , ωn are linearly dependent overQ

one speaks ofresonance. We shall return to resonance in§ 2.2.3 when discussingthe driven Van der Pol equation. Still we have a few general remarks.

Lemma 2.2 (Resonant evolutions are not dense).Suppose thatω1, ω2, . . . , ωn arelinearly dependent overQ. Then the corresponding translation systemRΩ : Tn →Tn with Ω = (ω1, ω2, . . . , ωn) and time setR has no dense evolutions.

Proof. Suppose thatq1, . . . , qn ∈ Q exist, not all equal to0, such that∑

j qjωj = 0.By multiplication of this equality by the product of all denominators we can achievethat all rationalsqj , 1 ≤ j ≤ n, become integers. Next, by dividing out the commonfactors, we also achieve that the integersqj, 1 ≤ j ≤ n have no common factors.From Algebra it is known that integersm1, . . . , mn exist such that

∑

j qjmj = 1,e.g., see Exercise 2.4 or compare with Van der Waerden [202].

We now consider the function given byf(x1, . . . , xn) =∑

j qjxj . To ensure thatfis defined onTn, we take its values inR/Z, noting that the coefficientsqj, 1 ≤ j ≤n, are integers. Since

∑

j qjωj = 0, the functionf is constant on each evolution. Forinstance on the evolution starting at[0] = [0, 0, . . . , 0], the functionf is everywhere0. Sincef is continuous and not everywhere0 on all of Tn, it follows that the


evolution starting at[0] can’t be dense inTn. As remarked earlier, this immediatelyimplies that no evolution is dense inTn. 2

We stay for a while in the resonant context of Lemma 2.2, so withn∑

j=1

qjωj = 0,

with relatively prime integersqj , 1 ≤ j ≤ n. We study the function

fΩ : Tn → R/Z, [x1, x2, . . . , xn] 7→n∑

j=1

qjxj modZ,

as introduced in the above proof, as well as its zero set

NfΩ= [x1, x2, . . . , xn] ∈ Tn | fΩ[x1, x2, . . . , xn] = 0 modZ, (2.4)

a bit further. The proof asserts thatNfΩ⊂ Tn contains the entire evolution

RtΩ[0] | t ∈ R ⊂ Tn

with initial state[0]. It can be shown thatNfΩis diffeomorphic to a torus of dimen-

sionn − 1. Again, we shall not prove this, since the simplest proof we know usestheory of Lie groups [33]. We however do prove thatNf is connected.6

Lemma 2.3. The resonant zero setNfΩ(2.4) is a connected subset ofTn.

Proof. Suppose that[x1, . . . , xn] ∈ Nf , or equivalently thatf(x1, . . . , xn) ∈ Z.Since the integer coefficientsqj , 1 ≤ j ≤ n have no common factor, there is anelement(x1, . . . , xn) ∈ Rn, with

(x1, . . . , xn) ∼ (x1, . . . , xn),

therefore defining the same element ofTn, such thatf(x1, . . . , xn) = 0. Now wecan define an arc

s ∈ [0, 1] 7→ [sx1, sx2, . . . , sxn] ∈ Nf ,

connecting[0] with [x1, x2, . . . , xn] ∈ Nf , so proving thatNf is (arcwise) con-nected. 2

Although we did not show all details, we hope the idea may be clear now. Whenω1, . . . , ωn are linearly dependent overQ, for each initial state there exists a lowerdimensional torus, diffeomorphic toNfΩ

, containing the entire involution. If inthis lower dimensional torus still linear dependencies exist, the process of takinginvariant subtori can be repeated, etc. Eventually we reachthe situation in whichthenewω’s are rationally independent. If then we have a torus of dimension largerthan1, we speak of quasi-periodic dynamics; if the latter torus hasdimension1, thedynamics is periodic.

6As we shall see lateron, a similar connectedness does not necessarily hold for the time setsN orZ.


2.2.2.4 Translation systems on then-dimensional torus with time setZ or N

We next consider the translation dynamics onTn with translation vectorΩ =(ω1, ω2, . . . , ωn), but now with time setZ (or N). In this case each evolution isdense if and only ifω1, ω2, . . . , ωn and1, as elements ofR, are linearly indepen-dent overQ. For a proof we again refer to [34]. Also here we speak ofresonancewhenever the above independence fails.

The connection with the translation dynamics having time set R rests on the suspen-sion construction, compare with§1.2.2.3. When constructing the suspension of atranslation map ofTn, that is given by a translation vectorΩ = (ω1, ω2, . . . , ωn), asis the case here, we obtain a system the state space of which can be considered as an(n+1)-dimensional torusTn+1, where the dynamics is given byΦt = Rt(ω1,...,ωn,1).A proof of this assertion is not really hard, also consider the exercises. Althoughthis, in principle, reduces the case with time setZ to that of time setR by the suspen-sion construction, we like to point at a few problems. Therefore in the forthcomingconsiderations we shall not use the foregoing analysis withtime setR.

To be precise, we investigate the resonant case whenω1, . . . , ωn en1 are rationallydependent. Analogous to the setting of§2.2.2.3 with time setR, also here there existintegersq1, q2, . . . , qn, q without common divisor and for which

∑

j qjωj = q. Forq = 0, as in the previous case, one can argue that there is an(n − 1)-dimensionaltorus that contains the entire evolution. Again it follows that the evolution is notdense inTn, compare with Lemma 2.2.

Now we consider the case whereq 6= 0. Let r be the greatest common divisor ofq1, . . . , qn. In caser = 1, we learn from Algebra that integersm1, m2, . . . , mn existwith

∑

j qjmj = 1. Here recall from the introduction of§2.2.2 that, since the timeset isZ, we may add to eachωj, 1 ≤ j ≤ n, an integer. It follows that forr = 1, bychoosing suitable ‘realisations’ ofω1, ω2, . . . , ωn, we can achieve thatq = 0, andwe are in the above case with an invariant(n−1)-torus. Next we assume thatr 6= 1and thatq 6= 0. Sinceq1, q2, . . . , qn, q have no common divisors, it follows thatqandr are coprime.

In this setting we define the functionfΩ : Tn → R/Z by

fΩ(x1, x2, . . . , xn) =∑

j

qjrxj modZ,

compare with the proof of Lemma 2.2. Since the integersqj/r, 1 ≤ j ≤ n, haveno common divisors, as in the previous case, the resonant zero setNfΩ

⊂ Tn of fis connected. And again it can be shown thatNfΩ

can be considered as a torus ofdimensionn− 1.


conjugation!smoothFrom the above it follows that

fΩ(RkΩ([x])) = fΩ([x]) +

qk

r.

Sinceq andr are coprime, the numbersqk/r as elements ofR/Z, for k ∈ Z, attainall values0, 1/r, 2/r, . . . , (r − 1)/r. We now define

NfΩ=

[x] = [x1, . . . , xn] ∈ Tn | fΩ([x]) ∈ 0, 1r,2

r, . . . ,

r − 1

r

,

where0, 1/r, 2/r, . . . , (r − 1)/r ∈ R/Z. From this it may be clear thatNfΩis the

disjoint union ofr tori of dimensionn − 1 and that the evolution starting at[0]remains withinNfΩ

. A fortiori the evolution is not dense inTn, again compare withLemma 2.2. Furthermore the translationRr

Ω maps the torusNfΩ⊂ NfΩ

onto itselfand the restrictionRr

Ω|NfΩis a translation. Now if any evolution of the restriction

RrΩ|NfΩ

is dense inNfΩ, we call the original dynamics, restricted toNfΩ

quasi-periodic. If the latter is not the case forRr

Ω|NfΩstill dependencies occur between

the newω’s and the reduction can be continued.

Remark. We conclude that, in the case of time setZ (or N), it cannot be avoided toconsider quasi-periodic evolutions on the disjoint union of a finite number of tori.This difference with the case of time setR occurs, notwithstanding the fact that bythe suspension construction the theory with time setZ (of N) can be transferred intothe theory of translation systems with time setR and vice versa.

2.2.3 General definition of multi- and quasi-periodic evolutions

Before discussing multi- and quasi-periodic evolutions ingeneral, we first definewhat are multi- and quasi-periodic subsystems; multi- and quasi-periodic evolutionsthan will be evolutions within such a subsystem. A formal definition will followsoon, but, since this is quite involved we first sketch the global idea. Such a quasi-periodic subsystem is an invariant subset such that the restriction to this subset,compare with§1.2.2, isconjugatedto a translation sytem. Here a conjugation is tobe understood as a bijection, in this case a diffeomorphism,that maps the evolutionsof one system to evolutions of the other system in a time-preserving way.

2.2.3.1 Multi- and quasi-periodic subsystems

Now we come to the formal definition. Consider a dynamical system (M,T,Φ).HereM is a differentiable manifold (where it is not an essential restriction to thinkof M as a vector space) andΦ a smooth evolution operator, taking for the time setT eitherZ, N or R.


multi-periodic!subsystem

quasi-periodic!subsystem

Definition 2.4 (Multi- and quasi-periodic subsystem).We say that this system hasa multi-periodic subsystemif there exists a translation system(Tn, T,ΦΩ), where

ΦΩ([x], t) = RtΩ([x]) = [x+ tΩ],

and a differentiable mapf : Tn → M, such that for all[x] ∈ Tn and t ∈ T theconjugation equation

f(ΦΩ([x], t)) = Φ(f([x]), t) (2.5)

holds. The multi-periodic subsystem is calledquasi-periodicif the translation vec-tor Ω contains no resonances. The mapf is required to be an embedding, meaningthat f(Tn) ⊂ M is a submanifold and that the derivative off at each point hasrankn. We callf(Tn), with the dynamics defined by the restriction ofΦ, the multi-periodic subsystem.7

Validity of the conjugation equation (2.5) implies not onlythat f is a diffeomor-phism fromTn to the subsystem of(M,T,Φ), but also thatf maps evolutions ofΦΩ to evolutions ofΦ, in such a way that the time parametrisation is preserved. Thismeans thatf is aconjugationbetween the translation system onTn with the restric-tion of the original system tof(Tn). We call the translation systemconjugatedwiththe restriction of the original system tof(Tn). Conjugations will occur more often;within the class of dynamical systems they play the role of ‘isomorphisms’.8

In the case of time setT = Z (or N) we also speak of a multi-periodic subsys-tem when the conjugation equaton (2.5) only holds for valuesof t ∈ rT for afixed integerr. Here it is required that the setsΦ(f(Tn), j) are disjoint forj =0, . . . , r − 1. In that case, the state space of the quasi-periodic subsystem is theunion

⋃

j Φ(f(Tn), j) of thosen-tori. This extension follows earlier remarks con-cluding§2.2.2.

From Definition 2.4 it follows that a multi-periodic subsystem is quasi-periodic ifand only if the corresponding translation system is dense inthe torus. Let us discussthe case where the subsystem is multi- but not quasi-periodic. We have seen that,if a translation on ann-torus defines translation dynamics, the latter is only multi-(and not quasi-) periodic, if each evolution is contained ina lower dimensional torusor in a finite union of such subtori. This means that if we have amulti- (and notquasi-) periodic subsystem, by restriction to such a lower dimensional torus, weget a lower dimensional subsystem. As in the case of translation systems, we cancontinue to restrict to lower dimensions until no further reduction is possible. Inthis way we end up in a periodic evolution or a quasi-periodicevolution. The latterwe have defined as an evolution in a quasi-periodic subsystem.

7In Arnold [33] instead of ‘multi-periodic’ the term ‘conditionally periodic’ is being used.8Lateron we shall meet conjugations that are only homeomorphic instead of diffeomorphic.


When comparing with asymptotically periodic evolutions, see§2.1.2, it may nowbe clear what is understood byasymptotically multi-or quasi-periodicevolutions:these are evolutions that tend to multi- or quasi-periodic evolutions for increasingtime. As in the case of periodic evolutions, this can be takeneither in a strict sense,or only as convergence in distance. Furthermore eventuallymulti- or quasi-periodicevolutions could be defined in a completely analogous way, but we are not aware ofany relevant examples where such evolutions play a role.

2.2.3.2 Example: the driven Van der Pol equation

As a concrete example of the above, we consider the driven Vander Pol equation,mainly restricting to the case where the driving force is ‘switched off’. Here weillustrate how a multi-periodic subsystem can be constructed. Of course it would benice to do the same when the driving force is ‘switched on’. Weshall go into theproblems that are involved here and mention some of the results that research hasrevealed in this case, The driven Van der Pol equation given by

x′′ = −x − x′(x2 − 1) + ε cos Ωt, (2.6)

compare with the equation of motion of the driven pendulum, see§§1.1.2 and 1.3.1.Our present interest is with multi and quasi-periodic dynamics of (2.6) as this occursfor small values of|ε|, although our mathematical exploration for now mainly willbe confined to the caseε = 0. Compare with Figure 1.11 showing the phase portraitof the free Van der Pol oscillator (1.9). However, as long as possible we workwith the full equation (2.6). The present purpose is to construct a multi-periodicsubsystem. As in the case of the driven pendulum, we can put this equation in thefomat of an autonomous system with three state variables, namelyx, y = x′ andz.Herez takes on values inR/Z, playing the role of the timet, up to a scaling1

2πΩ.

The new equations of motion thus are

x′ = y

y′ = −x− y(x2 − 1) + ε cos(2πz) (2.7)

z′ = 12πΩ.

In §1.3.1 we already noted that forε = 0, this system has a unique periodic evolu-tion. All other evolutions, except for the stationary evolution given byx = y = 0,are asymptotically periodic and tend to this periodic evolution, again see Figure1.11. This periodic evolution can be parametrised by the map

f1 : R/Z → R2 = x, y,such that the periodic evolution is given by

(x(t), y(t)) = f1

(

1αt)

,


whereα is the prime period. This gives a mapping

f : T2 → R3 = (x, y, z), (w, z) 7→ (f1(w), z),

where bothw andz are variables inR/Z. On the 2-torusT2 = w, z we next con-sider the translation dynamics corresponding to the systemof differential equations

w′ = 1α

z′ = 12πΩ.

In other words, we consider the translation dynamics corresponding to the transla-tion vector( 1

α, 1

2πΩ). It is not hard to show thatf satisfies the conjugation equation

(2.5) and that in this way we have found a multi-periodic subsystem of the drivenVan der Pol equation (2.6) for the caseε = 0.

As said before, we would have preferred to deal with the caseε 6= 0, since forε = 0 thez-coordinate is superfluous and we seem to deal here with a kindof ‘fakemulti-periodicity’. The analysis for the caseε 6= 0, even for the case where|ε| ≪ 1,belongs to Perturbation Theory, in particular Kolmogorov-Arnold-Moser (orKAM )Theory, and will be introduced in Chapter 5, also see the historical remarks andreferences in§ 2.2.5 below. For more details see Appendix B.

First, with help of the Theory of Normally Hyperbolic Invariant Manifolds [113, 4]it can be shown that close to the multi-periodic subsystem for ε = 0, alreadymentioned before, there exists an invariant manifold that is diffeomorphic to the2-dimensional torus. For the existence of such a torus it is of interest that the orig-inal invariant torus ‘attracts’ nearby evolutions in a sufficiently strong way. Thedynamics in the ‘perturbed’ torus can be considered as a small perturbation of thedynamics that we obtained in the multi-periodic subsystem for ε = 0. It turns outthat in the ‘perturbed’ dynamics, it is no longer true that either all evolutions areperiodic or quasi-periodic. Compare with Appendix C.

Next, to describe this situation we consider the system (2.7) for a fixed, small valueof ε 6= 0 and for variable values ofΩ. The analysis of such systems can be foundin [110, 62, 61, 23, 15]. For a summary also see Chapter 5 and Appendix B. Infact, it turns out that for many values ofΩ, to be precise for a set ofΩ-values ofpositive Lebesgue measure, the evolution in the perturbed torus are quasi-periodic.Apart from this there are also many values ofΩ where the perturbed torus containesperiodic evolutions, generally only finitely many. In Figure 2.4 it is indicated howon a 2-torus multi-periodic dynamics, that happens to be periodic, can be perturbedto dynamics with finitely many periodic evolutions. For the values ofΩ where suchperiodic dynamics occurs we have resonance between the period of the driving andthe period of the ‘free’ Van der Pol system, compare with§2.2. Without drivingthese resonances correspond to values ofΩ for which the frequency-ratio1

2παΩ is

rational.


predictability!ofmulti-periodicevolutions

predictability!ofquasi-periodicevolutions

prediction

1 : 1

1 : 2

Figure 2.4: Periodic dynamics on a 2-torus. Left: multi-periodic evolution in aresonance. Right: a (not too small) perturbation of the former.

2.2.4 Predictability of quasi-periodic evolutions

Just like periodic evolutions, quasi-periodic evolutionsin principle are quite wellpredictable. An important historical example of this is themotion of the planets.As these motions appear to us, at least approximately, threefrequencies play a role.The first of these concerns the daily motion of the Sun, Secondly, there is the motionwith respect to the Sun, in which each planet has its own period. Thirdly, we havethe yearly motion of the Sun, in which the planets participate ‘indirectly’. Longbefore the ‘causes’ of the planetary motions were well-known, people were wellable to predict them. In this the famous epicycles played an important role. It shouldbe noted that this method is independent of the special application on the planetarysystem: the present methods where (quasi-) periodic motions are approximated byFourier series is closely related to the method of epicycles.

We now investigate the predictability of quasi-periodic motion more precisely. As inthe case of predicting periodic motions, we shall be concerned with predictions notbased on equations of motion or evolution operators, but on along term observationof an evolution and the principlel’histoire se repete.9 Initially we shall not take into

9History repeats itself.


“textitl’histoire ser“’ep“‘ete

prediction!intervalevolution!past

account the fact that in general states can’t be registered with absolute precision.

We start considering translations on a torus with quasi-periodic evolutions. Forany evolutionx(t) | t ∈ T of such a system, since it is dense in the torus,the following holds true. For anyε > 0 there existsT (ε) ∈ R, such that foreacht ∈ T there existst ∈ [0, T (ε)] ∩ T whered(x(t), x(t)) < ε. Hered is atranslation invariant metric on the torus, compare with§2.2.1, in particular, equation(2.1). In other words this means that anyx(t) can beε-approximated byx(t), where0 ≤ t ≤ T (ε). Since translations do not change distances, it follows thatfor anypositives alsod(x(t+s), x(t+s)) < ε.On this fact we base our prediction method.First, we fixε according to the desired precision of the prediction. Next,we observean initial segmentx(t) | t ∈ T with 0 ≤ t ≤ T (ε) of the evolution. Here thedetermination ofT (ε) in dependence ofε, forms a practical problem that we leaveaside for the moment. Once this observation has been made, for any future instantt > T (ε) we can find a suitablet ∈ [0, T (ε)] with d(x(t), x(t)) < ε. We then usex(t+ s) as a prediction forx(t+ s). This it the principlel’histoire se repete. Notethe quality of the prediction stays the same, however large we choose the predictionintervals.

Remark. One small problem still remains. Using this method we can’t predictfurther into the future than the differencet − t. Indeed, if we want to go beyondthis, we are usingx(t + s) with s > t − t, and this is a state of the evolution that,at the timet, still is hidden in the future and therefore can’t be used for making aprediction at the timet. It thus follows that in this way we can continue our methoduntil prediction ofx(t+ (t− t)) and that our prediction of this state equalsx(t). Ifwe wish to go beyond this limit with our predictions, it is feasable to go further inthe past of the evolution and look for an instant where the evolution was closex(t),then assuming that the continuation beyond instantt + (t − t) is the same as thatfollowing the past instant.

When earlier searching the past for an instant where the state is close tox(t), wefound t. In this way, we predict the future as a periodic repetition of

x(s) ∈M | s ∈ T andt ≤ s ≤ t.

It may however be expected that the maximal prediction errorin each period of thissequence increases with the same amount, namelyε, until saturation occurs. Thismeans that the errors can’t exceed the diameter of the torus in which the evolutiontakes place.

We see that also here, the maximal error increases approximately linear with theprediction intervals, till saturation takes place. And again, the ratio between theprediction errors and the prediction interval become more favourable when knowinga longer initial piece of the evolution, and therefore can choose the valueε smaller.


prediction!errorComparing the above conclusions with those in the periodic case, we see a greatsimilarity. One difference, however, is that in the above itis not yet taken intoaccount that the states can only be observed with a finite precision. (In the periodiccase this was essentially the only source of prediction errors.) This is related to thefollowing. In the periodic case the period can be determinedwith absolute precisionafter observing at least over one full period (leaving asidethe observation errors).In the quasi-periodic case none of the periods can be determined by just observing afinite piece of the evolution, because it does not close up andthe continuation is notfully determined. In other words, continuation can’t be concluded from the givenobservation. When taking into account this inaccuracy, theabove conclusion aboutthe ratio between prediction error and prediction intervalremains the same, whereonly this ratio is less favourable.

For an arbitrary multi-periodic subsystem of(M,T,Φ), see Definition 2.4, theabove discussion has to be slightly adapted, since the mapf : Tn → M from torusto state space in general does not preserve distances. However, sinceTn is compact,the mapf that defines a quasi-periodic subsystemf(Tn) is uniformlycontinuous.From this it is rather simple to deduce the following. For anyε > 0 there existsδ > 0, such that wheneverp, q ∈ Tn are within distanceδ, then for alls > 0 onehas

dM(Φ(f(p), s),Φ(f(q), s)) < ε.

HeredM is the metric onM. This implies that the prediction method also works inthis general setting, in the sense that also here the prediction errors increase linearlywith the prediction intervals.

In case we are dealing with an asymptotically quasi-periodic evolution we have toadapt our arguments a bit further, but essentially this prediction method, based oncomparison with the past, works as described above.

Linear growth of prediction errors, where the speed of the growth can be madearbitrarily small by carrying out sufficiently long observations, means that we aredealing with predictability of an exceptionally high quality. In many systems, withthe weather as a notorious example, predictability is far more problematic. Thecommon idea about this is that always ‘random’ perturbations occur, which spoilthe predictability. The only exception is formed by the planetary motion, that seemto be of an ‘unearthly’ predictability. As we shall see hereafter, there are also evo-lutions of simpledeterministicsystems, the so-called chaotic evolutions, that areessentially less predictable than the (asympotically) quasi-periodic evolutions metso far. This being said, we have no wish to deny that in practical circumstances also‘randomness’ can play a role,


2.2.5 Historical remarks

For a long time one has wondered whether quasi-periodic evolutions, as definedabove, do occur in ‘natural’ dynamical systems other than the special examples in-vented by mathematicians. Even in the case of the driven Van der Pol equation(compare with§2.2.3.1), the proof that quasi-periodic evolutions occur is far fromsimple: it was only found in the 1950’s and 60’, compare with [62, 61, 15, 23].The underlying problem has puzzled mathematicians for a long time and a com-plete treatment of the solution of this problem would carry too far and we refer toChapter 5 and to Appendix B. However, this is a suitable placeto present moreinformation on the problem setting, as this is connected to the stability of the SolarSystem. It was not for nothing that our first example of motions that, at least to avery good approximation, are multi-periodic, was given by the planets. We shallnow consider the entire Solar System. In this system the motion to a good approxi-mation is multi-periodic. The simplest approximation is obtained by neglecting themutual attraction of the planets. In this approximation each planet, only attractedby the Sun, carries out a perfect periodic motion along an ellipse, as this was dis-covered by Kepler and explained by Newton. Now, if the ratio’s of the periods ofrevolution of all these planets have no rational dependence(which indeed would bequite exceptional), then the motion of the entire system is quasi-periodic.

The question then is, what are the consequences of the mutualattraction of theplanets. From our experience we know that the short term effects, say over severalthousands of years, are quite restricted. The stability problem of the Solar Systemhowever is more involved: the question is whether these effects on an infinitelylong time scale are also restricted. Here several scenario’s come to mind. To startwith there is the most optimistic scenario, where the mutualinteraction does notaffect the quasi-periodic character of the global motion, but only position and formof the corresponding invariant torus in the state space are abit distorted. In termsof the above Definition 2.4 of a quasi-periodic subsystem, this would just mean asmall distortion of the mapf, that conjugates the translation system to the quasi-periodic subsystem. This is the most orderly way in which thesmall perturbationskeep each other in balance. Another scenario occurs when thesmall perturbationsdo not balance on the average, but keep accumulating. In principle this may leadto various interesting calamities: in the not too near future planets could collide oreven escape from the Solar System. Also the Earth might be captured as a satelliteby Jupiter, etc., etc. Evidently, these questions are only of interest when lookingfurther ahead than only a few generations. It was king Oskar II of Sweden and Nor-way,10 who called a large scale contest, where mathematicians11 were challenged to

101829-190711Already then, Celestial Mechanics was mainly the domain of mathematicians.

2.3 Chaotic evolutions 89

chaotic evolutionshed light on this matter. The contest was won by Poincare,12 not because he clar-ified the situation, but because he showed that the complications by far exceededeveryone’s expectations. Poincare’s investigations [157] were of tremendous in-fluence on the development of the mathematical theory of Dynamical Systems.13

Nevertheless, the original problem concerning the Solar System remained. Later,around 1960, some further progress was made in this direction. By the joint effortsof Kolmogorov,14 Arnold15 and Moser16, one speaks ofKAM Theory, eventuallyit was ‘approximately’ proven that, if the mutual interaction between the planetsis sufficiently small, in the state space there is a set of initial conditions of posi-tive volume, that give rise to quasi-periodic dynamics. Compare with [32].17 Theterm ‘approximately’ is used here, because this result has not been elaborated forthe full generality of the three-dimensional Solar System.Anyway, it was shownthat quasi-periodic evolutions fill out a substantial part (i.e., of positive volume) inthe state space in systems, the practical relevance of whichcan’t be denied.KAM

Theory probably is the most important contribution of the 20th century to Classi-cal Mechanics. Further elaboration of the consequences of this discovery still is animportant area of research, not only for Celestial Mechanics, but also for StatisticalPhysics (which aims to explain properties of matter by the molecular structure) andfor the analysis of resonance phenomena.

For more complete information on the historical details we refer to [175, 87, 39].For a recent overview from a more general perspective with many references to theliterature, also see [61, 23]. In Chapter 5 we present an introduction toKAM Theory.

2.3 Chaotic evolutions

We have seen that the qualitatively good predictability of a(quasi-) periodic evolu-tion x(t) of a dynamical system(M,Φ, T ) rests on the fact that

A. For anyε > 0 there existsT (ε) ∈ R such that, for all0 ≤ t ∈ T , there is at ∈ [0, T (ε)] ∩ T with d(x(t), x(t)) < ε;

B. For anyε > 0 there is aδ > 0 such that, ifd(x(t), x(t)) < δ, with 0 ≤ t, t ∈ T,then for all0 < s ∈ T alsod(x(t+ s), x(t+ s)) < ε.

12Henri Poincare 1854-191213The precise history is quite involved, compare with Barrow-Green [39].14Andrey Nikolaevich Kolmogorov 1903-198715Vladimir Igorevich Arnold 1937-16Jurgen K. Moser 1928-199917Certain approximate ‘strong’ resonances in the planetary motion, e.g., concerning Jupiter and

Saturn, were taken into account by the model of Arnold [30].


predictability! ofchaotic evolutions

Doubling map

Indeed, these two statements imply that given prediction error ε, there exists asuffiently long past interval[0, T (ε)] such that at a ‘present’ instantt > T (ε) wecan predict the ‘future’ statex(t + s), by x(t + s), for some instantt ∈ [0, T (ε)],keeping things within the prescribed errorε. We here need thatt+ s ∈ [0, t]. More-over,s is called the prediction interval and the whole principle wenamedl’histoirese repete.

For the chaotic evolutions, to be discussed now, property A still holds, but propertyB is no longer valid. Based on A, it will be possible, given an initial piece of theevolution[0, T (ε)] of sufficient length, to find statesx(t) that are very close to the‘present’ statex(t), but predictions based on the principlel’histoire se repetewill beless succesful. Before giving a formal definition of what is going to be understoodas a chaotic evolution, we first investigate the Doubling map, see§1.3.8, for the oc-currence of badly predictable, and therefore possibly chaotic, evolutions. Althoughsimilar evolutions occur in many of the other examples, onlythe Doubling map issufficiently simple to be mathematically analysed by ratherelementary means.

2.3.1 Badly predictable (chaotic) evolutions of the Doubling map

We already saw in§1.3.8, that in the dynamics defined by iteration of the Doublingmap, where we have the situation in mind with the state spaceS1 = R/Z, there existmany periodic and eventually periodic evolutions. Evidently these evolutions arenot chaotic. We now consider an evolutionx(n) that is neither (eventually) periodicnor stationary. Such an evolution necessarily has infinitely many different states(i.e., points on the circleS1). Indeed, if for certainn1 < n2 one has thatx(n1) =x(n2), then the evolutions surely is from timen1 on periodic and contains no morethann2 different points. So assuming that we are dealing with a not (eventually)periodic evolution, for anyn1 6= n2 we havex(n1) 6= x(n2).

Lemma 2.5. For any evolution of the iterated Doubling map that is not (eventually)periodic, propertyA is valid.

Proof. Let ε > 0 be chosen. Consider a sequence of non-negative integers0 =n0 < n1 < n2 < . . . such that for each0 ≤ j < i one hasd(x(ni), x(nj)) ≥ ε. Tomake the definition of the sequencenj unique, we require that for anyn > 0 thatdoesn’t occur in the sequence, there existsnj < n, such thatd(x(n), x(nj)) ≤ ε.From the definition of the sequencenj it now follows that for any twoi 6= j wehave thatd(x(ni), x(nj)) ≥ ε. Since the circleS1 has finite length, the sequencenj has to befinite. LetN = nk be the final element of the sequence. (Note thatbothk andN, as well as the elementsn1, n2, . . . , nk of the sequence, depend onε.)Now we have that for anyn > 0 there exists an elementnj of the sequence such


prediction!errorthatd(x(n), x(nj)) < ε. From this it clearly follows that property A holds for ourevolution: for the given value ofε we can takeTε = N = nk. 2

To show that the second property B does not hold, we fixε < 14.We use the distance

function onS1 = T1 that also is used on generaln-tori:

d([x], [y]) = minx∈[x],y∈[y]|x− y|,

wherex, y ∈ R and where[−] denotes the equivalence class modZ. For two points[x], [y] ∈ S1 with d([x], [y]) < 1

4it follows that

d(ϕ([x]), ϕ([y]) = 2d([x], [y]),

where byϕ we denote the Doubling map.18 This means that, however small thedistanceδ = d(x(n), x(m)), with n 6= m, always a positives exists such that

d(x(n + s), x(m+ s)) ≥ ε. (2.8)

Indeed,d(x(n+ s), x(m+ s)) = δ2s, (2.9)

as long asδ2s < 12. Now if δ < ε, then for

s = 1 + int

(

ln ε− ln δ

ln 2

)

,

where the function ‘int’ takes the integral part, the inequality (2.8) occurs (for thefirst time).

With this same argument we can also indicate how prediction errors increase. Letus assume the same set-up as before where we compare with an initial segment ofthe evolutionx(t) | t ∈ T. Given a ‘present’ statex(t), if we are able to find astatex(t) in the initial, ‘past’ segment, such thatd(x(t), x(t) < δ, where we usex(t+ s) as a prediction for the ‘future’ statex(t+ s), then we know that, accordingto (2.9), the prediction error does not exceedδ2s (for the moment leaving aside thesaturation that takes place when errors increase to1

2). We see that here we have

exponential growthof the prediction error as a function of the prediction interval s.Also here we should take into account that fort+ s > t extra errors will occur, butpresently this is of lesser interest, since by the rapid increase of prediction errors themaking of predictions far in the future has become nonsensical.

Remark. We note that, even if we have exact knowledge of the evolutionoperator,a small inaccuracy in our knowledge of the intial state, already leads to the same

18Foregoing the subtleties of notation of§1.3.8.


exponential growth of prediction errors. This is contrarious to what we see for(quasi-) periodic evolutions.19 If, in the dynamics of the Doubling map, we knowthe initial condition up to an error of at mostδ, then prediction with help of theevolution operator and prediction intervals, involves an error of at mostδ2s, againleaving alone saturation. The effect of exponential growthof errors is illustrated inFigure 2.5.

x

t0

1

50

Figure 2.5: Effect of a small inaccuracy in the initial statein evolutions of theLogistic map: the difference in the initial states is approximately2.10−16. (We usedthe Logistic map rather than the Doubling map, since in the latter case numericalartefacts occur as a consequence of the binary number system.)

In general we express the upper bound for the prediction error in a chaotic evolution,to be defined shortly, as a function of the prediction interval s, for larges, by anexpression of the form

δeλs.

Hereδ can be made small by the amount of information we have over a long intervalin the past of the evolution (if our predictions are based onl’histoire se repete) oron the initial condition (if we use the exact evolution operator). The constantλ iscompletely determined by the evolution operator of the system. In the case of theDoubling mapλ = ln(2).

To clarify the difference in predictability of such chaoticevolutions in comparisonwith the (quasi-) periodic evolutions mentioned earlier, we investigate how muchtrouble it takes to enlarge the prediction interval, where we then have to produceuseful predictions. Here the notion ofpredictability horizoncomes up. This is the

19This remark will be elaborated later: here we are always dealing with one evolution whileinaccuracy in the initial condition involves many evolutions.


maximal time intervals that as a prediction interval still gives reasonable predic-tions. Apart from the saturation mentioned before, in the chaotic case the predictionerror has an upper bound ofδeλs, where in the (quasi-) periodic case we found anupper boundδs. In both casess is the prediction interval, whileδ depends on thelength of the initial interval of the evolution, we used for making our predictions.Decreasingδ mostly will be difficult and expensive. The predictability horizon nowis obtained by introducing a maximal prediction errorc and solving the equations

δeλs = c andδs = c,

thus obtaining the prediction horizon

s =

1λ(ln(c) − ln(δ)) chaotic case

cδ

(quasi-) periodic case.

If we are able to decreaseδ to δ′ = δ/σ, for σ > 1, this will lead to an increaseof the prediction horizon. In the (quasi-) periodic case thehorizon ismultipliedbyσ, while in the chaotic case the horizon is increase byadditionof ln(σ)/λ. For theDoubling map whereλ = ln(2), we just addln(σ)/ ln(2). The better predictabilityof (quasi-) periodic evolutions therefore is connected with the fact that larger valuesof the predictability horizon are ‘more easily reached by multiplication than byaddition’.

The exponential growth of the effects of small deviations inthe initial state, as justmentioned, was first ‘proven numerically’ by Lorenz for the system of differentialequations we treated in§1.3.7. A complete proof that such exponential growth re-ally occurs was found only in 1998, see [197]. Already numerical results in thisdirection can be seen as an indication that in such cases practically no reliable pre-dictions can be made on events further in the future. Lorenz’s system was a strongsimplification of the meteorological system that describesthe dynamics underlyingthe weather. In [35] one goes further: only using heuristic,but theoretically basedarguments, it is argued that in weather predictions the errors, as a function of theprediction intervals, grow by a dayly factor of 1.25. This factor may seem low,given our experience with the Royal Netherlands Meteorological Institute (KNMI),but this can be well explained by the incompleteness of the arguments used. In anycase, a better understanding of the predictability of phenomena like the weather, isa fascinating area of research, about which the last word hasnot yet been spoken.

2.3.2 Definition of dispersion exponent and chaos

Here we shall limit and simplify the concepts, used above in an informal way, soto arrive at formal definitions. In Chapter 6 we return to the problem of quantify-ing the unpredictability. There we shall be mainly interested in algorithms for theinvestigation of experimental data, for which a different approach is required.


dispersion exponent“textitl’histoire se

r“’ep“‘ete

We start with property A, see§ 2.3.1, needed to be able to speak on predictionsbased on the past, i.e., on the principlel’histoire se repete. We assume a state spaceM with a metricd : M × M → R+, assuming thatM, under the metricd, isa complete space. This means that every Cauchy sequence has alimit. Also, forsimplicity, we assume the evolution operatorΦ : M × T → M to be continuous,whereT ⊂ R is the time set as before. Throughout we shall use the followingnotation:

1. Instead ofx(0) we also just writex;

2. The positive evolution ofx, as a subset ofM, is denoted by

O+(x) = x(t) | t ∈ T ∩ R+;

3. The initial segment of the evolution ofx, during timeT , is denoted by

O+(x, T ) = x(t) | t ∈ T ∩ [0, T ].

In this notation property A is equivalent to: For anyε > 0 there existsT (ε), suchthatO+(x) ⊂ Uε(O(x, T (ε))), whereUε stands for ‘ε-neighbourhood of’.

Lemma 2.6. Let the dynamical system(M,T,Φ) be as above, whereM be a com-plete metric (metrizable) space andΦ continuous. Then, for the positive evolutionof x = x(t) propertyA holds if and only if the closureO+(x) is compact.

Proof. A complete metric space is compact if and only if, for anyε > 0, it allowsa cover with finitely manyε-neighbourhoods. If we speak of anε-neighbourhood,it is understood that we mean a neighbourhood of one point. Itis not hard to showthat this assertion is equivalent to the usual characterisation, which says that everyinfinite sequence has an accumulation point.

We first show that property A implies compactness ofO+(x). So we assume that Aholds true. Then, for anyε > 0 a numberT (1

2ε) > 0 exists, such thatO+(x) is con-

tained in the12ε-neighbourhood ofO+(x, T (1

2ε)). Now O+(x, T (1

3ε)) is compact,

as the image of the compact setT ∩ [0, T (12ε)] under the continuous mapx( ). This

implies thatO+(x, T (12ε) can be covered by a finite number of1

2ε-neighbourhoods.

Then also the correspondingε-neighbourhoods, i.e., around the same points, cov-ersO+(x, T (1

2ε). Moreover, the correspondingε-neighbourhoods then cover a1

2ε-

neighbourhood ofO+(x, T (12ε) and henceO+(x). SinceO+(x) clearly is metri-

cally complete, by the above characterisation, this gives compactness ofO+(x),which concludes the first part of the proof.

We now assume that property A does not hold and show thatO+(x) is not com-pact. Since A does not hold, there existsε > 0 such thatO+(x), for any T ,


positively compactis not contained inUε(O+(x, T )). In turn this implies that there exists an infi-nite sequence0 < T1 < T2 < . . . , such that for eachj > 0 we have thatx(Tj+1) 6∈ Uε(O+(x, Tj)). This means thatd(x(Ti), x(Tj)) ≥ ε, for all positivei 6= j. Now, if we have a cover ofO+(x) with 1

2ε-neighbourhoods, then two differ-

ent points of the formx(Tj) can’t lie in one same12ε-neighbourhood. This means

that there is no finite cover ofO+(x) with 12ε-neighbourhoods. Again using the

above characterisation, it follows thatO+(x) is not compact. 2

It is well-known [122] that a subset of a finite dimensional vector space has a com-pact closure if and only if the subset is bounded. This implies that for systems with afinite dimensional state spaceM property A holds for any positive evolution that isbounded. In infinite dimensional vector spaces, in particular, in function spaces, thisconclusion is not valid. Under certain circumstances, e.g., for positive evolutionsof parabolic partial differential equations, it still is possible to show that positiveevolutions do have a compact closure.

Definition 2.7 (Positively compact).Let(M,T,Φ) be a dynamical system as above,withM a topological space andΦ continuous. A positive evolutionx = x(t) withpropertyA, i.e., which as a subset ofO+(x) ⊆ M has a compact closure, is calledpositively compact.

When considering the not eventually periodic or stationaryevolutions of the Dou-bling map, we mentioned that for an evolution that we like to call ‘chaotic’, propertyB is notvalid. Now we shall introduce a quantity for the ‘measure in which B doesnot hold’. This quantity measures the exponential growth ofsmall errors as dis-cussed before. Our set-up remains the same, where the state spaceM a completemetric space with metricd. Moreover we consider an evolutionx(t), t ∈ T, that ispositively compact and not (eventually) periodic. The latter is needed so that for allε > 0 there existt′ 6= t ∈ T, with 0 < d(x(t), x(t′)) < ε. First we define

E(s, ε) = sup0<t6=t′∈T ;d(x(t),x(t′)<ε

d(x(t+ s), x(t′ + s))

d(x(t), x(t′)), (2.10)

where0 < s ∈ T andε > 0. The functionE(s, ε) indicates the maximal factor,the distance between two points of the evolution increases as time increases bysunits, assuming that the initial distance between the two points does not exceedε.This is a measure for the prediction errors in dependence of the prediction intervals. In case we predict with help of the exact evolution operator,but on the basis ofrestricted precision of the initial state,ε indicates this lack of precision. In case wepredict by comparison with the past,ε is the distance within which an arbitrary statecan be approximated by a ‘past state’.


prediction!intervalprediction!error

The functionE(s, ε) in (2.10) does not have to be finite. However, it can be proventhat for a positively compact evolution of a system with aC1-evolution operator,E(s, ε) is finite for all positive values ofs andε. See Exercise 2.8.

We already pointed at the phenomenon of saturation: prediction errors generallygrow (linearly or exponentially) until saturation takes place. When the initial errordecreases, saturation will occur later. Therefore, we can eliminate the effect ofsaturation by considering a fixed prediction intervals, restricting to smaller initialerrors. This motivates taking the following limit:

E(s) = limε→0

E(s, ε).

The existence of this limit (that may take the value∞) follows from the fact thatE(s, ε), for fixed values ofs, is a non-decreasing function ofε.

It is not hard to show that for a (quasi-) periodic evolution the functionE(s) isbounded and has a positive infimum. Compare the Excercises. In the above ex-ample of a non (eventually) periodic or stationary evolution of the Doubling mapwe haveE(s) = 2s, expressing exponential growth. This exponential growth ischaracteristic for a chaotic evolution. WheneverE(s) increases approximately ex-ponentially withs, we expect that the expressions−1 ln(E(s)) does not tend to0 ass→ ∞. In fact it can be proven that the limit

E = lims→∞

1

sln(E(s)) (2.11)

always exists. In the case of a positively compact evolution, where the evolutionoperator is of classC1, the limit E takes on finite values. Also see the exercises.We callE the dispersion exponentof the evolutionx(t). It indicates how rapidlyevolutions that start close together in the state space, aredriven apart, or dispersed,by time evolution.

Definition 2.8 (Chaos).Given a dynamical(M,T,Φ) system as above, so withMa topological space andΦ continuous. A positively compact evolutionx = x(t) iscalledchaoticif the dispersion exponentE, see(2.11), is positive.

Remarks.

- We have to say that the definitions given here are not standard. Usually onedefines chaotic dynamics in a way that uses information onall, or at leasta large number of, evolutions. Presently we consider one single evolution,where the notion of predictability, and hence of chaos, should make sense.This is achieved by our definitions.

- Alternative definition of chaos are named after ‘Devaney’ and ‘Milnor’, butwill not be discussed here. For details see [5, 136].


dispersion exponent- The dispersion exponent is strongly related with the largest Lyapunov expo-nent, see Chapter 6. We note however, that Lyapunov exponents cannot bedefined in terms of a single orbit.

2.3.3 Properties of the dispersion exponent

We discuss a few properties of the dispesrion exponent in relation to predictability.By adapting the time scale, the dispersion exponent can be ‘normalised’. Indeed, ifwe scale the timet to t′ = α−1t, the dispersion exponent changes toE ′ = αE.Notethat in case the time set isT = Z, we get a new time setT ′ = α−1Z. In this waywe can always achieve thatE = 0, 1 or∞. For chaotic evolutions we therefore finda ‘natural’ unit of time corresponding to the rate of increase of unpredictability asa function of the prediction intervals. We already noted that the for a deterministicsystem, under a few simple conditions (involving differentiability of the evolutionoperator), alwaysE <∞. However, the dispersion exponentE also can be definedfor non-deterministic motions, like Brownian motions. In such cases, usuallyE =∞. Granted some simplification we therefore can say that there are three kinds ofmotion. The first kind consists of very predictable evolutions, like the stationary,periodic and quasi-periodic ones, characterised byE ≤ 0. The second kind is badlypredictable, chaotic, but still deterministic and characterised by0 < E < ∞. Thethird kind finally is formed by non-deterministic, random motions, characterised byE = ∞. In the latter case one also speaks of ‘stochastic’ motions.

The three classes of motion as just described, are not strictly separated, but allow‘continuous transitions’ from one class to the other. Without attempting formalmathematical precision, we now discuss a few examples of such ‘continuous tran-sitions’.

2.3.3.1 ‘Transition’ from quasi-periodic to stochastic

By simple numerical experiments one can see that quasi-periodic motions with alarge number of different frequencies, do not look predictable at all. Such motionstake place on a torus of large dimension. Compare with Figure2.6, where we dis-played a 1-dimensional projection of the evolution. This phenomenon is related tothe fact that any reasonable function can be approximated byquasi-periodic fun-tions, compare with, e.g., [167]. english

reference

2.3.3.2 ‘Transition’ from periodic to chaotic

There is yet another way to approximate an ‘arbitrary’ motion by periodic motions,namely by letting the period tend to∞. When observing a motion for only a finitetime interval, we never can conclude that the motion isnot periodic. Evidently, it


t

x

t

x

t

x

Figure 2.6: Quasi-periodic evolutions with 3, 8 and 13 frequencies: in all case a1-dimensional projection of the evolution is displayed as afunction of time.

does not make sense to describe a motion, that has not been observed to be periodicand for the periodicity of which there are no theoretical arguments, as periodic.This kind of uncertainty on whether the motion is chaotic or periodic occurs inmany systems with chaotic evolutions. In the case of the Doubling map, the pointswith a periodic evolution are dense in the state space, so by asmall change of theinitial state a periodic evolution can be obtained. When restricting to a finite timeinterval, the approximation by a periodic initial state canbe chosen in such a waythat the difference between the original evolution and the periodic approximation isarbitrarily small. Yet, when the initial state is chosen randomly, there is no reasonto assume that we will have a periodic evolution: although the periodic points forma dense subset, it has Lebesgue measure0.

2.3.3.3 ‘Transition’ from chaotic to stochastic

Let us now briefly discuss how a computer simulates systems governed by chance.In its simplest form, a sequence of numbers in[0, 1] is produced that closely resem-bles the realisation of independent sampling from the uniform distribution. How-


2ϕ

ϕ

−1 10

s = cosϕ

s2s2 − 1

Figure 2.7: The Logistic family (forµ = 4) as a ‘projection’ of the Doubling mapon the circle.

ever, a computer is a deterministic system and therefore we have to cheat a bit. Inprinciple this goes as follows. The first number is provided by the programmer, butthe subsequent numbers are produced by application of a kindof Doubling map,where the multiplication is not by a factor2, but by a very large numberN. So oneuses iteration of the mapx 7→ Nx modZ. For this deterministic dynamical system,most evolutions have a dispersion exponentE = ln(N). By takingN larger, weget a better approximation of the non-deterministic case whereE = ∞. For suffi-ciently largeN, in a graphical representation, the (obvious) lack of independencein thesepseudo random numbers, no longer is visible. Indeed, since each pixel ofthe screen corresponds to an interval of numbers, for sufficiently largeN no deter-ministic relationship exists between the successive pixels representing the pseudorandom numbers.20

2.3.4 Chaotic evolutions in the examples of Chapter 1

Unfortunately, it often is difficult to show chaoticity of a given evolution. In the caseof the Doubling map this was not the case: each point inR/Z that is not rational, isthe initial state of a chaotic evolution.

Also in the case of the Logistic familyΦµ : x 7→ µx(1 − x) for µ = 4, it is rathereasy to prove that many states exist giving rise to chaotic evolutions. We shall show

20In fact the computer only computes with integers and not withreal numbers. This gives rise tofurther complications, that we shall not address here.


Logistic systemDoubling map

this now, illustrated by Figure 2.7. Consider the Doubling map on the circle, nowof the form21

ψ 7→ 2ψ mod2πZ. (2.12)

The circle is considered as a subset of the plane, according to

ψ ↔ (cosψ, sinψ).

As indicated in Figure 2.7, we can project the Doubling map dynamics on the inter-val [−1,+1] of the horizontal axis. Projection amounts to the map

ψ 7→ cosψ. (2.13)

The Doubling map on the circle mapsψ to 2ψ, which in the horizontal intervalcorresponds tocos 2ψ = −1 + 2 cos2 ψ. Expressed in the variables = cosψ thisgives the map

s 7→ 2s2 − 1. (2.14)

The affine coordinate transformation

x = 12(1 − s) or s = 1 − 2x (2.15)

takes (2.14) into the Logistic mapΦ4 : x 7→ 4(1−x), as can be checked by a directcomputation.

What to conclude from these considerations? First of all thecoordinate transfor-mation (2.15) conjugates the map (2.14) into the Logistic map Φµ for µ = 4. Sucha conjugation evidently preserves all dynamical properties. Second the map (2.13)is a 2-to-1 map, forming asemi-conjugationbetween the Doubling map (2.12) and(2.14): the Logistic map in disguise. It is not hard to show that the semi-conjugationpreserves chaoticity, as well as the value of the dispersionexponentE = ln(2). SeeExercise 2.10

Regarding the other examples, we observe the following. Forthe free pendulumas well as for the free Van der Pol oscillation, no chaotic evolutions occur and theonly possible evolutions are (asympotically) stationary and periodic. For the drivenpendulum and driven Van der Pol system, chaotic evolutions can occur. Around1950, this phenomenon has cost mathematicians like Cartwright, Littlewood andLevinson a lot of headache. Eventually, during the decades between 1960 and 1990many methods have been developed to prove that such systems do have chaoticevolutions. This break-through owes a lot to the geometrical methods introducedby S. Smale, in an area that for a long time was dominated by analysis. The paper[179] where these ideas have been elaborated still is highlyworth reading. Much of

21We use2π instead of1 in order to get simpler formulæ.


Thom mapthe later work has been summarised in [6, 16, 61, 23], also compare with the varioushandbooks [19, 20, 21, 22].

Also regarding the Logistic family, the Henon family, the Lorenz attractor and theRossler attractor, it was clear that for many parameter values the numerical sim-ulations looked chaotic. Still, it has taken a long time, sometimes more than 20years, before these phenomena could be understood, at leastpartly, in terms of fullyproven theorems. We here just refer to [41, 42, 88, 197] and shall continue thisdiscussion in Chapter 4, when more mathematical tools have become available.

2.3.5 Chaotic evolutions of the Thom map

We now turn to a new example, the so-called Thom map.22 Also in this exampleit is not too hard to prove that many evolutions are chaotic. We include it here,because it is prototypical for a general class of dynamical system, the so-calledAnosov systems. The state space is the 2-dimensional torusT2 = R2/Z2 and thetime set isZ. So the dynamics is generated by an invertible mapT2 → T2. Thismap is constructed as follows. We take a2 × 2-matrixA with integer coefficients,with det(A) = ±1 and where the absolute values of the eigenvalues are not equalto 1. As a leading example we take

A =

(

2 11 1

)

. (2.16)

The matrixA as in (2.16) defines an invertible transformation of the plane. Sincethe coefficients are integers,A induces a mapϕ : T2 → T2. Sincedet(A) = ±1,also the matrixA−1 has integer coefficients. The torus map induced byA−1 then isthe inverse ofϕ. We now discuss a number of properties of the mapϕ, from whichit eventually follows that the dynamics contains chaos.

First of all we note that the point[0, 0]23 is a fixed point ofϕ. For the derivative wehave

dϕ[(0,0)] = A.

The eigenvalues ofA are real, since otherwise the eigenvalues are complex conju-gatesλ andλ. Then sinceλλ = det(A) = ±1, it follows that |λ| = 1. This contra-dicts our assumptions, which proves thatA has indeed real eigenvalues. Moreoverit will be clear that the eigenvalues satisfy

0 < |λ1| < 1 < |λ2|.This implies that there are stable and unstable separatrices, compare with§1.3.2.Our main interest is with the unstable separatrix, to be denoted byW u.

22Also called Arnold’s Cat map, compare with [1], page 120, Figure 81.23Again [−] takes the equivalence class modZ2.


The fixed point[0, 0] ∈ T2 corresponds to the fixed point(0, 0) ∈ R2 of the linearmap of the plane given by the matrixA. The unstable separatrix ofthis fixed pointis a straight line, so of the form

(α1s, α2s) | s ∈ R,

where(α1, α2) is an eigenvector ofA for the eigenvalueλ2. The unstable separatrixW u then is the projection of this line to the torus. The successive intersections ofW u with the circle

[(x1, x2)] | x2 = 0 ⊂ T2,

are given by

. . . , [−2α1

α2, 0], [−α1

α2, 0], [0, 0], [

α1

α2, 0], [2

α1

α2, 0], . . . . (2.17)

Such sequences of points we have seen earlier when studying translation systemson the 1-dimension torus or circleT1 = S1, compare§2.2.2.1. The main point iswhether the ratioα1/α2 is rational or irrational. We have

Proposition 2.9 (Dense separatrix Thom map).α is irrational andW u = T2.

Proof. The case thatα1/α2 is rational leads to a contradiction as follows. Indeed,the assumption thatα1/α2 is rational implies thatW u is a closed curve, and hence acircle. Now by the definition of unstable separatrix we haveϕ(W u) = W u, wherethe restrictionϕ|W u magnifies the distance between (nearby) points with a factor|λ2| > 1. This contradicts the fact thatϕ is a diffeomorphism. We conclude that theratioα1/α2 is irrational, whence by Lemma 2.1 it follows that the intersections

W u ∩ [(x1, x2)] | x2 = 0

i.e., the set of points (2.17), is dense in the circle[(x1, x2)] | x2 = 0. This directlyimplies thatW u is dense inT2. 2

Definition 2.10 (Topologically mixing). We callf : M →M topologically mixingif for any two open (nonempty) subsetsU, V ⊂ M there existsn0 ∈ N, such that forall n > n0

fn(U) ∩ V 6= ∅.

Proposition 2.11 (Thom map topologically mixing).The Thom mapϕ : T2 → T2

is topologically mixing.

Proof. So letU, V ⊂ T2 be given. InU we take a segmentL parallel toW u

and chooseε > 0, such thatV contains aε-neighbourhood of one of its points.Application ofϕn to L yields a segment, still parallel toW u, but |λ2|n times as


long asL. This segmentϕn(L), up to a translation inT2 is a segment of the samelength insideW u. We know thatW u is dense inT2. Therefore, a sufficiently longsegment is withinε-distance of any point inT2. (The required length of the segmentevidently depends onε.) This means thatϕn(L), for n sufficiently large, also iswithin ε-distance of any point ofT2. Let n0 ∈ N be such that this property holdsfor all n > n0. By the choice ofε it now follows that

(ϕn(U) ∩ V ) ⊃ (ϕn(L) ∩ V ) 6= ∅,

which proves that the dynamics ofϕ is topologically mixing. 2

Proposition 2.12 (Chaotic evolutions in Thom map).Theϕ-dynamics has manyevolutions that are dense inT2. The dynamics of these evolutions is chaotic, withdispersion exponentE = ln(|λ2|).

Proof. We first generalise the argument in the proof of Proposition 2.10 as follows.If V1, V2, . . . is a countable sequence of open, non-empty subsets ofT2, then thereexists a sequence of open subsetsU1 ⊃ U2 ⊃ . . . of T2 and a sequence of naturalnumbersN1 < N2 < . . . , such that

ϕNj (Uj) ∩ Vj 6= ∅ andUj+1 ⊂ ϕ−Nj(Vj) ∩ Uj ,

for all j ∈ N. This follows from the above by induction onj ∈ N. We note that inthese circumstances

⋂

j∈N

U j 6= ∅

and, writingZ = ∩jU j , we observe that each evolution starting inZ, contains atleast one point ofVj, for eachj ∈ N.

Now to show that theϕ-dynamics has a dense evolution, we take a countable se-quenceV1, V2, . . . of open sets as follows. Using an enumeration ofQ2, we canarrange that theVj , j ∈ N, run through allε-neighbourhoods of points ofQ, withε ∈ Q. Following the above construction it easily follows that each element of thesetZ is the initial state of an evolution that is dense inT2.

We next show that any evolution that is dense inT2 is chaotic with dispersion ex-ponentE = ln(|λ2|). Here it is handy not to work with the usual metric onT2, butto take a metric that fits better to the Euclidean metric of theplane:

d([x1, y1], [x2, y2]) = minn,m∈Z

√

(x1 − x2 + n)2 + (y1 − y2 +m)2.

Now, first observe that in the plane, by application of the linear mapA, the distancebetween two points is multiplied by a factor that is at most|λ2|.24 The maximal

24Surely if, as in the leading example, the eigenvectors are perpendicular.


multiplication factor only occurs if the line connecting the two points lies in the di-rection of the eigenspace associated toλ2. This multiplication factor dependscon-tinuouslyon the direction of this connecting line. All these facts follow from LinearAlgebra. What happens in the plane, we also witness onT2, as long as no satura-tion takes place. To have maximal dispersion we surely have to take segments of theevolution, for which not only the initial points are nearby,but also that at the end,the corresponding points are so close together, that no saturation is in order. More-over, the intial points of the segments have to be chosen such, that the connectingline is parallel, or approximately parallel, toW u. Since our evolution is dense inT2, such a choice is possible. In this way we get two segmentsx(n), . . . , x(n + s)andx(m), . . . , x(m+ s) of the evolution, where the approximation

d(x(n+ s), x(m+ s))

d(x(n), x(m))≈ |λ2|s

is arbitrarily close. From this it easily follows thatE = ln(|λ2|) and hence that theevolution is chaotic. 2

Remark. Mutatis mutandis the above considerations also go for any otherA ∈GL(n,Z), the group of invertiblen× n-matrices that respectsZn.

2.4 Exercises

We recall that for all dynamical systems(M,T,Φ) in this chapter, we assume thatthe state spaceM is a topological space and that the evolution operatorΦ : M ×T → M is continuous.

Exercise 2.1 (Negative dispersion).Consider the dynamical system with statespaceM = R, given by the equation of motion

x′ = −x, x ∈ R.

Prove that in this case we haveE = −1. Generalize this to the case of arbitrarylinear systems.25

Exercise 2.2 (A dense set of periodic point).We consider the dynamical systemgenerated by the Doubling mapx 7→ 2x mo Z, with state space[0, 1). Show thatthe evolution with initial statex is eventually periodic (or stationary) if and only ifx is rational. (Hint: a real number is rational if and only if its decimal (and also itsbinary) expansion in the long run is periodic.

25A similar result holds in general for hyperbolic point attractors (to be treated in a later chapter).These are cases where the prediction errors do not increase when the prediction intervals increases.

2.4 Exercises 105

resonant!subtoriExercise 2.3 (Prime period not defined).We consider a dynamical system, thestate space of which is equal toR/Q, the time setR and with an evolution operatorΦ([t], s) = [t + s], where[ ] indicates that we take the equivalence class modQ.Show that in this system each point is periodic, but that for no point the prime periodis well-defined.

Exercise 2.4 (Some Algebra).In the text we frequently use the following fact fromAlgebra. Ifk1, k2, . . . , kn ∈ Z are relatively prime, then there existm1, m2, . . . , mn ∈Z, such that

n∑

j=1

mjkj = 1.

Prove this. (Hint: Consider the subgroupG ⊂ Z generated byk1, k2, . . . , kn. Sinceany such subgroup is of the formGK = KZ, for someK ∈ N, the aim now isshow thatG = G1. Now use that ifG = GK , thenK is a common factor ofk1, k2, . . . , kn.)

Exercise 2.5 (Resonant subtori).In § 2.2.2 the resonance case was described fortranslation systems on a torus, meaning rational dependence of the components ofthe translation vector (possibly including1). It was indicated in principle that theevolution starting at[0] lies in a lower dimensional torus, or in a (finite) union ofsuch lower dimensional tori. Moreover the restriction to this torus (or to the toruscontaining[0] in the case of several tori) again is a translation system. Below wegive three examples of a translation system on the 2-torus. In all cases there isresonance. The problem is to determine the1-torus (or the union of1-tori) in whichthe evolution lies that starts at[0], to prove that the evolution is dense in the1-torus(or tori) and, finally, to determine the translation system restricted to the1-torus thatcontains[0].

i. The time set isR, the translation vectorΩ = (12, 1

3);

ii. The time set isZ, the translation vectorΩ = (√

23,√

24

);

iii. The time set isZ, the translation vectorΩ = (18

+√

25,√

24

).

Exercise 2.6 (A conjugation).We mentioned that the suspension of a translationsystem with time setZ and translation vectorΩ = (ω1, . . . , ωn) ‘is the same as’ thetranslation system with time setR and translation vectorΩ = (ω1, . . . , ωn, 1). This‘being the same’ means that the systems areconjugated, see§ 2.2.3. In terms ofthe triple ‘state space, time set, evolution operator’ we write these two systems as(SΩ(Tn),R, SΦΩ) and(Tn+1,R,ΦΩ). The problem now is to construct a conjuga-tion between these two systems, in other words, a homeomorphismh : SΩ(Tn) →


Tn+1 satisfying the conjugation equation

h(SΦΩ(p, t)) = ΦΩ(h(p), t)

for eachp ∈ SΩ(Tn) andt ∈ R.

Exercise 2.7 (Quasi-periodic is not chaotic).Show that a quasi-periodic evolutionhas dispersion exponentE = 0.

Exercise 2.8 (Finite dispersion exponent).Consider a positively compact evolu-tion x(t) of a dynamical system(Rn, T,Φ), where for alls ∈ T the times map gΦs, defined byΦs(x) = Φ(x, s), is at least of classC1. Show that in this case thevalue ofE(s, ε), as definied in§ 2.3.2, is finite. (Hint: Usemaxv (||dΦs(v)||/||v||) .)

Exercise 2.9 (Dispersion in equivalent metrics).In the definition of the dispersionexponentE we make use of a metric. Show that if another, but equivalent,26 metricis used, the value ofE does not change.

Exercise 2.10 (The Logistic and the Doubling map).Consider the Logistic mapΦ4 : x ∈ [0, 1] 7→ 4x(1 − x) ∈ [0, 1].

1. Show thatΦ4 is conjugated tos ∈ [−1, 1] 7→ 2s2 − 1 ∈ [−1, 1];

2. From the fact that the latter map is semi-conjugated to theDoubling map onthe circle, prove that the Logistic mapΦ4 is chaotic, with dispersion exponentE = ln(2).

Exercise 2.11 (Existence of a limit).Show that for the functionE(s), as definedin § 2.3.2, we have

E(s1 + s2) ≤ E(s1) + E(s2).

Using this, show that the limit

E = lim infs→∞

1

sln(E(s))

exists.

Exercise 2.12 (Conjugation of translation systems).On the 2-torusT2 considerthe translation vectorsΩ1 andΩ2. Moreoverϕ : T2 → T2 is a ‘linear’ diffeomor-phism induced by a matrixA ∈ GL(2,Z) as in§ 2.3.5. Show that

ϕRΩ1 = RΩ2ϕ

26For his use of terminology see Appendix A.

2.4 Exercises 107

is equivalent toΩ2 = AΩ1,

whereRΩjis the translation onT2 determined byΩj . Also show that if these equal-

ities hold, then it follows thatϕ is a conjugation between the translation systemsdefined byΩ1 andΩ2.

NB: The components of a translation vector can be viewed as the frequencies of‘composed periodic motions’. These frequencies arenot necessarily preserved un-der conjugation.

Exercise 2.13 ((∗) Periodic points of Thom map). Show that the set of points inT2, that are periodic under the Thom map, exactly is the set of points with rationalcoordinates. For this it is not necessary that the matrixA has the same form as inthe example of§ 2.3.5. The only point of interest is thatA has integer coefficientsand thatdet(A) = ±1, where none of the eigenvalues is a root of unity. Also try toformulate and prove this result in higher dimension.

Exercise 2.14 ((∗) Many chaotic initial states of Thom map). Show that for theThm mapA as in (2.16) the set of initial states having a chaotic evolution, is resid-ual.27

Exercise 2.15 ((∗) Orderly dynamical behaviour for negative dispersion expo-nent). Letx(t) be a positively compact evolution of a dynamical system(M,T,Φ),whereM is a complete metric space andΦ continuous. Moreover we assume thatx(t) is not (eventually) periodic. Show that if the dispersion exponentE of this evo-lution is negative, i.e.,E < 0, then the evolution is either asymptotically periodicor stationary. In the case of time setR the evolution necessarily is asymptoticallystationary.

Exercise 2.16 ((∗) An example . . . ). Construct a dynamical system with time setR+ with an eventually periodic evolution.

Exercise 2.17 ((∗) Approximating segment of evolution). In § 2.2.4 we men-tioned that in a quasi-periodic translation system, for each positiveε there existsT (ε) ∈ R, such that each point of the torus is withinε-distance of the segmentx(t)t∈T∩[0,T (ε)] of an evolution. To make the definition unique, we takeT (ε)minimal. Now consider the case of an irrational rotation[x] 7→ [x+ω] on the circleR/Z. The functionT (ε) also depends onω (and is only defined for irrationalω).

1. Show that for eachω we haveT (ε) > 12ε−1;

27For this use of terminology see Appendix A.


2. Show that there is a ratioω such that

T (ε)

ε

is uniformly bounded for allε > 0;

3. Show that for certainω

lim supε→0

T (ε)

ε= ∞.

(Hint: Use continued fractions.)

typicalpersistence

Chapter 3

Persistence of dynamical properties

When interpreting practical data of a dynamical system, it is of the utmost impor-tance, that both the initial state and the evolution law are only known to an accept-able approximation. For this reason it is also good to know how the type of theevolution changes under variation of both. When this type does not change undersmall variations of the initial state, we speak ofpersistenceunder such variation andcall the corresponding evolutiontypical for the system at hand.

Next to this persistence of properties of individual evolutions, also persistence ofproperties of the entire dynamics of the system will be discussed. Here we speak ofpersistence under perturbations of the evolution law. Suchproperties also often arecalled typical or also robust.

It has been R. Thom1 and S. Smale,2 who in the 1960s and 70s have pointed atthe great interest of persistence properties of dynamical system that serve as robustmodels for concrete systems [194, 179].

3.1 Variation of initial state

To fix thoughts we first consider a periodic evolution of the free, undamped pen-dulum, see§ 1.1.1, where the pendulum oscillates, i.e., is not turning over. Thishappens for values of the energy

H(x, y) = mℓ2(12y

2 − ω2 cos x).

in the open interval(−mℓω2,+mℓ2ω2). It may be clear that then also evolutions,with an initial state that is sufficiently near this oscillating evolution, are periodicin this same way, i.e., oscillating and not turning over. Compare with Chapter 1,

1Rene Thom 1923-20022Stephen Smale 1930-

110 Persistence of dynamical properties

persistence!undervariation of initialstate

pendulumDoubling map

Figure 1.2. The dynamical property of being periodic without turning over, for thisreason ispersistentunder variations of the initial state. We also say that such evolu-tions aretypical for this dynamical system. We observe however, that as the energyof the oscillation gets closer to the limit values±mℓ2ω2, the allowed perturbationsof the initial state such that these do not affect the dynamical property at hand, aresmaller. (In fact, in§ 1.1.1 we saw that for the energy−mℓ2ω2 the pendulum hangsvertically down and that for the energy+mℓ2ω2 the pendulum either points up ver-tically, or carries out a motion in which fort → ±∞ it approaches the verticallyupward state.)

Analogous remarks can be made with respect to persistence ofthe periodic evo-lutions in which the pendulum rotates around the syspensionpoint, either to theright, or to the left. On the other hand, for the remaining evolutions the character-ising properties, like being stationary or of approaching astationary state, are notpersistent under variation of the initial state.

Although a general definition of an attractor only will be treated in the next chapter,we here wish to indicate a connection between persistence and attractors. This willbe done for the example of the damped, free pendulum

x′′ = −ω2 sin x− cx′,

discussed in§ 1.1.1. Here the stationary evolution(x, x′) = (0, 0), where the pen-dulum is hanging down, is a point attractor, since all evolutions starting sufficientlynearby, approach this state. Generally any attractorA ⊂ M, to be defined below, isrequired to have a neighbourhoodU ⊂M, such that all evolutions with initial statein U, converge toA. Therefore the convergence toA is a persistent property. Formore details see the next chapter.

Definition 3.1 (Persistence).Let (M,T,Φ) be a dynamical system, withM a topo-logical space andΦ continuous. We say that a propertyP of an evolutionx(t) withinitial statex(0) is persistent under variation of initial stateif there is a neighbour-hoodU of x(0), such that each evolution with initial state inU also has propertyP.

Apart from the concept of persistence just defined, also weaker forms of this areknown, to be described now with the help of examples.

We begin considering the dynamics generated by the Doublingmap, see§ 1.3.8.We have seen that this system has both many chaotic and many (eventually) pe-riodic evolutions. To some extent the chaotic evolutions form the majority: fromthe second proof of the existence of such evolutions, see§ 1.3.8, it namely followsthat when picking the initial state at random (with respect to Lebesgue measure) onM = [0, 1), the corresponding evolution with probability1 is chaotic. On the other

3.1 Variation of initial state 111

persistence!in thesense ofprobability

pendulum

hand, we know that the initial states of the (eventually) periodic evolutions form adense subset ofM = [0, 1). Therefore we can’t say that it is a persistent propertyof evolutions of the Doubling map dynamics to be chaotic: with an arbitrarily smallperturbation of the initial value, we can achieve (eventual) periodicity. Life changeswhen we decide to neglect events of probability0. Thus we say that for evolutionsof the Doubling map system chaoticity ispersistent under variation of initial states,in the sense of probability. Note that for this concept of persistence, we make useof a probability measure. In many cases it is not obvious which probability measureshould be used. However, this is not as bad as it sounds, sincewe only use the con-cept of zero measure sets, ofnull sets. In general all obvious probability measureshave the same null sets. Compare with Appendix A.

Definition 3.2 (Persistence in the sense of probability).Let (M,T,Φ) be a dy-namical sysem withM a topological space with a Borel measure andΦ both con-tinuous and measurable. We say that a propertyP of an evolutionx(t) with initialstatex(0) is persistent under variation of initial statein the sense of probability, ifthere is a neighbourhoodU of x(0) with a null setZ ⊂ U, such that each evolutionwith initial state inU \ Z also has propertyP.

The above example, where chaoticity was only persistent in the sense of probability,is not exceptional at all. In fact, the authors are not aware of the occurrence ofchaotic evolutions, the initial state of which can’t be approximated by the initialstate of a periodic or asymptotically periodic evolution.

An even weaker form of persistence is known for quasi-periodic evolutions. Againwe first consider an example, namely the driven, undamped pendulum, see§ 1.1.2.We are dealing with a system with equation of motion

x′′ = −ω2 sin x+ ε cosΩt.

As in the case of the driven Van der Pol system, dealt with in§ 1.3.1 to illustratequasi-periodicity, also for this system quasi-periodic evolutions occur for small val-ues of|A|. To explain this further, also here we turn to an autonomous system offirst order

x′ = y

y′ = −ω2 sin x+ ε cos z

z′ = Ω,

where bothx andz are variables inR/(2πZ). If in this system we setε = 0, therebyessentially just considering the free, undamped pendulum,we find a 1-parameterfamily of multi-periodic evolutions. Indeed, one for each oscillation of the free,


persistence!weak undamped pendulum, so corresponding to values of the energyin between−mℓω2

and+mℓω2, compare Figure 1.2 and the discussion at the beginning of this section.Each of these multi-periodic subsystems has two frequencies: the frequency in thez-direction equals1

2πΩ and the other frequency decreases with the amplitude. The

latter is the frequency of the corresponding free oscillation in the(x, y)-plane.

From KAM Theory as briefly discussed in§ 2.2.5, also compare with Chapter 5,Appendix B and the reviews [61, 23], it follows that if|A| is sufficiently small,there are still many quasi-periodic subsystems present. Infact, they are so manythat their union has positive Lebesgue measure. This means that we, when randomlychoosing the initial condition, have positive probabilityto land on a quasi-periodicevolution. In this case we speak ofweak persistence.

Definition 3.3 (Weak persistence).Let (M,T,Φ) be a dynamical sysem withM atopological space with a Borel measure andΦ both continuous and measurable. Wesay that a propertyP of an evolutionx(t) with inital statex(0) is weakly persistentunder variation of initial state, if for each neighbourhoodU of x(0) there exists asubsetZ ⊂ U of positive measure, such that each evolution with initial state inZalso has propertyP.

To explain that for the driven, undamped pendulum many quasi-periodic evolutionsoccur in a weakly persistent way, we have to resort to the Lebesgue Density Theo-rem [103], that we state here for the sake of completeness. This theorem deals withdensity pointsof a (measurable) subsetZ ⊂ Rn of positive Lebesgue measure. Wesay thatx ∈ Z is a density point ofZ if

limε→0

λ(Z ∩Dε(x))

λ(Dε(x))= 1,

whereDε(x) is a ε-neighbourhood ofx and whereλ(V ) denotes the Lebesguemeasure ofV.

Theorem 3.4 (Lebesgue Density Theorem [103]).If Z ⊂ Rn is a Lebesgue mea-surable set. Then almost every point ofZ, in the sense of Lebesgue measure, is adensity point ofZ.

The conclusion of Theorem 3.4 implies that the property ‘belonging toZ ’ for almostevery point ofZ is weakly persistent. Also see Appendix A.

3.2 Variation of parameters

In many of the examples of dynamical systems we met in Chapter1 several con-stants occur. In several variations of the pendulum we sawA, c, ω andΩ, for the

3.2 Variation of parameters 113

persistence!undervariation ofparameters

pendulumVan der Pol system

Logistic system this wasµ and for the Henon systema andb. The questions wepresently ask are of the form: which properties of the dynamics are persistent un-der (small) variation of these parameters. Again persistence may hold in variousstronger or weaker forms and again we demonstrate this with the help of examples.

As a first example we consider the damped, free pendulum, see§ 1.1.1. Recall thatthis system is governed by the following equation of motion:

x′′ = −ω2 sin x− cx′,

wherec > 0 andω 6= 0. For each choice of these parametersc andω, withinthe given restrictions, the typical evolution is asymptotically stationary to the lowerequilibrium (which is unique once we take the variablex in R/(2πZ)). Within thissetting it therefore is a persistent property of the dynamics to have exactly one pointattractor.

A warning is in order here. It is important to know exactly what is the setting inwhich persistence holds or does not hold. Indeed, for the undamped, free pendulum,with equation of motion

x′′ = −ω2 sin x,

it is a persistent property that the typical evolution is periodic: this holds for allvalues ofω 6= 0. However, if we would consider the latter system as a special caseof the damped, free pendulum occurring forc = 0, then the property of typicalevolutions being periodic would not be persistent, since the least damping wouldput an end to this.

An analogous remark goes with respect to quasi-periodicity, discussed in the pre-vious subsection as an example of a weakly persistent property under variation ofinitial state. Indeed, quasi-periodicity only occurs in the driven pendulumwithoutdamping. For a proof of this, based on a general relationshipbetween divergenceand contraction of volume, see Exercise 3.2.

An example of weak persistence under variation of parameters we find with respectto quasi-periodic dynamics as this occurs in the driven Van der Pol system

x′′ = −x − x′(x2 − 1) + ε cos Ωt,

see§2.2.3. As mentioned there, for all values ofε, with |ε| sufficiently small, thereis a set ofΩ-values of positive Lebesgue measure, where this system hasa quasi-periodic attractor. With help of the Fubini Theorem [103], it then follows that inthe (Ω, ε)-plane, the parameter values for which the driven Van der Polsystemhas a quasi-periodic attractor, has positive Lebesgue measure. This leads to weakpersistence, under variation of the parameter(s), of the property that the system hasa quasi-periodic attractor. Also see Chapter 5 and AppendixB.


bifurcation diagram

x

µ3 4

Figure 3.1: Bifurcation diagram of the Logistic system, see§1.3.3. For the various(horizontal) values ofµ ∈ [2.5, 4], asymptoticx-values are plotted (vertically) asthese occur after many iteration iterations (and after deletion of transient values).For theµ-values with only a finite number ofx-values, we have asymptotic period-icity. For theµ-values for which in thex-direction whole intervals seem to occcur,we probably are dealing with chaotic dynamics.

We next discuss persistence (i.e., persistent occurrence)of chaotic evolutions in anumber of the examples discussed in Chapter 1. In these cases, in general per-sistence is hard to prove and we will confine ourselves to giving references to theliterature. First consider the Logistic system given by

Φµ(x) = µx(1 − x),

see§1.3.3, with state variablex ∈ [0, 1] and whereµ ∈ [2.5, 4] is a parameter.It has been proven [118], that for a random choice of bothx andµ, the evolutionis chaotic with positive probability. However, also the evolution is asymptoticallyperiodic or stationary with positive probability, in whichcase it is not chaotic. Asa consequence of the Lebesgue Density Theorem 3.4, we here have an example ofweak persistence of chaoticity. Persistence of asymptotically stationary evolutionsis easy to prove forµ ∈ [0, 1], since here for each initial statex we have that

limn→∞

Φnµ(x) = 0.

The bifurcation diagram of the Logistic system, as sketchedin Figure 3.1, illustratesthe extremely complex dynamics well. Also for the Henon system, see§1.3.2, givenby

Φa,b(x, y) = (1 − ax2 + y, bx),

3.3 Persistence of stationary and periodic evolutions 115

persistence!ofstationaryevolutions

bifurcation diagramsaddle-node

bifurcation

one has been able to prove that the occurrence of chaotic evolutions is weakly per-sistent under variation of the parametersa andb. Again it is true that for a ‘randomchoice’ ofa, b, x andy, the evolution generated byΦa,b with initial state(x, y) ischaotic with positive probability; the same statement alsoholds when ‘chaotic’ isreplaced by ‘asymptotically periodic’. For a proof, which for this case is much morecomplicated than for the Logistic system, we refer to [42, 88]. There is, however,an important perturbative connection between the Henon and the Logistic system,where the latter occurs as a special case forb = 0. As a consequence, the set of(a, b)-values for which the Henon System is known to be chaotic, iscontained in asmall strip around the lineb = 0.

Remark. The Henon attractor as introduced in§1.3.2, occurs fora = 1.4 andb = .3. This value ofb probably is far too large to be covered by the perturba-tive approach of [42, 88]. Therefore it is not clear whether the attractor shown inFigure 1.13 is really chaotic or just a periodic attractor ofextremely large period.

Finally, we briefly mention the Rossler attractor, see§1.3.7. We here expect thesame situation as for the Logistic and the Henon system. Although there are goodgrounds for such a conjecture, for this no proof is yet known.

3.3 Persistence of stationary and periodic evolutions

We here show that stationary and periodic evolutions, that satisfy a simple extracondition, are persistent in the sense that such evolutionspersist under small pertur-bations of the evolution law. We note however, that the position in the state space,as well as the period, may vary.

3.3.1 Persistence of stationary evolutions

We start with stationary evolutions of dynamical systems with discrete time. Suchevolutions correspond exactly to the fixed points of the map that determines thetime evolution. We first given an example from which we see that not all fixedpoints persist. This problem is directly related to the Implicit Function Theorem.

Example 3.1 (A ‘saddle-node bifurcation’). An example of a non-persistent fixedpoint we find in the map

ϕµ : R → R, with x 7→ x+ x2 + µ,

whereµ ∈ R is a parameter. Forµ < 0, the mapϕµ has two fixed points, forµ = 0one fixed point and forµ > 0 no fixed point, compare with Figurer 3.2.


x

µ

Figure 3.2: Bifurcation diagram of the ’saddle node bifurcation’ x 7→ x + x2 + µ.Theµ-axis is horizontal and thex-axis vertical. The stationary points are indicatedby a solid line in the(µ, x)-plane. The arrows indicate the direction in which thenon-stationary move. Compare with Appendix C.

We conclude that the fixed point ofϕ0 is not persistent. It turns out that in this andin similar examples, in such a non-persistent fixed point thederivative of the mapat the fixed point has an eigenvalue equal to1. Indeed, in the example we see thatdϕ0(0) = 1.

Remark. In situations like Example 3.1, where for a special parameter value thenumber of stationary points changes, we speak of abifurcation, in this case it is asaddle-node bifurcation.

Theorem 3.5 (Persistence of fixed points I).Let ϕµ : Rn → Rn be a family ofmaps, depending on a parameterµ ∈ Rp, such thatϕµ(x) is differentiable in(µ, x).Let x0 be a fixed point ofϕµ0 , i.e., such thatϕµ0(x0) = x0. Then, if the derivativedϕµ0(x0) has no eigenvalue1, there exist neighbourhoodsU of µ0 ∈ Rp andV ofx0 ∈ Rn, as well as a differentiable mapg : U → V with g(µ0) = x0, such thatx = g(µ), for eachµ ∈ U, is a fixed point ofϕµ, i.e., such thatϕµ(g(µ)) = gµ.Moreover, for eachµ ∈ U, the mapϕµ in V has no other fixed points.

Remark. Theorem 3.5 precisely states in which sense fixed points are persistent inparametrised systems. Indeed, under ‘small’ variation of the parameterµ (withinU)the position of the fixed pointx = g(µ) changes only ‘slightly’ (withinV ), wherethe mapg is smooth and therefore surely continuous.

Proof. Define the mapF : Rp+n → Rn byF (µ, x) = ϕµ(x)− x. Then fixed pointsof ϕµ correspond to zeroes ofF (µ, x). Moreover, the derivative∂xF (µ0, x0), as a

3.3 Persistence of stationary and periodic evolutions 117

persistence!of fixedpoints

function ofx, i.e., for the fixed valueµ = µ0, does not have an eigenvalue0 andtherefore is an invertiblen × n-matrix. Then Theorem 3.5 immediately followsfrom the Implicit Function Theorem, compare with, e.g., [78]. 2

A slightly different formulation of the persistence of fixedpoints is possible (againwithout the derivative having an eigenvalue1). Here we wish to speak of arbitraryperturbations of a mapϕ. For a precise formulation of such a persistence result, wehave to know what perturbations ofϕ should be considered ‘small’. We shall not gointo the question which kinds of topologies can be put on the space of differentiablemaps, again see Appendix A. For our present purposes it suffices to define the‘C1-norm’ || − ||1 as follows.

Definition 3.6. For differentiable mapsf, g : Rn → Rm we define theC1-distanceby

||f − g||1 = max1≤i≤n,1≤j≤m

supx∈Rn

|f j(x) − gj(x)|, |∂i(fj − gj)(x)|.

We note that such aC1-distance may take on infinite values.

Theorem 3.7 (Persistence of fixed points II).Letϕ : Rn → Rn be a differentiablemap with a fixed pointx0 ∈ Rn, such that the derivativedϕx0 has no eigenvalue1.Then there is a neighbourhoodV of x0 in Rn and a numberε > 0, such that anydifferentiable mapϕ : Rn → Rn with ||ϕ − ϕ||1 < ε has exactly one fixed point inV.

Proof. We abbreviateA = dϕx0 and defineB = (A− Id)−1, whereId is then× nunit matrix. Next we define

Nϕ(x) = x−B(ϕ(x) − x).

It is not hard to verify the following properties:

- Nϕ(x0) = x0;

- d(Nϕ)x0 = 0;

- x is a fixed point ofϕ if and only if x is a fixed point ofNϕ.

We shall specify the neighourhoodV ⊂ Rn of x0 as aδ-neighbourhood, definedcoordinate wise, i.e.,

x ∈ V ⇔ |xj − x0,j | < δ, j = 1, 2, . . . , n.

We takeδ sufficiently small to ensure that∣

∣

∣

∣

∂Nϕ

∂xj

(x)

∣

∣

∣

∣

<1

4n, for j = 1, 2, . . . , n


for all x ∈ V. As a consequence the restricted mapNϕ|V is a contraction, in thesense that it contracts the distances between the points ofV with at least a factor4.Here we use the distance defined by taking the absolute valus of the differences ofcorresponding coordinates, called theℓ∞-distance. This can be seen as follows.

For allx, y ∈ V we have

Nϕ(x) −Nϕ(y) = (

∫ 1

0

d(Nϕ)(1−t)x+tydt)(x− y).

The elements of the matrixM = (∫ 1

0d(Nϕ)(1−t)x+tydt) all are less than1/(4n).

This implies that for anyx ∈ Rn we have

|M(x, y)| ≤ 14 |x− y|.

This implies thatNϕ(Uδ(x0)) ⊂ U1

4 δ(x0),

where we note thatV = Uδ(x0) ⊃ U 14δ(x0). By the Contraction Principle (see

Appendix B), this implies thatx0 is the unique fixed point ofNϕ in V. Comparewith, e.g., [4],

For arbitrary smooth mapsϕ : Rn → Rn we definieNϕ(x) = x − B(ϕ(x) − x).Again it follows thatx ∈ Rn is a fixed point ofϕ if and only if x is a fixed point ofNϕ.

We now chooseε > 0 small enough to ensure that, if||ϕ− ϕ||1 < ε, we have for allx ∈ V :

- ||Nϕ(x) −Nϕ(x)|| < frac13δ;

-∣

∣

∣

∂Nϕ

∂xj(x)∣

∣

∣< 1

2n, for j = 1, 2, . . . , n.

Given this,Nϕ still maps the closure of theδ-neighbourhoodV of x0 within itselfand also the restrictionNϕ|V is a contraction. This implies that alsoNϕ, within Vhas a unique fixed point. 2

Remark. We like to point at the connection with the Newton method to find zeroes,since fixed points of a mapϕ exactly are the zeroes ofϕ− Id. According to§ 1.3.4,the corresponding Newton operator is

N(x) = x− (dϕx − Id)−1(ϕ(x) − x),

which is almost identical to the operatorNϕ we used in the above proof. The onlydifference is that we replaced dϕx by dϕx0. The reason for this is that thus smallvariations ofϕ with respect to theC1-distance, also lead to small variations ofNϕ

with respect to theC1-distance.

3.4 Persistence for the Doubling map 119

persistence!ofperiodicevolutions

3.3.2 Persistence of periodic evolutions

The persistence theory of periodic evolutions of dynamicalsystems with discretetime directly follows from the previous subsection. If sucha system is generated bythe smooth mapϕ, and if the pointx lies on a periodic evolution with (prime) periodk, thenx is a fixed point ofϕk. Now, if the derivative d(ϕk)x has no eigenvalue1,thenx is persitent as a fixed point ofϕk, and hence also as a periodic point ofϕ.

For dynamical systems with continuous time, given by (a system of) ordinary dif-ferential equations, the situation is somewhat different.A stationary evolution cor-responds to a zero of the associated vector field, sayX. For simplicity we assumethatX : Rn → Rn is a smooth map, say of classC1, such that the derivative dXat a zero has no eigenvalue0. The proof of persistence of such a zero is completelyanalogous to that of the persistence of fixed points of maps, cf. the proofs of theTheorems 3.5 and 3.7.

Regarding periodic evolutions of dynamical systems with continuous time, in§ 1.2.2we have seen how a (local) Poincare map can be constructed. Such a periodic evo-lution then corresponds to a fixed point of the Poincare map.Moreover, it is knownfrom the theory of ordinary differential equations, e.g., see [4], that forC1-smallperturbations of the associated vector fieldX the corresponding evolution operatorΦX , when restricted to a compact domain, also undergoes only aC1-small pertur-bation, which thus lead toC1-small perturbations of the corresponding Poincaremap. From this it follows that for a dynamical system defined by a vector fieldX,a periodic solution, such that the derivative of the corresponding Poincare map atits fixed point has no eigenvalue1, is persistent underC1-small perturbations ofX.Also compare with Exercise 3.4.

Remarks.

- It should be said that the above discussion is not complete.For instance,we have only defined theC1-distance for mapsRn → Rm and not for mapsbetween arbitrary smooth manifolds. Although here there are no essential dif-ficulties, many details are needed to complete all the arguments. In AppendixA we present some of these; more information can be found in [112].

- In general is it not so easy to get analytical information onPoincare maps.For an example of this see Exercise 1.20.

3.4 Persistence for the Doubling map

We already observed that persistent occurrence of chaotic evolutions often is hardto prove. In case of the Doubling map, however, the analysis is rather simple. We


Doubling map show here that in this case not only chaoticity is persistent, but that even the whole‘topology’ of the system persists in the sense ofstructural stability, to be definedbelow.

3.4.1 Perturbations of the Doubling map: persistent chaoticity

We return to the Doubling map from the circleR/Z onto itself, see§1.3.8. TheDoubling mapϕ is given byϕ([x]) = [2x], where[x] denotes the equivalence classof x ∈ R/Z. In general such a mapϕ : R/Z → R/Z can be lifted to a mapϕ : R → R,3 such that for allx ∈ R one has

ϕ(x+ 1) − ϕ(x) = k,

for an integer numberk, called thedegreeof ϕ, compare with Exercise 3.6. In caseof the Doubling mapϕ we havek = 2. We shall consider perturbationsϕ of theDoubling mapϕ with the same degree2. Moreover we only consider perturbationϕ of ϕ, the derivative of which is larger than1. By compactness of[0, 1] (and ofR/Z) it then follows that a constanta > 1 exists, such that

ϕ′(x) > a. (3.1)

It may be clear that all smooth mapsϕ that areC1-nearϕ and defined onR/Z,meetwith the above requirements.

Proposition 3.8 (Perturbations of the Doubling map I). For a smooth perturba-tion ϕ of the Doubling mapϕ, that is sufficientlyC1-nearϕ, we have.

1. As a map ofR/Z onto itself,ϕ has exactly one fixed point;

2. Modulo a conjugating translation, the above fixed point ofϕ equals[0];

3. As a map ofR/Z onto itself,ϕ has exactly2n − 1 periodic points of periodn;

4. For the mapϕ, as a map ofR/Z onto itself, the inverse image of each pointin R/Z consists of two points.

Proof. We prove the successive items of the proposition.

1. The mapx ∈ R 7→ ϕ(x) − x ∈ R is monotonically increasing and maps theinterval [0, 1) onto the interval[ϕ(0), ϕ(0) + 1). Therefore, there is exactlyone point inx ∈ [0, 1) that is mapped on an integer number. The equivalenceclass ofx modZ is the unique fixed point of the mapϕ in R/Z.

3For simplicity the lift ofφ is given the same name.


Figure 3.3: Possible graph ofϕ.

2. Let x be the fixed point, then by a ‘translation of the origin’ we canturn to aconjugated system, given by the mapϕ given by

ϕ(x) = ϕ(x+ x) − x,

compare with§2.2.3. Seen as a map withinR/Z, the fixed point ofϕ equals[0].

As we saw earlier, conjugated systems have ‘the same’ dynamics. Thereforeit is no restriction of generality to assume that[0] is the unique fixed point inR/Z.

3. This property can be checked easiest by consideringϕ as a map[0, 1) → [0, 1)with fixed point0. In this formatϕn exactly has2n−1 points of discontinuitythat divide the interval[0, 1) into2n intervals of continuity. This can be provenby induction. Each of the latter intervals is mapped monotonically over thefull interval [0, 1). Therefore in each such interval of continuity we find onefixed point ofϕn, except in the last one, since1 does not belong to[0, 1). Thisproves thatϕn indeed has2n − 1 fixed points.

4. The last property can be immediately concluded from the graph ofϕ, regardedas a map[0, 1) → [0, 1), compare with Figure 3.3.

2

Remark. From Proposition 3.8 is follows that the total number of periodic andeventually periodic points is countable. In fact, as we shall see later in this chapter,


when dealing with structural stability, the whole structure of the (eventually) peri-odic points, up to conjugation, is equal to the structure of the original mapϕ. Forour present purposes, namely showing that the occurrence ofchaotic evolutions forthe Doubling map is persistent, it suffices to know that the set of (eventually) peri-odic points is only countable. The importance is that this implies that many pointsexist that are not (eventually) periodic.

Proposition 3.9 (Perturbations of the Doubling map II). Under the conditions ofProposition3.8, if [x] ∈ R/Z is neither a periodic nor an eventually periodic pointof ϕ, then theϕ-evolution of[x] is chaotic.

Proof. The proof of this Proposition is almost the same as the corresponding proofof chaoticity of an evoluton of the Doubling map itself, thatis not (eventually)periodic, compare with§2.3.1. We briefly indicate which adaptations are needed.To start with, as for the case of the Doubling map, each positive evolution that is not(eventually) periodic has infinitely many different points. SinceR/Z is compact,such evolution also contains points that are arbitrarily close to each other (withoutcoinciding).

The most important difference betweenϕ andϕ is that in the latter map the deriva-tive is not constant. However, there exist constants1 < a < b, such that for allx ∈ [0, 1) one has

ϕ′(x) ∈ [a, b].

From this it directly follows that inR/Z an interval of lengthℓ < b−1 by ϕ ismapped to an interval at least of lengthaℓ. Furthermore, two points at a distanced < 1

2b−1 by ϕ are mapped to points at least of distancead.

By n-fold iteration ofϕ the distance between two nearby points therefore grows atleast by a factoran (implying exponential growth till saturation takes place). Fromthis it follows that the dispersion exponent, see§ 2.3.2, is at leastln(a) > 0. Thisimplies that Definition 2.8 applies to such evolutions, implying that they are chaotic.

2

In summary we now state:

Theorem 3.10 (Persistence of chaoticity in the Doubling map). If ϕ is a C1-perturbation of the Doubling mapϕ([x]) = [2x], such that

1. ϕ as a map onR has the propertyϕ(x+ 1) − ϕ(x) = 2 and

2. for eachx ∈ [0, 1) one hasϕ′(x) > 1,

thenϕ, as a map of the circleR/Z, has chaotic evolutions.


structural stabilityconjugation!topologicaequivalence!topologic

3.4.2 Structural stability

Below we revisit and generalize the notion of two dynamical systems being con-jugated. Roughly speaking this means that the systems coincide up to a changeof coordinates, where in many important cases the corresponding conjugations arejust homeomorphisms. It may be clear that conjugated systems share all dynamicaltheir properties in a qualitative way. When speaking of dynamical properties thatare persistent, our present interest is formed by cases where a given system is con-jugated to all nearby systems. Such a system will be called ‘structurally stable’. Itfollows that all dynamical properties of a structurally stable system are persistent ina qualitative way.

The notion of structural stability in the 1960’s and 70’s wasstrongly propagatedby Rene Thom [194]. In fact, the ensuing persistence properties make structurallystable dynamical systems very suitable for modelling purposes.

Definition 3.11 (Topological conjugation).Let two dynamical systems(Xj , T,Φj),j = 1, 2 be given withX1 andX2 toplogical spaces and with continuous evolutionoperatorsΦ1 andΦ2. We say that(Xj , T,Φj), j = 1, 2, are toplogically conjugatedwhenever there is a homeomorphismh : X1 → X2 such that for eacht ∈ T andx ∈ X1 one has

h(Φ1(x, t)) = Φ2(h(x), t). (3.2)

The maph is called atopological conjugation.

Compare with§ 2.2.3, in particular with the conjugation equation (2.5).

Remarks.

- If T = Z orN and the evolution operator therefore is given by a mapϕj = Φ1j ,

j = 1, 2, then the equation (3.2) is equivalent to

h ϕ1 = ϕ2 h.

- The conjugationh takes the dynamics inX1 to that inX2, in the sensethat each evolutiont 7→ Φ1(x0, t) by h is mapped to the evolutiont 7→Φ2(h(x0), t) = h(Φ1(x0, t)). Thus it also turns out that two dynamical sys-tems cannot be conjugated when their time sets do not coincide.

- When defining quasi-periodic evolutions, see Definition 2.4, we used differ-entiable conjugations. Here we won’t do this here for reasons that will beexplained in Chapter 5 and Appendix B. Also compare with Excercise 3.9.


- In cases whereT = R, also the notion oftopological equivalenceexists. Ahomeomorphismh : X1 → X2 between the state spaces of two dynamicalsystems is an equivalence if for eachx ∈ X1 there exists a monotonicallyincreasing functionτx : R → R, such that

h(Φ1(x, t)) = Φ2(h(x), τ(t)).

This condition means thath does map evolutions of the former system toevolutions of the latter, but that the time parametrisationdoes not have tobe preserved, although it is required that the time direction is preserved. Thisconcept is of importance when considering periodic evolutions. Indeed, undera conjugation the periods of the corresponding periodic evolutions do have tocoincide, which is not the case under an equivalence.

- The Russian terminology is slightly different, compare, e.g., [1]. In fact, atopological conjugation is called ‘orbital equivalence’,whereas structurallystable is called ‘rough’, etc.

Definition 3.12 (Structural Stability). A dynamical system(X, T,Φ) whereT = Z

or N is called (C1-) structurally stableif any dynamical system(X, T,Φ′) withΦ′ sufficientlyC1-nearΦ, is conjugated to(X, T,Φ). For dynamical systems withT = R the same definition holds, where we replace ‘conjugation’ by‘equivalence’.

Remarks.

- We have not yet given a general definition of ‘C1-near’. For such definitionson manifolds we refer to [112]. For a brief discussion also see Appendix A.Note that here we only use the concept of structural stability for the Doublingmap, in which case, by Definition 3.6, we already know what is meant byC1-small. In fact, aC1-small perturbation of the Doubling mapϕ is a mapϕ, that as a map withinR is C1-nearϕ, while always observing the relationϕ(x+ 1) − ϕ(x) = 2.

- More generally we speak ofCk-structural stability when such a conjugationor equivalence exist wheneverΦ andΦ′ areCk-nearby, again see [112] andAppendix A. Below by ‘structural stability’ we always meanC1-structuralstability. Theoretically one also may defineC0-structural stability, but such aconcept would be not very relevant. Compare with Exercise 3.7.

In this section we aim to show that the following.

Theorem 3.13 (Structural stability of the Doubling map). The Doubling map is(C1)-structurally stable.


Doubling mapsymbolic dynamics

Proof. In fact, we shall show that any perturbation, as already met in the previoussection on persistence of chaoticity, is conjugated with the original Doubling map.To do this, we shall investigate the set of eventually stationary evolutions of theDoubling mapϕ and its perturbationϕ (so withϕ(x+ 1) − ϕ(x) = 2 andϕ′ > 1),to greater detail.

We viewϕ andϕ as maps on the interval[0, 1) with a stationary point at0.We labelthe inverse images and the iterated inverse images of this point 0, as follows. Thestationary point0 first gets the label ‘0’. The inverse image of0 consists of twopoints: one in the ‘left half’ of the interval and the other inthe ‘right half’. Theformer of these, that coincides with the original stationary point nowget the label‘00’ and the latter ‘10’. The point with the label10 from now on will be calledthe first point in the ‘right half’ of the interval. This process we repeat inductively.Each label consists of a finite sequence of digits0 and1 and at each repetition thesequences get longer by one digit. The point in the left half get a label consisting ofa0 followed by the sequence of the point on which it is mapped andin the right halfwe get a label starting with1 followed by the sequence of the point in which it ismapped. In the diagram below we indicate the result. On thenth row we wrote thelabels of the elements that are mapped to the stationary point in n iterations, startingwith n = 0. Labels in one columb belong to the same point. The order of thepointsin the interval is equal to the horizontal order of the labelsin the diagram.

n = 0 0n = 1 00 10n = 2 000 010 100 110n = 3 0000 0010 0100 0110 1000 1010 1100 1110

Both for the Doubling mapϕ and its perturbationϕ the coding is equal to the se-quences as in binary expansions.

We define the setsAn = ϕ−n(0). The labels of the points inAn are the sequencesin the diagram of lengthn + 1. For ϕ we denote the corresponding set withAn. Itnow follows from the definition of this coding that the maphn : An → An with thefollowing properties preserves the coding:

- hn is an order preserving bijection, i.e., ifp < q ∈ An, then alsohn(p) <hn(q) ∈ An;

- Forp ∈ An the conjugation equationhn(ϕ(p)) = ϕ(hn(p)) holds;

- hn−1 is the restrictionhn|An−1 .

This statement can be proven by induction.


In the above we take the limit asn → ∞. To this end we define the setsA∞ =⋃

nAn andA∞ =⋃

n An. It will be clear thatA∞ is dense in[0, 1) : any point of[0, 1) is within distance2−n of An. Also A∞ is dense in[0, 1). To see this we recall(3.1), saying that for allx ∈ [0, 1) one hasϕ > a > 1. This implies that any pointof [0, 1) is within a−n-distance ofAn.

From the above properties ofhn it first of all follows that there exists exactly onemaph : A∞ → A∞, such that for eachn we havehn = h|An

. Secondly it followsthath is an order preserving bijection fromA∞ ontoA∞, where both of these setsare dense in[0, 1). This means thath has a unique extension to a homeomorphismh : [0, 1) → [0, 1). The proof of this is an exercise in topology, see Excercise3.7. Moreover for allp ∈ A∞ the conjugation equationh(ϕ(p)) = ϕ(h(p)) issatisfied. Now, sinceA∞ is dense in[0, 1) and sinceh is continuous, the conjugationequation also holds on all of[0, 1). This means thath is a conjugation from thesystem generated byϕ to the system generated byϕ. This proves structural stabilityof the Doubling map. 2

Remarks.

- The maph, as constructed in the above proof, in general will not be differen-tiable, not even of classC1. To see this we consider a periodic pointp of ϕ of(prime) periodk. We define theexponentof ϕ at such a periodic pointp as

λp =1

kln(dϕk

p).

From the definition of the Doubling map it follows that for theexponent ateach periodic point one hasλp = ln(2).

For ϕ the pointh(p) also is a periodic point of (prime) periodk and

λh(p) =1

kln(dϕk

h(p)).

The conjugation equationh ϕ = ϕ h implies that

h ϕk = ϕk h.If we would assume differentiability ofh, then by the Chain Rule it wouldfollow that

λp = λh(p),

compare with Exercise 3.9. This would lead to a contradiction, since it is easyto produce perturbationϕ for which not all exponents equalln(2). Thus forsuch perturbations the conjugationh cannot be differentiable.

- From the above it may be clear, that if in Definition 3.12 of structural sta-bility, we would require differentiability of the conjugation, in general thereare practically no structurally stable dynamical systems,in which case theconcept would be meaningless, see the Exercises 3.9 and 3.10.


3.4.3 The Doubling map modelling a (fair) coin

From the above one may have obtained the impression that whentwo dynamicalsystems are conjugated, all theire ‘most important properties’ coincide. This is,however, not alltogether true. We shall show here that conjugated systems can haveentirely different ‘statistical properties’. In Exercise3.12 we will see that also dis-persion exponents do not have to be preserved by conjugations.

We may view the Doubling map as a mathematical model for throwing a (fair) coin.Again, we consider the map as being defined on[0, 1). The left half[0, 1

2) we now

consider as ‘heads’ and the right half[12, 1) as ‘tails’. For a ‘random’ choice of

initial stateq ∈ [0, 1), the corresponding sequence of events regarding the coin ares0, s1, s2, . . . , where

sj =

‘heads’ whenϕj(q) ∈ [0, 12),

‘tails’ whenϕj(q) ∈ [12, 1).

A choice of the initial stateq thus is equivalent with the choice of an infinite heads-tails sequence. If the choice is random in the sense of Lebesgue measure, thenindeed for eachj the probabilities to get heads and tails are equal, and the eventsare independent for different values ofj. In this sense we can speak of the behaviourof a fair coin.

These assertions can be proven in various ways. Below we givea heuristic argumentthat can be made rigorous with the help of Probability Theory. It also can be adaptedto certain perturbationsϕp of ϕ to be dealt with below. We first deal with theindependence of different coin throwings. Under a random choice ofq ∈ [0, 1) withrespect to Lebesgue measure,q lands in an intervalI of lengthℓ with probabilityℓ. A similar remark goes for the probability thatq lands in a disjoint union of anumber of intervals. With help of this is it can be simply shown that, under randomchoice ofq, the pointϕ(q) lies in an intervalI, also is equal to the length ofI.Moreover, this event is stochastically independent of the fact whetherq ∈ [0, 1

2) or

(12, 1). This means the following. Apart from the fact that the probability Ps0 =

0 = Ps0 = 112

alsoPs1 = 0 = Ps1 = 112, while the events regardings0

ands1 are stochastically independent. By induction we get a similar statement forsi andsj , i 6= j. This accounts for the independence stated above. To be explicit, itfollows that for any sequence(s0, s1, . . . , sn) of zeroes and ones, the set of pointsin [0, 1] which are the intial states leading to a coding that starts inthis way, is aninterval of length, hence of ‘probability’,2−(n+1).

We next consider a modified Doubling mapϕp : [0, 1) → [0, 1) defined by

ϕp(x) =

1px as0 ≤ x < p,1

1−p(x− p) asp ≤ x < 1,

(3.3)


for a parameterp ∈ (0, 1). Before investigating the statistics ofϕp, we first addressa number of dynamical properties.

Remarks.

- The mapsϕp, for p 6= 12, are not smooth as maps withinR/Z : at the points0

andp the derivative jumps from1/(1 − p) to 1/p, and vice versa.

- Yet all mapsϕp, 0 < p < 1, are conjugated toϕ = ϕ 12. The proof as given

in the previous section, also works here. The conjugationhp that takes thesystem defined byϕ to that defined byϕp, maps0 to 0 and 1

2to p; this is an

immediate consequence of the conjugation equation.

We now turn to the statistics ofϕp. In this case we let the interval[0, p) coincide with‘heads’ and the interval[p, 1) with ‘tails’. It will be clear that, when choosing aninitial stateq ∈ [0, 1) at random with respect to Lebesgue measure, the probabilityto land in[0, p) equalsp. So if we view iteration ofϕp as a mathematical model fortossing a coin, the probability of ‘heads’ equalsp, which implies that forp 6= 1

2,

the coin is no longer fair. As for the casep = 12, however, the events for different

tossings of the coin are independent.

We conclude that the dynamics describing a fair coin is conjugated to that describingan unfair one. This has strange consequences. First of all the conjugating homeo-morphismshp are highly non-differentiable, especially on the dense setof periodicpoints. See the above remarks. Secondly, such homeomorphisms send subsets ofLebesgue measure0 to subsets of Lebesgues measure1 and vice versa. The lattercan be seen as follows. For eachp ∈ (0, 1) the set

Bp =

x ∈ [0, 1) | limn→∞

1n#j ∈ 0, 1, 2, . . . , n | ϕj

p(x) ∈ [0, p) = p

,

where# denotes the number of elements, has Lebesgue measure1. However, forp 6= 1

2, the maphp maps the setB 1

2in the complement ofBp, which is a set of

Lebesgue measure0. Conversely, the maph−1p also maps the setBp of Lebesgue

measure1 to a set of Lebesgue measure0 in the complement ofB 12.

Remarks.

- A final comment concerns the special choice we made of perturbationsϕ =ϕp of the Doubling mapϕ, which is inspired by the fact that their statisticalproperties are easy to deal with. A more general perturbation ϕ, as consideredearlier, does not only provide another model for an unfair coin, but in generalthe successive tossings, will be no longer stochastically independent.

- This probabilistic analysis of (deterministic) systems has grown to an exten-sive part of Ergodic Theory. An introductory treatment of this can be foundin [171]. In [17, 7] is dealt far more generally with this subject.

3.5 Exercises 129

3.5 Exercises

Exercise 3.1 (Typical evolution Van der Pol).What types of evolution occur per-sistently in the dynamical system given by the Van der Pol equation?

Exercise 3.2 (A set of positive measure).For the driven, undamped pendulum(see§ 1.1.2) withA = 0, it is known from KAM Theory that in particular thequasi-periodic subsystems, the frequency ratios of which are badly approximated byrationals, persist for (small) values of|A|. Such frequency ratios typically contain aset like

Fτ,γ = x ∈ R |∣

∣

∣

∣

x− p

q

∣

∣

∣

∣

>γ

qτ, for all p, q ∈ Z, q 6= 0

Show that, forτ > 2 andγ > 0 sufficiently small, the setFτ,γ has positive Lebesguemeasure.

Exercise 3.3 (Divergence and contraction of volume).For a vector fieldX onRn, with componentsX1, X2, . . . , Xn, thedivergenceis defined as

div(X) =∑

j

∂jXj.

By ΦX : Rn × R → Rn we denote the evolution operator associated toX, thatpossibly only is locally defined. For a subsetA ⊂ Rn we let volA(t) be the Lebesguemeasure ofΦt

X(A), whereΦtX(x) = ΦX(x, t).

1. Show thatddt

volA(0) =∫

Adiv(X).

It is the aim to show, with help of this result, that the driven, damped pen-dulum can’t have quasi-periodic evolutions. To this end we first determinethe relevant divergence. The 3-dimensional vector fieldX associated to thedriven pendulum has components(y,−ω2 sin x − cy + A cos(Ωz), 1), withc > 0.

2. Show that div(X) = −c.We now consider an arbitrary dynamical system with state spaceR3 (wherepossibly one or more coordinates have to be taken moduloZ or 2πZ) and con-tinous time, the dynamics of which is generated by a vector field X. We as-sume that there is a quasi-periodic subsystem the evolutions of which denselyfill the differentiable imageT of the 2-torusT2, whereT is the boundary of a3-dimensional domainW.

3. Show that in the above situation∫

Wdiv(X) = 0.


4. Show that the driven pendulum, with positive damping, can’t have any quasi-periodic evolutions.

Exercise 3.4 (Eigenvalues derivative Poincare map). Given a vector fieldX on amanifoldM with a closed orbitγ. For any sectionΣ that is transversal toγ, we candefine a local Poincare mapP : Σ → Σ with a fixed pointp ∈ Σ∩ γ. Show that theeigenvalues of the derivativedPp do not depend on the choice of the sectionΣ.

Exercise 3.5 (Smooth conjugations for flows).Let two vector fieldsx = F (x)and y = G(y), x, y ∈ Rm, be given. Consider a diffeomorphismy = Φ(x) ofRm. Show thatΦ takes evolutions of the former to the latter vector field in a timepreserving way if and only if

DxΦ · F (x) ≡ G(Φ(x)). (3.4)

(Hint: Use the Chain Rule and the Existence and Uniqueness Theorem [114, 115]for solutions of ordinary differential equations.)

Remark. In tensorial shorthand we often rewrite (3.4) asG = Φ∗F, compare with[48, 181].

Exercise 3.6 (Lifts and degree of circle maps).From the definition ofR/Z itfollows that a mapF : R → R induces a map withinR/Z if and only if for eachx ∈ R one has thatF (x + 1) − F (x) ∈ Z. WhenF is continuous, it follows thatF (x + 1) − F (x) is a constant (i.e., independent ofx), it is an integer called thedegreeof F.

1. Show that for each continuous mapf : R/Z → R/Z, there exists a continu-ous mapF : R → R, such thatf π = π F, whereπ : R → R/Z is thecanonical projection.

2. Show that for a givenf as the previous item, the continuous mapF is uniqueup to translation by an integer number (i.e., whenF andF are two such maps,thenF (x) − F (x) is an integer number independent ofx.

Remark. This integer again is thedegreeof the circle mapf, as introducedby Brouwer, see Appendix A.

Exercise 3.7 (NoC0-structural stability). We define theC0-distance between twomapsf, g : Rn → Rm by

||f − g||0 = supx∈Rn

|f(x) − g(x)|.

3.5 Exercises 131

Using this, it also will be clear what is understood byC0-small perturbations of amap within the circleR/Z.

Show that any map of the circle that has at least one fixed point, can’t beC0-structurally stable. (NB: It is even possible to prove this statement without assum-ing the existence of fixed points: for maps within the circle,C0-structural stabilitydoes not occur.)

Exercise 3.8 (Extension of a homeomorphism from a dense set). Suppose thatZ1, Z2 ⊂ R are dense subsets and thath : Z1 → Z2 is a bijection, such thatp < q ∈ Z1 implies thath(p) < h(q).

1. Show thath can be extended in a unique way to a bijectionh : R → R, suchthatp < q ∈ R implies thath(p) < h(q).

2. Show that this extensionh is a homeomorphism. (Here it is important thatthe topology ofR can be defined in terms of the order(<) : the setsUa,b =x ∈ R | a < x < b form a basis of this topology.

3. Prove the analogue of the two above statements for the casewhereR is re-placed by[0, 1).

Exercise 3.9 (Invariance of eigenvalues).Given two smooth mapsf, g : Rn → Rn

with f(0) = 0 andg(0) = 0. Let A = d0f andB = d0g. Assume that there is asmooth maph : Rn → Rn with h(0) = 0 (possibly only defined on a neighbourhoodof 0), such that the conjugation equation

h f = g h

is satisfied. Show that the matricesA andB are similar (i.e., conjugated by aninvertible linear map). What can you conclude about the eigenvalues ofA andB?

Exercise 3.10 (Structural stability under smooth conjugation). Show that any(nontrivial) translation onR is structurally stable under smooth conjugation.

Exercise 3.11 ((∗) A structurally stable map of the circle). Show that the mapwithin the circleR/Z, induced by the mapF (x) = x+ 1

10sin(2πx), is (C1-) struc-

turally stable.

Exercise 3.12 ((∗) Dispersion exponent of the modified Doubling map).Con-sider the modified Doubling mapϕp, for p ∈ (0, 1), as defined by (3.3). Computeits maximal dispersion exponentEp, showing thatEp > ln 2 asp 6= 1

2. (Hint: The

possible value ofEp depends on the initial statex ∈ R/mZ of the orbit. Here takethe maximal value over all possible values ofx.)

Chapter 4

Global structure of dynamicalsystems

Till now we have dealt mainly with the properties of individual evolutions of dy-namical systems. In this chapter we shall consider aspects that in principle concernall evolutions of a given dynamical system. We already saw some of this whendealing with structural stability, see§ 3.4.2, which implies that underC1-small per-turbations the ‘topology’ of the collection of all evolutions does not change.

We first deal with attractors from a general perspective. Recall that we have alreadymet several examples of these like the Henon attractor. These ‘configurations’ inthe state space were obtained by taking a rather arbitrarilyinitial state, iteratinga large number of times, and then plotting the correspondingstates in the planewith deletion of a rather short initial (transient) part. Compare with Figures C.6and 1.13. The fact that these ‘configurations’ are independent of the choice ofinitial state, indicates that we are dealing here with attractors. Wherever one starts,always the evolution tends to the attractor and in the long run such an evolution alsoapproximates each point of it. Our first problem now is to formalise this intuitiveidea of attractor.

A dynamical system can have more than one attractor (this situation is often calledmultistability). Then it is of interest to know what domainsof the state space con-tain the initial states of the evolutions that converge to the various attractors. Sucha domain is called the basin of attraction of the corresponding attractor. Evolutionsstarting at the boundary of two basins of attraction, will never converge to any at-tractor. In simple cases these basin boundaries are formed by the stable separatricesof saddle points, but we shall see that such boundaries in general can be far morecomplicated. We now turn to formally defining attractors andtheir basins.

134 Global structure of dynamical systems

positively compact£“omega£-limit set

4.1 Definitions

Let (M,T,Φ) be a dynamical system, whereM is a topological space andΦ con-tinuous. Assume that for eachx ∈ M the positive evolutionO+(x) = Φ(x, t) |T ∋ t ≥ 0 is relatively compact, which means that it has a compact closure.1

According to Definition 2.7, see§2.3.2, this evolution is calledpostively compact.

Definition 4.1. For x ∈M we define theω-limit set by

ω(x) = y ∈ M | ∃tjj∈N ⊂ T, such thattj → ∞ andΦ(x, tj) → y

For a few simple consequences of Definition 4.1 we refer to Exercise 4.1. Theω-limit setω(x), and the dynamics therein, determines the dynamics startingat x, insofar as this occurs in the far future. To be more precise:

Lemma 4.2. For any neighbourhoodU ofω(x) inM there existstU ∈ R, such thatfor eachtU < t ∈ T one has thatΦ(x, t) ∈ U.

Proof. Suppose that for a givenU, such atU does not exist. Then, there is asequencetjj∈N ⊂ T, such thattj → ∞ but whereΦ(x, tj) 6∈ U. Since weassumed all positive evolutions to be relatively compact, the sequencetjj∈N musthave an accumulation point, sayp. Such an accumulation pointp on the one handcan’t be in the interior ofU, but on the other hand, by Definition 4.1, necessarilyp ∈ ω(x). This provides a contradiction. 2

From Lemma 4.2 it follows that the evolution starting inx, in the long run tendsto ω(x). The fact that the evolution operatorΦ is continuous then implies that, asthe evolution starting atx approachesω(x) better, it can be approximated moreprecisely by evolutions withinω(x), at least when we restrict ourselves to finitesegments of evolutions.

Remark. In an implicit way, theω-limit set already was discussed when speakingof asympoticallystationary, periodic or multi-periodic evolutions. Indeed, for suchevolutions, theω-limit set consists of a stationary, a periodic or a multi-periodicevolution, viewed as a subset of state space. For chaotic evolutions we expect morecomplicatedω-limit sets.

Already theω-limit set looks like what one might like to call ‘attractor’. Indeed,one may say thatω(x) is the ‘attractor’ of the evolution starting atx. Yet, for realattractors we need an extra property, namely that the set of initial states for whichtheω-limit set is the same, is ‘large’ to some extent. The choice of a good definitionof the concept of attractor is not so simple. In contrast withω-limit sets, on the

1In Rn this just means that the set is bounded.

4.1 Definitions 135

saddle-nodebifurcation

attractor

definition of which everyone seems to agree, there are a number of definitions ofattractor in use. For a careful discussion we refer to [136].Here we shall give adefinition that connects well to the concepts introduced earlier, but we do not claimthat our choice is the only ‘natural’ one.

We might put forward that anω-limit set ω(x) is an attractor whenever there is aneighbourhoodU of x in M, such that for ally ∈ U one hasω(y) = ω(x). Thisattempt of a definition looks satisfactory for the cases in which theω-limit set isstationary, periodic or quasi-periodic. However, we note two problems with this.The first of these we illustrate with an example.

Example 4.1 (Saddle-node bifurcation on the circle).On the circleR/2πZ con-sider the dynamical system generated by the diffeomorphism

f(x) = x+ 12(1 − cos x) mod2πZ.

It may be clear that under iteration off each point of the circle eventually convergesto the point[0]. However, for an initial state corresponding to a very small positivevalue ofx, the evolution first will move away from[0], only to return to[0] from theother side.

Cases like Example 4.1 shall be excluded by requiring that anattractor has arbitrar-ily small neighbourhoods, such that positive evolutions starting there, also remaininside. The second problem then has to do with what we alreadynoted in§ 3.1,namely that the initial states of chaotic evolutions very often can be approximatedby initial states of (asymptotically) periodic evolutions. Based on this, we expectthat theω-limit set Λ of a chaotic evolution also will contain periodic evolutions.The points on these periodic evolutions then are contained in any neighbourhood ofΛ, but they have a much smallerω-limit set themselves. This discussion leads tothe following.

Definition 4.3 (Attractor). We say that theω-limit setω(x) is anattractor, when-ever there exist arbitrarily small neighbourhoodsU of ω(x) such that

Φ(U × t) ⊂ U

for all 0 < t ∈ T and such that for ally ∈ U one has

ω(y) ⊂ ω(x).

Remarks.

- Referring to Exercise 4.1 we mention that, without changing our concept ofattractor, in the above definition we may replace ‘such that for all y ∈ U onehasω(y) ⊂ ω(x)’ by ‘such that

⋂

0≤t∈T Φt(U) = ω(x)’.


basin!of attraction - In other definitions of attactor [136] often use is made of probability measuresor classes of such measures.

- A definition that is more common than Definition 4.3 and that also does notuse measure theoretical notions, is the following alternative.

A closed subsetA ⊂ M is an attractor wheneverA contains an evolutionthat is dense inA, and ifA contains arbitrarily small neighbourhoodU inM, such thatΦ(U × t) ⊂ U for all 0 < t ∈ T and such that for ally ∈ Uone hasω(y) ⊂ A.

Here the requirement that the attractor is anω-limit set has been replaced bythe stronger requirement that it is theω-limit set of a pointinsideA. It is pos-sible to give examples of an attractor according to Definiton4.3 that does notsatisfy the alternative definition we just gave. Such examples are somewhatexceptional, compare with Exercise 4.2. Regarding the considerations thatfollow, it makes no difference which of the two definions one prefers to use.

- In Definition 4.3 (nor in its alternative) it is not excludedthatA = M. Thisis the case for the Doubling map, for the Thom map (compare with § 2.3.5)as well as for any other system where the whole state space occurs as theω-limit set of one state. (NB: Since we require throughout thatevolutions arepositively compact, it follows that nowM also has to be compact.)

- There are many dynamical systems without attractors. A simple example isobtained by takingΦ(x, t) = x, for all x ∈M andt ∈ T.

Definition 4.4 (Basin of Attraction). WhenA ⊂ M is an attractor then thebasinof attractionofA is the setB(A) = y ∈M | ω(y) ⊂ A.

Remark. It may be clear that a basin of attraction always is an open set. Often theunion of the basins of all attractors is an open and dense subset ofM. However,examples exist, where this is not the case. A simple example again is given bythe last item of the remark following Definition 4.3, where the time evolution is theidentity. Also less exceptional examples exist. We here mention the Logistic systemfor the parameter value that is the limit of the parameter values of the successiveperiod doublings, the so-called Feigenbaum attractor. This attractor does not attractan open set, but it does attract a set of positive measure, formore information see[77].

4.2 Examples of attractors

We start reviewing the simplest types of attractor already met before:

4.2 Examples of attractors 137

Doubling map onthe plane

- The stationary attractor or point attractor, e.g., in the dynamics of the free,damped pendulum, see§1.1.1;

- The periodic attractor, occurring for the free Van der Pol equation, see§2.2.3;

- The multi- or quasi-periodic attractor, occurring for thedriven Van der Polequation, see§2.2.3, 2.2.5.

We also met a number of ‘attractors’ with chaotic dynamics. Except for the Dou-bling map (§§1.3.8, 2.3.1, 2.3.4, also see below) and for the Thom map (§2.3.5), wediscussed the Henon attractor (§§1.3.2, 2.3.4), the Rossler attractor (§§1.3.7, 2.3.4),the Lorenz attractor (§§1.3.6, 2.3.4) and the attractors that occur in the Logisticsystem (§§1.3.3, 2.3.4). We already said that not in all these example it has beenproven that we are dealing with an attractor as just defined. In some of the cases amathematical treatment was possible after adaptation of the definitions (like in thecase of the ‘Feigenbaum attractor’, mentioned just after Definition 4.4). There is aclass of attractors, the so-calledhyperbolicattractors, that contain chaotic dynam-ics and, notwithstanding this, allows a detailed mathematical description. Theseattractors have a certain similarity with the Doubling map.The below descriptionof these attractors therefore will be based on the case of theDoubling map as far aspossible.

4.2.1 The Doubling map and hyperbolic attractors

The attractor of the Doubling map, which is the whole state space, will not beconsidered hyperbolic. We give two modifications, one on theplane and the otherin 3-space. Only the latter of these, the Solenoid, will be a hyperbolic attractoraccording to the definition given below. Also in higher dimensions such hyperbolicexamples exist.

4.2.1.1 The Doubling map on the plane

We first present an example of a dynamical system, the state space of which isR2,that contains an attractor the dynamics on which coincides with that of the Doublingmap. For this purpose on the plane we take polar coordinates(r, ϕ), as usual definedby

x = r cosϕ, y = r sinϕ.

We now define a mapf(r, ϕ) = (R(r), 2ϕ),whereR : R → R is a diffeomorphism,such that the following holds:

- R(−r) = −R(r);


- For r ∈ (0, 1) we haveR(r) > r while for r ∈ (1,∞) we haveR(r) < r;

- R′(1) ∈ (0, 1).

It may be clear that for each pointp ∈ R2 \0 it follows that the iteratesf j(p) tendto the unit circleS1 = r = 1 asj → ∞. On this circle, that is invariant under themapf, we exactly have the Doubling map.

This definition made the unit circle with Doubling map dynamics, to a ‘real’ attrac-tor, in the sense that it is smaller than the state spaceR2. We shall now argue thatthis attractor is not at all persistent under small perturbations of the mapf. To beginwith, consider the saddle pointp of f, in polar coordinates given by(r, ϕ) = (1, 0).The eigenvalues of the derivative off at p are as follows. The eigenvalue in ther-direction belongs to(0, 1) and the other, in theϕ-direction, equals2. By the InverseFunction Theorem, the restriction off to a small neighbourhood ofp, is invertible.This means that there exist local stable and unstable separatrices ofp as discussedin §1.3.2. Then the global stable and unstable separatrix atp are obtained by itera-tion of f−1, respectivelyf. Sincef is not invertible, we here restrict to the globalunstable separatrixW u

f (p). It is easy to see that

W uf (p) = r = 1 = S1. (4.1)

What happens to this global picture when perturbingf? Let us perturbf to f wheref still coincides withf in the sector−α < ϕ < α, then the unstable separatrixW u

f(p) coincides withW u

f (p) in the sector−2α < ϕ < 2α. We apply this idea for

α ∈ (13π, 1

2π). So in that case the segment

Sα = r = 1, ϕ ∈ (−2α, 2α) ⊂ S1

of the unit circle is a subset ofW uf(p). Now we really make a change fromf to f

in the sector|ϕ| ∈ (α, 2α). This can be done in such a way that, for instance,

f(Sα) ∩ Sα

contains a non-tangent intersection. This creates a situation that completely differsfrom the unperturbed case as described by (4.1).

Remarks.

- We emphasise thatf is not invertible, noting that for a diffeomorphism globalseparatrices can’t have self intersections.

- From § 3.3 recall that a saddle pointp of a mapf is persistent in the sensethat for aC1-small perturbationf of f we get a saddle pointp that may beslightly shifted.


solenoid- It also can be shown, although this is harder [139, 16], thataC1-small per-turbation f of f still has an attractor and that both the saddle pointp andits unstable separatrixW u

f(p) are part of this. This implies that the original

attractor, i.e., the unit circleS1, can undergo transitions to attractors with alot more complexity. The present example of the Doubling mapin the planetherefore has little ‘persistence’ in the sense of Chapter 3. However, in dimen-sion 3 it turns out to be possible to create a persistent attractor based on theDoubling map, but this attractor will have a lot more topological complexitythan a circle.

- Although the attractor of the planar Doubling map is not persistent in theusual sense, we shall see later in this chapter, that it has persistent featuresregarding its ‘stable foliation’.

- We like to mention here that the Doubling map also occurs when consideringthe polynomial mapf : C → C, z 7→ z2, namely as the restriction to thecomplex circleS1 ⊂ C, which is a repellor, the so-called Julia set. Comparewith Exercise 1.19. Also within the holomorphic setting, this situation is notvery persistent. Compare with, e.g., [5] and several contributions to [22].

4.2.1.2 The Doubling map in 3-space: the Solenoid

The idea is now in 3-space to map a ‘fattened’ circle without self intersections ontoitself in such a way that the image runs around twice as compared to the original.We shall make this more precise with pictures and formulæ.

Next to the Cartesian coordinatesx, y andz, on R3 we also use cylindrical coor-dinatesr, ϕ andz. As before we then havex = r cosϕ andy = r sinϕ. We aredealing with the circle

S1 = r = 1, z = 0.We fatten the circle to a domainD, given by

D = (r − 1)2 + z2 ≤ σ2,

where0 < σ < 1. As indicated before, the mapf is going to mapD within itself.First we construct the restrictionf |S1 by

f(1, ϕ, 0) = (1 + µ cosϕ, 2ϕ, µ sinϕ),

where0 < µ < σ; the first inequality avoids self intersections, while the secondensures thatf(S1) ⊂ D. Note that in the limit caseµ = 0, the circleS1 is mapped


D2ϕ0

D2ϕ0

Dϕ0

f(Dϕ0) f(Dϕ0+π)

S1

Figure 4.1: Left: the circleS1, the fatteningD andf(S1) ⊂ D.Right: the discD2ϕ0

with insidef(Dϕ0) andf(Dϕ0+π).

onto itself according to the Doubling map. Next we definef on all of the fattenedcircleD by

f(1 + r, ϕ, z) = (1 + µ cosϕ+ λr, 2ϕ, µ sinϕ+ λz). (4.2)

Hereλ > 0 has to be chosen small enough to haveµ + λσ < σ, which ensuresthat f(D) ⊂ D, and to haveλσ < µ, which ensures thatf |D also has no selfintersections. In theϕ-direction a Doubling map takes place: each disc

Dϕ0 = (r, ϕ, z) | (z − 1)2 + z2 ≤ σ2, ϕ = ϕ0

is mapped within the discD2ϕ0 , where it is contracted by a factorλ, compare withFigure 4.1.

The mapf : D → D generates a dynamical system, to be called the Solenoidsystem. This dynamical system will have an attractorA ⊂ D. To describeA, wedefine the setsKj = f j(D). Sincef(D) ⊂ D, alsoKi ⊂ Kj wheneveri > j. Wethen define

A =⋂

j∈N

Kj .

Before showing thatA indeed is an attractor off, we first investigate its geometry.


Cantor setTheorem 4.5 (Cantor set transverse to Solenoid).A ∩Dϕ0 is a Cantor set.

Proof. A Cantor set is characterized as a (non-empty) compact metric space, that istotally disconnected and perfect, for details see AppendixA. The fact thatA∩Dϕ0

is totally disconnected means that each pointa ∈ A ∩ Dϕ0 has arbitrarily smallneighbourhoods with empty boundary.

We just definedDϕ0 = D ∩ ϕ = ϕ0. From the definitions it now follows thatKj ∩ Dϕ0 consists of2j discs, each of diameterσλj, and such that each disc ofKj−1 ∩ Dϕ0 contains two discs ofKj ∩Dϕ0 in its interior. The discs ofKj ∩Dϕ0

can be coded by the integers between0 and2j − 1 as follows. The disc with indexℓ then is given by

f j(

D2−j(ϕ+2πℓ)

)

Two discs with indicesℓ < ℓ′ that differ by2j−1 lie in the same disc ofKj−1 ∩Dϕ0 ;that particular disc inKj−1 ∩Dϕ0 then has indexℓ.

We now can simplify by expressing these integers in the binary system. The discsof Kj ∩Dϕ0 correspond to integers that in the binary system have lengthj (wheresuch a binary number may start with a few digits0). Discs contained in one samedisc ofKi ∩Dϕ0 , with i < j, correspond to numbers the finali digits of which areidentical. In this way, the elements ofA ∩Dϕ0 correspond to binary expressions ofthe form

(. . . , α2, α1, α0),

so withαj = 0 or 1. From Algebra this set is known as the 2-adic closure of theintegers: integers then are closer together whenever theirdifference contains a largerpower of2. Compare with Exercise 4.6.

From the above construction it may be clear thatA∩Dϕ0 has the following proper-ties:

1. It is the limit of a decreasing sequence of non-empty compact setKi ∩Dϕ0 ,and hence itself also compact and non-empty;

2. A ∩ Dϕ0 is totally disconnected, meaning that each pointa ∈ A ∩ Dϕ0 hasarbitrarily small neighbourhoods inA∩Dϕ0 with an empty boundary. Indeed,if

a↔ (. . . , α2, α1, α0),

then, as neighbourhoods one can take setsUj of points that correspond to suchexpressions, the finalj elements of which are given by(αj−1, . . . , α1, α0), inother words all points in one same disc ofKj ∩Dϕ0 .

3. Any pointa ∈ A∩Dϕ0 is accumulation point ofA∩Dϕ0 , indeed, the setsUj

we just defined, apart froma also contain other points.


From these three properties is follows thatA ∩ Dϕ0 is a Cantor set, again see Ap-pendix A. 2

Remark. We note however, thatA is not homeomorphic to the product of circleand Cantor set, which is exactly why it is called Solenoid attractor. This fact isrelated to the way in which we assigned integer numbers to thediscs inKj ∩ Dϕ0

is different forϕ0 = 0 and forϕ0 = 2π. Since the ‘discontinuity’ in the angularcoordinate (instead ofϕ ∈ [0, 2π), we can just as well takeϕ ∈ [ϕ0, ϕ0 + 2π)), thismeans that if for the coordinateϕ we exclude one valueϕ0, the remaining set

a ∈ A | ϕ(a) 6= ϕ0is homeomorphic to the product of interval and Cantor set. In the Exercises 4.4and 4.6 we enter deeper into the algebraic description of theSolenoid and the cor-responding dynamics.

Theorem 4.6 (Solenoid as attractor).A is an attractor of the dynamics generatedby the Solenoid mapf, see(4.2).

Proof. We show thatA is anω-limit set and thus an attractor. The problem is tofind a suitable pointp ∈ R3, such thatω(p) = A. For p we take an element ofD,such that the projection of its orbit onS1 is a dense orbit of the Doubling map.

Now let q ∈ A be an arbitrary point with angular coordinateϕ(q) and let the binaryexpression ofq as an element ofDϕ(q) be given by

(. . . , α2(q), α1(q), α0(q)).

We now have to show thatq ∈ ω(p) meaning that the positive evolutionf j(p)j≥0

has elements that are arbitrarily nearq. So letV be an (arbitrarily) small neighbour-hood ofq. ThenV contains a neighbourhood that can be obtained by restricting theangular coordinate to anε-neighbourhood ofϕ(q) and by restricting to the elementsof the corresponding binary expressions, that for a suitablej end like

αj−1(q), . . . , α0(q).

(Here we assume that the angular coordinate has no ‘discontinuity’ in theε-neighbourhoodof ϕ(q).) Such a neighbourhood also can be written as

f j

(

ϕ0 − ε

2−j< ϕ <

ϕ0 + ε

2−j

)

,

for someϕ0 and someε > 0. Since theϕ coordinates of the positive evolution ofpare dense inS1, and therefore also contains points of the interval

(

ϕ0 − ε

2−j,ϕ0 + ε

2−j

)

,

the positive evolution ofp also contains points ofV. This concludes the proof. 2


hyperbolicity4.2.1.3 The Solenoid as a hyperbolic attractor

As indicated earlier, the Solenoid is an example of an important class of attractors.This class ofhyperbolic attractorswill be defined here, while we also show thatthe Solenoid is one of these. Also we will discuss a few properties that characterisehyperbolicity.

Definition 4.7 (Hyperbolicity, discrete time). We call an attractorA ⊂ M forthe dynamics generated by a mapf : M → M hyperbolic if f, restricted to aneighbourhood ofA, is a diffeomorphism and if for each pointp ∈ A there existssplittingTp(M) = Eu(p) ⊕ Es(p) with the following properties:

1. The linear subspacesEu(p) andEs(p) depend continuously onp, their di-mensions being independent ofp;

2. The splitting is invariant under the derivative off :

dfp(Eu(p)) = Eu(f(p)) anddfp(E

s(p)) = Es(f(p));

3. The non-zero vectorsv ∈ Eu(p) (respectivelyEs(p)) increase (respectivelydecrease) exponentially under repeated application of df in the followingsense: There exist constantsC ≥ 1 and λ > 1 such that for anyn ∈ N,p ∈ A, v ∈ Eu(p) andw ∈ Es(p) we have

| dfnp (v) |≥ C−1λn|v| and | dfn

p (w) |≤ Cλ−n|w|,where| − | is the norm of tangent vectors with respect to some Riemannianmetric.

We call the linear subspaceEu(p), respectivelyEs(p), of Tp(M) the space ofun-stable(respectivelystable) tangent vectors atp.

Remarks.

- It is not hard to show that, if such a hyperbolic splitting exists, then it isnecessarily unique. Indeed, a vector inTp(M) only belongs toEs(p) when itshrinks exponentially under iteration of df. Similarly a vector inTp(M) onlybelongs toEu(p) when it shrinks exponentially under iteration of df−1.

- For the above definition it is not of importance to know what Riemannianmetric onM has been chosen for the norms. Indeed, using the compactnessof A, one can prove that whenever such estimates hold for one Riemannianmetric these also hold for any other choice. Here it may be necessary toadapt the constantC. Moreover, it can be shown that always a Riemannianmetric exists whereC = 1. (Here it is needed thatA is compact, which in ourcase follows from our general assumption that all evolutions are positivelycompact.)


- Hyperbolic attractors also can be defined for dynamical systems with con-tinuous time, given by a vector field, sayX. In that case the definition runsdifferently. Indeed, ifΦt denotes the flow ofX over timet, then for eachp ∈ M and for eacht ∈ R we have

d(Φt)X(p) = X(Φt(p)).

Therefore, if the positive evolution ofp is relatively compact, d(Φt)X(p)can’t grow exponentially. (We assume thatX is at least continous.) More-over, if the positive evolution ofp does not converge to an equilibrium pointof X, then d(Φt)X(p) also can’t shrink exponentially. This means that vec-tors in theX-direction can’t participate in the ‘hyperbolic behaviour’, whichleads to the following definition.

Definition 4.8 (Hyperbolicity, continuous time). An attractorA ⊂ M of a dynam-ical system, defined by a smooth vector fieldX with evolution operatorΦ, is calledhyperbolicif for eachp ∈ A there exists a splittingTp(M) = Eu(p)⊕Ec(p)⊕Es(p)with the following properties:

1. The linear subspacesEu(p), Ec(p) and Es(p) depend continuously onp.while their dimensions are independent ofp :

2. For anyp ∈ A, the linear subspaceEc(p) is 1-dimensional and generated byX(p);

3. The splitting is invariant under the derivative dΦt in the sense that for eachp ∈ A andt ∈ R :

dΦt(Eu(k)) = Eu(Φt(k)) anddΦt(Ec(k)) = Ec(Φt(k))

anddΦt(Es(k)) = Es(Φt(k));

4. The vectorsv ∈ Eu(p) (respectivelyEs(p)), increase (respectively decrease)exponentially under application of dΦt as a function oft, in the followingsense; constantsC ≥ 1 andλ > 1 exist, such that for all0 < t ∈ R, p ∈ A,v ∈ Eu(p) andw ∈ Es(p) we have:

| dΦt(v) |≥ C−1λt|v| and | dΦt(w) |≤ Cλ−t|w|,

where|−| is the norm of tangent vectors with respect to a Riemannian metric.

Remark. With respect to the Lorenz attractor mentioned before (§§1.3.6, 2.3.4)and that will again be discussed later in this chapter, we note that a hyperbolicattractor of a system with continuous time (and defined by a vector field) cannothave equilibrium points, i.e., zeroes of the vector field. This implies in particularthat the Lorenz attractor cannot be hyperbolic.


solenoidWe now turn to the Solenoid system given by the mapf in (4.2). In Theorem 4.6we already showed that this system has an attractorA. The solenoid geometry ofAwas identified in Theorem 4.5 as a (nontrivial) bundle of Cantor sets over the circleS1.

Theorem 4.9 (Hyperbolicity of the Solenoid).The Solenoid attractorA, as de-scribed in Theorems4.6and4.5, is hyperbolic.

Proof. We need to identify the splitting of the tangent bundleTp(M) = Eu(p) ⊕Es(p) according to Definition 4.7.

The choice of the stable linear subspacesEs(p) is straightforward: these are just thetangent spaces to the discsDϕ. The corresponding estimates of Definition 4.7 thencan be directly checked.

Regarding the unstable subspacesEu(p), one would expect that these lie approxi-mately in theϕ-direction, but the existence of this component with the prescribedproperties, is not completely obvious. There exists however a method to show thatthe unstable subspacesEu(p) are well-defined. Here we start off with approxi-mate1-dimensional choices of the unstable subspacesE(p) that lie exactly in theϕ-direction. In other words,E(p) is tangent to the circle throughp, defined by tak-ing bothr andz constant. Now it is possible to improve this approximation in asuccessive way, that converges and where the limit exactly is the desired subspaceEu(k). To describe this process of improvement, we need some terminology.

A mapp ∈ A 7→ E(p) ⊂ Tp(M) that assigns to each point a linear subspace of thetangent space ofM, is called adistribution, defined onA. Compare with [181]. Weassume that such distributions have the property thatdim E(p) is constant and thatE is continuous.

On the set of all such distributions we define a transformation Tf by

Tf (E)(k) = dff−1(k)E(f−1(k)).

It can be generally shown that for a distributionE , defined onA and sufficiently nearthe distribution of unstable vectors, under repeated application ofTf will convergeto the distributionEu of unstable vectors

Eu = limn→∞

(Tf )n(E). (4.3)

In Exercise 4.7 the convergence of this limit process for theSolenoid attractor willbe dealt with. 2

Remark. For more details regarding this and other limit procedures associated tohyperbolicity, we refer to [113] and to the first appendix of [16]. We do mentionhere that, if in a general setting we wish to get a similar convergence to the stabledistributionEs, we need to use the transformationTf−1 .


foliationstable equivalence

4.2.1.4 Properties of hyperbolic attractors

We mention a number of properties that generally hold for hyperbolic attractorsand that we already partly saw for the Solenoid. For proofs ofthe properties tobe discussed here, we refer to the literature, in particularto [166, 113] and to thefirst appendix of [16]. The main aim of this section is to indicate why hyperbolicattractors are persistent underC1-small perturbations, while the dynamics inside isstructurally stable.

In the following we assume to deal with a dynamical system as before, with a statespaceM, time t mapΦt and a hyperbolic attractorA. Thus, there exists an openneighbourhoodU ⊃ A, such thatΦt(U) ⊂ U for all t > 0 and

⋂

T∋t>0(U) = A. Inthe case of time setN, we also assume that the restrictionΦ1|U is a diffeomorphismonto its image.

One of the tools used for proving persistence of hyperbolic attractors is the ‘stablefoliation’. First we give a general definition of a foliationof a manifoldV, which isa decomposition ofV in lower dimensional manifolds, to be called the leaves of thefoliation.

Definition 4.10 (Foliation). LetV be a manifold. Afoliation of V is a map

F : v ∈ V 7→ Fv,

whereFv is an injectively immersed submanifold ofV, in such a way that for certaininteger numbersk, ℓ andh one has

1. If w ∈ Fv, thenFv = Fw;

2. For eachv ∈ V the manifoldFv is of classCk;

3. For any v ∈ V there existCℓ-coordinatesx = (x1, . . . , xm), defined ona neighbourhoodVv ⊂ V of v, such that for anyw ∈ Vv, the connectedcomponent ofFw ∩ Vv has the form

xh+1 = xh+1(w), . . . , xm = xm(w).

It may be clear thath is the dimension of the leaves of the foliation and that allleaves areCk-manifolds, where alwaysk ≥ ℓ.. We callℓ the differentiability of thefoliation. Here it is not excluded thatℓ = 0, but in the foliations to follow, alwaysk ≥ 1.

Definition 4.11 (Stable equivalence).For a hyperbolc attractorA of a dynamicalsystem with timet mapsΦt and a neighbourhoodU ⊃ A as above, i.e., such that


stable foliationΦt(U) ⊂ U for 0 < t ∈ T and∩0<t∈T Φt(U) = A, two pointsv, w ∈ U are calledstably equivalentif

limt→∞

ρ(Φi(v),Φi(w)) = 0,

whereρ is the distance function with respect to a Riemannian metriconM.

For a hyperbolic attractor it can be shown that the connectedcomponents of theequivalence classes of this stable equivalence, are manifolds that define a foliationin U. This is called the ‘stable foliation’, to be denoted byF s.

Theorem 4.12 (Stable Foliation).Let U ⊃ A be as in Definition4.11. Then forthe stable foliationF s one has

1. The leaves ofF s are manifolds of dimension equal todimEs(p), p ∈ A, andof differentiability equal to that ofΦt;

2. At each pointp ∈ A one has that

TpF s(p) = Es(p);

3. F s is of classC0;

4. For anyp ∈ U enT ∋ t > 0 one has that

Φt(F s(p)) ⊂ F s(Φt(p)),

i.e., the foliation isinvariantunder the forward dynamics.

In case the leaves have codimension1 in M, and in a few exceptional cases, it canbe shown thatF s is of classC1. For proofs of Theorem 4.12, see the referencesgiven before. We now explain the meaning of Theorem 4.12 in a couple of remarks.

Remarks.

- In the case of the Solenoid it may be clear what the stable foliation amountsto: its leaves are just the 2-dimensional manifoldsDϕ on which the angleϕis constant.

- Although this is perhaps not the most interesting case, it may be well to ob-serve that Theorem 4.12 alo applies in the case of a hyperbolic point attractor.Indeed, ifA = p is such a point attractor, thenEs(p) = Tp(M) and thefoliationF s only has one leaf, which is an open neighbourhood ofp in M.


- On a neighbourhoodU ⊃ A as in Theorem 4.12 we can define an induceddynamics on the set of leaves ofF s. In the case of the Solenoid, this set ofleaves in a natural way is a1-dimensional manifold, namely the circleS1, onwhich the induced dynamics is conjugated to the Doubling map.

In general it may be expected that such a construction leads to a manifold ofdimensiondimEu(p), with on this manifold an induced dynamics. In reality,the situation is more complicated than this, but in the case wheredimEu(p) =1, this idea has been fruitful [209].

- Even in cases where no hyperbolic attractor exists, it may occur that a stablefoliation exists with certain persistence properties. In the example of the Dou-bling map on the plane, see§4.2.1.1, we can define a stable foliationF s

f ona neighbourhoodU = 1

2< r < 2 of the circular attratorS1. The leaves of

this foliation then are the line segments on whichϕ is constant and on whichr ∈ (1

2, 2).

This foliation is even persistent in the sense that for any map f , that is suffi-cientlyC1-nearf, there is a stable foliationF s

fin the sense of Theorem 4.12.

In this case it turns out that the set of leaves naturally getsthe structure of acircle, on which the induced dynamics is conjugated to the Doubling map.

In this respect we can speak of semi-conjugations as follows. If (M1,Φ1, T )and(M2,Φ2, T ) are dynamical systems, then we call a continuous and sur-jective maph : M1 → M2 a semi-conjugationwhenever for allt ∈ T onehas

h Φt1 = Φt

2 h.

In a similar way also semi-equivalences can be defined between systems withcontinuous time. We now see that the property that the Doubling map on theplane is semi-conjugated to the Doubling map on the circle, is persistent. ByTheorem 3.13 we also know that the presence of chaotic evolutions is persis-tent for the Doubling map on the circle. From all this, a similar persistenceresult follows for the occurrence of chaos in the Doubling map on the plane.

The proof of the existence of stable foliations for this kindof attractors inendomorphisms, is a standard application of the methods of [113]. Yet, it isnot known to the authors that these and the corresponding persistent existenceof a semi-conjugation, was mentioned earlier in the literature.

- Finally we mention that recently it has been shown that for the Lorenz attrac-tor, another non-hyperbolic attractor, a well-defined stable foliation exists.We shall come back to this later in this chapter.


attractor!hyperbolicPersistence

Theorem ofHyperbolicAttractors

attractor!non-hyperbolic

We now come to the main result of this section, dealing with persistence and struc-tural stability of hyperbolic attractors underC1-small perturbations.

Theorem 4.13 (Persistence of hyperbolic attractors).Let Φ define a dynamicalsystem with timetmapsΦt, and letA be a hyperbolic attractor of this system. Thenthere exist neighbourhoodsU of Φ in theC1-topology andU ⊂ M ofA, such thatA = ∩T∋t>0Φ

t(U) andΦt(U) ⊂ U , for 0 < t ∈ T, and such that for anyΦ ∈ U :

1. A =⋂

T∋t>0 Φt(U) is a hyperbolic attractor ofΦ;

2. There exists a homeomorphismh : A→ A, that defines a conjugation (or anequivalence in the caseT = R), between the restrictionsΦ|A enΦ|A.

Remarks.

- Theorem 4.13 for the Solenoid implies, among other things,that the compli-cated Cantor like structure is preserved when we perturb themap that gener-ates the dynamics, at least when this perturbation is sufficiently small in theC1-topology.

- Notice that since the Thom map is hyperbolic (see Exercise 4.5), it is alsostructurally stable.

4.2.2 Non-hyperbolic attractors

In §3.2 we already briefly discussed the attractors met in Chapter 1, including thechaotic ones. Most of the chaotic attractors discussed there and met in concreteexamples, are non-hyperbolic. Below we continue this discussion, where we distin-guish between Henon-like attractors and the case of the Lorenz attractor. One aspectof interest in all cases is the persistence. First of all, no cases are known to us, whereone has plain persistence of chaoticity under variation of the initial state. Indeed, asfar as mathematically known, in all cases the set of (unstable) periodic evolutionsis dense in the attractor. In quite a number of cases it turns out that persistence ofchaoticity holds in the sense of probability, see Definition3.2.

Secondly, regarding persistence under variation of parameters, we already men-tioned in§3.2 that the Logistic family and the Henon family only show weak per-sistence of chaoticity, see Definiton 3.3. Compare with [118, 5, 134, 41, 42, 88].Indeed, in both of these examples, there exists subsets of the parameter space withnon-empty interior, where the corresponding dynamics is asymptotically periodic.

Remark. Plain persistence of chaoticity under variation of parameters, is a propertyof hyperbolic attractors, like the Solenoid. However, as weshall see below, alsothe Lorenz attractor, although non-hyperbolic, isC1-persistent. In both cases, thepersistence proof uses stable foliations.


H“’enon!-likeattractor

4.2.2.1 Henon-like attractors

A number of dynamical systems we met in Chapter 1 have non-hyperbolic attrac-tors. To start with, think of the Logistic system with parameterµ = 4. It has evenbeen generally shown that in the case of a1-dimensional state space no ‘interest-ing’ hyperbolic attractors can exist: in the case of hyperbolicity for the Logisticsystem we have exactly one periodic attractor, compare withExercise 4.14. Non-hyperbolic attractors in a1-dimensional state space often are not persistent in thesense that small variations of parameters may lead to a much smaller (periodic)attractor. In this sense the Doubling map is an exception. Onthe other hand, aswe mentioned already, for the Logistic family the set of parameter values with anon-hyperbolic attractor has positive measure [118].

Meanwhile the research on more general dynamical systems with a1-dimensionalstate space has grown to an important area within the theory of dynamical systems.A good introduction to this area is [5], for more informationsee [134].

In Chapter 1, in particular in Exercise 1.12, we saw how the H´enon family, for smallvalues of|b| can be approximated by the Logistic family. In general we canfatten1-dimensional endomorphism to 2-dimensional diffeomorphisms as follows. Givenan endomorphismf : R → R, the diffeomorphism can be defined as

F : R2 → R2, F (x, y) = (f(x) + y, bx),

whereb is a real parameter. Forb = 0 thex-axis is invariant, where the correspond-ing restriction ofF reads

F (x, 0) = (f(x), 0),

and we retrieve the original endomorphismf. Following the principles of perturba-tion theory, it is reasonable to try to extend the theory as available for the Logisticsystem,2 to these2-dimensional diffeomorphisms.

Remark. Also compare with the construction of the Solenoid map from the Dou-bling map in§4.2.

A first step in this direction was the theory of Benedicks and Carleson [42] forthe Henon system. As a result it became known that the Henonsystem, for smallvalues of|b| and for many values of the parametera, has complicated and non-hyperbolic attractors. As already mentioned in§3.2, it seems doubtful whetherthese considerations also include the ‘standard’ Henon attractor as this occurs fora = 1.4 andb = .3.

2As well as for more general1-dimensional systems.


Lorenz!attractorThis theory, although originally only directed at the Henon system, was extendedconsiderably to more general systems by Viana and others, for reviews of results inthis direction see [199, 10, 88].

The corresponding theory is rapidly developing, but we herebriefly sketch a numberof properties as these seem to emerge.

In the perturbative settings of [42, 16, 199, 10, 200, 88] it is has been proven thatthe chaotic attractorA is the closure of the an unstable manifold

A = W u(p), (4.4)

wherep is a periodic (or fixed) point of saddle type. Also in a number of cases aninvariant measure onA exists, such that the dynamics, restricted toA, has certainmixing properties, compare with Appendix A. For continuation of the periodicattractors as these occur in the 1-dimensional case, compare with [192]. For partialresults in higher dimension, see [139, 204]. Also see [187].

Remarks.

- The general impression is that the Henon-like attractorsin 2-dimensional dif-feomorphisms, according to (4.4), occur on a nowhere dense subset of theparameter space, of positive measure. For a numerical illustration of thisregarding the Henon attractor, see Figure 1.15. Also see Appendix C, in par-ticular Figure C.6 concerning the damped swing. Also see Exercise 1.25. In[64], that also contains more references in this direction,this principle for2-dimensional diffeomorphisms was coined as ‘Henon everywhere’, and fur-ther illustrated in laser dynamics. For similar examples ina meteorologicalcontext we refer to [66].

- The numerical simulations on the Rossler system, as discussed in§1.3.7, sug-gest the presence of a Poincare map that can be viewed as a2-dimensionaldiffeomorphism, which is approximated by a1-dimensional endomorphism.

4.2.2.2 The Lorenz attractor

As a final example of a non-hyperbolic attractor we treat the Lorenz attractor as in-troduced in§1.3.6. We will be able to discuss this example to greater detail, since ageometrical description of the dynamics in this system is based on concepts treatedbefore, like the stable foliation, see§4.2.1.4, a Poincare map and modified Doublingmaps. This geometric presentation originally was proposedby Williams and Guck-enheimer [210, 102], also see Sparrow [180], while Tucker [197] in 1999 provedthat this description is valid for the Lorenz attractor withthe common parametervalues. Although we have tried to make the below descriptionas convincing as


Lorenz!system possible, the mathematical details have not been worked outby far. Often we willuse things suggested by numerical simulations. As noted before, everything can befounded mathematically, but this is out of the present scope.

We recall the equations of motion for the system with the Lorenz attractor, i.e., thesystem (D.11), with the appropriate choice of parameters:

x′ = 10(y − x)

y′ = 28x− y − xz (4.5)

z′ = −83z + xy.

We start the analysis of (4.5) by studying the equilibrium point at the origin. Lin-earisation gives the matrix

−10 10 028 −1 00 0 −8

3

.

The eigenvalues of this matrix are

−83 (z-direction) and− 11

2 ± 12

√1201; (4.6)

the latter two being approximately equal to12 and−23.

Following [182], by a nonlinear change of coordinates in a neighbourhood of theorigin, we linearise the system (4.5). This means that coordinatesξ, η, ζ exist in aneighbourhood of the origin, such that (4.5), in that neighbourhood gets the form

ξ′ = λξ

η′ = µη

ζ ′ = σζ,

whereλ, µ, σ are the eigenvalues (4.6) respectively11,−83

and−22. Around theorigin, within the domain of definition of(ξ, η, ζ) we now choose a domainD asindicated in Figure 4.2. Moroever we define

D+ = D ∩ η ≥ 0, U± = ∂D+ ∩ ξ = ±ξ0 andV = ∂D ∩ η = η0.

The neighbourhoodD moreover can be chosen in such a way that the evolutioncurves leavingD throughU± return again toD by V.

To get a good impression of the way in which the evolution curves return, it is usefulto study the numerical simultaion of Figure D.1. In Figure 4.4 this same simulationis represented, where also the setD+ is shown.


U+

U−D

V

ζ

η

ξ

Figure 4.2: NeighbourhoodD of the origin: via the boundary planesη = ±η0 andζ = ±ζ0 the evolution curves enterD and byξ = ±ξ0 they exit again. The curvedparts of the boundary∂D are formed by segments of evolution curves.

We note that our system has a symmetry in the sense that the coordinatesξ, η, ζ canbe chosen in such a way that under the reflection

(x, y, z) 7→ (−x,−y, z),

evolution curves turn into evolution curves. In the(ξ, η, ζ)-coordinates this reflec-tion is given by

(ξ, η, ζ) 7→ (−ξ, η,−ζ).By this symmetry, the evolution curves that leaveD via U+ turn into curves thatleaveD viaU−. The regions where these evolution curves re-enter intoD by V, aresymmetric with respect to the planeζ = 0. We define the setsW± as the unionsof segments of evolution curves starting atU± that end atV, again compare withFigures 4.3 and 4.4.

It may be clear that the setD+ ∪W− ∪W+ is ‘positive invariant’ in the sense thatthe evolution curves enter it, but never leave it again. Another term for such a set is‘trapping region’. The Lorenz attractor lies inside this trapping region, where it isgiven by

⋂

t>0

Φt(D+ ∪W− ∪W+) ⊂ D+ ∪W− ∪W+.

HereΦt is the timet map of (4.5). This means that the origin0 belongs to theattractor, which is an equilibrium point of (4.5). Therefore the Lorenz attractorcan’t be hyperbolic.


00

z

x

Figure 4.3: Typical evolution of the Lorenz system (4.5), compare with Figure D.1.Here the setD is indicated only schematically. Please note that theζ-axis does notcoincide with they-axis, but to first approximation is a ‘linear combination’ of thex- and they-axis.

V

W+

D+

Figure 4.4: Sketch of the setsD,D+ enW+; the position ofW− follows by sym-metry considerations.


o+

o+

o−

o−

Figure 4.5: Graph of the Poincare return mapP. The domain of definiton is theinterval [o−, o+], the end points of which are equal tolimP(x), wherex tends tothe discontinuity ofP either from the left or from the right.

The fact that the evolutions that leaveD+ by U± return throughV is suggested bynumerical simulations. In principle this can also be confirmed by explicit estimates.The final element we need for our description is the existenceof a stable foliation,as for hyperbolic attractors. The proof for the existence ofthese foliations for theLorenz system (4.5) is extremely involved and in the end onlywas possible in acomputer assisted way: i.e., numerical computations with arigorous bookkeepingof the maximally possible errors. The result [197] is that for this attractor, thoughnon-hyperbolic, there yet exists a stable foliationF s as defined in§4.2.1.4. Theleaves of this foliation are1-dimensional. Without restriction of the generality wemay assume thatV consists of a union of leaves ofF s. Now the global dynamicsgenerated by (4.5) can be described in terms of a Poincare return map, defined onthe leaves ofF s in V. We next describe this Poincare map.

We can parametrise the leaves ofF s in V by the1-dimensional setV ∩ ζ = 0.When following these leaves by the timet evolution, we see that one of the leaves(in the middle) converges to the origin. The points on the left enterW− by U−, thepoints on the right similarly enterW+ by U+; compare with Figures 4.2 and 4.5.After this the leaves again pass throughV.Since, by invariance of the stable foliation(compare Theorem 4.12) each leaf enters within one leaf, which implies that thePoincare return mapP on the set of leaves is well-defined. From the Figures 4.2and 4.5 it now may be clear that the graph ofP looks like Figure 4.5. Here we alsouse that eigenvalues exist that cause expansion.

The graph ofP looks like that of the Doubling map, as defined on the interval[0, 1);


symbolic dynamicsbasin!boundaries

in the middle there is a point of discontinuity and on both theleft and on the rightinterval the function is monotonically increasing. An important difference with theDoubling map is, that the endpoints of the interval[o−, o+] of definition, are nofixed points ofP. As for the Doubling map, however, we also here can describethe evolutions ofP by symbolic dynamics. In fact, for each pointx ∈ [o−, o+] wedefine a sequence of symbols

s1(x), s2(x), s3(x), . . .

where

sj+1(x) =

0 if Pj is on the left of the discontinuity,1 if Pj is on the right of the discontinuity,

(For the case wherePj(x) for somej ∈ N exactly is the point of discontinuity, werefer to Exercise 4.14.) If, except in the discontinuity,P has everywhere a derivativelarger than1, the pointx ∈ [o−, o+] is uniquely determined by its symbol sequences(x). However, as a consequence of the fact thatP(o−) > o− andP(o+) < o+, inthis case it is not true that any symbol sequence of0’s and1’s has a correspondinginitial point x. For a deeper investigation of this kind of Poincare return maps werefer to Exercise 4.14, where we classify such maps up to conjugation. For moreinformation on the symbolic dynamics of this map, we refer to[156].

The topology of the Lorenz attractor, up to equivalence, is completely determinedby the Poincare return mapP. This means the following. If we consider a perturba-tion of the Lorenz system, for instance, by a slight change ofparameters, then theperturbed system, by persistence, again will have a stable foliation, and hence therewill be a perturbed Poincare’ map. ThenP and the perturbed Poincare map will beconjugated if and only if the dynamics on the attractor of (4.5) and its perturbationare equivalent.

4.3 Basin boundaries and the Horseshoe map

We conclude this chapter on the global structure of dynamical systems with a dis-cussion on the boundaries of the basins of attraction. We first treat a class of sys-tems, namely gradient systems, where these boundaries havea simple structure.After this more complicated situations will be considered.

4.3.1 Gradient systems

A gradient system is a dynamical system generated by a differential equation of theform

x′ = −gradV (x). (4.7)

4.3 Basin boundaries and the Horseshoe map 157

gradient systemThe state space is a manifoldM, whereV : M → R is a smooth function, andgradV the gradient vector field ofV with respect to a Riemannian metric onM.In the simplest caseM = Rm, while the Riemannian metric is geven the standardEuclidean one, in which case the differential equation (4.7) gets the form

x′1 = −∂1V (x1, x2, . . . , xm)

x′2 = −∂2V (x1, x2, . . . , xm)...

x′m = −∂mV (x1, x2, . . . , xm).

In the case wherem = 2, one can think ofV as a level function and of the evolutionof this system as the orbits of particles that go down as steeply as possible and wherethe speed increases when the function gets steeper. We now, in the case wherem = 2 investigate what the basins of attraction and their boundaries look like. Toexclude exceptional cases we still have to make an assumption on the functionsVto be considered here. In fact, we require that at each critical point ofV, i.e., wheredV (x) = 0, for the Hessian matrix of second derivatives one has

det

(

∂11V (x) ∂12V (x)∂21V (x) ∂22V (x)

)

6= 0.

Such critical points are callednon-degenerate. The corresponding equilibrium ofthe gradient vector field gradV (x) is hyperbolic. In the present case2-dimensionalcase there are three types of hyperbolic equilibria:

- A point attractor;

- A saddle point;

- A repelling point (when inverting time, this exactly is theabove case of apoint attractor).

Moreover, for simplicity we shall assume that the evolutions of our systems arepositively compact (i.e., bounded). This can for instance be achieved by requiringthat

lim|x|→∞

V (x) = ∞.

In this situation, compare with Exercise 4.13, each point has anω-limit that consistsexacly of one critical point ofV. For a point attractorp, the set of points

y ∈M | ω(y) = p


Horseshoe map is its basin of attraction, which is open. The otherω-limit sets are given by thesaddle-points and the point repellors. This implies that the boundaries of the basinshave to consist of the stable separatrices of the saddle-points (and the point repel-lors). In particular these basin boundaries are piecewise smooth curves and points.

Remarks.

- We return to the geographical level sets we spoke of before.The repellorsthen correspond to mountain tops and the saddle points to mountain passes.The point attractors are just minima in the bottoms of the seas and lakes. Abasin boundary then corresponds to what is called a water shed. This is theboundary between area’s where the rain water flows to different rivers (this isnot entirely correct, since two of such rivers may unite lateron). These watersheds generally are the mountain combs that connect the topswith the passes.

- Note that the repellors topologically are boundary points, but they are not onthe mutual boundary of two basins of attraction.??

- It is possible to create more complicated basin boundariesin dynamical sys-tems with a2-dimensional state space and continuous time. For this it ishelpful thatM is a torus, or another closed surface. If we increase the dimen-sion by1 or turn to discrete time. it is simple to create basin boundaries witha more complicated structure.

4.3.2 The Horseshoe map

We already saw that general hyperbolic attractors to some extent are a generalisationof hyperbolic point attrators. A similar generalisation ofsaddle points also exists.The latter generalisation yields far more complicated boundaries between basins ofattraction than in the above example of gradient systems. Historically speaking,the first example of such a ‘generalised saddle point’ was given by Smale [178], asthe Horseshoe map to be discussed now. In this example we can’t yet speak of theboundarybetweenbasins of attraction. In fact, as we shall see, there is only oneattractor, which does have a complicated basin boundary. However, the Horseshoemap provides the simplest example of such ‘generalised saddle points’ and plays animportant role in the so-called homoclinic dynamics.

A Horseshoe map generally is a difeomorphismf : R2 → R2 that maps a rectangleD over itself in the form of a horseshoe, as indicated in Figure4.6. We furtherassume that in the components ofD ∩ f−1(D), the mapf is affine, where the hori-zontal and vertical directions are preserved. On these componentsf is an expansionin the vertical direction and a contraction in the horizontal direction. In Figure 4.7we sketched the positioning ofD, f(D), f 2(D) andf−1(D), where it was used that


symbolic dynamics

1 2

34

1′ 2′ 3′ 4′

Df(D)

Figure 4.6: By the mapf the vertices1, 2, 3, 4 of the rectangleD are turned into thevertices1′, 2′, 3′, 4′ of f(D).

the pairs(f−1(D), D), (D, f(D)) and(f(D), f 2(D) are diffeomorphic in the sensethat each pair byf is carried to the next.

From the Figures 4.6 and 4.7 it may be clear thatD ∩ f(D) consists of two compo-nents, each of which contains two components ofD ∩ f(D)∩ f 2(D). By inductionit can be easily shown that

Dn0 =

n⋂

j=0

f j(D)

consists of2n vertical strips. Here each component ofDn−10 contains two strips of

Dn0 . We can give any of these strips an address in the binary system, compare with

the construction regarding the Solenoid in§4.2.1.2. The vertical stripK in Dn0 has

an addressα−(n−1), . . . , α−2, α−1, α0

of lengthn, whereα−j = 0 or 1, for all j, to be assigned as follows:

α−j =

0 if f−j(K) lies in the left half ofD10

1 if f−j(K) lies in the right half ofD10,

whereD10 = D ∩ f(D). It now may be clear that the set

D∞0 =

⋂

j≥0

f j(D)

consists of an infinite number of vertical lines. Any such lineL so gets an infiniteaddress

. . . , α−2, α−1, α0


1 2

34

1′ 2′ 3′ 4′1′′ 2′′3′′ 4′′

D f−1(D)

f(D)

f 2(D)

Figure 4.7: The positioning off−1(D),D, f(D) andf 2(D).

according to the same assignment principle as above. Observe that two of these linesL andL′ are nearby whenever for the corresponding addresses. . . , α−2, α−1, α0 and. . . , α′

−2, α′−1, α

′0 for a certainj0 one has

α′−j = α−j , for all j ≤ j0.

As before we may conclude thatD∞0 is the product of a Cantor set and an interval.

From the Figures 4.6 and 4.7 is may be clear that the inverse ofa Horseshoe mapfagain is a Horseshoe map, we only have to tilt the picture over90o. In this way wesee that

D0−n =

n⋂

j=0

f−j(D)

consists of2n horizontal strips, while

D0−∞ =

⋂

j≥0

f−j(D)

consists of infinitely many horizontal lines. The address ofsuch a lineK now is aninfinite sequence

β0, β1, β2, . . . ,


where

βj =

0 if f j(K) lies in the lower half ofD0−1

1 if f j(K) lies in the upper half ofD0−1.

Again we now conclude thatD0−∞ is the product of an interval and a Cantor set.

We now are able to investigate the set

D∞−∞ =

⋂

j∈Z

f j(D),

which by definition is invariant under forward and backward iteration off. It fol-lows that

D∞−∞ = D∞

0 ∩D0−∞

thus it is the product of two Cantor sets: each point is the intersection of a verticalline inD∞

0 and a horizontal line inD0−∞. Therefore we can give such a point both

the addresses of these lines. In fact, we shall proceed a little differently, using thefact that the lower component off−1(D)∩D by f is mapped to the left componentof D ∩ f(D) (in a similar way the upper component in the former is mapped to theright component in the latter). This assertion can be easilyconclude by inspectionof the Figures 4.6 and 4.7. Now a pointp ∈ D∞

−∞ gets the address

. . . , α−1, α0, α1, α2, . . . ,

according to

αj =

0 if f j(p) is in the left half ofD10

1 if f j(p) is in the right half ofD10,

for all j ∈ Z.By application of the mapf a point with address. . . , α−1, α0, α1, α2, . . .turns into a point with address. . . , α′

−1, α′0, α

′1, α

′2, . . . where

α′j = αj+1.

As in §1.3.8.4, also here we speak of ashift mapon the space of symbol sequences.

4.3.2.1 Symbolic dynamics

The above assignment of addresses defines a homeomorphism

α : D∞−∞ → ZZ

2 ,

where the restricted mapf |D∞−∞

corresponds to the shift map

σ : ZZ

2 → ZZ

2 .


For topological background see Appendix A. As in the case of the Doubling map,see§1.3.8.4, we speak ofsymbolic dynamics, which can help a lot in describing thedynamics generated byf.

First of all it follows that there are exactly two fixed pointsof f, correspondingto symbol sequences that only consist of0’s to be denoted by0, or of 1’s to bedenoted1. Both are saddle points. The addresses consisting of the sequences

. . . , α−1, α0, α1, . . . ,

such thatαj = 0, for all j ≥ j0, for somej0 ∈ Z, i.e., where all addresses suffi-ciently far in the future are0, belong to the stable separatrixW s0. The unstableseparatrixW u0 contains similar sequences where nowαj = 0, for all j ≤ j0, forsomej0 ∈ Z, i.e., where all addresses sufficiently far in the past are0. Analoguousremarks hold for the saddle point1.Again as in the case of the Doubling map, see§1.3.8.4, we also can use symbolicdynamics to get information on the periodic points. For instance, the number ofperiodic points with a given prime period is determined in Exercise 4.10. Also it iseasy to see that the set of periodic points is dense inD∞

−∞.

Another subject concerns thehomoclinic points. We say that a pointq is homoclinicwith respect to the saddle pointp if

q ∈W u(p) ∩W s(p), q 6= p.

It now follows that the set of pointsq that are homoclinic to the saddle point0also is dense inD∞

−∞. To see this, observe that any sequence. . . , α−1, α0, α1, . . .can be approximated (arbitrarily) by a sequence. . . , α′

−1, α′0, α

′1, . . . , where for a

(large) valuej0 ∈ Z that

α′j = αj forj < j0 andα′

j = 0 for |j| ≥ j0.

A similar remark goes for the set of points homoclinic to1.Apart from homoclinic points, we also distinguishheteroclinic pointswith respectto two saddle pointsp1 andp2. These are the points inside

W u(p1) ∩W s(p2) orW s(p1) ∩W u(p2).

We leave it to the reader to show that the points that are heteroclinic with respect to0 and1 forms a dense subset ofD∞

−∞.

We can continue for a while in this spirit. Any of the periodicpoints of periodkis a saddle fixed point offk, and hence has a stable and unstable separatrix andhomoclinic points. Heteroclinic points between any two of such periodic points(also of different periods) can also be obtained. Followingthe above ideas, also


structural stabilityHorseshoe map

compare with§1.3.8.4, it can be shown that, for any periodic pointp, the set ofhomoclinic points with respect top is dense inD∞

−∞. Similarly, for any two periodicpointsp1 andp2, the set of corresponding heteroclinic points is dense inD∞

−∞.

Finally, again as in§1.3.8.4, it follows that if we choose a sequence of0’s and1’sby independent tosses of a fair coin, with probability1 we get a symbol sequencethat corresponds with a point with a chaotic evolution, thatlies dense inD∞

−∞.

All the facts mentioned up to now rest on methods, very similar to those used forthe Doubling map in§1.3.8. There are, however, a number of differences as well.For the Horseshoe map we have bi-infinite sequences, corresponding to invertibilityof this map. Also, the assignment mapα is a homeomorphism, which it was not soin the case of the Doubling map. This has to do with the fact that the domain ofdefinition of the Doubling map is connected, while the symbolspaceΣ+

2 is totallydisconnected. Presently this difference does not occur: the setsD∞

−∞ andZZ

2 bothare totally disconnected, in fact they are both Cantor sets,see Appendix A.

4.3.2.2 Structural stability

The setD∞−∞ is a hyperbolic set for the Horseshoe mapf. Although here we are not

dealing with an attractor, hyoerbolicity can be defined exactly in the same way asfor an attractor, compare with Definition 4.7. We shall not gointo details here, butjust mention that this hyperbolicity also here is the reasonthat the dynamics of theHorseshoe map is structurally stable, in the following sense.

Theorem 4.14 (Structural stability of the Horseshoe map).There exists a neigh-bourhoodU of the Horseshoe mapf in theC1-topology, such that for each diffeo-morphismf ∈ U there exists a homeomorphism

h :⋂

j∈Z

f j(D) →⋂

j∈Z

f j(D)

such that for the corresponding restrictions one has

(

f |Tj fj(D)

)

h = h (

f |Tj fj(D)

)

.

The description of the homeomorphism in Theorem 4.14 is quite simple. Forf andσ we just saw that the following diagram commutes:

ZZ

2σ−→ ZZ

2

↑ α ↑ αD∞

−∞f−→ D∞

−∞,


p

q

D

ϕn(D)

Figure 4.8: Two-dimensional diffeomorphismϕ with a saddle pointp and atransversal homoclinic intersectionq. The setD, homeomorphic to the unit square[0, 1] × [0, 1], by a well-chosen iterateϕn is ‘mapped overD’ in a Horseshoe-likeway.

where, as before,D∞−∞ =

⋂

j fj(D). Now it turns out that a counterpart diagram

for f , when sufficientlyC1-close tof, commutes just as well. The fact that thesymbolic descriptions match forf andf then provides the homeomorphismh. Formore details on the proof of Theorem 4.14 and on similar results regarding hyper-bolic sets, see [177, 166].

We just saw that the Horseshoe map contains many homoclinic points. We also liketo point at a kind of converse of this. Indeed, letϕ be a diffeomorphism with ahyperbolic saddle pointp and letq 6= p be a transversal, i.e., non-tangent, point ofintersection

q ∈W u(p)⋂

W s(p),

implying thatq is homoclinic top. Then there exists an integern > 0 and aϕn-invariant setZ, i.e., withϕn(Z) = Z, such that the dynamics ofϕ|Z is conjugatedto that of the Horseshoe mapf restricted toD∞

−∞ =⋂

j fj(D). For this statement

it is not necessary that the manifold on whichϕ is defined, has dimension2. In Fig-ure 4.8 we sketched how a transversal intersection of stableand unstable separatrixof a saddle point, can give rise to a map that, up to conjugation, can be shown to bea Horseshoe map.

The fact that transversal homoclinic points lead to Horseshoe dynamics, was al-ready observed by Smale [178]. This formed an important basis for his programon the geometric study of dynamical systems, see [179]. For proofs and furtherdevelopments in this direction, we also refer to [16].


q1

q2

p1

p2

D1

D2

f(D1)

f(D2)

Figure 4.9: Action of the Horseshoe mapf onD,D1, enD2 with hyperbolic pointattractorq1 andq2 and the saddle fixed pointsp1 andp2.

4.3.3 Horseshoe-like sets in basin boundaries

We now consider a modified form of the Horseshoe map. Again a squareD =[0, 1] × [0, 1] by a diffeomorphismf is ‘mapped over itself’ where nowD ∩ f(D)consists of three vertical strips, compare with Figure 4.9.Moreover we assumethat the halve circular discsD1 andD2 are mapped within themselves. This can bearranged such that withinD = D ∪D1 ∪D2 there are exactly two hyperbolic pointattractors, situated inD1 andD2.

The set⋂

j fj(D) and the dynamics inside, as for the Horseshoe map, can be de-

scribed in terms of infinite sequences

. . . , β−1, β0, β1, . . . ,

where now the elementsβj take on the values0, 1 or 2. Almost all remarks madeon the Horseshoe map also apply here. The only exception is that here not alltransversal homoclinic intersections give rise to this modified Horseshoe dynamics.Here another phenomenon occurs, that has to do with the basins of attraction of thetwo point attractors inD. The boundary between these basins to a large extent isdetermined by the complicated horseshoe like dynamics. Indeed, this boundary, asfar as contained inD, exactly is formed by

⋂

ℓ≥0 f−ℓ(D).

To prove this, we introduce some notation. The point attractor contained inDj iscalledqj, j = 1, 2. The saddle fixed points in the left and right vertical strips of


Cantor set D ∩ f(D) we call p1, respectivelyp2. These saddle points correspond to symbolsequences that exclusively consists of0’s, respectively2’s. It now may be clear,that one branch of the unstable separatrixW u(p1) is entirely contained in the basinof q1, and similarly forp2 andq2. This implies that the stable separatrixW s(p1)must lie in the boundary of the basin ofq2, since the points inW s(p1), althoughthey do not tend toq1, but they are arbitrarily close to points that under iteration byf do tend toq1. An analoguous statement holds forp2 andq2. As for the standardHorseshoe map,

W s(pj) ∩D is dense in⋂

ℓ≥0

f−ℓ(D),

for j = 1, 2. This means that⋂

ℓ≥0 f−ℓ(D) must be part of the basin boundary ofq1

and ofq2.

Moreover if r ∈ D \ ∩ℓ≥0f−ℓ(D) then for somn we havefn(r) ∈ D, so either

fn(r) ∈ D1 or fn(r) ∈ D2. Hencer is in the interior of the basin of eitherq1 or q2.This finishes the proof.

In the above example we have basins ofq1 andq2 that have a joint boundary con-taining a set that is ‘diffeomorphic’ to the product of a Cantor set and an interval.That such a set really is ‘fatter’ than a (smooth) curve can beformalised as follows,using concepts (regarding fractal dimensions) that will betreated more thoroughlyin Chapter 6. On the one hand we consider a bounded pieceL of a smooth curve.On the other hand we consider the productC of an interval and a Cantor set as thisoccurs in the above example. Both sets we regard as subsets ofthe planeR2. Anessential difference between these two sets can be seen whenconsidering the area’sof their ε-neighbourhoods. LetLε be theε-neighbourhood ofL and, similarly,Cε

theε-neighbourhood ofC. Then, for small values ofε > 0, the area ofCε is muchlarger than that ofLε. In fact, it can be shown that

limε→0

ln( area ofLε)

ln(ε)= 1

limε→0

ln( area ofCε)

ln(ε)< 1,

compare with [153]. This means that, if we can determine the initial states of evo-lutions only modulo precisionε, the area where it is unknown whether the corre-sponding evolution converges tpq1 or q2, is extremely large for smallε is extremelylarge in comparison withε. This phenomenon, where it is very hard to determinean initial state that leads to a desired final state, is familiar from playing dice ortossing a coin. It therefore seems true that also for the dynamics of dice or coin, thebasins belonging to the various final states (like the numbers1, 2, . . . , 6 or the states‘heads’ and ‘tails’), have more or less similar basin boundaries.

4.4 Exercises 167

4.4 Exercises

Generally we are dealing with a dynamical system(M,T,Φ) such thatM is a topo-logical space, with compactΦ, while the evolutions are positively compact.

Exercise 4.1 (Properties ofω-limit set). Show that for anyx ∈ M the setω(x) isnon-empty, compact and invariant under the evolution operator. The latter meansthat for anyy ∈ ω(x) alsoΦ(y × T ) ⊂ ω(x).

Exercise 4.2 (Property of attractor). LetA ⊂M be an invariant set for a dynam-ical system with evolution operatorΦ. Assume thatA has arbitrarily small neigh-bourhoodsU ⊂ M such thatΦt(U) ⊂ U for all 0 < t ∈ T and such that for ally ∈ U alsoω(y) ⊂ A. Then show that

⋂

0≤t∈T Φt(U) = A.

Exercise 4.3 (On attractors. . .). Give an example of an attractor according toDefinition 4.3, that contains no dense evolution, and hence does not satisfy thealternative definition.

Exercise 4.4 (Conservative systems have no attractors).Consider a dynamicalsystem(M,T,Φ), where onM a measureµ is defined, such that any open subsetU ⊂ M has positiveµ-measure. We call such a system measure preserving (orconservative) if for anyt ∈ T the mapΦt is aµ-preserving transformation. Showthat for such systems no attractors are possible, unless these are equal to the wholestate spaceM.

Exercise 4.5 (Hyperbolicity of the Thom map).Let ϕ : T2 → T2 be the Thommap, see§2.3.5. Show thatT2 is a hyperbolic attractor forf.

Exercise 4.6 (The Solenoid revisited).In this exercise we use the coding of thediscsKj ∩Dϕ by binary numbers as introduced in§ 4.2.1.

1. Show that the discs inKj ∩ Dϕ with binary expressionαi−1, . . . , α0 by funder the Solenoid map (4.2) are put on the discKj+1 ∩ D2ϕ, with binaryexpressionαi−1, . . . , α0, 0.

2. Show that the disc inKj ∩Dϕ with binary expressionαi−1, . . . , α0, as a discin Kj ∩Dϕ+2π, has the binary expressionα′

i−1, . . . , α′0 whenever

j−1∑

ℓ=0

α′ℓ2

ℓ =

(

j−1∑

ℓ=0

αℓ2ℓ

)

− 1 mod2j.

check


Exercise 4.7 (The 2-adic metric).We define the2-adic distanceρ2(i, j) betweenelementsi, j ∈ Z+ as

ρ2(i, j) = 2−n,

whenn is the number of factors2 in the absolute difference|i−j|. The correspond-ing norm onZ+ is denoted by|z|2 = ρ2(z, 0).

1. Show that the triangle inequality

|x+ y|2 ≤ |x|2 + |y|2

holds true. (In fact, it is even true that|x+ y|2 = max|x|2, |y|2.)

2. Show that|x|2|y|2 = |xy|2.

Let N2 be the metric completion ofZ+ with respect to the metricρ2. Then showthat:

1. The elements ofN2 can be represented in a unique way by infinite binaryexpressions

(. . . , α2, α1, α0),

corresponding to the limit

limn→∞

n∑

j=0

αj2j;

2. N2 is an additive group (where−1 = (. . . , 1, 1, 1));

3. The Solenoid is homeomorphic to the quotient

K = N2 × R/〈(−1, 2π)〉,

where〈(−1, 2π)〉 is the subgroup generated by(−1, 2π). The dynamics ofthe Solenoid as an attractor, by this homeomorphism is conjugated to multi-plication by2 onK.

(PM: This algebraic version of the Solenoid is the format in which it was first intro-duced by D. van Dantzig in his study on topological groups [84].)

Exercise 4.8 (Attractors of gradient systems).Consider the gradient systemx′ =−gradV (x) with x ∈ R2, where the critical points of the smooth functionV arenon-degenerate.3

3Note thatV has to satisfy a suitable condition in order to have only positively compact solutions.

4.4 Exercises 169

1. Show that this system can’t have any (non-constant) periodic evolutions;

2. Show that for anyx ∈ R2, for which the positive evolution starting atx isbounded, the setω(x) consists of one point, which is a critical point ofV ;

3. In how far are the above items still true when we drop the requirement thatthe critical points ofV are non-degenerate?

Exercise 4.9 (Periodic points of Horseshoe map).Compute the number of pointsin D that have prime period 12 for the Horseshoe map.

Exercise 4.10 (The unstable distribution of the Solenoid).We revisit the unstabledistribution for the Solenoid, using the notation of§4.2. In3-space we use cylindercoordinates(r, ϕ, z). For points in3-space, for whichr 6= 0, a basis of tangentvectors is given by∂r,∂ϕ and∂z. We consider the1-dimensional distributionE onthe SolenoidA such that, at each pointp ∈ A, the projection ofE(p) on theϕ-axisis surjective. Such a distribution can be uniquely represented as a vector field of theform

R∂r + ∂ϕ + Z∂z

whereR andZ are continuous functions onA. For two such distributionsE andE ′

we define the distance by

ρ(E , E ′) = maxp∈A

√

(R(p) − R′(p))2 + (Z(p) − Z ′(p))2.

Now show that:

1. By the transformationTf the 1-dimensional distributions of the above type,are taken over into1-dimensional distributions of this same type;

2. ρ(TfE , TfE ′) = 12λρ(E , E ′);

3. The limitEu = limj→∞(Tf )jE exists, is invariant under the derivative df and

has the property that vectorsv ∈ Eu(p) increase exponentially under repeatedapplication of df.

Exercise 4.11 (Symbolic dynamics more general).Consider the spaceC of mapsf : [0, 1] \ 1

2 → [0, 1], such that

- limx↑ 12f(x) = 1 andlimx↓ 1

2f(x) = 0 ;

- f is of classC1 on the intervals[0, 12) and(1

2, 1], and for all suchx one has

f ′(x) > 1;

- f(0) > 0 andf(1) < 1.


To eachx ∈ [0, 1] we assign a sequence of symbols0 and1 (in a not completelyunique way) where the sequence

s1(x), s2(x), . . .

corresponds to

sj(x) =

0 wheneverf j(x) ∈ [0, 12)

1 wheneverf j(x) ∈ (12, 1]

This definition fails when for a certainj0 one hasf j0(x) = 12, in which case we

continue the sequence as if the next iteration byf would yield the value0 or 1(which introduces a choice). Next to each symbol sequences1(x), s2(x), . . . weassign the number

S(x) =∑

j

sj(x)2j ,

where in cases where the symbol sequence is non-unique we defineS−(x), respec-tively S+(x), as minimum or maximum that can occur for the different choices. ThenumbersS(x), S−(x) andS+(x), thus defined, depend onx, but also on the mapf.If we wish to express the latter dependence as well, we writeSf (x), etc.

Show that forf1, f2 ∈ C the following holds true: there exists a homeomorphismh : [0, 1] → [0, 1] with h f1 = f2 h, if and only if Sf1

+ (0) = Sf2+ (0) and

Sf1− (1) = Sf2

− (1).

Exercise 4.12 ((∗) Horseshoe and basin boundaries).Consider the Horseshoemap, with its saddle fixed pointsp1 andp2, see§4.3.2.1. Show that there exists avector field, defined on a suitable manifold, with an attractor A such that such thatthe union of stable manifoldsW s(p1) ∪W s(p2) corresponds to the basin boundaryof A.check!

Exercise 4.13 ((∗) Like the Lorenz return maps). (Compare with Sparrow [180].)We consider mapsfα : [−1,+1] \ 0 → [−1,+1], whereα ∈ [1, 2], given by

fα(x) = αx− sign(x),

wheresign(x) =

x

|x| .

Whenever desired we definefα(0) = ±1.

1. Show that, for√

2 < α < 2, the mapfα has a dense evolution;

2. Show that for1 < α <√

2 the intervalIα = [1 − α, α − 1] and the mapfα

have the following properties:

4.4 Exercises 171

(a) fα(Iα) consists of two intervals that both are disjoint withIα;

(b) f 2α(Iα) ⊂ Iα;

(c) The restrictionf 2α|Iα

, up to a linear rescaling, is equal tofα2 ;

(d) Iα is an attractor offα2 ;

3. Investigate by induction what ‘bifurcations’ occur whenα = 212 , 2

14 , 2

18 , . . . ,

etc.

(NB: By a bifurcation occurring atα = α0 we understand that the dynamicschanges discontinuously whenα passes throughα0.)

Exercise 4.14.(∗) Show that any hyperbolic attractor of a dynamical system witha1-dimensional state space, either exists of one periodic of stationary evolution, oris the entire state space.

Kolmogorov-Arnold-MoserTheory

Van der Pol system

Chapter 5

On KAM Theory

As a motivation for this chapter we refer to the historical remarks of§2.2.5 and thereferences given there. The main question is in how far a quasi-periodic subsystem,as defined in§2.2.3, is persistent under small perturbation of the whole system, orsay, under variation of parameters, see§3.2.

This question belongs toKAM Theory, see§2.2.5, that classically was mainly de-veloped for conservative systems. Presently we restrict the attention to a case con-cerning certain circle maps, thereby motivating the so-called 1 bite small-divisorproblem, that will be discussed to some extent. In Appendix B we deal with the‘classical’KAM Theorem regarding Lagrangian tori in Hamiltonian systems,as wellas its ‘dissipative’ counterpart regarding a family of quasi-periodic attractors. Alsowe give elements of ‘the’KAM proof, using the 1-bite theory introduced in§5.5below.

5.1 Introduction, setting of the problem

Starting point of the present considerations is the periodically driven Van der Poloscillator (2.6), see§2.2.3, in a slightly more generalC∞-format

y′′ + cy′ + ay + f(y, y′) = εg(y, y′, t; ε), (5.1)

where the functiong has periodΩ in the timet. It is important thatf is such that thecorresponding free oscillator has a hyperbolic attractor,for instance takef(y, y′) =by2y′, for an appropriate value ofb, for the phase portrait in the(y, y′)-plane seeFigure 5.1. The state space of (5.1) is 3-dimensional with coordinates(y, y′, t),let us denote the corresponding vector field byXε. Compare with Chapters 1 and2 for this construction. As explained in§2.3, the unperturbed systemX0 has anattracting multi-periodic subsystem given by the restriction of X0 to a 2-torusT0,

174 On KAM Theory

persistence!ofquasi-periodicevolutions

y

y′

Figure 5.1: Phase portrait of the free Van der Pol oscillator.

where the restrictionX0|T0 is smoothly conjugated to the translation system (2.10)here presented as

x′1 = ω1

x′2 = ω2 (5.2)

on the standard 2-torusT2.

A general question is what happens to this 2-torus attractorfor ε 6= 0. By hyperbol-icity of the periodic orbit, the unperturbed torusT0 is normally hyperbolic. Thus,according to the Normally Hyperbolic Invariant Manifold Theorem ([113], Theo-rem 4.1), the 2-torusT0 is persistent as an invariant manifold. This means that for|ε| ≪ 1, the vector fieldXε has a smooth invariant 2-torusTε, where also the de-pendence onε is smooth. Here ‘smooth’ means finitely differentiable, where thedegree of differentiability tends to∞ asε→ 0.

The remaining question then concerns the dynamics ofXε|Tε. In howfar is also the

multi-periodicity ofX0|T0 persistent forε 6= 0?

Example 5.1 (Coupled oscillators I).As a related example consider two nonlinearoscillators with a weak coupling

y′′1 + c1y′1 + a1y1 + f1(y1, y

′1) = εg1(y1, y2, y

′1, y

′2)

5.2KAM Theory of circle maps 175

x1x1x1x1x1x1x1x1x1x1

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

x2

P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)


P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)P (x2)

Figure 5.2: Poincare map associated to the sectionx1 = 0 of T2.

y′′2 + c2y′2 + a2y2 + f2(y2, y2) = εg2(y1, y2, y1, y2),

y1, y2 ∈ R. This yields the following vector fieldXε on the 4-dimensional phasespaceR2 × R2 = (y1, z1), (y2, z2):

y′j = zj

z′j = −ajyj − cjzj − fj(yj, zj) + εgj(y1, y2, z1, z2),

j = 1, 2. Note that forε = 0 the system decouples to a system of two independentoscillators and has an attractor in the form of a two dimensional torusT0. This torusarises as the product of two (topological) circles, along which each of the oscillatorshas its periodic solution. The circles lie in the two-dimensional planes given byz2 = y2 = 0 andz1 = y1 = 0 respectively. The vector fieldX0 has a multi-periodicsubsystem as before, in the sense that the restrictionX0|T0 is smoothly conjugatedto the translation system (5.2) onT2. The considerations and problems regarding theperturbationXε, an invariant 2-torusTε and the multi-periodic dynamics ofXε|Tε

,runs as before.

5.2 KAM Theory of circle maps

Continuing the considerations of§5.1, we now discuss the persistence of the dy-namics in theXε-invariant 2-toriTε as this comes up in the above examples. Aftera first reduction to the standard 2-torusT2, we reduce to maps of the circleT1 bytaking a Poincare section, compare with§1.2.2.

176 On KAM Theory

circle mapsconjugation!smooth

5.2.1 Preliminaries

As said before, the toriTε are highly differentiable manifolds due to normal hyper-bolicity [113]. Indeed, for|ε| ≪ 1, all Tε are diffeomorphic to the standard 2-torusT2, where the degree of differentiability increases to∞ asε tends to0. To study thedynamics ofXε inside the torusTε, we use this diffeomorphism to get a reducedperturbation problem onT2. For simplicity we also here use the nameXε for thesystem.

The KAM Theorem we are preparing allows for formulations in the world of Ck-systems fork sufficiently large, includingk = ∞, compare [160, 62]. These ver-sions are variations on the simpler setting where the systems are of classC∞. Forsimplicity we therefore restrict to the case where the entire perturbation problem isC∞.

TheKAM set-up at hand in principle looks for smooth conjugations betweenX0 andXε, the existence of which would imply that alsoXε has a multi-periodic subsystem.Considering a simple example like

x′1 = ω1 + ε1

x′2 = ω2 + ε2,

we observe that both quasi-periodicity and resonance have dense occurrence, com-pare Section 2.2.2. These two cases cannot be conjugated, since in the latter caseall evolutions are compact, which they are not in the former.To be more precise,when classifying multi-periodic 2-tori under (smooth) conjugation the frequenciesω1 andω2 are invariants.

A standard way to deal with such invariants systematically,is to introduce parame-ters [1, 8]. Then, if trying to conjugate two different families of dynamical systems,parameter shifts may be used to avoid the difficulties just described. In the example(2.6) of §2.2.3 we may considerµ = (a, c) as a multi-parameter and in coupledoscillator case ending§5.1 evenµ = (a1, a2, c1, c2). In principle, we could eventakeµ = (ω1, ω2) in the translation system format (5.2) as a multi-parameter. Inthe latter case one usually requires that(ω1, ω2) depends in a submersive way onparameters as in the former two examples.

To simplify things a bit, we relax theKAM set-up in the sense that we also allowfor a smooth equivalence (see§3.4.2) betweenX0 andXε, in which case only thefrequency ratioω1 : ω2 is an invariant; see the Exercises 5.1 and 5.2. For simplicitywe considerβ = ω1/ω2 itself as a parameter, so we study the family of vector fieldsXβ,ε onT2 = x1, x2, both variables being counted modulo2πZ.

To exploit this further we now turn to the corresponding family of Poincare returnmapsPβ,ε of the circlex1 = 0, considered as a circle diffeomorphismT1 → T1,


equivalence!smoothagain compare with§2.2.3. The mapPβ,ε has the form

Pβ,ε : x2 7→ x2 + 2πβ + εa(x2, β, ε). (5.3)

Note that forε = 0 the unperturbed mapPβ,0 is just the discrete translation systemgiven by the rigid rotationR2πβ : x2 7→ x2 + 2πβ of the circleT1. Compare with§2.2.2.

For a proper formulation of our problem it is convenient to regard the family ofcircle maps as a ‘vertical’1 map of the cylinder by consideringPε : T1 × [0, 1] →T1 × [0, 1] defined as

Pε : (x2, β) 7→ (x2 + 2πβ + εa(x2, β, ε), β). (5.4)

Adopting the same ‘verticality’ convention of writingXε at the level of vector fields,we observe that a conjugation betweenP0 andPε gives rise to an equivalence be-tweenX0 andXε.

To solve the conjugation problem for Poincare maps, we now look for a ‘skew’2

diffeomorphismΦε : T1 × [0, 1] → T1 × [0, 1] conjugating the unperturbed familyP0 to the perturbed familyPε, i.e., such that

Pε Φε = Φε P0. (5.5)

The conjugation equation (5.5) also is expressed by commutativity of the followingdiagram.

T1 × [0, 1]Pε−→ T1 × [0, 1]

↑ Φε ↑ Φε

T1 × [0, 1]P0−→ T1 × [0, 1]

Remarks.

- At the level of circle maps (5.3), the conjugationΦε keeps track of multi-periodic subsystems.

Conjugations between return maps directly translate to equivalences betweenthe corresponding vector fields, see [1, 8]. In the case of thefirst example(5.1) these equivalences can even be made conjugations. SeeExercise 5.3.

- For orientation preserving circle homeomorphisms, the rotation number isan invariant under (topological) conjugation, e.g., compare with [1, 5]. Therotation number of such homeomorphism is the average amountof rotationeffected by the homeomorphism, which for the unperturbed mapPβ,0 = R2πβ

exactly coincides with the frequency ratioβ.

1RegardingT1 × [0, 1] as aT1-bundle over[0, 1], the mapPε is fibre preserving.2The skew format ofΦε is needed to respect the fibre preservingness of thePε.

178 On KAM Theory

conjugation!Whitneysmooth

small divisorshomological

equation

- The Denjoy Theorem [1, 5] asserts that for irrationalβ, whenever the rotationnumber ofPβ′,ε equalsβ, a topological conjugation exists betweenPβ,0 andPβ′,ε.

5.2.2 Formal considerations and small divisors

We now study equation (5.5) for the conjugationΦε to some detail. To simplify thenotation first setx = x2. Assuming thatΦε : T1 × [0, 1] → T1 × [0, 1] has thegeneral skew form

Φε(x, β) = (x+ εU(x, β, ε), β + εσ(β, ε)), (5.6)

we get the following nonlinear equation for the functionU and the parameter shiftσ :

U(x+ 2πβ, β, ε)− U(x, β, ε) = (5.7)

2πσ(β, ε) + a (x+ εU(x, β, ε), β + εσ(β, ε), ε) .

As is common in classical perturbation theory, we expanda, U andσ as formalpower series inε and solve (5.7) by comparing coefficients. We only consider thecoefficients of power zero inε, not only because asymptotically these coefficientshave the strongest effect, but also since the coefficients ofhigherε-powers satisfysimilar equations. So, writing

a(x, β, ε) = a0(x, β) +O(ε), U(x, β, ε) = U0(x, β) +O(ε),

σ(β, ε) = σ0(β) +O(ε),

substitution in equation (5.7) leads to the following linear, so-called homological,equation

U0(x+ 2πβ, β) − U0(x, β) = 2πσ0(β) + a0(x, β), (5.8)

which has to be solved forU0, andσ0. Equation (5.8) is linear and therefore can bedirectly solved by Fourier series, also compare§1.3.5.3. In [61] such a problem iscoined as a 1bite small divisor problem. Indeed, introducing

a0(x, β) =∑

k∈Z

a0k(β)eikx andU0(x, β) =∑

k∈Z

U0k(β)eikx

and comparing coefficients in (5.8), directly yields that

σ0 = − 1

2πa00, U0k(β) =

a0k(β)

e2πikβ − 1, k ∈ Z\0, (5.9)

whileU00, which corresponds to the position of the origin0 onT, remains arbitrary.In the next section we shall deal with 1 bite small divisor problems in more detail.


Diophantineconditions

Cantor set

We conclude that in general a formal solution exists if and only if β is irrational,which means that the translation systemPβ,0 = R2πβ has quasi-periodic dynamics.Compare with§2.2.

Even then one meets the problem of small divisors, caused by the accumulation ofthe denominators in (5.9) on0, which makes the convergence of the Fourier seriesof U0 problematic. This problem can be solved by a further restriction of β byso-called Diophantine conditions.

Definition 5.1 (Diophantine numbers). Let τ > 2 and γ > 0 be given. We saythatβ ∈ [0, 1] is (τ, γ)-Diophantine if for allp, q ∈ Z with q > 0 we have that

∣

∣

∣

∣

β − p

q

∣

∣

∣

∣

≥ γ

qτ. (5.10)

The set ofβ satisfying (5.10) is denoted by[0, 1]τ,γ ⊆ [0, 1]. First note that[0, 1]τ,γ ⊂R \Q, which implies that the translation system given by the rotationPβ,0 is quasi-periodic.

Let us further examine the set[0, 1]τ,γ, for background in topology and measuretheory referring to Appendix A. It is easily seen that[0, 1]τ,γ is a closed set. Fromthis, by the Cantor-Bendixson Theorem [107] it follows that[0, 1]τ,γ is the union ofa perfect set and a countable set. The perfect set, for sufficiently smallγ > 0, has tobe a Cantor set, since it is compact and totally disconnected(or zero-dimensional).The latter means that every point of[0, 1]τ,γ has arbitrarily small neighbourhoodswith empty boundary, which directly follows from the fact that the dense set ofrationals is in its complement. Note that, since[0, 1]τ,γ ⊆ [0, 1], so as a closedsubset of the real line, the property of being totally disconnected is equivalent tobeing nowhere dense. Anyhow, the set[0, 1]τ,γ is small in the topological sense. Inthis case, however, the Lebesgue measure of[0, 1]τ,γ is not small, since

measure([0, 1] \ [0, 1]τ,γ) ≤ 2γ∑

q≥2

q−(τ−1) = O(γ), (5.11)

asγ ↓ 0, by our assumption thatτ > 2, e.g., compare [1, 62, 61], also for furtherreference. Note that the estimate (5.11) implies that the union

⋃

γ>0

[0, 1]τ,γ,

for any fixedτ > 2, is of full measure. For a thorough discussion of the discrepancybetween the measure theoretical and the topological notionof the size of numbersets, see Oxtoby [150]. Also see Exercise 5.14.

Returning to (5.9), we now give a brief discussion on the convergence of Fourierseries, for details referring to the next section, in particular to the Paley-Wiener

180 On KAM Theory

Circle MapTheorem

Φε

P0

Pε

x

x

β

β

[0, 1]τ,γ

Figure 5.3: Conjugation between the Poincare mapsP0 andPε onT1 × [0, 1]τ,γ.

estimates on the decay of Fourier coefficients. For theC∞-functiona0 the Fouriercoefficientsa0k decay rapidly as|k| → ∞. Also see Exercise 5.13. Second, a briefcalculation reveals that forβ ∈ [0, 1]τ,γ it follows that for allk ∈ Z\0 we have

|e2πikβ − 1| ≥ 4γ|k|−τ .

It follows that the coefficientsU0,k still have sufficient decay as|k| → ∞, to ensurethat the sumU0 again is aC∞-function. Of course this does not yet prove that thefunctionU = U(x, β, ε) exists in one way or another for|ε| ≪ 1. Yet we do havethe following.

Theorem 5.2 (Circle Map Theorem). For γ sufficiently small and for the pertur-bationε a sufficiently small in theC∞-topology, there exists aC∞ transformationΦε : T1 × [0, 1] → T1 × [0, 1], conjugating the restrictionP0|[0,1]τ,γ

to a (Diophan-tine) quasi-periodic subsystem ofPε.

Theorem 5.2 in the present structural stability formulation (compare with Figure


persistence!weakArnold family of

circle maps

3

2.5

2

1.5

1

0.5

1/100 1/9 1/8 1/7 1/6 1/5 2/9 1/4 2/7 3/10 1/3 3/8 2/5 3/7 4/9 1 2/

0

ε

β

Figure 5.4: Resonance tongues [65]; note that for|ε| ≥ 1 the Arnold family (5.12)is endomorphic.

5.3), is a special case of the results in [62, 61]. We here speak of quasi-periodicstability. For earlier versions see [31, 34].

Remarks.

- Rotation numbers are preserved by the mapΦε and irrational rotation num-bers correspond to quasi-periodicity. Theorem 5.2 thus ensures thattypicallyquasi-periodicity occurs withpositivemeasure in the parameter space. Interms of§§3.1 and 3.2 this means that quasi-periodicity is persistentundervariation of parameters and weakly persistent under variation of initial states(see Definition 3.3. Note that since Cantor sets are perfect,quasi-periodicitytypically has a non-isolated occurrence.

- The mapΦε has no dynamical meaning inside the gaps. The gap dynamics inthe case of circle maps can be illustrated by the Arnold family of circle maps[31, 1, 5], given by

Pβ,ε(x) = x+ 2πβ + ε sinx (5.12)

which exhibits a countable union of open resonance tongues where the dy-namics is periodic, see Figure 5.4. Note that this map only isa diffeomor-phism for|ε| < 1.

182 On KAM Theory

area preservingmaps

Moser’s Twist MapTheorem

In the contexts of the Van der Pol system, either driven of coupled (see§5.1),quasi-periodic circles of (5.12) correspond to quasi-periodic invariant 2-tori,while in the resonant cases one speaks ofphase locking.

- We like to mention that non-perturbative versions of Theorem 5.2 have beenproven by Herman3 and Yoccoz4 [110, 111, 211].

- For simplicity we formulated Theorem 5.2 underC∞-regularity, noting thatthere exist many ways to generalize this. On the one hand there existCk-versions for finitek and on the other hand there exist fine tunings in termsof real-analytic and Gevrey regularity. For details wer refer to [61, 23] andreferences therein. This same remarks applies to other results in this sectionand in Appendix B onKAM Theory.

5.3 KAM Theory of area preserving maps

We continue the previous section on theKAM Theory of circle and cylinder maps,now turning to the setting of area preserving maps of the annulus. In particular wediscuss Moser’s Twist Map Theorem [140], also see [62, 61, 23], which is relatedto the conservative dynamics of frictionless mechanics.

Let ∆ ⊆ R2 \ (0, 0) be an annulus, with (polar) coordinates(ϕ, I) ∈ T1 × K,whereK is an interval. Moreover, letσ = dϕ ∧ dI be the area form on∆.

We consider aσ-preserving smooth mapPε : ∆ → ∆ of the form

Pε(ϕ, I) = (ϕ+ 2πβ(I), I) +O(ε),

where we assume that the mapI 7→ β(I) is a (local) diffeomorphism. This assump-tion is known as thetwist conditionandPε is called atwist map. For the unperturbedcaseε = 0 we are dealing with a pure twist map and its dynamics is comparable tothe unperturbed family of cylinder maps as met in§5.2. Indeed it is again a familyof rigid rotations, parametrized byI and whereP0(., I) has rotation numberβ(I).In this case the question is what will be the fate of this family of invariant circles,as well as with the corresponding rigid dynamics.

Regarding the rotation number we again introduce Diophantine conditions. Indeed,for τ > 2 andγ > 0 the subset[0, 1]τ,γ is defined as in (5.10), i.e., it contains allβ ∈ [0, 1], such that for all rationalsp/q

∣

∣

∣

∣

β − p

q

∣

∣

∣

∣

≥ γq−τ .

3Michael Robert Herman 1942-20004Jean-Christophe Yoccoz 1957-

5.3KAM Theory of area preserving maps 183


persistence!ofquasi-periodicevolutions

persistence!weakpersistence!under

variation ofparameters

typicalpendulum

Pulling back[0, 1]τ,γ along the mapβ we obtain a subset∆τ,γ ⊆ ∆.

Theorem 5.3 (Twist Map Theorem [140]).For γ sufficiently small and for the per-turbationO(ε) sufficiently small inC∞-topology, there exists aC∞ transformationΦε : ∆ → ∆, conjugating the restrictionP0|∆τ,γ

to a subsystem ofPε.

As in the case of Theorem 5.2 again we chose the formulation of[62, 61]. Largelythe remarks following Theorem 5.2 also apply here.

Remarks.

- Compare the format of the Theorems 5.2 and 5.3 and observe that in the lat-ter case the role of the parameterα has been taken by the ‘action’ variableI. Theorem 5.3 implies thattypically quasi-periodicity occurs with positivemeasure in phase space. Or, to rephrase this assertion in terms of§§3.1, 3.2,the property that quasi-periodicity is weakly persistent under variation of ini-tial state, is persistent under variation of parameters (i.e., typical).

- In the gaps typically we have coexistence of periodicity, quasi-periodicity andchaos [34, 33, 143, 144, 57, 166, 11]. The latter follows fromtransversalityof homo- and heteroclinic connections that give rise to positive topologicalentropy. Open problems are whether the corresponding Lyapunov exponentsalso are positive, compare with [34].

Example 5.2 (Coupled oscillators II).Similar to the applications of Theorem 5.2to driven and coupled nonlinear oscillators in§§5.1, 5.2, here direct applications arepossible in the conservative setting. Indeed, consider a system of weakly coupledpendula

y1 + α21 sin y1 = ε

∂U

∂y1(y1, y2)

y2 + α22 sin y2 = ε

∂U

∂y2(y1, y2).

Writing yj = zj , j = 1, 2 as before, we again get a vector field in the 4-dimensionalphase spaceR2 × R2 = (y1, y2), (z1, z2). In this case the energy

Hε(y1, y2, z1, z2) = 12z

21 + 1

2z22 − α2

1 cos y1 − α22 cos y2 + εU(y1, y2)

is a constant of motion. Restricting to a 3-dimensional energy surfaceH−1ε =

const., the iso-energetic Poincare mapPε is an area preserving Twist Map and ap-plication of Theorem 5.3 yields the conclusion of quasi-periodicity (on invariant2-tori) occurring with positive measure in the energy surfaces ofHε.

184 On KAM Theory

holomorphic mapsconjugation!biholomorphiccomplex

linearizationDiophantine

conditions

5.4 KAM Theory of holomorphic maps

The sections§§5.2 and 5.3 both deal with smooth circle maps that are conjugatedto rigid rotations. Presently the concern is with planar holomorphic (or complexanalytic) maps that are conjugated to a rigid rotation on an open subset of the plane.Historically this is the first time that a small divisor problem was solved [1, 137,213, 214].

5.4.1 Complex linearization

Given is a holomorphic germF : (C, 0) → (C, 0) of the formF (z) = λz + f(z),with f(0) = f ′(0) = 0. The problem is to find a biholomorphic germΦ : (C, 0) →(C, 0) such that

Φ F = λ · Φ.Such a diffeomorphismΦ is called alinearizationof F near0.

We begin with the formal approach. Given the seriesf(z) =∑

j≥2 fjzj , we look

for Φ(z) = z+∑

j≥2 φjzj . It turns out that a solution always exists wheneverλ 6= 0

no root of unity. Indeed, direct computation reveals the following set of equationsthat can be solved recursively:

For j = 2 : get the equationλ(1 − λ)φ2 = f2

For j = 3 : get the equationλ(1 − λ2)φ3 = f3 + 2λf2φ2

For j = n : get the equationλ(1− λn−1)φn = fn + known.

The question now reduces to whether this formal solution haspositive radius ofconvergence.

The hyperbolic case0 < |λ| 6= 1 was already solved by Poincare, for a descriptionsee [1]. The elliptic case|λ| = 1 again has small divisors and was solved by Siegel5

when for someγ > 0 andτ > 2 we have the Diophantine nonresonance condition

|λ− e2πi pq | ≥ γ|q|−τ .

The corresponding set ofλ constitutes a set of full measure inT1 = λ.Yoccoz [213] completely solved the elliptic case using the Bruno-condition. If

λ = e2πiα and whenpn

qn

5Carl Ludwig Siegel 1896-1981

5.4KAM Theory of holomorphic maps 185

is thenth convergent in the continued fraction expansion ofα then the Bruno-condition reads

∑

n

log(qn+1)

qn<∞.

This condition turns out to be necessary and sufficient forΦ having positive radiusof convergence, see Yoccoz [213, 214].

5.4.2 Cremer’s example in Herman’s version

As an example consider the map

F (z) = λz + z2,

whereλ ∈ T1 is not a root of unity.

Observe that a pointz ∈ C is a periodic point ofF with periodq if and only ifF q(z) = z, where obviously

F q(z) = λqz + · · · + z2q

.

WritingF q(z) − z = z(λq − 1 + · · · + z2q−1),

the periodq periodic points exactly are the roots of the right hand side polynomial.AbbreviatingN = 2q −1, it directly follows that, ifz1, z2, . . . , zN are the nontrivialroots, then for their product we have

z1 · z2 · · · · · zN = λq − 1.

It follows that there exists a nontrivial root within radius

|λq − 1|1/N

of z = 0.

Now consider the set ofΛ ⊂ T1 defined as follows:λ ∈ Λ whenever

liminfq→∞|λq − 1|1/N = 0.

It can be directly shown thatΛ is residual, compare with Appendix A. It alsofollows that forλ ∈ Λ linearization is impossible. Indeed, since the rotation isirrational, the existence of periodic points in any neighbourhood ofz = 0 implieszero radius of convergence.

Remarks.

186 On KAM Theory

residual setpersistence!of

quasi-periodicevolutions

1-bite small divisorproblem

- Notice that the residual setΛ is in the complement of the full measure setof all Diophantine numbers, see Appendix A and [150]. Also compare withExercise 5.14.

- Consideringλ ∈ T1 as a parameter, we see a certain analogy with the The-orems 5.2 and 5.3. Indeed, in this case for a full measure set of λ’s on aneighbourhood ofz = 0 the mapF = Fλ is conjugated to a rigid irrationalrotation. Such a domain in thez-plane often is referred to as a Siegel disc.For a more general discussion of these and of Herman rings, see [137].

5.5 The 1-bite small divisor problem

The 1-bite small divisor problem turns out to be the key problem useful for thegeneralKAM Theory, like the Theorems 5.2 and 5.3 as discussed both in this chap-ter and several others (in continuous time) as indicated in the Appendices B andC. The kind of proof we have in mind is a Newtonian iteration process for solv-ing conjugation equations, the linearizations of which lead to 1-bite small divi-sor problems. Compare with the Newtonian iteration as described in Chapter 1for finding roots of polynomials; the difference is that the present version takesplace in an infinite dimensional function space, compare with [144, 175] and with[160, 62, 61, 161, 128, 212].

To be a bit more concrete we start briefly summarizing§5.2.2, also referring toAppendix B, after which we shall deal with 1-bite small divisor problem for its ownsake. Here we largely follow parts of [61, 15], also compare with [69].

5.5.1 Motivation

We briefly recall the conjugation problem of§5.2. Given the vertical cylinder map(5.4)

Pε : (x, β) 7→ (x+ 2πβ + εa(x, β, ε), β).

the search is for a skew cylinder diffeomorphism (5.6)

Φε(x, β) = (x+ εU(x, β, ε), β + εσ(β, ε)),

satisfying the conjugation equation (5.5)Pε Φε = Φε P0, giving the nonlinearequation (5.7)

U(x+ 2πβ, β, ε)− U(x, β, ε) =

2πσ(β, ε) + a (x+ εU(x, β, ε), β + εσ(β, ε), ε)

5.5 The 1-bite small divisor problem 187

to be solved for the unknown functionsU andσ. After turning to power series

a(x, β, ε) = a0(x, β) +O(ε), U(x, β, ε) = U0(x, β) +O(ε),

σ(β, ε) = σ0(β) +O(ε),

we ended up with the linear equation (5.8)

U0(x+ 2πβ, β) − U0(x, β) = 2πσ0(β) + a0(x, β),

which has to be solved forU0, andσ0. The latter equation was coined as a 1 bitesmall divisor problem, since by writing

a0(x, β) =∑

k∈Z

a0k(β)eikx andU0(x, β) =∑

k∈Z

U0k(β)eikx

and comparing coefficients in (5.8) and assuming thatβ 6= Q, directly yields theformal solution (5.9)

U(x, β)∑

k∈Zn\0

a0k(β)

e2πikβ − 1eikx

andσ0 = − 12πa00, with U00 free to chose. It turns out that for all orders inε similar

1-bite problems arise and one might imagine a proof of Theorem 5.2 by provingconvergence of the corresponding power series inε, also compare with [142]. Theproof we have in mind uses a Newtonian iteration process to solve the nonlinearequation (5.7). At each step of this process we solve a linearequation of the form(5.8), which is a 1-bite problem.

5.5.2 Setting of the problem and formal solution

To be precise we introduce two 1 bite small divisor problems as follows, therebyslightly rephrasing (5.8),

u(x+ 2πβ, β) − u(x, β) = f(x, β) and⟨

ω,∂u(x, ω)

∂x

⟩

= f(x, ω). (5.13)

In both casesu has to be solved in terms of the given functionf, where in theformer casex ∈ T1 andu, f : T1 → R and in the latterx ∈ Tn andu, f : Tn →Rn. Moreover,β ∈ [0, 1] andω ∈ Rn will serve as parameters of the problems.By taking averages of both sides it follows that necessary for solution of both theequations in (5.13) is that for both the averages one has

f0(., β) ≡ 0 andf0(., ω) ≡ 0,

188 On KAM Theory

which we assume from now on.

Of the former, discrete, case of (5.13) we just summarized the formal solution,referring to§§2.2 and 5.2.2. See the Exercises 5.6, 5.7 and 5.8 for other examples.The second equation of (5.13) can be dealt with as follows. Expressingf in Fourierseries

f(x, ω) =∑

k 6=0

fk(ω) ei〈ω,k〉, (5.14)

one directly gets as a formal solution

u(x, ω) = u0(ω) +∑

k 6=0

fk(ω)

i〈ω, k〉 ei〈x,k〉, (5.15)

provided thatω1, ω2, . . . , ωn are independent overQ, i.e., that no resonances occur.Observe that the constant termu0(x, ω) remains arbitrary. As mentioned in§5.2.2,the convergence is problematic because of the corresponding small divisors.

In the following example we meet a slightly different 1 bite small divisor problem[61, 15].

Example 5.3 (Quasi-periodic responses in the linear case I). Consider a linearquasi-periodically forced oscillator

y + cy + ay = g(t, a, c) (5.16)

with y, t ∈ R, wherea andc with a > 0 andc2 < 4a serve as parameters. Thedriving termg is assumed to be quasi-periodic int, which means thatg(t, a, c) =G(tω1, tω2, . . . , tωn, a, c) for some functionG : Tn × R2 → R, n ≥ 2, and a fixednonresonant frequency vectorω ∈ Rn, i.e., withω1, ω2, . . . , ωn independent overQ. The problem is to find a response solutiony = y(t, a, c) of (5.16), that is quasi-periodic int with the same frequenciesω1, ω2, . . . , ωn. This leads to a 1-bite smalldivisor problem as follows.

The corresponding vector field onTn × R2 in this case determines the system ofdifferential equations

x = ω

y = z

z = −ay − cz +G(x, a, c),

wherex = (x1, x2, . . . , xn) modulo2π. We are looking for an invariant torus whichis a graph(y, z) = (y(x, a, c), z(x, a, c)) overTn with dynamicsx = ω.

For technical convenience we complexify the problem as follows:

λ = −c2

+ i

√

a− c2

4and ζ = z − λy.


Notice thatλ2 + cλ+ a = 0, ℑλ > 0, and that

y + cy + ay =

[

d

dt− λ

] [

d

dt− λ

]

y.

Now our vector field gets the form

x = ω

ζ = λζ +G(x, λ),

where we use complex multiplication and identifyR2 ∼= C. It is easily see that thedesired torus-graphζ = ζ(x, λ) with dynamicsx = ω now has to be obtained fromthe linear equation

⟨

ω,∂ζ

∂x

⟩

= λζ +G, (5.17)

which has a format that only slightly differs from the secondequation of (5.13).Nevertheless it also provides a 1 bite small divisor problemas we show now.

Again we expandG andζ in the Fourier series

G(x, λ) =∑

k∈Zn

Gk(λ)ei〈x,k〉, ζ(x, λ) =∑

k∈Zn

ζk(λ)ei〈x,k〉

and compare coefficients. It so turns out that the following formal solution is ob-tained,

ζ(λ) =∑

k∈Zn

Gk(λ)

i〈ω, k〉 − λ. (5.18)

granted thatλ 6= i〈ω, k〉, for all k ∈ Zn. Indeed, the denominators vanish at theresonance pointsλ = i〈ω, k〉, k ∈ Zn, so in a dense subset of the imaginaryλ-axis.In general a formal solution exists if and only ifλ is not in this set, and as in§5.2.2the convergence is problematic, again because of the corresponding small divisors.

5.5.3 Convergence

We next deal with convergence of the formal solutions (5.9) and (5.15). First weconsider the regularity of the problem. Recall that the Theorems 5.2 and 5.3 bothare formulated in the world ofC∞-systems, for a discussion see a remark followingTheorem 5.2. The proofs at hand, based on Newtonian iterations, all use the ap-proximation properties ofCk-functions by real analytic ones [160, 206, 207, 215,216, 62, 61]. It turns out that the linearizing 1 bite problems mostly are solved in theworld of real analytic systems. For that reason, and for simplicity, we here restrictto the case where in (5.13) the righthand side functionf is real-analytic in all its

190 On KAM Theory


ω1

ω2

Figure 5.5: Sketch of the setR2τ,γ (and a horizontal line. . .).

arguments. For variations on this to the cases ofCk-functions,k ∈ N ∪ ∞, seeExercise 5.13.

Recall that in§5.2.2 we required the rotation numberβ ∈ [0, 1] to be (τ, γ)-Diophantine, see (5.10), forτ > 2 and γ > 0, denoting the corresponding setby [0, 1]τ,γ; we recall that this forms a Cantor set of positive measure. Wegive asimilar definition for the frequency vectorω ∈ Rn.

Definition 5.4 (Diophantine vectors). Let τ > n − 1 and γ > 0. We say thatω ∈ Rn is (τ, γ)-Diophantine if for allk ∈ Zn we have that

|〈ω, k〉| ≥ γ|k|−τ , (5.19)

where|k| =∑n

j=1 |kj|. The subset of all(τ, γ)-Diophantine frequency vectors inRn is denoted byRn

τ,γ.

Lemma 5.5 (Properties of the Diophantine vectors).The closed subsetRnτ,γ ⊂

Rn has the following properties:

1. Wheneverω ∈ Rnτ,γ ands ≥ 1, then alsosω ∈ Rn

τ,γ (we say thatRnτ,γ has the

closed halfline property);


2. LetSn−1 ⊂ Rn be the unit sphere, thenSn−1 ∪ Rnτ,γ is the union of a Cantor

set and a countable set;

3. The complementSn−1 \ Rnτ,γ has Lebesgue measure of orderγ asγ ↓ 0.

For a proof see [61]. This proof is rather direct when using the Cantor-BendixsonTheorem [107], see the details on[0, 1]τ,γ in §5.2.2. For a sketch ofR2

τ,γ ⊂ R2 seeFigure 5.5. It turns out that the sets[0, 1]τ,γ andR2

τ,γ are closely related (for differentvalues ofτ ), see Exercise 5.4.

Another ingredient for proving the convergence of the formal solutions (5.9) and(5.15) is the decay of the Fourier coefficients as|k| → ∞. For brevity we nowrestrict to the second case of (5.13), leaving the first one tothe reader.

Let Γ ⊂ Rn be a compact domain and putΓγ = Γ ∩ Rnγ . By analyticityf has a

holomorphic extension to a neighborhood ofTn × Γ in (C/2πZ)n × Cn. For someconstantsκ, ρ ∈ (0, 1), this neighborhood contains the set cl((Tn + κ) × (Γ + ρ))where ‘cl’ stands for the closure and

Γ + ρ :=⋃

ω∈Γ

ω′ ∈ Cn : |ω′ − ω| < ρ, (5.20)

Tn + κ :=⋃

x∈Tn

x′ ∈ (C/2πZ)n : |x′ − x| < κ.

LetM be the supremum off over the set cl((Tn + κ) × (Γ + ρ)), then one has

Lemma 5.6 (Paley-Wiener I, e.g., [1]).Letf = f(x, ω) be real analytic as above,with Fourier series

f(x, ω) =∑

k∈Zn

fk(ω)ei〈x,k〉

[cf. (5.14)]. Then, for allk ∈ Zn and allω ∈ cl (Γ + ρ),

|fk(ω)| ≤Me−κ|k|.

Proof. By definition

fk(ω) = (2π)−n

∮

Tn

f(x, ω)e−i〈x,k〉dx.

Firstly, assume that fork = (k1, . . . , kn) all the entrieskj 6= 0. Then takez =(z1, . . . , zn) with

zj = xj − iκ sign(kj), 1 ≤ j ≤ n,

and use the Cauchy Theorem to obtain

fk(ω) = (2π)−n

∮

f(z, ω)e−i〈z,k〉dz = (2π)−n

∮

Tn

f(z, ω)e−i〈x,k〉−κ|k| dx,

192 On KAM Theory

whence|fk(ω)| ≤Me−κ|k|, as desired.

Secondly, if for somej (1 ≤ j ≤ n) we havekj = 0, we just do not shift theintegration torus in that direction. 2

Remark. The converse of Lemma 5.6 also olds true, e.g., compare with [1]: afunction onTn whose Fourier coefficients decay exponentially is real analytic andcan be extended holomorphically to the complex domain(C/2πZ)n by a distancedetermined by the decay rate (i.e.,κ in our notations).

As a direct consequence of the Paley-Wiener Lemma we have

Theorem 5.7 (The 1 bite small divisor problem in the real-analytic case). Con-sider the1 bite small divisor problems(5.13)with a real analytic right hand sidefwith average0. Then, for anyβ ∈ [0, 1]τ,γ, c.q., forω ∈ Rn

τ,γ, the formal solutions(5.9)and(5.15)converge to a solution that is real-analytic inx.

Indeed, again restricting to the case of (5.15), by Lemma 5.6the coefficientsfk

decay exponentially with|k| and by (5.19) the divisorsi〈ω, k〉 in size only growpolynomially. Hence, the quotient in (5.15) still has exponentially decaying Fouriercoefficients (with a somewhat smallerκ) and therefore by the remark followingLemma 5.6 this series converges to a real analytic solution.

Remark. What can be said about the regularity of the solutionsu = u(x, β) andu = u(x, ω) in theβ- andω-directions? This question is not trivial since the sets[0, 1]τ,γ and Rn

τ,γ are nowhere dense and hence have no interior points. It turnsout that in both casesu can be viewed as the restriction of aC∞-function definedon an open neighbourhood. In other words,u can be extended as aC∞-functionon this neighbourhood, even maintaining real analyticity in x. This means that inboth casesu is of class Whitney-C∞, for a proof see [61], for background see[160, 206, 207, 215, 216, 62] and Appendix B. It even turns outthatu even hasGevrey regularity, compare a remark following Theorem 5.2,for references see[23].

Example 5.4 (Quasi-periodic responses in the linear case II). We revisit Example5.3 regarding quasi-periodic responses in he linear case, we slightly adapt the aboveapproach. Indeed, for a fixed nonresonantω we study the formal solution (5.18)

ζ(x, λ) =∑

k∈Zn

Gk(λ)

i〈ω, k〉 − λei〈x,k〉.

The Diophantine condition here is that for givenτ > n − 1 andγ > 0 and for allk ∈ Zn \ 0

|λ− i〈ω, k〉| ≥ γ|k|−τ .


This condition excludes a countable number of open discs from the upper-halfλ-plane, indexed byk. This ‘bead string’ has a measure of the order ofγ2 asγ ↓ 0.The complement of the ‘bead string’ in the imaginaryλ-axis is a Cantor set. Herewe can apply the above kind of argument and obtain the Whitney-smoothness of thesolution inλ.

It may be of interest to study the latter problem for its own sake. A question iswhat happens asγ ↓ 0? The intersection of the ‘bead strings’ over allγ > 0 in theimaginaryλ-axis leaves out a residual set of measure zero containing the resonancepointsi〈ω, k〉, k ∈ Zn \ 0. Compare with Appendix A and with [150]. In theinterior of the complement of this intersection our solution is analytic, both inx andin λ. Forℜλ 6= 0 this also follows by writingζ = T (G), where

(T (G))(x, λ) =

∫ ∞

0

eλsF (x− s ω, λ) ds

(for simplicity we restricted to the caseℜλ < 0). Notice that in the operator (supre-mum) norm we have||T || = |ℜλ|−1, which explodes on the imaginary axis. Formore details see [44]. Remaining problems, among others, are what happens to oursolution in the residual set mentioned above and what can be said of the losses ofdifferentiability in the finitely differentiable case.

5.5.4 Finite differentiability

For completeness we now consider the 1 bite small divisor problem (5.13, 2nd eqn.)⟨

ω,∂u(x, ω)

∂x

⟩

= f(x, ω),

with the averagef0(., ω) ≡ 0 in the case wheref is of classCm, m ∈ N.

For any formal Fourier series

h(x) =∑

k∈Zn

hk ei〈x,k〉

we introduce the following familiar norms in terms of the Fourier coefficients:

|||h|||∞ = maxk

|hk|;

|||h|||2 =

√

∑

k

|hk|2;

|||h|||1 =∑

k

|hk|.

194 On KAM Theory

For any (continuous) functionh : Tn → R we recall that

|||h|||∞ ≤ |||h|||2 ≤ ||h||0 ≤ |||h|||1. (5.21)

Since we did not require the Fourier series to converge, someof these norms maybe infinite. In the case whereh is of classCm we let||h||m denote itsCm-norm, thatis the supremum overTn of h and its derivatives up to orderm. Compare AppendixA.

Lemma 5.8 (Paley-Wiener II, e.g., [1, 69]).Leth : Tn → R be of classCm, withvariable parth, i.e., such thath = h0 + h.

1. Then there exists a positive constantC(1)m,n such that for allk ∈ Zn \ 0

|k|m|hk| ≤ C(1)m,n||h||m.

2. Moreover, form ≥ n+ 1 there exists a positive constantC(2)m,n such that

||h||0 ≤ C(2)m,n max |k|m|hk|.

Proof. We recall thatk = (k1, k2, . . . , kn), where|k| = |k1| + · · · + |kn|. Also

hk =1

(2π)n

∫

Tn

e−i〈k,x〉h(x) d x.

By partial integration it follows that

hk =1

(2π)n

1

(ik1)p1 · · · (ikn)pn

∫

Tn

e−i〈k,x〉 ∂ph

∂xp11 · · ·∂xpn

n(x) d x, (5.22)

wherep1 + · · · + pn ≤ m. First for p = (p1, . . . , pn), k = (k1, . . . , kn) ∈ Zn weintroduce the multi-index notationkp = kp1

1 · · · kpnn .Also, we writePm = p ∈ Zn |

∑nj=1 pj = m. A consideration on positive definite, homogeneous polynomials

next reveals that∑

p∈Pm

kp ∼ |k|m. (5.23)

From (5.22) it now follows that

kp|hk| ≤ C||h||m and hence∑

p∈Pm

kp|hk| ≤ C ′||h||m,

which by (5.23) directly gives the first estimate.


For the second item we introduceDm(h) = max |k|m|hk|. It follows that |hk| ≤Dm(h)/|k|m for all k 6= 0. Then, for the variable parth of h we have

||h||0 ≤ |||h|||1 =∑

k 6=0

|hk| ≤ Dm(h)∑

k 6=0

1

|k|m ,

where we used (5.21) and where the latter sum is bounded form ≥ n + 1. Hence,for C(2)

m,n =∑

k 6=0 1/|k|m we have

||h||0 ≤ C(2)m.nDm(h),

as was to be shown. 2

Theorem 5.9 (The 1 bite small divisor problem in theCm case).Consider the1 bite small divisor problem(5.13, 2nd eqn.)with a right hand sidef of classCm and withf0(., ω) ≡ 0. Then for eachω ∈ Rn

τ,γ consider the formal solution(eq:formsol). We have

|uk| ≤|k|τγ

|fk|

for all k ∈ Zn \0.Moreover, form ≥ n+1+ τ one has thatu ∈ C0(Tn,R) with

||u||0 ≤ C(3)m,n,τ

1

γ||f ||m,

for a positive constantC(3)m,n,τ .

Proof. Comparing terms in the Fourier series, we find that

i〈ω, k〉uk = fk,

for k 6= 0, which by the Diophantine condition (5.19) gives the desiredestimate onthe Fourier coefficientsuk. Next if m ≥ n + 1 + τ, we get that

||u||0 ≤ |||u|||1 ≤∑

k 6=0

|k|τγ

|fk|

≤∑

k 6=0

|k|τγ

|k|−mnm||f ||m ≤ C

γ

∑

ℓ 6=0

ℓτ−m||f ||mℓn−1

≤ C||f ||mγ

∑

ℓ 6=0

ℓn+τ−m−1 ≤ C ′ ||f ||mγ

,

where we use thatn + τ −m − 1 ≤ −2, which gives convergence of the sum andyields the final estimate. Compare with Lemma 5.8. 2

196 On KAM Theory

Remark. Note that, ifm ≥ n + p+ τ with p ≥ 1, it follows thatu ∈ Cp−1, wheresimilarly

||u||p−1 ≤ C ′′ ||f ||mγ

,

where we now use thatn + τ −m− 1 ≤ −1 − p.

5.6 Exercises

Exercise 5.1 (Parallel vector fields onT2). OnT2, with coordinates(x1, x2), bothcounted mod2π, consider the constant vector fieldX, given by

x1 = ω1

x2 = ω2. (5.24)

1. Suppose thatω1 andω2 are rationally independent, then show that any integralcurve ofX is dense inT2.

2. Suppose thatω1/ω2 = p/q with gcd (p, q) = 1. Show thatT2 is foliated byperiodic solutions, all of periodq.

Hint: Use the Poincare (return) map of the circlex1 = 0 and use Lemma 2.2.

Exercise 5.2 (Intrinsicness is the frequency vector).If we defineTn = Rn/Zn

and count the angles modulo1, considerTn-automorphisms of the affine formx 7→a+Sx,wherea ∈ Rn andS ∈ GL(n,Z), the invertiblen×n-matrices that preservethe latticeZn. See Appendix B.

1. Show that the frequency vector of a multi-periodic torus is well-defined up tothe latticeZn.

2. How does this translate to the present situation whereTn = Rn/(2πZn)?

3. What can you say of the frequency vectors ofX andΦ∗X whenΦ is suffi-ciently close to the identity map?

4. Show that an individual vector field of the form (5.24) can never be struc-turally stable.

Exercise 5.3 (An equivalence turned into a conjugation).Show that in the caseof the forced nonlinear oscillator (2.6) the conjugationΦε between the return mapsP0 andPε can be extended to a (smooth) conjugation between the correspondingvector fields.

5.6 Exercises 197

Exercise 5.4 (On Diophantine conditions).Consider the(τ, γ)-Diophantine con-ditions (5.10) and (5.19), given by the conditions

∣

∣

∣

∣

β − p

q

∣

∣

∣

∣

≥ γ|q|−τ ,

for all rationalsp/q with gcd(p, q) = 1, and

|〈ω, k〉| ≥ γ|k|−τ ,

for all k ∈ Zn \ 0. Forn = 2 see Figure 5.5.

1. Show that forn = 2 the following holds. Wheneverω = (ω1, ω2), withω2 6= 0, is (τ, γ)-Diophantine, the ratioβ = ω1/ω2 is (τ + 1, γ)-Diophantine.

2. For(τ, γ)-Diophantineβ ∈ R show that∣

∣e2πkβ − 1∣

∣ ≥ 4γ|k|−τ .

Hint: Draw the complex unit circle and compare an appropriate arc with thecorresponding chord.

Exercise 5.5 (What’s in a name).OnR2 = y1, y2 consider the system

y1 = y21 − µ

y2 = −y2.

Draw the phase portraits of this planar vector field forµ negative, zero and positive.Can you guess and explain the name of this bifurcation?

Exercise 5.6 (A problem of Sternberg).On T2, with coordinates(x1, x2) (mod2π) a vector fieldX is given, with the following property. IfC1 denotes the circleC1 = x1 = 0, then the Poincare return mapP : C1 → C1 with respect toX is arigid rotationx2 7→ P (x2) = x2 + 2πβ, everything counted mod2π. From now onwe abbreviatex = x2. Let f(x) be the return time of the integral curve connectingthe pointsx andP (x) in C1. A priori, f does not have to be constant. The problemnow is to construct a(nother) circleC2, that does have a constant return time. LetΦt denote the flow ofX and expressP in terms ofΦt andf. Let us look for a circleC2 of the form

C2 = Φu(x)(0, x) | x ∈ C1.So the search is for a (periodic) functionu and a constantc, such that

Φc(C2) = C2.

198 On KAM Theory

Rewrite this equation in terms ofu andc. Solve this equation formally in terms ofFourier series. What condition onβ in general will be needed? Give conditions onβ, such that for a real analytic functionf a real analytic solutionu exists.

Regarding the present Exercise one has thatP (x) = Φf(x)(0, x). Next, using thegroup property for flows of vector fields (with addition in time), then keeping trackof the time one finds thatf(x) − u(x) + u(P (x)) = c. This is another 1-bite smalldivisor problem whereu can be solved givenf, provided thatc = f0, theT1-averageof f. It helped us to sketch the situation as in Figure 5.2 and to draw the integralcurve ofX starting at(0, x) that passes through(0, P (x)).

Exercise 5.7 (A normal form for families of circle maps). Given a1-parameterfamily of circle mapsPλ : T1 → T1 of the form

Pλ : x 7→ x+ 2πβ + f(x, λ),

wherex is counted mod2π and wheref(x, 0) ≡ 0. One has to show that by suc-cessive transformations of the form

Hλ : x 7→ x+ h(x, λ)

thex-dependence ofP can be pushed away to higher and higher order inλ. Forthis appropriate conditions onβ will be needed. Carry out the corresponding induc-tive process. What do you think the first step is? Then, concerning theN th step,consider

PN,λ(x) = x+ 2πβ + g(λ) + fN (x, λ) +O(|λ|N+1),

with fN(x, λ) = f(x)λN and look for a transformationH = Id+h, with h(x, λ) =h(x, λ)λN , such that inH−1 PN H theN th order part inλ is x-independent.Formulate sufficient conditions onβ, ensuring that the corresponding equation canbe formally solved, in terms of Fourier series. Finally giveconditions onβ, such thatin the real analytic case we obtain real analytic solutionsh. Explain your arguments.

Hint: Recall that by the Mean Value Theoremh(x + ϕ, λ) = h(x, λ) + h′(ξ, λ)ϕ,especially in cases whereϕ = O(|λ|N+1). By similar arguments it also follows thatH−1

λ (x) = x− h(x, λ) +O(|λ|N+1). Using all this one arrives at the expression

H−1λ PN,λ Hλ (x) =

x+ 2πβ + g(λ) + h(x, λ) − h(x+ 2πβ, λ) + fN(x, λ) +O(|λ|N+1),

where we require that theN th order parth(x, λ)−h(x+2πβ, λ)+fN(x, λ) = c(λ),for an appropriate constantc(λ). Show that this is a 1-bite small divisor problemwhereh can be solved givenfN , provided thatc = fN,0, theT1-average offN .

5.6 Exercises 199

Exercise 5.8 (Floquet problem on a codimension 1 torus).Consider a smoothsystem

x = f(x, y)

y = g(x, y),

with (x, y) ∈ Tn × Rm. Assume thatf(x, y) = ω + O(|y|), which implies thaty = 0 is a invariantn-torus, with on it a constant vector field with frequency-vectorω. Hence the torusy = 0 is multi-periodic. Putg(x, y) = Ω(x)y + O(|y|2),for a mapΩ : Tn → gl(m,R). The present problem is to find a transformationTn × Rm → Tn × Rm, of the form(x, y) 7→ (x, z) = (x,A(x)y), for some mapA : Tn → GL(m,R), with the following property: The transformed system

x = ω +O(|z|)y = Λz +O(|z|2),

is on Floquet form, meaning that the matrixΛ is x-independent. From now on, werestrict to the casem = 1. By a computation show that

Λ = Ω +n∑

j=1

ωj∂ logA

∂xj

.

From this derive an equation, expressing thatΛ is constant inx. Formally solvethis equation inA, givenΩ. Give conditions onω ensuring a formal solution. Alsoexplain how to obtain a real analytic solutionA, assuming real analyticity ofΩ.

Remark. Form ≥ 2 the expression forΛ becomes more complicated, since thenthe matrices do not commute. Apart from this, as said earlier, in this case there canbe topological obstructions against the existence of a Floquet form.

Exercise 5.9 (Main tongue of the Arnold family). For the Arnold family (5.12)

Pβ,ε(x) = x+ 2πβ + ε sinx

of circle maps, consider the region in the(β, ε)-plane where the family has a fixedpoint. Compute its boundaries and describe (also sketch) the dynamics on bothsides of and on a boundary curve. What kind of bifurcation occurs here?

Exercise 5.10 (Poincare map of the forced pendulum).Consider the equation ofmotion of a periodically driven pendulum

y + α2 sin y = ε cos(ωt), (5.25)

200 On KAM Theory

which can be written as a vector field

x = ω

y = z (5.26)

z = −α2 sin y + ε cosx

on the phase spaceT1 × R2 = x, (y, z). Show that the Poincare mapP = Pε ofthe vector field (5.26) is area preserving.

Exercise 5.11 (Action-angle variables for the autonomous pendulum). In Exer-cise 5.10 in the equation of motion (5.25) takeε = 0 and consider the transversalsectionΣ = (x, (y, z)) ∈ T1 × R2 | x = 0 mod2πZ of the vector field (5.26).Let H = H(y, z) be the energy of the autonomous pendulum. For(y, z) ∈ Σ weletH(y, z) = E. In the energy levelH−1(E) the autonomous pendulum carries outa periodic motion of periodT = T (E). Define

I(E) =1

2π

∮

H−1(E)

z dy

and show that

T (E) = 2πdI

dE(E).

Let t be the time the motion inH−1(E) takes to get from the linez = 0 to the point(y, z). Defining

ϕ(y, z) =2π

T (E)t,

show that(I, ϕ) is a pair of action-angle variables.

Consider the Poincare mapPε from Exercise 5.10. Show thatP0 has the form

P0(ϕ, I) = (ϕ+ 2πβ(I), I)

and derive an integral expression forβ(I). Show thatPε is a Twistmap and give aprecise investigation for quasi-periodic invariant circles.

Hint: Use the Gauß Divergence Theorem on an appropriate flow box and the factthat the time-dependent Hamiltonian vector field has divergence zero.

Exercise 5.12 (1 bite small divisor problem (5.13, 1st eqn.), real analyticity).Consider the first equation of the 1 bite problem (5.13), which is the linear differ-ence equation

u(x+ 2πβ, β)− u(x, β) = f(x, β)

5.6 Exercises 201

wherex ∈ T1 = R/(2πZ), β ∈ R, and whereu, f : T1 → R. Let β satisfy theDiophantine conditions (5.10)

∣

∣

∣

∣

β − p

q

∣

∣

∣

∣

≥ γ|q|−τ ,

for all p ∈ Z, q ∈ Z\0, with τ > 2, γ > 0. By expandingf andu in Fourierseries

f(x) =∑

k∈Z

fkeikx, u(x) =

∑

k∈Z

ukeikx,

the linear equation transforms to an infinite system of equations in the coefficientsuk

andfk as described in§§5.2 and 5.5 leading to a formal solution.

First assume thatf is real analytic and bounded byM on the complex strip ofwidth r > 0 aroundT1. That is, for

x ∈ T1 + σ = x ∈ C/(2πZ) | |Im x| < σ,

assume thatsup

x∈T1+σ

|f(x)| ≤M.

The Paley-Wiener Lemma 5.6 gives estimates of thekth Fourier coefficientfk interms ofσ,M andk.

6. Use these and the Diophantine conditions to obtain estimates ofuk. (Hint:Also use Exercise 5.4.)

7. Show that for0 < < σ, the formal Fourier series converges onT1 + anddefines an analytic functionu(x).

8. Derive a bound foru onT1 + that depends explicitly onσ and.

Exercise 5.13 ((∗) 1 bite small divisor problem (5.13, 1st eqn.), finite differen-tiability). Consider the same set-up as Exercise 5.12, but now assume that f is onlyof classCm.

2. What does this imply for the rate of decay of the coefficientsfk as|k| → ∞?

3. Using this result and the Diophantine conditions (5.10),estimate the rate ofdecay of the coefficientsuk as|k| → ∞.

4. What is the least valuem0 of m, such that the formal Fourier series∑

ukeikx

converges absolutely and so defines a continuous function?

5. Given thatf isCm withm > m0, what is the amount of differentiability ofu?

202 On KAM Theory

residual set Exercise 5.14 ((∗) Residual sets of zero measure).For definitions see AppendixA, also compare [150].

1. Use the sets[0, 1]τ,γ ⊂ [0, 1] of Diophantine numbers (5.10) to construct asubset of[0, 1] that is both residual and of zero measure.

2. Show that the subsetΛ ⊂ T1 as defined in Cremer’s example of§2.4.2 is bothresidual and of zero measure.

Chapter 6

Reconstruction and time seriesanalysis

6.1 Introduction

Since it was realized that even simple and deterministic dynamical systems can pro-duce trajectories which look like the result of a random process, there were someobvious questions: How can one distinguish ‘deterministicchaos’ from ‘random-ness’ and, in the case of determinism, how can we extract froma time series relevantinformation about the dynamical system generating it? It turned out that these ques-tions could not be answered in terms of the conventional linear time series analysis,i.e., in terms of autocovariances and power spectra. In thischapter we review thenew methods of time series analysis which were introduced inorder to answer thesequestions. We shall mainly restrict to the methods which arebased on embeddingtheory and correlation integrals: these methods, and the relations of these methodswith the linear methods, are now well understood and supported by mathematicaltheory. Though they are applicable to a large class of time series, it is often neces-sary to supplement them for special applications by ad hoc refinements. We shallnot deal in this chapter with such ad hoc refinements, but concentrate on the mainconcepts. In this way we hope to make the general theory as clear as possible. Forthe special methods which are required for the various applications, we refer to themonographs [120, 89] and [98] as well as to the collections ofresearch papers [205]and [149].

We start in the next section with an experimental example, a dripping faucet, takenfrom [79] which gives a good heuristic idea of the method of reconstruction. Then,in Section 6.3, we describe the Reconstruction Theorem which gives, amongst oth-ers, a justification for this analysis of the dripping faucet. It also motivates a (stillrather primitive) method for deriving numerical invariants which can be used to

204 Reconstruction and time series analysis

dripping faucet characterize aspects of the dynamics of the system generating a time series; thisis discussed in Section 6.4. After that we introduce in Section 6.5 some notionsrelated with stationarity, in particular the so-called reconstruction measures, whichdescribe the statistics of the behaviour of a time series as far as subsegments (ofsmall lengths) are concerned. After that we discuss the correlation integrals and thedimensions and entropies which are based on them in Section 6.6 and their estima-tion in Section 6.7. In Section 6.8 we relate the methods in terms of correlationintegrals with linear methods. Finally, in Section 6.9, we give a short discussion ofrelated topics which do not fall within the general framework of the analysis of timeseries in terms of correlation integrals.

6.2 An experimental example: the dripping faucet

Here we give an account of a heuristic way in which the idea of reconstruction wasapplied to a simple physical system which was first publishedin Scientific American[79] and which was based on earlier work in [151]. We also use this example tointroduce the notion of a (deterministic) dynamical systemas a generator of a timeseries.

We consider a sequence of successive measurements from a simple physical exper-iment: One measures the lengths of the time intervals between successive dropsfalling from a dripping faucet. They form a finite, but long, sequence of real num-bers, atime seriesy0, y1, . . .. The question is whether this sequence of real num-bers contains an indication about whether it is generated bya deterministic systemor not (the faucet was set in such a way that the dripping was not ‘regular’, i.e.,not periodic; for many faucets such a setting is hard to obtain). In order to answerthis question, the authors of [79] considered the set of 3-vectors(yi−1, yi, yi+1)i,each obtained by taking three successive values of the time series. These vectorsform a ‘cloud of points’ inR3. It turned out that, in the experiment reported in [79],this cloud of points formed a (slightly thickened) curveC in 3-space. This gives anindication that the process is deterministic (with small noise corresponding to thethickening of the curve). The argument is based on the fact that the valuesyn−1

andyn of the time series, together with the (thickened) curveC, enable in general agood prediction for the next valueyn+1 of the time series. This prediction ofyn+1 isobtained in the following way: we consider the lineℓn in 3-space which is in the di-rection of the third coordinate axis and whose first two coordinates areyn−1 andyn.This line ℓn has to intersect the cloud of 3-vectors, since(yn−1, yn, yn+1) belongsto this cloud (even thought the cloud is composed of vectors which were collectedin the past, we assume the past to be so extensive that new vectors do not substan-tially extendthe cloud). If we ‘idealize’ this cloud to a curveC, thengenericallyℓn andC only should have this one point in common. Later we will make all this

6.3 The Reconstruction Theorem 205

read out functionmathematically rigorous, but the idea is that two (smooth) curves in 3-space havein general, or generically, no intersection (in dimension 3 there is enough space fortwo curves to become disjoint by a small perturbation); by construction,C andℓnmust have one point in common, namely(yn−1, yn, yn+1), but there is no reason forother intersections. For the definition of ‘generic’ see Appendix A. This means thatwe expect that we can predict the valueyn+1 from the previous valuesyn−1 andyn

as the third coordinate of the intersectionC ∩ ℓn. Even if we do not idealize thecloud of points inR3 to the curve, we conclude that the next valueyn+1 should bein the range given by the intersection of this cloud withℓn; usually this range issmall and hence the expected error of the prediction will be small. The fact thatsuch predictions are possible means that we deal with a deterministic system or analmostdeterministic system if we take into account the fact that the cloud of vectorsis really athickenedcurve.

So, in this example, we can consider the cloud of vectors in 3-space, or its idealiza-tion to the curveC, as thestatistical informationconcerning the time series whichis extracted from along sequence of observations. It describes the dynamical lawgenerating the time series in the sense that it enables us to predict, from ashortsequence of recent observations, the next observation.

A mathematical justification of the above method to detect a deterministic struc-ture in an apparently random time series, is given by the Reconstruction Theorem;see Section 6.3. This was a main starting point of what is now called nonlineartime series analysis. The fact that many systems are not strictly deterministic butcan be interpreted as deterministic with a small contamination of noise gives extracomplications to which we will return later.

Note that we only considered here the prediction of one step in the future. So evenif this type of prediction works, the underlying dynamical system still may have apositive dispersion exponent, see Section 2.3.2.

6.3 The Reconstruction Theorem

In this section we formulate the Reconstruction Theorem anddiscuss some of itsgeneralizations. In the next section we give a first example of how it is relevant forthe questions raised in the introduction of this chapter.

We consider a dynamical system, generated by a diffeomorphismϕ : M →M on acompact manifoldM , together with a smooth ‘read out’ functionf : M → R. Thisset up is proposed as the general form of mathematical modelsfor deterministic dy-namical systems generating time series in the following sense. Possibleevolutions,or orbits, of the dynamical system are sequences of the formxn = ϕn(x0), whereusually the indexn runs through the non-negative integers. We assume that what


time seriesReconstruction

Theorem

is observed, or measured, at each timen is not the whole statexn but just one realnumberyn = f(xn), wheref is the read out function which assigns to each statethe value which is measured (or observed) when the system is in that state. So cor-responding to each evolutionxn = ϕn(x0) there is a time seriesY of successivemeasurementsyn = f(ϕn(x0)) = f(xn).

In the above setting of a deterministic system with measurements, we made somerestrictions which are not completely natural: the time could have been continuous,i.e., the dynamics could have been given by a differential equation (or vector field)instead of a diffeomorphism (in other words, the time set could have been the realsRinstead of the natural numbersN); instead of measuring only one value, one couldmeasure several values (or a finite dimensional vector); also the dynamics couldhave been given by an endomorphism instead of a diffeomorphism; finally we couldallowM to be non-compact. We shall return to such generalizations after treatingthe situation as proposed, namely with the dynamics given bya diffeomorphismon a compact manifold and a 1-dimensional read out function.In this situation wehave:

Theorem 6.1 (Reconstruction Theorem).For a compactm-dimensional manifoldM , ℓ ≥ 1, andk > 2m there is an open and dense subsetU ⊂ Diffℓ(M)×Cℓ(M),the product of the space ofCℓ-diffeomorphisms onM and the space ofCℓ-functionsonM , such that for(ϕ, f) ∈ U the following map is an embedding ofM into Rk:

M ∋ x 7→ (f(x), f(ϕ(x)), . . . , f(ϕk−1(x))) ∈ Rk.

Observe that the conclusion of Theorem 6.1 holds forgenericpairs(ϕ, f). A proof,which is an adaptation of the Whitney Embedding Theorem [208], first appeared in[185], also see [29]. For a later and more didactical version, with some extensions,see [176] and [26]. For details we refer to these publications.

We introduce the following notation: the map fromM to Rk, given by

x 7→ (f(x), f(ϕ(x)), . . . , f(ϕk−1(x))),

is denoted byReck and vectors of the form(f(x), f(ϕ(x)), . . . , f(ϕk−1(x))) arecalledk-dimensionalreconstruction vectorsof the system defined by(ϕ, f). Theimage ofM underReck is denoted byXk.

The meaning of the Reconstruction Theorem is the following.For a time seriesY = yn we consider the sequence of itsk-dimensionalreconstruction vectors(yn, yn+1, . . . , yn+k−1)n ⊂ Rk; this sequence is ‘diffeomorphic’ to the evolutionof the deterministic system producing the time seriesY provided that the followingconditions are satisfied:

- The numberk is sufficiently large, e.g., larger than twice the dimensionof thestate spaceM ;

The Reconstruction Theorem 207

- The pair(ϕ, f), consisting of the deterministic system and the read out func-tion, is generic in the sense that it belongs to the open and dense subset as inTheorem 6.1.

The sense in which the sequence of reconstruction vectors ofthe time series isdiffeomorphic to the underlying evolution is as follows. The relation between thetime seriesY = yn and the underlying evolutionxn = ϕn(x0) is thatyn =f(xn). This means that the reconstruction vectors of the time seriesY are just theimages under the mapReck of the successive points of the evolutionxn. SoReck,which is, by the theorem, an embedding ofM onto its imageXk ⊂ Rk, sends theorbit of xn to the sequence of reconstruction vectors(yn, yn+1, . . . , yn+k−1)n

of the time seriesY . We note that a consequence of this fact, and the compactnessof M , is that for any metricd onM (derived from some Riemannian metric), theevolutionxn and the sequence of reconstruction vectors ofY are metrically equalup tobounded distortion. This means that the quotients

d(xn, xm)

||(yn, . . . , yn+k−1) − (ym, . . . , ym+k−1)||

are uniformly bounded and bounded away from zero. So, all therecurrence proper-ties of the evolutionxn are still present in the corresponding sequence of recon-struction vectors ofY .

6.3.1 Generalizations

6.3.1.1 Continuous time

Suppose now that we have a dynamical system with continuous time, i.e., a dy-namical system given by a vector field (or a differential equation)X on a compactmanifoldM . For eachx ∈ M we denote the corresponding orbit byt 7→ ϕt(x) fort ∈ R. In that case there are two alternatives for the definition ofthe reconstructionmapReck. One is based on a discretization: we take a (usually small) time intervalh > 0 and define the reconstruction mapReck in terms of the diffeomorphismϕh,the timeh map of the vector fieldX. Also then one can show that fork > 2mand generic triples(X, f, h), the reconstruction mapReck is an embedding. An-other possibility is to define the reconstruction map in terms of the derivatives ofthe functiont 7→ f(ϕt(x)) at t = 0, i.e. by taking

Reck(x) = (f(ϕt(x)),∂

∂tf(ϕt(x)), . . . ,

∂k−1

∂tk−1f(ϕt(x))) |t=0 .

Also for this definition of the reconstruction map the conclusion is the same: fork > 2m and generic pairs(X, f) the reconstruction map is an embedding ofM into


Rk. The proof for both these versions is in [185]; in fact in [29]it was just the caseof continuous time that was treated through the discretization mentioned above.

We note that the last form of reconstruction (in terms of derivatives) is not veryuseful for experimental time series since the numerical evaluation of the derivativesintroduces in general a considerable amount of noise.

6.3.1.2 Multidimensional measurements

In the case that at each time one measures a multidimensionalvector, so that we havea read out function with values inRs, the reconstruction mapReck has values inRsk.In this case the conclusion of the Reconstruction Theorem still holds for genericpairs(ϕ, f) wheneverks > 2m. As far as known to us there is no reference for thisresult, but the proof is the same as in the case with 1-dimensional measurements.

6.3.1.3 Endomorphisms

If we allow the dynamics to be given by an endomorphism instead of a diffeo-morphism, then the obvious generalization of the Reconstruction Theorem isnotcorrect. A proper generalization in this case, which can serve as a justification forthe type of analysis which we described in section 2 and also of the numerical esti-mation of dimensions and entropies in section 7, was given in[188]. Fork > 2mone proves that there is, under the usual generic assumptions, a mapπk : Xk → Msuch thatπk Reck = ϕk−1. This means that a sequence ofk successive measure-ment determines the state of the system at theendof the sequence of measurements.However, there may be different reconstruction vectors corresponding to the samefinal state; in this caseXk ⊂ Rk is in general not a sub-manifold, but still the mapπk is differentiable, in the sense that it can be extended to a differentiable map froma neighbourhood ofXk in Rk toM .

6.3.1.4 Compactness

If M is not compact the Reconstruction Theorem remains true, butnot the remarkabout ‘bounded distortion’. The reason for this restriction was however the follow-ing. When talking about predictability, as in Section 6.2, based on the knowledgeof a (long) segment of a time series, an implicit assumption is always: what willhappen next already happened (exactly or approximately) inthe past. So this ideaof predictability is only applicable to evolutionsxnn≥0 which have the propertythat for eachε > 0 there is anN(ε) such that for eachn > N(ε), there is some0 < n′(n) < N(ε) with d(xn, xn′(n)) < ε. This is a mathematical formulationof the condition that after a sufficiently long segment of theevolution (here lengthN(ε)) every new state, likexn, is approximately equal (here equal up to a distance

The Reconstruction Theorem 209

ε) to one of the past states, herexn′(n). It is not hard to see that this condition onxn, assuming that the state space is a complete metrizable space, is equivalentwith the condition that the positive evolutionxnn≥0 has a compact closure, seeSection 2.3.2. Since such an assumption is basic for the mainapplications of theReconstruction Theorem, it is no great restriction of the generality to assume, as wedid, that the state spaceM itself is a compact manifold since we only want to dealwith a compact part of the state space anyway. In this way we avoid also the com-plications of defining the topology on spaces of functions (and diffeomorphisms)on non-compact manifolds.

There is another generalization of the Reconstruction Theorem which is related tothe above remark. For any evolutionxnn≥0 of a (differentiable) dynamical systemwith compact state space, one defines itsω-limit ω(x0) as:

ω(x0) = x ∈M | ∃ni → ∞, such thatxni→ x,

compare with Section 4.1. (Recall that often such anω-limit is an attractor, but thatis of no immediate concern to us here.) Using the compactnessof M , or of theclosure of the evolutionxnn≥0, one can prove that for eachε > 0 there is someN ′(ε) such that for eachn > N ′(ε), d(xn, ω(x0)) < ε. So theω-limit describes thedynamics of the evolution starting inx0 without the peculiarities (transients) whichare only due to the specific choice of the initial state. For a number of applications,one does not need the reconstruction mapReck to be an embedding of the whole ofM , but only of theω-limit of the evolution under consideration. Often the dimen-sion of such anω-limit is much lower than the dimension of the state space. Forthis reason it was important that the Reconstruction Theorem was extended in [176]to this case, where the condition onk could be weakened:k only has to be largerthan twice the dimension of theω-limit. It should however be observed that theseω-limit sets are in general not very nice spaces (like manifolds) and that for thatreason the notion of dimension is not so obvious (one has to take in this case thebox counting dimension, see Section 6.4.1); in the conclusion of the theorem, onegets an injective map of theω-limit into Rk, but the property of bounded distortionhas not been proven for this case (and is probably false).

6.3.2 Historical note

This Reconstruction Theorem was obtained independently byAeyels and the sec-ond autho. In fact it was D.L. Elliot who pointed out, at the Warwick conference in1980 where the Reconstruction Theorem was presented, that Aeyels had obtainedthe same type of results in his thesis (Washington University, 1978) which was pub-lished as [29]. His motivation came from systems theory, andin particular from theobservability problem for generic nonlinear systems.


−1.5

−1

−0.5

0

0.5

1

1.5

x

0 20 40 60 80 100

n

−1.5

−1

−0.5

0

0.5

1

1.5

x

0 20 40 60 80 100

n

Figure 6.1: Time series: (left) from the Henon system withn running from 100 to200 and (right) a randomized version of this time series.

6.4 Reconstruction and detecting determinism

In this section we show how, from the distribution of reconstruction vectors of agiven time series, one can obtain an indication whether the time series was generatedby a deterministic or a random process. In some cases this is very simple, seeSection 6.2. As an example we consider two time series: one isobtained from theHenon system and the second is a randomized version of the first one. We recall,see also Section 1.2.3, that the Henon system has a 2-dimensional state spaceR2

and is given by the map

(x1, x2) 7→ (1 − ax21 + bx2, x1);

we take fora and b the usual valuesa = 1.4 and b = .3. As a read out func-tion we takef(x1, x2) = x1. For an evolution(x1(n), x2(n)), with (x1(0), x2(0))close to(0, 0), we consider the time seriesyn = f(x1(n), x2(n)) = x1(n) with0 ≤ n ≤ N for some largeN . In order to get a time series of an evolutioninsidetheHenon attractor (within the limits of the resolution of thegraphical representation)we omitted the first 100 values. This is our first time series. Our second time seriesis obtained by applying a random permutation to the first timeseries. So the secondtime series is essentially a random iid (identically and independently distributed)time series with the same histogram as the first time series. From the time seriesitself it is not obvious which of the two is deterministic, see Figure 6.1. However,if we plot the ‘cloud of 2-dimensional reconstruction vectors’, the situation is com-pletely different, compare with Figure 6.2. In the case of the reconstruction vectorsfrom the Henon system we clearly distinguish the well knownpicture of the Henonattractor, see Figure 1.13 in§1.3. In the case of the reconstruction vectors of therandomized time series we just see a structureless cloud.

Reconstruction and detecting determinism 211

−1.5

−1

−0.5

0

0.5

1

1.5

xn+1

−1.5 −1 −0.5 0 0.5 1 1.5

xn

−1.5

−1

−0.5

0

0.5

1

1.5

xn+1

−1.5 −1 −0.5 0 0.5 1 1.5

xn

Figure 6.2: The clouds of 2-dimensional reconstruction vectors for the two timeseries in figure 6.1 .

Of course the above example is rather special, but, in terms of the ReconstructionTheorem, it may be clear that there is a more general method behind this obser-vation. In the case that we use a time series that is generatedby a deterministicprocess, the reconstruction vectors fill a ‘dense’ subset ofthe differentiable image(under the reconstruction map) of the closure of the corresponding evolution of thedeterministic system (or of itsω-limit if we omit a sufficiently long segment fromthe beginning of the time series); of course, for these vectors to fill the closure of theevolutionreally densely, the time series should be infinite. If the reconstruction vec-tors are of sufficiently high dimension, one generically gets even a diffeomorphicimage. In the case of the Henon system the reconstruction vectors form a diffeo-morphic image of the Henon attractor which has, according to numerical evidence,a dimension strictly between 1 and 2 (for the notion of dimension see below). Thismeans that also the reconstruction vectors densely fill a subset of dimension strictlysmaller than 2; this explains that the 2-dimensional reconstruction vectors are con-centrated on a ‘thin’ subset of the plane (also this will be explained below). Such aconcentration on a ‘thin’ subset is clearly different from the diffuse distribution ofreconstruction vectors which one sees in the case of the randomized time series. Adiffuse distribution, filling out some open region is typical for reconstruction vectorsgenerated by a random process, e.g. this is what one gets for atime series generatedby any (non-deterministic) Gaussian process.

In the above discussion the, still not defined, notion of dimension played a key role.The notion of dimension which we have to use here is not the usual one from topol-ogy or linear algebra which only can have integer values: we will use here so calledfractal dimensions. In this section we will limit ourselvesto the box counting di-mension. We first discuss its definition, or rather a number ofequivalent definitions,and some of its basic properties. It then turns out that the numerical estimation of


box countingdimension

these dimensions provides a general method for discrimination between time se-ries as discussed here, namely those generated by a low dimensional attractor of adeterministic system, and those generated by a random process.

6.4.1 Box counting dimension and its numerical estimation

For introductory information on the various dimensions used in dynamical systemswe refer the reader to [153] and to references therein. For a more advanced treat-ment see [95, 96] and [154].

The box counting dimension, which we discuss here, is certainly not the most con-venient for numerical estimation, but it is conceptually the simpler one. The com-putationally more relevant dimensions will be discussed inSection 6.6.

Let K be a compact metric space. We say that a subsetA ⊂ K is ε-spanning ifeach point inK is contained in theε-neighbourhood of one of the points ofA. Thesmallest cardinality of anε-spanning subset ofK is denoted byaε(K); note thatthis number is finite due to the compactness ofK.

For the same compact metric spaceK a subsetB ⊂ K is calledε-separated if,for any two pointsb1 6= b2 ∈ B, the distance betweenb1 andb2 is at leastε. Thegreatest cardinality of anε-separated subset ofK is denoted bybε(K).

It is not hard to prove that

a ε2(K) ≥ bε(K) ≥ aε(K),

see the exercises. Also it may be clear that, asε ↓ 0, bothaε andbε tend to infinity,except ifK consists of a finite number of points only. The speed of this growthturns out to provide a definition of dimension.

Definition 6.2 (Box Counting Dimension).Thebox counting dimensionof a com-pact metric spaceK is defined as

d(K) = lim supε↓0

ln(aε(K))

− ln(ε)= lim sup

ε↓0

ln(bε(K))

− ln(ε).

Note that the equality of the two expressions in the theorem follows from the aboveinequalities.

It is not hard to show that ifK is the unit cube, or any compact subset of theEuclideann-dimensional space with non-empty interior, then its box counting di-mension isd(K) = n; also, for any compact subsetK ⊂ Rn, we haved(K) ≤ n,compare with the exercises. This is what we should expect from any quantity whichis some sort of a dimension. There are however also examples where this dimensionis not an integer. Main examples of this are the mid-α Cantor sets. These sets can

Reconstruction and detecting determinism 213

be constructed as follows. We start with the intervalK0 = [0, 1] ⊂ R; eachKi+1 isobtained fromKi by deleting from each of the intervals ofKi an open interval in itsmiddle of lengthα times the length the interval ofKi. SoKi will have2i intervalsof length((1 − α)/2)i. See also the exercises.

Next we consider a compact subsetK ⊂ Rk. With the Euclidean distance functionthis is also a metric space. For eachε > 0 we consider the partition ofRk intoε-boxes of the form

n1ε ≤ x1 < (n1 + 1)ε, . . . , nkε ≤ xk < (nk + 1)ε,

with (n1, . . . , nk) ∈ Zk. The number of boxes of this partition containing at leastone point ofK, is denoted bycε(K). We then have the following inequalities:

aε√

k(K) ≤ cε(K) ≤ 3kaε(K).

This means that in the definition of box counting dimension, we may replace thequantitiesan or bn by cn wheneverK is a subset of a Euclidean space. This explainsthe name ‘box counting dimension’; also other names are usedfor this dimension;for a discussion, with some history, compare with [96].

It follows easily from the definitions that ifK ′ is the image ofK under a mapfwith bounded expansion, or a Lipschitz map, i.e., a map such that for some constantC we have for all pairsk1, k2 ∈ K that

ρ′(f(k1), f(k2)) ≤ Cρ(k1, k2),

whereρ, ρ′ denote the metrics inK andK ′ respectively, thend(K ′) ≤ d(K). Adifferentiable map, restricted to a compact set, has bounded expansion. Thus wehave proven:

Lemma 6.3 (Dimension of Differentiable Image).Under a differentiable map thebox counting dimension of a compact set does not increase, and under a diffeomor-phism it remains the same.

Remark. One should not expect that this lemma remains true without the assump-tion of differentiability. This can be seen from the well known Peano curve whichis a continuous map from the (1-dimensional) unit interval onto the (2-dimensional)square. Compare with [153].

6.4.2 Numerical estimation of the box counting dimension

SupposeK ⊂ Rk is a compact subset andk1, k2, . . . ∈ K is a dense sequencein K. If such a dense sequence is numerically given, in the sense that we have


(good approximations of)k1, . . . , kN for some large valueN , this can be used toestimate the box counting dimension ofK in the following way: first we estimatethe quantitiescε(K) for various values ofε by counting the number ofε-boxescontaining at least one of the pointsk1, . . . , kN . Then we use these quantities toestimate the limit (or limsup) ofln(cε(K))/ − ln(ε) (which is the box countingdimension).

We have to use the notion of ‘estimating’ here in a loose senseand since there iscertainly no error bound. However, we can say somewhat more:first, we shouldnot try to makeε smaller than the precision with which we know the successivepointski; then there is another lower bound forε which is much harder to assess,and which depends onN : if we takeε very small, thencε(K) will usually becomelarger thanN which implies that we will get too low an estimate forcε (since ourestimate will at most beN). These limitations onε imply that the estimation of thelimit (or limsup) of ln(cε(K))/ − ln(ε) cannot be more than rather ‘speculative’.The ‘speculation’ which one usually assumes is that the graph of ln(cε(K)), as afunction of− ln(ε), approaches a straight line for decreasingε, i.e. for increasing− ln(ε). Whenever this speculation is supported by numerical evidence, one takesthe slope of the estimated straight line as estimate for the box counting dimension.It turns out that this procedure often leads to reproducibleresults in situations whichone encounters in the analysis of time series of (chaotic) dynamical systems (at leastif the dimension of theω-limit is sufficiently low).

Later, we shall see other definitions of dimension which leadto estimation proce-dures which are better than the present estimate of the box counting dimension inthe sense that we have a somewhat better idea of the variance of the estimates. Still,dimension estimations have been carried out along the line as described above, see[101]. In that paper, dimension estimates were obtained fora 1-parameter familyof dimensions, the so calledDq-dimensions introduced by Renyi, see [165]. Thebox counting dimension as defined here coincides with theD0-dimension. The esti-mated value of this dimension for the Henon attractor isD0 = 1.272± .006, wherethe error bound of course is not rigorous: there is not even a rigorous proof for thefact thatD0 is greater than 0 in this case!

6.4.3 Box counting dimension as an indication for ‘thin’ subsets

As remarked above, a compact subset of the plane which has a box counting di-mension smaller than 2 cannot have interior points. So the ‘thin appearance’ of thereconstructed Henon attractor is in agreement with the fact that the box countingdimension is estimated to be smaller than 2.

From the above arguments it may be clear that if we apply the dimension estima-tion to a sequence of reconstruction vectors from a time series which is generated

6.5 Stationarity and reconstruction measures 215

by a deterministic system with a low dimensional attractor,then we should find adimension which does not exceed the dimension of this attractor. So an estimateddimension, based on a sequence of reconstruction vectors(yi, . . . , yi+k−1) ∈ Rk,which is significantly lower than the dimensionk in which the reconstruction takesplace, is an indication in favour of a deterministic system generating the time series.

Here we have to come back to the remark at the end of Section 6.2, also in relationwith the above remark warning against too small values ofε. Suppose we considera sequence of reconstruction vectors generated by a system which we expect to beessentially deterministic, but contaminated with some small noise (like in the caseof the dripping faucet of Section 6.2). If, in such cases, we consider values ofεwhich are smaller than the amplitude of the noise we should find that the ‘cloud’ ofk-dimensional reconstruction vectors has dimensionk. So such values ofε shouldbe avoided.

Remark. So far we have given even a partial answer how to distinguish betweendeterministic and random time series. In Section 6.8, and inparticular§§6.8.2 and6.8.3, we will be able to discuss this question in greater depth.

6.4.4 Estimation of topological entropy

The topological entropy of a dynamical systems is a measure for the sensitivenessof evolutions on initial states, or of its unpredictability(we note that this notionis related to, but different from, the notion of dispersion exponent introduced inSection 2.3.2). This quantity can be estimated from time series with a procedurethat is very much related to the above procedure for estimating the box countingdimension. It can also be used as an indication whether we deal with a time seriesgenerated by a deterministic or by a random system, see [185]. We shall howevernot discuss this matter here. The reason is that, as we observed before, the actualestimation of these quantities is difficult and often unreliable. We shall introducein Section 6.6 quantities which are easier to estimate and carry the same type ofinformation. In the context of these quantities we shall also discuss several formsof entropy.

6.5 Stationarity and reconstruction measures

In this section we discuss some general notions concerning time series. These timeseriesyii≥0 are supposed to be of infinite length; the elementsyi can be inR, inRk, or in some manifold. This means that also a (positive) evolution of a dynamicalsystem can be considered as a time series from this point of view. On the other handwe do not assume that our time series are generated by deterministic processes.

216 Reconstruction theory and nonlinear time series analysis

reconstructionmeasure

Also, we do not consider our time series as realizations of some stochastic processas is commonly done in texts in probability theory, i.e., we do not consideryi as afunction ofω ∈ Ω, whereΩ is some probability space; in§6.5.2 we however shallindicate how our approach can be interpreted in terms of stochastic processes.

The main notion which we discuss here is the notion ofstationarity. This notion isrelated with the notion of predictability as we discussed itbefore: it means that the‘statistical behaviour’ of the time series stays constant in time, so that knowledgeof the past can be used to predict the future, or to conclude that the future cannot be(accurately) predicted. Equivalently, stationarity means that all kinds of averages,quantifying the ‘statistical behaviour’ of the time series, are well defined.

Before giving a formal definition, we give some (counter-) examples. Standardexamples of non-stationary time series are economical timeseries like prices as afunction of time (say measured in years): due to inflation such a time series will,in average, increase, often at an exponential rate. This means that, e.g., the averageof such a time series (as an average over the infinite future) is not defined, at leastnot as a finite number. Quotients of prices in successive years have a much betterchance of forming a stationary time series. These examples have to be taken witha grain of salt: such time series are not (yet) known for the whole infinite future.However, more sophisticated mathematical examples can be given. For example weconsider the time seriesyi with:

- yi = 0 if the integer part ofln(i) is even;

- yi = 1 if the integer part ofln(i) is odd.

It is not hard to verify thatlimn→∞1n

∑ni=0 yi does not exist, see the exercises. This

non-existence of the average for the infinite future means that this time series is notstationary. Though this example may look pathological, we shall see that examplesof this type occur as evolutions of dynamical systems.

6.5.1 Probability measures defined by relative frequencies

We consider a topological spaceM with a countable sequence of pointspii∈Z+ ⊂M . We want to use this sequence to define a Borel probability measureµ onMwhich assigns to each (reasonable) subsetA ⊂ M the average fraction of points ofthe sequencepi which are inA, so

µ(A) = limn→∞

1

n#i | 0 ≤ i < n andpi ∈ A,

where as before,# denotes the cardinality. If this measure is well-defined we call itthemeasure of relative frequenciesof the sequencepi. One problem is that these

Stationarity and reconstruction measures 217

relative frequencieslimits may not be defined; another that the specification of the ‘reasonable’ sets isnot so obvious. This is the reason that, from a mathematical point of view, anotherdefinition of this measure of relative frequencies is preferred. For this approachwe need our spaceM to be metrizable and to have a countable basis of open sets(these properties hold for all the spaces one encounters as state spaces of dynamicalsystems). Next we assume that for each continuous functionψ : M → R withcompact support, the limit

µ(ψ) = limn→∞

1

n

n−1∑

i=0

ψ(pi)

exists; if this limit does not exist for some continuous function with compact sup-port, then the measure of relative frequencies is not defined. If all these limits exist,then µ is a continuous positive linear functional on the space of continuous func-tions with compact support. According to the Riesz Representation Theorem, e.g.see [169], there is then a unique (non-negative) Borel measureµ onM such that foreach continuous function with compact supportψ we have

µ(ψ) =

∫

ψdµ.

It is not hard to see that for this measure we always haveµ(M) ≤ 1. If µ(M) = 1,we callµ the probability measure of relative frequencies defined bypi. Note thatit may happen thatµ(M) is strictly smaller than 1, e.g., if we haveM = R andpi → ∞, asi → ∞, think of prices as a function of time in the above example. Inthat caseµ(ψ) = 0 for each functionψ with compact support, so thatµ(M) = 0. Ifµ(M) < 1, the probability measure of relative frequencies is not defined.

6.5.2 Definition of stationarity and reconstruction measures

We first consider a time seriesyi with values inR. For eachk we have a corre-sponding sequence ofk-dimensional reconstruction vectors

Y ki = (yi, . . . , yi+k−1) ∈ Rki≥0.

We say that the time seriesyi is stationary if for eachk the probability measureof relative frequencies ofY k

i i is well defined. So here we need the time series tobe of infinite length. In the case that the time series has values in a manifoldM (orin a vector spaceE), the definition is completely analogous. The only adaptationone has to make is that the elementsY k

i = (yi, . . . , yi+k−1) are notk-vectors, butelements ofMk (or ofEk).


Assuming that our time series is stationary, the probability measures of relative fre-quencies ofk-dimensional reconstruction vectors are calledk-dimensional recon-struction measuresand are denoted byµk. These measures satisfy some obviouscompatibility relations. For time series with values inR these read as follows. Forany subsetA ⊂ Rk:

µk+1(R ×A) = µk(A) = µk+1(A× R);

for any sequence of measuresµk onRk, satisfying these compatibility relations (andan ergodicity property, which is usually satisfied for reconstruction measures), thereis a well defined time series in the sense of stochastic processes. This is explainedin [45], Chapter 2.11.

6.5.3 Examples

We have seen that the existence of probability measures of relative frequencies,and hence the validity of stationarity, can be violated in two ways: one is that theelements of the sequence, or of the time series, under consideration move off toinfinity; the other is that limits of the form

limn→∞

1

n

n−1∑

i=0

ψ(pi)

do not exist. If we restrict to time series generated by a dynamical system withcompact state space, then nothing can move off to infinity, sothen there is only theproblem of existence of the limits of these averages. There is a general belief that forevolutions of generic dynamical systems with generic initial state, the probabilitymeasures of relative frequency are well defined. A mathematical justification of thisbelief has not yet been given. This is one of the challenging problems in the ergodictheory of smooth dynamical systems; see also [173].

There are however non-generic counterexamples. One is the mapϕ(x) = 3x mod-ulo 1, defined on the circleR moduloZ. If we take an appropriate (non-generic)initial point, then we get an evolution which is much like the0-, 1-sequence whichwe gave as an example of a non-stationary time series. The construction is the fol-lowing. As initial state we take pointx0 which has in the ternary system the form0.α1α2 . . . with

- αi = 0 if the integer part ofln(i) is even;

- αi = 1 if the integer part ofln(i) is odd.

6.6 Correlation dimensions and entropies 219

correlation!dimensionscorrelation!entropy

If we consider the partition ofR moduloZ, given by

Ii = [i

3,i+ 1

3), i = 0, 1, 2,

thenϕi(x0) ∈ Iαi; compare this with the Doubling map and symbolic dynamics as

described in Section 1.3.8. BothI0 andI1 have a unique expanding fixed point ofϕ:0 and .5 respectively; long sequences of 0’s or 1’s correspond to long sojourns verynear 0 respectively .5. This implies that for any continuousfunctionψ on the circle,which has different values in the two fixed points 0 and .5, thepartial averages

1

n

n−1∑

i=0

ψ(ϕi(x0))

do not converge forn→ ∞.

We note that for anyϕ, C1-close toϕ, there is a similar initial point leading to anon-stationary orbit. For other examples see [173] and references therein.

6.6 Correlation dimensions and entropies

As observed before, the box counting dimension of an attractor (orω-limit) can beestimated from a time series, but the estimation procedure is not very reliable. Inthis section we shall describe the correlation dimension and entropy, which admitbetter estimation procedures. Correlation dimension and entropy are special mem-bers of 1-parameter families of dimensions and entropies, which we will define in§6.6.1.2. The present notion of dimension (and also of entropy) is only defined fora metric spacewith a Borel probability measure. Roughly speaking, the definitionis based on the probabilities for two randomly and independently chosen points,for a given probability distribution, to be at a distance of at mostε for ε → 0. Inann-dimensional space with probability measure having a positive and continuousdensity, one expects this probability to be of the orderεn: once the first of the twopoints is chosen, the probability that the second point is within distanceε is equalto the measure of theε-neighbourhood of the first point, and this is of the orderεn.

This is formalized in the following definitions, in whichK is a (compact) metricspace, with metricρ and Borel probability measureµ.

Definition 6.4 (Correlation Integral). Theε-correlation integralC(ε) of (K, ρ, µ)is theµ× µ measure of

∆ε = (p, q) ∈ K ×K | ρ(p, q) < ε ⊂ K ×K.


We note that the above definition is equivalent to

C(ε) =

∫

µ(D(x, ε))dµ(x),

whereD(x, ε) is theε-neighbourhood ofx, i.e. it is theµ-average of theµ-measureµ(D(x, ε)) of ε-neighbourhoods.

Definition 6.5 (Correlation Dimension). Thecorrelation dimension, orD2-dimension,of (K, ρ, µ) is

D2(K) = lim supε→0

ln(Cε)

ln ε.

As in the case of the box counting dimension, one can prove that also here the unitcube inRn, with the Lebesgue measure, hasD2-dimensionn, see the exercises.Also, any Borel probability measure onRn has itsD2-dimension at most equal ton. An indication of how to prove this is given at the end of this section where wediscuss the the 1-parameter family of Renyi dimensions of which ourD2-dimensionis a member. For more information on this and other dimensions, in particular inrelation with dynamical systems, we refer to [154].

Next we come to the notion of entropy. This notion quantifies the sensitive depen-dence of evolutions on initial states. So it is related with the notion of ‘dispersionexponent’ as discussed in§§2.3.2 and 2.3.3. For the definition we need a com-pact metric space(K, ρ) with Borel probability measureµ and a (continuous) mapϕ : K → K defining the dynamics. Though not necessary for the definitions whichwe consider here, we note that usually one assumes that the probability measureµis invariant under the mapϕ, i.e. one assumes that for each open setU ⊂ K wehaveµ(U) = µ(ϕ−1(U)). We use the mapϕ to define a family of metricsρn onKin the following way:

ρn(p, q) = maxi=0,...,n−1

ρ(ϕi(p), ϕi(q)).

Soρ1 = ρ and, ifϕ is continuous, then each one of the metricsρn defines the sametopology onK. If moreoverϕ is a map with bounded expansion, in the sense thatthere is a constantC > 1 such thatρ(ϕ(p), ϕ(q)) ≤ Cρ(p, q) for all p, q ∈ K, thenρ(p, q) ≤ ρn(p, q) ≤ Cn−1ρ(p, q). Note that anyC1-map, defined on a compact set,always has bounded expansion.

In this situation we consider theε, n-correlation integralC(n)(ε), which is just theordinary correlation integral, with the metricρ replaced byρn. Then we define theH2-entropyby

H2(K,ϕ) = lim supε→0

lim supn→∞

− ln(C(n)(ε))

n.

Correlation dimensions and entropies 221

Brin-KatokTheorem

Here recall thatC(n)(ε) is theµ-average ofµ(D(n)(x, ε)) and thatD(n)(x, ε) is theε-neighbourhood ofx with respect to the metricρn.

As we shall explain, also this entropy is a member of a 1-parameter family of en-tropies. For this and related definitions, see [190]. In order to see the relationbetween the entropy as defined here and the entropy as originally defined by Kol-mogorov, we recall the Brin-Katok Theorem.

Theorem 6.6 (Brin-Katok [46]). In the above situation withµ a non-atomic Borelprobability measure which is invariant and ergodic with respect toϕ, for µ almostevery pointx the following limit exists and equals the Kolmogorov entropy:

limε→0

limn→∞

− ln(µ(D(n)(x, ε)))

n,

Theorem 6.6 means that the Kolmogorov entropy is roughly equal to the exponentialrate at which the measure of anε-neighbourhood with respect toρn decreases as afunction ofn. This is indeed a measure for the sensitive dependence of evolutionson initial states: it measures the decay, with increasingn, of the fraction of the initialstates whose evolutions stay within distanceε aftern iterations. In our definitionit is however the exponential decay of theaveragemeasure ofε-neighbourhoodswith respect to theρn metric which is taken as entropy. In general the Kolmogoroventropy is greater than or equal to theH2 entropy as we defined it.

6.6.1 Miscellaneous remarks

6.6.1.1 Compatibility of the definitions of dimension and entropy with recon-struction

In the next section we shall consider the problem of estimating dimension and en-tropy, as defined here, based on the information of one time series. One then dealsnot with the original state space, but with the set of reconstruction vectors. Onehas to assume then that the (generic) assumptions in the Reconstruction Theoremare satisfied so that the reconstruction mapReck, for k sufficiently large, is an em-bedding, and hence a map of bounded distortion. Then the dimension and entropyof theω-limit of an orbit (with probability measure defined by relative frequencies)are equal to the dimension and entropy of that same limit setafter reconstruction.This is based on the fact, which is easy to verify, that ifh is a homeomorphismh : K → K ′ between metric spaces(K, ρ) and(K ′, ρ′) such that the quotients

ρ(p, q)

ρ′(h(p), h(q)),


correlation!integral with p 6= q, are uniformly bounded and bounded away from zero, and ifµ, µ′ areBorel probability measures onK, K ′ respectively, such thatµ(U) = µ′(h(U)) foreach openU ⊂ K, then theD2-dimensions ofK andK ′ are equal. If moreoverϕ : K → K defines a dynamical system onK and ifϕ′ = hϕh−1 : K ′ → K ′, thenalso theH2-entropies ofϕ andϕ′ are the same.

In the case where the dynamics is given by an endomorphism instead of a diffeo-morphism, similar results hold, see [188]. If the assumptions in the ReconstructionTheorem are not satisfied, this may lead to underestimation of both dimension andentropy.

6.6.1.2 Generalized correlation integrals, dimensions and entropies

As mentioned before, both dimension and entropy, as we defined it, are membersof a 1-parameter family of dimensions respectively entropies. These notions aredefined in a way which is very similar to the definition of the R´enyi informationof orderq; hence they are referred to as Renyi dimensions and entropies . These1-parameter families can be defined in terms of generalized correlation integrals.The (generalized) correlation integral of orderq of (K, ρ, µ), for q > 1, are definedas:

Cq(ε) = (

∫

(µ(D(x, ε)))q−1dµ(x))1

q−1 ,

whereD(x, ε) is the ε-neighbourhood ofx. We see thatC2(ε) = C(ε). Thisdefinition can even be used forq < 1 but that is not very important for our purposes.For q = 1 a different definition is needed, which can be justified by a continuityargument:

C1(ε) = exp(

∫

ln(µ(D(x, ε)))dµ(x)).

Note that forq < q′ we haveCq ≤ Cq′.

The definitions of the generalized dimensions and entropiesof orderq, denoted byDq andHq, are obtained by replacing in the definitions of dimension and entropythe ordinary correlation integrals by the orderq correlation integrals. This impliesthat forq < q′ we haveDq ≥ Dq′ . We note that it can be shown that the box count-ing dimension equals theD0-dimension; this implies that for any Borel probabilitymeasure onRn (with compact support) theD2 dimension is at mostn.

For a discussion of the literature on these generalized quantities and their estimation,see at the end of Section 6.7.

6.7 Numerical estimation of correlation integrals, dimensions and entropies223

6.7 Numerical estimation of correlation integrals, di-mensions and entropies

We are now in the position to describe how the above theory canbe applied toanalyze a time series which is generated by a deterministic system. As describedbefore, in Section 6.3, the general form of a model generating such time series isa dynamical system, given by a compact manifoldM with a diffeomorphism (orendomorphism)ϕ : M → M , and a read out functionf : M → R. We assumethe generic conditions in the Reconstruction Theorem to be satisfied. A time seriesgenerated by such a system is then determined by the initial statex0 ∈ M . Thecorresponding evolution isxn = ϕn(x0) with time seriesyn = f(xn). It isclear that from such a time series we only can get informationabout this particularorbit, and, in particular, about itsω-limit setω(x0) and the dynamics on that limitset given byϕ | ω(x0). As discussed in the previous section, for the definitionof theD2-dimension and theH2-entropy we need some (ϕ-invariant) probabilitymeasure concentrated on the set to be investigated. Preferably this measure shouldbe related to the dynamics. The only natural choice seems to be the measure ofrelative frequencies, as discussed in§6.5.2, of the evolutionxn, if that measureexists. We now assume that this measure of relative frequencies does exist for theevolution under consideration. We note that this measure ofrelative frequencies isϕ-invariant and that it is concentrated on theω-limit setω(x0).

Given the time seriesyn, we consider the corresponding time series of recon-struction vectorsY k0

n = (yn, . . . , yn+k0−1) for k0 sufficiently large, i.e. so that thesequence of reconstruction vectors and the evolutionxn are metrically equal upto bounded distortion, see§6.3. Then we also have the measure of relative frequen-cies for this sequence of reconstruction vectors. This measure is denoted byµ; itssupport is the limit set

Ω = Y | for someni → ∞, limi→∞

Y k0ni

= Y .

Clearly,Ω = Reck0(ω(x0)) and the mapΦ onΩ, defined byReck0ϕ(Reck0 |ω(x0))−1

is the unique continuous map which sends eachY k0n to Y k0

n+1.

Both dimension and entropy have been defined in terms of correlation integrals. Asobserved before, we may substitute the correlation integrals ofω(x0) by those ofΩ,as long as the transformationReck0, restricted toω(x0), has bounded distortion. Anestimate of a correlation integral ofΩ, based on the firstN reconstruction vectorsis given by:

C(ε,N) =1

12N(N − 1)

#(i, j) | i < j and||Y k0i − Y k0

j || < ε.


It was proved in [86] that this estimate converges to the truevalue asN → ∞, in thestatistical sense. In that paper also information about thevariance of this estimatorwas obtained. In the definition of this estimate, one may takefor ‖ · ‖ the Euclideannorm, but in view of the usage of these estimates for the estimation of the entropy,it is better to take themaximum norm: ‖(z1, . . . , zk0)‖max = maxi=1...k0 |zi|. Thereason is the following. If the distance function onΩ is denoted byd, then, as in theprevious section, when discussing the definition of entropy, we use the metrics

dn(Y, Y ′) = maxi=0,...,n−1

d(Φi(Y ),Φi(Y ′)).

If we now take for the distance functiond onΩ the maximum norm (fork0-dimensionalreconstruction vectors), thendn is the corresponding maximum norm for(k0 +n−1)-dimensional reconstruction vectors. Note also that the transition from the Eu-clidean norm to the maximum norm is a ‘transformation with bounded distortion’,so that it should not influence the dimension estimates. Whennecessary, we ex-press the dimension of the reconstruction vectors used in the estimationC(ε,N)by writing C(k)(ε,N); as in§6.6, the correlation integrals based onk-dimensionalreconstruction vectors are denoted byC(k)(ε).

In order to obtain estimates for theD2-dimension and theH2-entropy of the dy-namics on theω-limit set, one needs to have estimates ofC(k)(ε) for many valuesof bothk andε. A common form to display such information graphically is a dia-gram showing the estimates ofln(C(k)(ε)), or rather the logarithms of the estimatesof C(k)(ε), for the various values ofk, as a function ofln(ε). In figure 6.3 weshow such estimates based on a time series of length 4000 which was generatedwith the Henon system (with the usual parameter valuesa = 1.4 andb = .3). Wediscuss various aspects of this figure and then give references to the literature forthe algorithmic aspects for extracting numerical information from the estimates asdisplayed.

First we observe that we normalized the time series in such a way that the differ-ence between the maximal and minimal value equals 1. Forε we took the values1, .9, .92, . . . , .942 ∼ .011. It is clear that forε = 1 the correlation integrals have tobe 1, independently of the reconstruction dimensionk. SinceC(k)(ε) is decreasingas a function ofk, the various graphs of (the estimated values of)ln(C(k)(ε)) asfunction of ln(ε), are ordered monotonically ink: the higher graphs belong to thelower values ofk. For k = 1 we see thatln(C(1)(ε)), as a function ofln(ε), isvery close to a straight line, at least for values ofε smaller than say .2; the slope ofthis straight line is approximately equal to 1 (N.B.: to see this, one has to rescalethe figure in such a way that a factor of 10 along the horizontaland vertical axiscorrespond to the same length!). This is an indication that the set of 1-dimensionalreconstruction vectors has aD2-dimension equal to 1. This is to be expected: sincethis set is contained in a 1-dimensional space, the dimension is at most 1, and the

Numerical estimation of correlation integrals, dimensions, and entropies 225

10−3

10−2

10−1

100

C(k)(ε)

10−2 10−1 100

ε

Figure 6.3: Estimates for the correlation integrals of the Henon attractor. Fork =1, . . . , 5 the estimates ofC(k)(ε) are given as function ofε, with logarithmic scalealong both axis. (This time series was normalized so that thedifference betweenthe maximal and the minimal values is 1.) Since it follows from the definitions thatC(k)(ε) decreases for increasing values ofk the highest curve corresponds tok = 1and the lower curves tok = 2, . . . , 5 respectively.

dimension of the Henon attractor is, according to numerical evidence, somewhatlarger than 1. Now we consider the curves fork = 2, . . . , 5. Also these curvesapproach straight lines for small values ofε (with fluctuations due, at least in part,to estimation errors). The slope of these lines is somewhat larger than in the caseof k = 1 indicating that the reconstruction vectors, in the dimensionsk > 1, forma set withD2-dimension also somewhat larger than 1. The slope indicatesa dimen-sion around 1.2. Also we see that the slopes of these curves donot seem to dependsignificantly onk for k ≥ 2; this is an indication that this slope corresponds to thetrue dimension of the Henon attractor itself. This agrees with the fact that the recon-struction mapsReck, k > 1, for the Henon system are embeddings (if we use oneof the coordinates as read out function). We point out again that the estimation ofdimensions, as carried out on the basis of estimated correlation integrals, can neverbe rigorously justified: The definition of the dimension is interms of the behaviourof ln(C(k)(ε)) for ε→ 0 and if we have a time series of a given (finite) length, theseestimates become more and more unreliable asε tends to zero because the numberof pairs of reconstruction vectors which are very close willbecome extremely small(except in the trivial case where the time series is periodic). Due to these limitations,one tries to find an interval, a so-calledscaling interval, of ε values in which theestimated values ofln(C(k)(ε)) are linear, up to small fluctuations, inln(ε). Thenthe dimension estimate is based on the slope of the estimatedln(C(k)(ε)) vs ln(ε)


curves in that interval. The justifications for such restrictions are that for

- large values ofε no linear dependence is to be expected because correlationintegrals are by definition at most equal to 1;

- small values ofε, the estimation errors become too large;

- many systems that we know there is in good approximation such a lineardependence ofln(C(k)(ε)) on ln(ε) for intermediate values ofε.

So the procedure is just based on an extrapolation to smallervalues ofε. In thecase of time series generated by some (physical) experimentone also has to takeinto account that always small fluctuations are present due to thermal fluctuationsor other noise; this provides another argument for disregarding values ofε whichare too small.

An estimate for theH2-entropy also can be obtained from the estimated correlationintegrals. The information in Figure 6.3 suggests also herehow to proceed. Weobserve that the parallel lines formed by the estimatedln(C(k)(ε)) vs ln(ε) curves,for k ≥ 2, are not only parallel, but also approximately at equal distances. This isan indication that the differences

ln(C(k)(ε)) − ln(C(k+1)(ε))

are approximately independent ofε andk, at least for small values ofε, here smallerthan say .2, and large values ofk, here larger than 1. Assuming that this constancyholds in good approximation, this difference should be equal to theH2-entropy asdefined before.

We note that this behaviour of the estimatedln(C(k)(ε)) vs ln(ε) curves, namelythat they approach equidistant parallel lines for small values ofε and large valuesof k, turns out to occur quite generally for time series generated by deterministicsystems. It may however happen that, in order to reach the ‘good’ ε andk valueswith reliable estimates of the correlation integrals, one has to take extremely longtime series (with extremely small noise). For a quantification of these limitations,see [172].

This way of estimating the dimension and entropy was proposed in [186] and ap-plied by [99, 100]. There is an extensive literature on the question how to improvethese estimations. We refer to the (survey) publications and proceedings [94, 193,82, 196, 83], and to the textbook [120]. As we mentioned above, theD2-dimensionandH2-entropy are both members of families of dimensions and entropies respec-tively. The theory of these generalised dimensions and entropies and their estima-tion was considered in a number of publications; see [109, 36, 37, 104, 193, 191]and [198] as well as the references in these works.

6.8 Classical time series analysis, correlation integrals, and predictability 227

6.8 Classical time series analysis, correlation integrals,and predictability

What we here consider as classical time series analysis is the analysis of time seriesin terms of autocovariances and power spectra. This is a rather restricted view,but still it makes sense to compare that type of analysis withthe analysis in termsof correlation integrals which we discussed sofar. In§6.8.1 we summarize thisclassical time series analysis in so far as we will need it; for more informationsee, e.g., [162]. Then, in§6.8.2 we show that the correlation integrals provideinformation which cannot be extracted from the classical analysis, in particular, theautocovariances cannot be used to discriminate between time series generated bydeterministic systems or and by stochastic systems. Finally in §6.8.3 we discusspredictability in terms of correlation integrals.

6.8.1 Classical time series analysis

We consider a stationary time seriesyi. Without loss of generality we may assumethat the average of this time series is 0. Thekth auto covariance is defined as theaverageRk = averagei(yi · yi+k). Note thatR0 is called the variance of the timeseries. From the definition it follows that for eachk we haveRk = R−k.

We shall not make use of the power spectrum but, since it is used very much asa means to give graphical information about a signal (or timeseries) we just givea brief description of its meaning and its relation with the autocovariances. Thepower spectrum gives information about how strong each (angular) frequencyω isrepresented in the signal:P (ω) is the energy, or squared amplitude, of the (angular)frequencyω. Note that for a discrete time series, which we are discussing here,these frequencies have only values in[−π, π] and thatP (ω) = P (−ω). The formaldefinition is somewhat complicated due to possible difficulties with convergence: ingeneral the power spectrum only exists as a generalized function, in fact as a mea-sure. It can be shown that the collection of autocovariancesis the Fourier transformof the power spectrum:

Rk =

∫ π

−π

eiωkP (ω)dω.

This means that not only the autocovariances are determinedby the power spectrum,but that also, by the inverse Fourier transform, the power spectrum is determined bythe autocovariances. So the information, which this classical theory can provide, isexactly the information which is contained in the autocovariances.

We note that under very general assumptions the autocovariancesRk tend to zero ask tends to infinity. (The most important exceptions are (quasi)-periodic time series.)This is what we assume from now on. If the autocovariances converge sufficiently


fast to zero, then the power spectrum will be continuous, smooth or even analytic.For time series, generated by a deterministic system, the autocovariances convergeto zero if the system is mixing, see [34].

6.8.1.1 Optimal linear predictors

For a time series as discussed above, one can construct the optimal linear predictorof orderk. This is given by coefficientsα(k)

1 , . . . , α(k)k ; it makes a prediction of the

nth element as alinear combination of thek preceding elements:

y(k)n = α

(k)1 yn−1 + · · · + α

(k)k yn−k.

The valuesα(k)1 , . . . , α

(k)k are chosen in such a way that the average valueσ2

k ofthe squares of the prediction errors(y

(k)n − yn)2 is minimal. This last condition

explains why we call this anoptimal linear predictor. It can be shown that theautocovariancesR0, . . . , Rk determine, and are determined by,α(k)

1 , . . . , α(k)k and

σ2k. Also if we generate a time serieszi by using the autoregressive model (or

system):zn = α

(k)1 zn−1 + · · · + α

(k)k + εn,

where the values ofεn are independently chosen from a probability distribution,usually a normal distribution, with zero mean and varianceσ2

k, then the autocovari-ancesRi of this new time series satisfy, with probability 1,Ri = Ri for i = 0, . . . , k.This new time series, withεn taken from a normal distribution, can be consideredas the ‘most unpredictable’ time series with the given firstk + 1 autocovariances.By performing the above construction with increasing values ofk one obtains timeseries which are, as far as power spectrum or autocovariances are concerned, con-verging to the original time seriesyi.

6.8.1.2 Gaussian timeseries

We note that the analysis in terms of power spectra and autocovariances is in partic-ular useful forGaussian time series. These time series are characterized by the factthat their reconstruction measures are normal; one can showthat then:

µk = |Bk|−1/2(2π)−k/2e−〈x,B−1k

x〉/2dx,

whereBk is a symmetric and invertible matrix which can be expressed in terms ofR0, . . . , Rk−1:

Bk =

R0 R1 · · Rk−1

R1 R0 · · Rk−2

· · · · ·· · · · ·Rk−1 Rk−2 · · R0

.

Classical time series analysis, correlation integrals andpredictability 229

(For an exceptional situation where these formulae are not valid see the footnote in§6.8.2.) So these times series are, as far as reconstruction measures are concerned,completely determined by their autocovariances. Such timeseries are generatede.g. by the above type of autoregressive systems whenever the εn’s are normallydistributed.

6.8.2 Determinism and auto-covariances

We have discussed the analysis of time series in terms of correlation integrals in thecontext of time series which are generated by deterministicsystems, i.e., systemswith the property that the state at a given time determines all future states and forwhich the space of all possible states is compact and has a low, or at least finite,dimension. We call such time series ‘deterministic’. The Gaussian time seriesbelong to a completely different class. These time series are prototypes of what wecall ‘random’ time series. One can use the analysis in terms of correlation integralsto distinguish between deterministic and Gaussian time series in the following way.

If one tries to estimate numerically the dimension of an underlying attractor (orω-limit) as in §6.7, one makes estimates of the dimension of the set ofk-dimensionalreconstruction vectors for several values ofk. If one applies this algorithm to aGaussian time series, then one should find that the set ofk-dimensional reconstruc-tion vectors has dimensionk for all k. This contrasts with the case of deterministictime series where these dimensions are bounded by a finite number, namely thedimension of the relevant attractor (orω-limit). Also if one tries to estimate theentropy of a Gaussian time series by estimating the limits

H2(ε) = limk→∞

(ln(C(k)(ε)) − ln(C(k+1)(ε))),

one finds that this quantity tends to infinite asε tends to zero, even for a fixed valueof k. This contrasts with the deterministic case, where this quantity has a finitelimit, at least if the map defining the time evolution has bounded expansion.

It should be clear that the above arguments are valid for a much larger class ofnon-deterministic time series than just the Gaussian ones.

Next we want to argue that the autocovariances cannot be usedto make the dis-tinction between random and deterministic systems. Suppose that we are given atime series (of which we still assume that the average is 0) with autocovariancesR0, . . . , Rk. Then such a time series, i.e., a time series with these same first k + 1autocovariances, can be generated by an autoregressive model, which is of courserandom1. On the other hand, one can also generate a time series with the same au-tocovariances by a deterministic system. This can be done bytaking in the autore-gressive model for the valuesεn not random (and independent) values from some

1There is a degenerate exception where the variance of the optimal linear predictor of orderk is


probability distribution, but values generated by a suitable deterministic systems.Though deterministic,εn should still have (i) average equal to zero, (ii) varianceequal to the variance of the prediction errors of the optimallinear predictor of orderk, and (iii) all the other autocovariances equal to zero, i.e., Ri = 0 for i > 0. Theseconditions namely imply that the linear theory cannot ‘see’the difference betweensuch a deterministic time series and a truly random time series of independent andidentically distributed noise. The first two of the above conditions are easy to sat-isfy: one just has to apply an appropriate affine transformation to all elements of thetime series.

In order to make an explicit example where all these conditions are satisfied, westart with thetent map, e.g., see [5]. This is a dynamical system given by the mapϕ on [−1

2 ,12 ]:

ϕ(x) = +2x+ 12 for x ∈ [−1

2 , 0)

ϕ(x) = −2x+ 12 for x ∈ [0, 1

2).

It is well known that for almost any initial pointx ∈ [−12 , halfje], in the sense of

Lebesgue, the measure defined by the relative frequencies ofxn = ϕn(x), is justthe Lebesgue measure on[−1

2 ,+12 ]. We show that such a time series satisfies (up to

scalar multiplication) the above conditions. For this we have to evaluate

Ri =

∫ +12

−12

xϕi(x) dx.

Since fori > 0 we have thatϕi(x) is an even function ofx, this integral is 0. Thevariance of this time series is112 . So in order to make the variance equal toσ2 onehas to transform the state space, i.e. the domain ofϕ, to [−

√3σ2,+

√3σ2]. We

denote this rescaled evolution map byϕσ2 . Now we can write down explicitly the(deterministic) system which generates time series with the same autocovariancesas the time series generated by the autoregressive model:

yn = α(k)1 yn−1 + · · ·+ α

(k)k yn−k + εn,

where theεn are independent samples of a distribution with mean 0 and varianceσ2. The dynamics is given by:

u1 7→k∑

i=1

α(k)i ui + x

zero. In that case the time series is multi-periodic and has the form

yn =∑

i≤s

(ai sin(ωin) + bi cos(ωin)),

with 2s ≤ k + 1. Then also the matricesBk, see section 8.1, are not invertible.


predictabilityu2 7→ u1

u3 7→ u2

.. .. ..

uk 7→ uk−1

x 7→ ϕσ2(x).

The time series consists of the successive values ofu1, i.e., the read out function isgiven byf(u1, . . . , uk, x) = u1.

We note that, instead of the tent map, we could have used the Logistic mapx 7→1 − 2x2 for x ∈ [−1, 1].

We produced this argument in detail because it proves that the correlation inte-grals, which can be used to detect the difference between time series generated bydeterministic systems and Gaussian time series, contain information which is notcontained in the autocovariances, and hence also not in the power spectrum. Weshould expect that deterministic time series are much more predictable than random(or Gaussian) time series with the same autocovariances. Inthe next subsection wegive an indication how the correlation integrals give an indication about the pre-dictability of a time series.

6.8.3 Predictability and correlation integrals

The prediction problem for chaotic time series has attracted a lot of attention. Thefollowing proceedings volumes were devoted, at least in part, to this: [196, 72, 205]and and [195].

The prediction methods proposed in these papers are all strongly using the station-arity of the time series: the generating model is supposed tobe unknown, so theonly way to make predictions about the future is to use the behaviour in the past,assuming (by stationarity) that the future behaviour will be like the behaviour in thepast. Roughly speaking there are two ways to proceed.

1. One can try to estimate a (nonlinear) model for the system generating the timeseries and then use this model to generate predictions. The disadvantage ofthis method is that estimating such a model is problematic. This is, amongstother things, due to the fact that the orbit, corresponding to the time serieswhich we know, explores only a (small) part of the state spaceof the systemso that only a (small) part of the evolution equation can be known.

2. For each prediciton, one can compare the most recent values of the time serieswith values in the past: if one finds in the past a segment whichis very closeto the segment of the most recent values, one predicts that the continuation in


“textitl’histoire ser“’ep“‘ete

the future will be like the continuation in the past (we give amore detaileddiscussion below). In Section 2.3, this principle was coined asl’histoire serepete. The disadvantage is that foreachprediction one has to inspect thewholepast, which can be very time consuming.

6.8.3.1 l’Histoire se repete

Though also the second method has its disadvantages it is less ad hoc than the firstone and hence more suitable for a theoretical investigation. So we shall analyzethe second method and show how the correlation integrals, and in particular theestimates of dimension and entropy, give indications aboutthe predictability.

Note that in this latter procedure there are two ways in whichwe use the past: wehave the ‘compete known past’ to obtain information about the statistical properties,i.e., the reconstruction measures, of the time series; using these statistical properties,we base our prediction of the next value on a short sequence ofthe most recentvalues. Recall also our discussion of the dripping faucet in§6.2.

In order to discuss the predictability in terms of this second procedure in more de-tail, we assume that we know a finite, but long, segmenty0, . . . , yN of a stationarytime series. The question is how to predict the next valueyN+1 of the time series.For this procedure, in its simplest form, we have to choose anintegerk and a pos-itive real numberε. We come back to the problem how to choose these values.(Roughly speaking, thek most recent valuesyN−k+1, . . . , yN should contain therelevant information as far as predicting the future is concerned, given the statisticalinformation about the time series which can be extracted from the whole past;εshould be of the order of the prediction error which we are willing to accept.) Wenow inspect the past, searching for a segmentym, . . . , ym+k−1 which is ε-close tothe lastk values, in the sense that|ym+i − yN−k+1+i| < ε for i = 0, . . . , k − 1. Ifthere is no such segment, we should try again with a smaller value of k or a largervalue ofε. If there is such a segment, then we predict the next valueyN+1 to beyN+1 = ym+k.

This is about the most primitive way to use the past to predictthe future and laterwe will discuss an improvement; see the discussion below of local linear predictors.For the moment we concentrate on this prediction procedure and try to find outhow to assess the reliability of the prediction. We shall assume, for simplicity,that a prediction which is correct within an error of at mostε is considered to becorrect; in the case of a larger error, the prediction is considered to be incorrect.Now the question is: what is the probability of the prediction to be correct? Thisprobability can be given in terms of correlation integrals.Namely the probabilitythat two segments of lengthk, respectivelyk+1, are equal up to an error of at mostε isC(k)(ε), respectivelyC(k+1)(ε). So the probability that two segments of length


k + 1, given that the firstk elements are equal up to an error of at mostε, have alsotheir last elements equal up to an error of at mostε, isC(k+1)(ε)/C(k)(ε). Note thatthis quotient decreases in general with increasing values of k. So in order to getgood predictions we should takek so large that this quotient has essentially its limitvalue. This is the main restriction on how to takek, which we want otherwise to beas small as possible.We have already met this quotient, or rather its logarithm, in the context of theestimation of entropy. In§6.6 we defined

H2(ε) = limk→∞

(ln(C(k)(ε)) − ln(C(k+1)(ε))) = limk→∞

− ln

(

C(k+1)(ε)

C(k)(ε)

)

.

So, assuming thatk is chosen as above, the probability of giving a correct predictionis e−H2(ε). This means thatH2(ε) and hence alsoH2 = limε→0H2(ε) is a measureof theunpredictabilityof the time series.

Also the estimate for the dimension gives an indication about the (un)predictabilityin the following way. Suppose we have fixed the value ofk in the prediction pro-cedure so thatC

(k+1)(ε)

C(k)(ε)is essentially at its limiting value (assuming for simplicity

that this value ofk is independent ofε). Then, in order to have a reasonable prob-ability of finding in the past a sequence of lengthk which agrees sufficiently withthe sequence of the lastk values, we should have that1/N , the inverse length ofthe time series, is small compared withC(k)(ε). This is the main limitation whenchoosingε, which we otherwise want to be as small as possible. The dependenceof C(k)(ε) on ε is, in the limiting situation, i.e. for smallε and sufficiently largek,proportional toεD2, so in the case of a high dimension, we cannot takeε very smalland hence we cannot expect predictions to be very accurate. To be more precise, ifwe want to improve the accuracy of our predictions by a factor2, then we expect toneed a segment of past values which is2D2 times longer.

6.8.3.2 Local linear predictors

As we mentioned before, the above principlel’histoire se repeteto predict the fu-ture is rather primitive. We here briefly discuss a refinementwhich is based on acombination of optimal linear prediction and the above principle. This type of pre-diction was the subject of [71] which appeared in [196]. In this case we start inthe same way as above, but now we collectall the segments from the past whichareε-close to the last segment ofk elements. Letm1, . . . , ms be the first indices ofthese segments. We have thens different values, namelyym1+k, . . . , yms+k, whichwe can use as prediction foryN+1. This collection of possible predictions alreadygives a better idea of the variance of the possible prediction errors.We can however go further. Assuming that there should be somefunctional depen-denceyn = F (yn−1, . . . , yn−k) with differentiableF (and such an assumption is


justified by the Reconstruction Theorem if we have a time series which is gener-ated by a smooth and deterministic dynamical system and ifk is sufficiently large),thenF admits locally a good linear, or rather affine, approximation (given by itsderivative). In the case thats, the number of nearby segments of lengthk, is suffi-ciently large, one can estimate such a linear approximation. This means estimatingthe constantsα0, . . . , αk such that the variance of(ymi+k − ymi+k) is minimal,whereymi+k = α0 + α1ymi+k−1 + · · ·+ αkymi

. The determination of the constantsα0, . . . , αk is done by linear regression. The estimation of the next value is thenyN+1 = α0 + α1yN + · · · + αkyN−k+1. This means that we use essentially a lin-ear predictor, which is however only based on ‘nearby segments’ (the fact that wehave here a termα0, which was absent in the discussion of optimal linear predic-tors, comes from the fact that here there is no analogue of theassumption that theaverage is zero).

Finally we note that if nothing is known about how a (stationary) time series isgenerated, it is not a priori clear that this method of local linear estimation will givebetter results. Also a proper choice ofk andε is less obvious in that case; see [71].

6.9 Miscellaneous subjects

In this concluding section we consider two additional methods for the analysis oftime series which are of a somewhat different nature. Both methods can be moti-vated by the fact that the methods which we discussed up to nowcannot be used todetect the difference between a time series and its time reversal. Thetime reversalof a time seriesyii∈Z is the time seriesyi = y−ii∈Z (for finite segments of atime series the definition is similar). It is indeed obvious from the definitions thatautocorrelations and correlation integrals do not change if we reverse time. Stillwe can show with a simple example that a time series may becomemuch moreunpredictable by reversing the time — this also indicates that the predictability, asmeasured by the entropy, is a somewhat defective notion.

Our example is a time series generated by the tent map (using ageneric initial point);for the definition see Section 6.8.2. If we know the initial state of this system, wecan predict the future, but not the past. This is a first indication that time reversalhas drastic consequences in this case. We make this more explicit. If we know theinitial state of the tent map up to an errorε, then we know the next state up to anerror2ε, so that a prediction of the next state has a probability of 50% to be correctif we allow for an error of at mostε. (This holds only for smallε and even then tofirst approximation due to the effect of boundary points and the ‘turning point’ atx = 0.) For predicting the previous state, which corresponds to ordinary predictionfor the time series with the time reversed, we also expect a probability of 50 %of a correct prediction, i.e., with an error not more thanε, but in this latter case

Miscellaneous subjects 235

Lyapunovexponents

the prediction error is in general much larger: the inverse image of anε-intervalconsists in general of two intervals of length1

2ε at a distance which is in averageapproximately1

2 .

In the next subsection we discuss theLyapunov exponentsand their estimation fromtime series. They give another indiction for the (un)predictability of a time series;these Lyapunov exponents are not invariant under time reversal.

In the second subsection we consider the(smoothed) mixed correlation integralswhich were originally introduced in order to detect the possible differences betweena time series and its time reversal. It can however be used much more generally asa means to test whether two time series have the same statistical behaviour or not.

6.9.1 Lyapunov exponents

A good survey on Lypunov exponents is the third chapter of [94] to which we referfor details. We give here only a heuristic definition of theseexponents and a sketchtheir numerical estimation.

Consider a dynamical system given by a differentiable mapf : M → M , whereM is a compactm-dimensional manifold. For an initial statex ∈ M the Lyapunovexponents describe how evolutions, starting infinitesimally close tox diverge fromthe evolution ofx. For these Lyapunov exponents to be defined, we have to assumethat there are subspaces of the tangent space atx: Tx = E1(x) ⊃ E2(x) ⊃ . . . ⊃Es(x) and real numbersλ1(x) > λ2(x) > . . . > λs(x) such that for a vectorv ∈ Tx

the following are equivalent:

limn→∞

1

nln(||dfn(v)||) = λi ⇐⇒ v ∈ (Ei(x) −Ei+1(x)).

The norm‖·‖ is taken with respect to some Riemannian metric onM . The existenceof these subspaces (and hence the convergence of these limits) for generic initialstates is the subject Oseledets’ Multiplicative Ergodic Theorem, see [148] and [94].Here generic doesnot mean belonging to an open and dense subset, but belongingto a subset which has full measure, with respect to somef -invariant measure. Thenumbersλ1, . . . , λs are called the Lyapunov exponents atx. They are to a largeextend independent ofx: in general they only depend on the attractor to which theorbit throughx converges. In the notation we shall drop the dependence of theLyapunov exponents on the pointx. We say that the Lyapunov exponentλi hasmultiplicity k if

dim(Ei(x)) − dim(Ei+1(x)) = k.

Often each Lyapunov exponent with multiplicityk > 1 is repeatedk times, so thatwe have exactlym Lyapunov exponents withλ1 ≥ λ2 ≥ . . . ≥ λm.


When we know the mapf and its derivative explicitly, we can estimate the Lya-punov exponents using the fact that for a generic basisv1, . . . , vm of Tx and foreachs = 1, . . . , m

s∑

i=1

λi = limn→∞

1

nln(||dfn(v1, . . . , vs)||s),

where‖dfn(v1, . . . , vs)‖s denotes thes-dimensional volume of the parallelepipedspanned by thes vectors dfn(v1), . . . , dfn(vs).

It should be clear that in the case of sensitive dependence oninitial states one ex-pects the first (and largest) Lyapunov exponentλ1 to be positive. There is an inter-esting conjecture relating Lyapunov exponents to the dimension of an attractor, theKaplan-Yorke conjecture [121], which can be formulated as follows. Consider thefunctionL, defined of the interval[0, m] such thatL(s) =

∑si=0 λi for integerss

(andL(0) = 0) and which interpolates linearly on the intervals in between. SinceLyapunov exponents are by convention non-increasing as function of their index,this function is concave. We define the Kaplan-Yorke dimensionDKY as the largestvalue oft ∈ [0, m] for whichL(t) ≥ 0. The conjecture claims thatDKY = D1,whereD1 refers to theD1-dimension, as defined in Section 6.6, of the attractor (orω-limit) to which the evolution in question converges. For some 2-dimensional sys-tems this conjecture was proven, see [127]; in higher dimensions there are counterexamples, but so far nopersistentcounter examples have been found, so that it mightstill be true generically. Another important relation in terms of Lyapunov exponentsis that, under ‘weak’ hypotheses (for details see [94] or [127]), the Kolmogorov (orH1) entropy of anattractor is the sum of its positive Lyapunov exponents.

6.9.2 Estimation of Lyapunov exponents from a time series

The main method of estimating Lyapunov exponents, from timeseries of a deter-ministic system, is due to Eckmann et al [93]; for referencesto later improvements,see [149].

This method makes use of the reconstructed orbit. The main problem here is toobtain estimates of the derivatives of the mapping (in the space of reconstructionvectors). This derivative is estimated in a way which is verysimilar to the construc-tion of the local linear predictors, see Section 6.8.3. Working with ak-dimensionalreconstruction, and assuming thatReck is an embedding from the state spaceMinto Rk, we try to estimate the derivative at each reconstruction vector Y k

i by col-lecting a sufficiently large set of reconstruction vectorsY k

j(i,s)1≤s≤s(i) which areclose toY k

i . Then we determine, using a least square method, a linear mapAi whichsends eachY k

j(i,s) − Y ki or Y k

j(i,s)+1 − Y ki+1 plus a small errorεi,s (of course so that

Miscellaneous subjects 237

Kantz-Diks testthe sum the squares of theseεi,s is minimized). These mapsAi are used as esti-mates of the derivative of the dynamics atY k

i . The Lyapunov exponents are thenestimated, based on an analysis of the growth of the various vectors under the mapsANAN−1 . . . A1. A main problem with this method, and also with its later improve-ments, is that the linear maps which we are estimating are toohigh dimensional: weshould use the derivative of the map onXk defined byF = Reck f Rec−1

k , withXk = Reck(M), while we are estimating the derivative of a (non defined) maponthe higher dimensionalRk. In fact the situation is even worse: the reconstructionvectors only give information on the mapF restricted to the reconstruction of oneorbit (and its limit set) which usually is of even lower dimension. Still, the methodoften gives reproducible results, at least as far as the largest Lyapunov exponent(s)is (are) concerned.

6.9.3 The Kantz-Diks test — discriminating between time seriesand testing for reversibility

The main idea of the Kantz-Diks test can be described in a context which is moregeneral than that of time series. We assume to have two samples of vectors inRk, namelyX1, . . . , Xs and Y1, . . . , Yt. We assume that theX-vectors arechosen independently from a distributionPX and theY -vectors independently froma distributionPY . The question is whether these samples indicate in a statisticallysignificant way that the distributionsPX andPY are different. The original idea ofKantz was to use the correlation integrals in combination with a ‘mixed correlationintegral’ [119]. We denote the correlation integrals of thedistributionsPX andPY

respectively byCX(ε) andCY (ε) — they are as defined in Section 6.6. The mixedcorrelation integralCXY (ε) is defined as the probability that the distance betweenaPX-random vector and aPY -random vector is component wise smaller thanε. Ifthe distributionsPX andPY are the same, then these three correlation integrals areequal. If not, one expects the mixed correlation integral tobe smaller (that need notbe the case, but this problem is removed by using the smoothedcorrelation integralswhich we define below). So if an estimate ofCX(ε) + CY (ε) − 2CXY (ε) differs ina significant way from zero, this is an indication thatPX andPY are different.

A refinement of this method was given in [90]. It is based on thenotion ofsmoothedcorrelation integral. In order to explain this notion, we first observe that the cor-relation integralCX(ε) can be defined as the expectation value, with respect to themeasurePX × PX , of the functionHε(v, w) onRk × Rk which is 1 if the vectorsvandw differ, component wise, less thanε and 0 otherwise. In the definition of thesmoothed integralSε(X), the only difference is that the functionHε is replaced by

Gε(v, w) = e−‖v−w‖2

ε2


where‖ · ‖ denotes the Euclidean distance. The smoothed correlation integralsSY (ε) andSXY (ε) are similarly defined. This modification has the following con-sequences:

- The quantitySX(ε) + SY (ε) − 2SXY (ε) is always non negative;

- SX(ε) +SY (ε)− 2SXY (ε) = 0 if and only if the distributionsPX andPY areequal.

The estimation ofSX(ε) + SY (ε) − 2SXY (ε) as a test for the distributionPX andPY to be equal or not can be used for various purposes. In the present contextthis is used to test whether two (finite segments of) time series show a significantlydifferent behaviour by applying it to the distributions of their reconstruction vectors.One has to be aware however of the fact that these vectors cannot be considered asstatistically independent; this is discussed in [89].

Originally this method was proposed as a test for ‘reversibility’, i.e. as a test whethera time series and its time reversal are significantly different. Another application isa test for stationarity. This is used when monitoring a process in order to obtain awarning whenever the dynamics changes. Here one uses a segment during whichthe dynamics is as desired. Then one collects, say every 10 minutes, a segmentand applies the test to see whether there is a significant difference between the lastsegment and the standard segment. See [147] for an application of this method inchemical engineering.

Finally we note that for Gaussian time series thek-dimensional reconstruction mea-sure, and hence the correlation integrals up to orderk, are completely determinedby the autocovariancesρ0, · · · , ρk−1. There is however no explicit formula for cor-relation integrals in terms of these autocovariances. For the smoothed correlationintegrals such formulas are given in [189] where also the dimensions and entropiesin terms of smoothed correlation integrals are discussed.

6.10 Exercises

Exercise 6.1 (WhenReck is a diffeomorphism). We consider the Henon system,see§1.3.2.; as a read out function we take one of the coordinate functions. Showthat in this caseReck is a diffeomorphism fork ≥ 2.

Exercise 6.2 (Reconstructing a linear system).We consider a dynamical systemwith state spaceRk given by a linear mapL : Rk → Rk; we take a linear read outmapχ : Rk → R. What are the necessary and sufficient conditions onL andχ forthe conclusion of the Reconstruction Theorem to hold? Verify that these conditionsdetermine an open and dense subset in the product space of thespace of linear mapsin Rk and the space of linear functions onRk.

6.10 Exercises 239

Exercise 6.3 (Reconstruction for circle maps I).We consider a dynamical systemwith state spaceS1 = R/2πZ given by the mapϕ(x) = x+ a mod2πZ.

1. Show that fora = 0 or π mod2π and for any functionf onS1 the conclusionof the the Reconstruction Theorem does not hold for anyk.

2. Show that fora 6= 0 or π mod2π andf(x) = sin(x), the mapReck is anembedding fork ≥ 2.

Exercise 6.4 (Reconstruction for circle maps II).We consider a dynamical sys-tem, again with state spaceS1 but now with mapϕ(x) = 2x mod2π. Show that forany (continuous)f : S1 → R the mapRecSk is not an embedding for anyk.

Exercise 6.5 (Reconstruction in practice).Here the purpose is to see how recon-struction works in practice. The idea is to use a program likemathematica,maple, ormatlab to experiment with a time series from the Lorenz attractor.

1. Generate a (long) solution of the Lorenz equations. It should reasonably ‘fill’the butterfly shape of the Lorenz attractor.

We obtain a time series by taking only one coordinate, as a function of time,of this solution.

2. Make reconstructions, in dimension two (so that it can easily be representedgraphically), using different delay times.

One should notice that very short and very lang delay times give bad figures (why?).For intermediate delay times one can recognize the butterflyshape.

Exercise 6.6 (Equivalent box counting dimensions).Prove the inequalities (con-cerningaε, bε, andcε) needed to show that the various definitions of the box count-ing dimension are equivalent.

Exercise 6.7 (Box counting dimension of compact subsetRn). Prove that for anycompact subsetK ⊂ Rn the box counting dimension satisfiesd(K) ≤ n.

Exercise 6.8 (Box counting dimension of Cantor set).Prove that the box countingdimension of the mid-α Cantor set is

ln 2

ln 2 − ln(1 − α).

Exercise 6.9 (Non-existence of a limit).Prove that the limit of∑n

i=0 yi, forn→ ∞does not exist foryi = 0 respectively1 if the integer part ofln(i) is even, respec-tively odd.


Exercise 6.10 (On averages).In S1 = R/Z we consider the sequencepi = iα withα irrational. Show that for each continuous functionf : S1 → R we have

limN→∞

1

N

N−1∑

i=0

f(pi) =

∫

S1

f.

Hint: see [34].

Exercise 6.11 (Reconstruction measure for Doubling map).We consider theDoubling mapϕ on S1 = R/Z defined byϕ(x) = 2x modZ. Show that for apositive orbit of this system we can have, depending on the initial point:

1. The orbit has a reconstruction measure which is concentrated onk pointswherek is any positive integer;

2. The reconstruction measure of the orbit is the Lebesgue measure onS1;

3. The reconstruction measure is not defined.

Are there other possibilities? Which of the three possibilities occurs with the ‘great-est probability’?

Exercise 6.12 (Correlation dimension fitting).LetK ⊂ Rn be the unit cube withLebesgue measure and Euclidian metric. Show that the correlation dimension ofthis cube isn.

Exercise 6.13 (OnD2-dimension I). LetK = [0, 1] ⊂ R with the usual metric andwith the probability measureρ defined by:

- ρ(1) = 12;

- ρ([a, b)) = 12(b− a) .

Show thatD2(K) = 0.

Exercise 6.14 (OnD2-dimension II). Let (Ki, ρi, µi), for i = 1, 2, be compactspaces with metricρi, probability measureµi, andD2-dimensiondi.

On the disjoint union ofK1 andK2 we take a metricρ with ρ(x, y) equal to:

ρ(x, y) =

ρi(x, y) if both x, y ∈ Ki;1 if x, y belong to differentK1 andK2.

The measureµ onK1 ∪K2 is such that for any subsetA ⊂ Ki, µ(A) = 12µi(A).

Show that theD2-dimension of this disjoint union satisfiesd = mind1, d2.

6.10 Exercises 241

On the productK1 ×K2 we take the product metric, defined by

ρp((x1, x2), (y1, y2)) = maxρ1(x1, x2), ρ2(y1, y2),

and the product measureµp defined by

µp(A× B) = µ1(A)µ2(B)

for A ⊂ K1 andB ⊂ K2. Show that theD2-dimension ofK1 ×K2 is at most equalto d1 + d2.

Exercise 6.15.We consider the ‘mid-third’ Cantor setK with the usual metric(induced from the metric in[0, 1] ⊂ R). In order to define the measureµα,β, withα, β > 0 andα + β = 1 on this set we recall the construction of this Cantor set asintersectionK = ∩Ki with K0 = [0, 1], so thatKi contains2i intervals of length3−i. for each interval ofKi, the two intervals ofKi+1 contained in it have measureα times, respectivelyβ times, the measure of the interval ofKi. SoKi has i!

k!(i−k)!

intervals of measureαkβi−k.

1. Prove thatCq(3−n) = (αq + βq)

nq−1 .

2. Compute theDq-dimensions of these Cantor sets.

3. The mapϕ : K → K is defined byϕ(x) = 3x modZ. Compute theHq

entropies of this map.

Exercise 6.16 (D2-dimension and entropy in practice I). Write programs, inmatlab, mathematica, or maple to estimate, from a given time series theD2-dimension and -entropy; see also the next exercise.

Exercise 6.17 (D2-dimension and entropy in practice II). The idea of this exer-cise is to speed up the computation of (estimates of) correlation integrals.

According to the definition for the calculation ofC(ε,N) one has to check for all12N(N − 1) pairs1 ≤ i < j ≤ N whether the distance between the correspondingreconstruction vectors is smaller thanε. Since the number of such pairs is usuallyvery large, it might be attractive to only use a smaller number of such pairs; thesecan be selected using the random generator.

Generate a time series of length 4000 from the Henon attractor. EstimateC(2)(0.2)usingm random pairs1 ≤ i < j ≤ N for various values ofm and try to determinefor which value ofm we get an estimate which is as reliable as using all pairs.

The fact that in general one can takem as above much smaller than12N(N − 1) is

to be expected on the basis of the theory ofU-statistics, see [85, 86].


Exercise 6.18 (Reconstruction measure of Gaussian process). Prove the state-ment in§6.8.2 that the set ofk-dimensional reconstruction vectors of a time seriesgenerated by a (non deterministic) Gaussian process define areconstruction mea-sure which hasD2-dimensionk.

Exercise 6.19 (ε Entropies of Guassian process).Prove that for a time series gen-erated by a (non deterministic) Gaussian process theε entropiesH2(ε) diverge to∞ for ε→ 0.

Appendix A

Diffential Topology and MeasureTheory

A.1 Introduction

In this appendix we discuss notions from general topology, differential topologyand measure theory which were used in the previous chapters.In general proofs areomitted. We refer the reader for proofs and a systematic treatment of these subjectsto:

- [122] J.L. Kelley, General topology, Van Nostrand, 1955; newer title?

- [112] M.W. Hirsch, Differential topology, Springer-Verlag, 1976;

- [150] J.C. Oxtoby, Measure and Category, Springer-Verlag, 1971.

A.2 Topology

Topological space. A topological space consists of a setX together with a col-lectionO of subsets ofX such that

- ∅ andX are inO;

- O is closed under the formation of arbitrary unions and finite intersections.

The elements inO are calledopen sets; the complement of an open set is calledclosed. A basefor such a topology is a subsetB ⊂ O such that all open sets canbe obtained by taking arbitrary unions of elements ofB. A subbaseis a setS ⊂ Osuch that all open sets can be obtained by first taking finite intersections of elementsof S and then taking arbitrary unions of these.

244 Appendix

We call a subsetA of a topological spaceX denseif for each open setU ⊂ X, theintersectionA ∩ U is nonempty.

Continuity. A map f : X → Y from a topological spaceX to a topologicalspaceY is calledcontinuousif for each openU ⊂ Y , f−1(U) is open inX. Ifmoreover such a map is a bijection and if alsof−1 is continuous, thenf is called ahomeomorphismand the spacesX andY are calledhomeomorphic.

Metric space. A metric space is a setX together with a functionρ fromX ×Xto the non-negative real numbers such that:

- ρ(p, q) = ρ(q, p) for all p, q ∈ X;

- ρ(p, q) = 0 if and only if p = q;

- ρ(p, q) + ρ(q, r) ≥ ρ(p, r) for all p, q, r ∈ X.

The functionρ as above is called ametricor adistance function.

Each metric space(X, ρ) has an associated topology which is generated by thebase consisting of theε-neighbourhoodsUp,ε = x ∈ X|ρ(p, x) < ε for allp ∈ X and ε > 0. As an example we considerRn with the Euclidean metricρ((x1, . . . , xn), (y1, . . . , yn)) =

√∑

i(xi − yi)2. The corresponding topology isthe ’usual topology’.

Two metricsρ1 andρ2 on the same setX are calledequivalentif there is a constantC ≤ 1 such that for each pair of pointsp, q ∈ X we have

Cρ1(p, q) ≤ ρ2(p, q) ≤ C−1ρ1(p, q).

Clearly, if two metrics are equivalent, they define the same topology. Howeverinequivalent metrics may define the same topology, e.g. ifρ is any metric thenρ(p, q) = minρ(p, q), 1 is in general not equivalent withρ, but defines the sametopology. So when using a metric for defining a topology it is no restriction toassume that the diameter, i.e. the largest distance, is at most 1; this is useful whendefining product metrics. A topology which can be obtained from a metric as calledmetrizable

A metric space(X, ρ) is calledcompleteif each sequencexii∈N in X which isCauchy (i.e., for eachε > 0 there is anN such that for ali, j > N we haveρ(xi, xj) < ε) has alimit (i.e. a pointx such thatρ(xi, x) → 0 for i → ∞). Atopological space which can be obtained from a complete metric is calledcompletemetrizable. Complete metric spaces have important properties:

Topology 245

open and densesubset

Baire!propertygeneric propertyresidual setBaire!first categoryBaire!second

category

- Contraction principle. If f : X → X is a contracting map of a com-plete metric space to itself, i.e., a map such that for somec < 1 we haveρ(f(p), f(q)) ≤ cρ(p, q) for all p, q ∈ X, then there is a unique pointx ∈ Xsuch thatf(x) = x; such a pointx is called afixed point. This property isused in the proof of some existence and uniqueness theorems where the exis-tence of the object in question is shown to be equivalent withthe existence ofa fixed point of some contracting map, e.g. the usual proof of existence anduniqueness of solutions of ordinary differential equations.

- Baire property.In a complete metric space (and hence in a complete metriz-able space) the countable intersection of open and dense subsets is dense.This needs some explanation. In a topological space the notion ’almost allpoints’ can be formalized as to ’the points, belonging to a set containing anopen and dense subset’: Indeed letU ⊂ G ⊂ X be an open and dense subsetof X containingG. Then any point ofX can be approximated (in any of itsneighbourhoods) by a point ofG and even a point ofU . Such a point ofUeven has a neighbourhood which is completely contained inU and hence inG. We use here ’a set containing an open and dense subset’ instead of just’an open and dense subset’ because we want to formalize the notion ’almostall’, so we cannot exclude the set to be larger than open and dense. (We notethat also in measure theory there is a notion of ’almost all’ which however, incases where there is both a natural topology and a natural measure, does notin general agree with the present notion; we come back to thisin the end ofthis appendix). The Baire property means that if we have a countable numberof propertiesPi which may or may not hold for elements of a complete met-ric spaceX, defining subsetsGi ⊂ X, i.e.Gi = x ∈ X|x has properyPi,which each contain an open and dense subset ofX, then the set of all points ofX for which all the propertiesPi hold is still dense. For this reason it is com-mon to extend the notion of ’almost all points’ (in the topological sense) inthe case of complete metric (or complete metrizable) spacesto ‘the points ofa set containing a countable intersection of open and dense subsets’. In casewe deal with complete metrizable spaces we call a subsetresidual if it con-tains a countable intersection of open and dense subsets; a property is calledgenericif the set of all points having that property is residual. In the literatureone also uses the following terminology: a subset is of the first categoryif itis the complement of a residual set; otherwise it is of the secondcategory.

Subspace. Let A ⊂ X be a subset of a topological spaceX. ThenA inherits atopology fromX: open sets inA are of the formU ∩ A whereU ⊂ X is open. Inthe same way, ifA is a subset of a metric spaceX, with metricρ, thenA inherits ametric by simply puttingρA(p, q) = ρ(p, q) for all p, q ∈ A. It is common to also

246 Appendix

denote the metric onA by ρ.

Compactness. LetX be a topological space. A collection of (open) subsetsUi ⊂Xi∈I is called an(open) coverof X if ∪i∈IUi = X. A topological space is calledcompactif for each open coverUii∈I there is a finite subsetI ′ ⊂ I such thatUii∈I′ is already an open cover ofX; the latter cover is called a finitesubcoverofthe former.

If X is metrizable then it is compact if and only if each infinite sequencexi ofpoints inX has anaccumulation point, i.e., a pointx such that each neighbourhoodof x contains points of the sequence.

In a finite dimensional vector space (with the usual topologyand equivalence classof metrics) a subset is compact if and only if it is closed and bounded. In infinitedimensional vector spaces there is no such thing as ’the usual topology’ and defi-nitely not a ’usual equivalence class of metrics’ and for most ’common’ topologiesand metrics it isnot true that all closed and ’bounded’ subsets are compact.

Product spaces. Let Xii∈I be a collection of topological spaces. On the Carte-sian productX = ΠiXi, with projectionsπi : X → Xi, the product topology isgenerated by the subbase containing all sets of the formπ−1

i (Ui) withUi ⊂ Xi open.If eachXi is compact then also the product is compact. If the topologies ofXi aremetrizable and ifI is countable, then alsoX is metrizable — uncountable productsare in general not metrizable. A metric onX, defining the product topology, can beobtained as follows. We assume that on eachXi the topology is given by the metricρi. Without loss of generality we may assume thatρi distances are at most one andthat the spaces are indexed byi ∈ N. Forp, q ∈ X with πi(p) = pi andπi(q) = qi,i ∈ N we define the distance as

∑

i 2−iρi(pi, qi).

Function spaces. On the (vector) spaceCk(Rn) of all Ck-functions onRn, 0 ≤k <∞, theweak topologyis defined by the subbase consisting of the ’g-neighbourhoods’UK,ε,g,k which are defined as the set of functionsf ∈ Ck(Rn) such that in each pointof K the (partial) derivatives off − g up to and including orderk are, in absolutevalue, smaller thanε, whereK ⊂ Rn is compact,ε > 0 and g ∈ Ck(Rn). Itcan be shown that with this topologyCk(Rn) is complete metrizable, and hencehas the Baire property. An important property of this topology is that for any dif-feomorphismϕ : Rn → Rn the associated mapϕ∗ : Ck(Rn) → Ck(Rn), givenby ϕ∗(f) = fϕ is a homeomorphism. All these notions extend to the space ofCk-functions on some open subset ofRn.

ForC∞-functions onRn we obtain theweakC∞-topologyfrom the subbaseUK,ε,g,k

as defined above (but now interpreted as subsets ofC∞(Rn)) for any compactK ⊂

A.3 Differentiable manifolds 247

Rn, ε > 0, g ∈ C∞(Rn) andk ≥ 0. Also the space ofC∞-functions is completemetrizable and the topology does not change under diffeomorphisms inRn.

A.3 Differentiable manifolds

Topological manifold. A topological manifoldis a topological spaceM which ismetrizable and which is locally homeomorphic withRn, for somen, in the sensethat each pointp ∈ M has a neighbourhood which is homeomorphic with an opensubset ofRn. Usually one assumes that a topological manifold isconnected, i.e.that it is not the disjoint union of two open subsets. The integer n is called thedimensionof the manifold.

Differentiable manifold. A differentiable manifoldconsists of a topological man-ifold M , say of dimensionn, together with adifferentiable structure. Such a dif-ferentiable structure can be given in various ways. One is byspecifying which ofthe continuous functions are differentiable. We denote thecontinuous functions onM byC0(M). Given a positive integerk, aCk-structureonM is given by a subsetCk(M) ⊂ C0(M) satisfying:

- For each pointp ∈ M there are a neighbourhoodU of p in M andn ele-mentsx1, . . . , xn ∈ Ck(M) such thatx = (x1|U, . . . , xn|U) : U → Rn is ahomeomorphism ofU to an open subset ofRn and such that the restrictionsof elements ofCk(M) to U are exactly the functions of the form(f |x(U))xwith f ∈ Ck(Rn). Such(U, x) is called alocal chart; x1|U, . . . , xn|U arecalledlocal coordinates.

- If g ∈ C0(M) satisfies: for eachp ∈ M there are a neighbourhoodU of p inM and somegU ∈ Ck(M) such thatg|U = gU |U , theng ∈ Ck(M).

Note that, as a consequence of this definition, whenever we have aCk-structure ona manifoldM andl < k there is a uniqueC l-structure onM such thatCk(M) ⊂C l(M). It can also been shown that given aCk-structure onM , k ≥ 1 one canalways find aC∞-structureC∞(M) ⊂ Ck(M), which is however not unique.

Some simple examples of differentiable manifolds are:

- Rn with the usualCk-functions;

- An open subsetU ⊂ Rn; note that in this case we cannot just take asCk-functions the restrictions ofCk-functions onRn: if U 6= Rn there areCk-functions onU which cannot be extended to aCk-function onRn, see alsobelow;

248 Appendix

- The unit sphere(x1, . . . , xn) ∈ Rn|∑x2i = 1 in Rn; asCk-functions one

can take here the restrictions of theCk-functions onRn; this manifold, or anymanifold diffeomorphic to it, is usually denoted bySn−1.

- Then-dimensional torusTn is defined as the quotient ofRn divided out byZn. If π denotes the projectionRn → Tn then theCk-functions onTn areexactly those functionsf for which fπ ∈ Ck(Rn). Note thatT1 andS1 arediffeomorphic

The second of the above examples motivates the following definition of theCk-functions on an arbitrary subsetA ⊂ M of a differentiable manifoldM . We saythat a functionf : A→ R isCk if, for each pointp ∈ A there are a neighbourhoodU of p inM and a functionfU ∈ Ck(M) such thatf |U ∩A = fU |U ∩A. The set oftheseCk-functions onA is denoted byCk(A). If A is a topological manifold and ifCk(A) ⊂ C0(A) defines aCk-structure onA, thenA is called asubmanifoldof M .

A submanifold ofM is calledclosedis it is closed as a subset ofM ; note howeverthat a manifold is often called closed if it is compact (and has no boundary — anotion which we do not discuss here).

Differentiability. LetM andN beCk-manifolds, then a mapf : M → N isCk-differentiableif and only if for eachh ∈ Ck(N), we have thathf ∈ Ck(M). Theset ofCk-maps fromM toN is denoted byCk(M,N). If moreover such a mapf isa bijection and such that alsof−1 is Ck thenf is called aCk-diffeomorphism. Al-though we have here a simple definition of differentiability, the definition of deriva-tive is more complicated and requires the notion of tangent bundle which we willdiscuss now.

Tangent bundle. Let M be a differentiable manifold andp ∈ M . A smoothcurve throughp is a mapl : (a, b) → M , with 0 ∈ (a, b), which is at leastC1

and satisfiesl(0) = p. We call two such curvesl1 and l2 equivalent if for eachdifferentiablef : M → R we have(fl1)′(0) = (fl2)

′(0). An equivalence class forthis equivalence relation is called atangent vector ofM at p. These tangent vectorscan be interpreted as directional derivatives acting on smooth functions onM . Assuch they can be added (provided we consider tangent vectorsat the same pointof M) and they admit scalar multiplication. This makesTp(M), theset of tangentvectors ofM at p a vector space.

Note that wheneverM is an open subset ofRn andp ∈ M , thenTp(M) is canoni-cally isomorphic withRn: for x ∈ Rn the vector corresponding tox is representedby the curvel(t) = p + tx. This observation, together with the use of local chartsmakes it possible to show thatT (M) = ∪p∈MTp(M), thetangent bundleof M , is

Differentiable manifolds 249

Maximal RankTheorem

Sard Theorem

a topological manifold, and even a differentiable manifold: if M is aCk-manifold,thenT (M) is aCk−1-manifold.

Derivative. Let f : M → N be a map which is at leastC1, M andN beingmanifolds which are at leastC1, thenf induces a map df : T (M) → T (N), thederivativeof f which is defined as follows: if a tangent vectorv ofM at a pointp isrepresented by a curvel : (a, b) → M then df(v) is represented byfl. Clearly dfmapsTp(M) to Tf(p)(N). The restriction of df to Tp(M) is denoted by dfp; eachdfp is linear. IfM andN are open subsets ofRm, respectivelyRn, andp ∈ Rm

then dfp is given by the matrix of first order partial derivatives off at p. From thislast observation, and using local charts one sees that iff isCk then df isCk−1.

Maps of maximal rank. Let f : M → N be a map which is at leastC1. Therankof f at a pointp ∈ M is the dimension of the image of dfp; this rank is maximalif it equalsmindim(M), dim(N). Related to this notion of ’maximal rank’ thereare two important theorems:

Theorem A.1 (Maximal Rank ). Let f : M → N be a map which is at leastC1

having maximal rank inp ∈M , then there are local charts(U, x) and(V, y), whereU, V is a neighbourhood ofp, respectivelyf(p), such thatyfx−1, each of thesemaps suitably restricted, is a linear map. Iff is Ck then the local coordinatesxandy can be taken to beCk.

This means that if a map between manifolds of the same dimension has maximalrank at a pointp then it is locally invertible.

If f maps a ’low’ dimensional manifold to a higher dimensional manifold and if ithas maximal rank inp then the image of a neighbourhood ofp is a submanifold ofN ; If it has maximal rank in all points ofM , thenf is called animmersion. If such aimmersionf is injective and iff induces a homeomorphism fromM to f(M), withthe topology inherited fromN , thenf(M) is a submanifold ofN andf is called anembedding.

The second theorem we quote here uses a notion ’set of measurezero’ from measuretheory which will be discussed in the next section.

Theorem A.2 (Sard). Let f : M → N be aCk-map,M andN manifolds ofdimensionm respectivelyn. LetK ⊂M be the set of points wheref does not havemaximal rank. Ifk − 1 ≥ max(m − n), 0 thenf(K) is a set of measure zerowith respect to the (uniquely defined) Lebesgue measure class onN (intuitively thismeans thatf(K) is very small).

The theorem remains true if one takesK as the (often bigger) set of points wherethe rank off is smaller thann. This theorem if of fundamental importance in theproof of the transversality theorem.

250 Appendix

transversalityTransversality

Theorem

Transversality. LetN be a submanifold ofM , both at leastC1-manifolds. A mapf : V → M , V a differentiable manifold andf at leastC1, is calledtransversalwith respect toN if for each pointp ∈ V we have either:

- f(p) 6∈ N , or

- dfp(Tp(V )) + Tf(p)(N) = Tf(p)(M).

A special case of transversality occurs wheneverf in the above definition is asub-mersion, i.e. a map from a ‘high’ dimensional manifold to a lower dimensionalmanifold which has in each point maximal rank: such a map is transversal withrespect to any submanifold ofM .

One can prove that wheneverf is transversal with respect toN , thenf−1(N) is asubmanifold ofV ; if f isCk, this submanifold isCk, and its dimension isdim(V )−dim(M) + dim(N); if this comes out negative, thenf−1(N) = ∅.

Theorem A.3 (Tranversality). For M , N , andV as above, each having aCk-structure,k ≥ 1, it is a generic property forf ∈ Ck(V,M) to be transversal withrespect toN ; Ck(V,M) denotes the set ofCk maps fromV toM — the topologyon this set is defined in the same way as we defined the topology onCk(Rn).

We give here a brief outline of the proof of this theorem for the special case thatM = Rn,N is a closedC∞-submanifold ofM andV is a compactC∞-manifold. Inthis case it is easy to see that the set of elements ofCk(V,M) which are transversalwith respect toN is open. So we only have to prove that theN-transversal mappingsare dense. Since eachCk-map can be approximated (in anyCk-neighbourhood)by aC∞-map, it is enough to show that theN-transversal elements are dense inC∞(V,M). Let f : V → M be aC∞-map; we show how to approximate it byanN-transversal one. For this we considerF : V × Rn → M = Rn, definedby F (p, x) = f(p) + x; note that the productV × Rn has in a natural way thestructure of aC∞-manifold. ClearlyF is a submersion and hence transversal withrespect toN . DenoteF−1(N), which is aC∞-submanifold ofV × Rn, by W .Finally g : W → Rn is the restriction toW of the canonical projection ofV × Rn

to Rn. By the theorem of Sard, the image underg of the set of critical points ofg, here to be interpreted as the set of points whereg has rank smaller thann, hasLebesgue measure zero inRn. This means that arbitrarily close to0 ∈ Rn there arepoints which are not in this critical image. Ifx is such a point, thenfx, defined byfx(p) = f(p) + x is transversal with respect toN .

Remark. TheoremA.3 has many generalizations which can be proven in more orless the same way. For example, for differentiable mapsf : M →M it is a genericproperty that the associated mapf : M → M ×M , defined byf(p) = (p, f(p))is transversal with respect to∆ = (p, p) ∈ M × M. This means that it is a

Differentiable manifolds 251

orientationdegree

generic property for maps ofM to itself that the fixed points are isolated and thatthe derivative at each fixed point has no proper value one.

Vector fields and differential equations. For ordinary differential equations ofthe formx′ = f(x), with x ∈ Rn, f can be interpreted as avector fieldon Rn:indeed a solution of such an equation is a curvel : (a, b) → Rn such that for eacht ∈ (a, b) we have thatl′(t) = f(l(t)) and this derivative ofl can be interpretedas a vector (see our definition of a tangent vector). These notions generalize tomanifolds. Avector fieldon a differentiable manifoldM is a cross sectionX :M → T (M); with ’cross section’ we mean that for eachp ∈ M the vectorX(p) isin Tp(M). A solution of thedifferential equationdefined by the vector fieldX is asmooth curvel : (a, b) → M such that for eacht ∈ (a, b) we have that the tangentvector defined byl at t equalsX(l(t)); with this tangent vector defined byl at t wemean the tangent vector atl(t)) represented by the curvelt(s) = l(t+ s).

Existence and uniqueness theorems for ordinary differential equations easily gener-alize to such equations on manifolds.

Orientation and degree. Let M be a smooth manifold, which we assume to beconnected. For eachp ∈ M the tangent space can be given an orientation, for ex-ample by choosing a base and declaring this base to be positively oriented. Anotherbasis is positively, respectively negatively, oriented ifthe matrix transforming oneto the other has a positive, respectively negative, determinant. Clearly there are twoorientations possible in such a tangent space. Next one can try to orient all tangentspacesTq(M) for q ∈ M in such a way that the orientations depend continuouslyon theq. With this we mean that if we have a positively oriented base in Tp(M)and if we move it continuously to other points (during this motion it should remaina base) then it remains positively oriented. It is not possible to define such a globalorientation on any manifold, e.g. consider the manifold obtained fromS2 by iden-tifying antipodal points, the so called projective plane — such manifolds are callednonorientable. If M is orientable, it admits two orientations. If one is chosen thenwe say that the manifold isoriented.

If M1 andM2 are two compact oriented manifolds of the same dimension andif f :M1 → M2 is a smooth map then the degree off is defined as follows: take a pointp ∈ M2 such thatf has maximal rank in all points off−1(p) — by Sard’s theoremsuch points exist. Now the degree off is the number of points off−1(p) properlycounted. Due to the maximal rank condition and the compactness the number ofthese points is finite. We count a pointq ∈ f−1(p) as plus one, respectively, minusone if dfq is an orientation preserving map, respectively an orientation reversingmap fromTq(M1) toTp(M2). This degree does not depend on the choice ofp ∈M2.

252 Appendix

If we reverse the orientation onM1 or onM2 then the degree off in the abovesituation is multiplied by−1. This mean that ifM1 = M2 = M then we don’t haveto specify the orientation as long as we use the same orientation onM as domainand as image — still we needM to be orientable for the degree to be defined.

An important special case is the degree of maps of the circleS1. For this we interpretS1 asR modZ which we defined asT1; the canonical projection fromR to S1 isdenoted byπ. If f : S1 → S1 is a continuous map andπ(x) = f(0) (we denotethe elements ofS1 just as reals instead of equivalence classes of reals) then there isa unique continuous liftf : [0, 1) → R of f , i.e. πf(x) = f(x) for all x ∈ [0, 1).In generallims→1 f(s) = x exists but is not equal tox; the differencex − x is aninteger. In fact it is not hard to show that this integer is thedegree off .

Riemannian metric. A Riemannian metricg on a differentiable manifoldM isgiven by a positive definite inner productgp on each tangent spaceTp(M). Usuallyone assumes such a metric to be differentiable: it isCk if for anyCk-vector fieldsX1

andX2, the functionp 7→ gp(X1(p), X2(p)) is Ck. Note that a Riemannian metricis not a metric is the sense we defined before, however it defines sucha metric, atleast if the manifoldM is connected. In order to see this we define the length of acurvel : [a, b] → M which is at leastC1 asL(l) =

∫ b

a

√

gl(s)(l′(s), l′(s))ds; notethe analogy with the length of curves inRn. For p, q ∈ M we define the distancefrom p to q as the infimum of the lengths of all smooth curves ofp to q. It is notvery hard to prove that in this way we have indeed a metric as defined before.

It can be shown that each manifold admits a smooth Riemannianmetric (even manysuch metrics). Such a Riemannian metric can even be constructed in such a waythat the corresponding metric is complete.

A.4 Measure theory

Basic definitions. A measureon a setX is given by a nonempty collectionB ofsubsets ofX and a mapm from B to the non-negative real numbers, including∞,such that:

- B is a sigma algebra, i.e.,B is closed under formation of countable unionsand complements (algebra: addition corresponds to taking the union, multi-plication corresponds to taking the intersection);

- m is sigma additive, i.e., for each countable collectionBi ∈ B of pairwisedisjoint subsets,m(∪iBi) =

∑

im(Bi).

The elements of the sigma algebraB are calledmeasurable sets; for B ∈ B, m(B)is themeasureof B. A triple (X,B, m) as above is called ameasure space.

Measure theory 253

Borel measureGiven such a sigma algebraB with measurem, thecompletionB′ of B is obtainedby including all setsB′ ⊂ X which are contained in a setB ∈ B with m(B) = 0;the extension ofm to B′ is unique (and obvious). Often taking the completion of asigma algebraB, with respect to a given measure goes without saying.

A measure on a topological space is called aBorel measureif all open sets aremeasurable. The smallest sigma algebra containing all opensets of a topologicalspace is called itsBorel sigma algebra; its elements are calledBorel subsets.

In order to avoid pathologies we will assume all Borel measures to satisfy the fol-lowing additional assumptions:

- the topological space on which the measure is defined is locally compact,complete metrizable and separable (i.e. it contains a countable dense subset);

- the measure is locally finite, i.e. each point has a neighbourhood of finitemeasure.

These conditions are always satisfied for the Borel measureswe consider in the the-ory of dynamical systems (we note that Borel measures satisfying these extra con-ditions are also called Radon measures, but several different definitions of Radonmeasures are used).

A measure is called nonatomic if each point has measure zero:an atom is a pointwith positive measure.

A measurem onX is called aprobability measureif m(X) = 1; for A ⊂ X,m(A)is then interpreted as theprobability to find a point inA.

If (Xi,Bi, mi), i = 1, 2, are two measure spaces, then a surjective mapϕ : X1 →X2 is called amorphism, or a measure preserving map, if for eachA ∈ B2 wehaveϕ−1(A) ∈ B1 andm1(ϕ

−1(A)) = m2(A). If moreoverX1 = X2 we callϕ anendomorphism. If the morphismϕ is invertible and induces an isomorphismbetweenB1 andB2, thenϕ is anisomorphism.

These three notion also appear in a weaker form:morphisms mod 0, endomorphismsmod 0, andisomorphisms mod 0. A morphism mod 0 is given by a mapϕ : X1 →X2, which satisfies:

- ϕ is defined on all ofX1 except maybe on a setA of m1-measure zero;

- the complement ofϕ(X1 − A) in X2 hasm2-measure zero;

- for eachC ∈ B2, ϕ−1(C) ∈ B1 andm1(ϕ−1(C)) = m2(C).

The other two ’mod 0’ notions are defined similarly. These last definitions are mo-tivated by the fact that sets of measure zero are so small thatthey can be neglectedin many situations.

254 Appendix

Lebesgue measure Note that the sets of measure zero form an ideal in the sigma algebra of a measurespace. An isomorphism induces an isomorphism between the sigma algebrasB2

andB1 after they have been divided out by their ideal of sets of measure zero.

Lebesgue measure onR. As an example we discuss the Lebesgue measure onR.This measure assigns to each open interval(a, b) the measureb−a. It turns out thatthere is a unique sigma additive measure on the Borel sigma algebra ofR whichassigns this measure to intervals; this measure is called the Lebesgue measure. Itis not possible to extend this measure to all subsets ofR, so the notion of beingmeasurable is a nontrivial one. Clearly the Lebesgue measure is a Borel measure.Its restriction to the interval[0, 1] is a probability measure.

Product measure. OnRn the Lebesgue measure can be obtained in the same way.This is a special case of product measures which are defined for Borel measurespaces. If(Xi,Bi, mi)i∈I is a finite collection of Borel measure spaces, then theproduct is defined as follows:

- as a set, the product isX = ΠiXi — the canonical projectionsX → Xi aredenoted byπi;

- the sigma algebraB of measurable subsets ofX is the smallest sigma alge-bra containing all sets of the formπ−1

i (Bi) with i ∈ I andBi ∈ Bi (or itscompletion with respect to the measure to be defined below);

- for Bi ∈ Bi we havem(∩iπ−1i (Bi)) = Πimi(Bi), if in such a product one

of the terms is zero then the result is zero, even if a term∞ occurs — theextension of this functionm to all ofB as a sigma additive function turns outto be unique.

The reason why we had to restrict tofinite products is that infinite products of theform Πimi(Bi), for an infinite index setI need not be defined, even if all terms arefinite and∞ as a result is allowed. There is however a very important casewherecountable infinite products are defined, namely forBorel probability measuresoncompact spaces. In this case also infinite products are well defined because all termsare≤ 1. We require the spaces to be compact, because otherwise the product mightnot be locally compact; we require the product to be only countably infinite so thatthe product space is still separable. Indeed countable infinite products of Borelprobability measure on compact spaces can be defined as above.

Measure class. Let (X,B, mi), i = 1, 2, denote two measures. We say that thesemeasures belong to the samemeasure classif they define the same subsets of mea-sure zero. An important measure class is that of the Legesguemeasure and an

Measure theory 255

measure classimportant property of this measure class is that it is invariant under diffeomorphismsin the following sense: IfK ⊂ Rn is a set of measure zero and ifϕ : U → Rn isa diffeomorphism of a neighbourhood ofK to an open subset ofRn, thenϕ(K) isalso a set of measure zero (this holds for diffeomorphisms which are at leastC1 butnot for homeomorphisms). This last fact can be easily derived from the followingcharacterization of sets of Lebesgue measure zero. A subsetK ⊂ Rn has measurezero if an only if, for anyε > 0, it is contained in a countable union of cubes withsides of lengthεi with

∑

i εni < ε.

The fact that the Lebesgue measure class is preserved under diffeomorphisms meansthat also on smooth manifolds on has the notion of sets of measure zero.

Example A.1 (ZN

2 versus[0, 1)). On Z2 = 0, 1 we consider the measure whichassigns to0 and 1 each the measure1

2. On ZN

2 , which is the set of sequences(s1, s2, . . .) with si ∈ Z2, we have the product measure. In the discussion of thedoubling map we used that this measure space is isomorphic (mod 0) with the Lebe-gues measure on[0, 1). The isomorphism mod 0 is constructed by considering sucha sequence(s1, s2, . . .) as the diadic expression for

∑

i si2−i. It is easy to verify that

this is transforming the Lebesgue measure to the product measure onZN

2 . It is onlyan isomorphism mod 0 since we have to exclude sequences(s1, s2, . . .) for whichthere is someN with si = 1 for all i > N .

Example A.2 (Measure and category).We conclude this discussion with an ex-ample showing thatR is the disjoint union of a set of the first category, i.e. aset which is negligible in the topological sense, and a set ofLebesgue measurezero, i.e. a set which is negligible in the measure theoretical sense. For this con-struction we use the fact the the rational numbers are countable and hence can begiven asqii∈N. We consider the following neighbourhoods of the rational numbersUj = ∪iD2−(i+j)(qi), whereD2−(i+j)(qi) is the2−(i+j) neighbourhood ofqi. Thenwe defineU = ∩jUj . It is clear that the measure ofUj is smaller than21−j. HenceU has Lebesgue measure zero. On the other hand eachUj is open and dense, soUis residual and hence the complement ofU is of the first category. This means thatprobably a better way to formalize the notion ’almost all’, in the case ofR or Rn,would be in terms a of sets which are both residual and have a complement of mea-sure zero. This is what one usually gets when applying the transversality theorem;this comes from the role of Sard’s Theorem A.2 in the proof of the TransversalityTheorem A.3. On the other hand, in the study of resonances andKAM tori, oneoften has to deal with sets which are residual but have a complement of positiveLebesgue measure. Compare Exercise 5.14.

256 Appendix

KAM Theoryinteger affine

structuremulti-periodicquasi-periodic

Appendix B

MiscellaneousKAM Theory

We further review certain results inKAM Theory, in continuation of§§2.2, 2.5 andof Chapter 5. The style will be a little less formal and for most details we referto [61, 23, 15] and to the many references given there. In§§5.2, 5.3 and 5.4 wedescribed the persistent occurrence of quasi-periodicityin the setting of diffeomor-phisms of the circle or the plane. A similar theory exists forsystems with continuoustime, which we now describe in broad terms, as this fits in our considerations. Inparticular we shall deal with the ‘classical’KAM Theorem regarding Lagrangian in-variant tori in nearly integrable Hamiltonian systems, fordetails largely referring to[32, 160, 161, 215, 216]. Also we shall discuss the ‘dissipative’ counterpart of the‘classical’ KAM Theorem, regarding a family of quasi-periodic attractors [61]. Atthe level ofKAM Theory these two results are almost identical and we give certainrepresentative elements of the corresponding proof. Also see [50, 61, 23, 15] andmany references therein.

B.1 The integer affine structure of quasi-periodic tori

As we saw in§2.2 and Chapter 5, quasi-periodicity is defined by a smooth conju-gation. We briefly summarize this set-up for quasi-periodicn-tori in systems withcontinuous time. First on then-torusTn = Rn/(2πZ)n consider the vector field

Xω =n∑

j=1

ωj∂

∂xj,

whereω1, ω2, . . . , ωn are called frequencies [142, 53]. Now, given a smooth1 vectorfield X on a manifoldM, with T ⊆ M an invariantn-torus, we say that the re-strictionX|T is multi-periodicif there existsω ∈ Rn and a smooth diffeomorphism

1Say, of classC∞.

258 MiscellaneousKAM Theory

conjugation!smoothKAM

Theory!conservative

Φ : T → Tn, such thatΦ∗(X|T ) = Xω. We say thatX|T is quasi-periodicif thefrequenciesω1, ω2, . . . , ωn are independent overQ.

A quasi-periodic vector fieldX|T leads to an integer affine structure on the torusT.In fact, by the fact that each orbit is dense, it follows that the self conjugations ofXω

exactly are the translations ofTn, which completely determine the affine structureof Tn. Then, givenΦ : T → Tn with Φ∗(X|T ) = Xω, it follows that the selfconjugations ofX|T determines a natural affine structure on the torusT. Note thatthe conjugationΦ is unique modulo translations inT andTn.

Note that the composition ofΦ by a translation ofTn does not change the frequencyvectorω. However, the compostion by a linear invertible mapS ∈ GL(n,Z) yieldsS∗Xω = XSω. We here speak of aninteger affinestructure [53].

Remarks.

- The transition maps of an integer affine structure are translations and elementsof GL(n,Z).

- The current construction is compatible with the integrable affine structure onthe the Liouville tori of an integrable Hamiltonian systems[33]. Note that inthat case the structure extends to all multi-periodic tori.

B.2 Classical (conservative)KAM Theory

ClassicalKAM Theory deals with smooth, nearly integrable Hamiltonian systemsof the form

x = ω(y) + f(y, x)

y = g(y, x), (B.1)

wherey varies over an open subset ofRn andx over the standard torusTn. Thephase space thus is an open subset ofRn × Tn, endowed with the symplectic formdy ∧ dx =

∑nj=1 dyj ∧ dxj. The functionsf andg are assumed small in theC∞-

topology, or in another appropriate topology, compare with§§5.2 and 5.5, withAppendix A or [112], also see below. Adopting vector field notation, for the unper-turbed and perturbed systems we write

X(x, y) = ω(y)∂x andX = (ω(y) + f(x, y))∂x + g(x, y)ω(y)∂y, (B.2)

where∂x = ∂/∂x, etc. Note that(y, x) is a set of action-angle variables forthe unperturbed systemX. Thus the phase space is foliated byX-invariant tori,parametrized byy. Each of the tori is parametrized byx and the correspondingmotion is multi- (or conditionally) periodic with frequency vectorω(y).

B.2 Classical (conservative)KAM Theory 259

Kolmogorovnon-degeneracy


KAMTheory!global


KAM Theory addresses persistence of theX-invariantn-tori and the multi-periodicityof their motion for the perturbationX. The answer needs two essential ingredients.The first of these is that ofKolmogorov non-degeneracywhich states that the mapy ∈ Rn 7→ ω(y) ∈ Rn is a (local) diffeomorphism. Compare with the twist condi-tion of §5.3. The second ingredient is formed by the Diophantine conditions (5.19)on the frequency vectorω ∈ Rn, for γ > 0 andτ > n − 1 leading to the ‘Cantorset’Rn

τ,γ ⊂ Rn compare with§5.5 and see Figure 5.5.

Completely in the spirit of Theorem 5.3, the classicalKAM Theorem roughly statesthat a Kolmogorov non-degenerate nearly integrable systemX, for sufficiently smallf andg is smoothly conjugated to the unperturbed, integrable approximationX,see (B.2), provided that the frequency mapω is co-restricted to the DiophantinesetRn

τ,γ. In this formulation smoothness has to be taken in the sense ofWhitney[160, 215, 216, 62], also see [161, 61, 50, 23, 15].

As a consequence we may say that in Hamiltonian systems ofn degrees of freedomtypically quasi-periodic invariant (Lagrangian)n-tori occur with positive measurein phase space, compare with Chapter 3. It should be said thatalso an iso-energeticversion of this classical result exists, implying a similarconclusion restricted toenergy hypersurfaces [34, 33, 59, 23]. The Twist Map Theorem5.3 is closely re-lated to the iso-energeticKAM Theorem, in particular cf. Example 5.2 on coupledpendula.

Remarks.

- We chose the quasi-periodic stability format as in§5.3. For regularity issuescompare with a remark following Theorem 5.2 in§5.2 and the one followingTheorem 5.7 in§5.5.

- One extension of the above deals with globalKAM Theory [53]. In a Liouville-Arnold integrable system [33] the set of Lagrangian invariant tori often formsa non-trivial bundle [91, 80] and the question is what remains of this bun-dle structure in the nearly-integrable case. The classicalKAM Theorem onlycovers local trivializations of this bundle, at the level ofDiophantine quasi-periodic tori giving rise to local conjugations of Whitney classC∞. In [53]these local conjugations are glued together so to obtain a global Whitneysmooth conjugation that preserves the non-triviality of the integrable bundle.This glueing process uses the integer affine structure on thequasi-periodictori, which enables taking convex combinations of the localconjugations,subjected to a suitable partition of unity [112].

The non-triviality of these bundles is of importance for certain semi-classicalversions of the classical systems at hand, in particular forcertain spectrumdefects [81]. One example is the spherical pendulum [91, 80,53], for otherexamples see [81] and references therein.


KAMTheory!dissipative

multi-periodic!dynamics

- As indicated in§2.5, the litterature on the classicalKAM Theorem is immense.As an entrance to this see [61, 23]. Here we just mention a difference betweenthe casen = 2 andn ≥ 3. In the former of these cases the energy surfaceis 3-dimenional in which the 2-dimensionalKAM tori have codimension 1.This means that the perturbed evolutions are trapped forever, which can leadto a form of perpetual stability. In the latter case this no longer holds true andthe evolutions can escape. For further details about the stickyness of quasi-periodic tori and on Nekhorochev stability, again see the above references andtheir bibliographies.

B.3 Dissipative KAM theory

As already noted by Moser [141, 142],KAM Theory extends outside the worldof Hamiltonian systems, like to volume preserving systems,or to equivariant orreversible systems. This also holds for the class of generalsmooth systems, oftencalled ‘dissipative’. In fact, theKAM Theorem allows for a Lie algebra proof, thatcan be used to cover all these special cases [62, 61, 23, 58]. It turns out that inmany cases parameters are needed for persistent occurrenceof (Diophantine) quasi-periodic subsystems, for instance see§5.2 and§5.5.

As another example we now consider the dissipative setting,where we discussa parametrized system with normally hyperbolic invariantn-tori carrying quasi-periodic motion. From [113] it follows that this is a persistent situation and that,up to a smooth2 diffeomorphism, we can retrict to the case whereTn is the phasespace. To fix thoughts we consider the smooth system

x = ω(α) + f(x, α)

α = 0, (B.3)

whereα ∈ Rn is a multi-parameter. The results of the classicalKAM Theoremregarding (B.1) largely carry over to (B.3).

As before we write

X(x, α) = ω(α)∂x andX(x, α) = (ω(α) + f(x, α))∂x (B.4)

for the unperturbed and and perturbed system. For the unperturbed systemX theproduct of phase space and parameter space as an open subset of Tn × Rn is com-pletely foliated by invariantn-tori and by [113] for smallf this foliation is per-sistent. The interest is with the multi-periodic dynamics,compare with the settingof Theorem 5.2. As just stated,KAM Theory here gives a solution similar to the

2In this case of classCk for largek.

B.4 On theKAM proof in the dissipative case 261


Van der Pol systemquasi-

periodic!(familyof) attractors

Newtonian iteration

Hamiltonian case. The analogue of the Kolmogorov non-degeneracy condition hereis that the frequency mapα 7→ ω(α) is a (local) submersion [112]. Then, in thespirit of Theorem 5.2, the perturbed systemX is smoothly conjugated to the unper-turbed caseX, see (B.4) as before, provided that the mapω is co-restricted to theDiophantine setRn

τ,γ. Again the smoothness has to be taken in the sense of Whitney[160, 215, 216, 62], also see [61, 50, 23, 15].

It follows that the occurrence of normally hyperbolic invariant tori carrying (Dio-phantine) quasi-periodic flow is typical for families of systems with sufficientlymany parameters, where this occurrence has positive measure in parameter space.In fact, if the number of parameters equals the dimension of the tori, the geometryas sketched in Figure 5.5 carries over in a diffeomorphic way. The same holds forthe remarks regarding the format of quasi-periodic stability and regularity followingsection B.2 and Theorem 5.2.

Remarks.

- In Example 5.1 we considered two coupled Van der Pol type oscillators andwere able to study invariant 2-tori and their multiperiodicity by the Twist MapTheorem 5.3. In this respect, the presentKAM Theory covers the case ofncoupled oscillators, forn ≥ 2.

- DissipativeKAM Theory provides families of quasi-periodic attractors in arobust way. As in the conservative case, also here there is animportant dif-ference between the casesn = 2 andn ≥ 3, namely that a flow onT3 canhave chaotic attractors, which it cannot onT2. This is of importance for fluidmechanics [14, 174], for a discussion see§C.3 below.

B.4 On theKAM proof in the dissipative case

This section contains a sketch of the proof of [61], Theorem 6.1 (p. 142 ff.), whichis just about the simplest case of the general Lie algebra proof in [62, 58], followingthe ideas of [142] and techniques of [160]. As already indicated in Chapter 5, theKAM Theory proofs involved here are based on a Newtonian process, where at eachstep a 1 bite small divisor problem (see§5.5) has to be solved. We like to illustratethis a bit further regarding the proof in the dissipative setting, compare with the set-up of Theorem 5.2. Here it should be said that theKAM Theoretical aspects of theconservative (classical) and dissipative results are really the same. The former onlyis more complicated by the fact that one has to keep track of the symplectic form.


1-bite small divisorproblem

B.4.1 Reformulation and some notation

As said before we are looking for a conjugation between the unperturbed and per-turbed systems (B.4)

X(x, α) = ω(α)∂x respectivelyX(x, α) = (ω(α) + f(x, α)∂x.

As a first simplification, using the Inverse Function Theorem[112], we reparametrizeα↔ (ω, ν), whereν is an external multi-parameter that we can suppress during theproof. As in§5.2, the naive search is for a diffeomorphismΦ conjugatingX to X,i.e., such that

Φ∗X = X. (B.5)

Again we choseΦ of a skew form

Φ(x, ω) =(

x+ U(x, ω), ω + Λ(ω))

, (B.6)

the conjugation equation (B.5) rewrites to

∂U (x, ω)

∂xω = Λ(ω) + f

(

x+ U(x, ω), ω + Λ(ω))

.

This is a nonlinear equation to be solved inU and Λ, comparable with (5.7) in§5.2. As mentioned in§5.3 we make use of a Newtonian iteration process. Thecorresponding linearizations look like

∂U (x, ω)

∂xω = Λ(ω) + f(x, ω), (B.7)

which we recognize as a vectorial version of the 1-bite smalldivisor problem (5.13,2nd eqn.) we met in§5.3. In this linearized form intuitively we think off as anerror that has to be reduced in size during the iteration, thereby in the limit yieldingthe expression (B.5). Recall from§5.3 that necessary for solving the equation (B.7)is that the right-hand side has average zero, which amounts to

Λ(ω) = −f0(., ω),

leading to a parameter shift. Again compare with§5.2.

B.4.2 On the Newtonian iteration

As said in§5.5 the present proof is based on approximation by real analytic func-tions and the simplest version of the presentKAM Theory is in this analytic setting

B.4 On theKAM proof in the dissipative case 263

where we use the compact open topology on holomorphic extensions. Therefore wereturn to a complex domain (5.20)

cl ((Tn + κ) × (Γ + ρ)) ⊂ (C/2πZ)n × Cn,

where the size of perturbations likef is measured in the supremum norm; therebygenerating the topology at hand, see Appendix A. For technical reasons settingΓ′ = ω ∈ Γ | dist (ω, ∂Γ) ≥ γ, we abbreviate

Γ′τ,γ = Γ′ ∩ Rn

τ,γ,

The conjugationΦ is obtained as the Whitney-C∞ limit of a sequenceΦJj≥0 ofreal analytic near-identity diffeomorphisms, withΦ0 = Id, all of the skew format(B.6). We here just sketch the important elements of this, for details referring to[160, 215, 216] and to [62, 61], also see [161, 15].

Remarks.

- The fact thatΦ : Tn × Γ′τ,γ → Tn × Γ is of class Whitney-C∞, means that it

can be seen as the restriction of aC∞-map defined on all ofTn×Γ. Comparewith the formulations of Theorems 5.2, 5.3.

- The fact thatΦ can’t be real analytic can be guessed from the fact that on theone hand, for dynamical reasons, generically the conjugation relation (B.5)cannot possibly hold true in the gaps of the ‘Cantor set’Γ′

τ,γ. On the otherhand, the setΓ′

τ,γ is perfect, which in the analytic case might give contradic-tions with unicity.

This being said, it may be mentioned here that the conjugation Φ can beproven to be Gevrey regular [203], also compare with [23] andits references.

Each of theΦj is defined on a complex neighbourhoodDj ,

Tn × Γ′τ,γ ⊂ Dj ⊂ cl ((Tn + κ) × (Γ + ρ)),

whereDj+1 ⊂ Dj for all j ≥ 0. In fact, theDj shrink towards

(Tn + 12κ) × Γ′ asj → ∞.

Remark. Note that in theTn-direction the complex domain does not shrink com-pletely, which implies that the Whitney-C∞ limit Φ even is analytic in thex-variables. A similar regularity can be obtained in the ray direction of the setRn

τ,γ,see Lemma 5.5 and compare with Figure 5.5.


homologicalequation

For the Newtonian iteration steps maps

Ψj : Dj+1 → Dj , whereΦj+1 = Ψ0 · · · Ψj,

are constructed as follows. In appropriate coordinates(ξ, σ) onDj+1 we require thefamiliar skew format

Ψj : (ξ, σ) 7→ (ξ + Uj(ξ, σ), σ + Λj(σ))

whereUj andΛj are determined from a homological equation

∂Uj(ξ, σ)

∂ξ= Λj(σ) + fd

j (ξ, σ), (B.8)

compare with (B.5). Herefj is the ‘error’ term obtained afterj iterations and

fdj (ξ, σ) =

∑

|k|≤d

fj(σ) ei〈σ,k〉

is the truncation at an appropriate orderd = dj tending to∞ as j → ∞. Asbefore, the termΛj(σ) just serves to get theTn-average of the right hand side equalto zero. The fact thatfd

j (ξ, σ) is a trigonometric polynomial implies thatUj canbe solved from (B.8) as another trigonometric polynomial, for which only finitelymany Diophantine conditions are needed. This accounts for the fact thatΨj isdefined on the full neighbourhoodDj+1 of Tn × Γ′.

The remainder of the proof is a matter of refined bookkeeping,keeping in mind theapproximation properties of Whitney smooth maps by analyicones [215, 216]. Tothis end, the size of the shrinking domains

Tn × Γ′τ,γ ⊂ · · · ⊂ D2 ⊂ D1 ⊂ D0 ⊂ cl ((Tn + κ) × (Γ + ρ))

is chosen by geometric sequences while the truncationdj grows exponential withj. Moreover, the size of the ‘error’|fj|Dj+1

can be controlled as an exponentiallydecaying sequence. For details see [61] (pp. 146-154), alsosee [161] and the otherreferences given before.

Remarks.

- In the solution (B.8) we need only finitely many Diophantineconditions ofthe form|〈σ, k〉| ≥ c|k|−τ , namely only for0 < |k| ≤ d. A crucial Lemma[61] (p. 147) ensures that this holds for allσ ∈ (Γγ + rj) , whererj is anappropriate geometric sequence, compare the Exercises B.1and B.2.

B.5 Exercises 265

- The analytic book keeping mentioned above includes many applications ofthe Mean Value Theorem and the Cauchy Integral Formula. The latter servesto express the derivatives of a (real) analytic function in terms of this func-tions, leading to useful estimates of theC1-norm in terms of theC0-norm.

- As said earlier, the above proof is about the simplest version of the Lie algebraproof of [62] and thereby its set-up is characteristic for many other contexts,like for KAM Theorems in the Hamiltonian, the reversible context, etc. Com-pare [160, 60] and many references in [61, 23]. Compare with [161] for asimple version of the proof in the Hamiltonian case.

B.5 Exercises

Exercise B.1 (Homothetic role ofγ). By a scaling of the timet and ofω showthat the above proof only has to be given for the caseγ = 1. What happens to thebounded domainΓ asγ gets small?

Hints: Sett = γt andω = γ−1ω.

Exercise B.2 (Order of truncation). Following Exercise B.1 we restrict to the casewhereγ = 1, taking the setΓ sufficiently large to contain a nontrivial ‘Cantor set’of parameter values corresponding to(τ, 1)-Diophantine frequencies. Forj ≥ 0onsider the complexified domain

Dj = (Tn + 12κ+ sj) × (Γ′

τ,1 + rj),

see, assuming thatrj = 12s

2τ+2. For the order of truncationd take

dj = Entier(

s−2j

)

.

1. Show that for all integer vectorsk ∈ Zn with 0 < |k| < dj one has|k|τ+1 ≤(2rj)

−1;

2. Next show that for allσ ∈ Γ′τ,1 + rj and allk with 0 < |k| < dj one has

|〈σ, k〉| ≥ 12 |k|−τ ;

3. As an example takesj =(

14

)j, j ≥ 0, and express the order of truncationdj

as a function ofj.

Hints: For a while dropping the indexj, regarding 1 note that by definitiond ≤s−2 = (2r)−1/(τ+1). Next, to show 2, first observe that forσ ∈ Γ′

τ,1 + r there existsσ∗ ∈ Γ′

τ,1 such that|σ − σ∗| ≤ r. It then follows for allk ∈ Zn with 0 ≤ |k| ≤ dthat

|〈σ, k〉| ≥ |〈σ∗, k〉| − |σ − σ∗||k| ≥ |k|−τ − r|k| ≥ 12 |k|−τ .

Appendix C

Miscellaneous bifurcations

Consider a smooth dynamical system depending smoothly on parameters, wherewe study the changes in the dynamics upon variation of parameters. When thedynamics changes in a qualitative way, we say that a bifurcation occurs. In themain text we met several examples of this. For instance in Chapter 1 when studyingthe Van der Pol system we met the Hopf bifurcation, where a point attractor loosesits stability and a periodic orbit comes into existence. Moreover in§3.2 we metsequences of period doubling bifurcations as these occur inthe Logistic family,while in Example 3.1 in§3.3 we encountered a saddle-node bifurcation.

These and other examples will be discussed below in more detail. Speaking in termsof Chapter 3, one important issue is whether the bifucation itself is persistent undervariation of (other) parameters, i.e., whether it occurs ina robust way, or that agiven bifurcation family is typical. Another point of interest is the possibility thatthe dynamics changes from regular to complicated, especially when, upon changesof parameters, a number of bifurcations takes place. In particular we focus on sometransitions from orderly to chaotic dynamics.

Local bifurcation theory deals with equilibria of vector fields or fixed points ofmaps, where bifurcation usually means that hyperbolicity is lost. A similar theoryexists for periodic orbits: in the case of flows one reduces toa previous case byturning to a Poincare map and in the case of maps similarly bychoosing an appro-priate iterate. Of this, local, theory a lot is known, e.g., compare with [183, 184, 1,145, 4, 6, 2, 12, 13]. Also in the Hamiltonian case much has been said about localbifurcations, for an overview see [106].

Somewhat less generally known are quasi-periodic bifurcations, sometimes alsocalled torus bifurcations, where aspects of local bifurcation theory are intermixedwith KAM Theory. In fact, versions of the local bifurcation scenario’s also show uphere, but in a way that is ‘Cantorised’ by densely many resonances. The Whitneysmoothness format of theKAM Theory as developed in Chapter 5 and Appendix B

268 Miscellaneous bifurcations

conjugation!topologicalequivalence!topological

allows for a concise geometrical description, see [62, 61, 58, 55, 56, 57, 15, 74, 106,23].

Below we shall illustrate the above theories, focussing on the Hopf bifurcation, theHopf-Neımark-Sacker bifurcation and the quasi-periodicHopf bifurcation. In thisway we show that for higher dimensional modelling quasi-periodic bifurcations areof great importance. As an example we sketch the Hopf-Landau-Lifschitz-Ruelle-Takens scenario for the onset of turbulence [117, 125, 126, 174, 146, 67]. In thesebifurcation scenario’s, which contain transitions to chaos, also homo- and hetero-clinic bifurcations play an important role. For details in this respect we refer to [16],where also the Newhouse phenomenon is dealt with, and to [25]. For a computerassisted case study see [65].

C.1 Local bifurcations of low codimension

The theory of bifurcations of equilibrium points of vector fields is best understood,where a similar theory goes for fixed points of maps; a systematic treatment isbased on the concept ofcodimension, compare with the textbooks [5, 6, 13, 8],for background dynamical systems theory also see [1, 28, 2, 7, 9, 11]. Roughlyspeaking the codimension of a bifurcation is the minimum numberp ∈ Z+ such thatthe bifurcation occurs in genericp-parameter families of systems. Here one oftenspeaks ofgermsof such families, which expresses the fact that the neighbourhoodof the bifurcation point is not to be specified.

Remarks.

- The structure of bifurcation theory can change drastically when the dynamicshas to preserve a certain structure, e.g., whether a symmetry is present or not.Below we shall see certain differences between bifurcations in the general‘dissipative’ setting and in the Hamiltonian one.

- The above concept of codimension is rather naive. A more sophisticated no-tion is based on (trans-) versality, see Appendix A and the references [194,1, 8] also see [2, 62, 58]. In any case is important to specify the equivalencerelation to be used on the spaces of parametrised systems. Wealready sawthat the relation based on smooth conjugations can be too fineto be useful,compare Exercise 3.9. Instead, often topological conjugation is more appro-priate, see Definition 3.11 in§3.4.2. For vector fields we often even resort totopological equivalence, see a remark following Definition3.11. It should besaid that the theory can quite involved, where, for instance, a difference mustbe made between whether the topological equivalences depend continuouslyon the parameters or not. For an overview see [2].

C.1 Local bifurcations of low codimension 269

Hartman-GrobmanTheorem

saddle-nodebifurcation

perioddoubling!bifurcation

In this appendix we do not attempt a systematic approach, butwe present a fewmiscellaneous examples, mainly in the general ‘dissipative’ setting where the localbifurcation theory deals with loss of hyperbolicity. Indeed, by the Hartman Grob-man Theorem hyperbolic equilibria and fixed points are persistent under variationof parameters and hence have codimension0. In both these cases one even hasstructural stability under topological conjugation [8, 2].

Bifurcation occurs when eigenvalues of the linear part are on the imaginary axis (forequilibria) or on the unit circle (for fixed points). The simplest case occurs whenone simple eigenvalue crosses the imaginary axis at0 or the unit circle at1 and theothers remain hyperbolic. Since the eigenvalues depend continously on parameters,this phenomenon has codimension1. Under open conditions on higher order termsin both setting we get a saddle-node bifurcation. see below for more details. Forfixed points another codimension 1 bifurcation is period doubling, occurring whenone simple eigenvalue crosses the unit circle at−1. See below for some details, alsosee the above references, in particular [5]. The cases when acomplex conjugatedpair of eigenvalues crosses the imaginary axis or the unit circle leads to the Hopfand the Hopf-Neımark-Sacker bifurcation that we will discuss later.

C.1.1 Saddle-node bifurcation

The saddle-node bifurcation for equilibria and fixed points, as already met in Exam-ple 3.1 in§3.3, in its simples form already takes place in the phase space R = y,in which case topological normal forms are given by

y = y2 − µ and (C.1)

y 7→ y + y2 − µ,

y ∈ R, µ ∈ R, both varying near0. In general each of these bifurcation takes placein a 2-dimensional center manifold inside the product of phase space and parameterspace [113, 145], compare with Figure 3.2.

C.1.2 Period doubling bifurcation

Another famous codimension 1 bifurcation is period doubling, which does not occurfor equilibria of vector fields. However, it does occur for fixed points of maps, wherea topological model is given by

Pµ(y) = −(1 + µ)y ± y3, (C.2)

wherey ∈ R, µ ∈ R, [145]. It also occurs for periodic orbits of vector fields, inwhich case the map (C.2) occurs as a return map inside a centermanifold.

Sequences of period doublings have been detected in the Logistic family, for a dis-cussion we refer to§3.2, in particular see Figure 3.1.


Hopf!bifurcationHopf!-Ne“u“imark-

Sackerbifurcation

C.1.3 Hopf bifurcation

The other codimension 1 bifurcation for equilibria of vector fields is the Hopf bifur-cation, occurring when a simple complex conjugate pair crosses the imaginary axis.For an example see§1.3.1. We classify up to topological equivalence and then inacenter manifoldR2 × R = (y1.y2), µ obtain a normal form

ϕ = 1, (C.3)

r = µr − r3,

in corresponding polar coordinatesy1 = r cosϕ, y2 = r sinϕ, [145]. For phaseportraits compare with Figure 1.12 in§1.3.1. Figure C.1 shows an amplitude re-sponse diagram (often called bifurcation diagram). Observe the occurrence of theattracting periodic solution forµ > 0 of amplitude

√µ. Also observe that in the

model (C.3) the eigenvalues cross the imaginary axis at±i.

Remarks.

- Note that in our formulation we used equivalence instead ofconjugation. Thisis related to the fact that the periodic orbit has a period that is an invariantunder conjugation, which would not lead to a finite classification. For furtherdiscussion in this direction see [2].

- Related to this is the following. In the classification of matrices under simi-larity, i.e., under the adjoint action of the general lineargroup [1], we get anormal form

(

α −ββ α

)

, (C.4)

corresponding to the eigenvalueα + ıβ. Hereα = µ takes care of the bi-furcation, whileβ keeps track of the normal frequency at bifurcation; undertopological equivalence we setβ = 1. For (α, β) ≈ (0, 1) the family (C.4) isauniversal unfoldingin the sense of [1]. Therefore we say that the linear parthassimilarity codimension2.

C.1.4 Hopf- Neımark-Sacker bifurcation

We now turn to the Hopf bifurcation for fixed points of diffeomorphisms. wherethe eigenvalues in a fixed point cross the complex unit circleat e±2πiβ , such thatβis not rational with denominator less than 5, see [184, 1]. Again this phenomenonhas codimension 1, since it occurs persistently in 1-parameter families of diffeo-morphisms.


resonance!tonguerrr

µµµ

Figure C.1: Bifurcation diagram of the Hopf bifurcation.

In this case the analogue of (C.4) is

exp

(

α −ββ α

)

,

implying that the linear part again has similarity codimension 2. So let us consider

P (y) = e2π(α+iβ)y +O(|y|2), (C.5)

y ∈ C ∼= R2, near0, whereO(|y|2) should contain generic third order terms. Asbefore, we letα = µ serve as a bifurcation parameter, varying near0.On one side ofthe bifurcation valueµ = 0 this system by normal hyperbolicity and the NormallyHyperbolic Invariant Manifold Theorem [113] has an invariant circle. When tryingto classify modulo topological conjugation, observe due tothe invariance of therotation numbers of the invariant circles, no topological stability can be obtained[145].

Still this bifurcation can be characterized by many persistent properties. Indeed,in a generic 2-parameter family (C.5), say with bothα andβ as parameters, theperiodicity in the parameter plane is organized in resonance tongues [1, 63, 13].12

If the diffeomorphism is the return map of a periodic orbit for flows, this bifurcationproduces an invariant 2-torus. Usually this counterpart for flows is called Neımark-Sacker bifurcation. The periodicity as it occurs in the resonance tongues, for thevector field is related to phase lock. The tongues are contained in gaps of a ‘Cantorset’ of quasi-periodic tori with Diophantine frequencies.Compare the discussion in§§5.1 and 5.2 and and again compare with [150].

12The tongue structure is hardly visible when only one parameter, likeα, is being used.


Arnold family ofcircle maps

structural stabilitypersistence

ω2/ω1

q

Figure C.2: Sketch of the Cantorised Fold, as the bifurcation set of the quasi-periodic center-saddle bifurcation forn = 2 [105, 106], where the horizontal axisindicates the frequency ratioω2 : ω1, cf. (C.7). The lower part of the figure corre-sponds to hyperbolic tori and the upper part to elliptic ones. See the text for furtherinterpretations.

Remarks.

- A very related object is the Arnold family (5.12)

Pβ,ε : x 7→ x+ 2πβ + ε sin x

of circle maps [1, 5, 65] as mentioned at the end of§5.2. As discussed there,quasi-periodic and periodic dynamics coexist. Periodicity in the(β, ε)-planeis organized in resonance tongues, see Figure 5.4. Also herein the comple-ment of the union of tongues, which is open and dense, there isa ‘Cantor set’of hairs having positive measure.

- To describe more complicated dynamical phenonema the notion of structuralstability, in whatever version [2], turns out to be too severe and we have torely on the persistence notions introduced in Chapter 2. Then it remains tobe seen what is the appropriate number of parameters needed for a cleardescription. In quite a few cases the similarity codimension turns out to givean appropriate number of parameters [62, 58], also see below.


saddle-centerbifurcation

structural stabilityCatastrophe Theoryequivalence!smooth

C.1.5 The saddle-center bifurcation

As we saw already in the Chapters 1 and 2, persistence under variation of parametersdepends a lot on the class of systems one is dealing with. The same holds for thepresent notions of bifurcation, codimension, etc. Here we describe the example ofthe (Hamiltonian) saddle-center bifurcation, which has codimension 1. In fact itis the Hamiltonian counterpart of the saddle-node bifurcation (C.1) as described in§C.1. Compare with [33, 106, 135].

We consider Hamiltonian systems in 1 degree of freedom, werethe local bifurca-tion theory to a large extent reduces to Catastrophe Theory [194]. To understandthis considerR2 = p, q with the standard symplectic forσ = d p ∧ d q. ForH,G : R2 → R of Hamiltonian functions suppose thatG = H Φ for a (local)diffeomorhismΦ. Then the corresponding Hamiltonian vector fieldXH andXG

are (smoothly) equivalent byΦ, see Exercise C.1. Therefore, instead of Hamilto-nian vector fields on de plane, we can study Hamiltonian functions under (smooth)coordinate transformations, in Catastrophe Theory calledright-equivalences.

To fix thoughts we now describe a local bifurcation, of the form

Hµ(p, q) = 12p

2 + Vµ(q), (C.6)

wherep, q ∈ R and whereµ is a multi-parameter and wherep, q, µ all vary near0.1 The functionV0 hasq = 0 as a critical point, corresponding to an equilibriumpoint ofH0. In the case where this ciritical point has a non-degenerate Hessian, bythe Morse Lemma, the functionV0 is structurally stable under right-equivalencesΦ,so this case has codimension0. The next case is thatV0(x) = 1

3q3, which has the

following universal unfolding

Vµ(q) = 13q

3 − µq, with µ ∈ R,

so of codimension 1. This is thefold catastrophe, the first of the celebrated Thomlist [194]. In the productR × R = q, µ, the set of critical points is the parabolaµ = q2, for eachµ ≥ 0 leading toq = ±√

µ : a maximum forq = −√µ and a

minimum forq = +√µ. For the corresponding Hamiltonian systems given by (C.6)

these correspond to a saddle-point and a center respectively, which is why we callthis the saddle-center bifurcation. Compare Exercise C.2.

Remarks.

- In a similar way we can get a hold of all cuspoids and umbilics, etc. [194], toobtain bifurcation models in Hamiltonian systems using smooth equivalence,e.g., see [106].

1Officially we are talking ‘germs’.


‘Cantorisation’Diophantine

conditions

- For more degrees of freedom there generally is no hope of obtaining struc-tural stability in any sense whatsoever. A quite representative picture is givenby Figure C.5 in§C.3. It should be said however, that often a perturbationanalysis can be made, where an approximation can be found that fits, e.g., inthe setting of the previous remark. In general the persistence properties intro-duced in Chapter 2 serve well to describe the non-pathological dynamics. Inthe next section we give a few examples of this.

C.2 Quasi-periodic bifurcations

For the generic bifurcations of equilibria and periodic orbits, the bifurcation setsand diagrams are generally determined by a classical geometry in the product ofphase space and parameter space as already established by, e.g., [28, 194], oftenusing Singularity Theory. Quasi-periodic bifurcation theory concerns the extensionof these bifurcations to invariant tori in nearly-integrable systems, e.g., when thetori lose their normal hyperbolicity or when certain (strong) resonances occur. Inthat case the dense set of resonances, also responsable for the small divisors, leadsto a ‘Cantorisation’ of the classical geometries obtained from Singularity Theory,see, e.g., [62, 61, 58, 15, 74, 106, 23]. Broadly speaking onecould say that in thesecases the Preparation Theorem [194] is partly replaced byKAM Theory. Since theKAM Theory has been developed in several settings with or without preservation ofstructure, see Chapter 5 and Appendix B, for the ensuing quasi-periodic bifurcationtheory the same holds.

C.2.1 The quasi-periodic saddle-center bifurcation

We start with an example in the Hamiltonian setting, where a robust model for thequasi-periodic saddle center bifurcation is given by

Hµ,ε(I, ϕ, p, q) = ω1I1 + ω2I2 + 12p

2 + Vµ(q) + εf(I, ϕ, p, q) (C.7)

with Vµ(q) = 13q

3 − µq, compare with [105, 106]. The unperturbed (or integrable)caseε = 0, by factoring out theT2-symmetry, boils down to a standard saddlecenter bifurcation, involving the Fold catastrophe [194] in the potential functionV = Vµ(q), see§C.1.5. This results in the existence of two invariant 2-tori, oneelliptic and the other hyperbolic. For0 6= |ε| ≪ 1 the dense set of resonancescomplicates this scenario, as sketched in Figure C.2, determined by the Diophantineconditions

|〈k, ω〉| ≥ γ|k|−τ , for q < 0, (C.8)

|〈k, ω〉+ ℓ β(q)| ≥ γ|k|−τ , for q > 0

C.2 Quasi-periodic bifurcations 275

quasi-periodic!saddle-centerbifurcation

Cantor setquasi-

periodic!Hopfbifurcation

for all k ∈ Zn \ 0 and for allℓ ∈ Z with |ℓ| ≤ 2. Hereβ(q) =√

2q is normalfrequency of the elliptic torus given byq =

√µ for µ > 0. As before, (cf. Chapter

5 and Appendix B), this gives a ‘Cantor set’ of positive measure [142, 141, 62, 61,58, 23, 105, 106].

For 0 < |ε| ≪ 1 Figure C.2 will be distorted by a near-identity diffeomorphism,compare with the formulations of the Theorems 5.2 and 5.3. Onthe Diophantine‘Cantor set’ the dynamics is quasi-periodic, while in the gaps generically there iscoexistence of periodicity and chaos, roughly comparable with Figure C.5.

Similar programs exist for all cuspoid and umbilic catastrophes [55, 56] as well asfor the Hamiltonian Hopf bifurcation [54]. For applications of this approach see[57]. For a reversible analogue see [52]. As so often within the gaps genericallythere is an infinite regress of smaller gaps [38, 57]. For theoretical background werefer to [142, 58], for more references also see [23].

Remark. One extension of the above theory, both in the Hamiltonian and in thedissipative context, as well as in many other structure preserving settings, deals with‘lack of parameters’, where also ‘distinguished’ parameters like action variablesare counted in. Here an appropriate path formalism considers generic subfamiliessatisfying Russmann-Bahktin conditions. This method hasbeen applied succesfullyby Herman in the 1990’s, for a description see [61, 23] and references therein.

C.2.2 The quasi-periodic Hopf bifurcation

In the general ‘dissipative’ case we basically follow the same strategy. Given thestandard bifurcations of equilibria and periodic orbits, we get complexer situationswhen invariant tori are involved as well. The simplest examples are the quasi-periodic saddle node and quasi-periodic period doubling [62] also see [61, 23].

Quasi-periodic versions exist of the saddle-node, the period doubling and the Hopfbifurcation. Returning to the setting withTn × Rm as the phase space, we remarkthat the quasi-periodic saddle-node and period doubling already occur form = 1, orin an analogous center manifold. The quasi-periodic Hopf bifurcation needsm = 2.We shall illustrate our results on the latter of these cases,compare with [61, 49] andreferences therein. Our phase space isTn × Rm = x (mod2π), y, where we aredealing with the multi-periodic invariant torusTn × 0. In the integrable case, byTn-symmetry we can reduce toRm = y and consider the bifurcations of relativeequilibria. The present interest is with small non-integrable perturbations of suchintegrable models.

We now discuss thequasi-periodic Hopf bifurcation[44, 62], largely following [15].The unperturbed, integrable familyX = Xµ(x, y) onTn × R2 has the form

Xµ(x, y) = [ω(µ) + f(y, µ)]∂x + [Ω(µ)y + g(y, µ)]∂y, (C.9)


BHTnon-degeneracy

αααααααααααααααααααααααααααααααααααα

ββββββββββββββββββββββββββββββββββββ

Figure C.3: Planar section of the ‘Cantor set’Γ(2)τ,γ.

weref = O(|y|) andg = O(|y|2) as before. Moreoverµ ∈ P is a multi-parameterandω : P → Rn andΩ : P → gl(2,R) are smooth maps. Here we take

Ω(µ) =

(

α −ββ α

)

,

which makes the∂y component of (C.9) compatible with the planar Hopf family(C.3). The present form of Kolmogorov non-degeneracy is Broer-Huitema-Takensstability [58], requiring that there is a subsetΓ ⊆ P on which the map

µ ∈ P 7→ (ω(µ),Ω(µ)) ∈ Rn × gl(2,R)

is a submersion. For simplicity we even assume thatµ is replaced by

(ω, (α, β)) ∈ Rn × R2.

Observe that if the nonlinearityg satisfies the well-known Hopf nondegeneracyconditions, e.g., compare [6, 13], then the relative equilibrium y = 0 undergoes astandard planar Hopf bifurcation as described before. Hereα again plays the roleof bifurcation parameter and a closed orbit branches off atα = 0. Without loss ofgenerality we assume thaty = 0 is attracting forα < 0, and that the closed orbitoccurs forα > 0, and is attracting as well. For the integrable familyX, qualitatively

C.3 Transition to chaos 277

Cantor sethyperbolicityresonance!gapresonance¡bubble’

we have to multiply this planar scenario withTn, by which all equilibria turn intoinvariant attracting or repellingn-tori and the periodic attractor into an attractinginvariant(n + 1)-torus. Presently the question is what happens to both then- andthe(n+ 1)-tori, when we apply a small near-integrable perturbation.

The story runs much like before. Apart from the non-degeneracy condition werequire Diophantine conditions like (C.8), defining the ‘Cantor set’

Γ(2)τ,γ = (ω, (α, β)) ∈ Γ | |〈k, ω〉+ ℓβ| ≥ γ|k|−τ , (C.10)

∀k ∈ Zn \ 0, ∀ℓ ∈ Z with |ℓ| ≤ 2,

In Figure C.3 we sketch the intersection ofΓ(2)τ,γ ⊂ Rn × R2 with a planeω × R2

for a Diophantine (internal) frequency vectorω, cf. (5.19).

From [44, 62] it now follows that for any familyX on Tn × R2 × P, sufficientlynearX in theC∞-topology a near-identityC∞-diffeomorphismΦ : Tn×R2×Γ →Tn×R2×Γ exists, defined nearTn×0×Γ, that conjugatesX to X when furtherrestricting toTn × 0 × Γ

(2)τ,γ. So this means that the Diophantine quasi-periodic

invariantn-tori are persistent of a diffeomorphic image of the ‘Cantorset’ Γ(2)τ,γ,

compare with the formulations of the Theorems 5.2 and 5.3.Similarly we can find invariant(n + 1)-tori. We first have to develop anTn+1

symmetric normal form approximation [44, 62, 51]. For this purpose we extend theDiophantine conditions (C.10) by requiring that the inequality holds for all |ℓ| ≤ Nfor N = 7. We thus find another large ‘Cantor’ set, again see Figure C.3,whereDiophantine quasi-periodic invariant(n + 1)-tori are persistent. Here we have torestrict to, e.g.,α > 0 depending on the sign of a normal form coefficient, comparewith Figure C.1.

In both the cases ofn-tori and of (n + 1)-tori, the nowhere dense subset of theparameter space containing the tori can be fattened by normal hyperbolicity to opensubsets. Indeed, the quasi-periodicn- and(n+1)-tori are∞-ly normally hyperbolic[113]. Exploiting the normal form theory [44, 62, 51] to the utmost and using a moreor less standard contraction argument [4, 44], a fattening of the parameter domainwith invariant tori can be obtained that leaves out only small ‘bubbles’ around theresonances, as sketched and explained in Figure C.4 for then-tori. For earlier resultsin the same spirit in a case study of the quasi-periodic saddle node bifurcation see[74, 75, 76], also compare with [38].

C.3 Transition to chaos

One of the main interests over the second half of the twentieth century has been thetransition between orderly and complicated forms of dynamics upon variation of


pendulum

α

β

σ1

σ2

Aσ1

Aσ2

Rσ1

Rσ2

H

Hc

Figure C.4: Fattening by normal hyperbolicity of a nowhere dense parameter setwith invariantn-tori in the perturbed system. The curveH is the Whitney smoothimage of theβ-axis in Figure C.3.H interpolates the Cantor setHc that containsthe non-hyperbolic Diophantine quasi-periodic invariantn-tori, corresponding toΓ

(2)τ,γ, see (C.10). To the pointsσ1,2 ∈ Hc discsAσ1,2 are attached where we find

attracting normally hyperbolicn-tori and similarly in the discsRσ1,2 repelling ones.The contact between the disc boundaries andH is infinitely flat [44, 62].

either intial states or of system parameters. By ‘orderly’ we here mean equilibriumand periodic dynamics and by ‘complicated’ quasi-periodicand chaotic dynam-ics, although we note that only chaotic dynamics is associated to unpredictability,e.g. §2.2.4. As already discussed in§5.1 systems like a forced nonlinear oscillatoror the planar 3-body problem exhibits coexistence of periodic, quasi-periodic andchaotic dynamics, also compare with Figure C.5 in Example C.1 on the parametri-cally forced pendulum.

Similar remarks go for the onset of turbulence in fluid dynamics. Around 1950this led to the scenario of Hopf-Landau-Lifschitz [117, 125, 126], which roughlyamounts to the following. Stationary fluid motion corresponds to an equilibriumpoint in an∞-dimensional state space of velocity fields. The first transition is aHopf bifurcation [116, 6, 13], where a periodic solution branches off. In a sec-ond transition of similar nature a quasi-periodic 2-torus branches off, then a quasi-


perioddoubling!sequence

pendulum

Figure C.5: Poincare maps of the undamped swing (C.11) forΩ ≈ 2ω. We tookc = 0 andA quite large, and even larger in the right picture. Several evolutions canbe distinguished.

periodic 3-torus, etc. The idea is that the motion picks up more and more frequen-cies and thus obtains an increasingly complicated power spectrum. In the early1970’s this idea was modified in the Ruelle-Takens route to turbulence, based onthe observation that, for flows, a 3-torus can carry chaotic (or ‘strange’) attractors[174, 146], giving rise to a broad band power spectrum. By thequasi-periodic bi-furcation theory [62, 61, 23] as sketched below, these two approaches are unified ina generic way, keeping tack of measure theoretic aspects.

Another transition to chaos was detected in the Logistic family of interval maps

Φµ(x) = µx(1 − x),

as met in Chapter 1, also see [5, 134, 137], also for a holomorphic version. Thistransition consists of an infinite sequence of period doubling bifurcations endingup in chaos; it has several universal aspects and occurs generically in families ofdynamical systems. In many of these cases also homoclinic bifurcations show up,where sometimes the transition to chaos is immediate when parameters cross a cer-tain boundary, for general theory see [41, 42, 2, 16]. There exist quite a number ofcase studies where all three of the above scenario’s plays a role, e.g., see [65, 66, 67]and many of their references.

Example C.1 (Chaos in the swing).Consider the parametrically forced pendulum,also calledswing, given by the equation of motion

ϕ′′ = −cϕ′ − (ω2 + A cos Ωt) sinϕ, (C.11)


Figure C.6: Poincare map of the damped swing, again forΩ ≈ 2ω. The cloudof points depicted, belongs to one evolution. The geometricshape that becomesclearer when more iterates are added, and it attracts all nearby evolutions. We herespeak of a ‘strange attractor’.

which is a slight variation on (1.5) of§1.1. In the Figures C.5 and C.6 we depict afew significant orbit segments

(

ϕ( 2Ωπn), ϕ′( 2

Ωπn))

| 0 ≤ n ≤ N

of the Poincare map. Figure C.5 corresponds to the undampedcase of (C.11), i.e.,with c = 0. Here we see several ‘islands’ in a ‘chaotic sea’. Also we see anumberof dotted closed curves. A probable interpretation runs as follows. The islands aredomains where Moser Twist Map Theorem 5.3 applies, giving a (topological) discthat contains a union of quasi-periodic invariant circles constituting a set of positivemeasure. In the center of the discs there is a periodic orbit (not shown here), cor-responding to a subharmonic periodic evolution of (C.11).2 We witness four largerislands, that contain two periodic sets of period 2 (each with an approximate mirrorsymmetry for reflections in the origin), corresponding to subharmonic periodic evo-lutions of period4

Ωπ. These solutions are entrained by the fact thatΩ ≈ 2ω : thisphenomenon also is calledresonance.

Remarks.

- The subharmonics of period 2 close to the horizontal axis, related to a1 : 2resonance, can be used to explain the swinging motion of an incense containerin the cathedral of Santiago de Compostela (Spain), also see[47].

2The central periodic point is a Lebesgue density point of quasi-periodicity [140, 160].


dispersion exponentH“’enon!-like

attractorHopf-Landau-

Lifschitz-Ruelle-Takens route toturbulence

- The chaotic orbits are not yet well-understood. For the conservative cases ofFigure C.5 it was conjectured in 1967 [34] that each orbit densely fills a setof positive area, thereby forming anergodic componentof the dynamics.

This conjecture holds generally for similar area preserving maps with homo-or heteroclinic tangle, for instance also for certain (iso-energetic) Poincaremaps in the 3-body problem. Up to now such conjectures only have beenproven in certain toy models.

Note that homoclinic tangle implies the existence of Horseshoes, for examplesee Figure 4.8, which means that there is positive topological entropy [34, 17].An open question is whether also the dispersion exponent is positive.

- In Figure C.6 only one orbit of the Poincare map is shown in the damped, i.e.,dissipative case, again close to1 : 2 resonance, so whereΩ ≈ 2ω. It turns outthat almost any initial state after a transient is attractedtowards this ‘figure’,which probably is a chaotic, Henon like attractor, comparewith §4.2.2. Theconjecture is that this attractor coincides with the closure

W u((0, 0))

of the unstable manifold to the saddlepoint(x, y) = (0, 0).Also questions likechaoticity and ergodicity play here. Again, to settle this and similar problemsmathematically momentarily seems to be out of reach. For a discussion withquite a few references to the literature, we refer back to§4.2.2.

- It is interesting to scan a model like (C.11) for bifurcations numerically. Alsoit seems feasible to prove certain of the above points, like positivity of disper-sion or Lyapunov exponents, in a computer assisted way. For similar studiessee [65, 66, 67, 68].

Example C.2 (Scenario for the onset of turbulence).Generally speaking, in maysettings quasi-periodicity constitutes the order in between chaos [61]. In the Hopf-Landau-Lifschitz-Ruelle-Takens scenario [117, 125, 126,174] we may consider asequence of typical transitions as given by quasi-periodicHopf bifurcations, startingwith the standard Hopf or Hopf-Neımark-Sacker bifurcation as described before. Inthe gaps of the Diophantine ‘Cantor’ sets generically therewill be coexistence ofperiodicity, quasi-periodicity and chaos in infinite regress. As said earlier, perioddoubling sequences and homoclinic bifurcations may accompany this.

As an example consider a family of maps that undergoes a generic quasi-periodicHopf bifurcation from circle to 2-torus. It turns out that here the Cantorised fold ofFigure C.2 is relevant, where now the vertical coordinate isa bifurcation parameter.Moreover compare with Figure 5.4, where also variation ofε is taken into account.


The ‘Cantor set’ contains the quasi-periodic dynamics, while in the gaps we canhave chaos, e.g., in the form of Henon like strange attractors [146, 67]. A fatteningprocess as explained above, also can be carried out here.

C.4 Exercises

Exercise C.1 (Smooth equivalence of 1 degree of freedom Hamiltonian sys-tems). Given R2 = p, q with the symplectic formσ = dp ∧ dq. Consider aHamilton functionH : R2 → R and a smooth diffeomorphismΦ : R2 → R2.DefineG = H Φ−1 and consider the Hamiltonian vector fieldsXH andXG. ShowthatΦ is a smooth equivalence fromXH toXG.(Hint: Show thatΦ∗(XH)(p, q) = (det d(p,q)Φ) ·XG(Φ(p, q)).)

Exercise C.2 (Phase portraits of the Hamiltonian saddle-center bifurcation).Draw graphs of the potential functionsVµ(q) = 1

3q3 − µq for µ < 0, µ = 0 and

µ > 0. Sketch the corresponding phase portraits of the vector fieldXHµ, where

Hµ(p, q) = 12p

2 + Vµ(q).

Exercise C.3 ((∗) (Similarity) codimension). For the saddle-center bifurcationmodelH(p, q) = 1

2p2 + 1

3q3−µq, consider the parabola of equilibria, in the product

of phase space and parameter space given by the equationp = 0 andµ = q2. Alsoconsider the corresponding linear part of the Hamiltonian vector field, which formsa smooth curvec in the spacesl(2,R) of trace zero matrices.

1. Show thatc is smoothly parametrized byq and compute thec(q) for q < 0,q = 0 andq > 0;

2. LetN = A ∈ sl(2,R) | detA = 0. Show thatN ⊂ sl(2,R) is a codimen-sion 1 surface with a singularity at0 ∈ sl(2,R);

3. Show that the curvec intersectsN transversally;

4. (∗) Considering the similarity action ofSL(2,R) on sl(2,R), show thatc is auniversal unfolding of

c(0) =

(

0 01 0

)

.

(Hint: Have a good look in [1, 58].)

Appendix D

Derivation of the Lorenz equations

The derivation of the Lorenz equations is quite involved andat certain momentsfacts known from fluid dynamics will have to be used. We proceed as follows. Firsta precise description of the geometry of the problem is given. Next an analysisis given of the equations of motion of incompressible fluids of constant densitywithin this geometry. Finally we introduce the correction in these equations ofmotion as a consequence of temperature differences. In thisway a system of partialdifferential equations is obtained. These have to be reduced to a system of threeordinary differential equations. To this end we chose an appropriate 3-dimensionalsubspace of our function space, such that the projection on this subspace contains asmuch as possible of the original dynamics. To arrive at a goodchoice, we need thehelp of a stability analysis that Rayleigh [164] carried outfor this system. After thisa projection on a 3-dimensional subspace turns out to be feasible. For this derivationwe gratefully made use of [201].

D.1 Geometry and flow of an incompressible fluid

Our starting position is a 2-dimensional layer, that in the horizontal direction isinfinitely extended, but which is bounded in the vertical direction. For the horizontalcoordinate we takex and for the vertical oney. Our layer can be described asy1 ≤ y ≤ y2. Without loss of generality we may assume thaty1 = 0. Forthe height we can also take whatever we want, which can be arranged by chosingappropriate units. It turns out thaty2 = π is convenient and we so arrive at

0 ≤ y ≤ π.In reality 3-dimensional layers are even more relevant, however, also in a 3-dimensionallayer 2-dimensional flows can occur. For such flows, there is no component in thesecond horizontal direction and also all velocities are independent of the secondhorizontal coordinate.

284 Derivation of the Lorenz equations

Navier Stokesequation

A 2-dimensional fluid flow is described by a velocity fieldv = (vx, vy), where bothcomponents are functions ofx, y and the timet. The equation of motion is deducedfrom Newton’s lawF = ma. The acceleration of a ‘fluid particle’ is given by

(∂t + ∂v)v,

where we abbreviate∂t = ∂/∂t and∂v = vx∂x + vy∂y. We assume that the densityof our fluid equals 1. The force effected per unit of volume, and hence per unitmass, is constituted by three contributions. The first is thenegative gradient of thepressure

−gradp = (−∂xp,−∂yp).

Secondly there is a force due to friction in the fluid and that will have the form

ν∆v,

where∆ = ∂xx + ∂yy is the Laplace operator and whereν is a constant determinedby the viscosity of the fluid (i.e., the degree in which the fluid is experiencing inter-nal friction). Thirdly, we keep the possibility open of an ‘external force’

Fext.

In this way we find the Navier-Stokes equation

∂tv = −∂vv − gradp+ ν∆v + Fext. (D.1)

A more extensive explanation of this equation can be found inany good book onfluid dynamics, e.g., see [40].

Remarks.

- We imposed the requirement that the fluid is incompressibleand has constantdensity. This means that the velocity fields isdivergence free, meaning that

divv = ∂xvx + ∂yvy = 0;

for this and various other facts from vector analysis that weuse here, we referto [48]. This implies that the right hand side of the Navier Stokes equation(D.1) also has to be divergence free. It can be shown that, forgiven v andFext, this requirement, up to an additive constant, uniquely determines thepressurep.

- For any divergence free vector fieldv in dimension 2 there exists astreamfunction, i.e., a functionΨ, such that

vx = −∂yΨ, vy = ∂xΨ.

D.2 Heat transport and the influence of temperature 285

Since in our case the velocity of the fluid along the boundaries of the stripmust be horizontal, any associated stream function has to beconstant on theseboundary components. If there is a difference between theseconstants forthe upper and lower boundary, there will be a netto flow in the horizontaldirection. We exclude this and therefore assume that our stream functions arezero on the boundaries of the strip. Moreover we note that, contrarious towhat happens often, we do not require that the velocity alongthe boundariesvanishes, but just that this velocity is parallel to these.

- For 2-dimensional vector fieldsv = (vx, vy) we define thecurl by

curlv = ∂xvy − ∂yvx.

Since in the Navier-Stokes equation (D.1) both the left and the right hand sideare divergence free and the vector fields have to be tangent tothe boundaries,we don’t loose information if we apply the curl operator on alterms in theequation. We here note that ifΨ is the stream function ofv then

curlv = ∆Ψ.

Thus we transform the Navier-Stokes equation (D.1) into itsvorticity form

∂t∆Ψ = −∂v∆Ψ + ν∆2Ψ + curlFext. (D.2)

D.2 Heat transport and the influence of temperature

As noted before, we shall assume that the lower boundary of the strip is warmer thanthe upper boundary. We set the temperature of the lower boundary at the valueTand keep the temperature of the upper boundary at0. This gives rise to a stationarytemperature distribution

τstat(x, y, t) = T − yT/π,

recalling that the height of the strip isπ. This is indeed the temperature distributionwhen the fluid is at rest and if the only heat transfer is due to heat conduction. Wenow describe a temperature distributionτ(x, y, t) in terms of the deviation from thestationary distribution:

Θ(x, y, t) = τ(x, y, t) − τstat(x, y, t).

From this definition it follows thatΘ, as well as our stream function, has to vanishat the strip boundaries. If in the fluid there is no heat conduction, and hence thetemperature moves with the fluid, then necessarily

∂tτ = −∂vτ.


In the case that heat conduction does take place in the fluid, the right hand sideneeds an extra termκ∆τ, whereκ depends on the way in which the fluid conductsthe heat. For the functionΘ this means

∂tΘ = −∂vΘ − ∂v(−yT/π) + κ∆Θ. (D.3)

Now we still have to describe the influence of the temperaturedistribution on themotion of the fluid. This influence is a consequence of the factthat warm fluid isless dense than cold fluid. This leads to a problem since we have assumed that thefluid is uncompressible and has everywhere the same density.In the textbook [40],as already mentioned, arguments are given that under reasonably general circum-stances, the conclusions of the following ‘reasoning’ holdto good approximation.The density, for small differences in temperature, can be assumed equal to

ρ = ρ0 − cτ,

for positive constantsc andρ0. This means that we have a gravitational force perunit of volume, as a function of time and position, that is equal to a constant pluscτ .1 In summary, in the Navier Stokes equation (D.1), (D.2), the external forceFext, up to an additive constant, is set equal to(0, c1τ), which casts the vorticityform (D.2) Navier-Stokes equations into the form

∂t∆Ψ = −∂v∆Ψ + ν∆2Ψ + c∂xΘ. (D.4)

Now, finally, to get the whole system of equations (D.3) and (D.4) in a simpler form,we note that, ifΨ is the stream function of the velocity fieldv, then for an arbitraryfunctionf the directional derivative can be expressed as follows:

∂vf = ∂xΨ ∂yf − ∂yΨ ∂xf =∂(Ψ, f)

∂(x, y),

where in the latter expression we use notation familiar fromtransforming integrals.A direct computation confirms this. We now express the total set of equations as

∂t∆Ψ = −∂(Ψ,∆Ψ)

∂(x, y)+ ν∆2Ψ + c∂xΘ (D.5)

∂tΘ = −∂(Ψ,Θ)

∂(x, y)+ 1

πT∂xΨ + κ∆Θ.

1NB: Here two minus signs have cancelled as follows. Gravitation points downward, while thepositivey-directions points upward: this accounts for one minus sign. The formula for the densitycontains the second minus sign.

D.3 Rayleigh stability analysis 287

D.3 Rayleigh stability analysis

The equations (D.5) have a trivial solution, namely where both Ψ andΘ are identi-cally zero. This solution corresponds to the case where the fluid is at rest and heatconduction occurs exclusively by diffusion. Now Rayleigh attempted to answer thefollowing question: for which differences of temperatureT is this zero solution sta-ble? A feasible method for this is tolineariseat this zero solution. (Recall how weobtained a good idea of the dynamics near the saddle point of the Henon map bylinearisation.) Presently we linearise by omitting the terms

−∂(Ψ,∆Ψ)

∂(x, y)and − ∂(Ψ,Θ)

∂(x, y)

from (D.5), evidently leading to

∂t∆Ψ = ν∆2Ψ + c∂xΘ (D.6)

∂tΘ = 1πT∂xΨ + κ∆Θ.

We first determine a great number of solutions of the linearised system (D.6), lateronexplaining the meaning of these for the stability problem. We try solutions of theform

Ψ = X(t)Ψn,a, Θ = Y (t)Θn,a,

whereΨn,a = sin(ax) sin(ny), Θn,a = cos(ax) sin(ny), (D.7)

with a ∈ R andn ∈ Z both positive. Substituting these in the linearised system(D.6) gives the following equations for the scalar functionsX(t) andY (t) :

∂tX = −ν(a2 + n2)X +ca

a2 + n2Y

∂tY = 1πTaX − κ(a2 + n2)Y.

This is a linear system with matrix

A =

(

−ν(a2 + n2) ca (a2 + n2)−1

1πTa −κ(a2 + n2)

)

.

Since all our constants are positive, the trace ofA is negative. Moreover, for suf-ficiently small values ofT both eigenvalues are negative. For increasingT thisremains true, at least als long asdetA > 0. The turning point occurs for the tem-peratureTn,a, where we getdetA > 0, i.e., where

νκ(a2 + n2)2 =Tn,aa

2c

π(a2 + n2)


or

Tn,a =νκπ(a2 + n2)3

a2c.

If we view the solutions of (D.6) as very small perturbationsof the zero solution of(D.5), then we see that forT > Tn,a these small solutions can grow exponentially.For such values ofT we don’t expect stability. Our interest is now with valuesof n anda for which Tn,a is minimal. Clearly, to minimiseTn,a, we have to taken = 1. Next, to determine the value ofa whereTn,a is minimal, we have to findthe minimum of(a2 + 1)3/a2. This occurs fora2 = 1

2. The corresponding ‘critical’

temperature then is

Tc = T1,√

1/2=

27νκπ

4c. (D.8)

We now return to the stability problem. According to Fouriertheory, any initialstate(Ψ,Θ) of the equation of motion (D.5) can be expressed as an infinitesum(or integral) of contributions in terms ofΨn,a andΘn,a. We used this before whendiscussing the heat equation in§1.3.5. Again we refer to [167] for backgroundinformation.

Remarks.

- Apart from these, also contributions occur as indicated atthe end of the previ-ous paragraph, namely when in the definitions ofΨn,a andΘn,a of (D.7), wejust replacesin(ax) by cos(ax) or cos(ax) by − sin(ax), respectively. Thisdoes not essentially change the argument.

- The functionsΨ andΘ both vanish at the boundariesy = 0, π of the do-main of definition. This is the reason why in they-directions only contribu-tions show up of the formsin(ny).

- In thex-direction the domain is unbounded, whence the parametera can at-tain all (positive) values. This means that the Fourier expansion in thex-direction gives rise to an integral. This might be differentif we would boundthe domain in thex-direction as well, or by requiring so-called ‘periodicboundary conditions’. We won’t go into this matter here.

- We won’t consider any aspects regarding the convergence ofFourier expan-sions.

When taking an arbitrary initial state(Ψ,Θ) of (D.5), expressing it as a sum (or anintegral) of contributions as (D.7) etc., then the corresponding solution again canbe written as such a sum (or integral) of the solutions corresponding to the various(n, a). For T < Tc (see (D.8)) we expect the solution to tend to the zero-solution

D.4 Restriction to a 3-dimensional state space 289

(since all contributions tend to 0). In this case the zero-solution will be stable.We already said that forT > Tc the zero-solution is not stable, in the sense thatsolutions of the linearised system can grow exponentially.

Remark. We must point out that our argumentation, in order to obtain mathematicalrigour, needs completion. In this discussion we systematically have omitted allconsiderations dealing with function spaces, i.e.infinite dimensional vector spaces.Nevertheless we shall considerTc as aformalstability boundary.

D.4 Restriction to a 3-dimensional state space

After the above stability analysis it may be feasible to lookfor approximating so-lutions of the equations of motion (D.5), starting from solutionsΨn,a andΘn,a, see(D.7), of the linearised system (D.6) found so far. It also seems reasonable to startwith the solution that is ‘the first’ to be unstable, so fora2 = 1/2 andn = 1. Thishas to to with the theory of Center Manifolds [113], discussed elsewhere in the text.Presently, however, the parametera is still kept free, although we shall fixn = 1.We first check in how far the solution of the linearised system(D.6) satisfies thecomplete nonlinear equations (D.5). To this end we compute

∂(Ψ1,a,∆Ψ1,a)

∂(x, y)= 0

∂(Ψ1,a,Θ1,a)

∂(x, y)= 1

2a sin(2y).

We see a ‘new’ functionsin(2y) show up. Note however that neither the Laplaceoperator∆ nor the partial derivative∂x, applied to this function, yields any newfunctions. So if we want to restrict to a 3-dimensional statespace, it seems feasibleto consider approximating solutions of the form

Ψ = X(t)Ψ1,a;

Θ = Y (t)Θ1,a − Z(t) sin(2y).

When substituting in the equation of motion (D.5) the only essentially new term is

X(t)Z(t)∂(Ψ1,a, sin(2y))

∂(x, y)= X(t)Z(t)2a cos(ax) sin y cos(2y)

= X(t)Z(t)(−a cos(ax)(sin y − sin(3y))

= X(t)Z(t)(−aΘ1,a + a cos(ax) sin(3y).

So, the only term outside the basis specified so far, is

aX(t)Z(t) cos(ax) sin(3y) (D.9)


Lorenz!system and it is this term that we shalldeletein order to get a projection on a 3-dimensionalstate space. We have to admit that here there is some arbitrariness: we also couldhave deleted the whole new term. What we did here is carry out an orthogonalprojection with respect to the an inner product in which the basis functions for theFourier decomposition are orthogonal. And this choice is inspired by the fact thatthese basis functions of the Fourier decomposition also arethe eigenfunctions ofthe linearised system (D.6). After carrying out the substitution in (D.5) (includingthis deletion) we get:

∂t(−(a2 + 1)X(t)Ψ1,a) = ν(a2 + 1)2X(t)Ψ1,a − acY (t)Ψ1,a

∂t(Y (t)Θ1,a − Z(t) sin(2y)) = −12aX(t)Y (t) sin(2y) − aX(t)Z(t)Θ1,a

+ 1πaTX(t)Θ1,a − κ(a2 + 1)Y (t)Θ1,a

+4κZ(t) sin(2y),

which, by taking together the terms withΨ1,a, Θ1,a, and− sin(2y), yields the fol-lowing equations forX, Y enZ :

∂tX = −ν(a2 + 1)X + ac (a2 + 1)−1Y

∂tY = −aXZ + 1πaT X − κ(a2 + 1)Y

∂tZ = 12aXY − 4κZ.

This is the 3-dimensional system of differential equationsas announced. They don’tyet have the format in which the Lorenz equations are usuallyshown. We now carryout a few changes to obtain the standard Lorenz format, displayed below as (D.11).

As a first step we divide the right hand sides of all three equations byκ(a2 + 1),which ensures that the coefficient ofY in the second equation equals−1. Such amultiplication only makes the system faster or slower, but all other properties of thesolutions, like being (quasi-) periodic or chaotic, etc., will not be changed by this.This gives:

X ′ = −νκX +

ac

κ(a2 + 1)2Y

Y ′ = − a

κ(a2 + 1)XZ +

aT

κπ(a2 + 1)X − Y (D.10)

Z ′ =a

2κ(a2 + 1)XY − 4

a2 + 1Z.

Next we carry out scaling substitutions of the formX = Ax,Y = By andZ = Czto get the system on the standard form

x′ = σy − σx

y′ = rx− y − xz (D.11)

z′ = −bz + xy,

D.4 Restriction to a 3-dimensional state space 291

see (1.23) in§1.3. It will turn out that for all positive values of the parameterss ν, κ, a, c, and T, there exists a unique choice for corresponding values ofA, B, C, σ, r and b. In this way also the physical meaning of the coefficientsin the Lorenz equations (D.11) can be determined.

To show howA, B, C, σ, r andb should be chosen, we first carry out the scalingsubstitutionsX = Ax, etc., in (D.10). This gives

x′ = −νκx+

acB

κ(a2 + 1)Ay

y′ = − aAC

κ(a2 + 1)Bxz +

aTA

κπ(a2 + 1)Bx− y

z′ =aAB

2κ(a2 + 1)Cxy − 4

a2 + 1z.

Thus we find as necessary conditions to get the systems (D.10)and (D.11) the samefor givenν, κ, a, c andT :

σ =ν

κ;

ν

κ=

ac

κ(a2 + 1)

B

A⇒ B

A=ν(a2 + 1)2

ac;

r =aT

κπ(a2 + 1)

A

B=

Ta2c

κνπ(a2 + 1)3=

T

T1,a,

whereT1,a is the critical temperature (D.8). Moreover,

1 =a

κ(a2 + 1)

AC

B⇒ C =

κν(a2 + 1)3

a2c;

1 =a

2κ(a2 + 1)

AB

C⇒ AB =

2κ2ν(a2 + 1)4

a3c.

Since bothB/A andAB are determined (and positive), this means that the (positive)values ofA andB are uniquely determined. Next

b =4

a2 + 1.

Concerning the latter equation we recall that the most feasible value isa2 = 12 ,

which leads tob = 83 , which is the value mostly given to this coefficient.

Although for the value ofb in the standard Lorenz equations (D.11) there is a naturalchoice, this is not the case forr. For r, at least in the case that leads to the Lorenzattractor, one choosesr = 28. We note however, that for this value ofr, even withperiodic boundary conditions, several basis solutions of the linearised system (D.6),corresponding to higher values ofn, already have become unstable. This castscertain doubts on the choicer = 28 for the 3-dimensional approximation.


Lorenz!attractor

x

z

Figure D.1: Typical evolution of the Lorenz system (D.11), projected on the(x, z)-plane.

D.5 Numerical simulations

In Figure D.1 we project a rather arbitrarily chosen evolution of the Lorenz system(D.11) on the(x, z)-plane. Here the usual parameter valuesσ = 10, r = 28 andb = 8/3 have been taken. Compare with Figure 1.21 in§1.3. The figure shows whyone is referring to the Lorenz butterfly. The background of this name, however,is somewhat more complicated. As said earlier, Lorenz himself saw this systemmostly as supportive for the thesis that the rather inaccurate weather predictions arerelated to sensitive dependence on initial states of the atmospheric system. In a lec-ture [129, 130] he illustrated this by way of an example, saying that the perturbationof one wing beat of a butterfly in the Amazon jungle, after a month, can have grownto such a size, that it might ‘cause’ a tornado in Texas. It is this catastrophical but-terfly that the name relates to. As in the case of the Henon attractor, see Figure 1.13,this configuration is highly independent of the initial state, at least if we disregard atransient, while evolutions that start too far from the attractor tend to∞.

To demonstrate the sensitive dependence on intiatial state, in Figure D.2, comparewith Figure 1.22, we take two initial statesx = y = z = 0.2 andx = y = 0.2,z = 0.20001, plotting thex-coordinate of the ensuing evolutions as a function ofthe timet ∈ [0, 50]. We clearly witness that the two graphs first are indistinguish-able, but after some time show differences, after which theyrapidly get completelyindependent of each other.

D.5 Numerical simulations 293

x

t

20

−20

0

0

50

Figure D.2: Thex-coordinate of two evolutions of the Lorenz system (D.11) withnearby initial states, as a function of time: an illustration of sensitive dependence ofinitial state.

Bibliography

[1] V.I. Arnold, Geometrical Methods in the Theory of Ordinary DifferentialEquations.Springer-Verlag 1983.

[2] H.W. Broer, F. Dumortier, S.J. van Strien and F. Takens, Structures in dynam-ics, finite dimensional deterministic studies, In E.M. de Jager and E.W.C. vanGroesen (eds.)Studies in Mathematical Physics IINorth-Holland 1991.

[3] Y. Choquet-Bruhat, C. de Witt-Morette, and M. Dillard-Bleick. Analysis,Manifolds and Physics, North–Holland, 1977.

[4] S.-N. Chow, J.K. Hale,Methods of Bifurcation Theory, Springer-Verlag, 1982.

[5] R.L. Devaney,An Introduction to Chaotic Dynamical Systems, Benjamin,1986; reprint Addison-Wesley, 1989.

[6] J. Guckenheimer, P. Holmes,Nonlinear Oscillations, Dynamical Systems, andBifurcations of Vector Fields, Springer-Verlag, 1983.

[7] A. Katok, B. Hasselblatt,Introduction to the Modern Theory of DynamicalSystems, Cam. Univ. Press, 1995.

[8] J. Palis, M. de Melo,Geometric Theory of Dynamical Systems, Springer-Verlag, 1982.

[9] S.H. Strogatz,Nonlinear Dynamics and Chaos, Addison Wesley publishingcompany, 1994.

[10] F. Verhulst,Nonlinear Differential Equations and Dynamical Systems, SecondEdition, Springer Verlag, 1996.

[11] S. Wiggins, Introduction to Applied Nonlinear Dynamical Systems andChaos.Springer-Verlag 1990.

296 BIBLIOGRAPHY

On Bifurcation Theory

[12] S-N. Chow, C. Li and D. Wang,Normal Forms and Bifurcation of PlanarVector Fields. Cambridge University Press 1994.

[13] Yu.A. Kuznetsov,Elements of Applied Bifurcation Theory, Third Edition,Appl. Math. Sciences, Vol. 112, Springer, New York (2004).

[14] D. Ruelle,Elements of Differentiable Dynamics and Bifurcation Theory. Aca-demic Press 1989.

[15] M.C. Ciocci, A. Litvak-Hinenzon, H.W. Broer, Survey ondissipativeKAM

theory including quasi-periodic bifurcation theory basedon lectures by HenkBroer. In: J. Montaldi and T. Ratiu (eds.):Geometric Mechanics and Symme-try: the Peyresq Lectures,LMS Lecture Notes Series,306. Cambridge Uni-versity Press, 2005, 303-355.

[16] J. Palis, F. Takens,Hyperbolicity & Sensitive Chaotic Dynamics at HomoclinicBifurcations, Cambridge Univ. Press, 1993.

On Ergodic Theory

[17] R. Mane,Ergodic Theory of Differentiable Dynamics, Springer-Verlag, 1983.

[18] P. Walters,An Introduction to Ergodic Theory, Springer-Verlag 1981.

Handbooks and Encyclopædiæ

[19] B. Hasselblatt, A. Katok (eds.),Handbook of Dynamical SystemsVol. 1A,North-Holland, 2002.

[20] B. Hasselblatt, A.I. Katok (eds.),Handbook of Dynamical SystemsVol. 1B,North-Holland, 2006.

[21] B. Fiedler (ed.),Handbook of Dynamical SystemsVol. 2, North-Holland,2002.

[22] H.W. Broer, B. Hasselblatt, F. Takens (eds.),Handbook of Dynamical SystemsVol. 3, North-Holland 2007 (to appear).

[23] H.W. Broer, M.B. Sevryuk,KAM Theory: quasi-periodicity in dynamical sys-tems. In: H.W. Broer, B. Hasselblat, F. Takens (eds.),Handbook of DynamicalSystemsVol. 3, North-Holland 2008 (to appear).

BIBLIOGRAPHY 297

[24] H.W. Broer, F. Takens, Preliminaries of dynamical systems. In: H.W. Broer, B.Hasselblat, F. Takens (eds.),Handbook of Dynamical SystemsVol. 3, North-Holland 2008 (to appear).

[25] A.J. Homburg, B. Sandstede, In: H.W. Broer, B. Hasselblat, F. Takens (eds.), add titleHandbook of Dynamical SystemsVol. 3, North-Holland 2008 (to appear).

[26] F. Takens, Reconstruction theory and nonlinear time series analysis. In: H.W.Broer, B. Hasselblat, F. Takens (eds.),Handbook of Dynamical SystemsVol.3, North-Holland 2008 (to appear).

[27] F. Takens, A. Vanderbauwhede, Local invariant manifolds and normal forms.In: H.W. Broer, B. Hasselblatt, F. Takens (eds.),Handbook of Dynamical Sys-temsVol. 3, North-Holland 2008 (to appear).

[28] V.I. Arnold (ed.), Dynamical Systems. V. Bifurcation Theory and Catastro-phe Theory, Encyclopædia of Mathematical Sciences, Vol. 5, Springer-Verlag,1994.

Other references

[29] D. Aeyels, Generic observability of differentiable systems,SIAM J. Controland Optimization19 (1981), 595-603.

[30] V.I. Arnold, On the classical perturbation theory and the stability problem ofthe planetary system,Dokl. Akad. Nauk SSSR145(1962), 487-490.

[31] V.I. Arnold, Small divisors I: On mappings of the circleonto itself,IzvestiyaAkad. Nauk SSSR, Ser. Mat.25 (1961), 21–86 (in Russian); English transla-tion: Amer. Math. Soc. Transl., Ser.2 46 (1965), 213–284; Erratum:IzvestiyaAkad. Nauk SSSR, Ser. Mat.28 (1964), 479–480 (in Russian).

[32] V.I. Arnold, Proof of a theorem by A.N. Kolmogorov on thepersistence ofconditionally periodic motions under a small change of the Hamilton function,Russian Math. Surveys18 (5) (1963), 9-36 (English; Russian original).

[33] V.I. Arnold, Mathematical Methods of Classical Mechanics, GTM 60Springer-Verlag, New York, 1978.

[34] V.I. Arnold, A. Avez, Probemes Ergodiques de la Mecanique classique,Gauthier-Villars, 1967;Ergodic problems of classical mechanics, Benjamin,1968.

298 BIBLIOGRAPHY

[35] V.I. Arnold, B.A. Khesin,Topological Methods in Hydrodynamics, Springer-Verlag, 1998.

[36] R. Badii, A. Politi, Hausdorff dimension and uniformity factor of strange at-tractors,Phys. Rev. Letters52 (1984), 1661-1664.

[37] R. Badii, A. Politi, Statistical description of chaotic attractors: the dimensionfunction,J. Stat. Phys. 40 (1985), 725-750.

[38] C. Baesens, J. Guckenheimer, S. Kim and R.S. MacKay, Three coupled oscil-lators: Mode–locking, global bifurcation and toroidal chaos,Physica D49(3)(1991), 387-475.

[39] J. Barrow-Green,Poincare and the Three Body Problem, History of Mathe-matics, Vol. 11, Amer. Math. Soc., Providence, RI; London Math. Soc., Lon-don (1997).

[40] G.K. Batchelor,An introduction to Fluid Dynamics, Cambridge Univ. Press,1967.

[41] M. Benedicks, L. Carleson, On iterations of1 − ax2 on (−1, 1), Annals ofMath. 122(1985), 1-24.

[42] M. Benedicks, L. Carleson, The dynamics of the Henon map, Ann. Math. 133(1991), 73-169.

[43] P. Blanchard, R.L. Devaney, G.R. Hall,Differential Equations(Third Ed.),Thomson Brooks / Cole, 2006.

[44] B.L.J. Braaksma and H.W. Broer, On a quasi-periodic Hopf bifurcation,Ann.Institut Henri Poincare, Analyse non lineaire4(2) (1987), 115-168.

[45] D.R. Brillinger,Time series, McGraw-Hill, 1981.

[46] M. Brin, A. Katok, On local entropy. In: J. Palis (ed.),Geometric Dynamics,LNM 1007(1983).

[47] H.W. Broer, De chaotische schommel,Pythagoras35(5) (1997), 11-15.

[48] H.W. Broer,Meetkunde en Fysica, Epsilon Uitgaven44, 1999.

[49] H.W. Broer, Coupled Hopf-bifurcations: Persistent examples of n-quasiperiodicity determined by families of 3-jets,Asterisque286(2003), 223-229.

BIBLIOGRAPHY 299

[50] H.W. Broer, KAM theory: the legacy of Kolmogorov’s 1954paper,Bull. AMS(New Series)41(4) (2004), 507-521.

[51] H.W. Broer, Normal forms in perturbation theory. In: R.Meyers (ed.),En-cyclopaedia of Complexity & System Science. To be published by Springer-Verlag 2008.

[52] H.W. Broer, M.-C. Ciocci and H. Hanßmann, The quasi-periodic reversibleHopf bifurcation,Intern. J. Bifurcation Chaos, 17(8) (2007), 2605-2623.

[53] H.W. Broer, R.H. Cushman, F. Fasso and F. Takens, Geometry of KAM torifor nearly integrable Hamiltonian systems.Ergod. Th & Dynam. Sys.(2007)to appear.

[54] H.W. Broer, H. Hanßmann and J. Hoo, The quasi-periodic Hamiltonian Hopfbifurcation,Nonlinearity20 (2007), 417–460.

[55] H.W. Broer, H. Hanßmann and J. You, Bifurcations of normally parabolic toriin Hamiltonian systems,Nonlinearity18 (2005), 1735–1769.

[56] H.W. Broer, H. Hanßmann and J. You, Umbilical torus bifurcations in Hamil-tonian systems,J. Differential Equations222(2006), 233–262.

[57] H.W. Broer, H. Hanßmann,A. Jorba, J. Villanueva and F.O.O. Wagener,Normal-internal resonances in quasi-periodically forcedoscillators: a conser-vative approach,Nonlinearity16 (2003), 1751–1791.

[58] H.W. Broer, J. Hoo and V. Naudot, Normal linear stability of quasi-periodictori, J. Differential Equations232(2007), 355-418.

[59] H.W. Broer and G.B. Huitema, A proof of the isoenergeticKAM -theorem fromthe “ordinary” one,J. Differential Equations90 (1991), 52–60.

[60] H.W. Broer and G.B. Huitema, Unfoldings of quasi-periodic tori in reversiblesystems,J. Dynamics Differential Equations7 (1995), 191-212.

[61] H.W. Broer, G.B. Huitema, M.B. Sevryuk, Quasi-periodic motions in Familiesof Dynamical Systems,Lecture Notes in Mathematics1645, Springer-Verlag1996.

[62] H.W. Broer, G.B. Huitema, F. Takens, B.L.J. Braaksma, Unfoldings and bi-furcations of quasi-periodic tori,Memoir AMS, 421, Amer. Math. Soc., Provi-dence, R.I. 1990.

[63] H.W. Broer, M. Golubitsky and G. Vegter, The geometry ofresonance tongues:a singularity theory approach,Nonlinearity16 (2003), 1511-1538.

300 BIBLIOGRAPHY

[64] H.W. Broer, B. Krauskopf, Chaos in periodically drivensystems. InB. Krauskopf and D. Lenstra (eds.),Fundamental Issues of Nonlinear LaserDynamics,American Institute of Physics Conference Proceedings548(2000),31-53.

[65] H.W. Broer, C. Simo and J.C. Tatjer, Towards global models near homoclinictangencies of dissipative diffeomorphisms,Nonlinearity11(3) (1998), 667–770.

[66] H.W. Broer, C. Simo, R. Vitolo, Bifurcations and strange attractors in theLorenz-84 climate model with seasonal forcing,Nonlinearity15(4) (2002),1205-1267.

[67] H.W. Broer, C. Simo, R. Vitolo, The Hopf-Saddle-Node bifurcation for fixedpoints of 3D diffeomorphisms, the Arnold resonance web,Bull. Belgian Math.Soc. Simon Stevin(2008) to appear.

[68] H.W. Broer, C. Simo, R. Vitolo, The Hopf-Saddle-Node bifurcation for fixedpoints of 3D diffeomorphisms, analysis of a resonance ‘bubble’, Physica D(Nonlinear Phenomena) (2008) to appear.

[69] H.W. Broer, F. Takens, Unicity ofKAM tori, Ergod. Th. & Dynam. Sys.27(2007), 713-724.

[70] L.E.J. Brouwer, Zur Analysis Situs,Math. Ann. 68 (1910), 422 - 434.

[71] M. Casdagli, Chaos and deterministicversusstochastic non-linear modelling,J. R. Statist. Soc. 54 B (1992), 303-328.

[72] M. Casdagli, S. Eubank (eds.),Nonlinear modelling and forecasting, Addison-Wesley, 1992.

[73] A. Cayley,complete

[74] A. Chenciner, Bifurcations de points fixes elliptiques. I. Courbes invariantes,Publ. Math. IHES61 (1985), 67–127.

[75] A. Chenciner, Bifurcations de points fixes elliptiques. II. Orbites periodiqueset ensembles de Cantor invariants,Invent. Math.80 (1985), 81–106.

[76] A. Chenciner, Bifurcations de points fixes elliptiques. III. Orbites periodiquesde “petites” periodes et elimination resonnante des couples de courbes invari-antes,Publ. Math. IHES66 (1988), 5–91.

[77] P. Collet, J.-P. Eckmann,Iterated Maps on the Interval as Dynamical Systems,Birkhauser, 1980.

BIBLIOGRAPHY 301

[78] R. Courant,Differential and Integral Calculus, 1937, heruitgave: Wiley, 1988.

[79] J.P. Crutchfield, J.D. Farmer, N.H. Packerd, R.S. Shaw,Chaos,ScientificAmericanDec. 1986, 38-49.

[80] R.H. Cushman and L.M. Bates,Global Aspects of Classical Integrable Sys-tems, Birkhauser, Basel (1997).

[81] R.H. Cushman, H.R. Dullin, A. Giacobbe, D.D. Holm, M. Joyeux, P. Lynch,D.A. Sadovskiı and B.I. Zhilinskiı, CO2 molecule as a quantum realizationof the1 : 1 : 2 resonant swing-spring with monodromy,Phys. Rev. Lett.93(2004), 024302.

[82] C.D. Cutler, Some results on the behaviour and estimation of the fractal di-mension of distributions on attractors,J. Stat. Phys. 62 (1991), 651-708.

[83] C.D. Cutler, D.T. Kaplan (eds.), Nonlinear dynamics and time series,FieldsInst. Communications11, AMS, 1997.

[84] D. van Dantzig,Studien over topologische algebra, dissertatie Rijksuniver-siteit Groningen, 1931.

[85] M. Denker, G. Keller, OnU-statistics and v. Mises’ statistics for weakly de-pendent processes,Z. Wahrscheinlichkeitstheorie verw. Gebiete64 (1983),505-522.

[86] M. Denker, G. Keller, Rigorous statistical proceduresfor data from dynamicalsystems,J. Stat. Phys. 44 (1986), 67-93.

[87] F. Diacu and P. Holmes,Celestial Encounters. The Origins of Chaos and Sta-bility, Princeton Univ. Press, Princeton, NJ (1996).

[88] L.J. Dıaz, I.L. Rios, M. Viana, The intermittency route to chaotic dynamics.In: H.W. Broer, B. Krauskopf, G. Vegter (eds.),Global Analysis of DynamicalSystems, IoP, 2001, 309-328.

[89] C. Diks,Nonlinear time series analysis, World Scientific, 1999.

[90] C. Diks, W.R. van Zwet, F. Takens, J. DeGoede, Detectingdifferences betweendelay vector distributions,Phys. Rev. 53 E (1996), 2169-2176.

[91] J.J. Duistermaat, On global action-angle coordinates, Comm. Pure Appl. Math.33 (1980), 687–706.

[92] J.J. Duistermaat, W. Eckhaus,Analyse van Gewone Differentiaalvergelijkin-gen, Epsilon Uitgaven33, 1995.

302 BIBLIOGRAPHY

[93] J.P. Eckmann, S. Oliffson Kamphorst, D. Ruelle, S. Ciliberto, Liapunov expo-nents from time series.Phys. Rev. 34 A (1986), 4971-4979.

[94] J.P. Eckmann, D. Ruelle, Ergodic theory of chaos and strange attractors,Rev.Mod. Phys. 57 (1985), 617-656.

[95] K.J. Falconer,The Geometry of Fractal Sets, Cam. Univ. Press, 1985.

[96] K.J. Falconer,Fractal Geometry, Wiley, 1990.

[97] R.P. Feynman, R.B. Leighton, M. Sands,The Feynman Lectures on Physics,Addison-Wesley, 1963.

[98] A. Galka,Topics in nonlinear time series analysis, with implications for EEGanalysis, World Scientific, 2000.

[99] P. Grassberger, I. Procaccia, Characterisation of strange attractors,Phys. Rev.Lett. 50 (1983), 346-349.

[100] P. Grassberger, I. Procaccia, Estimation of Kolmogorov entropy from achaotic signal,Phys. Rev. 28 A (1983), 2591-2593.

[101] P. Grassberger, I. Procaccia, Dimensions and entropies of strange attractorsfrom a fluctuating dynamics approach,Physica13 D (1984), 34-54.

[102] J. Guckenheimer, R.F. Williams, Structural stability of Lorenz attractors,Publ. I.H.E.S., 50 (1979), 59-72.

[103] P.R. Halmos,Measure Theory, Van Nostrand, 1950.

[104] T.C. Halsey, M.H. Jensen, L.P. Kadanoff, I. Procaccia, B.I. Shraiman, Fractalmeasures and their singularities: the characterisation ofstrange sets,Phys.Rev. 33 A (1986), 1141-1151.

[105] H. Hanßmann, The quasi-periodic centre-saddle bifurcation,J. DifferentialEquations142(1998), 305–370.

[106] H. Hanßmann,Local and Semi-Local Bifurcations in Hamiltonian Dynam-ical Systems. Results and Examples, Lecture Notes in Math., Vol. 1893,Springer-Verlag, 2007.

[107] F. Hausdorff,Set Theory(2nd edition), Chelsea 1962. [German original: Veit(Leipzig) 1914.]

[108] M. Henon, A two-dimensional map with a strange attractor, Comm. Math.Phys. 50 (1976), 69-77.

BIBLIOGRAPHY 303

[109] H.G.E. Hentschel, I. Procaccia, The infinite number ofgeneralised dimen-sions of fractals and strange attractors,Physica8 D (1983), 435-444.

[110] M. Herman, Mesure de Lebesgue et nombre de rotation. In: J. Palis, M doCarmo (eds.),Geometry and Topology, Lecture Notes in Mathematics597,Springer-Verlag, 1977, 271-293.

[111] M.R. Herman,Sur la conjugaison differentiable des diffeomorphismes ducerclea des rotations, Publ. Math. IHES49 (1979), 5-233.

[112] M.W. Hirsch,Differential Topology, Springer-Verlag, 1976.

[113] M.W. Hirsch, C.C. Pugh, M. Shub,Invariant Manifolds, Lecture Notes inMathematics583, Springer-Verlag, 1977.

[114] M.W. Hirsch, S. Smale,Differential Equations, Dynamical Systems, and Lin-ear Algebra, Acad. Press, 1974.

[115] M.W. Hirsch, S. Smale, R.L. Devaney,Differential Equations, DynamicalSystems & An Introduction to Chaos(Second Edition), Academic Press, 2004.

[116] E. Hopf, Abzweigung einer periodischen Losung von einer stationarenLosung eines Differentialsystems,Ber. Math.-Phys.Kl Sachs. Akad. Wiss.Leipzig94 (1942), 1-22.

[117] E. Hopf, A mathematical example displaying features of turbulence.Com-mun. Appl. Math. 1 (1948), 303-322.

[118] M.V. Jakobson, Absolutely continuous invariant measures for one-parameterfamilies of one-dimensional maps,Commun. Math. Phys.81 (1981), 39-88.

[119] H. Kantz, Quantifying the closeness of fractal measures, Phys. Rev.49 E(1994), 5091 - 5097.

[120] H.Kantz, T. Schreiber,Nonlinear time series analysis, Camb. Univ. Press,1997.

[121] J.L. Kaplan, J.A. Yorke, Chaotic behaviour of multidimensional differenceequations, in Functional differential equations and approximations of fixedpoints. In: H.-O. Peitgen, H.O. Walter (eds.), LNM730, Springer, 1979. title?

[122] J.L. Kelley,General topology, Van Nostrand, 1955. morerecent?

[123] F. Klein,Vorlesungenuber die Entwicklung der Mathematik im 19. Jahrhun-dert, Springer-Verlag, 1926-7; Reprinted: 1979.

304 BIBLIOGRAPHY

[124] S.E. Kuksin, Nearly integrable infinite-dimensionalHamiltonian dynamics,LNM 1556Springer–Verlag 1993.

[125] L.D. Landau, On the problem of turbulence,Akad. Nauk.44 (1944), 339.

[126] L.D. Landau and E.M. Lifschitz,Fluid Mechanics.Pergamon, Oxford 1959.

[127] F. Ledrappier, L.S. Young, The metric entropy of diffeomorphisms, Part Iand II,Ann. of Math. 122(1985), 509-539 and 540-574.

[128] R. de la Llave, A tutorial onKAM Theory.Proceedings of Symposia in PureMathematics69 (2001), 175–292.

[129] E.N. Lorenz, Deterministic nonperiodic flow,J. Atmosph. Sci.20 (1963),130-141.

[130] E.N. Lorenz,The Essence of Chaos, Univ. of Washington Press, 1993.

[131] B.B. Mandelbrot,The Fractal Geometry of Nature, Freeman, 1977.

[132] W.S. Massey,Algebraic Topology: and introduction, GTM 56, Springer-Verlag, 1967.

[133] R.M. May, Simple mathematical models with very complicated dynamics,Nature261459-466.year?

[134] W. de Melo, S.J. van Strien,One-Dimensional Dynamics, Springer-Verlag,1991.

[135] K.R. Meyer, G.R. Hall,Introduction to Hamiltonian Dynamical Systems andtheN-Body Problem, Springer-Verlag, 1992.

[136] J.W. Milnor, On the concept of attractor,Comm. Math. Phys.99 (1985), 177-195.

[137] J.W. Milnor,Dynamics in One Complex Variable, Third Edition, Ann. Math.Studies, Vol. 160, Princeton Univ. Press, Princeton, NJ (2006).

[138] C.W. Misner, K.S. Thorne, J.A. Wheeler,Gravitation, Freeman, 1970.

[139] L. Mora, M. Viana, Abundance of strange attractors,Acta Math171 (1993),1-71.

[140] J.K. Moser, On invariant curves of area-preserving mappings of an annulus,Nachr. Akad. Wiss. Gottingen, Math.-Phys. Kl. II.1 (1962), 1–20.

BIBLIOGRAPHY 305

[141] J.K. Moser, On the theory of quasiperiodic motions,SIAM Review8(2)(1966), 145-172.

[142] J.K. Moser, Convergent series expansions for quasi-periodic motions,Math.Ann.169(1967), 136-176.

[143] J.K. Moser, Lectures on Hamiltonian systems,Mem. AMS81 (1968), 1–60.

[144] J.K. Moser, Stable and random motions in dynamical systems, with specialemphasis to celestial mechanics,Ann. Math. Studies77 Princeton UniversityPress 1973.

[145] S.E. Newhouse, J. Palis and F. Takens, Bifurcations and stability of familiesof diffeomorphisms.Publ. Math. IHES57 (1983), 5-71.

[146] S.E. Newhouse, D. Ruelle and F. Takens, Occurrence of strange Axiom Aattractors near quasi-periodic flows onTm, m ≤ 3, Commun. Math. Phys.64(1978), 35-40.

[147] J.R. van Ommen, M.-O. Coppens, J.C. Schouten, C.M. vanden Bleek, Earlywarnings of agglomeration in fluidised beds by attractor comparison,A.I.Ch.E.Journal46 (2000), 2183-2197.

[148] V.I. Oseledets, A multiplicative ergodic theorem,Trans. of the Moscow Math.Soc. 19 (1968), 197-221.

[149] E. Ott, T. Sauer, J.A. Yorke (eds.),Coping with Chaos, Wiley, 1994.

[150] J. Oxtoby,Measure and Category.Springer-Verlag 1971.

[151] N.H. Packard, J.P. Crutchfield, J.D. Farmer, R.S. Shaw, Geometry from timeseries,Phys. Rev. Letters45 (1980), 712 - 716.

[152] H.-O. Peitgen (ed.),Newton’s Method and Dynamical Systems, KluwerAcad. Publ., 1989.

[153] H.-O. Peitgen, H. Jurgens, D. Saupe,Chaos and Fractals, Springer-Verlag,1992.

[154] Ya.B. Pesin,Dimension theory in dynamical systems, Univ. of Chicago Press,1997

[155] Ya.B. Pesin, H. Weiss, The multi-fractal analysis of Gibbs measures: moti-vation, mathematical foundation, and examples,Chaos7 (1997), 89-106.

306 BIBLIOGRAPHY

[156] M. St. Pierre, Topological and measurable dynamics ofLorenz maps,Dis-sertationes Math.382(1999).

[157] H. Poincare, Sur le probleme des trois corps et les equations de la dynamique,Acta Math.13 (1980), 1-270.

[158] B. van der Pol, De amplitude van vrije en gedwongen triode-trillingen,Tijd-schr. Ned. Radiogenoot.1 (1920), 3-31.

[159] B. van der Pol, The nonlinear theory of electric oscillations,Proc. of the Inst.of Radio Eng. 22 (1934), 1051-1086; Reprinted in:Selected Scientific Papers,North Holland, 1960.

[160] J. Poschel, Integrability of Hamiltonian systems onCantor sets.Comm. PureAppl. Math.35(5) (1982), 653-696.

[161] J. Poschel. A lecture on the classical KAM Theorem,Proceedings of Sym-posia in Pure Mathematics69 (2001), 707-732.

[162] M.B. Priestley,Spectral analysis and time series, Acad. Press, 1981.

[163] A. Quarteroni, R. Sacco and F. Saleri,Numerical Mathematics(2nd ed.)(2007), Springer-Verlag.

[164] Lord J.W.S. Rayleigh, On convective currents in a horizontal layer of fluidwhen the higher temperature is on the under side,Phil. Mag.32 (1916), 529-546.

[165] A. Renyi,Probability Theory, North-Holland, 1970.

[166] C. Robinson,Dynamical Systems, CRC Press, 1995

[167] A. van Rooij,Fouriertheorie, Epsilon Uitgaven, 1988.

[168] O.E. Rossler, Continuous chaos — four prototype equations. In:Bifurcationtheory and applications in scientific disciplines, The New York Academy ofSciences, 1979.ed.

[169] W. Rudin,Real and Complex Analysis, McGraw Hill, 1970.

[170] D. Ruelle, Microscopic fluctuations and turbulence,Phys. Lett.72A (1979),81-82.

[171] D. Ruelle,Chaotic Evolution amd Strange Attractors, Cambridge UniversityPress, 1989.

BIBLIOGRAPHY 307

[172] D. Ruelle, Deterministic chaos: the science and the fiction, Proc. R. Soc.LondonA 427 (1990), 241 - 248.

[173] D. Ruelle, Historical behaviour in smooth dynamical systems. In: H.W.Broer, B. Krauskopf, G. Vegter (eds.),Global Analysis of Dynamical Systems,IoP Publishing (2001), 63-66.

[174] D. Ruelle and F. Takens, On the nature of turbulence,Comm. Math. Phys.20(1971), 167-192;23 (1971), 343-344.

[175] H. Russmann, Konvergente Reihenentwicklungen in der Storungstheorie derHimmelsmechanik. In: K. Jacobs (ed.),Selecta MathematicaV, Springer-Verlag, 1979, 93-260.

[176] T. Sauer, J.A. Yorke, M. Casdagli, Embedology,J. Stat. Phys. 65 (1991),579-616.

[177] M. Shub, Stabilite globale des systemes dynamiques, Asterisque56 (1978)1-211.

[178] S. Smale, Diffeomorphisms with many periodic points.In: S. Cairns (ed.),Differential and Combinatorial Topology,Princeton Univ. Press 1965.

[179] S. Smale, Differentiable dynamical systems,Bull. Amer. Math. Soc.73(1967), 747-817.

[180] C. Sparrow,The Lorenz Equations: Bifurcations, Chaos and Strange Attrac-tors, Applied Mathematical Sciences,41Springer-Verlag, 1982.

[181] M. Spivak,Differential GeometryVol. I, Publish or Perish, 1970.

[182] S. Sternberg, On the structure of local homeomorphisms of Euclidean n-space, II,Amer. J. Math.80 (1958), 623-631.

[183] F. Takens, Singularities of Vector Fields,Publ. Math. IHES43 (1974), 47-100.

[184] F. Takens, Forced oscillations and bifurcations. In:Applications of GlobalAnalysis I, Comm. of the Math. Inst. Rijksuniversiteit Utrecht(1974). In:H.W. Broer, B. Krauskopf and G. Vegter (Eds.),Global Analysis of DynamicalSystems, Festschrift dedicated to Floris Takens for his 60th birthday(2001),1-62 IOP.

[185] F. Takens, Detecting strange attractors in turbulence, in: Dynamical systemsand turbulence, In: D. Rand, L.-S. Young (eds.),Dynamical Systems and Tur-bulence, Warwick1980, LNM898(1981), Springer-Verlag.

308 BIBLIOGRAPHY

[186] F. Takens, Invariants related to dimension and entropy, in: Atas do13o

coloquio Bras. de Mat. 1981, IMPA, Rio de Janeiro, 1983.

[187] F. Takens, Heteroclinic attractors: time averages and moduli of topologicalconjugacy,Bol. Soc. Bras. Mat. 25 (1994), 107-120.

[188] F. Takens, The reconstruction theorem for endomorphisms,Bol. Bras. Math.Soc. 33 (2002), 231-262.

[189] F. Takens, Linear versus nonlinear time series analysis — smoothed correla-tion integrals,Chemical Engineering Journal96 (2003), 99-104.

[190] F. Takens, E. Verbitski, Generalised entropies: Renyi and correlation integralapproach,Nonlinearity11 (1998), 771-782.

[191] F. Takens, E. Verbitski, General multi-fractal analysis of local entropies,Fund. Math. 165(2000), 203 - 237.

[192] J.-C. Tatjer, On the strongly dissipative Henon map,ECIT 87, SingaporeWorld Scientific, 1989, 331-337.

[193] J. Theiler, Estimating fractal dimensions,J. Opt. Soc. Am. 7 (1990), 1055-1073.

[194] R. Thom,Stabilite Structurelle et Morphogenese, Benjamin, 1972.

[195] H. Tong (ed.),Chaos and Forecasting, World Scientific, 1995.

[196] H. Tong, R.L. Smith (eds.) Royal statistical society meeting on chaos,J. ofthe R. Stat. Soc. B 54 (1992), 301-474.

[197] W. Tucker, The Lorenz attractor exists,C.R. Acad. Sci. ParisSer.I Math.328(1999) 1197-1202.

[198] E.A. Verbitski, Generalised entropies and dynamicalsystems,PhD Thesis,Rijksuniversiteit Groningen, 2000.

[199] M. Viana, Strange attractors in higher dimensions,Bull. Braz. Math. Soc.24(1993), 13–62.

[200] M. Viana, Global attractors and bifurcations. In: H.W. Broer, S.A. vanGils, I. Hoveijn, F. Takens (eds.),Nonlinear dynamical systems and chaos,Birkhauser, 1996, 299-324.

[201] M. Viana, What’s new on Lorenz strange attractors?The Mathematical In-telligencer22 (3) (2000) 6-19.

BIBLIOGRAPHY 309

[202] B.L. van der Waerden,Moderne AlgebraI and II, Springer-Verlag, 1931.

[203] F.O.O. Wagener, A note on Gevrey regular KAM theory andthe inverse ap-proximation lemma.Dynamical Systems18 (2003), 159-163.

[204] Q. Wang, L.-S. Young, Strange attractors with one direction of instability,Comm. Math. Phys.218(2001), 1-97.

[205] A.S. Weigend, N.A. Gershenfeld (eds.), Time series prediction, Addison-Wesley, 1994.

[206] H. Whitney, Analytic extensions of differentiable functions defined in closedsets.Trans. Amer. Math. Soc36(1) (1934), 63-89.

[207] H. Whitney, Differentiable functions defined in closed sets.Trans. Amer.Math. Soc.36(2) (1934), 369-387.

[208] H. Whitney, Differentiable manifolds,Ann. of Math. 37 (1936), 645-680.

[209] R.F. Williams, Expanding attractors,Publ. Math. I.H.E.S.50(1974) 169-203.

[210] R.F. Williams, The structure of the Lorenz attractors, Publ. I.H.E.S.50(1979), 73-99.

[211] J.-C. Yoccoz,C1-conjugaisons des diffeomorphismes du cercle. In: J. Palis(ed), Geometric Dynamics,Proceedings, Rio de Janeiro 1981, LNM1007(1983), 814–827.

[212] J.-C. Yoccoz, Travaux de Herman sur les tores invariants. In: SeminaireBourbaki Vol 754, 1991-1992,Asterisque206(1992), 311–344

[213] J.-C. Yoccoz, Theoreme de Siegel, nombres de Bruno et polynomes quadra-tiques,Asterisque231(1995), 3–88.

[214] J.-C. Yoccoz, Analytic linearization of circle diffeomorphisms. In: S. Marmi,J.-C. Yoccoz (Eds.)Dynamical Systems and Small Divisors,LNM 1784(2002), 125–174, Springer–Verlag.

[215] E. Zehnder. An implicit function theorem for small divisor problems,Bull.AMS80(1) (1974), 174-179.

[216] E. Zehnder. Generalized implicit function theorems with applications tosome small divisor problems, I and II,Comm. Pure Appl. Math., 28(1) (1975),91-140;29(1) (1976), 49-111.

Index

ω-limit set, 134n-dimensional torus, 73l’histoire se repete, 86, 94, 232‘Cantorisation’, 2741-bite small divisor problem, 186, 262

area preserving maps, 182Arnold family of circle maps, 181, 272asymptotically periodic evolution, 70asymptotically stationary evolution, 70attractor, 135

hyperbolic, 149non-hyperbolic, 149

Bairefirst category, 245property, 245second category, 245

basinboundaries, 156of attraction, 136

BHT non-degeneracy, 276bifurcation diagram, 35, 114, 115Borel measure, 253box counting dimension, 212Brin-Katok Theorem, 221

calendars, 69Cantor set, 141, 166, 179, 275, 277Catastrophe Theory, 273chaotic evolution, 89Circle Map Theorem, 180circle maps, 176complex linearization, 184conjugation

biholomorphic, 184smooth, 81, 176, 258topological, 123, 268Whitney smooth, 178, 259, 261

correlationdimensions, 219entropy, 219integral, 222

degree, 251Diophantine conditions, 179, 183, 184,

190, 259, 274, 277discretisation, 20dispersion exponent, 94, 97, 281Doubling map, 54, 90, 100, 110, 120,

125Doubling map on the plane, 137dripping faucet, 204dynamical system, 19

equivalencesmooth, 177, 273topological, 123, 268

eventually periodic evolution, 71eventually stationary evolution, 71evolution, 15

operator, 15past, 86

foliation, 146

generic property, 245gradient system, 157

Henon

INDEX 311

-like attractor, 150, 281attractor, 29system, 28

Hartman-Grobman Theorem, 269heat equation, 47holomorphic maps, 184homological equation, 178, 264Hopf

-Neımark-Sacker bifurcation, 270bifurcation, 27, 270

Hopf-Landau-Lifschitz-Ruelle-Takens routeto turbulence, 281

Horseshoe map, 158, 163hyperbolicity, 143, 277

integer affine structure, 257

KAM Theory, 257conservative, 258dissipative, 260global, 259

Kantz-Diks test, 237Kolmogorov non-degeneracy, 259Kolmogorov-Arnold-Moser Theory, 173

Lebesgue measure, 254Logistic system, 32, 100Lorenz

attractor, 50, 151, 292system, 49, 152, 290

Lyapunov exponents, 235

Maximal Rank Theorem, 249measure class, 255Moser’s Twist Map Theorem, 182multi-periodic, 257

dynamics, 260evolution, 73subsystem, 82

Navier Stokes equation, 284Newton algorithm, 38Newtonian iteration, 261

open and dense subset, 245orientation, 251

pendulum, 2, 110, 111, 113, 183, 278,279

period doublingbifurcation, 37, 269sequence, 37, 279

periodic evolution, 68persistence, 109, 272

in the sense of probability, 111of fixed points, 117of periodic evolutions, 119of quasi-periodic evolutions, 174, 183,

186of stationary evolutions, 115under variation of initial state, 110under variation of parameters, 113,

183weak, 112, 181, 183

Persistence Theorem of Hyperbolic At-tractors, 149

Poincare map, 23population dynamics, 34positively compact, 95, 134predictability, 231

of chaotic evolutions, 90of multi-periodic evolutions, 85of periodic evolutions, 68of quasi-periodic evolutions, 85

prediction, 85error, 87, 91, 96interval, 86, 96

quasi-periodic, 257(family of) attractors, 261evolution, 76Hopf bifurcation, 275saddle-center bifurcation, 275subsystem, 82

Rossler

312 INDEX

attractor, 52system, 52

read out function, 205reconstruction measure, 216Reconstruction Theorem, 206relative frequencies, 217residual set, 186, 202, 245resonance, 78

‘bubble’, 277gap, 277tongue, 271

resonantevolution, 78subtori, 105

restriction, 19in the narrower sense, 20in the wider sense, 20

saddle-center bifurcation, 273saddle-node bifurcation, 115, 135, 269Sard Theorem, 249separatrix, 31small divisors, 178solenoid, 139, 145stable equivalence, 146stable foliation, 147state space, 15stationary evolution, 68structural stability, 123, 163, 272, 273suspension, 21symbolic dynamics, 55, 125, 156, 159

Thom map, 101time series, 206translation system, 75transversality, 250Transversality Theorem, 250typical, 109, 183

Van der Pol system, 25, 78, 113, 173,261

wave equation, 46

henk broer and floris takens january 28, 2008broer/pdf/bt.pdf · nomical evolutions to population...

Documents