cyrille imbert - poincare.univ-lorraine.fr · cyrille imbert. contents general introduction 5 i...

The opacity of NatureHard to explain phenomena within well-known theories and

the limits of science

Cyrille Imbert

Contents

General introduction 5

I Explanations : the more detailed, the more explanatory? 8

1 Introduction : which deductions are explanatory? 9

2 The problem of relevance in the discussions about explanation : from Hempel’smodel shortcomings to the causal model 102.1 Hempel’s DN model and its shortcomings . . . . . . . . . . . . . . . . . 102.2 Causal theories of explanation . . . . . . . . . . . . . . . . . . . . . . . 102.3 From explanatory sketches to Railton’s ideal explanatory text . . . . . . . 112.4 Summary of the problems that have been met . . . . . . . . . . . . . . . 13

3 Non relevant causal details in asymptotic explanations : Batterman againstthe Devil 143.1 Sierpinski triangle and the emergence of fractal patterns . . . . . . . . . . 143.2 Scope of Batterman’s claims . . . . . . . . . . . . . . . . . . . . . . . . 153.3 Are there other cases where causal details should be put aside? . . . . . . 15

4 Explanation, or the identification and selection of relevant facts 174.1 The law of areas and its explanations . . . . . . . . . . . . . . . . . . . . 184.2 Two existing attempts to solve the problem of explanatory relevance . . . 21

4.2.1 The SR model and its (informative) failure . . . . . . . . . . . . 214.2.2 Searching for maximal explanatory classes : from statistics to ar-

guments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.2.3 The requirement of exhaustiveness about nomological statements

(Reichenbach) . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.3 Sketch of a model of explanation based on relevance . . . . . . . . . . . 25

4.3.1 Explaining : a logically second activity . . . . . . . . . . . . . . 264.3.2 If theories were axiomatized ... an example taken from mathematics 274.3.3 Formulation of a syntactic criterion of relevance . . . . . . . . . 284.3.4 Formulation of a semantic criterion of relevance . . . . . . . . . 294.3.5 When theories are not axiomatized : explanation and relevance in

usual theoretical settings . . . . . . . . . . . . . . . . . . . . . . 314.3.5.1 Why a criterion is required for cases in which theories

are not axiomatized . . . . . . . . . . . . . . . . . . . 31

CONTENTS 2

4.3.5.2 Criterion of relevance for non axiomatized theories . . 324.3.5.3 Final remarks about the criterion of relevance . . . . . 34

4.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.4.1 The two faces of the hexed salt: identifying theories and identify-

ing relevant explanatory facts . . . . . . . . . . . . . . . . . . . 344.4.2 Fundamental nomological regularities and exhaustive derived reg-

ularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.4.3 On Hempel’s shoulders we stand . . . . . . . . . . . . . . . . . . 36

5 Conclusion : good explanations deduce nothing too much 37

II The intrinsic opacity of phenomena 38

6 Where are the boundaries of science ? 396.1 The boundaries of what can be known in practice . . . . . . . . . . . . . 396.2 Explanation and understanding : two different notions . . . . . . . . . . . 40

6.2.1 Explanation and understanding : different nature, authors, andpossessors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6.2.2 Different computational costs . . . . . . . . . . . . . . . . . . . 416.2.3 Opacity or, the cost of "primary understanding" . . . . . . . . . . 43

6.3 How to measure the cost of primary understanding ? . . . . . . . . . . . 446.3.1 Identifying potential explanations . . . . . . . . . . . . . . . . . 446.3.2 Predicting versus explaining . . . . . . . . . . . . . . . . . . . . 456.3.3 Producing a proof versus verifying it . . . . . . . . . . . . . . . . 45

6.4 Opacity, a notion first relative to individuals and what they know . . . . . 476.4.1 What does primary opacity depend on ? . . . . . . . . . . . . . . 476.4.2 The different boundaries of science . . . . . . . . . . . . . . . . 48

6.5 Does opacity have an intrinsic component ? . . . . . . . . . . . . . . . . 506.6 Description of the following chapters . . . . . . . . . . . . . . . . . . . . 50

7 From physical models to computational complexity theory 517.1 Complexity theory and the cost of computations . . . . . . . . . . . . . . 51

7.1.1 Turing machines and the study of computational resources (timeand space) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

7.1.2 Basic notions from computational complexity theory . . . . . . . 537.2 The complexity of physical problems . . . . . . . . . . . . . . . . . . . 53

7.2.1 The well-defined complexity of some physical models . . . . . . 547.2.1.1 The computational complexity of "direct simulations"

in fluid dynamics . . . . . . . . . . . . . . . . . . . . 547.2.1.2 The computational complexity of some discrete models

in statistical physics . . . . . . . . . . . . . . . . . . . 567.2.2 Relativity to models and is it troublesome ? . . . . . . . . . . . . 577.2.3 Modeling can’t do miracles . . . . . . . . . . . . . . . . . . . . 58

7.3 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597.3.1 Why studying time complexity? . . . . . . . . . . . . . . . . . . 597.3.2 Invariance and robustness results for time complexity . . . . . . . 60

CONTENTS 3

8 Problem complexity, sub-problem complexity and instance hardness 618.1 From problem complexity to instance hardness : a path paved with pitfalls 62

8.1.1 Problem "holistic" complexity : a needed detour . . . . . . . . . 628.1.2 Troubles with "holistic" measures and complexity paralogisms . . 638.1.3 The logic of problem complexity measures . . . . . . . . . . . . 648.1.4 Easy sub-problems within NP-complete problems . . . . . . . . 658.1.5 Which complexity measure : average, generic or worst-case com-

plexity ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668.2 NP-complete but almost always easy : where are the really hard problems? 67

8.2.1 Hard instances in the phase transition region . . . . . . . . . . . 688.2.2 First-order phase transitions and hard problems : results about

(2 + p)-SAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688.2.3 Overview about the statistical study of 2 + p-SAT . . . . . . . . . 70

8.3 Conclusion : problem complexity underdetermines instance hardness . . . 708.4 Appendix : first-order transition and replica method in 2 + p-SAT . . . . 71

9 Ressorting to partitions : successes and failures 729.1 Striving after efficiency and generality . . . . . . . . . . . . . . . . . . . 72

9.1.1 What can we infer from NP-completeness ? . . . . . . . . . . . 729.1.2 Particular physical situations, individual instances and the search

for generality in science . . . . . . . . . . . . . . . . . . . . . . 739.2 An analogous problem: determining the probability of singular events . . 749.3 Measuring the hardness of instances by partitioning problems . . . . . . . 74

9.3.1 Computational relevance : the example of sparse matrices . . . . 759.3.2 From statistical relevance to computational relevance . . . . . . . 76

9.3.2.1 The requirement of K-homogeneity . . . . . . . . . . 779.3.2.2 Objective, epistemic and practical K-homogeneity . . . 78

9.4 Complexity cores and the quest for objectively homogeneous problems . . 799.4.1 Definition of complexity cores . . . . . . . . . . . . . . . . . . . 819.4.2 The hardness of problems is not merely a "collective" effect : ex-

istence of gradual complexity cores . . . . . . . . . . . . . . . . 829.4.3 Complexity cores and proof verification . . . . . . . . . . . . . . 839.4.4 Hard to delineate complexity cores and the blurred boundaries of

science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839.5 The hardness of instances: or, the absent Holy Grail . . . . . . . . . . . . 85

10 Laying foundations for the notion of hard instance: instance complexity orthe two dimensions of complexity 8610.1 The main idea : taking into account the size of algorithms and of proof

systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8710.2 Instance complexity : semi formal presentation . . . . . . . . . . . . . . 87

10.2.1 Bounded and unbounded instance complexity and Kolmogorovcomplexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

10.2.2 How to measure the hardness of instances? . . . . . . . . . . . . 8810.3 Instance complexity : formal definition and significant results . . . . . . . 90

10.3.1 Definition and invariance results . . . . . . . . . . . . . . . . . . 90

CONTENTS 4

10.3.2 Instance complexity and complexity cores . . . . . . . . . . . . . 9110.3.3 How to characterize hard instances? . . . . . . . . . . . . . . . . 9110.3.4 The growth of instance complexity for problems not belonging to P 9210.3.5 Which problems do have hard instances? . . . . . . . . . . . . . 92

10.3.5.1 The logic of the notion of hard instance . . . . . . . . . 9210.3.5.2 p-hard instances within NP-complete problems . . . . 93

10.3.6 Instance complexity, Kolmogorov complexity and the hardness ofproblems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

10.4 Instance complexity and opacity . . . . . . . . . . . . . . . . . . . . . . 9310.4.1 Instance complexity and proof verification . . . . . . . . . . . . . 9410.4.2 The opacity of instance complexity . . . . . . . . . . . . . . . . 94

10.5 Conclusion about instance complexity . . . . . . . . . . . . . . . . . . . 94

11 Epilogue : to what extend is Nature opaque? 96

12 General conclusion 10012.1 The end of human scale epistemology . . . . . . . . . . . . . . . . . . . 10012.2 Relativity and non-relativity to models and theories . . . . . . . . . . . . 10112.3 Models, models, models (and why this is so) . . . . . . . . . . . . . . . . 102

Bibliography 105

General introduction. Hard to explainphenomena and the in practice limits ofour science

<For this chapter, the summary does not follow the details of the original introduction,which was written for philosophers that have no background in philosophy of science.>

Natural phenomena seem to be more or less hard to explain. Part of the difficulty is nodoubt due to the fact that, in many cases, it is difficult to make the right hypotheses aboutthe laws or mechanisms that are into play, and this is inference to the best explanation isabout : finding how to make the right explanatory hypothesis out of the various clues thatcan be collected. In the same time, even when the right hypotheses have been made, weare sometimes in a situation in which we do not posses an explanation of the phenomenaunder study. We know the mechanisms or laws that rule the target system ; we know whichparticular circumstances obtained ; we know what the fact to be explained is ; and still,we do not manage to produce an explanation and to understand how the explanandum factwas made to occur. This state of affairs can be illustrated by the following list of graduallyhard to answer why-questions :

i) Why do most stars make arc of circles every night in the sky ?

Answering this question is elementary as soon as one knows copernician astronomyand the explanations can be built easily by a human mind.

ii) Why are finite simple groups always of the same types1 ?

Answering this question required several groups of hundreds of scientists, comput-ers, and the proof was 15000 pages long. The person who had the best insight andoverview about the proof, Daniel Gorenstein, died in 1992. So we have an examplewhere it is possible to produce an explanation but a single human mind can no longerdo it alone nor understand it completely.

iii) Why is the weather so changing today in Paris ?

Answering this question takes big computers simulating the evolution of the weather.This time, it is totally impossible for a human mind to follow the details of the de-duction, even if we know the inner workings of the fluid that are simulated by thecomputer.

1This example is taken from mathematics, because I found no easy to describe such example in physics.

CONTENTS 6

iv) Why did the golf ball went right into the hole on this windy day ?

Here, we can of course answer that Tiger Wood hit the ball, which was deviated bythe wind till the branches of the tree, which altered its trajectory towards the hole.So we clearly have a rough understanding of the mechanisms at work and we caneven give a sketch of the causal processes that led to this event ; still, even if withcomputers, we are no longer able to produce a complete and sound explanation of theball behaviour and check that we can really explain what happened.

v) Why did lottery ball number 17 went first out of the machine ?

The situation is here the same as with the golf ball except that we can no longer givea sketch of the causal processes that led to the event.

vi) Why did a tornado suddenly set up on August 14th 1773 near the Merrimack Riverin the south of Salisbury (Massachusetts) ?

In this case, the situation is even worse. We know for sure what laws fluid obey andthere already exists scenarios about the formation of tornadoes of certain types ; still,there are many different types of tornadoes and in many cases, we do not really knowfor sure how tornadoes arise.

All these situations have in common that in each case, we know the laws or principlesthat must be used in order to explain the target behaviour. In spite of this, some cases areeasy to explain whereas some are extremely hard. To put it briefly, having an in principleknowledge of how a system evolve is not sufficient to correctly explain this behaviour. Inthe same time, the fact that we have troubles in explaining these behaviours does not meanthat they are intrinsically hard to explain. After all, in some cases, we spend much workand sweat to explain some phenomenon till the happy day when a brilliant scientist find amuch quicker explanation of it, for example by identifying a new analytical solution to aproblem that was previously considered as extremely hard.

In this work, my aim is to show that the difficulties that are met in explaining naturalphenomena sometimes originates in the complexity of phenomena and not only in ourepistemic shortcomings. By doing this, I aim at describing how one can hope to extendthe boundaries of science by developing our computers abilities. The fact that our scienceis finite and limited is an old philosophical idea. The fact that it can be significantlyextended by using computers is a scientifically recent situation — but how much can it beextended exactly ? My aim is precisely to show that there are intrinsic constraints on howwe can extend the boundaries of what we know.

In order to carry through this objective, I have divided the labour into two parts

1. First, I ask what the complete and optimal explanation of a fact is. The worry that Ihave in mind is that it could perhaps be possible to increase indefinitely the qualityof explanations of particular facts by adding more details into the picture. In thiscase, the complexity of the task would not be bounded and it would be difficultto compare how hard it is to explain different facts. The conclusion of this partis that to produce a complete and optimal explanation of a fact, no such inflationof the explanatory task is to be dreaded. Being able to deduce the statement to beexplained from well-chosen first principles provides satisfactory explanations andthe optimal explanation is a deduction of this type.

CONTENTS 7

2. In part II, I ask whether it is possible to measure the complexity of the productionof an explanation. I try to show that particular explanations do have an intrinsicand irreducible cost, which cannot be arbitrarily decreased (even if this cost can besometimes almost null).

<The end of the introduction of the original document argues that, if one managesto show that some explanations are intrinsically hard to produce, then we have a goodargument to consider that we must build an epistemology in which human mind is nolonger the touchstone of every bit of knowledge (which is already a de facto situation).>

Part I

Explanations : the more detailed, themore explanatory?

Chapter 1

Introduction : which deductions areexplanatory?

In this introductory chapter, I briefly review arguments, which indicate that all deduc-tions of statements describing the explanandum facts cannot be considered as explanatory.(These traditional arguments are presented at length in the next chapter.) This can be il-lustrated by the following two cases :

• Typically, one is reluctant to consider that a follower of ptolemaic astronomy knowshow to explain the trajectory of stars in the sky.

• Also, a sailor that is able to predict the hour of the tide can hardly be considered aspossessing a good explanation of the tide.

A solution to these puzzles has been given by defenders of the causal model of expla-nation. They claim that, in order to give a good explanation of a fact, one must describethe causal processes that are at work in the target system.

A worry is that such a position implies a significant (and potentially unbounded1) in-flation of the explanatory task, even in some cases in which we think already possesssatisfactory explanations. For example, one can be reluctant to accept that a good expla-nation of the pressure in an ideal gaz requires to describe the trajectory of the molecules(which is the scale at which causal processes can be described).

In conclusion, one needs to answer the following questions : "What is an optimalexplanation of a fact ?" "Are the explanations that describe as precisely as possible thecausal details the best ?"

1Because there are potentially no limits to how we can go deeper into causal details.

Chapter 2

The problem of relevance in thediscussions about explanation : fromHempel’s model shortcomings to thecausal model

This chapter is devoted to a review of the problem of explanatory relevance. Since thematerial is well-known by philosophers of science, I shall be very brief.

2.1 Hempel’s DN model and its shortcomings

To the giantsOn whose shoulders we stand1.

In this section, I first present Hempel’s DN model2 and some of the famous counter-examples to the model, such as the dissolution of the hexed salt (the explanation says thisis because all hexed samples of salt dissolve, when plunged in water") and John Jones’sfailure to get pregnant (the explanation says that this is because he takes his wife’s pill).

2.2 Causal theories of explanationIn this section, I first present the causal model of explanation, which answers the

previous counterexamples by saying that explanatory relevant relations are causal relation— and causal theorist have worked hard to give sound definitions of causal processes.

In a second step, I discuss criticism that have been made againt the way that the causalmodel solves the problem or relevance in explantions.

Hitchcock [32] has in particular made one of the deepest criticisms to the causalmodel. His criticism can be illustrated by the following example. When two billiard

1"Dicebat Bernardus Carnotensis nos esse quasi nanos, gigantium humeris insidentes, ut possimus pluraeis et remotiora videre, non utique proprii visus acumine, aut eminentia corporis, sed quia in altum subven-imur et extollimur magnitudine gigantea." (John of Salisbury, Metalogicon, 1159)

2I shall not discuss the problem of statistical explanation in the next chapters.

2.3 From explanatory sketches to Railton’s ideal explanatory text 11

balls collide, different causal processes can be described in this spatio-temporal region,such as the causal processes involving the transfer of momentum and those involving theconservation of the blue chalk quantity. All these processes are intertwined. But the con-servation of the chalk quantity is not relevant to the trajectory of the balls, so citing localcausal processes is not sufficient.

To quote Woodward, giving a more general version of Hitchcock’s criticism :

"A more general way of putting the problem revealed by these examples isthat those features of a process P in virtue of which it qualifies as a causal pro-cess (ability to transmit mark M) may not be the features of P that are causallyor explanatorily relevant to the outcome E that we want to explain (M may beirrelevant to E with some other property R of P being the property which iscausally relevant to E). So while mark transmission may well be a criterionthat correctly distinguishes between causal processes and pseudo-processes,it does not, as it stands, provide the resources for distinguishing those featuresor properties of a causal process that are causally or explanatorily relevant toan outcome and those features that are irrelevant."

I completely agree with this criticism. However, little seems to be at stake here :after all, Hitchcock seems to be still granting that citing causal processes is necessaryin explanations and that only an additional condition is missing so as to select the rightcausal processes.

In the next chapters, I shall argue that the problem is actually much deeper. I shall firstrely on Batterman’s analyses and then on an example of my own about the explanation ofthe law of areas.

2.3 From explanatory sketches to Railton’s ideal explana-tory text

The position of causal theorists3 goes with an explanatory ideal, which has been de-scribed by Railton and Salmon has enthusiastically approved. Explaining a fact is showinghow it inserts in the web of causal processes and, in this perspective, causal informationabout this web of causal processes, even the tiniest, is welcome. In this perspective, ex-plaining that the pressure is P = p in an ideal gaz can be made by relying on the ideal gaslaw but this explanation must be considered as a very incomplete one.

More precisely, the notion of ideal explanatory text is defined by Railton like this :

"An ideal text for the explanation of the outcome of a causal process wouldlook something like this: an inter-connected series of law-based accounts ofall the nodes and links in the causal network culminating in the explanandum,complete with a fully detailed description of the causal mechanisms involvedand theoretical derivations of all the covering laws involved. This full-blown

3In the original document, I also argue that the causal theorist can fruitfully use the notion of explanatorysketch, originally developped by Hempel, to argue that what we often consider to be good explanations arein fact explanatory skecthes that select some partial bits of causal information.

2.3 From explanatory sketches to Railton’s ideal explanatory text 12

causal account would extend, via various relations of reduction and super-venience, to all level of analysis, i.e. the ideal text would be closed underrelations of causal dependence, reduction and supervenience. It would be thewhole story concerning why the explanandum occurred, relative to a correcttheory of the lawful dependencies of the world." [57, p.241].

Of course, such an ideal text can never be produced by scientits : the explanations thatwe usually give are bits of this explanatory text.

"Needless to say, even if we did possess the ability to fill out arbitrarily exten-sive bits of ideal explanatory texts, and in this sense thoroughly understoodthe phenomena in question, we would not always find it appropriate to provideeven a moderate portion of the relevant ideal texts in response to particularwhy-questions. On the contrary, we would tailor the explanatory informationprovided in a given context to the needs of that context; if we had the capacityto supply arbitrarily large amounts of explanatory information, there wouldbe no need to flaunt it. [57, p.244].

So in the end, the content of a particular explanation is determined pragmatically, forexample by considering the epistemic deficits of the explanation-seeker.

Strangely enough, a causal theorist like Railton and a pragmatist like van Fraassenagree here since they both consideration that what is relevant to answer a particular why-question depends on the context and is to be determined by pragmatic motives. That iswhy, for van Fraassen, an Omnisicent Being would never ask "why" : because of its lackof particular interest and desires, she would never have reasons to select particular detailsof the complete explanation.

"It is sometimes said that an Omniscient Being would have a complete ex-planation, whereas these contextual factors only bespeak our limitations dueto which we can only grasp one part or aspect of the complete explanationat any given time [...]. But this is a mistake. If the Omniscient Being hasno specific interests (legal, medical, economic ; or just an interest in opticsor thermodynamics rather than chemistry) and does not abstract (so that henever thinks of Caesar’s death qua multiple stabbing, or qua assassination),then no why-questions ever arise for him in any way at all [...] If he doeshave interests, and does abstract from individual peculiarities in his thinkingabout the world, then his why-questions are as essentially context-dependentas ours." [69, p.130].

This is a point that I disagree with. There is no doubt that, when answering a partic-ular question, the answer should be adapted to the explanation-seeker : pragmatism candescribe explanation just like other activities. Still, I argue that there is much more to rel-evance than this: in the next chapters, I claim that which statements are relevant to whichother explanandum statements can be determined on logical grounds. An Omniscient Be-ing that would be able to produce Railton explanatory text would still be midway throughin its explanatory task because she would still need to determine which precise facts arerelevant to the fact to be explained.

2.4 Summary of the problems that have been met 13

2.4 Summary of the problems that have been metIn this last section, I summarize all the problems that have been met in the previous

review. Since I have gone quickly through this review in the present document, I shall notpresent this summary here.

To characterize negatively what I shall argue for in the next chapters, one last quote ishelpful.

For instrumentalists there are no scientific explanations. Science is acknowl-edged to have a number of virtues, but none of them is associated with theproduction of a better understanding of what goes on in the world. [...] Thenotion of explanation we want to explicate is one whose correct applicabilityto a scientific argument is denied by instrumentalists. An alleged theory ofexplanation (i.e., an explication of explanation) that elucidates the concept ofexplanation in such a way that the instrumentalist may consistently agree thatscience contains explanations in the explicated sense, will not be an explica-tion of any of the explicanda described here." [17, p.61–62].

Once again, I disagree with the author. For instrumentalists, there are good scientificexplanation indeed4. Finding the true theories is a must-do if one is to produce trueexplanations ; for lack of true theories, one need to select the best theories that one caninvent, that is to say theories that are at least empirically adequate ; but this is only onehalf of the work. Once a theory has been chosen, be it true or false, and provided thistheory has the resources to account for the phenomeno to be explained, an explanatorytask remains to be done, namely to select within the theory the statements that are relevantto the statement to be explained.

4But not in the sense that Duhem gave to this notion.

Chapter 3

Non relevant causal details inasymptotic explanations : Battermanagainst the Devil

In this short chapter, I review Batterman’s position about what he calls "asymptoticexplanation" in his book The Devil in the Details [5]. Batterman claims that in suchexplanations, in which the asymptotic behaviour of a system is investigated, the causaldetails are irrelevant for explanatory purposes. I first present in section 3.1 examplestaken from Batterman’s book. In section 3.2, I emphasize the scope of his claim. Finally,I argue in section 3.3 that Batterman’s arguments leave under-analysed the concept ofrelevance.

3.1 Sierpinski triangle and the emergence of fractal pat-terns

Figure 3.1.1 shows what is called "Sierpinski triangle". This pattern obtains whenone plays the "chaos game" : one marks off three vertices of a triangle, labelled A,B,Cchooses a starting point in the triangle and then make a mark halfay from the point andvertex A (resp. B,C) when one gets 1 or 2 (resp. 3 or 4, or 5 or 6). By repeteadly doingthis, one finally gets a fractal pattern with dimension 1.58.

Batterman emphasizes that two types of question must be distinguished : type (i)questions ask for the explanation of why a given instance of a pattern obtains : type (ii)questions ask why, in general, patterns of a given type can be expected to obtain.

Batterman argues that, to answer a type (ii) question, the causal details are irrelevant.Three points need to be stressed out :

1. Such explanations seem to involve crucially asymptotic analysis and therefore to beabout some asymptotic behaviour.

2. Structural stability plays a crucial role in the analysis.

3. The explanatory deduction shows that there is a probability equal to 1 that a patternof the given type obtain in each case (even if it is not impossible to obtain a patternof a different type).

3.2 Scope of Batterman’s claims 15

!

Figure 3.1.1: Sierpinski triangle

Batterman emphasizes that the particular causal details do not play any part in provid-ing answers to type (ii) questions, contrarily to what must be done in type (i) questions,for which one needs to explain why such or such instance of the pattern obtains and forwhich causal details are crucial. Batterman calls "asymptotic explanations" answers toquestions of type (ii), which are characterized by the three above features.

3.2 Scope of Batterman’s claimsIn this section, I give more examples of Batterman’s claims and emphasize that asymp-

totic explanations can be found in different fields. Further, it is important to note that,because they correspond to structurally stable behaviour, they provide explanations for awhole type of phenomena and have therefore a wide scope.

3.3 Are there other cases where causal details should beput aside?

I raise in this section different questions that are left unanswered by Batterman’s anal-ysis.

The devil in the details : only for asymptotic behaviours ?

As we have just seen it, identifying the detail of the causal processes cannot be anexplanantory ideal in case where "principled reasons" indicate why these processes are notrelevant. All the situations identified by Batterman are about the explanation of asymptoticbehaviour where structural stability is at play. As a consequence, the causal theorist ofexplanation can argue back that Batterman has put his finger about a very special type ofexplanations, which are only about asymptotic behaviours. So one may ask if there areother situations, not involving asymptotics, where the causal details are not explanatoryrelevant. I answer "yes" to this question in the next chapter and presents an example tothis effect in section 4.1 (the explanation of the law of areas).

3.3 Are there other cases where causal details should be put aside? 16

The devil in the details : only for statistical regularities?

Batterman’s examples are also about situations in which a statistical regularity obtainswith a probability 1. This raises two more questions. Can the causal details be explanatoryirrelevant in cases where one is interested in singular events ? Can the causal details beexplanatory irrelevant in cases where one is interested in deterministic events ? I alsoanswer "yes" to these two questions in the next chapter and illustrate this again with theexplanation of the law of areas.

The relevance requirement : only for "asymptotic explanations" ?

Batterman argues the traditional theories of explanation can not account for the exam-ples that he gives, in which the causal details are not explanatory relevant. That is whyhe coins a new concept of explanation, which he calls "asymptotic explanation". Still,his description of what asymptotic explanations are remains underdescribed. In particu-lar, he does not analyse in details the concept of relevance ; in consequence, what countsas explanatorily relevant is not clear. Further, since he does not give a precision logicaldescription of "asymptotic explanation" and does not seem to explicitly suscribe to anymodel of explanation in particular, one does not know :

• what asymptotic explanations share with other explanations ;

• what should count as "relevant" and should belong to an asymptotic explanationin each case : Batterman only presents in his examples the "principled physicalreasons" (structural stability, asymptotic analysis, renormalization arguments) thatmakes the causal details irrelevant. So each example is convincing in itself, but ageneral account showing what precisely is common to all these cases is missing, aswell as a formulation of a relevance criterion which would be shown to be fulfilledin each case on the basis of the "principled reasons".

• whether the (implicit) relevance requirement that is used in asymptotic explanationsis a specification of a more general criterion of relevance or whether this relevancerequirement is specific to asymptotic explanations.

I do not know what Batterman would answer to these (gentle and midle) criticisms.In the next chapters, I shall argue that the relevance requirement must be fulfilled by allexplanations and I shall try to define precisely the notion of relevance. In any case, Batter-man’s examples and analyses constitue a significant step forward, because it emphasizesthe significance of the problem of relevance and rejuvenates the debate ; further his exam-ples badly catch out the causal model of explanation and are serious arguments against it,in particular because they show that the "philosophical ideal" of causal theorist is not inagreement with important examples of scientific explanations. In these cases, followingthe causalist ideal does even go against what it is to explain well.

Chapter 4

Explanation, or the identification andselection of relevant facts

In the present chapter, I try to show that the explanatory ideal of causal theorists likeSalmon or Railton is not appropriate in general (and not just in the particular cases de-scribed by Batterman) and I try to present a different account that includes the requirementof explanatory relevance.

• In section 4.1, I analyse an example in which the causal details are again irrelevant.The phenomenon to be explained is not statistical and is described at the scale atwhich causal processes are usually described (so there is no change of scale ordomain glitch nor any emergence quandary or dead body in the cupboard). Theexample involves no asymptotic analysis. The example is taken from physics, whichis a battle ground on which the causal model should be in a favourable position anda failure is more troublesome for it.

• In section 4.2, I analyse two attempts to solve the problem of relevance, one bySalmon (the SR model) and one by Salmon -Reichenbach (the requirement of ex-haustiveness) and I show why these attempts, though clearly fruitful, fail to satisfythe requirement of relevance in the simple case of the explanation of the law ofareas.

• Finally, in section 4.3, I propose a criterion of relevance. I give three differentversions of the criterion, two for axiomatic theories (one for the syntactic notion ofderivation, one for the semantic notion of logical consequence) and one for the morefrequent cases, in which theories are not axiomatized. I have deliberately chosento give first formulations of the criterion for axiomatic theories because I arguethat relevance is a logical notion and that the relevance requirement can be bestfulfilled in cases where the logical analysis of the content of theories can be doneas precisely and finely as possible. As a consequence, for the purpose of definingrigorously relevance and illustrating it with clear-cut cases, supposing in a first stepthat one works within an axiomatic framework makes things easier — but I do notsuscribe to an axiomatic or linguistic view about theories.

4.1 The law of areas and its explanations 18

4.1 The law of areas and its explanationsI have deliberately chosen an example taken from physics (a field where the causal

model should succeed), where no asymptotics are involved, where the system is deter-ministic and the behaviour to be explained is not a statistical regularity. Finally, I havechosen an example as simple as I could find, in order to show that the requirement ofrelevance must be fulfilled even in elementary cases.

The law of areas, also called "Kepler’s second law", states that, for planets in our solarsystem "a line joining a planet and the sun sweeps out equal areas during equal intervalsof time." So originally, what is now called a law is just a statement about particular objectslike Earth or March. It is only with Newton’s explanation that the fully general statementabout systems of two bodies can be made safely.

!

Figure 4.1.1: Kepler’s second law. A, B, C, etc. correspond to different positions of theplanet at regular time intervals.

Since my goal is not to compare different explanations relying on different theories,I shall only consider explanations that are built from classical mechanics. Bodies obeyNewton’s second law ("FPD" for short)

∑−→F = m~a and the forces are gravitational with

intensity equal to F = G.M1.M2

R2 .Explanations must then be elaborated on the basis of what the theory indicates as

fundamental statements, that is to say, descriptions of forces, positions, masses, laws,etc. All the problem is to select for each explanation the right material among thesefundamental statements.

Suppose that one wants to explain that the trajectory of Earth is such that the linejoining the Earth and the Sun sweeps out equal areas during equal intervals of time. Dif-ferent explanations of this fact (or of the corresponding statement KT ) can be built in thisNewtonian setting.

.


!

Figure 4.1.2: Geometric demonstration of the law of areas by Newton (Principia [49, §I,2, 1].

� A laborious explanation

A laborious deduction of KT can be made by integrating the FPD and calculating thetrajectory of Earth and, in a second step, the swept area. The result is dAT /dt = C

2,

with C = r20 θ0 (r0 being the initial distance between the Earth and the Sun and θ0 the

initial angular speed).

Note that this is a sound deduction, in perfect agreement with the DN model of expla-nation. Causalists can be hapy too since the FPD, which describes how an invariantquantity, the momentum, is transferred, is heavily relied upon. Since the calculus isliteral, it also shows that the law of areas also obtains for a wide variety of initial con-ditions.

� Newton’s better explanation

Newton’s explanation relies on the following premisses : i) one single central force isexerted regularly, whatever its intensity ; ii) the action of a force is along the directionof the force — whatever the exact form of the FPD ; motions compound.

The demonstration goes like this (see figure 4.1.2). Because of the inertia principle, theEarth should go in c, if no force was exerted, and AB = Bc. Therefore, the area ofSAB and SBc is the same : ASAB = ASBc.

If the Earth was initially motionless, it would reach V because of the central force.By reporting

−−→BV in c, one obtains that the areas of SBc and SBC are the same. In

conclusion, ASAB = ASBc and ASBc = ASBC , so ASAB = ASBC .

By repeating this reasoning, it is clear that the swept area is the same is constant in equaltime interval.


!

Figure 4.1.3: Schema illustrating Newton’s demonstration.

Why Newton’s explanation is better

Newton’s explanation is better because it relies on less premisses. In particular, itshows that the intensity of the gravitational force and the exact form of the FPD are ir-relevant. (This is something that the causal theorist should be extremely unhappy with,because the propagation of momentum depends on the details of the FPD’s formulation.So the explanation is compatible with a wide variety of causal processes.)

This shows that the law of areas obtains for different forces, provided they are central.The intensity of the force could change ; its value could even be non computable (whichwould make the trajectory non computable) ; we could live in a different world, where theFPD would be partly different, for example if it were α ||

∑−→F ||2 ~uF = m~a)1, the law

of areas would still obtain.A way to put this is to say that Newton’s explanation is more general. I think that this

is a misguided way to interpret the difference between the two explanation. Generalityis for sure a welcome feature. Still, when one explains a particular statement, one hardlysee why providing an explanation that can be used for different situations can improve thequality of the explanation of this particular event.

Newton’s explanation is better because it shows that some potentially explanatoryfacts (e.g. the intensity of the force) are independent of the explanandum fact (here thelaw of areas) ; as a consequence, these facts cannot be used to explain it. By crossingout those facts, Newton and its demonstration show that the law does obtain in moresituations, where these facts are not the case. So generality is a consequence of the factthat one has provided a better explanation of the particular statement KT . Pace Kitcher,generality comes second, as an additional benefit2.

To put it differently, if one uses the first laborious explanation, even if the require-ments for the DN model are clearly fulfilled, the explanation seems to indicate that theintensity of the force and the details of the FPD (which are premisses of the explanatoryargument) do matter in the explanation of the law of areas, which is false. Because ofthese additional premisses, there is something like — to put in grician vocabulary — aviolation of a conversational maxim of quantity [28]3. Here the fact that the maxim of

1The constant α is added to keep the formula dimensionnally homogeneous.2It is possible that, when one creates new theories, generality comes first — but the choice of theories is

a different philosophical question.3In short, the maxim says : "i) Make you contribution to the conversation as informative as is required.

4.2 Two existing attempts to solve the problem of explanatory relevance 21

quantity is violated is not a matter of pragmatics but can be proved logically by showingthat the statements describing the intensity of the force and the law of areas are logicallyindependent.

Conclusion

This example answers some questions that had been left open by Batterman’s analysis.The causals details can also be irrelevant when explaining behaviours that are non statis-tical, non asymptotic and can be observed at the physical level of the causal processes.

Further, as the example shows, identifying the relevant facts is a scientific problem,not a philosophical game, and it rises even for simple cases. What remains to be done isto formulate clearly a relevance condition that can account for this example and solve theproblem of explanatory relevance.

4.2 Two existing attempts to solve the problem of explana-tory relevance

In this section, I discuss two propositions by Salmon and Reichenbach, which havebeen used to try to solve the problem of explanatory relevance. Both goes in the rightdirection but finally fail, one because it offers a statistical treatment, which is not powerfulenough, the second because it is too specific and focuses on the conditions that laws mustfulfill in general.

4.2.1 The SR model and its (informative) failureI shall be here very quick because the SR models is well-known by philosophers.Salmon argues that, when making explanations, one should mention only statistical

relevant properties. Attribute C is statistically relevant relatively to attribute B withinclass A iff P (B|A.C) 6= P (B|A). In order to find a good explanation, one needs topartition the initial class till no relevant partition can be made and the partition is homo-geneous.

The problem is that the class of hexed samples of salt is statistically homogeneousrelatively to salt dissolution. As a consequence, Salmon requires that the explanatoryclass is as large as possible4 : if one makes a partition using "hexing", two classes doshare the same probability relatively to dissolution (the class of hexed samples of salt andthe class of non hexed sampled).

But this does not solve the problem completely. Causes are under-determined byprobabilities and cannot be distinguished by this means. For example, two drugs that curea disease with probability 1 cannot be distinguished by means of statistics, even if thetwo drugs work differently and should be considered as explanatorily different. This is atraditional criticism that can be found in the literature. But the situation is even worse.

ii) Do not make your contribution any more informative than necessary."4I have simplified the argument in this summary.


Suppose for the sake of the argument that it is possible to distinguish between actualcauses5 by means of probabilities. Even in this case, the SR model is still insufficient. Inthe explanation of the law of areas, in order to show that the FPD is partly irrelevant andthat, were the FPD and the world different, the law of areas would still obtain, one wouldneed empirical statistics about worlds that do not exist. So as soon as irrelevancies areabout universal laws, there is no showing that these laws are partly irrelevant by meansof statistics coming from empirical data because in our world some facts are always "truetogether", even if the corresponding statements are logically independent.

4.2.2 Searching for maximal explanatory classes : from statistics toarguments

I have great admiration for the SR model6. It clearly shows that, in order to providegood explanations, two steps are required :

1. The first requirement is to find some sufficient explanatory material. In the DNmodel of explanation this requirement is fulfilled once one possesses a deductiveargument relying on premises that are fundamental statements , within the acceptedthery, or, in the SR model, when one has identified an objectively homogeneousclass.

2. Once some sufficient explanatory material is gathered up, the second requirementis to cross out the remaining irrelevant facts within this material and, by doingthis, to find a maximal explanatory class of similar situations, in which similarrelevant explanatory facts and a similar explanandum fact obtain. This can hardlybe done within a causal model of explanation, since causal processes, to qualifyas causal, must include features that can be explanatorily irrelevant and thereforerestrict unduly the class of explanatorily equivalent situations (see p.11 Hitchcock’sobjection to the causal model).

In the context of the SR model, to fulfill the second requirement, one must try to finda maximal homogeneous reference class, for example by crossing out properties such ashexing. A serious problem is that, to make statistics, one needs to select an initial class ofevents and to check whether it can be partioned by statistically relevant properties. But theinitial class of events can also be unduly restrictive. In the case of the law of areas, since asimilar explanation can be built when the force is gravitational or coulombian, the initialreference class should include both phenomena that are studied in mechanics and thosethat are studied in electro-magnetism. This would require to take into account extremelyvarious phenomena and reliable statistics would become extremely hard to build, if notimpossible.

In the context of deductive explanatory arguments, Salmon puts his finger on the sec-ond requirement when, asking about the difference between explanations and arguments,he writes :

5I mean causal processes that can be found in this world.6And incidentally, I am completely baffled by the fact that the very same person could be an ardent

defender of the causal model of explanation afterwards, and therefore lose the grat benefits from the SRmodel.


Question : Why are irrelevancies harmless to arguments but fatal to explana-tions?

Salmon"s question can be illustrated by the following example.

All human beings are mortal.Socrate is a human beingXanthippe is a human beingSocrate is mortal

This argument, though strange and inelegant, is completely valid. But, as Salmonwrites it, the rooster who explains the rising of the sun on the basis of his regular crowingis guilty of more than a minor logical inelegancy. [62]. So what condition about relevancemust be added to deductive accounts of explanation to make them satisfactory ?

4.2.3 The requirement of exhaustiveness about nomological statements(Reichenbach)

As Salmon notes it in 1977, advocates of the DN model of explanations have nottried hard to solve the relevance problem. This statement is probably still true in 2008. Iagree with Batterman when he claims that Kitcher’s model does not solve the question ofrelevance [5, p.30–34] for the cases that he presents. I think it also fails for the exampleof the explanation of areas. Both the laborious explanation and Newton’s explanation ofthe law of areas are needed in our unified science. The laborious derivation proves inaddition that the swept area is dAT /dt = C

2, with C = r2

0 θ0 and it is needed for thispurpose it is something that Newton’s explanation does not prove). Newton’s explanationis needed because it explains a wider regularity (the fully general law of area). So if oneis interested in explaining an instance of this latter regularity, or any regularity that is alsoencompassed by the laborious derivation, two argument schema are available in Kitcher’sunified science, and Kitcher’s account cannot select the better explanation between thetwo arguments.

One final attempt has been made and it was developped by ... Salmon again, who elab-orates on a discussion about laws, which is originally to be found in [58]. Reichenbachtries to give an additional condition that must be fulfilled by true universal nomologicalstatements in order to count as laws. Laws must be, according to Reichenbach, completelyexhaustive. To determine whether a statement is exhautsive, it is first required to put it innormal prenex form7. Then, one drops the quantifier and puts the statement in conjunctivenormal form. For example, a→ (b→ c), becomes :

(a ∧ b ∧ c) ∨ (a ∧ ¬b ∧ c) ∨ (a ∧ ¬b ∧ ¬c) ∨ ...... ∨ (¬a ∧ b ∧ c) ∨ (¬a ∧ b ∧ ¬c) ∨ (¬a ∧ ¬b ∧ c) ∨ (¬a ∧ ¬b ∧ ¬c)

A residual statement is obtained when some of the disjuncts are dropped and the quan-tifiers added again. A statement is exhaustive if none of its residual is true. Roughly said,

7That is to say, to write it as a string of quantifiers followed by a quantifier-free part.


a statement is exhaustive if all the cases about which it gives constraints are indispensableto its truth.

For example, "all animal with have a hear also have bladder" is not exhaustive. Whenexpanded, the statement look like this

∀x[(Hx ∧Bx) ∨ (¬Hx ∧Bx) ∨ (¬Hx ∧ ¬Bx)].

Since animals with no heart do not have bladder either, the residual

∀x[(Hx ∧Bx) ∨ (¬Hx ∧ ¬Bx)].

is also true. So the original statement was no exhautive because the disjunct (¬Hx ∧Bx) ("no heart and a bladder") never happens and is "useless" in the original expandedstatement.

Examples like the hexed salt can be analysed along the same lines. The nomologicalstatement "all hexed samples of salt dissolve in water" is not exhaustive because one ofthe disjunct ("the salt is not hexed and does not dissolve") can be dropped without thenomological statement being false, since it corresponds to a situation that never happens.

To sum up things, laws that are not exhaustive are unduly specific because they toleratetoo many cases (and can therefore be falsified in fewer cases). If one calls "domains" ofa law the set of cases in which its premisses are true8, one can say that the domain of anon-exhaustive law is unnecessarily restricted. To be exhaustive, it must have a domainthat is as wide as possible.

Benefits and failures of Salmon-Reichenbach’s proposition

Salmon-Reichenbach’s proposition cannot work in all cases. It does work for thehexed salt case (or for John John’s failure to get pregnant) because in theses cases, theregularities that are used are not exhaustive9.

Still, this proposition will not do for the explanation of the law of areas because thelaborious explanation is built upon the FPD

∑−→F = m~a, which is one of the best possible

candidates, if any, for being a completely exhaustive statement qualifying as a law.In conclusion, Salmon-Reichenbach’s solution gives a solution to the problem of rel-

evance only when the failure comes from a defective law (like in the hexed salt example)which is not the case in the laborious explanation of the law of areas, where genuine lawsare used. So the solution to the relevance problem cannot be that deductive explanationsshould only use exhaustive nomological statements. But once again, the exhaustivenessrequirement, like the notion of maximal homogeneous class, seems to be pointing in theright direction.

Since all the existing positions in the literature have been exhausted and have failed,it is now necessary to try to build a new one, which is what I shall do now.

8A universal law can be about all physical objects and have a limited domain. The hexed salt regularityis about all objects but has a domain unsufficiently large, because it says that if these objects are hexedsample of salts, etc..

9"All men who take their wife’s pill fail to get pregnant" is not exhautive either because it never happensthat a man does not take his wife’s pill and get pregnant.

4.3 Sketch of a model of explanation based on relevance 25

4.3 Sketch of a model of explanation based on relevanceDiscussions about explanation usually try to solve two questions in the same time.

• what theoretical description must be given of a particular system in order to knowits inner workings and to be able to account for all features of its behaviour ?

• what is a good explanation of some particular features of this system ?

There is no doubt that the two questions are connected. In order to give a good expla-nation of a phenomenon, it is advised to mention true or approximately true statementsabout the target system. In the same time, if one possesses a true theoretical and funda-mental description of a system, one does not possess a good explanation of some featurescharacterizing its behaviour since, as we saw above with Batterman’s examples or withthe explanation of the law of areas, it is also required to select within the pool of truefundamental statements composing the true theoretical and fundamental description ofthe system which one are relevant to the fact to be explained. In consequence, I arguein section 4.3.1, that what is specific to the activity of explanation is this selection of therelevant facts : since, in order to make this selection, one must first delineate the poolof statements composing the theoretical description of the system under study, I claimthat explanation is a logically second activity, which is done in the setting given by theaccepted theories.

For whom wants to give a definition of relevance, this claim entails some more diffi-culties.

i. First, what theories are is still a debated question, even if there is presently an agree-ment about what is called the "semantic view".

ii. Further, even if one suscribes to the semantic view and considers that a theory isset of models, the way theories are linguistically presented is likely to matter a lot,since it is only through these formulations that one can access these theories, exploretheir content by logical deductions, and produce good and sound explanations of thephenomena that they cover.

iii. Also, it is clear that an appropriately formulated theory can help in scientific activity.In the discussion about relevance, I have so far talked about the selection of relevantfacts within a pool of statements composing a theoretical description and it is tempt-ing to say that a statement of this pool is relevant to an explanandum statement Eif it is absolutely required as a premisse of the explanatary deduction of E. But thenumber and identity of the statements of this pool and how each one does carry someinformation may vary depending on the formulation of the theories, so the abobesuggestion does not work. In the same time, as far as relevance is concerned, a well-formulated theory helps one to identify and distinguish the different bits of logicalcontent of the theoretical description of the system.

iv. Finally, the degrees to which theories are formalized do probably have a bearing uponhow we manage to analyse them and identify in each case the relevant explanatorystatements for an explanandum statement E. But theories do have more or less for-mal formulations, from rough descriptions of mechanisms in biology to axiomatized


theories. In consequence, it is difficult to give a rigorous definition of relevance thatcan be applied to all kinds of explanatory searches within all these kinds of theories.

To overcome these difficulties, I shall proceed in several steps and start with a philo-sophical idealization. First, I take a simple example, which illustrates how the requirementof relevance can be fulfilled when theries are axiomatized (section 4.3.2 ). In a secondstep (sections 4.3.3 and 4.3.4), I propose two definitions of the condition of relevancefor the case where theories are axiomatized. The two definitions correspond to the twoapproaches to the notion of valid inference, syntactic (by considering formal derivations)and semantic (by considering semantically valid logical consequences). This treatment ofthe notion of relevance in an axiomatized framework aims at showing how the conditionof relevance can be best fulfilled in cases where the analysis of the logical content of atheory can be done extremely precisely, so the case of axiomatized theories can providea model and a guide for further analyses. Finally, I try to give a formulation of the crite-rion that is more adapted to the case where theories do not have an axiomatic formulation(section 4.3.5).

4.3.1 Explaining : a logically second activityIn this section, I argue that what is specific to the activity of explanation is to select

the relevant explanatory facts. More precisely, in order to produce good explanations, oneneeds to complete the following stages.

1. One need to select the best (and if possible true) theories for the domain of phe-nomena that is investigated and, by this means, determine what the fundamentaldescription of the system under study is.

2. Then, it is required to show how the statements describing the fact to be explainedare true within the model of the target system or, to put it differently, how thesestatements can be derived from the statements that present the theory and the par-ticular circumstances in which the fact to be explained occurred. This stage alsoneeds to be completed when one makes prediction or check that the theory is notfalsified by some phenomena. By doing this, one establishes that the theory is logi-cally rich enough to account for the behavior investigated.

3. Finally, one needs to select within the description of the system the statementswhich are explanatorily relevant to the fact to be explained. It should be notedthat, in order to make this selection, it is required to have first determined a pool ofpotentially explanatory facts. Without this restriction, if any statement that is truewithin the model could be used for explanatory purposes, the explanandum state-ment could be a candidate for being part of the explanation of itself, which is ofcourse not satisfactory. So the theory must also indicate which statements are to betaken as fundamental and do describe the basic properties of the system (e.g. forces,mass, velocity, etc.). Once again, the relevance analysis comes second, even if it isa specific feature of the activity of explanation.

When assessing the importance and role of these different stages, two different per-spective must be distinguished :


!

Figure 4.3.1: Saccheri quadrilateral. AC = BD ; CAB = DBA = 90˚.

• If one focuses on the importance of each stage in scientific activty and in the ac-tivity of explanation, it is clear that the last one is the less important : in order toobtain good explanations, it is first and foremost required to select good theories,the quality of which determines the quality of the explanations that can be built ; inaddition, these theories should at least be rich enough to account for the phenomenainvesitigated. So stages 1 and 2 are indispensable in scientific activity and expla-nation. It is only when these first two stages are correctly completed by a theory Tthat it is legitimate to focus on the quality of explanations, within the setting pro-vided by T , and to focus on relevance. Having better explanations is somewhat ascientific luxury.

• If one focuses on what is specific to the activity of explanation, the stress must beput on stage 3. Stage 1 and 2 are common to other scientific activities. In particular,when doing predictions, completing stages 1 and 2 is sufficient — but any goodprediction does not necessarily make a good explanation because any predictiondoes not fulfill the relevance requirement.

4.3.2 If theories were axiomatized ... an example taken from mathe-matics

Before proposing definitions of the condition of relevance, I start with an exampletaken from geometry in order to show how precisely the relevance analysis can be doneprecisely when theories are axiomatized10.

The example is about Saccheri-Legendre theorem, which says:

If the sum of the degree measures of the three angles in a triangle is less than(resp. equal to) 180˚, then it is less than (resp. equal to) 180˚ in all triangles.

Saccheri actually proved this theorem because he wanted to prove Euclid’s fifth postu-late by finding contradictions out of the hypothesis that, in figure 4.3.1, angles ACD and

10I have chosen an example from mathematics in order to make my point clearer, but I do not claim to bediscussing the question of explanation in mathematics, which is a close but distinct issue raising differentquestions [67, 59, 31, 37, 41]. So I assume for the sake of the argument that one tries to explain the factthat, in our world, the sum of the degree measures of a triangle is equal to or less than 180˚.


CDB are more (or less) than 180˚. What he actually proved is the theorem above, which,to say it in Bolyai’s words, is a theorem of absolute geometry, that is to say geometrywithout Euclid’s fifth postulate [9, 56, 53].

As the example shows it, if one wants to explain why, in a Euclidean world, the sum ofthe degree measures of all triangle is 180˚, it is clearly better not to include Euclid’s fifthpostulate in the explanation, even if this result can be proved by a (probably cognitivelyeasier) deduction using all axioms of geometry, including Euclid’s fifth postulate.

Since the theory that is relied upon is axiomatized, the pool of potentially explanatorystatements (the axioms) is clearly delineated, so one only needs to select which axioms ofthis pool are relevant in order to prove the explanandum statement. Note that things areusually different in scientific practise. For example, the fundamental principle of dynam-ics is usually given in one bit, even if it contains the description of quite a few distinctfundamental facts (such as the direction of the action of forces, the relation between theintensity of forces and the acceleration, etc.). Therefore, in order to find out what is rel-evant to the law of area within this principle, one must analyze it and select the part ofits logical content that is relevant to the explanandum fact. The advantage of using anaxiomatized theory is that one part of the logical analysis has already been done and allthe atomic statements, describing atomic facts, are already identified.

The example also shows that demonstrations play a crucial role in finding out the factsthat are not relevant to the explanandum fact. Given a set of independent statements, theirrelevant statements are the ones that can be dispensed with in the premises that are usedin order to deduce the explanandum statement.

4.3.3 Formulation of a syntactic criterion of relevanceDrawing from the example, it is possible to offer a first definition of the criterion of

relevance. I suppose that the theory that is used is axiomatized and that the axioms areindependent.

Let us start with preliminary definitions.

Definition 4.3.1. Syntactic sufficiencyA set A = {A1...An} of axioms of a an axiomatized theory T is syntactically sufficient

for an explanandum fact E if it is possible to derive from the axioms of A a formula ϕE

the interpretation †ϕE† of which "corresponds"11 to E, that is to say if A1...An ` ϕE .

Definition 4.3.2. Syntactic minimalityA set A = {A1...An} of axioms of a an axiomatized theory T is syntactically minimal

for a factE ifA is syntactically sufficient forE and no proper12 subset ofA is syntacticallysufficient.

11I am a little embarrassed here. To determine how to interpret the statements, one would need to makea philosophical commitment to what type of item is explained (facts, statements, phenomena) and to howtheories and models do represent phenomena. One usually considers that the interpretation of ϕE is a math-ematical structure and that this mathematic structure does represent some model of data, or does representdirectly the objects in the world as well a their properties. Since my point does not depend on such com-mitments, I have used a rather vague expression here. To make things more simple, I shall do as if theinterpretation †ϕE† of ϕE is the explanandum fact E.

12A proper subset of a set A necessarily excludes at least one member of A.


A syntactic definition of relevance can now be proposed.

Definition 4.3.3. Let T be an axiomatized theory, E the explanandum fact, ϕE a formulathe interpretation of which is E. Let D be a derivation of ϕE from the axioms of B ={b1...bj}, which belong to the axiom of T .

Syntactic criterion of relevance. Derivation D and set B = {b1...bj} of axioms fulfillthe requirement of syntactic relevance in the setting provided by T if B is syntacticallyminimal for E.

Independence and atomicity of the axioms : two welcome features

In this paragraph, I emphasize that the independence and atomicity of axioms is adesirable feature of axiomatized theories if they are to provide good explanations.

If independence is missing, then different minimal sets of axioms may be found for asingle explanandum statement ϕE . If the axioms are not atomic, in the sense that Newton’sprinciple stands for a cluster of several fundamental facts such as the intensity of theincrease of momentum, its direction, etc., then the syntactic relevance analysis cannot"reach into" the axioms in order to select within the logical content of these axioms (whichstand for several distinc facts) the part of this content that is relevant to the explanandumfact.

This illustrates again that the quality of the explanation that can be built in the settingof a theory crucially depends on the quality of the linguistic formulation of this theory —even if one claims that a theory is a set of models. But again, finding a good theory anda good formulation of it are scientific activities that are logically anterior to the search forexplanations, even if this search partly depends on the output of these activities and evenif all these activities are often made in the same time in scientic practice.

4.3.4 Formulation of a semantic criterion of relevanceEven if one remains in an axiomatized setting, it is desirable to give a semantic for-

mulation of the criterion of relevance.

• A syntactic criterion of relevance makes this notion totally relative to a formulationof the theory and to the inference rules that are used in such or such formal system.It is therefore better to try to find a semantic notion, which will make the notionrelative to the theory itself, and not to its linguistic formulations. In this perspective.finding a notion based as much as possible on semantic concepts is a first step.

• In order to emphasize that good explanations, which select only the relevant facts,provide explanations that are more general13, it is appropriate to find a criterionrelying on semantic notions such as semantically valid argument, which are definedin terms of set of models.

• Finally, if the theories that one needs to rely on for physics are incomplete, thenotion of syntactic derivation might be unsufficiently powerful to catch all the caseswhere one needs to establish logical connections between statements. Since this isa delicate and speculative additional reason, I shall not say any more on this topic.

13See the discussion of the explanation of the law of areas.


The criterion of semantic relevance

To get a semantic criterion of relevance, we can recycle the previous definitions withthis difference that one should use the notion of semantically valid consequence (noted"|="), instead of the notion of derivation.

Definition 4.3.4. Semantic sufficiencyA set A = {A1...An} of axioms of an axiomatized theory T is semantically sufficient

for an explanandum fact E if the argument schema having these axioms as premises anda formula ϕE the interpretation of which is E as conclusion is semantically valid, that isto say if A1...Ap |= ϕE

14.

Definition 4.3.5. Semantic minimalityA set A = {A1...An} of axioms of an axiomatized theory T is semantically minimal

for a fact E if A is semantically sufficient for E and no proper15 subset of A is.

A semantic definition of relevance can now be proposed.

Definition 4.3.6. Let T be an axiomatized theory, E the explanandum fact, ϕE a formulathe interpretation of which is E. Let I be a valid argument having ϕE as conclusion andaxiomes A1...Aj as premises.

Semantic criterion of relevance. Argument I and the set of axioms A = {A1...Aj}fulfil the condition or semantic relevance for E in the setting provided by T if A is asemantically minimal set for E.

It should be noted that what is defined here is partly relative to the formulation of thetheory that is used. Further, in order to get "the" true explanation of E, T must be true.Also, the remarks about the benefits drawn from having atomic and independent axiomsstill apply.

Should we take incompleteness into account ?

In this section, I review some arguments about which theories are indispensable forphysics in general and for finding explanations in physics in particular and whether thesetheories are incomplete.

This is anyway a point that I need not to commit about, even if some important ques-tions about explanation might depend on these issues, e.g. "Can or should the proof thatsome explanatory premises are sufficient for the explanandum statement be included inthe explanation?" If in some cases, such proofs do not exist, they obviously cannot beincluded in the explanation.

14Isabelle Drouet suggests to say more simply that a setA = {A1...Ap} is sufficient if all models ofA = {A1...Ap} are also modesl of ϕE . By doing this, one gets rid of the notion of deduction and oneno longer focuses on the "path" from premises to conclusion. It is perhaps true that focusing on the pathfrom premises to conclusion comes from a misplaced emphasis within the definition of explanations on theepistemology of explanations and how we identify that some statements can explain another statement.

15A proper subset of a set A necessarily excludes at least one member of A.


Explanation : an absolute notion

In this section, I defend the claim that explanation is an absolute notion in the follow-ing sence : once a theory has been selected, whether a setA of statements explains anotherstatement E should not depend on whether we have the knowledge that A explains E orwe can easily access to this knowledge. From this point of view too, giving a definition ofexplanation which is independent of any particular formulation of a theory is desirable —since the linguistic formulations of a theory are the particular media that we use in orderto explore the content of the theory and of its models.

In the same time, the claim that the notion of explanation is absolute is completelycompatible with a position that puts the emphasis on the linguistic formulations of theorieswhen studying the epistemology of explanations. When it comes to understanding andanalysing how we manage to produce explanations, to select the relevant facts, etc. aclose scrutiny of the linguistic formulations that we use for these activities is required. Itis clear that how we manage to be successful in the different activities depends (amongother things) on the quality of the linguistic formulations that we can use and how theseformulations are suited for these activities. For example, possessing axiomatized theorieswith independent axioms can help one to select the relevant facts – even if an axiomaticsetting may be unnecessarily cumbersome in order to complete other scientific activities.

From quality to generality

With this semantic version of the criterion of relevance, it becomes clearer than expla-nations that satisfy more completely the relevance requirement are more general. Sincethese explanations select as explanatory less statements from the complete fundamentaldescription of the target system, the class of models in which these statements are trueand potentially similar explanations of similar facts obtain is wider.

This is in agreement with what we saw i) about the requirement of maximality of thehomogeneous classes in the SR model ; ii) about the maximality of the set of situations thatan exhaustive nomological statement is about iii) about the greater generality of Newton’sexplanation of the law of area.

4.3.5 When theories are not axiomatized : explanation and relevancein usual theoretical settings

I finally address the problem of giving a criterion of relevance for the much morefrequent case where theories are not axiomatized.

4.3.5.1 Why a criterion is required for cases in which theories are not axiomatized

In this section, I give arguments so as to justify that a criterion of relevance is neededfor the cases where theories are not axiomatized.

I shall not expatiate on this issue since the arguments run along the same lines as thearguments against the receive view of theories. Since considering that theories are partic-ular linguistic entities is not a satisfactory position, it is therefore a flaw of the previousformulations of the criterion of relevance that they make what is explanatorily relevant toa fact relative to particular formulations of theories and not to these theories.


4.3.5.2 Criterion of relevance for non axiomatized theories

� Step 1. Identification of fundamental statements.

If, for defining a criterion of relevance, we do not favour a particular formulation ofa theory, a problem that immediately arises is that we can no longer consider that theset of fundamental statements that can be used as premisses of explanatory argumentsis clearly identified (since we can no longer consider that this set is given by a list ofrecursively enumerable axioms). This is clearly a worry because we do not want topermit that any sentence that is true within the model representing the target systemcan be used as an explanatory material. The reason is that, if we permitted this, thebest explanation of a derived explanandum statement could be the statement itself. Forexample, when we explain the fact that the law of areas obtains for the Earth or March,it is legitimate to consider that the explanantion shoud be built upon information aboutwhat the theory indicates as fundamental quantities or properties of the system, such asthe masses or velocities of bodies, the type of forces, the dynamics of the system, etc.

As a consequence, if one wants to be able to build explanations, one must suppose thata theory is not just a set of models but also indicates which statements, which are truein the mathematical model M representing the target system, are fundamental and canlegitimately be used in order to build explanations of other statements also true in thismodel.

� Step 2. Selection of sufficient descriptions for a fact E.

Once step 1 has been completed, the definition of relevance can proceed as before,with this difference that any fundamental statement that is true within the mathematicalmodel M can be selected to build explanations, and not just the statements that can bemade within a particular formulation of the theory.

The two following definitions of sufficiency can then be proposed.

Definition 4.3.7. Deductively sufficient description for an explanandum statement ϕE .

A description D, composed of a set of statements that are fundamental16 in the settingprovided by theory T , is deductively sufficient for an explanandum fact E if a statementϕE , the interpretation of which is E17 can be deduced from D.

This definition is not completely satisfactory because the notion of deduction requiresthat a particular formalism with sound inference rules has been used, which makes thedefinition relative to a particular formal system. As a consequence, it is again moreappropriate to use the notion of semantically valid argument.

Definition 4.3.8. Semantically sufficient description for an explanandum statement ϕE .

A description D, composed of a set of statements that are fundamental in the settingprovided by a theory T , is semantically sufficient for an explanandum fact E if theargument having these fundamental statements as its premisses and ϕE as its conclusionis valid.16That is to say statements the interpretation of which "correspond" (see note 11) to fundamental facts.17See note 11


Again, one should not be content for explanatory purposes with a semantically sufficientdescription for an explanandum statement ϕE since this description can include irrele-vant and "useless" statements. In consequence, one needs again to define the notion ofminimality.

� Step 3. Selecting just the relevant facts.

Minimality is here more difficult to define. As we said above, we can no longer considerthat we have a pool of fundamental statements provided by a formulation of the theory,and that we need to select within this pool the indispensable statements for the explana-tion of E. Further, one should also remind that the statements that are used to presentthe theories can be unnecessarily informative for our specific explanatory purposes, asthe FPD is unnecessarily informative for the explanation of the law of areas. As a con-sequence, I offer an "extensional" definition that is based on the notion of maximality ofthe corresponding class of models. Roughly said, if a description is unnecessarily infor-mative for some explanatory purposes, then one can find a less informative descriptionthat is sufficient and true in a wider class of models.

Definition 4.3.9. Minimal explanatory description for a statement ϕE in the setting ofa theory T .

LetMD be the set of models in which the fundamental description D (in the setting ofa theory T ) is true.

DescriptionD is minimal for statement ϕE in the setting provided by theory T if there isno other fundamental descriptionD′ in the setting provided by theory T that is sufficientfor ϕE and such thatMD ⊂MD′ .

This definition does not provide a quantitative measure of the quality of explanations.Further, it is clear that all setsMDi

corresponding to different sufficient descriptions foran explanandum fact E cannot be compared because the descriptions Di can include dif-ferent irrelevant statements, which are true in potentially different models. In the sametime, this definition shows how the quality of explanations relying on description Divaryin inverse proportion of the cardinal of the classMDi

, which is in agreement with Port-Royal’s law, which says that for a concept, comprehension and extension vary in inverseproportions. The more you add irrelevant information, the less general are the explana-tions that you get.

A general criterion relevance can now be formulated.

Definition 4.3.10. General (and non formal) criterion of relevanceA description D that is fundamental in the setting provided by theory T fulfills the

criterion of relevance for a fact E if D is minimal for E.

This definition obviously mentions linguistic entities ("descriptions") but no particulardescription of the description of the theories is favoured. In this perspective, providinga suitly axiomatized formulation of a theory can be seen as a means of finding minimaldescriptions for a fact E. Finally, one should remind that whether a description that isminimal in the setting provided by a theory T is the explanation of a fact E depends onwhether T is true. If T is false, this description is the good explanation of E in the settingprovided by the false theory T or, for short, D is a good but false explanation of E.

4.4 Concluding remarks 34

4.3.5.3 Final remarks about the criterion of relevance

In these paragraphs, I make a few final comments about the criterion of relevance18.

� A criterion, not a research methodThe criterion gives a requirement that must be fulfilled by an explanation but it does notprovide a method for searching minimal descriptions, which remains a scientific task.

� Explanations, good explanations, the best explanationsThe criterion gives a requirement that must be fulfilled by an explanation composedonly of relevant facts. But how well can we fulfill this condition ? In most cases,we know we possess sufficient descriptions, from which we do offer explanations ; itsometimes happen that we manage to cross out some irrelevant statements from thesedescriptions (as in the explanation of the Saccheri-Legendre theorem, in which the fifthEuclid’s postulate was crossed out by Saccheri). But finding a minimal description andproving that a description is minimal is no doubt even more difficult.

All this considered, it is perhaps more appropriate to call sufficient descriptions "ex-planations" and to consider that the more one satisfies the relevance requirement bybuilding proofs with less informative premisses, the better the corresponding explana-tions.

� About the need to rule out disjunctive statementsIn order to guard against artificial enlargment of the classes MDi

corresponding tosufficient descriptions Di, it is necessary to rule out the use of disjunctive descriptions.

� A slight concession to pragmatics : the choice of initial conditionsI have argued in the previous sections that, pace Railton or vanFraassen, the notion ofrelevance was not merely a matter of pragmatics. There is however a final point aboutwhich pragmatic does matter. When it comes to explaining that a physical system is ina state E that depends on previous states of the system, there is no doubt a pragmaticchoice to be made in order to determine which set of initial conditions (namely at t −1, t−2 or t−n etc.) are chosen to build the explanation. As a consequence, a factE canhave different minimal explanatory descriptions, one for each set of initial conditionsthat can be chosen.

4.4 Concluding remarks

4.4.1 The two faces of the hexed salt: identifying theories and identi-fying relevant explanatory facts

We can now see that the problem of explanatory relevance, as it is traditionnaly anal-ysed, actually corresponds to two different requirements.

i. First, one needs to identify the right theories.

18I have selected here some of these comments from the original document.


ii. Second, one needs to select the right explanatory facts within the pool of what thetheory indicates to be fundamental and potentially explanatory facts.

With this two requirements in mind, the hexed salt puzzle can now be analysed in twodifferent ways.

i. If the theory T that one relies upon includes the law L "all hexed samples of saltdissolve in water", the hexed salt explanation is perfectly fine relatively to this theory— even if further empirical researches are likely to provide evidence that L is notexhautive (in Reichenbach sense) and that theory T can be replaced by theory T ′,which includes instead of L the law L′ "all samples of salt dissolve in water".

ii. Once one possesses this theory T ′, it is clear that the hexed salt explanation is notsatisfactory because L is just a derived (versus fundamental) nomological statement,which further unnecessarily restricts the domain of the fundamental law L′. So thehexed salt explanation is, in the setting provided by T ′, a low quality derived expla-nation.

4.4.2 Fundamental nomological regularities and exhaustive derivedregularities

The idea of exhaustiveness can be fruitfully recycled here. Finding an explanation thatincludes only relevant facts is tantamount to finding an exhautive nomological regularitywhich summarizes the explanation.

Suppose Perti (resp. NoPert) stands for a description of the relevant (resp. someirrelevant) facts for explanandam fact E. Then the explanatory argument is :

Pert1Pert2E

Establishing that this argument is true can take long. But once this has been done, onecan rewrite the following argument, which summarizes the explanation:

(Pert1 ∧ Pert2)→ EPert1 ∧ Pert2E

This argument is much more easy to check, but it has an additional premisse ((Pert1∧Pert2) → E), which is a derived statement. If irrelevant facts have been included, theadditional premisse would have been ((Pert1 ∧ Pert2 ∧NoPert)→ E).

The nomological statement (Pert1∧Pert2)→ E is exhaustive in the (new) sense thatno less informative antecedent, having a wider extension, can be used if the conditional isto remain true, whereas the conditional (Pert1∧Pert2∧NoPert)→ E) is not exhaustivein this sense.

In conclusion, finding an explanation amounts to finding an exhaustive statement inthis new sense; but one shoul be careful that this statement is a derived one and cannot beused for explanatory purposes, even if it provides an elegant summary of the explanationand condensates its useful content.


4.4.3 On Hempel’s shoulders we standI do not suscribe to all the claims made by Hempel. Still, what I have said above

can be used to provide a modified version of Hempel’s DN model. The requirement forHempel’s DN models are

i. The explanandum must be a logical consequence of the explanans.

ii. The statements in the explanans must be true.

iii. There must be at least one law in the explanans.

My revised version of the DN model is :

i. The statements in the explanans must be fundamental statements in the setting pro-vided by theory T .

ii. The explanandum must be a logical consequence of the explanans.

iii. There must be at least one law in the explanans. <Somehing I have in fact not dis-cussed.>

iv. The explanans must satisfy the requirement of relevance, if the explanation is to be asgood as possible. Relevance is a specific and not pragmatic virtue of explanations andthe relevance requirement need not be fulfilled when doing predictions or checkingthat a theory is empirically adequate.

v. In addition, for the explanation to be true, theory T as well as the explanans mustbe true. If theory T is empirically adequate, approximately true, idealized, etc., theexplanation is empirically adequate, approximately true, idealized, etc. too.

Chapter 5

Conclusion : good explanations deducenothing too much

I can now answer briefly the introductory worry stated at the beginning of part 1.Because of the explanatory (but spurious!) ideal that is embodied by the causal model

of explanation, I had described the worry that there could be a (potentially unbounded)inflation of the explanatory task with the (potentially unbounded) increase of the qualityof explanations. This would have been the case if the more detailed explanations were,the better they also were.

I have argued that this worry is not legitimate.To provide a good explanation, it is sufficient to be able to deduce the explanandum

statement from what our theories designate as fundamental statements. If the set of pre-misses used in the explanatory argument can be decreased, this is all the better. On thecontrary, deducing more information about how the target system behaves and what thedetailed causal processes are require to use premisses that are as informational as possi-ble. In other words, this goes againt the requirement of relevance and ruins the quality ofexplanations.

The second part of the inquiry can now begin. Since producing the optimal explanationrequire to show that some statements are a logical consequence of some more fundamentalstatements, the complexity of the task is determined by the complexity of this deduction.So the question is now : is it possible to provide a robust measure of how difficult it is toproduce explanatory deductions ?

Part II

The intrinsic opacity of phenomena

Chapter 6

Where are the boundaries of science ?

6.1 The boundaries of what can be known in practiceThere is a tradition in which explanation is analysed as an activity of a subject. Prag-

matists, like Bromberger, describe explanation as a linguistic performance. And if expla-nation is an activity, explanations do not exist independently of their being performed bya subject. An extreme position of this type has been defended about proofs in philosophyof mathematics by Martin-Löf and Brouwer :

"Thus a proof is, not an object, but an act. This is what Brouwer wantedto stress by saying that a proof is a mental construction [...]. And the act isprimarily the act as it is being performed. Only secondarily, and irrevocably,does it become the act that has been performed." [42]."

Against positions of this type, I have claimed in part 1 that the predicate "to be an expla-nation of" is a logical relation1 (in the setting provided by a theory). As a consequence,whether X is an explanation of Y is something that is true independent of our discoveringor proving that X is an explanation of Y — even if it is only when we manage to provethat Y is a logical consequence of X that we know that X can explain Y 2. It is in partic-ular possible that X is a good explanation of Y even if, because our limited faculties andressources, we are in practice unable to prove that X is an explanation of Y or to identifyX as an explanation of Y .

When it comes to analysing what really belongs to our science and what we can know,predict or explain, the perspective must be different and close attention should be paid towhat we can do. As noted by Humphreys [33, p.153], impossibility results, like Gödel’sincompleteness theorem, the impossibility of particles to exceed the speed of light orHeisenberg’s principle have played a central role in twentieth century philosophy of sci-ence. Such attention to these negative results is legitimate because such results indicatewhat cannot be done or known in principle — and what cannot be done in principle can-not be done in practice either. Yet, when it comes to establishing what can be known inpratice, such results are of little help, because what can be done in principle is often im-possible to carry out in practice. To quote Humphreys again [33, p.153], "to say that for

1Notions like "sufficiency" and "minimality" have been spelled out in logical terms.2Other conditions need to be fulfilled, see part 1.

6.2 Explanation and understanding : two different notions 40

scientific purposes a function is computable when it is computable only in principle andnot in practice is rather like presenting a friend with a million dollar inside a safe sealedwith an uncrackable lock and tellling him, "Now you are a millionnaire"".

In this part, my purpose is to try to determine where the boundaries of science lie andto what extent one can hope to push them back. To do that, I discuss which operationsneed to be carried out in order to come into the possession of explanations or predictionsand how one can measure how difficult or complex these operations are. Finally, by rely-ing on computational complexity theory, I claim that this complexity or difficulty cannotbe arbitrarily decreased. In consequence, I argue that, even if, by extending our compu-tational resources, we can push back the boundaries of science, the progress that can bemade are constrained by the intrinsic difficulty of producing explanations or predictions.

6.2 Explanation and understanding : two different no-tions

In the next sections, I shall argue that what is central for discussing where the bound-aries of science lie is the cost of finding explanations. Some people may object thatconsidering the cost of getting (some) understanding may be more appropriate (all themore since understanding can sometimes be gained at a lower cost). Understanding isnot my favorite notion. When discussing this notion, people often want to get the honestbenefits from a solid discussion about explanation, which is a partly normative notion3,and to make this coherent with epistemological, descriptive and phenomenological con-siderations about what we feel understanding is (many people feel different). It is not suresuch synthese is possible. As for me, I do not know what understanding is exactly and Ido not want to discuss this question here. Yet, I shall answer a couple of objections, whichcould be raised against me and which rely on this notion (or on conceptions about whatthis notion is). My goal is to argue that it is not appropriate to focus on the cost of gettingunderstanding to discuss about the domain of our science.

6.2.1 Explanation and understanding : different nature, authors, andpossessors

Traditional theories of explanation tell us that explanations bring understanding, whichis usually considered as a state of the mind related to the knowledge why something is thecase. There is no doubt that, explanations that are produced by humans can bring cognitiveunderstanding to some humans ; but, in computer-aided science, it is less obvious that theyalways can.

Explanations, qua linguistic entities, can be possessed and produced collectively, es-pecially when it comes to what is called "big science", in which hundreds of scientists,aided by number crunchers make predictions or produce explanations of phenomena (likein CERN). A hempelian theorist can consider that a simulation provide a huge valid ex-planatory argument fulfilling the requirement of good explanations and that we do possess

3Whereas, when one discusses explanation, a pinch of normativity is welcome, in order to define what agood explanation is on the basis of scientific successes.


these explanations in our registers. It is less obvious that understanding can be possessedcollectively. There is perhaps something like a collective understanding of phenomena bycommunity of scientits. Still, if there is such a thing, the notion of collective understand-ing is only analogic to individual understanding (and remains to be clarified).

So if one sticks to a classical notion of understanding, focused on individual minds, itis clear that what can be completely understood cannot nowadays encompass all scientificactivity and its products. So focusing on what can be cognitively understood is a way toput aside parts of what belong to our science, which is not the best choice in order to studythe domain of science.

6.2.2 Different computational costsAnother objection is that understanding is sometimes more easy to get than fully-

fleshed out explanations. So by focusing on the cost of explanations, one would undulyoverestimate the cost of what it takes to get some scientific grasp on phenomena. I shallanswer this objection now.

� "This second hand car is cheaper. – I asked you the price of a new one. – But youcannot afford it. — This is a different problem and none of your business."

In somes cases, understanding of phenomena enables one to anticipate easily what thebehaviour of a system must be. Feynman seems to have something like this in mindwhen, talking about equations, he says :

"I understand what an equation means if I have a way of figuring out thecharacteristics of its solution without actually solving it."[22, vol.2, 2.1]."

So understanding is associated with some easiness. And it is true that a deep scientificunderstanding of a system can provide such a quick way to predict quickly the charac-teristics of the behaviour of a system. For example, after calculating the mean internenergy of a gaz (by relying on statistical physics), one finally gets the knowledge thatthis quantity depends on the number of degrees of freedom of the molecules. Similarly,physicists studying chaotic systems consider that they get some real understanding ofthe behaviour of these systems if they can anticipate from the system’s description thetopology of the trajectory in its phase space.

Some care is needed here. If understanding is just a cognitive phenomenon, then it isalmost tautological to say that its cost is no more than the cost of cognitive operationsthat we are able to perform. And in some cases, it is true that we can summarize ourknowledge about a system so as to make it cognitively graspable (like in the mean en-ergy exampe above) ; in such happy cases, this summarized knowledge is both genuineand cognitively accessible. And there is no doubt that understanding as a pure cognitivephenomenon, correlated with some feelings such as easiness, is also an interesting fieldof study (let us call it cognitive understanding).

However, one hardly see why cognitive understanding should be the beginning andthe end of scientific understanding, nor why scientific understanding should always be


quick and easy4. A student that knows that the intern energy of a gaz can be estimatedquickly by counting the number of degrees of freedom has perhaps some cognitive un-derstanding but he does not necessarily have a deep scientific understanding. Further,scientific understanding need not always result in full cognitive understanding: some-times, one can have some cognitive quick and easy understanding only about a part ofwhat one scientifically understands. In consequence, situations like those depicted byFeynman just show that, in some cases, it turns out that the passage from scientific un-derstanding to cognitive understanding is not an all or nothing matter (which is goodnews). But in some other cases, we may scientifically understand something and stillneed a long time to connect it with the basic statements (laws, axioms, principles) of ourscience. So the quote should be " if I scientifically understand an equation, then I havesometimes a way of figuring out the characteristics of its solution without actually solv-ing it." Finally, one may not scientifically understand a phenomenon completely evenif one can get some partial scientific and cognitive understanding of it. In the case ofchaotic systems, if one is interested in the exact trajectory of a system, then it is not truethat one understands it completely when one can explain the topology of this trajectorybecause this is only an aspect of this behaviour.

Overall, it is true that one can get sometimes some (cognitive or scientific) understand-ing of a phenomenon at a low cost but it is inappropriate to take this as a standard whena full scientific understanding is what we are after, even if this full understanding takesmuch more sweat (and if we do not have all the sweat).

� Why understanding can be cheaper ... in a second step.

Understanding is often easy to get because it is the output of a preliminary difficultresearch that made it possible. Take the example of the intern energy of a gaz.Whatthe above student learns ("add 0.5kT for each degree of freedom") is just a summaryof the full explanation. Suppose L stands for the appropriate laws of physics, C forthe description of the appropriate features of the gaz and E is the statement saying thatthe intern energy is gkT in this gaz, g being the number of degrees of freedom formolecules. The explanation can be presented in two very different ways. It may looklike this:

LCE

In this former case, it takes long to show that the argument is valid because one needsto show from laws of physics L why for a gaz like the one described with C, E i true.But once this long explanation has been made, it can be summarized like this :

(L ∧ C)→ E(L ∧ C)E

4But it is clear that the more one can get quickly scientific understanding, the more one has chances toenjoy cognitive understanding.


From this argument, it is very quick to determine what the intern energy is for gazes,and it takes only a syllogism. In fact, such an argument summarizes the useful part ofthe previous complete explanation and it is perhaps what gives the most acute feelingof understanding. But, to use Hempel’s terminology, this is in fact just an explanatorysketch (even if the argument is valid) or a summary of the full explanation. And the real"cost" of the understanding that goes with this quick and short explanatory sketch is bestdescribed by the cost of showing that the first "complete and painful" argument is valid.Had one not have been able to establish the validity of the first argument, we would nothave been able to possess and benefit from the second quick and short argument.

6.2.3 Opacity or, the cost of "primary understanding"

As we saw, in the previous paragraphs, the part of science that can be produced, sur-veyed and completely masterized by private individual scientists is only a part of thescience that is produced. The easy and quick understanding that private individualscan get is only one part of what we scientifically "understand" or, more precisely canexplain. And the understanding one can get easily is often made possible by more la-borious taks such as producing complete explanations. As is noted by Humphreys [33,p.269], one should be careful to distinguish between secundary understanding (the onewe can get once the explanations are already in the books) and primary understanding(the scientific understanding one manages at last to produce for the first time a correctand complete explanation of the fact to be explained). The debates about explanationoften confuse the two and these two types of understanding have significantly differentcosts.

All this considered, in the perspective of determining what exactly is within our sci-entific reach, it is more appropriate to focus on the cost of primary understanding, that isto say of the finding of the explanation for the first time.

It should be noted immediately that this choice may have in some cases some unex-pected consequences. Some facts or explanations, which already belong to our science,can be quite simple to understand and to describe, whereas the cost of the production oftheir explanation was extremely high. For example, understanding the litteral meaningof Fermat’s theorem and using this theorem is trivial, whereas proving it was extremelydifficult. Describing the fact that this leaf fell onto the gentleman’s hat during the stormis easy (this is no complex behaviour), and understanding how it can have been possibleis easy, whereas explaining in details how this happened can be almost impossible. Inbrief, the notion of opacity does not mirror how easily we can understand or make useof the knowledge of some facts once they belong to our science. It is designed to mirrorhow difficult it is to access for the first time this knowledge and to include it into ourscience. And facts, which belong to our science and we feel easy with can have have ahigh opacity because explaining them for the first time was difficult. To sum up, one mustdistinguish between primary opacity, which corresponds to the difficulty in (and cost of)gaining primary understanding and secundary opacity, which corresponds to the difficultyin (and cost of) gaining secundary understanding. And to determine where the boundariesof what we can know lie, one must focus on primary opacity since secundary opacity

6.3 How to measure the cost of primary understanding ? 44

only measures how difficult it is to masterize some knowledge one already possess in ourbooks.

6.3 How to measure the cost of primary understanding ?Saying that primary opacity is what one should focus on to describe the domain of

science is of little help if one does not find a way to measure this opacity. I discuss in thissection different activities, which are related to the finding of explanations and show howthey differ in computational costs.

• identifying a set of statements as potentially explanatory for a statement E ;

• authentifying a potentially explanatory set of statements as really explanatory for astatement E ;

• checking a proof that a potentially explanatory for a statement E is really explana-tory ;

• etc.

6.3.1 Identifying potential explanationsEven when one does already "in principle" possess an explanation of a fact, one may

still have to strive for the precise explanation of it. This is what the activity of scientiststudying fluid dynamics is mainly about. When studying incompressible flows moving atmoderate speeds, one "already knows" that what explains the behaviour of the fluid is itsdynamics, described by Navier-Stokes equations and its initial and boundary conditions.Still, one does not possess good and precise explanations of phenomena in fluids andone still needs to make explanatory hypotheses about what exactly a given phenomenonshould be put upon.

This can be illustrated by the explanation of the great speed of sharks. Sharks andvessels move all the more quickly since they manage to drag little water with them andmake the boundary layer as thin as possible (i. water is heavy to drag and ii. as more andmore water is dragged along, it starts tumbling in eddies and vortices and drag greatlyincreases.) Different factors can decrease drag. Injecting bubble of air (which is lighter)in the layer is an option, as secreting oily substances or producing appropriate vibrations.It is important to note that for sharks, finding the right explanation was not only a questionabout descriptive accuracy. If one possesses an exact description of the skin of sharks, ofits vibration, of the air around it, etc., one only knows that all these details, taken together,are sufficient to reduce the drag but one does know which ones actually play a role in dragdecrease. To know this, one must determine which details of the description do matter.Some may actually play no role, some may decrease drag but not significantly, somemay be more efficient when combined together, etc. After exploring different hypotheses,scientific have found that that the ribbets of shark skin have just the appropriate size tokeep the eddies far from the sking and avoid turbulent water to come too close. It shouldfinally be noted that finding appropriate potential explanations (here mechanisms that maydecrease the drag) and checking whether these mechanisms are the right ones and are


sufficient to produce the observed effect may be an endless and (computationnally costly)quest since there is no mechanical procedure to generate such explanatory hypotheses. Itis therefore dificult to measure the cost of this part of the building of explanations.

6.3.2 Predicting versus explainingIt is a famous claim by Hempel that a good explanation could have been used as a

prediction and that the difference between the two is a matter of pragmatics. It is worthnoting that this "pragmatic" difference can imply significant difference in computationalcost. When explaining a fact, you already possess a description of what you are explain-ing. When predicting something, you do not possess such a description.

This difference is clear in equation solving. Solving an equation, which solution youdo not know, can be extremely hard. Showing that a solution is indeed a solution can beextremely quick.

From this point of view, when describing the domain of science, one should distin-guish between predictive opacity and explanatory opacity. Pace Hempel, many phenom-ena we can explain can hardly be predicted.

6.3.3 Producing a proof versus verifying itTo make sure that the set of statements A1...An explains statement E, one must

(among other things) prove that E is a logical consequence of A1...An. Producing a proofand checking such a proof are computationally distinct operations. The difference be-tween the two is the difference (if any!) between classes P andNP and can be illustratedby the SAT problem : given a formula like (a∨ b∨¬c)∧ (¬a∨¬b∨ d)∧ (¬d∨ a∨¬e),is it possible to find an assignation of truth values to variable that make the formula true? Proving what the right answer is (the formula is or is not satisfiable) is difficult. But ifone already knows a satisfactory assignation (which functions as a proof), answering thequestion is extremely easy.

Roughly, the difference between proving and verifying a proof is that, when search-ing for a proof, many different paths are open, many of which are dead end (a non-deterministic Truing machine can explore simultaneously all these paths). Possessingand checking a proof amounts to making the right choices each time two different pathsare open. A proof system is a binary relation P between strings, some of which are thetarget theorems and other are proofs of these theorems, such that P (x, y) can be decidedin polynomial time (whether something is a proof can be checked quickly). A good proofsystems is such that theorems have short proofs in it.

Primary or secundary opacity ?

When we already possess a proof of E, verifying this proof provides enables one tocheck that some statements can explain another statement. So if one is interested in findingout what optimal scientific situation can be reached, one should study what the minimalsize of the proof in each case. The optimal scientific situation is the one in which we havein our records this minimal proof. The size of this proof indicates what effort is requiredin the best case to prove that the statement to be explained E is a consequence of the


Figure 6.3.1: Calculus trees explored by a non-deterministic Turing machine (only the beginning of thetree is shown. Some branches are dead end ; here it is supposed that one branch only provides a proof.

explanandum statements A,B,C. Further, we often possess proofs that were producedby beautiful (or "lucky"5) minds. Since the proof exists, it can be checked and secundaryopacity is well-defined ; but in many cases, we have no idea of any algorithm which wouldhave been able to build this proof automatically (so primary opacity is not well-defined inthese cases and can be considered as infinite). So the minimal size of proofs is somethinglike an absolute distance between what we know and what we can know and indicatesthe length of the shortest deductive link between the explanatory facts and the facts to beexplained.

In spite of all this, its is better to choose primary opacity as measure of what is withinour scientific reach because what we try to determine is how difficult it is to access someknowledge without any clue (versus what the absolute "logical" distance of this knowl-edge to what we already know is). To take a metaphorical example, the top of the MontBlanc is only a couple of kilometers from Chamonix but it takes days to be able to reach it.Perhaps some shortcuts can be found and once they are found, it is easier to reach the top.But if no clue is given or can be found by a systematic method (versus by chance), gettingto the top for the first time will be difficult. In an analogous way, even if sometimes it ispossible to be lucky or particularly smart and to find a shortcuf proof, in the vast majorityof cases in physics, we have to build the proof from scratch by algorithmic procedures,that is to say to find a systematic method in order for example to find the solution of aset of equations (e.g by doing operations on matrices), or to determine whether a set ofconstraints can be fulfilled (when studying optimization in spin systems).

Further, it is only when we possess an algorithmic procedure to build a proof that itcan be safely considered that the statement to be proved is potentially within our reach,because we know for certain how to produce this proof and how much this will cost.Computational base science, which has so much extented the domain of science is pre-

5When we produce a proof and that two open paths are possible, if we are lucky, we choose the one thatis not a dead end and is the shortest.

6.4 Opacity, a notion first relative to individuals and what they know 47

cisely about phenomena for wich predictions or explanations can be built by algorithmicprocedures. When no algorithm is available, perhaps the corresponding phenomena ortheorems will be explained quickly and elegantly one day ; but perhaps they will not andthere is no way to be sure. To put is in a nutshell, primary opacity gives indications aboutwhat belongs to our science if we decide to explain or prove it. Other phenomena (ortheorems), with low secundary opacity but high or infinite primary opacity, might be partof our science, but they do not belong to our science in any sense till the happy day whenthey happen to be explained or proved.

Summary

I have distinguished between predictive opacity, secundary explanatory opacity andprimary explanatory opacity and I have argued that focusing on the latter is most appro-priate to discuss where the boundaries of our science lie.

6.4 Opacity, a notion first relative to individuals and whatthey know

In the next chapters, I shall argue that opacity has an intrinsic and irreducible com-ponent. Before this, it should be reminded that, at first sight, the computational cost ofmaking proofs, finding solutions to problems, etc. depends on a wide variety of factorsincluding which knowledge can be relied on. Accordingly, the boundaries of what canbe explained, predicted, proved, etc. are different for different individuals, be they singlescientists or community of scientits, aided or not by computational power. (Basically,US army scientists can predict, explain or prove things that other scientists in the worldcannot.)

6.4.1 What does primary opacity depend on ?In this section, I list different factors, which opacity depends on.

� It depends on the algorithmic knowledge that one possesses. For example, the domainof science was significantly extended when the Fast Fourier Transform was invented[18, 19].

� It depends on previously known results, such as lemma, demonstrated theorems, or eventhe knowledge of the graph of functions like sine, cosine or logarithmic values6.

� The length of proof also depends on the proof system that is used and, if we turn to in-terpreted mathematics, to the formal systems (like differential calculus, matrix algebra,etc) that are known. For example, propositional logic tautologies can be proved withor without cuts (that is to say additional and somewhat redundant inferential rules, suchas the modus ponens). As Boolos showed it [10], cut-free proofs can be significantlylongers than proofs using cuts.

6Before computers were invented, possesssing tables describing the values of these functions was ex-tremely helpful to shorten calculi. The baron Gaspard de Prony had these values systematically calculated[11] in order to make the full cadastral map of France.


Figure 6.4.1: Efficiency hierarchy of proof systems. The higner the system in the hierarchy, the moreefficent it is. It not known whether this hierarchy ends. The problem with high-level proof systems isthat they do not enable one to find algorithms for finding proofs systematically, whereas low level and lessefficient proof systems, such as truth-table checking, makes the systematic search for proof possible, andcannot for this reason be simply put aside.

� The computational cost also depends on the type of computers that one uses. RAM ma-chines and Turing machines do not have exactly the same performance. Also, differentTuring machines can do more or less well on identical problems. The linear speed-uptheorem even say that if a language belonging to the classDTIME(f(n)), then for anyk it also belongs to the class DTIME(f ′(n)), avec f ′(n) = k.f(n) + n + 2, whichjustifies that constant are not taken into account in complexity theory.

Figure 6.4.2: Proof of the linear speed-up theorem. The transformation Σ × Σ −→ Σ′

and more gen-erally Σk −→ Σ

′is made. The size of the alphabet therefore grows exponentially with k and several

computational "old" steps can be performed in one "new" step.

Overall, this review shows that opacity can vary with the knowledge that one has andthe different tools that one can use.

6.4.2 The different boundaries of scienceEven if one puts aside the fact that opacity may not be the same for different individ-

uals or groups of individuals, it seems more appropriate to speak of several boundaries ofscience, as can be seen from figure 6.4.3.


Figure 6.4.3: Different domains of phenomea for an individual i having ressources K. Only the phe-nomema that can be accounted for by the theories accepted by i are showed.

There is first the set of phenomona for which we already possess explanations. Thisset is a subset of the phenomena that can be explained with resources K (in between, butnot represented, stands the set of phenomena that can be predicted with resources K).The set of phenomena the explanation of which can be verified is even wider, but manyof these phenomena have not been explained yet because building their explanation isdifficult. The last set represented is the set of phenomena that one can explain or predictwith infinite resources. If our theories are incomplete, there are perhaps some phenomabeyond, falling within the domain of our theories but inderivable within these theories.Incidentally, one should note that if a theory is a set of models, only the models that canbe identified with resources K can be used to describe this theory — but one can still useformulations of the theory to present a complete and "packed" description of the theory,so to speak.

Summary

Humphreys Extending Ourselves, describing the development of computational sci-ence, writes [33]: "If we are concerned with, among other things, how science progresses,then the issue of how science pushes back the boundaries of what can be known in prac-tice should be a primary concern. That is because scientific progress involves a temporallyordered sequence of stages, and one of the things which influences that progress is thatwhat is possible in practice at one stage was not possible in practice at an earlier stage."[33, p.124].

In the previous paragraphs, I have developped different concepts with the aim of mak-ing more precise this notion of "boundaries of what can be known".

6.5 Does opacity have an intrinsic component ? 50

6.5 Does opacity have an intrinsic component ?As we have seen, the position of the boundaries of science depend on the amount of

computational resources that one can use. There are two main ways of pushing back theboundaries of science. The first one is to develop our technology and to build more power-ful number crunchers. There is potentially no limit to this quantitative development of ourtechnology. The second one is to develop our knowledge as to how one can solve prob-lems or build proofs more efficiently. Since development of knowledge can lower opacity,this notion can be labelled as "epistemic". My purpose is to show that opacity possessan intrinsic and irreducible component and that it cannot be arbitrarily decreased by thedevelopment of knowledge. If this is so, how we can push back the boundaries of scienceis constrained by this intrinsic component of opacity and extending our computationalresources can only bring about limited progresses.

• Is it possible to show that opacity has an intrinsic irreducible component ?

• Can opacity characterize singular deductions or instances of problems (e.g. "2+2"and not addition in general)?

• Is opacity "transparent" in the sense that it is possible to determine the value ofits intrinsic irreducible component and therefore to determine the limits of the pro-gresses that can be made in efficiency ?

6.6 Description of the following chaptersThis part is organized in the following way.Chapter 7 presents how the theory of computational complexity can be used to study

physical problems.Chapitre 8 shows why the usual framework of complexity theory, that is to say the

study of the complexity of problems (i.e. inifinite sets of instances or questions), is not anappropriate setting to measure the opacity of particular deductions or instances of prob-lems.

In chapter 9, I try to show how, by partitioning problems, one can hope to refine ourknowledge of the map of opacity and I introduce the notion of complexity core, which areinfinite sets of instances which do not have easy subsets.

I finally show in chapter 10 why the notion of complexity core fails to provide robustfoundations to the notion of hard instance. To remedy its defects, I present a differentnotion, "instance complexity", which takes into account both the length of the computa-tion and the size of algorithms that one uses and I argue that this notion i) does provide agood mesure of opacity and ii) shows opacity has an irreducible component — the troublebeing that the value of instance complexity is often difficult to measure in practice.

Chapter 7

From physical models to computationalcomplexity theory

The following chapters are devoted to showing, by relying on computational com-plexity theory, that producing the explanation or the prediction of an event can require anirreducible amount of computational resources.

Since I shall in my argument heavily rely on computational complexity theory, I startwith an introductory chapter in which I introduce the basic notions of this theory (section7.1) and show it can be applied to study the complexity of the prediction and explanationof physical systems (section 7.2).

7.1 Complexity theory and the cost of computationsIn this section, I present different notions from computational complexity theory, most

of which are well-known and can be found in any handbook about this subject1. For thisreason I shall be very brief.

7.1.1 Turing machines and the study of computational resources (timeand space)

The usual model of computation to study computational complexity theory is Turingmachines. Roughly said, a Turing machine ("TM" for short) has several work tapes. Theinput is stored on the first. A TM also has different inner states owing to which it canperform different operations in similar contexts, depending on its inner state. For example,when doing an addition, different operations need to be done when adding numbers 6 and3. If the carry over is "1", one should write "0" and carry over "1". If the carry over is "0",one should write "9"and carry over "0".

A more formal definition of a Turing machine

Definition. A TM (Q,Σ,Γ, δ, q0, qacceptation, qreject) is a 7-tuple, whereQ,Σ,Γare finite sets and

1. Q is the set of states of the TM1A good and pedagogical recent reference is [66]. I also heavily rely on [54].

7.1 Complexity theory and the cost of computations 52

Figure 7.1.1: An Turing machine with a semi-infinite tape

2. Σ is the input alphabet without symbol (∗ standing for a blanc cell)

3. Γ is the tape alphabet with ∗ (and Σ ⊆ Γ).

4. δ : Q× Γ −→ Q× Γ× {L,R} is the transition function.

5. q0 ∈ Q is the "start" state.

6. qacceptation ∈ Q is the "accept" state.

7. qrejet ∈ Q is the "reject" state and qrejet 66= qacceptation.

A configuration Ci is composed of the state of the machine, the position of the head onthe machine on the tape and the configuration of the tape. A computation is a successionC1, C2, ..., Ck of such configurations which is in accordance with the transition function.A TM accepts input w if there exists such a sequence in which C0 is the start configurationand Ck is an acceptation configuration.

Finally, for a computation, γ = C0 C1... Cm, the consumed computationalresources can be defined like this

• the used time for computation γ is time(γ) = m ;

• the used space for computation γ is space(γ) = max06i6m(|Ci|, where |Ci| is thesize of the longest configuration on the tape, without the (∗).

The space and time complexity of a TM on an input is the time and space complexityof the longest computation starting with this input2. The time complexity is finite if thecomputation stops in an "accept" or "reject" state. Finally the worst-case time or spacecomplexity of a TM for a size n is the maximal time or space complexity of this TM forany input of length n.

In the following chapter, I focus on time complexity. From a computer science pointof view, this choice is somewhat arbitrary. In any way, since my goal is to show thatpredictions or explanations require irreducible computational resources, showing this onone single type of resource does not flaw the argument. Further, time complexity is moreintuitive, more studied and more easy to connect to philosophical debate in which thevariable time is of special interest (e.g. for questions like emergence, determinism orunpredictability).

2Remember we are using deterministic TM, so there is just one possible succession of configurations.

7.2 The complexity of physical problems 53

7.1.2 Basic notions from computational complexity theoryHere I present several notions that I shall regularly use in the next chapters.

Problems

In traditional complexity theory, the unit of analysis is not particular problems (forexample the complexity of "2+2") but infinite sets of particular problems (for example alladditions between two numbers), and this is what computer theorists call "a problem".

As we shall se later, this is really troublesome since when one produces a prediction oran explanation, one performs a single computation and it is the complexity of this singlecomputation one is interested in.

Reductions

Proving that a problem can be solved or is hard is usually difficult. As a consequence,instead of finding a completely original proof for each problem, computer scientists usewhat they call "reductions". A reduction is the transformation of a problemA into anotherproblem B such that the solutions to the instances of problem B can be used to find easilythe solutions to the instances of A. As a consequence, if one proves that problem A ishard and B can be used to solve A, then B must also be hard.

Usually met reductions are polynomial time reductions, that is to say, reductions,which can be computed in polynomial time.

Complexity classes and complete problems

Let t : N −→ N be a function. Then DT IME(t(n)) is the class of problems whoseworst-case complexity is less or equal than t(n).

The complexity classes we shall mostly meet are P (problems that can be solved inpolynomial time) andNP (problems whose solutions can be verified in polynomial time).

Complete problems for a complexity classC are problems belonging toC such that allproblems within C do reduce to these problems (by using reductions that are compatiblewith the definition of the complexity class). A famous class of complete problems is theclass of NP-complete problems.

7.2 The complexity of physical problemsIt is my purpose to show that the computational complexity of producing predictions

or explanantions can be irreducible. At this step, two different achievement must be dis-tinguished.

• One can try to show that such or such particular phenomena are irreducibly difficultto predict or explain. To do so, one needs assumptions about which theories are theright ones and which models do represent correctly (and as simply as possible) thestudied systems. There is no doubt that attributing a complexity value to a particularphysical system is a delicate undertaking.


• One can try to show that, in general, phenomena can be irreducibly hard to predict orexplain. To do that, one only needs to show, by relying on computational complexitytheory, that making predictions or explanations based on some physical models canbe irreducibly hard. And it is enough for this to give a couple of basic examplesof common physical models, for which predictions or explanations can be hard toproduce. As a consequence, whatever the phenomena that are represented by thesemodels, they will be irrreducibly hard to predict or explain. And if these modelsrepresent nothing, other models, which can also be studied from a computationalcomplexity point of view, will.

It is this second track that I shall follow now.Two remarks need first to be made. First, it should be noted that resorting to complex-

ity theory to study the complexity of explanations and predictions in physics is conceptu-ally legitimate. As soon as it is granted that physicists use mathematical structures, objectsor statement for the purpose of representing, predicting or explaining, it is conceptuallylegitimitate to use computational complexity theory to measure how difficult it is to an-swer questions about these mathematical items, since this is precisely what this theory isabout. I shall in section 7.2.1 give two examples, in which the computational complexityfor exploring the behaviour described by the physical model is well-established.

Second, since predictions and explanations rely on particular models and theories,obviously, the complexity measures that can be obtained cannot but be relative to theparticular representations of physical systems (models and backing up theories) that areused in each case, and therefore to the hypotheses that were made to build these particularrepresentations. Note that this is completely in agreement with my claim in part 1 thatexplanations are looked for in the setting provided by a theory and are therefore relativeto these theories. In spite of this, I shall try in section 7.2.2 and 7.2.3 to show how,with weak hypotheses, one can provide arguments to the effect that complexity measuresobtained in the setting of these models do reflect objectively and absolutely the complexityof predicting or explaining the corresponding phenomena. In particular, I shall argue thatno new miraculous modelization can be hoped to simplify significantly the production ofpredictions or explanations of these phenomena.

7.2.1 The well-defined complexity of some physical modelsMy goal is to establish that phenomena can be said to have an intrinsic opacity. For

this, it is necessary to show that in some cases, phenomena can be intrinsically difficultto predict or explain. Phenomena that can be predicted or explained easily can be saidto have a null opacity and be intrinsically easy: there is no trouble with this, quite thecontrary. So all I need to do is to provide clear cases where the hardness of the scientifictask is obvious.

7.2.1.1 The computational complexity of "direct simulations" in fluid dynamics

I start with an example which, though not as clear as one may wish from a philo-sophical point of view, is of real importance to establish that finding the complexity of amodel is not a mathematical fad, which is extraneous to what physicists themselves do.The trouble, when exploring the content of physical models, is that what needs to be done


seldom falls from the start into the realm of what can be solved with algorithms. As aconsequence, adjustments may be needed to change the original physical problem into aproblem that can be solved by algorithmic procedures. My aim with the following ex-ample is to show that establishing the computational complexity of a model is somethingthat physicists do care themselves about (because it determines the feasibility of what theytry to do) and that they rely on physical arguments in order to establish which computa-tions provide bona fide physical information and should for this reason be studied froman algorithmic point of view.

The example that I give is the exploration of the behavior of fluids obeying Navier-Stokes equations. Since this exploration is usually very costly, physicits, depending onwhat phenomena they study, usually try their best to build models that are as simple aspossible to study. However, when they want to study turbulent flows at all scales, nosuch strategy is possible and they need to do "direct simulations" of the fluid, that is tosay simulations that include all the scales that can influence turbulent activity. So thequestions "Which scales must be taken into account" is a crucial one here. The answerto this question, which can be directly cashed into a computational complexity value forsimulations, is answered by using physical arguments. To sum up things (see [55, 9.1.2]for further details) :

• The smallest scale (Kolmogorov microscales) is chosen on the basis of the dissipa-tion scale, at which, because of the prevalence of viscosity, fluid moves are homo-geneous ; the quantities involved are viscosity and the energy dissipation rate andKolmogorov scale can be determined by using dimensional analysis.

• The highest scale (the "integral scale") is chosen so that all eddies contributing tothe energy exchanges are taken into account (which can be done by considering theenergetic spectrum of the fluid).

• The smallest and highest temporal scales are chosen on the basis of the period ofthe smallest and biggest eddies.

With these physical arguments, it is possible to choose the grid for the simulation ofthe fluid and to determine for how long the simulation must be run if one is to get reliableinformation about all the typical events that do happen in the fluid. All included, thecomputational cost of simulations grows like Re3 ; in 2000, it was for this reason difficultto simulate fluids above Re = 1500.

Everything is not philosophically flawless in this complexity estimation. The origi-nal Navier-Stokes equations are differential equations and a discretization scheme mustbe used. Justifying why the original equations can legitimately be used for studying aset of turbulent phenomena and how it should be transformed for a study amenable to acomputational treatment is a careful process. Nonetheless, as the example shows, physi-cal arguments can be used in order to get meaningful complexity measures. There is nodoubt that such studies require close scrutiny and subtle analyses. Still, there is no reasonto believe that there are in principle reasons why such a physico-computational analysisis doomed to failure and cannot, at least in certain cases, provide robust results about thehardness of predicting and explaining some phenomena. And this is precisely what scien-tists try to do in fluid dynamics where measuring what amount of computational resourcesis needed in each case is crucial.


Figure 7.2.1: 3D grid for a fluid dynamics simulation

7.2.1.2 The computational complexity of some discrete models in statistical physics

In the perspective of not misrepresenting scientific activity, it was important to showhow making a physical model amenable to computational treatment and correctly mea-suring the complexity of this treatment on the basis of physical arguments is a primaryconcern for physicists, especially in a field (fluid dynamic) were simulations are so muchdeveloped and computional methods so widespread.

I now turn to more consensual cases, where exactly computable answers can from thestart be brought to the questions about the behaviour described by a physical model, andI present briefly some complexity results about such models.

Figure 7.2.2: Spin glass and interactions between neighbouring spins.

In this respect, an appropriate and deeply studied example is Ising model, which is a


central model in statistical physics and for which precise complexity results were found[4, 35].

• Computing the partition function and the fundamental state of a spin glass with nomagnetic field can be made in polynomial time for the 2 dimension case. (Analyticsolutions have been found by Onsager for an infinite glass.)

• The same problem is NP-hard3 in the 3D case or if a magnetic field is added.

Results have been found about the complexity of other problems in statistical physics4

(e.g. in the case of counting self-avoiding walks, percolation theory, protein folding,growth problems or discrete fluids [40]). In conclusion, since these problems belong tothe ones that are used again and again when studying physical phenomena in physics, itis legitimate to conclude that, at least in some cases, making predictions or explanationsabout the corresponding phenomena will really require to solve one of these problems,whose complexity is well-defined within computational complexity theory.

7.2.2 Relativity to models and is it troublesome ?It is important to note that, at this step, nothing has been showed yet about the dif-

ficulty of producing predictions or explanations of particular phenomena. However, asillustrated above, applying computational complexity to physical problems can be done.As a consequence, the complexity of explanations and predictions does now depend onwhat can be inferred from computational complexity theory about the production of ex-planations and predictions for particular phenomena.

Still, it is already possible already to answer objections about the scope of the resultsthat can be hoped for.

It can be objected that the computational complexity measures that one can get forthe study of some phenomena are necessarily relative to a theory and to a model, whichcan include approximations, idealizations, and additional hypotheses. This is actually astate of fact that I am completely happy with. The computational complexity measuresa distance between the statement to be explained or predicted and some statements thatare taken as basic, so it is true that the complexity is relative to whatever is taken as thestarting point, be it a theory or a model. Further , the possibility of taking as starting pointsmodesl that somewhat depart from what our fundamental theory exactly say should clearlybe taken as an advantage of our science. In the received view about science, statementsare supposed to be derived from the axioms of the theory or the statements that are takenas basic. But if science really proceeded like this, few statements could be in practicederived from the most fundamental laws describing systems and almost everything shouldbe considered as complex.

It is therefore legitimate for the good of science to devise partly autonomous modelsand therein to build a "map" which indicates computational distances between points (=models and the statements to be explained or predicted) that are not too far apar. If inaddition, it is possible to indicate computational distance to points that are much far away,this is all the better. To take a comparison, it would be silly to always indicate the distance

3A NP-hard problem is at least as difficult as NP-complete problems.4See [71, 43] for a review.


to Paris, and only this distance, wherever you are on the territory. It would be totallyuseless for people with bikes, interested in short distances to cities where is their usualbusiness. Indicating distances to Paris can of course be helpful but, to determine what ispossible in practice, finding the computational distances to models is even more useful.The need to use models, with additional assumptions built into them, as starting pointsinstead of deriving everything exactly from basic statements should be seen a sign of howcomplex the world is and is in the time an indication of how we manage to develop sciencein spite of this complexity.

7.2.3 Modeling can’t do miraclesThe relativity of complexity to the models or theories that are taken as starting point

should not be overestimated either. In one sens, this relativity is trivial and cannot beavoided. Any computational distance is relative to the statements that are taken as basic,because if anything could be taken as starting point, the simplest option is to chooise thestatement to be derived as the starting point.

In the same time, this relativity should not be interpreted as implying that, by appro-priately modeling a physical system, predictions and explanations that were difficult toproduce can always turn out to be easy.

� On the one hand, it is true that sometimes, by doing appropriate approximations or ideal-izations, predictions or explanations can become tractable. Nevertheless, the statementsthat will be derived by this means will be in general statements describing approxi-mately or in an idealized way the target system, so these statements will be possiblydifferent from the original statements to be derived in the non approximate or non ide-alized case. This is absolutely not contradictory with what is said by complexity theory,which deals only with the cost of exact solutions. Also, computing the exact behaviourof a system can be impossible whereas computing its average behaviour can be easy— this is what statistical physics is about. All these different procedures are goodstrategies, which enables scientists to side step untractability and to learn informationabout the systems they study with their finite means. Finally, significant increases inefficiency can also be gained by doing approximations in the solving of the problem(versus in the initial representation of the system). Some branches of computationalcomplexity theory are precisely about the cost of finding approximate and tractable so-lutions, which are known to be possibly slightly distinct from the exact solutions. Thismeans that some problems can be hard to solve exactly but easy to solve approximately.but it can also be the case that some problems are hard to solve even in the approximatecase. When studying a chaotic trajectory with sensitivity to initial conditions, toleratingapproximate results in the predictions does not really help because trajectory divergeexponentially.

� On the other hand, the results given by complexity theory imply that, from a com-putational point of view, modeling cannot be an omnipotent and all-solving activity.Idealizations, approximations and the like cannot change a hard problem (in computerscientist sense) into an easy one.

Suppose for the sake of the argument that one is interested in a set of physical situations,which are appropriately modeled by Ising spin systems of different sizes and that study-

7.3 Appendix 59

ing the behaviour of these physical situations requires to compute the partition functionor the fundamental state of these Ising spin systems. In other words, for the scientificgoal considered here, it is required to solve what is, as we saw above, aNP-hard prob-lem. It may be possible that, by using approximation or idealization procedures, and"remodeling" the target systems, such a goal can be partially fulfilled. Nevertheless,this goal cannot be completely fulfilled by such a remodeling procedure. The reasonwhy this is so is quite simple. If one could find a new family of tractable models fromwhich predictions and explanations equivalen to the original ones could be made easily,then by translating the original Ising spin systems into the new models (that is to say, bydoing what computer scientists call a "reduction"), one would be able to solve quicklywhat was originally a NP-hard problem, which, unless P = NP , is impossible.

Actually, the same type of arguments can be used to claim that a change of theorycannot make tractable the study of systems that was previously untractable — if someconditions about the possibility of translating the statements from the original theorywithin the new theory are fulfilled.

The sketchy "no-miracle" argument I have just given is in a sense incomplete becausewe have not studied yet what can be inferred from complexity results, like the NP-completeness of a problem. Nevertheless it indicates how, by relying on complexitytheory, it is possible to build arguments to the effect that modeling and theory changecannot make sets of predictions or explanations easy if they have been proven to beuntractable. It is in this sense that the complexity of the production of explanations andpredictions can be hoped to characterize physical systems and phenomena themselvesand not to be just relative to some theories or models. Since establisging under whichassumptions it would be so would be long and difficult, I shall not try to do it here —a preliminary step is anyway to determine what can be inferred from complexity theoryresults about the hardness of particular explanations or predictions, which is what thenext chapters are about.

7.3 Appendix

7.3.1 Why studying time complexity?Other computational resources than time can be studied. In this section, I describe

why, if one is interested in polynomial bounds and what is possible in practice, studyingtime can be seen as more appropriate than studying space (since time complexity classesare clearly included within space complexity classes and enable one to make finer distinc-tions).

It must be noted however that when discussing space, computer scientist are usuallythriftier than discussing time. Some computations can be impossible in practise becausethey require too much space. The answer "stack overflow" by a compiler is, in somecomputers, a typical indication that too much space has been used. Stack is a specificamount of memory assigned to a program. A stack overflow means that this stack is fulland you cannot insert anymore data into it. In short, when doing the computation, youhave run out of space.

7.3 Appendix 60

7.3.2 Invariance and robustness results for time complexityIn this section, I present some invariance and robustness results that show that using

Turing machine to study complexity is not a problem since the complexity of problemsis the same with other type of computer architecture, such as RAM (which are moresimilar to our computers) — provided one sticks to coarse labels like "polynomial" or"non polynomial", which is all I actually need in order to show that some explanationscan be intrinsicially hard to produce.

Chapter 8

Problem complexity, sub-problemcomplexity and instance hardness

The theory of computational complexity is specifically devoted to study the amountof computational resources that is required for performing computational tasks. I try todetermine in the next chapters whether, by resorting to this theory, there is a way to mea-sure the quantity of resources that is intrinsically needed for solving a particular problem—namely solving such or such equation, inverting such or such matrix, proving such orsuch formula to be a tautology, etc. — and therefore for producing explanations or makingpredictions.

The problem is that, in computational complexity theory, complexity measures areholistic in the sense that they do not characterize particular problems (e.g. multipliying127 and 649,) but infinite sets of particular problems (e.g. multiplying numbers x and y ingeneral) — these infinite sets of particular problems being what computer scientists call"problems"1. In the perspective of studying how difficult it is to produce explanations andpredictions of phenomena, this is clearly a drawback. Actually, one never solves all theinstances of a problem ; thus, measuring the complexity of a particular instance with acomplexity measure characterizing all the instances of a problem may seem misguided.Further, I had hard time in part I, trying to establish what precise task needs to be done andwhat tasks, usually much more costly, need not in order to provide good explanations ofphenomena. So the legitimate worry is that, by measuring the complexity of a particulartask a by the complexity of a general problem A, one may overestimate the hardness of a.

As a consequence, the following questions need now to be answered :

• Is it mandatory to use a theory dealing with the complexity of problems to discussthe complexity of particular instances of these problems ?

• If one uses a theory dealing with the complexity of problems, to what extent isit legitimate to assume that the complexity of problems mirrors the complexity oftheir instances ?

Section 8.1 is devoted to study what I call "the logic of complexity measures", namelyto study from a conceptual point of view what is exactly said when a problem is saidto have complexity K. The conclusion of this section is that, strictly speaking, very

1From now on, I shall use the term "problem" with this meaning.

8.1 From problem complexity to instance hardness : a path paved with pitfalls 62

little can be inferred from the complexity of a problem about the hardness of particularinstances of this problem, which is, on the face of it, somewhat paradoxical : where doesthe complexity of a problem originate in, if it does not in the hardness of its instances? In section 8.2, I present results about the complexity of instances of notoriously hardproblems, which do confirm the diagnosis made in the first section.

8.1 From problem complexity to instance hardness : apath paved with pitfalls

8.1.1 Problem "holistic" complexity : a needed detourWhy studying complexity by focusing on the complexity of problems (id est, types of

questions) and not on the complexity of their instances (id est, particular questions) ?A first simple reason is that, for a same type of mathematical objects, complexity does

depend on the type of question that is asked, and not simply on the complexity of theobject (whatever this may be). For example, finding whether there is an eulerian circuit ina graph, that it to say a circuit using every edge once is easy (there is one if the number ofedges is even), whereas finding if there is a hamiltonian path (that is to say a circuit thatvisits every vertex exactly once) is NP-complete.

In the same time, the precise number of steps required when finding an answer toa particular question is not significant since this very number can change depending onfactors like the machine that is used, the coding, etc. Further, this number of steps canbe arbitrarily decreased by a constant because of the linear speed-up theorem (see p.48).Something that is much more robust is the growth rate of the computational cost with thesize of instances, because it characterizes algorithms and is not altered by what the linearspeed-up theorem says. What is more, this growth rate is polynomilally equivalent for theusual types of computers that can be used to implement this algorithm (this means that us-ing different computers requires in the worst case a polynomial increase in computationalcost). So choosing to characterize complexity by means of growth rates seems to pave theway for robust characterizations of complexity, which is not possible otherwise2.

The consequence of this choice is that complexity is first and foremost a holistic no-tion, which describrs an infinite set of questions. Typically, the use of predicates like"requiring a polynomial cost" or "requiring an exponential cost", applied to instances,result from a projection of a property of problems to their instances : litteraly, sayingthat the computational cost required by the solution of an instance is polynomial is mean-ingless, because any finite computational cost can be seen as the value of a logarithmic,polynomial or exponential function.

Since complexity turns out to be a holistic notion, one finally needs to answer thefollowing question : is it possible to extrapolate from the complexity of problems to theactual complexity of the instances of these problems and how much information can begained from such extrapolations ?

2Actually, instance complexity makes this possible, but this is a far less intuitive notion and it is naturalto resort to it once it has been shown that the standard definitions of complexity fails to characterize properlythe complexity of instances.


8.1.2 Troubles with "holistic" measures and complexity paralogismsA complexity measure of problem A brings information about the performances of

algorithms to solve all instances of A. Worst-case complexity is about the highest numberof steps that is required to solve the instances for each size n. If a problem is NP-complete, then (if P 6= NP), it is not possible to find a polynomial worst-case algorithmto solve its instances. It is difficult to see how this could not mean that an infinite numberof instances all require individually a super-polynomial computational steps to be solved.As surprising as this may be, strictly speaking, this inference is a paralogism. The nextsubsections are devoted to show why it is not completely legitimate to draw from thecomplexity of problems inferences about the complexity of instances.

First, I present in this section problems associated with holistic measures.

� What type of holistic measures are complexity measures ?

Properties characterizing a class or an infinite set can be of different types.

• Generic properties. Some properties can be shared by all the individuals of theclass. For example, human kind is characterized by mortality and all humans aremortal.

• Impossible to measure properties. A property a can be shared by some individ-uals but impossible to measure directly on these individuals, like in the case ofpropensities. In this case, one can create an ersatz measure f on a class of similarindividuals in order to get information about p, as when we measure frequenciesto determine the quantitative value of a propensity.

• Collective properties. Some properties can be apparently individual but in factintrinsically collective. Take for example the notion of hardness of a rallye. Onecould think that a rallye is hard if its different parts are hard but this is not nec-essarily the case. If a pilot can use only one car and if the parts of the rallye areextremely different (driving on snow, in the mud, on asphalt, etc.) then the rallyecan be hard because no type of car can perform well on all these different typesof road. In this case, hardness is a purely collective effect : because of the choiceof the car that is originally made, performances on some parts of the rallye willbe bad because the car is ill-adapted to the type of road on these particular roads,even if there does exist types of car who could run fast on these roads.

• Etc.

For all these types, the inference that can be drawn about the properties of individualsfrom the the holistic property characterizing all these instances are different, so it is cru-cial to determine if, for example, complexity measures are in fact collective properties.

� Reference class problem

It is well-known that, for probabilities, changing the reference class may implie changesin the value of the probability measure. The probability of getting pregnant is not thesame within the class of human beings and within the class of men. Consequently, thechoice of the reference class must be made very carefully if one is to determine whetheran individual may get pregnant.


So the question is : to what extent does the same problem arises for complexity mea-sures ?

� Choice of the complexity measure.

There are different type of holistic complexity measures : worst-case, average, generic,mean, etc. What measure is the most appropriate to get some information about thehardness of particular instances ?

8.1.3 The logic of problem complexity measuresWhat can be inferred exactly from worst-case complexity measures? Saying that an

algorithm for a problemA hasO(f(n)) complexity gives an information that it is possibleto project on all instances, namely that it is possible to solve each instance in time at worstf(n). But it may be the case that some or all instances can be solved much quicker, inparticular if one finds a much more efficient algorithm. So algorithmics tell us what wecan do for certain.

Complexity theory goes further. It tells us that there is hierarchy of complexity classesthat cannot completely collapse. It has in particular been shown that L ⊆ P ⊆ NP ⊆PSPACE = NPSPACE ⊆ EXPT IME3,4, that L 66= PSPACE and that P 66=EXPT IME5. So if we take a complete problem for a class that cannot collapse ontoinferiour classes, we know that the complexity measure is objective and is not overesti-mated, which is the best certificate we can get for the existence of hard instances in thisframework. But what guarantee does this bring exactly ? For such a complete-problem,we can be certain that, for any Turing machine solving the problem, if one uses this ma-chine (and only this machine) :

• if one solves all instances of size n, we can be sure that, asymptotically, the worst-case complexity will be reached ;

• if for all size, one repeatedly solves instances, then we are bound to meet instancesthat will be hard to solve and the wort-case complexity is bound to be asymptoticallyreached ;

It should in particularly be noted that :

• the use of asymptotics is crucial because if one focuses on instances of size inferiorto n; then it is possible to find slow growing functions with big constants, whichwill be an upper-bound for the complexity of all these instances ;

• all instances must be taken into account (except possibly a finite number of them)because if one removes an infinite number of them, we may remove all the infinite-many instances that are hard for an algorithm;

• the same algorithm must be used for all instances ; but perhaps, if one uses differentalgorithms, the hard instances will be different.

3L is the set of languages that can be decide in logarithmic space.4Voir [54, §7.3, pp.148,150].5See [54, §7.3, p.145].


In conclusion, we have no guarantee that there are instances that are hard for all algo-rithms and it remains possible that the problem complexity is a collective effect. However,if one accepts that all the instances do have a "collective destiny" (that is to say must betreated with one single algorithm), then, whatever the algorithm that is used, there arechances that the worst-case complexity may be reached. But why should we accept thatinstances have a collective destiny and why should we use one single algorithm for allof them ? Is not it legitimate to use more specific tools for more specific tasks if thisimproves efficiency ?

It should be noted that the situation is worse than in the case of probabilities, because,in this latter case, we may have theoretical evidence (e.g. from quantum mechanics the-ory) that, for a physical system, the property "having an objective probability of beingin such or such state" is really possessed by some physical systems — and all the prob-lem is to measure this probability. Here, we do not know how to measure the "objectivehardness" of instances, but we are not even sure that such a property does really exist.

In conclusion, complexity theory does bring more information than algorithmics, butthis is not the information we were looking for. Not only may there be easy instanceswithin hard problems but we are still not sure that hard problems are hard because theyhave intrinsically hard instances.

Since this result is somewhat surprising, it is worth giving immediately an argumentto convince us that we are not daydreaming and that any instance of any hard problemcan be solved easily. Take any problem and pick up any instance i in it. Choose thebest algorithm A for this problem (which may actually do very badly). Now build analgorithms A′ which does the following operations : first A′ checks whether the instanceto be solved is i.If it is indeed, A′ writes i solution. If it is not, A′ does exactly the samethings as A. This algorithm is just as efficient as A and in addition, it solves i extremelyquickly. An algorithm like A′ is called a "patched algorithm" and already contains instore the solution for the instance i. For the time being, there is no reason to worry aboutsuch odd algorithms : if the instances of hard problems were easy only in this sense,one would probably find a way of excluding such algorithms. The existence of such"patched algorithms" just confirms that we did not make any mistake above in saying thatthe complexity of a problem does not imply that his instances are hard and always requirea long time to be solved, which is a good incentive to check whether the complexity ofproblems is a good hint about the hardness of instances in more ordinary cases.

8.1.4 Easy sub-problems within NP-complete problemsThe previous analyses show that even when a problem is complete for a complexity

class, there is no certainty that its instances may be hard. In this paragraph, I give specificexamples illustrating that this is what happens indeed and that hard problems may haveinfinite many easy instances.

There is of course a sense in which it is easy to show that NP-complete problemshave easy subproblems because all problems belonging to the classe NP , and in par-ticular problems belonging to the class P can be "translated" into instances of an NP-complete problem. Also, as noted by Papadimitriou [54, p.183], any problem, if general-ized enough, will be a particular sub-problem of aNP-complete problem or worse. Fromthese simple evidence, it is obvious that any hard problem does have some "solvability


pockets" here and there. One nevertheless may expect the frontiers of these pockets to beclearly defined and easily identifiable.

In this section, I report some recent resultats that suggest that the solvability map isprobably much more intricate than that. These results are about k-SAT problems. Aninstance of a k-SAT problem is a boolean expresion in conjunctive normal form with kliterals in each clause like

(x11 ∨ ¬x12 ∨ ... ∨ x1k) ∧ (¬x21 ∨ x22 ∨ ... ∨ x2k) ∧ ... ∧ (xn1 ∨ xn2 ∨ ... ∨ ¬xnk).

k-SAT problems play a special role in complexity theory and are good representative ofother problems. SAT was the first problem that was shown to be NP-complete by Cookand these problems are widely used for benchmarking. Finally, SAT is a problem, whichhas proved to be central in a wide variety of domains among which is statistical physics.It is easy to see why : the different clauses can be interpreted as constraints and statisticalphysics does deal with systems for which one tries to optimize a set of constraints (e.g. inspin glasses) in order to get, for example an energy as low as possible.

Starting from the fact that 2-SAT is in P and 3-SAT is NP-complete, computer sci-entists have tried to explore the frontier between the solvability and unsolvability regions.New local pockets have been found by bringing more subtle restrictions on instances of3-SAT (for example by deciding that n variables should be used four times). For example,it was shown by Berman et alii [8] that instances of (3, 4k, n)-SAT (that is to say instancesof 3-SAT having k literals appearing four times and all other literal three times), can besolved in 2k/3nk/3poly(n) time, which is a polynomial bound. As is illustrated by thiscase study the frontier between solvability and unsolvability is not easily drawn and thesolvability pockets may be scattered, numerous and difficult to identify. This shows thatwhat was suggested by conceptual analysis in the previous section is confirmed in practice: being an instance of a hard problem is in no way a sufficient condition for being hardto solve. However, there is still the hope that these pockets of polynomial solvability arehappy exceptions in an ocean of hard instances. If this is so, NP-completeness is stilla quite reliable certificate for hardness of instances. This is what we shall examine insection 8.2.

8.1.5 Which complexity measure : average, generic or worst-casecomplexity ?

In this section, I give arguments in favour of a focus on worst-case complexity — atleast if one is interested in determining the complexity of the prediction or explanation ofparticular phenomena.

Other complexity measures than worst-case complexity have been studied by com-puter scientists :

• Average case complexity was for example developped by Levin [38] and [29, 30,70]. Such a measure helps one to characterize more accurately problems such as3-COL6 in which worst-case complexity is high whereas instances are easy on av-erage.

6In this problem, one tries to color a graph using three colors with the constraint that no two adjacentvertex should have the same color.

8.2 NP-complete but almost always easy : where are the really hard problems? 67

• Generic complexity was also developped recently [27]. The main idea underlyingthis measure is that one should ignore sets of instances that are not met in practicebecause they have an insignificant measure. One should instead focus on the com-plexity of generic subsets7. Subsets, which are difficult on average, can be seen aseasy if one uses this measure [48].

This measures are undoubtfully helpful in order to characterize science globally. Sci-entists who need to produce many explanations or predictions of the same type of phe-nomena (e.g. repeatedly predicting the weather) may consider that a set of predictionsor explanations is within their reach when, on average, these predictions or explanationsrequire a reasonable amount of computational resources even if sometimes, this amountcan be really high.

Yet, if one is interested in getting some exact information about the complexity of aparticular instance, these measures are not appropriate.

i. Worst-case complexity is a total reliable measure as far as one is content with upperbounds. This is not true for generic and average complexity : whatever the averageor generic complexity measure, a particular instance can be difficult to solve. Inshort, such measures do not give any reliable information about the complexity ofinstances taken individually. And as noted by Papadimitriou [54, p.7], if one wishesto solve just one particular instance, knowing that we have stumbled upon a statisticalexception is of little consolation.

ii. Average complexity requires a particular distribution (normalized to one) to be cho-sen. This distribution describes with which probability the instances are expected tooccur. The problem is that the value of average complexity happens to be sensitiveto the distribution that is chosen. Depending on the distribution, a problem can forexample turn out to be polynomial or not on average. It should also be added that theinput distribution is rarely known.

In conclusion, such measures strengthens the holistic character of the complexity mea-sure, which is precisely what I am trying to avoid in these chapters.

8.2 NP-complete but almost always easy : where are thereally hard problems?

As we saw in the last paragraphs, the existence of pockets of solvability in NP-complete problems shows that belonging to a NP-complete problem is not a sufficientcondition for being a hard instance. I shall now present much more surprising results,which show that the situation is even worse since hard instances happen to be by far in theminority in NP-complete problems like SAT.

7A subset B of a set A is generic when the fraction of elements belonging to B is asymptotically equalto 1.


8.2.1 Hard instances in the phase transition regionIn the 1980ies, scientists had already noticed that graph coloring, thoughNP-complete

has many easy instances. In the 1990ies, methods from statistical physics were success-fully applied to the study of NP-complete problems. This made possible to show thathard instances are not pathological cases randomly distributed and to undertsand betterwhy they are sometimes so hard to solve — and therefore to ground the belief that prob-lem complexity stems from the hardness of some instances, which itself originates in someproperties of these instances or in short, that hardness is not a purely collective property.The following paragraphs are devoted to present briefly these results.

Cheeseman et alii [16] first noticed that NP-complete problems can be described byorder parameters and exhibit phase transitions (for example phase transitions of the prob-ability p of having solutions). A frequently used order parameter is the average connec-tivity of graphs. In the hamiltonian problem, p is the probability of finding a hamiltoniancircuit in a random graph. The higher the connectivity, the bigger the size of connectedsubgraphs. When the critical connectivity is reached, this size quickly increases, andhamiltonian circuit can be found8.

By describing how work backtracking algorithms9, which are the best algorithms forthese problems, one can understand why the hard instances lie at the phase transition. Abacktracking algorithm explores the trees of possible solutions and, when it finds a deadend, it backs up to the last choice that has been made and takes a different path. At thetransition, there exists quasi-solutions in the graph. As a consequence, the backtrackingalgorithms must make a very deep search to check whether a possible solution works.As a consequence, it needs to explore a significant fraction of the tree. Since this treehas exponenital size, the algorithm has therefore an exponential complexity. Far from thetransition, either a solution is quickly found because the problem is underconstrained orall the branches are quickly pruined by the backtracking algorithm because the problemis overconstrained and constraints impossible to satisfy are quickly met.

From these results, the conjecture was made by Cheeseman et alii that such phasetransitions is a specific property of NP-complete problems. The conjecture turned out tobe wrong because phase transitions were found in problems belonging to P . Still, as weshall see now, the path that was open turned out to be a fruitful one.

8.2.2 First-order phase transitions and hard problems : results about(2 + p)-SAT

A more thorough and systematic analysis of the relationships between problem com-plexity, phase transitions and instances hardness was made by Monasson et alii [46] aboutrandom 3-SAT instances.

Monasson et alii interpolates between 2-SAT (which is in P) and 3-SAT (which isNP-complete) by stuyding instances with a proportion p of clauses with three literals.The instances are built by randomly picking literals in a pool of n literals with the onlyconstraint that the ratio m/n between the number of clauses and the numbers of literalis kept fixed. By choosing random instances, one guards against studying over-specific

8The link with statistical physics and percolation theory is particularly obvious here.9Invented by Martin Davis and Hilary Putnam [21, 20] !


instances having additional properties in virtue of which the instances may be less diffi-cult.Also, since one of the goals is to determine where most of the hard instances lie, theyfocus on the median complexity of instances and not on worst-case complexity.

The results they get are plotted in figure 8.2.1. As expected, a phase transition can beseen at a critical value of α, for which the problem is critically constrained.

Figure 8.2.1: Fraction of satisfiable formulas of (2+p)-SAT against ratio α = mn (m is the number of

clauses and n the number of literals. (taken from) de [46]..

In a second step, Monasson et alii study the hardness of instances at the phase transi-tion. For that, they choose instances at the value α0.5, for which half of the instances dohave solutions, which corresponds to the point where prevails uncertainty as to whetherinstances can be satisfied and the backtracking algorithms has to go deep in the possiblesolution tree. The results are plotted in figure 8.2.2. The cost is clearly exponenital forp = 0.6.

Finally, Monasson et alii use replica methods to study the solution (and quasi-solution)landscape for some particular instances. In particular, they study the proportion of "frozen"variables for the set of solutions or quasi-solutions : these variables belong to the "back-bone" of the solutions. These variables are those for which the value is the same for allsolutions. As a consequence, if the backtracking algorithms missassign one of these vari-ables, a useless exploration of the corresponding part of the possible solution tree is made.Further, they show that for p > 0.41, the backbone appears suddenly with a finite valueat α0.5 (the transition is discontinuous). There is therefore a situation where uncertaintyprevails (see the definition of α0.5 above) and some variables do have a unique satisfactoryassignation, which explains why a whole part of the exponential tree is explored.

8.3 Conclusion : problem complexity underdetermines instance hardness 70

Figure 8.2.2: Median cost for solving 2 + p-instances at α0.5. The scale is semi-logarithmic. (Taken from [46].).

8.2.3 Overview about the statistical study of 2 + p-SATFrom their results, Monasson et alii make the new conjecture that the characteristic

feature ofNP-complete problems is that they have a discontinuous first-order phase tran-sition. Much care is needed here because these are still open questions. Further, it is clearthat all physical systems exhibiting first-order transitions are not necessarily complex todescribe and study. For example, frozen liquids have a highly regular crystal structurewhich can be easily explained. Yet, the relationship between the properties of SAT in-stances (e.g. proportion of a type of constaints associated with 2 or 3 -clauses) can betranslated into the properties of the physical systems that they represent (remember thatthe clauses can be used to describe some constraints such as energetically optimal align-ments between spins), which is a way to provide insights into the reason why physicalsystems can be sometimes hard to predict or explain. Finally, and most importantly formy argument, the above-presented results help one to understand better why even in aNP-complete problem, the great majority of instances can be easy to solve.

8.3 Conclusion : problem complexity underdeterminesinstance hardness

In this chapter, I first gave reasons why the complexity of instances should be studiedby first analyzing the complexity of the corresponding problem and then "projecting" fromthe features of the problem to the features of the instances.

I also studied the precise meaning of these complexity measures and stated that, on theface of their litteral meaning, it cannot be inferred from these features of problems that

8.4 Appendix : first-order transition and replica method in 2 + p-SAT 71

the corresponding instances are difficult indeed, even if the problems are, for example,NP-complete.

The existence of "pockets of solvability" was a confirmation of these analyses. Amore detailed study of 3-SAT showed that the situation is even "worse", because evenin NP-complete problems hard instances can be rare. As a consequence, the inferencefrom features of problems to features of instances is not a safe one indeed. Worst-casecomplexity does provide an upper-bound to the complexity of instances, but, in manycases, this upper-bound is a coarse over-estimation of this complexity. In the same time,the study of 2+p instances with methods from statistical physics suggests that the hardnessof instances is rooted in some of their properties and is not a mere collective effect.

In conclusion, since even the completeness of problems is not a certificate (a sufficientcondition) for the hardness of instances, it is legitimate what can be. As a consequence,one can raise the following questions

• Is it possible to find problems with no easy sub-problems, and if such problems doexist, what conditions do they satisfy ?

• Is it possible to clearly separate into different subproblems the hard and the easyinstances of a difficult problem ?

• If this is possible, it is sufficient to consider safely that the instances of these ho-mogeneously difficult subproblems are intrinsically difficult and cannot be solvedquickly ?

As we shall see, these questions are quite similar to the questions that must be solvedwhen one tries to determine the probability of events, since in both cases, the questionis to find the right reference class for an instance (in the case of complexity measures)or for an event (in the case of probability) and therefore to find conditions that must befulfilled by a reference class to be the "right" one, that is to say the one from which onecan legitimately use the complexity measure characterizing the infinite class to extrapolateabout the hardness of particular instances.

8.4 Appendix : first-order transition and replica methodin 2 + p-SAT

In this appendix, I give more details about how Monasson et alii use the replicamethod, originally devised by Mézard, Parisi et Virasoro [44] to study spin glasses, inorder to explain in which cases 2 + p-instances are difficult.

Chapter 9

Ressorting to partitions : successes andfailures

This chapter is devoted to finding the additional conditions that must be met by prob-lems if their complexity is to mirror the hardness of their instances. I start in section 9.1by emphasizing that the theory of computational complexity is after algorithms which areboth general and efficient, which is at odds with the need, in discussion about predictabil-ity, emergence or explanation, to characterize features of particular systems and events. Insection 9.2, I emphasize that the question is formally similar to the problem that Salmontries to solve when discussing statistical explanation in the SR model. I first present thenotions of statistical relevant partition and of (statistically) homogeneous problem ; in asecond step, I draw a parallel with the study of the computational cost of instances. Then,in section 9.3, I introduce the notions of computationnaly relevant partition of a prob-lem and of computationaly homogeneous problem. In section 9.4, I present the notionof complexity core of a problem and show how the existence of complexity cores provesthat there are computationnaly homogeneous problems. I finally conclude by showinghow the drawbacks of complexity cores (equivalence by finite variation and the possibil-ity of using patched algorithms) have the consequence that the notion of the hardness ofinstances remains ill-grounded.

9.1 Striving after efficiency and generality

9.1.1 What can we infer from NP-completeness ?As we saw in the previous chapter, the worst-case complexity of problems supplies

only upper bounds to the complexity of all their instances, and this is so even when theproblem isNP-complete. In other words, if you need to solve a particular instance of theproblem (versus all its instances), the situation is perhaps not that bad. Suppose that, toput it in Garey and Johnson’s words, "a good method is needed for determining whetheror not any [my emphasis] given set of specifications for a new bandersnatch componentcan be met and, if so, for constructing a design that meets them" [26, §1]. Knowing thetheory of NP-completeness, you can come back to your boss and, instead of saying "Ican’t find an efficient algorithm, I guess I’m just too dumb" say "I can’t find an algorithm,but neither can all these famous people". As a consequence, you can concentrate on less

9.1 Striving after efficiency and generality 73

ambitious tasks than finding an efficient exact and all-purpose algorithm : "for example,you might look for efficient algorithms that solve various special cases of the generalproblem". In short, what you have learned is that it is not possible to find an algorithmthat is both efficient and general. In this sense, the theory of computational complexity isabout the limits of algorithmics and of what can be reached with very few algorithms butnot necessarily about the limits of what can be reached by science1.

So in one sense, since the theory of computational complexity deals with the computa-tional resources that are needed for computations, it is completely relevant to discuss pre-dictability in practice, computational emergence and which particular phenomena (resp.theorems) one can explain (resp. prove) within science. In the same time, since this theoryis only about the impossibility of finding efficient and general algorithms, it fails to provethat such or such particular phenomenon cannot be predicted or explained or that such orsuch phenomenon is emergent. But when one discusses unpredictability or emergence2,one has in mind properties of such or such particular phenomena, not properties of classesof phenomena. To put it more clearly : a phenomenon A is, for example, unpredictablebecause it cannot be predicted, not because there is no single method to predict A andB and C, etc. — even if A, B and C can be seen as being of the same type. And atype of phenomena should be described as unpredictable if all the tokens of this type areunpredictable.

In conclusion, it is crucial to find additional conditions, which, when they are met,allow one to infer that the complexity of a problem is good information about the hardnessof its instances.

9.1.2 Particular physical situations, individual instances and the searchfor generality in science

It is a traditional claim that the object of science is what is general, not what is par-ticular. By insisting that we should try to measure the complexity of particular instancesof problems, am I not going against this claim ? Two different answers can be brought tothis criticism.

First the search for general algorithms is a legitimate one and I agree that findinggeneral and efficient algorithms it is a completely legitimate goal of science. My point isonly that when it comes to measuring the complexity of phenomena, we need a complexitymeasure for the hardness of particular deductions or solutions to particular problems.

Second, it is true that I have emphasized in part 1 that, to produce good explanations,one needs to select relevant explanatory facts and that by doing this, one also producesgeneral explanations. In the same time, remember that, when one produces a generalexplanation, one does so by means of one and single deduction. For example, Newton’ssingle deduction of the law of Area provides an explanation of the instantiation of thelaw of Area in all cases where a single central force is at work in a two-body system. Asa consequence, by measuring the complexity of producing such a deduction, one fullysatisfies the requirement of explanatory generality. It is true that in some cases, a singlededuction (or a single instance of a problem) represents a single particular situation as

1This is a crucial difference with notions like Kolmogorov complexity or Bennett’s logical depth, whichare about particular strings.

2See [50], [6, 7]

9.2 An analogous problem: determining the probability of singular events 74

when one computes the evolution of the atmosphere to predict tomorrow’s weather. But asingle deduction (or a single instance of a problem) can also stand for an explanation thatis general and applies to a wide variety of cases. So the objection is misguided.

9.2 An analogous problem: determining the probabilityof singular events

When one uses statistics to determine the probability of a single event, a similar prob-lem arises. It is not possible to "measure" this probability by focusing on this single evente ; in consequence, one needs to build up the probability measure out of a class of eventsof the same type, of which e is an element. For example, to determine what the chanceswere for Albert to steal the car (event E), one can make statistics about the number ofboys in the United States who do steal cars or about the number of boys living in SanFrancisco and with divorced parents who do steal cars. So to determine the probabilityof E, all the problem is to find the right reference class A. Once this is done, one cancount the frequence p of boys stealing cars within class A and infer by projection that theprobability of E was p.

Wesley Salmon, discussing the notion of explanation in a probabilistic contex, pro-posed a solution to the reference class problem with his SR (Statistical Relevance) modelof explanation3. The two main concepts defined by Salmon are the following4 :

• Given some class or population A, an attribute C is statistically relevant to anotherattribute B if and only if P (B|A.C) 6= P (B|A), that is to say if and only if theprobability of B conditional on A and C is different from the probability of Bconditional on A alone.

• A class F is objectively homogeneous with respect to attribute G if no relevantpartition can be made in F by means of an attribute which is statistically relevant toG.

There remains to be checked that analog concepts can be used to tackle the problemof the referenc class for measuring the hardness of instances.

9.3 Measuring the hardness of instances by partitioningproblems

Mutatis mutandis a similar solution as the one offered by Salmon seems to be possible.What one tries to determine is the hardness of an instance i (resp. the probability of anevent E), which is done indirectly by means of a "holistic" complexity measure (resp. a"holistic" frequency measure) characterizing an infinite class of instances (resp. a classof events of a same type) to which the instance i (resp. the event E) belongs. In bothcases, the aim is to determine a class which can be said to be homogeneous with respect

3See [63, p.63] for more details.4Since my aim is not to discuss the SR model, I somewhat simplified the definitions.

9.3 Measuring the hardness of instances by partitioning problems 75

to the holistic measure that is used. To make things clearer, I start with an example aboutmatrices, which, for the analysis of scientific activity, is probably more telling than resultsabout SAT problems. Another reason to give this additional example is to illustrate thatdoing computationally relevant partitions is also of great scientific interest even when thedifferent classes do belong to the complexity class P .

9.3.1 Computational relevance : the example of sparse matricesMatrices are frequently used in physics to study physical systems, e.g. in optics,

quantum mechanics or in condensed matter physics. It is therefore crucial to be ableto perform the usual operations (multiplication, inversion, diagonalization) on particularmatrices at a cost which is as low as possible.

Matrix multiplication is performed as indicated by figure 9.3.1. In the example (AB)1,2 =∑2r=1 a1,rbr,2 = a1,1b1,2 + a1,2b2,2 and (AB)3,3 =

∑2r=1 a3,rbr,3 = a3,1b1,3 + a3,2b2,3.

Figure 9.3.1: Matrix mulitplication

More generally, the product AB = C of a m× n matrix A and a n× p matrix B is

∀i, j : cij =n∑

k=1

aikbkj = ai1b1j + ai2b2j + · · ·+ ainbnj

.From this, it is clear that multiplying two matrices requires in general n3 operations.

It was a great feast by Strassen to show that the complexity can be as low as O(nlog2 7) =O(n2.81). Further progresses showed that the exponent is inferior to 2,376. Whatever theprogresses that are made, the best one can hope for the general case is a complexity equalto O(n2) — since there are n2 cells to be filled in.

However, in many cases the complexity can be lower, especially when the matri-ces that are multiplied are what is called "sparse matrices" with many "0s". Figure ??shows different types of usual sparse matrices. Algorithmic studies show that the precisecomplexity of matrix multiplication does depend on the type of sparseness characteriz-ing matrices. For example, in the case of tridiagonal matrices, the time complexity ofmultiplication is O(n) (versus O(n2) or more).

In short, partitioning the original type "matrix" into subtypes (e.g.tridiagonal matrices,triangular matrices, etc. and other matrices) yields a finer grained typology of matrices


Figure 9.3.2: Examples of sparses matrices. (Original figure in [68]).

with different computational costs for the different types5. It must be noted that, if oneis interested in tridiagonal matrices, the fact that the class of all but tridiagonal matri-ces is not computationally homogeneous (since it can be partitioned again into triangularmatrices and other matrices) is of little importance6.

9.3.2 From statistical relevance to computational relevanceWith the matrix example in mind and a peeping eye on the SR model, one can try to

bring an answer to the problem of the reference class for complexity measures.

Definition 9.3.1. Computational relevanceLet i be an instance of a problem A (I shall also use "A" to denote the set of all

instances of this problem). Let K be a well-defined complexity measure of problems and

5Note however that computational cost underdetermines types of matrices just as probability underde-termines causes. Different types of sparse matrices can require different algorithms, which do neverthelesshave the same computational complexity.

6Salmon requires all the cells of the partition to be homogeneous — but one can deny this should berequired even for statistical explanation.


K(A) its value for A. Let P be a property of instances that can be used to define a sub-problem of A. (An infinite subset of A should in particular have property P .) CP (A) isthe subset of instances of A having property A (I shall also use "CP (A) " to denote thecorresponding subproblem). CP (A) and C¬P (A) therefore defines a partition of A.

Property P is K-relevant if and only if K(CP (A)) 6= K(A) or K(C¬P (A)) 6= K(A).

An obvious but important remark is that the inequality K(CP (A)) 6= K(A) can beinterpreted more or less liberally. It can correspond to a difference of complexity class,of complexity exponent within a complexity class or even of multiplicative coefficient.With the former interpretation, one obtains classes, which belong to different robust andinvariant well-defined complexity classes. In the end of this chapter, I shall use this in-terpretation of the inequality because I am after a robust foundation of the notion of hardinstance. In the same time, it is clear that this interpretation of the inequality sign is toocoarse to describe scientific progress since a decrease by a constant factor is in practiceusually considered as a significant progress.

Also, from a practical point of view, further conditions can be added for a partitionto be useful. For example, a partition is acceptable if making the partition decreases theglobal cost of the solution of the corresponding instances, that is to say the cost of thepartition and the cost of the solution for the new subproblems (the cost of the partitioncorresponds to the computational resources that are required for identifying an instanceas belonging to one of the subproblems defined by the partition). Also, it is not satisfac-tory to state that SAT instances can be partitioned into two subproblems composed of theinstances that can be satisfied and the instances that cannot be satisfied. For these twosubproblems, satisfaction becomes a trivial question, which can be answered immediatelyonce one knows to which subproblem a given instance belongs to. In short, all the com-putational cost of finding the solution has been transferred into the cost of the partition.There is (probably difficult) work to be done here in order to give a satisfactory defini-tion of what an "acceptable" partition is, but I shall not try to solve these puzzles in thischapter.

Finally, it is important to note that, as far as the complexity measure that one usesis worst-case complexity, the complexity measure never increases in any subproblem anddoes decrease in at least one sub-problem when one makes aK-relevant partition. In otherwords, making partitions is a way to produce finer and lower upper bounds describingmore precisely the complexity of instances.

9.3.2.1 The requirement of K-homogeneity

One can now bring an answer to the problem of the reference class for complexitymeasures. For a complexity measure of a problem to bring information about the hardnessof its instances, no computationnally relevant partition of this problem should be possible.Hereby the definitions

Definition 9.3.2. Definition of objective K-homogeneity. A problem is objectively K-homogeneous if it is not possible to make a K-relevant partition of it by means of anacceptable property.

Definition of objective K-homogeneity with respects to an instance. A problem A isobjectively K-homogeneous with respects to an instance i if it is not possible to make a


K-relevant partition ofA into subproblemsA1 andA2 by means of an acceptable propertyso that the sub-problem Aj that instance i belongs to has a different complexity measurefrom A.

The second definition is required because computational non-homogeneity of a problemA is a worry only in cases where the instance that one is interested in is less difficultto solve than it is indicated by the complexity of A. It should also be noted that so far,we still have no clue as to whether K-homogeneous problems do exist. (This questionwill be discussed in the next section by presenting the notion (and proof of existence) ofcomplexity cores.)

9.3.2.2 Objective, epistemic and practical K-homogeneity

Just as in the case of statistical homogeneity [61, p.44], three notions of homogeneitycan be distinguished between :

1. A problem (resp. class) is objectively homogeneous if it is absolutely impossible tomake a K-relevant partition (resp. statistical relevant) of it.

In physics, objectively homogeneous can be found in quantum mechanics. This isclearly the case we are interested in if we are to show that instances can be intrinsi-cally hard and the complexity of problems can mirror the hardness of instances.

2. A problem (resp. class) is epistemically homogeneous if one does not know how tomake a K-relevant partition (resp. statistical relevant) of it whereas it is possible tomake one7.

3. A problem (resp. class) is practically homogeneous if it is not possible in practiseto make a K-relevant partition (resp. statistical relevant) of it. For example, whenyou plays roulette, it is not possible in practice to determine which set of initialconditions do result in the ball stopping on number 13.

Coming back to problem solving, if a problemA has many easy subproblems, whichall require different algorithms, it may be globally unproductive to spend resourcesto determine whether an instance belongs to one of these subproblems and moreconvenient and efficient to use one single algorithm in all cases.

NP-completeness andt epistemic K-homogeneity

As we saw it in the previous chapter, NP-complete can have a wide majority of easyinstances and can be fruitfully partitioned into subproblems, some of which belong to theclass P . In other words, problems like SAT are epistemically homogeneous. Further,once such a partition has been made, the two subproblems can still be epistemically ho-mogeneous because further partitions can still be made. Even if NP-complete problemscan be epistemically homogeneous, there are however limits to what can be gained by

7Sometimes, a problem is objectively homogeneous but one does not know that this is so ; as a conse-quence, one keeps searching for partitions and wastes resources to no avail. Thus, the notion of epistemichomogeneity could be fruitfully enlarge in order to include this latter case or a second close definition couldbe built to account for this case.

9.4 Complexity cores and the quest for objectively homogeneous problems 79

K-relevant partitions. It cannot happen that, by making a finite number of partitions, oneends up with a finite list of subproblems belonging to P because, by trying out simulta-neously the algorithms solving these subproblems, one would solve all instances of theinitial problem in polynomial time, and the problem would belong to P .

As a consequence, by partitioning a NP-complete problem and singling out polyno-mial pockets in it, one must always end-up with a subproblem which belongs to the classNP . Further, there is a sort of asymetry between the subproblems that are the result of thepartition. After the partition, we can be certain that the instances of the polynomial pocketare easy, whereas no such certainty is gained for the instances of the difficult subproblemsince it is perhaps still epistemically homogeneous.

To gain full certainty about the hardness of some instances, one must be able to singleout a subproblem which is objectively homogeneous — and this is precisely what the nextsection is about.

Conclusion : exploring the map of complexity versus bringing foundations to thenotion of hard instance

In conclusion, two different goals must be clearly distinguished.

1. If our goal is to describe how scientists in practice do refine the map of complexityand discover how they can solve new instances and extend the boundaries of sci-ence, the previous definitions in terms of partitions are satisfactory and the existenceor non-existence of objectively homogeneous problems is not a key issue.

2. If our goal is to show that instances can be intrinsically hard to solve, which is afoundational perspective, then it is required to show that objectively homogeneousproblems do exist. In this perspective, it is also better to interpret the inequality signin the expression K(CP (A)) 6= K(A) as standing for a difference in complexityclass.

As a result, the issue that needs to be addressed now is whether there exists homo-geneous difficult problems that do not have polynomial subproblems.

9.4 Complexity cores and the quest for objectively homo-geneous problems

Since some problems are intrinsically complex, it is legitimate to suspect that someinstances of these problems, infinite in number, are intrinsically hard to solve and that thisis the origin of the complexity of problems. If this is so, one can also expect that hard andeasy instances can be sorted. This is what we shall examine now by focusing on the notionof complexity cores, that is to say problems whose instances are almost all difficult. Twopreliminary remarks are needed.

� A first worry is that, in the study of complexity cores, the emphasis is laid on deci-sion problems. Is this appropriate in order to study the opacity of natural phenomena(since in the physical sciences, problems to be solved are seldom decision problems)? This worry is actually not troublesome. Usual problems can be transformed easily


into decision problems. For example, for minimization problems with a target functionto be minimized, one can enumerate some numbers and ask whether this value can bereached.

� Since this investigation is foundational, I shall focus in this section on problems thatdo not have polynomially solvable subproblems and therefore on problems that do notbelong to the class P (e.g. NP-complete and hard problems). At this step, one maywonder whether it is appropriate to focus on these problems in order to study the opacityof natural phenomena. Are not these problems too high in the complexity hierarchy tobe relevant for the study of physical systems ? Do not the problems that are solved byscientists belong to the class P ?

Actually, the fact that scientists solve problem belonging to P is not so surprising:physics is the art of developing our knowledge about Nature in spite of the disproportionbetween Nature’s complexity and our finite means. Since problems belonging to NPare not tractable, physicists are unlikely to spend their resources trying to solve them.Instead, they legitimately try to use approximations and idealisations, to study systems’saverage behaviour or to develop any other method that can make the study of a systemtractable and still informative.

In the same time, it is clear that NP-complete or hard problems are met indeed byphysicists. As indicated above, the Ising model — a central model in the domain of sta-tistical physics — has been shown to be NP-hard. Protein folding is another exampleof a simple physical problem that seems to be NP-complete [24, 25]. So, even if itis still debated if Nature does really "solve" instances of these problems quickly [1], itis at least consensual that such problems are encountered by physicists and that the artof physics is to keep our investigations clear of them. To quote a physicist working onsuch questions:

The most precise numerical results available are for combinations of modelsand questions that can be adressed with know polynomial-time algorithms.It is often not obvious which questions can be studied using polynomial-time algorithms and which cannot. Just finding the ground state exactly canbe NP-hard and thus impracticable to study. Even when the ground statecan be found in polynomial time, computing quantities that characterize theenergy landscape in the same model, such as the highest energy state, thepartition function or the height of barriers between low-lying states, may beNP-hard. It is unclear whether the distinction between P and NP-hardoptimization problems, so important in complexity theory, leads to distinctbehavior of the physical models. Regardless of the physical importance ofthis distinction, it is clear that the discrimination between P and NP-hardoptimization problems has been extremely useful in organizing simulations."[45, §5, p.71 sq.].

As indicated in this quotation, a debated question is to know whether hard instances ofhard problems are connected with distinct behaviours. But the fact that such problemsare met in practice seems to be agreed upon.


Let us turn to complexity cores. Paragraph 9.4.1 is devoted to a definition of thisnotion. Then, are presented results proving that complexity cores can be found in hardproblems, which shows (at last) that computational complexity is not merely a collectiveeffect. More results are presented in 9.4.4, which indicate that complexity cores are dif-ficult to identify : as a consequence, we can be sure that there are difficult problems inscience but it is difficult to recognize them for sure. Finally, I argue in paragraph 9.5that, for whom wants to lay safe foundations for the notion of hard instance, the notionof complexity core is flawed with defects (shared with complexity measures attached toproblems) and cannot ultimately be used for this purpose.

9.4.1 Definition of complexity coresSuppose that one tries to solve SAT instances. An instance of SAT is a logical formula

in conjunctive normal form, which can be coded as a sequence of 0s and 1s. Thereore,all the instances of SAT can be described as a set A of strings (what computer scientistscall "a language"), which is a subset of Σ∗. The instances of SAT that can be satisfiedconstitue a subset B of A. Since deciding whether a string is an instance of a problemis usually easy, solving SAT boils down to being able to decide whether an instance ofΣ∗ belongs to B. More generally, decision problems are associated with sets and solvingan instance of one of these problems is equivalent to deciding whether the correspondingstring belongs to the corresponding set. Finally, a difficult set of instances for a problemP , corresponding to setB, is a setA for which it is difficult to decide whether its elementsare also elements of B.

Computer scientists Lynch [39, 52, 65, 3] have tried to make this idea more preciseby defining the notion of (polynomial8) complexity cores. Let us start with definitions.The model of computations that is used is Turing machine. L(M) is the set of strings orlanguage accepted by Turing machine M . TM(x) denotes the number of steps used byTuring machine M to halt on input x. Finally, the set of t-difficult instances for a Turingmachine M (t being a function) is:

Hard(M, t) = {x ∈ Σ∗ | TM(x) > t(| x |)}

How to define hard instances of a problem A? If M is a Turing machine that decidesA and t is a polynomial function, let Hard(M, t) be the set of t difficult instances for M .The problem with this definition is that it is relative to a particular machine M . However,if a set does not belong to P , it can be proved that, if M and M ′ decide A, then whateverthe polynomial p and p′9,

Hard(M, p) ∩Hard(M ′, p′) =∞.

From this, the notion of complexity core can be defined like this :

Definition 9.4.1. A polynomial complexity core for a setA is an infinite setX of instancessuch that for every Turing machine M for which such L(M) = A and for any polynomial

8As we shall see, one can also define cores for higher classes in the complexity hierarchy.9But, as we shall see later,

⋂{Hard(M,p)|L(M) = A and p is a polynomial } = ∅, which is one of

the major flaws of complexity cores.


Figure 9.4.1: Schematic representation of a complexity core for SAT. Horn formula, which are tractablein polynomial time, do not belong to this core..

p, X ⊆ Hard(M, p) almost everywhere. ("X ⊆ Y almost everywhere means that |X − Y | is finite.)

The restriction "almost everywhere" in the definition should not be ignored because, what-ever the instances that one needs to solve, one can find a polynomial p big enough so thatone has x 6∈ Hard(M, p). In spite of this, the previous definition is satisfactory becauseit gives a more precise formulation to the idea that, when a problem is difficult, it doespossess infinite sets for which no algorithm can do well. For a complexity core, any algo-rithm fares badly except for a finite number of instances, which means that no infinite partof a complexity core can be easy (no general method can do well on an infinite numberof instances of the core.) In other words, this infinite set of instances is difficult for allalgorithms. Also, if a problem A has a complexity core X , it is not possible to divide thispart X of A into many subproblems, which all belong to P . In this sense, the complexityof A and of X is not a "collective" effect due to the fact that one tries to solve too manydifferent instances with too few algorithms.

9.4.2 The hardness of problems is not merely a "collective" effect :existence of gradual complexity cores

This paragraph is devoted to presenting results that confirm the existence of complex-ity cores for problems within the usual complexity classes.

Theorem 9.4.1. Any recursive set10 A not in P does have a complexity core.

This theorem implies that, under the assumption P 6= NP , SAT and all other NP-complete problems do have polynomial complexity cores. Since NP-complete problemsare met in physics, unless one can devise an argument showing that the instances that aremet in physics must be easy, one must conclude that the physical situations corresponding

10For a recursive set A, it is possible to decide with a Turing machine whether any instance belongs to A."To decide" means that the Turing machine answers "yes", "no", or stops but never loops.


to the instances of the complexity cores of these NP-complete problems are difficult topredict or explain.

9.4.3 Complexity cores and proof verificationMy concern is mainly about the complexity of producing proofs, because I am inter-

ested in what I have called primary opacity (that is to say, the amount of resources thatone needs to produce an explanation or a prediction). Let us turn briefly now to secundaryopacity (that is to say, the amount of resources one needs to verify an already existingexplanation or prediction). The model of computers used is now non-deterministic Turingmachine : since the proof is already built we do not have to explore successively differentpaths as we do when we try to produce a proof but we can directly review the right path,which, from a complexity point of view, is equivalent to exploring all these paths togetherwith a non-deterministic Turing machine.

So the question is now whether there exists polynomial complexity cores when oneuses non-deterministic Turing machines or, equivalently, when one measures the lengthof proofs in proof systems. It was shown by Schöning [64] that, here again, the answer ispositive :

Theorem 9.4.2. [64] (If NP 6= co−NP)There is a constant ε > 0 and a collection F of tautologies of density at least 2εn

for infinitely many n, such that for every sound proof system S for the tautologies andfor every polynomial q, the shortest proof of f in S has length more than q(f) for almostevery f ∈ F .

In other words, even if we were lucky (if we always took immediately the right pathswhen constructing proofs), or if these proofs were given to us by a benevolent demon, itwould be out of our reach to verify that these proofs are correct (and therefore to benefitfrom the understanding that verifying and understanding a proof gives).

9.4.4 Hard to delineate complexity cores and the blurred boundariesof science

Two more questions can be asked about complexity cores :

• Is it always possible to recognize the instances belonging to a complexity core ?

• Is it always possible to cluster together all the hard instances within a single com-plexity core ?

If a positive answer was given to these questions, this would be good news for science.To be sure, there would be, for many different problems, sets of instances, which could notbe solve quickly but, at least, it would be possible to know which instances are difficult tosolve. We would know what we can hope to do and we would know what we can have nohope of doing — and philosophers would be delighted.


Can complexity cores be always easily identified?

An existing criticism about complexity cores is that they can hardly be identified inpractice [27], even though they can be identified in principle (that is to say with unlimitedcomputational resources). Does not this go against Humphreys’ and Wimsatt’s motto thatphilosophy of science should focus on what is possible in practice, not in principle ? Twoanswers can be brought to this objection.

First, it is true the identification of complexity cores often relies on enumerations ofTuring machines, which does not make them easy to build or delineate. Nonetheless,results have been found that for any set A that does not belong to P , for each superpoly-nomial function f , there is a complexity core that belongs to DT IME(f(n))11. Also, ithas been shown that for almost all sets belonging to EXP , Σ∗ is a complexity core, whichmeans that all instances are difficult to solve.

Second, the fact that complexity cores are difficult to identify is a problem for scientits,not for philosophers. The notion can be used by philosophers to establish that science isnot and cannot be omnipotent just like Kolmogorov complexity can help one to showthat all strings cannot be described shortly. But the fact that Kolmogorov complexity isuncomputable or that complexity cores are hard to identify is a problem for scientists. Asphilosophers, we should just conclude that scientists cannot identify easily all that theycan do because they are not able to easily identify what they cannot do easily.

The missing maximal complexity cores

A second surprising result is that sometimes, easy (resp. hard) instances cannot beonce and for all clustered together because maximal cores do not exist. A complexity coreX for a problemA is maximal if and only if for any other complexity coreX ′ forA′,X ′ ⊆X almost everywhere. It can first be shown that a problem A has a maximal complexitycore if there exists a maximal restriction of Σ∗ on which A can be solved in polynomialtime (this means that all easy instances can be clustered together). Unfortunately, it wasshown that SAT and other NP-complete problems of the same kind do not have anymaximal complexity core.

This means that any set on which SAT can be non-trivially solved efficiently can beenlarged and there is no way to identify all easy instances once for all : we are condemnedto keep searching forever new sets of easy instances.

From what has been said in this section, two conclusion can be drawn about science :

• The boundaries of science are blurred, even if they are well-defined, because com-puting exactly where the boundaries are (by identifying the limits of complexitycores) cannot be done once and for all and seems to require too much computa-tional power for many complexity cores.

• Science must have a protestant ethics praising hard labour despite uncertainty aboutsuccess. We are condemned to uncertainty as to whether some instances can be

11Watch out that the complexity given here is not the time required for solving the instance i but foridentifying it as belonging to the complexity core.

9.5 The hardness of instances: or, the absent Holy Grail 85

solved because easy instances and hard instances cannot be definitely separated andclassified into two separate sets. No "lazy man’s reasoning"12 is possible.

9.5 The hardness of instances: or, the absent Holy GrailThe notion of complexity core is a decisive step in the study of the hardness of prob-

lems since complexity cores of difficult problems are sets that do not have easy subprob-lems and that are, for each algorithm, easy only in a finite set of cases. In spite of thissuccess, the notion cannot be used to define correctly the notion of hard instance. Sinceall but finitely many instances of a complexity cores are difficult for all polynomials, onemay consider that belonging to a core is a sufficient condition for an instance to be diffi-cult, except in a finite list of cases. This conclusion is unfortunately illegitimate. Threemain reasons can be given.

First, belonging to a complexity core is not a sufficient condition for being difficultbecause any instance can belong to a complexity core. The reason is that any finite vari-ation of a complexity core (a complexity plus (resp. minus) a finite number of instances)is a complexity core.

Second, it is not possible either to consider that an instance is really difficult if itbelongs to all complexity cores of a problem. First, as we saw above, there is no maximalcomplexity cores in some problems. Further even if the intersection of two sets of difficultinstances for a problem is infinite :

| Hard(M, p) ∩Hard(M ′, p′) |=∞,the intersection of all such sets is empty :⋂

{Hard(M, p) | L(M) = A et p est un polynôme } = ∅.

The reasons is that, for each instance, one can find a polynomial which is big enough sothat one has i 6∈ Hard(M, p).

Finally, one should note that for all instances and all Turing machineM which decidesa problem A and has complexity K, there exists a Turing machine M ′ which has the samecomplexity and decides i quickly. In short, any instance can be solved quickly by analgorithm which performs just as well as our best algorithm. To build M ′, take your bestalgorithm and add him two preliminary instructions : first check whether the instance tobe solved is i ; if it is, write the answer for i, otherwise, proceed as usual. This "patched"algorithm gives the right answer for all answers and his asymptotic complexity is the sameas the one of the original algorithm M .

The conclusion to be drawn is that the notion of complexity core, attractive thoughit may seem, is not sufficient to provide a good definition of the notion of hard instance.In the same time, it is clear that "patched" algorithms are unsatisfactory because theyalready contain a description of the target instance and its solution. In order to definehard instances, an attractive direction is to try to forbid the use of "patched" algorithmsby restricting the permitted size of algorithms. This is what is done in the next chapter byrelying the notion of instance complexity.

12The lazy man’s syllogism is the one that is made by a person who believes that everything is pre-determined in the world and that it is useless to make any effort.

Chapter 10

Laying foundations for the notion ofhard instance: instance complexity orthe two dimensions of complexity

In the previous chapter, we have been on the verge to find a satisfactory definition forthe notion of hard instance in the framework of problem complexity, with the notion ofcomplexity core. We have nevertheless been faced with a last minute failure due to theexistence of "patched algorithms" — algorithms, which, for a finite set of instances, canrely on table look-up. As a consequence, for any instance, there are algorithms that solveit quickly and perform just as well as our best algorithms, which do not use table look-up.This failure is not to be put upon the notion of complexity cores because the same kind ofproblems arises for other complexity measures. "Patched algorithms" were discussed inthe context of complexity cores because they happened to be the last obstacle to a robustdefinition of the notion of hard instance.

As a consequence, to lay foundations for this notion, it is finally required to leavethe framework of the theory of problem complexity and to focus on a new notion, whichis more adapted to deal with the hardness of particular instances. This is what instancecomplexity is about. The general idea backing this notion is that one should take intoaccount the size of the algorithms that one uses. The longer the size of the algorithm, themore it can incorporate specific knowledge finely tailored for specific sets of instances."Patched algorithms" are the ultimate caricature of this idea since they are specificallydesigned to solve as quickly as possible a finite number of instances and must contain forthis reason a description of these instances, which increases their size. Since the notion ofinstance complexity is delicate to manipulate, I have first made a non formal presentationof it in the first part of this chapter (section 10.1 and 10.2). In section 10.3 and 10.4more formal definitions are given and results are presented in order to check, as we didwith the notion of complexity cores, that this notion gives us information about how hardthe instances of problems that are met in physics are. Finally, in section 10.5, I put intoperspectives the results from the last three chapters about the notion of intrinsic opacity.

10.1 The main idea : taking into account the size of algorithms and of proof systems 87

10.1 The main idea : taking into account the size of algo-rithms and of proof systems

Patched algorithms, which work well because of the table look-up trick, are ad hocalgorithms designed to solve quickly only finitely many instances and it is intuitivelyclear that one should not say that an instance is easy because it can be solved quickly byalgorithms of this type.

Still, even if table look-up is a trickery, it is one that can be helpful and that scientistsuse in practice. If one often needs to use π and many of its decimal places, it is a goodpolicy to store a description of π instead of repeatedly computing it. As we saw above(p.47), the same was true about the values of the sine, cosine or logarithmic functions be-fore powerful computers were built. So the solution cannot be to say that such algorithmsare useless and must be banned. In the same time, an instance should not be described aseasy if only patched algorithms can solve it quickly.

Two issues must also be distinguished. The first one is about finding good argumentsin order to legitimate that patched algorithms should not been taken into account whenmeasuring how hard instances are. The second objective is to find a way to prohibit theuse of these algorithms for measuring the hardness of instances : when doing this, it isclear that one must pick out the most efficient algorithms for this instance, since it wouldbe stupid to measure the hardness out of unappropriate methods. In the same time, anadditional clause must be added so that patched algorithms cannot be picked up and thequestion is what this clause must be in order to ban all and only patched algorithms.

In the next sections, I shall only try to answer the second question. The strategy isto take into account the size of algorithms. Roughly said, algorithms that are big enoughto contain a description of the target instance and of its solution must be blacklisted.The advantage of this criterion is that it is purely syntactic, which, when dealing withalgorithms, is what should be looked for. It should however be noted from the start thatthis solution is satisfactory only when one deals with long instances. The reason is thatthe description of efficient "honest" algorithm also takes up space. As a consequence,for all instances that are short to describe, the size of honest algorithms and of patchedalgorithms is approximately the same.

10.2 Instance complexity : semi formal presentation

10.2.1 Bounded and unbounded instance complexity and Kolmogorovcomplexity

The idea behind instance complexity is to take into considerationthe size of the solvingprogram. With this perspective in mind, one can start with the definition of unboundedinstance complexity :

Definition 10.2.1. The unbounded instance complexity of an instance i of a problem A isthe size of the shortest program solving i and making no mistake on other instances of A.

This notion is not very informative because the corresponding program may take verylong. For example, a program based on truth table enumeration is short but inefficient to

10.2 Instance complexity : semi formal presentation 88

check whether a formula is a tautology. In consequence, it is legitimate to let the size ofthe allowed program grow in order to decrease the time complexity. But what size canbe safely alloted ? As indicated above, the program should not contain a description ofinstance i. In consequence, a first coarse upper bound for the size of honest programs isthe Kolmogorov complexity K(i) of the instance i, that is to say the size of the shortestprogram that can produce i. Since a program using this shortest description to solve i bytable look-up can be used, one can write :

ic(x : A) 6 K(x) +O(1).

Since the shortest program p that computes i may take long, a patched algorithmrelying on this compressed description may fail to be efficient, because it would first needto produce i out of p. In consequence, in order to be quick, a patched algorithm must usea description of i that can be "unfolded" within a given time bound. Accordingly, it ismore appropriate to define and use time-bounded instance complexity :

Definition 10.2.2. The time-bounded instance complexity icf (i : A) of an instance i rela-tively to a problem A and a time function f(x), is the size of the shortest program that cansolve i in time complexity inferior to f(i) without making any mistake on other instancesof A.

Time-bounded instance complexity must be compared not to Kolmogorov complexity butto time-bounded Kolmogorov complexity, which is the size of the shortest program thatproduces i within a time bound :

Kt(x) = min {|M | |M(λ) = x and tM 6 t(|x|)}.

With these definitions, the new inequality that one gets is1 :

ict(x : A) 6 Kt(x) (10.1)

The inequality says that the time-bounded instance complexity cannot be more than thetime-bounded Kolmogorov complexity since a patched algorithm for i, relying on a com-pressed description that can be unfolded in time t, has roughly size Kt(i).

10.2.2 How to measure the hardness of instances?In this new perspective, the complexity of an instance is not measured with one param-

eter but with two, the time of the computation and the size of the algorithm. To determinewhether an instance is hard to solve, one first chooses a reasonable time bound t and onethen asks about the size of the shortest algorithm that can solve it. If this size is just big asthe t-bounded Kolmogorov complexity, this means that the instance cannot be solved intime t unless one uses table look-up and that the instance can be considered as t-difficult.The purpose is not necessarily to be able to measure the instance complexity of eachinstance, but to build a complexity measure so that theorems can be given about the exis-tence or non existence of hard instances. Also, if the measure is a good explication of thenotion of hardness of instances, it should account for what we know about the hardness

1I have simplified the equation.

10.2 Instance complexity : semi formal presentation 89

of instances. It should for exeample account for results like the linear-speedup theoremor the decrease in computational cost brought about by the use of cuts (when one provingtautologies). This is what I check in this section.

� The linear speedup theorem says that it is possible to speed up any computation byan arbitrary constant k. Yet, as we already saw it, the counterpart of this speedup isan exponential increase of the alphabet used by the TM and of its number of statesdescribing the transitions. This is in accordance with instance complexity : the decreasein computational time goes with an increase in the size of the algorithm (correspondinghere to the description of the TM).

� The case of the use of cuts for proving tautology can be analysed in the same way.Including cuts does not change the deductive closure of the initial proof system ; how-ever, by using cuts, the length of proofs can be significantly shorter but this goes withan increase of the set of rules and therefore of the size of the description of the proofsystem.

These two examples confirm that the notion of instance complexity (the explicatum)does account for cases where a decrease of computational cost is possible. More gener-ally, it is also consonant with the fact that the development of our algorithmic knowledgeand of more refined algorithms, adapted to more specific instances, a decrease of the com-putational cost of solutions together with greater size of the knowledge that must be storedin our books or memory.

What has been said so far can be summarized with a graphic describing the co-evolution of time-bounded instance complexity and Kolmogorov complexity (see figure10.2.1). As the graphic shows, instance complexity is always inferior to Kolmogorovcomplexity. Still, when one decreases the time bound for computation, there is finally apoint after which the only way to solve the instance within the alloted time is to use tablelook-up : in consequence, instance complexity is equal to Kolmogorov complexity.

Summary and perspectives

Now that the main idea has been sketched, it is required to see if it can be givena more precise definition of the explicandum2 notion (the hardness of instances) and iffruitful formal results can be brought to describe which problems have hard instances. Itcan in particular be expected from a good explicatum notion that :

• it is robust and independent of the computers that are used ;

• it can be used to characterize the complexity of problems, which would then be seenas deriving from the hardness of instances ;

• it can be used to redefine the notion of complexity cores, which are supposed to besets of hard instances ;

• it can be determined whether hard problems do have hard instances indeed.

2To use Carnap’s terminology [12, §§1–3].

10.3 Instance complexity : formal definition and significant results 90

Figure 10.2.1: Sketchy graph of the time-bounded Kolmogorov complexity (Kt(x)=f(t)) and the time-bounded instance complexity (ict(x : A) = g(t)) as functions of time.

10.3 Instance complexity : formal definition and signifi-cant results

This section is devoted to showing that the expectations that have just been listedcan be satisfactorily fulfilled. In the present summary, I shall be brief and not give thetechnical details since most of the work is done by the definitions and results taken fromcomputer science and the philosophical work consists in discussing to what extent theseresults do the job that we want them to.

10.3.1 Definition and invariance resultsThe definitions given above uses concepts like "shortest program", "algorithm" or

"smallest Turing machines". But, these notions are not precisely defined since, for ex-ample, the size of a Turing machine us not well-defined. To get rigorous definitions andinvariance results, it is necessary, both for Kolmogorov complexity and instance com-plexity, to use the notion of universal Turing Machine (UTM) taking two arguments, onefor the description of the instance to be solved and one for the program that must be


run. With this definition, it is possible to define the size of the argument describing theprogram. Further, because two-tapes UTM can simulate UTM with multi-tapes in timet′(n) = ct(n) log t(n) + c, one gets the desired invariance result. The instance complex-ity is the size of the shortest program on an UTM that can solve i and the time bound iswell-defined up to a log t(n) factor.

Theorem 10.3.1. Invariance theoremThere is an interpreter U such that for all other interpreter M , there is a constant c

such that, for all set A, any temporal bound t and any string x,

ict′

U(x : A) 6 ictM(x : A) + c

Kt′

U (x) 6 KtM(x) + c

with in both cases t′(n) = ct(n) log t(n) + c.

As a consequence, one gets the following inequality between instance complexity andKolmogorov complexity :

ict′(x : A) 6 Kt(x) + c (10.2)

10.3.2 Instance complexity and complexity coresWith this definition of instance complexity, the notion of complexity core can be re-

defined like this :

Theorem 10.3.2. [51]. A set C is a polynomial complexity core for a set A if and only if,for every polynomial p and constant c, icp(x : A) > c for almost all x in C.

It should finally be noted that with this notion of instance complexity, finer grainedresults than with the notion of complexity core can be obtained about the density of hardinstances in hard problems.

10.3.3 How to characterize hard instances?How to characterize hard instances ? As indicated above, the strategy looks like this :

• take a finite time bound b (which is a constant) ;

• compare icb(x : A) and Kb(x)

• if icb(x : A) ≈ Kb. log b(x), then, the instance is b-hard and its instance complexityis maximal.

From this definition of hard-instance, the following general conjecture has been made:

Conjecture 10.3.1. [51] : If a set A does not belong to complexity classDT IME(t(n)),then, for infinitely many strings x, the t-bounded instance complexity of x relatively to Ais equal to the t′-bounded Kolmogorov complexity, with t′ = O(t log t)).

In 2000, this general conjecture had not been proved or rejected. However, as we shallsee, other versions of the conjecture were proved as well as fruitful theorems.


10.3.4 The growth of instance complexity for problems not belongingto P

An interesting class is IC[log, poly]. It is composed of sets for which there is a con-stant c and a polynomial p such that icp(x : A) 6 c log |x| for all instance x. In otherwords, the correponding set is not in P but the instance complexity of its instances growsslowly.

The two following results further show that a slow growth of instance complexity isnot to be expected for the instances of hard problems.

Theorem 10.3.3. [51, §4]. Let A be a self-reducible set3. Then A is in IC[log, poly] ifand only if A is in P .

Theorem 10.3.4. [51, §4]. Under the hypothesis P 66= NP , SAT is not in IC[log, poly].

To interpret these results, one should keep in mind that instance complexity is an"optimistic" measure because the best, possibly one-target, algorithm is used for eachinstance. The results show that, even in this case, the instance complexity grows morequickly than logarithmic functions.

10.3.5 Which problems do have hard instances?10.3.5.1 The logic of the notion of hard instance

The notion of hardness for an instance has been defined above by describing howinstance complexity can be maximal for a time bound b. Note that this bi-dimensionalway to measure hardness is totally intrinsic and focused on instance, so the notion ofhardness relatively to a bound is completely well-defined. A minor worry is that, forany instance, there is a constant b for which icb(x : A) ≈ Kb. log b(x). So if we want todetermine which instances can be considered as really hard, one must find a way to saywhich temporal bound b is high and which is low4. Polynomial or exponential bounds,taken alone, are of no help for this since any number b can be seen as the value of apolynomial or exponential function. As a consequence, to choose polynomial bounds,and to describe an instance as having "polynomially hard instance complexity" one mustre-introduce problems into the play.

Definition 10.3.1. A problem has p-hard instances, if, for every polynomial p, there existsa polynomial p′ and a constant c such that, for infinitely many instances x, icp(x : A) >Kp′

(x)− c.Note however that there is a crucial difference with the definitional use of the notion

of problem that was made in the definitions of complexity cores or problem complexity :here the notion of problem is only used to create a label and the complexity measure hasalready been well defined independently.

From this definition, a weaker conjecture than conjecture 10.3.1 can be made :

Conjecture 10.3.2. [51]. Every problem A not in P has p-hard instances.3Roughly said, a set is auto-reducible if instances of size n can be decided by using the answer for some

instances of size inferior to n.4Similarly, once a notion of temperature has been well-defined, one may wish to find a way to say which

temperature is high and which is low.

10.4 Instance complexity and opacity 93

10.3.5.2 p-hard instances within NP-complete problems

As in the case of complexity cores, the emphasis is laid uponNP-complete problemsbecause they are hard enough to be intractable and there seems to be physical problemsthat require to solve such problems. Several theorems have been proved that show thatNP-complete problems do have p-hard instances.

Théorème ([51]). If E 66= NE , SAT has p-hard5.

A similar result exists for E-complete problems.

Theorem 10.3.5. [23] Every recursive tally set6 not in P has p-hard instances.

Since it has been show that there is such a tally set in E − P , one finally gets :

Theorem 10.3.6. All E-complete problems have p-hard instances.

Fortnow and Kummer have also proved a significant theorem, which does not depend onthe hypothesis that P 6= NP .

Theorem 10.3.7. [23]). Every recursive problem A 6∈ P7 that is NP-hard relatively tohonest reductions8 has p-hard instances.

It is difficult to find a similar result for NP-complete problems because this seems torequire a proof that P 6= NP .

One can conclude that the notion of instance complexity does lay foundations for thenotion of hardness of an instance and shows that there are indeed hard instances within thehard problems that can be met in physics, and this is all I needed for my general argument.

10.3.6 Instance complexity, Kolmogorov complexity and the hard-ness of problems

In this section, I report results showing that recursive problems can be defined frominstance complexity and that the difference between recursive and recursively enumerableproblems can be characterized with this notion. I also emphasize that, with the notion ofinstance complexity, links between computational complexity and Kolmogorov complex-ity can be seen, which is a real progress on the way of understanding what complexity is.This shows that this notion has some unificatory power and is correctly integrated withinthe web of definitions and results existing in complexity theory.

10.4 Instance complexity and opacityIn this section, I finally examine two pending questions about instance complexity,

which are of interest in order to make clearer the characteristics of opacity.5E =

⋃c>0DTIME(2cn +c). NE is the corresponding non-deterministic class. For classes EXP and

NEXP the exponent is polynomial but non linear.6Strings of a tally set are included in {a}∗.7NP-hard problems are as difficult as NP-complete problems. The theorem says that, if P = NP ,

only NP-hard problems not in P have p-hard instances.8A reduction is "honest" if it does not "shrink" instances excessively [2, p.61].

10.5 Conclusion about instance complexity 94

10.4.1 Instance complexity and proof verificationJust as we did with complexity cores, instance complexity can be used to check

whether similar results hold for proof verification (and therefore for what I have called"secundary opacity"). I present results to the effect that the answer is positive, even if thisside has not been so far extensively studied by computer scientists.

10.4.2 The opacity of instance complexityI finally report results about hard it is to compute instance complexity. Fortnow and

Kummer show in particular that instance complexity is not a recursive function, even ifit can be approximated. This confirms what we said about opacity after the analysis ofcomplexity cores : it can be proved that some instances are difficult to solve but it isextremely difficult to know for sure which instances are hard. As a consequence, theboundaries of scienc are well-defined but blurred. One cannot know easily what onecannot know easily.

10.5 Conclusion about instance complexityIn this conclusion, I summarize the claims that I have made in the previous chapters.

1. The notions of complexity cores and instance complexity can be used to lay foun-dations for the notion of intrinsic hardness of instances.

2. The notion of instance complexity shows that the hardness of instances should beseen as measured and described by the relationship between the size of the algo-rithms that one uses and the length of the computations made by these algorithms.In other words, instance complexity is a bidimensional measure, and the corre-sponding graph describes the map of possible progresses — even if we are unableto draw this graph precisely. One can finally conclude that opacity does have an in-trinsic and irreducible component, even if opacity is in practice often higher becauseof lack of knowledge.

3. The notion of instance complexity should also been seen as an "optimistic measure",since it is assumed that the best algorithms are used for each instance. One shouldremind that algorithms that are both general and efficient are much more useful, andfor this issue, the usual theory of complexity, focused about problem complexity, iscompletely appropriate.

4. Opacity, as measured by instance complexity, is usually an opaque quantity in thesense that it is difficul to to know its value. As a consequence, one cannot knoweasily what there is no hope of knowing easily. As a consequence, scientists arecondemned to look for new efficient methods and to try to extend the boundaries ofscience, without knowing in which cases this is not possible.

5. The opacity of opacity is a problem for scientists, who try to extend the boundariesof science in particular cases. Philosophers, taking a more general (and confortable)

10.5 Conclusion about instance complexity 95

perspective, can conclude that scientists cannot always succeed in pushing back theboundaries of science if they do not also extend their computational resources.

Chapter 11

Epilogue : to what extend is Natureopaque?

Let us now take a more global perspective onto the claims that have been made inorder to make clearer what their scope exactly is. I have argued in the following chaptersthat some explanations and predictions are irreducibly hard to make.Therefore, even ifwe extend our computational resources, any progress and extension of the domain of ourscience is not possible. As a consequence, it is legitimate to consider that phenomena aremore or less intrinsically opaque and that, depending on the resources that one posseses,some of them cannot be exactly predicted or explained. This state of affair leaves a coupleof questions open about how Nature is globally opaque and how much it is to resist ourscience. In order to make clearer what has been done and what remains to be done, Idescribe here these open questions.

1. What can be known, predicted or explained if one loosens the requirement aboutexactness in the solution?

2. What is the global opacity of the mathematical world ?

3. What is the global opacity of the physical world ?

4. What can Nature do exactly ?

5. To what extent is it possible to use Nature’s capacities to extend our science ?

What can be known, predicted or explained if one loosens the requirement aboutexactness?

In the previous chapter, I have focused on the cost of making exact explanations orpredictions of phenomena (that is to say sound deductions or exact solutions of problems,possibly relying on approximate or idealized models). This does not mean that I have con-sidered only the explanation and prediction of the detailed "exact" behaviour of physicalsystems since one can make exact predictions or explanations about, for example averagebehaviours. For example, one can compute exactly the partition function of a gaz andinfer from this what the macroscopic average features of this gaz must be.

Something different, which can be done, is to loosen the exactness requirement on thesolution to be found or the deduction to be made — when this still leads to meaningful

97

results. This can be best illustrated in the case of optimization problems. As we sawp.57, finding the lowest energy state of a spin glass is a NP-hard problem. Still, in thelast two decades, physicists have shown how quasi-optimal solutions could be found ata relatively moderate cost by using techniques from statistical physics such as simulatedannealing [36] or replica method [44]. Computer scientists have also developed branchesof complexity theory that deal with the cost of approximate solutions to problems [60].

In consequence, even if some phenoma cannot be exactly predicted or explained, thisby no means imply that they are absolutely opaque to us. The art of physicicts is preciselyto find techniques to sidestep such difficulties, work their way around them and manage toextract one way or another some information about the physical systems that they study.

What is the global opacity of the mathematical world ?

Open questions remain about the "topology" of complexity for problems. A famousconjecture is that P 6= NP . Another conjecture is that Avg-P 6= Dist-NP . ( Avg-P isthe class corresponding to P when one uses average complexity and Dist-NP is the classcorresponding to NP when one uses average complexity and only distributions that arecomputable in polynomial time are allowed. To put it bluntly, if Avg-P = Dist-NP , thenhard instances are in fact very rare and problems in NP are on average easy to solve. Alast conjecture is about the existence of one-way functions, that is to say functions thatare easy to calculate but hard to invert [66, §10.6.3]. From these different conjectures,different scenarios or worlds were described by Impagliazzo [34] :

• Algorithmica

In this world, P = NP so "everything" can be solved easily. Every theorem hasa short proof that can be found and checked quickly. So there is no significantdifference between what I called primary and secundary opacity.

• Heuristica

In this world, P 6= NP , so there are hard instances but these instances are rare so,in practice, Algorithmica et Heuristica are very similar.

• Pessiland

In this world, there are problems which are difficult on average but secure cryptog-raphy is not possible because there are no one-way functions.

• Minicrypt et Cryptomania

In these worlds, there are one-way functions and secure cryptography is possiblebut depending on the case, the privacy about keys is more or less important and lifeis more or less beautiful for peeping hackers.

Computer scientists usually think the right world is Cryptomania, but this is still to beproved. On the answer to this question depend how difficult the problems that can be metwhen doing physics are.

98

What is the global opacity of the physical world ?

Whatever the topology of complexity in the mathematical world, the opacity of phe-noma in Nature also depends on how frequent hard instances are in Nature. There is nospecial reason to suppose that hard instances are rare in problems but the Devil created theworld so that we meet only hard instances, nor that hard instances are frequent in problemsbut God made them rare in the actual world. Nevertheless, there may be empirical rea-sons why we meet or do not meet hard instances in empiricial studies. Take for exampleprotein folding. As we saw above, the problem of finding the most stable configuration ofproteins (the minimum energy state) is NP-complete [24, 25]. Yet, there is still a chancethat the most stable configuration of the proteins that can be found in actual living beingsare easy to find. The reason is that, if a protein has many local optima, the system maytake much longer to "find" the minimum energy state (the relaxation time is long) and maystay trapped in one of these local pitfalls. The organisms with such proteins would thenhave an adaptative disadvantage because the properties of the ill-folded proteins would bedifferent. The hypothesis can be made that only proteins without local optima have beenselected — and if a protein has no local optima, then our algorithms will not be trappedinto them either when searching for the minimum energy state.

What can Nature do exactly ?

As we saw above, when studying physical systems, we can be happen to come accrossNP-complete problems. But does Nature really "solve" these problems. In the case ofspin glasses, how long does it take to Nature to "find" the minimum energy state? If thisrelaxation time is extremely long, then it is not legitimate to say that we need to solveNP-complete problems to predict or explain the actual phenomena that are observed inNature since the phenomena corresponding to this minimum energy state shall not beobserved. Whether Nature "solves " hard problem is a question that should be answeredon the basis of empirical evidence1.

To what extent is it possible to "tame" Nature to extend our science ?

Finally, even if Nature does "solve" extremely hard problems, this does not necessar-ily imply that the corresponding phenomena will remain forever outside the domain ofscience. The reason is that, just as we managed computer to develop our computationalpower, there is the hope that we can "tame" the physical systems that "solve" hard prob-lems in order tomake them solve for our own benefits hard instances of hard problems.

The classical computers are equivalent for complexity purposes to Turing machines,but other types of computers may be developped in the future, such as quantum computerswhich, with n qubits, could perform 2n computations simultaneously. In the same way, ifproteins do solveNP-complete problems quickly, why cannot we build proteinic comput-ers by coding problem instances on linear proteins so that the 3D optimal configurationsof the folded proteins give the solutions of these hard instances ?

In short, the more powerful Nature is, the more we can hope to develop our compu-tational power by developping our knowledge of Nature and "taming" part of it for ourown good by building powerful computers. How far Nature can be tamed in this way is

1See [1] for a review.

99

both a scientific and technological question that philosophers can hardly answer. Whatphilosophy can do, by relying on computer science, is to determine what can be withinthe domain of science if one possesses such or such computational resources and to de-scribe and analyse how science works and partly changes with these new computationalresources.

Chapter 12

General conclusion

12.1 The end of human scale epistemologyI started this thesis by noting that there is no doing without computers in modern

science. Were computers dispensable in each case (by the finding of a very specific andpossibly one-target algorithm specific for each case), we would not always have the timeto launch and complete the required researches to find this algorithm in each case. Whenpredicting the weather, one wants to use everyday the same algorithm. Foe the purposeof an efficient science, we need general and efficient algorithm, which is the topic aboutwhich the part of complexity theory that is about worst-case complexity gives results.As a consequence, we need a new epistemology in which human mind is not directlythe touchstone of every bit1 of knowledge, even if it remains indirectly the touchstone ofknowledge, since we need to calibrate our tools and in particular to prove that algorithmswork correctly.

What has been shown is that this situation, which is a matter of fact, cannot be differ-ent. I have argued in part 1 that the more informative explanations are not the best ones —quite the contrary. The best explanations just derive the explanandum statement and nomore. Good explanations indicate which statements (and facts) are independent of eachother, which cannot be done when one brings as much information as possible about whatwas the case in the system under study. As a consequence, an unbounded inflation of theexplanatory cost (with the increase of the explanatory quality and explanatory task) is notto be dreaded.

In part 2, I have claimed that making deductions or finding solutions to particularproblems has an intrinsic and irreducible computational cost, which cannot be arbitrarilydecreased by the development of our knowledge, even if this computational cost cannotalways be easily measured. This has been done in two steps. First, I have shown that hardproblems, which are met in physics, do have subsets of instances (the so-called complexitycores) that are homogeneously difficult. The problem is that these instances can stillbe solved quickly by ad hoc patched algorithms. To remedy this defect, the notion ofinstance complexity was brought into the play. By relying on this bi-parameter complexitymeasure, it has finally been shown that individual instances can be intrinsically hard andthat problems with high complexity do have such intrinsically hard instances, which can

1Literally !

12.2 Relativity and non-relativity to models and theories 101

be quickly solved only by table look-up (that is to say by using a pre-computed solution).In short, there is no quick and honest method to solve these instances. These results areabout primary opacity (the difficulty to find and produce explanations) but similar resultscan be given about secundary opacity and the difficulty to verify proofs or check solutions.In consequence, even if we possess long explanations or predictions, surveying them anddrawing the benefits from these scientific results can be hard. Further, when secundaryopacity and shortest proofs2 are extremely long, this means that, even if we are luckywhen we build proofs (by choosing the right direction any time two paths are open), theproofs can be too long to be made.

In conclusion, the fact that opacity can be irreducibly high implies that there are in-trinsic constraints to how easily we can push back the boundaries of science by extendingour computational resources.

12.2 Relativity and non-relativity to models and theoriesThere is still a stone in my shoe. I have argued in part 1 that explanations were built

out of theories and models ; as a consequence, opacity, which measures how difficult itis to produce these explanations, is also relative to the theories or models that are takenas starting points. Actually, this is something that I am completely happy with. If anystatement could be chosen as the premisses of an explanation, then, it would be moresimple to choose as basic the statement to be explained itself. So explanations must bebuilt on the statements that are designated as basic by theories or models. And opacitymeasures the distance between these statements and the statements to be explained.

In the same time, when untractability results are obtained about a physical model,such as Ising spin model, and that these results indicate that it is intrinsically difficult toexplain or predict the behaviour described by this model, there is very little hope that,by modelizing differently the phenomena to be explained or predicted, the computationalcost can be anyhow significantly decreased. Average, idealized or approximate easierexplanations or predictions can perhaps be devised. But, it is not possible to find a newmodel from which could be easily made explanations or predictions equivalent to theoriginal ones as far as the derived statements are concerned. The reason is that, if sucha model could be found, by translating the original model into the new one, it would bepossible to solve quickly an intrinsically intractable problem, which is not possible.

Of course, this argument is a very general one. In order to be sound, it needs to beadapted to particular situations involving particular complexity classes. Further, this is aglobal argument in the sense that it states that the opacity of an infinite set of phenomenacannot be decreased, even if the opacity of some particular phenomena can be decreased,since some statements that were originally considered as derived statements can be con-sidered as basic in the new model or theory. As a consequence, the opacity of particularphenomena is relative to the theories or models that are used but a theory or model changecannot create a collapse in the opacity of an infinite set of intrinsically complex phenom-ena. Theories do change, opacity mostly remains.

2Remember primary opacity does not measure the size of the shortest proofs but the minimal cost of theconstruction of proofs or solutions of problems, made without any clue.

12.3 Models, models, models (and why this is so) 102

12.3 Models, models, models (and why this is so)I have made my best in the previous chapters in order to keep away from discussions

about models and how they mediate between theories and phenomena. The previoussection offers an argument indicating why opacity can partly be emancipated from thisrelativity to models and theories. A second reason for keeping away from discussionsabout idealizations, approximations, additional hypotheses and the like was that the needto use such methods can be seen now as a consequence of the opacity of phenomena. Tomake this point clearer, I need to present first two visions of what the world is like (I shallrely on Nancy Cartwright’s presentation [15] for this).

First, there is the lofty vision of science that is usually attributed to the Vienna Circle,in which phenomena are covered by particular sciences, the concepts and laws of whichcan be reduced to the laws and concepts of more fundamental domains till physics isreached. In this fundationalist perspective, deductivism prevails and science looks likesomething like figure 12.3.1.

Figure 12.3.1: The unity of science and the reign of deductivism (image taken from [15]), drawn byRachel Hacking.

This vision of science is not in agreement with Cartwright’s views. "How do we usetheory to [...] model particular physical or socio-economic systems? How can we usethe knowledge we have encoded in our theories to build a laser or to plan an economy ?The core idea of all standard answers is the deductive-nomological account. This is anaccount that serves the belief in the one great scientific system, a system of a small set ofwell co-ordinated first principles, admitting a simple and elegant formulation, from whicheverything that occurs [...] can be derived. But treatment of real systems are not deduc-tive; nor are they approximately deductive; nor deductive with correction, nor plausiblyapproaching closer and closer to deductivity as our theories progress [...]." [15, p.9].

As a consequence Cartwright favours positions in which models (and fields) are partlyautonomous. Accordingly, she shares with Neurath a conception of science that ratherlooks like what is shown in figure 12.3.2. The balloons stand for different fields with theirown central equations. The balloons can cooperate, their boundaries are flexible but theyhave boundaries and there is no universal covering law, from which everyting can be in


Figure 12.3.2: The image of science defended by Neurath (and Cartwright) (image taken from [15]),drawn by Rachel Hacking.

principle derived. These claims (taken from The Dappled World) are in accordance withwhat is argued for in How the Laws of Physics Lie [13]. In this latter book, Cartwrightclaims that physical theories cannot represent reality. Only models can. These modelsare built by squeezing representations of physical systems into straight jackets ; more pre-cisely, these models are built by combining, in quantum mechanichs, typical hamiltonians,like the harmonic oscillator the square well or the hydrogen atom [13, §7.3, p.139].

When it comes to explaining why the situation is like this, Cartwright mentions thecomplexity of phenomena and the constraints of a collective efficient research. "The phe-nomena to be described are endlessly complex. In order to pursue any collective research,a group must be able to delimit the kinds of models that are even contenders. If there wereendlessly many possible ways for a particular research community to hook up phenomenawith intellectual constructions, model building would be entirely chaotic, and there wouldbe non consensus of shared problems on which to work."[13]. But things are also like thisbecause we live in a world with a great variety of causes and natures — contrarily to whatthe deductive-nomological fable says [14].

Overall, this vision of science gives a central part to models. This new perspectivehas been fruitfully adopted in the last decades by many authors [47] and it is now agreedupon that models are a key unit in scientific activity and do mediate between theories andphenomena.

Cartwright also argues from this state of facts in favour of an anti-fundationalist posi-tion and describes the world as "dappled". This is however not the only way to interpretthis situation. For example, Paul Humphreys, who is more inclined towards realism about


science3 has a much more moderate interpretation when he says : "It is the invention anddeployment of tractable mathematics that drives much progress in the physical sciences.[...]" "There is a converse to this principle: Most scientific models are specifically tailoredto fit, and hence are constrained by, the available mathematics." [33, p.55].

Figure 12.3.3: Sketch of the image of science, given the irreducible opacity of phenomena. Continu-ous lines stand for exact deductions ; lines with crosses stand for deductions made with approximations,idealizations, etc. Interrupted lines stand for "computational gaps".

I do not know which position is the right one. And, as a matter of fact, I do not wantto know4. Results about the opacity of phenomena are compatible with both positionsbecause they show that, even if the deductive-nomological fable is true and if Nature isruled by few universal covering laws , science must look like what Cartwright says. Moreprecisely, science does look like something like figure 12.3.35.

The main difference bewteen figure 12.3.3 and figure 12.3.2 is that I have explicitlyadded interrupted lines, which stand for what I call irreducible "computational gaps". Fur-ther, this description of science relies not only on a de facto description of science, butalso from the untractability results that are given by computational science. Further, theexistence of these computational gaps implies that, in order to side step untractability,more modeling is needed, with more additional hypotheses (possibly drawn from phe-nomenological regularities, which are less "computationally distant" from the phenom-ena), approximations, idealizations, and anything that can be helpful in order to sidestepuntractability and develop, one way or another, our knowldege about physical systems. Inconclusion, computer science can be helpful in order to determine which picture of sci-ence is the right one, but it also shows that one should be cautious about the philosophicalconsequences that must be drawn from this image of science, since even a world withuniversal covering laws would be pictured like this.

3Humphreys defends "selective realism" in [33, p.82–85].4At least here !5Unfortunately for me, I had no Rachel Hacking to help me for this drawing.

Bibliography

[1] Scott Aaronson. NP -complete problems and physical reality. SIGACT News, Com-plexity theory column, Mars 2005.

[2] José Luis Balcázar, Josep Díaz, and Joaquim Gabarró. Structural Complexity.Springer Verlag, 1988.

[3] José Luis Balcázar, Josep Díaz, and Joaquim Gabarró. Structural Complexity II.Springer Verlag, 1990.

[4] Francisco Barahona. On the computational complexity of Ising spin glass. Journalof Physics A : Mathematical and General, 15:3241–3253, 1982.

[5] Robert Batterman. The Devil in the Details, Asymptotic Reasoning in Explanation,Reduction, and Emergence. Oxford University Press, 2002.

[6] Mark Bedau. Weak emergence. Philosophical Perspectives : Mind, Causation andWord, 11:375–399, 1997.

[7] Mark Bedau. Downward causation and the autonomy of weak emergence. Principia,6:5–50, 2003.

[8] Piotr Berman, Marek Karpinski, and Alexander D. Scott. Computational complexityof some restricted instances of 3-SAT. Discrete Applied Mathematics, 155:649–653,2007.

[9] J. Bolyai. Appendix, scientiam spatii absolute veram exhibens. In in Tentamenjuventutem studiosam in elementa matheseos purae elementaris ac sublimioris,methodo intuitiva, evidentiaque huic propria introducendi (Wolfanf Bolyai). MarosVasarhelyini : J. et S. Kali, 1832.

[10] George Boolos. Don’t eliminate cut. Journal of Philosophical Logic, 13(4):373–378, 1984.

[11] Martin Campbell-Kelly and William Aspray. Computer : a history of the informationmachine. Oxford : Westview Press, seconde edition, 2004.

[12] Rudolf Carnap. Logical Foundations of Probability. University of Chicago Press,1951.

[13] Nancy Cartwright. How the laws of physics lie. Clarendon Press, Oxford., 1983.

BIBLIOGRAPHY 106

[14] Nancy Cartwright. Nature’s Capacities and Their Measurement. Oxford UniversityPress, 1989.

[15] Nancy Cartwright. The Dappled World. Cambridge University Press, 1999.

[16] Peter Cheeseman, Bob Kanefsky, and William M. Taylor. Where the really hardproblems are. In Proceedings of the Twelfth International Joint Conference on Arti-ficial Intelligence, 1991.

[17] J. Alberto Coffa. The foundations of Inductive Explanations. University of Pitts-burgh, 1973.

[18] James W. Cooley and John W. Turkey. An algorithm for the machine calculation ofcomplex fourier series. Mathematics Computation, 90:297–301, 1965.

[19] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein.Introduction to Algorithms. MIT Press, deuxième edition, 2001.

[20] Martin Davis, George Logemann, and Donald Loveland. A machine program fortheorem proving. Communications of the ACM, 5(7):394–397, 1962.

[21] Martin Davis and Hillary Putnam. A computing procedure for quantification theory.Journal of the Association for Computing Machinery, 7:201–215, 1960.

[22] Richard Feynman. The Feynman Lectures on Physics. Addison-Wesley, sixth edi-tion, 1977.

[23] Lance Fortnow and Martin Kummer. On resource-bounded instance complexity.Theoretical Computer Science, 161:123–140, 1996.

[24] Aviezri Fraenkel. Complexity of protein folding. Bulletin of mathematical biology,55(6):1199–1210, 1993.

[25] Aviezri Fraenkel. Protein folding, spin glass and computational complexity. Pro-ceedings of the 3rd DIMACS Workshop on DNA Based Computers, held at the Uni-versity of Pennsylvania, June 23 – 25, 1997.

[26] Michael R. Garey and David S. Johnson. Computers and Intractability: a guideto the theory of NP-completeness. A Series of books in the mathematical sciences,Victor Klee (éd.). New York : W. H. Freeman, 1979.

[27] Robert Gilman, Alexei G. Miasnikov, Alexey D. Myasnikov, and AlexanderUshakov. Report on generic complexity. ArXiv-eprints, arXiv:0707.1364v1, July2007.

[28] Paul Grice. Logic and conversation. In Syntax and Semantics 3: Speech Acts, P.Cole and J. Morgan (éd.). New York, Academic Press, 1975.

[29] Yuri Gurevich. Average case completeness. Journal of Computer and System Sci-ence, 42:346–398, 1991.

BIBLIOGRAPHY 107

[30] Yuri Gurevich. Average case complexity. In Proceedings of the 18th InternationalColloquium on Automata, Languages and Programming, volume 510 of LectureNotes in Computer Science, pages 615–628, 1991.

[31] Johannes Hafner and Paolo Mancosu. The varieties of mathematical explanation. InK. Jørgensen P. Mancosu and S. Pedersen, editors, Visualization, Explanation andReasoning Styles in Mathematics. Springer, 2005.

[32] Christopher Hitchcock. Discussion: Salmon on explanatory relevance. Philosophyof Science, 62, 1995.

[33] Paul Humphreys. Extending Ourselves. Computational Science, Empiricism, andScientific Method. Oxford University Press, 2004.

[34] R. Impagliazzo. A personal view of average-complexity. In Proceedings of the 10thIEEE Annual Conference on Structure in complexity theory, pages 134–147, 1995.

[35] Sorin Istrail. Statistical mechanics, three-dimensionality and NP-completeness. InProceedings of the thirty-second annual ACM symposium on Theory of computing,pages 87–96. ACM, New York USA, 2000.

[36] Scott Kirkpatrick, C. D. Gelatt, and M.P. Vecchi. Optimization by simulated anneal-ing. Science, 1983.

[37] Philip Kitcher. Explanatory unification and the causal structure of the world. InPhilip Kitcher and Wesley Salmon, editors, Scientific Explanation. University ofMinnesota Press, 1989.

[38] Leonid A. Levin. Average case complete problems. SIAM Journal on Computing,15:285–286, 1986.

[39] Nancy Lynch. On reducibility to complex or sparse sets. Journal of the Associationfor Computing Machinery, 22(3):341–345, 1975.

[40] J. Machta. Complexity, parallel computation and statistical physics. Complexity,11(5):46–64, 2006.

[41] Paolo Mancosu. Explanation in mathematics. In The Stanford Encyclopedia ofPhilosophy, volume été 2008. Edward. N. Zalta, 2008.

[42] Per Martin-Löf. On the meaning of the logical constants and the justifications of thelogical laws. Nordic Journal of Philosophy, 1(1):11–60, 1996.

[43] Stephen Mertens. Computational complexity for physicists. Computing in Scienceand Engineering, 4(3):31–47, mai 2002.

[44] Marc Mézard, Giorgio Parisi, and Miguel Angel Viraroso. Spin glass theory andbeyond. Singapore: World Scientific, 1987.

[45] A. Alan Middleton. Counting states and counting operations. In Weinheim : Wiley-VCH, editor, New optimization algorithms in physics (éd. Hartmann, Alexander K.and Rieger, Heiko), 2004.

BIBLIOGRAPHY 108

[46] Rémi Monasson, Riccardo Zecchina, Scott Kirkpatrick, and Lidror Troyansky. De-termining computational complexity from characteristic ’phase transition’. Nature,400, Juillet 1999.

[47] Mary Morgan and Margaret Morrison. Models as Mediators. Cambridge UniversityPress, 1999.

[48] Alexei G. Myasnikov. Generic complexity of undecidable problems. Lectures Notesin Computer Science, 4649:407–417, 2007.

[49] Isaac Newton. Isaac Newton’s Philosophiae Naturalis Principia Mathematica (éd.Alexandre Koyré and Bernard I. Cohen). Cambridge University Press, Réimpressionde la 3ème édition de 1726, Londini, Apud Guil. et Joh. Innys, 1972.

[50] Timothy O’Connor and Hong Yu Wong. Emergent prop-erties. In The Stanford Encyclopedia of Philosophy,http://plato.stanford.edu/archives/fall2008/entries/properties-emergent/. Edward N.Zalta, 2008 (automne).

[51] Pekka Orponen and Ker-I Ko et Uwe Schöning et Osamu Watanabe. Instance com-plexity. Journal of the Association for Computing Machinery, 41(1):96–121, 1994.

[52] Pekka Orponen and Uwe Schöning. The structure of polynomial complexity cores(extended abstract). In Springer Verlag, editor, Proceedings of the MathematicalFoundations of Computer Science 1984, pages 452–458, 1984.

[53] Victor Pambuccian. Axiomatizations of hyperbolic and absolute geometries. InNon-Euclidean geometries: János Bolyai Memorial volume, number 119–153 in A.Prékopa and E. Molnár (éd.). Springer Berlin, 2006.

[54] Christos H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.

[55] Stephen B. Pope. Turbulent Flows. Cambridge University Press, 2000.

[56] A. Prékopa and E. Molnár, editors. Non-Euclidean geometries: János Bolyai Memo-rial volume. Springer Berlin, 2006.

[57] Peter Railton. Probability, explanation, information. Synthese, 48, 1981.

[58] Hans Reichenbach. Nomological statements and admissible operations. In Studies inLogic and the Foundations of Mathematics. Amsterdam: North Holland, réimpriméen 1976 sous le titre Laws, Modalities, and Counterfactuals, University of CaliforniaPress, 1954.

[59] Michael Resnik and David Kushner. Explanation, independence, and realism inmathematics. British Journal for the Philosophy of Science, 38(141–158), 1987.

[60] Jean-François Rey. Calculabilité, complexité et approximation. Vuibert, 2004.

[61] Wesley Salmon. Statistical explanation. In W. Salmon, editor, Statistical Explana-tion and Statistical Relevance. Pittsburgh: University of Pittsburgh Press, 1971.

BIBLIOGRAPHY 109

[62] Wesley Salmon. A third dogma of empiricism. In Basic Problems in Methodologyand Linguistics, Dordrecht: D. Reidel Publishing Co., pages 149–166. (Robert Buttsand Jaakko Hintikka, éd.), 1977.

[63] Wesley Salmon. Four decades of scientific explanation. In Scientific Explanation.W. Salmon et P. Kitcher (éd.), réédité en 2006, Paul Humphreys (éd.), University ofPittsburgh Press, 1989.

[64] Uwe Schöning. Complexity cores and hard-to-prove formulas. Lectures notes inComputer Science, 329:273–280, 1987.

[65] Uwe Schöning. Complexity cores and hard problem instances. In Springer Berlin,editor, Proceedings of the International Symposium on Algorithms, volume 450 ofLectures Notes in Computer Science, pages 232–240. Springer Berlin, 1990.

[66] Michael Sipser. Introduction to the Theory of Computation. Thomson course tech-nology, Boston, 2006.

[67] Mark Steiner. Mathematical explanation. Philosophical Studies, 34, 1978.

[68] Reginal P. Tewarson. Sparse matrices. New York: Academic Press, 1973.

[69] Bas van Fraassen. Scientific Image. Clarendon Press, Oxford, 1981.

[70] Jie Wang. Average-case computational complexity theory. In Springer, editor, Com-plexity Theory Retrospective, pages 295–328, 1997.

[71] D. J. W. Welsh. The computational complexity of some classical problems fromstatistical physics. In G. Grimmett et D. J. W. Welsh, editor, Disorder in PhysicalSystems, pages 307–321. Clarendon Press, Oxford„ 1990.

cyrille imbert - poincare.univ-lorraine.fr · cyrille imbert. contents general introduction 5 i...

Documents