hardness amplification within np
TRANSCRIPT
http://www.elsevier.com/locate/jcss
Journal of Computer and System Sciences 69 (2004) 68–94
Hardness amplification within NP
Ryan O’Donnell1
Institute for Advanced Study, Princeton, NJ 08540, USA
Received 21 June 2002; revised 16 June 2003
Abstract
In this paper we investigate the following question: if NP is slightly hard on average, is it very hard onaverage? We give a positive answer: if there is a function in NP which is infinitely often balanced andð1� 1=polyðnÞÞ-hard for circuits of polynomial size, then there is a function in NP which is infinitely often
ð12þ n�1=2þeÞ-hard for circuits of polynomial size. Our proof technique is to generalize the Yao XOR
Lemma, allowing us to characterize nearly tightly the hardness of a composite function gð f ðx1Þ;y; f ðxnÞÞin terms of: (i) the original hardness of f ; and (ii) the expected bias of the function g when subjected torandom restrictions. The computational result we prove essentially matches an information-theoreticbound.r 2004 Elsevier Inc. All rights reserved.
MSC: 68Q06
Keywords: Hardness amplification; NP; Monotone; XOR Lemma; Expected bias; Noise sensitivity
1. Introduction
Supposing that NP is hard for polynomial circuits, is it hard on average? That is, does it containfunctions for which every polynomial-sized circuit outputs the wrong answer on close to half of allinputs? Let us make the following definition:
Definition 1. A boolean function f : f0; 1gn-f0; 1g is ð1� dÞ-hard for circuits of size s if no circuit
of size s can correctly compute f on a 1� d fraction of the inputs f0; 1gn:
ARTICLE IN PRESS
E-mail address: [email protected] performed when the author was a student at the MIT Mathematics Department, partially supported by
NSERC fellowship PGSA-221401-99.
0022-0000/$ - see front matter r 2004 Elsevier Inc. All rights reserved.
doi:10.1016/j.jcss.2004.01.001
Of course, we do not know whether there is a function in NP which is even ð1� 2�nÞ-hard forpolynomial circuits, since this is precisely the NP vs. P/poly problem. However, under thereasonable assumption that NP is at least slightly hard in the above sense, it is of interest to knowjust how hard it really is. In particular, a fundamental question is whether there is a language in
NP which is very hard on average—having hardness 1=2þ oð1Þ; say.There are many natural complexity classes for which it makes sense to investigate hardness
on average; for example, P (with respect to circuits of a fixed polynomial size), NP, PSPACE,and EXP. This last case of EXP has received by far the most attention, motivated by thehardness-versus-randomness paradigm for derandomization of Nisan and Wigderson [17].Nisan and Wigderson showed that the existence of languages in EXP that are very hard forcircuits of subexponential size implies the existence of subexponential time deterministicsimulations of BPP. Hence there has been much interest in showing that there are very hardfunctions in EXP under the assumption that EXP is slightly hard; many extremely strongresults along these lines are now known—see [2,10,11,21]. For example, it is known that ifthere is a language in EXP which is even ð1� 2�nÞ-hard for polynomial circuits, then there is alanguage which is ð1=2þ 1=polyðnÞÞ-hard for polynomial circuits. Producing a result of thisnature for the class NP is the subject of this paper. It is unclear if there are direct derandomizationapplications of such a result; however, we believe the problem of the hardness of NP is of intrinsicinterest.A crucial ingredient in some of hardness-on-average amplifications for EXP is the Yao XOR
Lemma [22]. This is a kind of ‘‘direct product theorem’’, which describes what happens to thehardness of a function f when one takes many independent copies of it and applies anotherfunction, g; to the results. In Yao’s case, g ¼ PARITY and roughly speaking one gets that theresulting function f"f"?"f is exponentially more hard than f : Yao’s XOR Lemma hasproved to be a very important result in cryptography.While the XOR Lemma is very useful for amplifying hardness within EXP, it does not shed any
light on the hardness of NP. The reason is that f"f"?"f is not necessarily in NP, even when fis. In order to ensure the composite function is in NP, we must use a g which is monotone (and alsoin NP). We are thus led to the question of whether there is a monotone function which amplifieshardness.In this paper we investigate the direct product question in a general form:
Definition 2. Let f : f0; 1gn-f0; 1g and g : f0; 1gk-f0; 1g be boolean functions. Then we define
g#f to be the boolean function ðf0; 1gnÞk-f0; 1g given by ðx1;y;xkÞ/gð f ðx1Þ;y; f ðxkÞÞ:
Question 1. Suppose f : f0; 1gn-f0; 1g is a balanced boolean function which is ð1� dÞ-hard for
circuits of size s: Let g : f0; 1gk-f0; 1g be any function. What is the hardness of the compositefunction g#f ?
By a balanced function we simply mean one which is 1 on exactly half of all inputs. We consideronly balanced functions f for technical reasons that will become clear later.The tight answer we give to the above question is in terms of the expected bias of g; which we
now define. Let h : f0; 1gn-f0; 1g be a boolean function.
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 69
Definition 3. The bias of h is biasðhÞ ¼ maxfPr½h ¼ 0;Pr½h ¼ 1g:
(Here and throughout, we view functions’ domains as probability spaces with the uniformmeasure.)
Definition 4. A restriction r is a mapping ½n-f0; 1;%g; given a restriction r and a booleanfunction h on n bits, we write hr for the subfunction of h given by substituting in r’s values on eachcoordinate, where a % indicates the coordinate is left ‘‘free’’. Given dA½0; 1 we denote by Pn
d the
probability space over all restrictions on n coordinates in which a restriction is chosen by mappingeach coordinate independently to % with probability d; to 0 with probability ð1� dÞ=2; and to 1with probability ð1� dÞ=2:
Our central definition:
Definition 5. The expected bias of h at d is
ExpBiasdðhÞ ¼ ErAPn
d
½biasðhrÞ:
With this definition in place, we can now answer Question 1:
Answer 1. Roughly speaking, the hardness of g#f is (at most) ExpBias2dðgÞ: To state ourhardness theorem exactly:
Theorem 1. For every r40; the function g#f is ðExpBiasð2�rÞdðgÞ þ eÞ-hard for circuits of size
s0 ¼ Oðe2=logð1=dÞ
ksÞ:
In Section 2 we explain from an intuitive, information-theoretic perspective why Answer 1should be the correct answer; then in Section 3 we transfer these ideas to the circuit setting andgive the computational proof of Theorem 1.Note that a form of Yao’s XOR Lemma follows as a corollary; since ExpBiasdðPARITYkÞ is
easily calculated to be 12þ 1
2ð1� dÞk; we get (taking r ¼ 0:01):
Corollary 1 (Yao [22]). If f is a balanced boolean function which is ð1� dÞ-hard for circuits of size
s; then f"f"?"f (k times) is ð12þ 1
2ð1� 1:99dÞk þ eÞ-hard for circuits of size Oðe
2=logð1=dÞk
sÞ:
Quantitatively, this is close to the best versions of Yao’s XOR Lemma known (see [8]); the mainweakening is the assumption that f is balanced. Levin’s proof also improves the hardness all the
way to 12þ 1
2ð1� 2dÞk þ e:
If we want to do hardness amplification within NP, we are led to look for monotone functions g
such that ExpBias2dðgÞ is much smaller than 1� d: Qualitatively, we want a monotone functionwhich: (i) is nearly balanced, and (ii) remains close to balanced even when subjected to randomrestrictions. For technical reasons, it turns out that functions for which each input variable has
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9470
small ‘‘influence’’ do well. Thus, we are led to consider a classic pair of monotone functionsintroduced by Ben-Or and Linial [5] which have this property.The first of these is the ‘‘recursive majority of 3’’ function. This function does especially well
when d is very small, and can take ð1� 1=polyðnÞÞ-hardness to ð1=2þ n�aÞ-hardness, for a certainsmall a40: The second function is the ‘‘tribes’’ function, and it gets very close to 1=2 when d startsout fairly large. By first getting hardness down to 1=2þ oð1Þ via recursive majority, and thenapplying the tribes function, we get the main theorem:
Theorem 2. If there is a function in NP which is infinitely often balanced and ð1� 1=polyðnÞÞ-hard
for circuits of polynomial size, then there is a function in NP which is infinitely often ð12þ n�1=2þeÞ-
hard for circuits of polynomial size.2
The tribes function is nearly optimal given our techniques; as we describe at the end ofSection 7, insisting that the function used in Theorem 1 be monotone prevents the proof of any
hardness bound better than 1=2þ Oðlog2 n=nÞ: Note that this is in contrast to the case when oneuses the PARITY function as an amplifier—i.e., in the case of Yao’s XOR Lemma. There the issue
preventing final hardness 1=2þ noð1Þ is the effect of the quantity e on the size of the circuits forwhich the amplified function is hard.Finally, let us remark that the assumption in Theorem 2 that the initial hard function is
balanced may be removed at the expense of a small loss in final hardness:
Theorem 3. If there is a function in NP which is infinitely often ð1� 1=polyðnÞÞ-hard for circuits of
polynomial size, then there is a function in NP which is infinitely often ð12þ n�1=3þeÞ-hard for circuits
of polynomial size.3
1.1. Organization of the paper
In Section 2 we give an intuitive discussion explaining why Answer 1 is the correct hardnessbound; then in Section 3 we give the computational proof of Theorem 1. Section 4 discussesanother property of boolean functions called noise stability, which we relate to expected bias. InSection 5 we study the expected bias of the majority function and conclude that it is not aparticularly good hardness amplifier. Section 6 contains a estimation of the noise stability of therecursive majority of 3 function which allows us to prove a theorem which reduces 1� 1=polyðnÞhardness to 1=2þ oð1Þ hardness under the assumption that the initial hard function is balanced.Section 7 investigates the tribes function and completes the proof of Theorem 2, reducing
hardness to 1=2þ n�1=2þe: At the end of this section, we explain why we have essentially exhaustedthe limits of Theorem 1’s applicability to monotone functions. Finally, Section 8, is devoted toproving Theorem 3.
ARTICLE IN PRESS
2Specifically, each input length on which the second function is defined corresponds to an input length on which the
first function is defined. Whenever the first function is both balanced and hard on an input length, the second function
has the claimed hardness on its corresponding input length.3 In contrast to Theorem 2, the second function in this theorem may not be hard on all input lengths for which it is
defined, even if the first function is hard for all input lengths.
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 71
Appendix A contains a technical proof omitted from the proof of Theorem 1 in Section 3.Appendix B gives constructions showing that Theorem 1 is essentially tight.
1.2. Related work
An important earlier work on which we rely is the ‘‘hard-core set’’ paper of Impagliazzo [10].Impagliazzo’s paper shows that for any function f which is ð1� dÞ-hard, there is a ð2� oð1ÞÞd-fraction of inputs on which f is extremely hard—essentially unpredictable; whereas, off this hard-core set, f may be very easy. With this result in hand, it is possible to formalize the information-theoretic reasoning behind hardness amplification. For example, Impagliazzo used his result togive a very simple and intuitive proof of the Yao XOR Lemma. Similarly, we prove Theorem 1 byapplying Impagliazzo’s construction, and then formalizing the intuition outlined in Section 2.As we will see, the expected bias of a function is closely related to another one of its intrinsic
properties, namely its sensitivity to noise in the input. Quantities related to these have previouslybeen studied in [4,3]. The former is an extensive work by Benjamini, Kalai, and Schramm on thenoise sensitivity and stability of boolean functions. It relates noise sensitivity to Fouriercoefficients, proving a formula similar to our Proposition 9. It also gives some mention to therecursive majority of 3 function and tribes function which we treat in more detail in Section 6.The second work mentioned is a learning theory paper by Bshouty, Jackson, and Tamon, which
has a similar spirit to the present paper. They define a quantity called ‘‘noisy distance’’, which ismuch like expected bias or noise stability, and they give lower bounds for learning under attributenoise in terms of this quantity. They also prove a result much like our Proposition 7, and theirmain theorem is proved via a hybrid argument with some of the same flavor as Theorem 1’s proof.Subsequent to the present work, Elchanan Mossel and the author gave a fairly complete study
[16] of the noise sensitivity of monotone functions.Fourier analysis often greatly aids the study of structural properties of boolean functions, and
indeed spectral considerations lie hidden below much of the work in this paper, arising explicitlyonly in the analysis of the noise stability of the tribes function. Relevant work on the Fourieranalysis of boolean functions—especially monotone ones—appears in [6,12,15]. The twomonotone functions we use to amplify hardness were both introduced in [5], as examples offunctions in which individual variables have small influence.The Yao XOR Lemma, and direct product lemmas in general, are studied in [8] and [20],
respectively. The constructions of [20] are used heavily in Appendix B. Other papers whichdescribe what boolean functions can or cannot do based on spectral properties include [9,14].
2. Intuition for Theorem 1
Roughly speaking, Yao’s XOR lemma says that if f is ð1� dÞ-hard, then f"f"?"f (k
times) is ð12þ 1
2ð1� dÞkÞ-hard. The intuition is that in order to calculate f ðx1Þ"?"f ðxkÞ; one
must correctly calculate each f ðxiÞ—if any are missed, one might as well guess at the final answer.
Since f is ð1� dÞ-hard and x1;y; xk are independent, we should only get advantage ð1� dÞk;
hence hardness 12 þ 1
2ð1� dÞk:
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9472
To see the intuition behind Answer 1 we have to think more carefully about the issue of‘‘knowing’’ whether or not f ðxiÞ was calculated correctly.Suppose that x1;y; xk are picked at random, and our task is to compute ðg#f Þðx1;yxkÞ ¼
gð f ðx1Þ;y; f ðxkÞÞ; where f is a balanced function which is ð1� dÞ-hard. We would like toabstract away the hard function f and consider instead the problem of calculating g withimperfect information. Let us call yi ¼ f ðxiÞ; i ¼ 1yk the ‘‘true’’ inputs to g: Since f is balancedand the xi’s are independent, it is as if the true inputs were also chosen independently and
uniformly from f0; 1gk: Let us say that the hardness of f obscures our view of what the true inputsare, and we see only ‘‘corrupted’’ inputs z1;y; zk; which match the true inputs only withprobability 1� d: Our goal is to guess gðy1;y; ykÞ as best we can given that we see only z1;y; zk:One way to model the hardness of f would simply be to say that zi ¼ yi independently with
probability 1� d: In this case, our best strategy would be a maximum likelihood one, involvingtaking a weighted majority of the values of g; with the most weight being given to points nearðz1;y; zkÞ in Hamming distance. However, it is possible we might actually have more informationthan this model allows—one could imagine that for certain recognizable inputs xi it is very easy tocalculate f ðxiÞ; and we can be sure our zi is correct. Indeed, it is possible that for a large fractionof the inputs, we not only know the answer, but we know we know. Taking this to its extreme, wecome to the most helpful way in which the true inputs could be obscured:
For each i; with probability 1� 2d; zi is the correct value of yi and furthermore we know thatthis corrupted input is correct; with probability 2d; zi is set to a random bit and we know that this
corrupted input is completely uncorrelated to yi:
Definition 6. We call the above situation the Hard-Core scenario (cf. [10]).
Notice that in the Hard-Core scenario, each bit zi is still correct with probability 1� d: To seethat this arrangement is more useful to us than the first one, simply note that we can simulate theinformation we had in the first scenario by ignoring the extra knowledge about whether each zi iscorrect or random.The Hard-Core scenario in some sense gives us the most information—i.e., it models f as being
ð1� dÞ-hard in the easiest possible way. The optimal strategy for guessing g in this scenario isclear. We look at the restricted version of the function g gotten when the zi’s which are known tobe correct are plugged in. Since the remaining inputs are completely unknown, but alsocompletely random, we should guess according to the bias of the restricted function, i.e., if thefunction is biased towards 0, guess 0; if it is biased towards 1, guess 1.The probability that this strategy is correct is exactly the expected bias of gr over random
restrictions r with %-probability 2d—i.e., ExpBias2dðgÞ: This is essentially the hardness we provefor g#f in Theorem 1.
3. The hardness theorem
In this section we give the computational proof of Theorem 1, modeled after the intuitivediscussion of the previous section. Our main tool is the hard-core set theorem of Impagliazzo [10].
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 73
Roughly speaking, this theorem says that in the computational context of circuits, every ð1� dÞ-hard function conforms to the Hard-Core scenario described in Section 2. That is, on a 2d-fraction of its inputs the function is nearly impossible for small circuits, whereas on the remainderof its inputs it may be very easy.Let us state Impagliazzo’s theorem from [10] in a slightly improved form due to Klivans and
Servedio [13,19]:
Theorem 4. Let f : f0; 1gn-f0; 1g be ð1� dÞ-hard for size s; and let r40 be any constant. Then f
has a ‘‘hard-core’’ SDf0; 1gnof size between ð2� rÞd2n and ð2� r=2Þd2n on which f is ð1=2þ eÞ-
hard for size Oðe2=logð1=dÞ sÞ:
We will need an easy corollary which lets us assume f is balanced on its hard-core:
Corollary 2. Let f and r be as in Theorem 4. Then f has a hard-core SDf0; 1gnof size between
ð2� rÞd2n and 2d2n on which f is ð1=2þ eÞ-hard for size Oðe2=logð1=dÞ sÞ and on which f isbalanced.
Proof. Given f ; r; and e; apply Theorem 4 with e=2; producing a hard-core S0: We would like toadd a small number of strings to S0 to make it balanced. Suppose without loss of generality that f
is biased towards 1 on S0: Since f is ð1=2þ e=2Þ-hard on S0; it is 1 on at most ð1=2þ e=2ÞjS0jstrings in S0: Since jS0jpð2� r=2Þd2n and we may assume without loss of generality that eor=4;the total number of 1’s f has in S0 is at most ð1=2þ e=2Þð2� 2eÞnpd2n:On the other hand, since f is ð1� dÞ-hard on f0; 1gn; f must take on the value 0 for at least d2n
strings in f0; 1gn: It follows that there is a superset S0DSDf0; 1gn of S0 on which f is balanced.This set has size at most 2d2n:It remains to show that f is ð1=2þ eÞ-hard on S: We know that no circuit (of size
Oðe2=logð1=dÞ sÞ) gets f right on S0 with probability more than 1=2þ e=2; and no circuitgets f right on S \S0 with probability more than 1. We know that jS \S0j ¼ biasð f jS0 ÞjS0j�ð1� biasð f jS0 ÞÞjS0j ¼ ð2biasð f jS0 Þ � 1ÞjS0jpejS0j: It follows that no circuit gets f right on S with
probability more than:
ð1=2þ e=2Þ jS0j
jSj þjS \S0jjSj ¼ ð1=2þ e=2Þ jSj � jS \S0j
jSj þ jS \S0jjSj
¼ 1=2þ e=2þ ð1=2� e=2Þ jS \S0jjSj
p 1=2þ e=2þ ð1=2� e=2Þep 1=2þ e;
as needed. &
With Impagliazzo’s theorem formalizing the reason for which f is hard, there is only one moreingredient we need to complete the proof along the lines outlined in Section 2. We need to showthat if all of a function’s inputs look completely random then it is impossible to guess its value
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9474
with probability better than its bias. In particular, suppose we have a boolean function
h : f0; 1gk-f0; 1g and a randomized procedure which on each input in f0; 1gk tries to guess the
value of h; say that on input yAf0; 1gk it guesses 1 with probability pðyÞ: If the success probabilityof the random procedure exceeds biasðhÞ; then a hybrid argument of sorts shows that the inputs inf0; 1gk are not completely indistinguishable to the random procedure; i.e., there must be twoadjacent inputs on which the procedure acts in noticeably different fashions.
Lemma 5. Let Bk denote the k-dimensional Hamming cube, and suppose h :Bk-f0; 1g andp :Bk-½0; 1: Further suppose that
2�kX
yAh�1ð0ÞpðyÞ þ
XyAh�1ð1Þ
ð1� pðyÞÞ
24 35 ð�Þ
is at least biasðhÞ þ e: Then there exists an edge ðz; z0Þ in Bk such that j pðzÞ � pðz0ÞjXOðe=ffiffiffik
pÞ:
Proof. Note that ð�Þ represents the probability with which p correctly guesses h under the uniformdistribution.
Getting the lower bound Oðe=ffiffiffik
pÞ is a little tricky, so we defer the proof to Appendix A.
For now we just prove a lower bound of e=k: Note that using this weaker lower bound inthe proof of Theorem 1 makes no qualitative difference, and we could still prove Theorem 2 justas well.Without loss of generality, suppose h is biased towards 0, and let b ¼ biasðhÞ ¼ Pr½h ¼ 0X1=2:
By way of contradiction, assume jpðzÞ � pðz0Þjoe=k for all edges ðz; z0ÞABk: Let M and m be themaximum and minimum values of p on Bk: Since any of pair of points in the cube is at Hammingdistance at most k; it follows that M � moe: Now ð�Þ would be maximized if pðyÞ ¼ M for all
yAh�1ð0Þ and pðyÞ ¼ m for all yAh�1ð1Þ: Hence:ð�Þp bM þ ð1� bÞð1� mÞ
¼ ðM þ m � 1Þb þ 1� m
p ð2M � 1Þb þ 1� m þ M � M
o ð2M � 1Þb þ 1� M þ e
¼ð1� MÞ � ð2� 2MÞb þ b þ e
p b þ e;
since bX1=2: This contradiction completes the proof. &
Now we prove the hardness theorem. We will actually prove a slightly stronger statementthan the one given in Section 1—for technical reasons we want the theorem to hold underthe assumption that the hard function f is only nearly balanced, not necessarily exactlybalanced.
Theorem 1. Let f : f0; 1gn-f0; 1g be a function which is ð1� dÞ-hard for size s: Let
g : f0; 1gk-f0; 1g be any function. Let e40 be any parameter, and assume that biasð f Þp1=2þ
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 75
ð1� 2dÞe=4k: Then for every r40; g#f is ðExpBiasð2�rÞdðgÞ þ eÞ-hard for circuits of size
s0 ¼ Oðe2=logð1=dÞ
ksÞ:
Proof. Let S be the hard-core given by Corollary 2, using parameters d; r; and e0 ¼ ðc=8Þe=ffiffiffik
p;
where c is the constant hidden in the Oð�Þ of Lemma 5. Then f is ð1=2þ e0Þ-hard on S for circuitsof size s0: Furthermore, we have biasð f jSÞ ¼ 1=2; and because jSjp2d2n and biasð f Þp1=2þð1� 2dÞe=4k; we get biasð f jScÞp1=2þ e=4k:Let Z ¼ jSj=2n
Xð2� rÞd; and let E ¼ ExpBiasZðgÞ: It is easy to see that ExpBiasdðhÞ is a non-increasing function of d for any h—simply note that in the guessing game of Section 2, the moreinformation the optimal strategy has the better. Hence EpExpBiasð2�rÞdðgÞ:Suppose by way of contradiction that C is a circuit of size s0 computing g#f on an
ðExpBiasð2�rÞdðgÞ þ eÞXE þ e fraction of the inputs in ðf0; 1gnÞk: For a given restriction rAPkZ ; let
us say that an input ðx1;y; xkÞ matches r when for all i we have xiAS ) rðiÞ ¼ % and xiASc )rðiÞ ¼ f ðxiÞ: Note that the probability an input matches r is very nearly r’s natural probabilityunder Pk
Z: In particular, the probability that xiAS is exactly Z; which is the correct % probability;
the probability that f ðxiÞ ¼ 1 given that xiASc is at most 1=2þ e=4k because biasð f jScÞp1=2þe=4k; and, a similar statement holds for f ðxiÞ ¼ 0: Hence Pr½ðx1;y;xkÞmatches rpð1=2þe=4kÞk=ð1=2ÞkPr½r ¼ ð1þ e=2kÞkPr½rpð1þ ð2=3ÞeÞPr½r for every r:Let cr be the probability that C correctly computes ðg#f Þðx1;y; xkÞ conditioned on the event
that ðx1;y;xkÞ matches r: We have:
Pr½C correct X E þ e
)P
r Pr½ðx1;y;xkÞ matches rcr XP
r Pr½r biasðgrÞ þ e
)P
r ð1þ ð2=3ÞeÞ Pr½rcr XP
r Pr½r biasðgrÞ þ e
)P
r Pr½rcr þ ð2=3Þe XP
r Pr½r biasðgrÞ þ e;
since crp1 for all r; andP
r Pr½r ¼ 1: It follows that there must exists a restriction r such that
crXbiasðgrÞ þ e=4:By reordering the k length-n inputs to C we may assume r is of the form ð%;y;%|fflfflfflfflffl{zfflfflfflfflffl}
k0times
; bk0þ1;y; bkÞ
for some k0ok; biAf0; 1g: An averaging argument tells us there exist particular
x�k0þ1;y;x�
kAf0; 1gn with f ðx�i Þ ¼ bi such that C correctly computes gr#f with probability at
least cr when the first k0 inputs are drawn independently and uniformly from S; and the last k � k0
inputs are hardwired to the x�’s.Let C0 be the size s0 circuit given by hardwiring in the x�’s. So C0 is a circuit taking k0 strings in
f0; 1gn; and it correctly computes gr with probability at least cr when its inputs x1;y; x0k are
drawn independently and uniformly from S: From now on, whenever we speak of probabilities
with respect to C0 we mean over uniformly random inputs drawn from S (not all of f0; 1gn).
For each ðy1;y; yk0 ÞAf0; 1gk0; let pðyÞ be the probability that C0ðx1;y; x0
kÞ ¼ 0 conditioned on
the event that yi ¼ f ðxiÞ for all i ¼ 1yk0: Since f is balanced on S; each of these events is equally
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9476
likely. It follows that the correctness probability of C0 is exactly:
2�k0 XyAg�1r ð0Þ
pðyÞ þX
yAg�1r ð1Þð1� pðyÞÞ
24 35:Since this quantity is at least crXbiasðgrÞ þ e=4; Lemma 5 tells us that there is an edge ðz; z0Þ inBk0 such that jpðzÞ � pðz0ÞjXcðe=4Þ=
ffiffiffiffik0
pXðc=4Þe=
ffiffiffik
p¼ 2e0:
Reorder inputs again so that we may assume z ¼ 0u; z0 ¼ 1u for some string uAf0; 1gk0�1:Again, an averaging argument tells us that there exist x�
2;y;x�k0 with ð f ðx�
2Þ;y; f ðx�k0 ÞÞ ¼ u
such that
Prx1Að f jSÞ
�1ð0Þ½C0ðx1; x�
2;yx�k0 Þ ¼ 0 � Pr
x1Að f jSÞ�1ð1Þ
½C0ðx1; x�2;y; x�
k0 Þ ¼ 0
X2e0:
Now let C00 be the size s0 circuit given by hardwiring in the new x�’s. Then we have:
PrxAð f jSÞ
�1ð0Þ½C00ðxÞ ¼ 0 � Pr
xAð f jSÞ�1ð1Þ
½C00ðxÞ ¼ 0
X2e0:
But since f is balanced on S; we immediately conclude that the probability that C00 and f agree
on a randomly chosen input from S is either at least 12þ e0 or at most 1
2� e0: Thus either C00 or the
negation of C00 agrees with f on at least a ð12þ e0Þ fraction of the inputs in S: Since these circuits
have size s0; we have contradicted the fact that f is ð12þ e0Þ-hard for size s0 on the hard-core S: This
completes the proof. &
In Appendix B we give unconditional constructions showing that this theorem is nearly tight.
4. Noise stability
We now know that the expected bias of g essentially characterizes the hardness of a compositefunction g#f : Unfortunately, ExpBiasdðgÞ is often difficult to calculate, even for fairly simplefunctions g: In this section we introduce another intrinsic property of boolean functions callednoise stability which turns out to be a very good estimator for expected bias and is in general easierto calculate. Noise stability has been studied in a number of other works; e.g., [3,4,16].Let us begin with some notation. Because any binary function can be guessed with probability
at least 1=2; we often deal with probabilities in the range ½1=2; 1: Sometimes it is more natural towork with the amount these quantities are in excess of 1=2; scaled to the range ½0; 1: To that end,we introduce the following notational quirk:
Definition 7. If Q is any quantity in the range ½1=2; 1; Q� will denote the quantity 2ðQ � 1=2Þ;which is in the range ½0; 1:
We give bias� a special name:
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 77
Definition 8. The advantage of a boolean function h is advðhÞ ¼ biasðhÞ�:
Let h be any boolean function on n bits. The advantage of h is related to the probability that hagrees on two random values. The following fact is easy to prove:
Proposition 6. If x and y are selected independently and uniformly from f0; 1gn; the probability that
hðxÞ ¼ hðyÞ is 12þ 1
2advðhÞ2:
Instead of picking x and y at random, we might first pick x at random and then make y arandom perturbation of x with noise d:
Definition 9. If xAf0; 1gn; define NdðxÞ to be a random variable in f0; 1gn given by flipping eachbit of x independently with probability d:
Now we can define the noise stability of a boolean function h:
Definition 10. The noise stability of h at d is
NoiseStabdðhÞ ¼ Pr½hðxÞ ¼ hðNdðxÞÞ;
where the probability is taken over the choice of x and the noise.
We can see the connection between noise stability and expected bias by recalling the intuitivediscussion from Section 2. Suppose we are in the Hard-Core scenario, in which with probability1� 2d we know that zi is the correct answer f ðxiÞ; and with probability 2d we know that zi is acompletely random bit. Then NoiseStabdðgÞ is exactly the success probability of the suboptimal,naive strategy of just guessing gðz1;y; zkÞ for the value of gð f ðx1Þ;y; f ðxkÞÞ: Since the optimalstrategy had a success probability of ExpBias2dðgÞ; we immediately conclude thatNoiseStabdðhÞpExpBias2dðhÞ for all functions h:Perhaps surprisingly, the naive guessing strategy is not so bad. The following result shows that
if we look at the advantage over 1=2 of the naive strategy and the optimal strategy, the naivestrategy is worse by at most a square root.
Proposition 7. NoiseStabdðhÞ�pExpBias2dðhÞ�pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiNoiseStabdðhÞ�
p:
Proof. Given a rAPnd; write starsðrÞ for the set of coordinates to which r assigns %: By linearity
of expectation, ExpBias2dðhÞ� ¼ ErAPn2d½advðhrÞ: On the other hand,
NoiseStabdðhÞ ¼ Prx;noise
½hðxÞ ¼ hðNdðxÞÞ
¼ PrrAPn
2d
y;zAf0;1gstarsðrÞ
½hrðyÞ ¼ hrðzÞ
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9478
¼Er Pry;z
½hrðyÞ ¼ hrðzÞ� �
¼Er1
2þ 1
2advðhrÞ2
� �ðby Proposition 6Þ
and so NoiseStabdðhÞ� ¼ Er½advðhrÞ2: Since advA½0; 1; we get the left inequality since
adv2padv; and we get the right inequality by Cauchy–Schwarz. &
Proposition 7 is useful because it tells us that NoiseStabdðhÞ is a good estimator forExpBias2dðhÞ; the two quantities are qualitatively similar in that if one is 1� oð1Þ then so is theother, and similarly for 1� Oð1Þ and 1=2þ oð1Þ: It is also satisfying to confirm that functionswhich are highly noise-unstable make good hardness amplifiers—this is roughly the property ofXOR to which one appeals in intuition for the XOR Lemma.We end this section with two tools which help in calculating noise stability. The first is a simple
but useful observation which follows immediately from the definitions:
Proposition 8. If h is a balanced boolean function, and g is any boolean function, then
NoiseStabdðg#hÞ ¼ NoiseStab1�NoiseStabdðhÞðgÞ:
The second tool is a formula for noise stability in terms of Fourier coefficients; similar formulasappear in [3,4]. Let us set up the usual Fourier machinery. We begin with an important notationalconvention: whenever we discuss Fourier coefficients, instead of true being represented by 1 andfalse by 0, we will use �1 and þ1; respectively.Let h : fþ1;�1gn-fþ1;�1g be any boolean function. Then h has a unique representation as a
multilinear polynomial in x1;y;xn of total degree at most n: The coefficients of this polynomialare the Fourier coefficients of h: Let x
Sdenote the monomial
QiAS xi; where SD½n: Then the
Fourier coefficient corresponding to this monomial is denoted hbðSÞ: Note that hbð|Þ ¼ E½h:The formula for noise stability is as follows:
Proposition 9. Let h : fþ1;�1gn-fþ1;�1g: Then:
NoiseStabdðhÞ� ¼X
SD½nð1� 2dÞjSjhbðSÞ2:
Proof. Let xAfþ1;�1gn be chosen uniformly at random, and let y ¼ NdðxÞ: Then:
NoiseStabdðhÞ ¼Pr½hðxÞ ¼ hðyÞ¼E½1hðxÞ¼hðyÞ¼E½1� ðhðxÞ � hðyÞÞ2=4 ðsince hAfþ1;�1gÞ¼ 1� 1
4E½hðxÞ2 � 1
4E½hðyÞ2 þ 1
2E½hðxÞhðyÞ
¼ 12þ 1
2E½hðxÞhðyÞ ðsince h2 � 1Þ;
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 79
hence NoiseStabdðhÞ� ¼ E½hðxÞhðyÞ: But
E½hðxÞhðyÞ ¼EX
S
hbðSÞxS
! XT
hbðTÞyT
!" #¼XS;T
hbðSÞhbðTÞE½xSy
T:
If SaT ; it is fairly easy to see that E½xSy
T ¼ 0: Let iASDT ; say iAT \S:Now yi is equally likely
to be þ1 or �1; and hence E½xSy
T ¼ 1
2E½x
Sy
Tjyi ¼ þ1 þ 1
2E½x
Sy
Tjyi ¼ �1 ¼ 1
2E½x
Sy
T \fig �12E½x
Sy
T \fig ¼ 0:
It remains to prove that E½xSy
S ¼ ð1� 2dÞjSj: Since x
Sand y
Sare both either þ1 or �1;
E½xSy
T ¼ Pr½x
S¼ y
S � Pr½x
Say
S ¼ �1þ 2 Pr½x
S¼ y
S: This last probability is exactly the
probability that, if we flip jSj coins with probability d for heads, then we get an even number
of heads. A trivial induction shows that this is indeed 12þ 1
2ð1� 2dÞjSj; another way to see it is to
write k ¼ jSj; and note that the probability is
ð1� dÞk þk
2
� �ð1� dÞ2d2 þ
k
4
� �ð1� dÞ4d4 þ?
¼ ð½ð1� dÞ þ dk þ ½ð1� dÞ � dkÞ=2 ¼ 1
2þ 1
2ð1� 2dÞk;
as claimed. &
5. Majority
This section is devoted to one of the simplest monotone functions, the majority function. Itturns out that the majority function is not a very good hardness amplifier, but it is neverthelessinteresting to study its expected bias. Let MAJn denote the majority function on n bits; forsimplicity we assume n to be odd.
Proposition 10. For any dA½0; 1; ExpBiasdðMAJnÞ-34þ 1
2p arcsinð1� 2dÞ as n-N: In particular
ExpBiasdðMAJnÞ �3
4þ 1
2parcsinð1� 2dÞ
� � pOð1=ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffidð1� dÞn
pÞ:
Proof. Suppose rAPnd is chosen at random. Then ðMAJnÞr is biased towards the majority
value of the non-star bits of r: Indeed, biasððMAJnÞrÞ is exactly equal to the probability that,
if one proceeds to set the starred bits of r at random, the majority is preserved. (This is ambiguousin the case that the non-star bits of r split evenly; we shall see that we needn’t worry aboutthis case.)Let us model this random experiment as follows: suppose we consider each bit position
independently and in sequence, 1, 2, y, n: For each position i; we first choose whether i will
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9480
be a ‘‘star position’’ (probability d) or a ‘‘non-star position’’ (probability 1� d). Either way,we also select a random bit associated to position i (equally likely 0 or 1). We wish tocalculate the probability that the majority of all the bits agrees with the majority of thenon-star bits.
Our experiment corresponds to a random walk on the lattice points of R2 in a natural way: letstar positions correspond to vertical steps and non-star positions correspond to horizontal steps;further, let right and up steps correspond to choosing the bit 0 and left and down steps correspondto choosing 1. Now we have an n-step random walk starting at the origin, where each step is:ðþ1; 0Þ with probability ð1� dÞ=2; ð�1; 0Þ with probability ð1� dÞ=2; ð0;þ1Þ with probabilityd=2; or ð0;�1Þ with probability d=2: Let ðX ;YÞ denote the final position of the random walk.Then the majority of all the bits agrees with the majority of the non-star bits if and only if:X þ YX0 and XX0; or, X þ Yp0 and Xp0: (Again, the boundary of this region falls intoan ambiguous case, but this will be unimportant.) Call this region R; R is shown shaded on the leftof Fig. 1.Let us write X1;y;XnAf�1; 0;þ1g for the x-components of the random walk’s n steps, and
similarly Y1;y;Yn for the y-components. We have X ¼ X1 þ?þ Xn; Y ¼ Y1 þ?þ Yn; andwe wish to study the distribution of ðX ;YÞ: Note that the mean of one step in the random walk
ðXi;YiÞ is ð0; 0Þ and the covariance matrix for one step is S ¼ Eð X 2i
YiXi
XiYi
Y 2i
Þ ¼ ð1�d0
0dÞ: By the
multidimensional Central Limit Theorem, the distribution function of ðX=ffiffiffin
p;Y=
ffiffiffin
pÞ converges
to that of the two-dimensional Gaussian with covariance S; namely, the distribution with densityfunction:
fSðx; yÞ ¼1
2pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffidð1� dÞ
p exp � x2
1� dþ y2
d
�� �2
� �:
Sazonov gave Berry–Esseen-type bounds for the rate of convergence in the multidimensionalCentral Limit Theorem; in particular [18, p. 10 item 6] shows that for any convex set S
in R2; jPr½ðX=ffiffiffin
p;Y=
ffiffiffin
pÞAS � FSðSÞjpOð1Þð1=
ffiffiffid
pþ 1=
ffiffiffiffiffiffiffiffiffiffiffi1� d
pÞn�1=2 ¼ Oð1=
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffidð1� dÞn
pÞ;
where FS denotes the distribution function of the Gaussian distribution mentioned above.
ARTICLE IN PRESS
Fig. 1. The regions R and R0:
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 81
Since R is the union of two convex sets, and ðX=ffiffiffin
p;Y=
ffiffiffin
pÞAR iff ðX ;YÞAR by definition of R;
the proof will be complete once we show that FSðRÞ ¼ 34þ 1
2p arcsinð1� 2dÞ: (It is now apparent
that the definition of R’s boundary is irrelevant since it gets 0 weight under the continuousdistribution fS:)
This calculation is easy. Make the change of variables u ¼ x=ffiffiffiffiffiffiffiffiffiffiffi1� d
p; v ¼ y=
ffiffiffid
p: Then fS
becomes the standard two-dimensional normal distribution, fðu; vÞ ¼ 12p expð�ðu2 þ v2Þ=2Þ; and R
is transformed into R0 ¼ fuX0;ffiffiffiffiffiffiffiffiffiffiffi1� d
pu þ
ffiffiffid
pvX0g,fup0;
ffiffiffiffiffiffiffiffiffiffiffi1� d
pu þ
ffiffiffid
pvp0g; pictured on
the right in Fig. 1. Now f is evidently radially symmetric, and R0 consists of exactly two radial
slices each of angle p=2þ tan�1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� dÞ=d
p: Therefore the weight of R0 under F is precisely
ð1=2pÞ � 2ðp=2þ tan�1ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� dÞ=d
pÞ ¼ 1=2þ ð1=pÞtan�1
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1� dÞ=d
p¼ 3
4þ 1
2p arcsinð1� 2dÞ; by a
trigonometric identity. &
It is not surprising that arcsin should arise in this context; cf. the arc sine laws of Feller [7].Roughly speaking, Theorem 1 now tells us that if f is balanced and ð1� dÞ-hard, then MAJ#f
has hardness about 34þ 1
2p arcsinð1� 2ð2dÞÞ: The graph of 34þ 1
2p arcsinð1� 2ð2dÞÞ and 1� d versusd is shown in Fig. 2.We can see then that when d is small (i.e., f is ‘‘easy’’), majority is a fair hardness amplifier; an
approximation shows that it amplifies 1� d hardness to 1�Yðffiffiffid
pÞ hardness. So long as do1=4;
majority increases hardness; but, when d41=4; it actually decreases hardness! (It is easy to showthat ExpBias1=2ðMAJnÞ ¼ 3=4 for all odd n:) It is clear then that the majority function will not give
us the good hardness amplification properties we desire.
ARTICLE IN PRESS
0 0.1 0.2 0.3 0.4 0.50.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
delta
new
har
dnes
s
Fig. 2. 34þ 1
2p arcsinð1� 2ð2dÞÞ and 1� d as functions of d:
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9482
6. Recursive majorities
It seems that recursive majorities might not be such a good idea—if one did the analysis simplyby applying Theorem 1 with Proposition 10 multiple times, the hardness would converge and get‘‘stuck’’ at 3=4: This would be giving too much away, however. Recall the intuitive discussionfrom Section 2; it has to assume that the function f is ð1� dÞ-hard in the easiest possible way. Butif f is already the majority of hard functions, it is in fact hard in a harder way. The key is to applyall of the majorities at the same time, taking g to be recursive-majority right from the start andapplying Theorem 1 only once.
Definition 11. For each cX1; REC-MAJ-3c : f0; 1g3c
-f0; 1g is the function given by a depth-cternary tree of majority-of-3 gates, with the input bits at the leaves.
We do not know how to calculate the expected bias of REC-MAJ-3c; but estimating its noisestability is straightforward. We give a crude estimate which suffices for our purposes:
Proposition 11. For cXlog1:1ð1=dÞ; NoiseStabdðREC-MAJ-3cÞ�pd�1:1ð3cÞ�0:15:
Proof. If we write NoiseInstd ¼ 1�NoiseStabd; then Proposition 8 tells us thatNoiseInstdðg#hÞ ¼ NoiseInstNoiseInstdðhÞðgÞ when h is balanced. MAJ3 is balanced, and an
explicit calculation gives NoiseInsttðMAJ3Þ ¼ ð3=2Þt � ð3=2Þt2 þ t3: Defining pðtÞ ¼ ð3=2Þt �ð3=2Þt2 þ t3; we conclude that NoiseStabdðREC-MAJ-3cÞ ¼ 1� pðpð?pðdÞ?ÞÞ|fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl}
ctimes
; so
NoiseStabdðREC-MAJ-3cÞ� ¼ 1� 2pðcÞðdÞ:It remains to study the iteration of p: When t is sufficiently small, pðtÞEð3=2Þt: Very crudely,
when to1=4; pðtÞX1:18t: It follows that pðmÞðtÞXð1:18Þmt; so long as this quantity is at most 1=4:
Taking m ¼ log1:18ð1=4dÞ; we get pðmÞðdÞXð1=4Þ=1:1840:21:
Now write t ¼ 1=2� Z: Then pðtÞ ¼ 1=2� ð3=4ÞZ� Z3: When Z is very small, this isapproximately 1=2� ð3=4ÞZ: Crudely again, when Zp1=2� 0:21 ¼ 0:29; pðtÞX1=2� 0:84Z: Itfollows that pðm
0Þð0:21ÞX1=2� ð0:84Þm0ð0:29Þ for any positive m0; in particular for m0 ¼ c� m;
which is positive because cXlog1:1ð1=dÞ4m:
Hence pðcÞðdÞX1=2� 0:29 � ð0:84Þm0; and so it remains to check that 0:58 �
ð0:84Þm0pd�1:1ð3cÞ�0:15:
0:58 � ð0:84Þm0¼ 0:58 � ð0:84Þc�log1:18ð1=4dÞ
o ð0:84Þ�log1:18ð1=4dÞð0:84Þc
¼ð4dÞlog1:18ð0:84Þð3cÞlog3ð0:84Þ
o d�1:1ð3cÞ�0:15;
as needed. &
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 83
This result gives us a monotone balanced function which is very sensitive to a small amount ofnoise. Applying Proposition 7, we immediately get:
Proposition 12. For cXlog1:1ð1=dÞ; ExpBias2dðREC-MAJ-3cÞ�pd�0:55ð3cÞ�0:075:
We can now prove a 1� 1=polyðnÞ-1=2þ oð1Þ hardness amplification for NP under theassumption that the initial hard function in NP is balanced:
Theorem 13. If there is a family of functions ð fnÞ in NP which is infinitely often balanced andð1� 1=polyðnÞÞ-hard for polynomial circuits, then there is a family of functions ðhmÞ in NP which is
infinitely often balanced and ð1=2þ m�0:07Þ-hard for polynomial circuits.
Proof. Suppose ð fnÞ is infinitely often balanced and ð1� 1=ncÞ-hard for polynomial circuits. For agiven n; let k ¼ nC ; where C is a large constant dependent on c to be chosen later. Put c ¼Ilog3 km and view REC-MAJ-3c as a function on k bits by ignoring some inputs if necessary. Now
define hm ¼ REC-MAJ-3c#fn; having input length m ¼ kn ¼ nCþ1: Note that the family ðhmÞ is inNP since ð fnÞ is and since REC-MAJ-3c is monotone and in P. Also hm is balanced whenever fn is.
It remains to check that hm is indeed ð1=2þ m�0:07Þ-hard for polynomial circuits whenever n issuch that fn is hard.
Apply Theorem 1 with r ¼ 1; e ¼ 1=nC ; and d ¼ 1=nc: Then we conclude that hm
is ðExpBiasdðREC-MAJ-cÞ þ eÞ-hard for polynomial circuits. By Proposition 12,
ExpBiasdðREC-MAJ-3cÞp1=2þ ð1=2Þðd=2Þ�0:55ð3cÞ�0:075; assuming cXlog1:1ð2=dÞ ¼ log1:1ð2ncÞ;by taking C sufficiently large (compared to c) we can ensure this is true. But
ð1=2Þðd=2Þ�0:55ð3cÞ�0:075pð2ncÞ0:55ðnC=3Þ�0:075 þ n�Cpn�0:074C if we take C large enough. And
n�0:074Cpm�0:07 if we take C large enough, since m ¼ nCþ1: Thus hm is ð1=2þ m�0:07Þ-hard forpolynomial circuits, and the proof is complete. &
A more careful estimation in Proposition 11 would allow us to get a function which was
ð1=2þ n�aÞ-hard for a approaching 12log3ð4=3ÞE0:13: See [16] for a detailed analysis of the noise
stability of recursive majorities.
7. The tribes function
The previous theorem got us from 1� 1=polyðnÞ hardness down to 1=2þ oð1Þ hardness. In thissection we improve on the oð1Þ term by moving to a different function, namely the ‘‘tribes’’function of Ben-Or and Linial. This is a simple monotone read-once DNF, whose definition is asfollows: for input length k; the parameter b is set to a certain quantity a little less than log2 k:Then the k inputs are divided into k=b blocks of size b; and the function Tk is defined to be 1if and only if at least one of the blocks consists entirely of 1’s, i.e., Tkðx1;y; xkÞ ¼ðx14?4xbÞ3ðxbþ14?4x2bÞ3?3ðxk�bþ14?4xkÞ:
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9484
As in the previous section our technique is to estimate the noise stability of Tk: The calculationis more technical than before; we use the formula for noise stability in terms of Fourier coefficientsgiven by Proposition 9, along with Mansour’s calculations of the Fourier coefficients of the tribesfunction [15]. Note that the subsequent work [16] gives an elementary exact calculation of thenoise stability of the tribes function, demonstrating that the calculation to follow is nearly tight.Let us first rigorously define the tribes function. Note that the probability the tribes function is
true is exactly 1� ð1� 2�bÞk=b: Since we want the tribes function to be as close to balanced as
possible, we need to define k and b in such a way that ð1� 2�bÞn=bE1=2: In order to get this, wewill actually define k in terms of b; so that Tk will only be defined on certain input lengths.
For every integer b; let k ¼ kb be the smallest integral multiple of b such that ð1� 2�bÞk=bp1=2:
For analysis purposes, let k0 be the real number such that ð1� 2�bÞk0=b ¼ 1=2: Then b ¼log2 k0 � log2 ln k0 þ oð1Þ; hence k0pkpk0 þ log2 k0; and so b ¼ log2 k � log2 ln k þ oð1Þ as well.The probability that Tk is true on a random input is ð1� 2�bÞk=b ¼ ð1=2Þð1� 2�bÞðk�k0Þ=b ¼ð1=2Þð1� 2�bÞ1þoð1Þ ¼ ð1=2Þð1� Oðlog k=kÞÞ:
Proposition 14. Let b and k be chosen as described, and let Tk be the associated tribes function.Then:
NoiseStabdðTkÞ�p½expðð1� dÞbÞ � 1 þ Oðlog2 k=k2Þ:
Proof. We want to upper-bound NoiseStabdðTkÞ� using Proposition 9, so we need to know theFourier coefficients of T : We just showed that the probability Tk is true on a random input is
ð1=2Þð1� Oðlog k=kÞÞ; hence jcTkTkð|Þj ¼ Oðlog k=kÞ: Mansour studied the other Fourier coeffi-cients of Tk in [15]; he showed that:
jcTkTkðS1;y;Sk=bÞj ¼ 2ð1� 2�bÞk=bð2b � 1Þ�j
p ð2b � 1Þ�j;
where SiDfxbði�1Þþ1;y;xbig and j is the number of i’s such that Sia|:How many sets S ¼ ðS1;y;Sk=bÞ of cardinality d have exactly j nonempty parts Si? There are
k=bj
� �choices for the nonempty parts, and at most jb
d
� �ways to distribute the elements of S among
these parts. Hence there are at most k=bj
� �jbd
� �such sets. Each set contributes ð1� 2dÞdð2b � 1Þ�2j
to the sum NoiseStabdðTkÞ� ¼P
ID½k ð1� 2dÞjSjcTkTkðSÞ2: The case k ¼ 0 we did separately; it
contributes Oðlog2 k=k2Þ to the sum. Hence:
NoiseStabdðTkÞ�pOðlog2 k=k2Þ þXk=b
j¼1
Xjb
d¼j
k=b
j
� �jb
d
� �ð1� 2dÞdð2b � 1Þ�2j
pOðlog2 k=k2Þ þXk=b
j¼1
k=b
j
� �ð2b � 1Þ�2jð2� 2dÞjb
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 85
pOðlog2 k=k2Þ þXk=b
j¼1
ðk=bÞj
j!
1
½2bð2b � 2Þ jð2� 2dÞbj
¼Oðlog2 k=k2Þ þXk=b
j¼1
k
bð2b � 2Þ
� � j
ð1� dÞbj=j!
Since ð1� 2�bÞk=b ¼ 1=2� oð1Þ; kblog2ð1� 2�bÞ ¼ �1� oð1Þ: The Taylor approximation for
log2ð1� xÞ lets us conclude that log2ð1� 2�bÞX� 2�b=ln 2; so kb2bpðln 2Þð1þ oð1ÞÞo1: We
conclude that kbð2b�2Þo1 asymptotically. Continuing our estimate, we get:
NoiseStabdðTkÞ�pOðlog2 k=k2Þ þXk=b
j¼1ð1� dÞbj=j!
pOðlog2 k=k2Þ þ ½expðð1� dÞbÞ � 1;
as claimed. &
When d is nearly 1=2; the noise stability of Tk is almost as small as 1=2þ 1=k:
Proposition 15. For any r40; NoiseStab1=2�rðTkÞ�pk�1þ3r for k sufficiently large.
Proof. By Proposition 14,
NoiseStab1=2�rðTkÞ�p ½expðð1=2þ rÞbÞ � 1 þ Oðlog2 k=k2Þ¼ ½expðð1=2Þbð1þ 2rÞbÞ � 1 þ Oðlog2 k=k2Þ
p exp ð1þ oð1ÞÞ ln k
kð1þ 2rÞlog2 k
� �� 1
� �þ Oðlog2 k=k2Þ
¼ exp ð1þ oð1ÞÞ ln k
kklog2ð1þ2rÞ
� �� 1
� �þ Oðlog2 k=k2Þ
¼ ½expðk�1þ2r=lnð2Þþoð1ÞÞ � 1 þ Oðlog2 k=k2Þp 2k�1þ2r=lnð2Þþoð1Þ þ Oðlog2 k=k2Þp k�1þ3r;
for sufficiently large k; as 3X2=lnð2Þ: &
As an immediate corollary of Propositions 15 and 7, we get:
Proposition 16. For every constant Z40; there is a constant r40 such that ExpBias1�rðTkÞp1=2þk�1=2þZ for k sufficiently large.
We are now ready to prove our main theorem, a 1� 1=polyðnÞ-1=2þ n�1=2þe hardnessamplification for NP under the assumption that the initial hard function family in NP is balanced.
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9486
In Section 8 we show how to get a result which is almost as strong without making the balanceassumption.
Theorem 2. If there is a family of functions ð fnÞ in NP which is infinitely often balanced
and ð1� 1=polyðnÞÞ-hard for circuits of polynomial size, then there is a family of functions
ðhmÞ in NP which is infinitely often ð12þ n�1=2þZÞ-hard for circuits of polynomial size, for any
small Z40:
Proof. By Theorem 13, the hypothesis of the present theorem implies the existence of a family offunctions ð fnÞ in NP which is infinitely often balanced and ð1=2þ oð1ÞÞ-hard for polynomialcircuits. We now proceed in a manner similar to that in the proof of Theorem 13.
Pick b such that the associated tribes input length k ¼ kb is roughly nC ; where C is a large
constant to be chosen later. Let hm ¼ Tk#f ; a function on m ¼ kn ¼ nCþ1 bits. Since Tk is
monotone and in P, the family ðhmÞ is in NP. It remains to show that hm is ð1=2þ m�1=2þZÞ-hardfor polynomial circuits whenever fn is balanced and ð1=2þ oð1ÞÞ-hard.Apply Theorem 1 with e ¼ 1=k; d set to the given 1=2� oð1Þ; and r a small positive constant to
be chosen later. Then we conclude that hm is ðExpBiasð1�rÞðTkÞ þ eÞ-hard for polynomial circuits.
By Proposition 16, ExpBiasð1�rÞðTkÞ can be made smaller than 1=2þ k�1=2þZ=4 by taking r
sufficiently small. Since e ¼ 1=k; the hardness of hm is at most 1=2þ k�1=2þZ=2 (for sufficiently
large k). But m ¼ nCþ1 ¼ k1þ1=C : Hence if we take C to be any constant greater than 1=Z then
ðk1þ1=CÞ�1=2þZXkð1þZÞð�1=2þZÞ ¼ k�1=2þZ=2þZ2
Xk�1=2þZ; and the proof is complete. &
This result more or less exhausts the possibilities for monotone hardness amplification via
Theorem 1. The reason is that NoiseStabðgÞ� will never be smaller than polylogðnÞ=n when g ismonotone.To be exact, a celebrated theorem of Kahn et al. [12] shows that for any monotone function g
on k inputs,P
jSj¼1 gbðSÞ2XOðlog2 k=kÞ: From Proposition 9, it immediately follows that
NoiseStabdðgÞ�Xð1� 2dÞOðlog2 k=kÞ: If we start with a balanced function f on n inputs which
has hardness at least 1=2þ Oðlog2 n=nÞ and then apply Theorem 1 with g as an amplifier, we end
up with a function with input length nk and a hardness bound of 1=2þOðlog2 n=nÞOðlog2 k=kÞX1=2þ Oðlog2ðnkÞ=nkÞ:Consequently, we cannot use these techniques to amplify hardness within NP below 1=2þ
Oðlog2 n=nÞ: Further, from a constructive point of view, if we always use NoiseStabd as anestimator for ExpBias2d we pay the additional penalty of Proposition 7 and we will never do better
than 1=2þ *Oðn�1=2Þ:
8. Eliminating the balance hypothesis
Theorem 2 requires that the initial hard function in NP is balanced on infinitely many of theinput lengths for which it is hard. This assumption is not unrestrictive; the author is unaware of
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 87
any NP-complete language which is balanced on all input lengths. We can remove the assumptionat the expense of a slight loss in hardness in the final function:
Theorem 3. If there is a family of functions ð fnÞ in NP which is infinitely often ð1� 1=polyðnÞÞ-hard
for circuits of polynomial size, then there is a family of functions ðhmÞ in NP which is infinitely often
ð12þ m�1=3þZÞ-hard for circuits of polynomial size, for any small Z40:
We devote the remainder of this section to proving Theorem 3. The proof only requires playingsome tricks with input lengths. First, a simple trick using padding:
Proposition 17. If there is a family of functions ð fnÞ in NP which is infinitely often ð1� 1=polyðnÞÞ-hard for circuits of polynomial size, then for every Z40 there is a family of functions ðgmÞ in NP
which is infinitely often ð1� 1=mZÞ-hard for circuits of polynomial size.
Proof. Suppose ð fnÞ is infinitely often ð1� 1=ncÞ-hard for polynomial circuits. Pick K4c=Z; anddefine m ¼ mðnÞ ¼ nK : On an input x of length m; define gmðxÞ to be f applied to the first n bits ofx: Then ðgmÞ is surely in NP, and we claim that whenever n is an input length for which fn isð1� 1=ncÞ-hard for polynomial circuits, gm is ð1� 1=mZÞ-hard for polynomial size. The proof is
simple; suppose C were a circuit of size md which correctly computed gm on a 1� 1=mZ fraction of
the inputs in f0; 1gm: By an averaging argument, there are settings for the last m � n input bits to
C such that C correctly computes gm on a 1� 1=mZ fraction of the remaining inputs, f0; 1gn: But
now we have a circuit of size nKd ¼ polyðnÞ which computes fn with probability at least 1�1=nZK41� 1=nc; a contradiction. &
Next, let us combine the tribes function and the recursive majority of 3 function to get a singlefunction which is highly noise unstable under a small amount of noise:
Proposition 18. For every r40 there is a monotone function family ðFkÞ in P defined for every input
length k such that NoiseStab1=krðFkÞ�pk�1þ10r for all sufficiently large k:
Proof. Let r0 ¼ 9r: On input length k; pick c as large as possible so that 3cpkr0 ; and pick k0 to be
the largest possible tribes function input length which is at most k1�r0 : Define Fk ¼Tk0#REC-MAJ-3c to be on k bits by ignoring some input bits if necessary. Note that 3c ¼Oðkr0 Þ and k0 ¼ Oðk1�r0 Þ; so we have not lost too much in the input length.
By Proposition 11, NoiseStab1=krðREC-MAJ-3cÞ�pðkrÞ1:1=ð3cÞ0:15 ¼ Oðk1:1r�0:15r0 Þ ¼Oðk�0:25rÞpr=3 for k sufficiently large. Now invoking Propositions 8 and 15, we conclude that
NoiseStab1=krðFkÞ�pðk1�r0 Þ�1þ3r=3pk�1þ10r; as claimed. &
We now outline the proof of Theorem 3. We begin with a ð1� 1=nZÞ-hard function f on n
inputs by using Proposition 17, and we wish to apply Theorem 1 using the function on k inputsdefined in Proposition 18. The trouble is that the initial function has an unknown bias. Tocircumvent this, we map input instances of length n to various instance lengths around L ¼ LðnÞ;
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9488
and use the new input lengths as guesses for the bias of the initial hard function. This allows us to‘‘know’’ the bias of f to within about71=L: At this point we can easily cook up a slightly alteredversion of f which is still hard and is within about 71=L of being balanced. This lets us applyTheorem 1, as long as e=kE1=L: Since the expected bias we get from Proposition 18 is about
1=ffiffiffik
p; the largest e we would like to pick is 1=
ffiffiffik
p; leading to 1=k3=2E1=L ) 1=
ffiffiffik
pEL�1=3: This is
why the exponent of 1=3 arises in Theorem 3. We now give the rigorous proof:
Proof (of Theorem 3). Let Z40 be given, and assume there is a family of functions ð fnÞ in NP
that is infinitely often ð1� 1=polyðnÞÞ-hard for circuits of polynomial size. By Proposition 17,there is a family of functions in NP, which we will also denote by ð fnÞ; that is infinitely oftenð1� 1=nZÞ-hard for polynomial circuits. We now begin to define hm: Let C ¼ CðZÞ be a large
constant divisible by 3 to be chosen later. On input x of length m; express m ¼ ðn þ 1ÞCþ1 þ i for n
as large as possible with iZ0: Assuming 0pio8nC; we ‘‘guess’’ that the fraction of inputs on
which fn is 1 is about i=8nC : (Note that i may be as large as ðn þ 2ÞCþ1 � ðn þ 1ÞCþ148nC ; when
iX8nC ; define hm arbitrarily, say hm � 0:) Specifically, define f 0n : f0; 1g
nþ1-f0; 1g as follows: on
input yb; where jyj ¼ n; jbj ¼ 1; put
f 0nðybÞ ¼
f ðyÞ if b ¼ 0;
0 if b ¼ 1 and y is one of the lexicographically
first ði=8nCÞ2n strings in f0; 1gn;
1 else:
8>>><>>>:The point of this construction is that for every n; there is a particular i and hence a particular m
for which f 0n becomes very close to balanced; specifically, biasð f 0
nÞp1=2þ 1=8nC : Further, notethat f 0
n is easily seen to be ð1� 2=nZÞ-hard for polynomial circuits.
Now we continue the definition of hm: Pick k ¼ nð2=3ÞC and let Fk be the function fromProposition 18, with parameter r43Z=2C (with this choice, 2=nZ41=kr for sufficiently large n).
On input x of length m (where m ¼ ðn þ 1ÞCþ1 þ i with 0pio8nC), write x ¼ y1y2?ykz; wherejyij ¼ n þ 1 and jzj ¼ m � kðn þ 1Þ40: Now define hmðxÞ ¼ Fk#f 0
n; where the input bits z are
ignored.One easily checks that the family ðhmÞ is indeed in NP. We now show that ðhmÞ is infinitely
often ð1=2þ m�1=3þOð1ÞZÞ-hard for polynomial circuits, which is sufficient to complete theproof.Suppose n is an input length for which fn is hard. Then as noted there is an m for which f 0
n
becomes close to balanced, biasð f 0nÞp1=2þ 1=8nC : For this m; we claim that hm has the desired
hardness. This follows in a straightforward manner from Theorem 1, using Propositions 18 and 7.
To be exact, let d ¼ 2=nr; e ¼ 1=nð1=3ÞC ; and r ¼ 1: Note that biasð f 0nÞp1=2þ 1=8nCpð1�
2dÞe=4k: Theorem 1 tells us that hm is ðExpBiasdðFkÞ þ eÞ-hard for polynomial circuits. By
Propositions 7 and 18, ExpBiasdðFkÞp1=2þ k�1=2þ5ð3Z=2CÞ þ 1=nð1=3ÞC ¼ 1=2þ n�ð1=3ÞCþ5Z þn�ð1=3ÞC : As a function of the input length, mpðn þ 2ÞCþ1; the quantity n�ð1=3ÞCþ5Z þ n�ð1=3ÞC
can be made smaller than m�1=3þOð1ÞZ by taking C ¼ Oð1=ZÞ: &
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 89
Acknowledgments
Thanks to Madhu Sudan for suggesting both the problem and the approach, as well as formany helpful discussions. Russell Impagliazzo independently proved a 1� 1=poly-1� Oð1Þhardness reduction for NP; many thanks to him for sharing his proof with me. Thanks to AdamKlivans for much explanation and encouragement; in particular, for emphasizing [10] to me.Thanks also to Rocco Servedio, Elchanan Mossel, and Michael Rosenblum for helpfuldiscussions and comments.
Appendix A. Proof of Lemma 5
Here we prove the strong version of Lemma 5. We have changed notation slightly to make thestatements more natural. Note too that we have slightly generalized by allowing p’s range to be½�1; 1 rather than ½0; 1:
Lemma A.1. Let Bn denote the n-dimensional Hamming cube, and suppose h : Bn-fþ1;�1g and
p : Bn-½�1; 1: Further suppose that jE½hpjXjE½hj þ e: Then there exists an edge ðx; yÞ in Bn such
that jpðxÞ � pðyÞjXOðe=ffiffiffin
pÞ:
Proof. Let d be the maximum of jpðxÞ � pðyÞj over all edges ðx; yÞ in Bn: For a subset CDBn; letmpðCÞ denote the average value of p on C: Let m be the median value of p on Bn; and partition Bn
into two sets Aþ and A� of equal size, such that pðxÞXmXpðyÞ for all xAAþ and yAA�:
We are interested in the number of points in Bn at distance at most d from Aþ: When jAþj ¼122n; an isoperimetric inequality on the cube (see, e.g., [1]) tells us that for all d this number is
maximized when Aþ is a ball of radius n=2: (Strictly speaking, n should be odd and theradius should be In=2m: Since we only care about asymptotics in n; we gloss over such niceties.)It follows that the average distance from Aþ (over all points) is maximized when Aþ is sucha ball.
The average distance in Bn to a ball of radius n=2 is Oðffiffiffin
pÞ; and this fact is fairly standard. To
see it, suppose without loss of generality the ball is A ¼ fxABn : MAJðxÞ ¼ 0g: Then the distancefrom a point x to A is equal to 0 if MAJðxÞ ¼ 0; and is equal to ðweightðxÞ � n=2Þ otherwise. Theresult now follows from the fact that weightðxÞ is distributed approximately as a normal variable
Nðn=2; n=4Þ; and that E½jNðn=2; n=4Þ � n=2j ¼ Oðffiffiffin
pÞ:
Hence the average distance to Aþ over all points in Bn is Oðffiffiffin
pÞ: Indeed, since half of all points
have distance 0 from Aþ; we can conclude that the average distance to Aþ; over any set of size 122n;
is Oðffiffiffin
pÞ (gaining a factor of 2 in the Oð�Þ).
But if the Hamming distance between two points x and y is at most d; then jpðxÞ � pðyÞjrdd:Since the smallest possible value of p on Aþ is m; it follows that mpðCÞZm � Oð
ffiffiffin
pdÞ for any set
jCj ¼ 122n:
Running the same argument with A� in place of Aþ yields that mpðCÞpm þ Oðffiffiffin
pdÞ for any set
jCj ¼ 122n: Writing y ¼ Oð
ffiffiffin
pdÞ; we conclude that mpðCÞA½m � y;m þ y for all sets jCj ¼ 1
22n:
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9490
Now let us turn our attention to h: Write Hþ ¼ h�1ðþ1Þ; H� ¼ h�1ð�1Þ and assume without
loss of generality that b ¼ 2�njHþjX1=2: We first upper-bound mpðHþÞ: Let M be the set of 122n
points in Hþ with largest p-value. Then mpðMÞpm þ y: The remaining points in Hþ have p-value
at most m: Hence:
mpðHþÞp 1=2
bðm þ yÞ þ b � 1=2
bm
) bmpðHþÞp bm þ y=2:
Now we lower-bound mpðH�Þ: Let N be the set of ðb � 12Þ2n points outside of H� with smallest
p-value. Then jN,H�j ¼ 122n; so mpðN,H�ÞXm � y: On the other hand, the points in N have p-
value at most m: Hence:
b � 1=2
1=2m þ 1� b
1=2mpðH�ÞXm � y
) ð1� bÞmpðH�ÞX ð1� bÞm � y=2:
Subtracting the two inequalities, we get:
bmpðHþÞ � ð1� bÞmpðH�Þp ð2b � 1Þm þ y
) E½hppE½hm þ y
) E½hpp jE½hj þ y;
since mp1; and we assumed E½hX0: Since we could replace p by �p throughout, we can in factconclude jE½hpjpjE½hj þ y: But jE½hpjXjE½hj þ e by assumption. Hence yXe which implies
dXOðe=ffiffiffin
pÞ: &
Appendix B. A corresponding easiness theorem
In this section we demonstrate that Theorem 1 is close to being tight by showing how toconstruct, for any d and any g; a balanced function f which is ð1� dÞ-hard, yet such that g#f isroughly ExpBias2dðgÞ-easy (i.e., there is a circuit which computes the function with thisprobability). The constructions of the functions used in this section are essentially those ofShaltiel [20].
Proposition B.1. There is a universal constant gX1=8 such that there exists a balanced function
h : f0; 1gn-f0; 1g which is ð1=2þ 2�gnÞ-hard for size 2gn:
Proof. Folklore proof, by picking h to be a random function. &
Definition B.1. For a given rational number d; define a canonical constant zd and a canonicalconstant-sized circuit Zd on zd inputs such that Pr½Zd ¼ 0 ¼ d:
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 91
Definition B.2. Given a function f : f0; 1gn-f0; 1g and a rational constant d; define
fd : f0; 1gnþzdþ1 by
f ðx; a; bÞ ¼f ðxÞ if ZdðaÞ ¼ 0;
b else:
#
Proposition B.2.
* If f is balanced, so is fd:* If f is Z-hard for size s; then fd is ð1� dþ dZÞ-hard for size s:* fd is ð1� dÞ-easy for size Oð1Þ:
Definition B.3. If g : f0; 1gn-f0; 1g; define SIZEðgÞ to be the size of the smallest circuit computingg exactly, and define R-B-SIZEðgÞ (standing for ‘‘restriction-bias-size’’) to be the size of thesmallest circuit which, on input a length-n restriction r; outputs the bit value towards which gr is
biased. (If gr is balanced, the circuit is allowed to output either 0 or 1.)
Of course, we cannot expect g#f to be easy to compute if g itself is hard to compute. Sointuitively, we think of SIZEðgÞ as being quite small. R-B-SIZEðgÞ may or may not be small. Formany simple functions though, such as parity, threshold functions, or tribes functions, it is small.The easiness theorem we now prove, as a converse to Theorem 1, allows for a tradeoff, dependingon the comparitive values of SIZEðgÞ and R-B-SIZEðgÞ:
Theorem B.1. Let d be a rational constant. Let d0 ¼ d=ð1� 2�gnÞ ¼ dþ Oð2�gnÞ; where g is the
constant from Proposition B.1. Then there is a balanced function f : f0; 1gnþOð1Þ-f0; 1g which isð1� dÞ-hard for circuits of size 2gn; such that:
(1) g#f is ExpBias2d0 ðgÞ-easy for sizeR-B-SIZEðgÞ þ OðkÞ(2) g#f is NoiseStabd0 ðgÞ-easy for size SIZEðgÞ þ OðkÞ(3) g#f is ðExpBias2d0 ðgÞ � Oð1=
ffiffiffiffim
pÞÞ-easy for size mSIZEðgÞ þ OðmkÞ
Proof. Take h to be the hard function from Proposition B.1, and let f ¼ h2d0 : Then by Proposition
B.2, f is balanced and ð1� dÞ-hard for size 2gn: But it is also ð1� 2d0Þ-easy for size Oð1Þ: We are
now essentially in the Hard-Core scenario described in Section 2; with probability ð1� 2d0Þ weknow the correct answer, and with probability 2d0; we have no good idea.Let us call our inputs x1;y; xk; where xi ¼ ðx0
i; ai; biÞ: Let A be the constant-sized circuit which
on input ðx0i; ai; biÞAf0; 1gnþzd0þ1 outputs bi if Zd0 ðaiÞa0; or % if Zd0 ðaiÞ ¼ 0: Let B be the circuit of
size OðkÞ which applies a copy of A to each input xi: The output of B is a length-k restriction, and
we have the property that the output distribution of B is exactly that of Pk2d0 :
Now we apply the different guessing strategies suggested in Section 2, given the restriction routput by B:To get result 1, we use a circuit of size R-B-SIZEðgÞ to output the more likely value of gr over a
uniform choice of inputs.
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9492
To get result 2, our circuit picks a random bit for each coordinate on which r is %; then aSIZEðgÞ circuit outputs the appropriate value of g: An averaging argument lets us trade off therandomness for nonuniformity.To get result 3, we try to guess the bit towards which gr is biased by trying many random
values. Specifically, we take m copies of the randomized circuit for result 2 (size mSIZEðgÞ), andtake their majority (size OðmkÞ). Again, the OðmkÞ random bits can be traded for nonuniformity.Let a ¼ advðgrÞ; and assume without loss of generality that gr is biased towards 0. The
probability that the majority of m random outputs of g is 1 is at most Z ¼ expð� a2
1þam4Þ; using the
Chernoff bound. Consequently, the probability the circuit is correct is at least ð1� ZÞð12þ 1
2aÞ þ
Zð12� 1
2aÞ ¼ 1
2þ 1
2a � aZ ¼ biasðgrÞ � aZ: It can be shown that a expð� a2
1þam4ÞpOð1=
ffiffiffiffim
pÞ (simple
calculus shows that the quantity on the left is maximized for a ¼ Yð1=ffiffiffiffim
pÞ). Averaging over r; we
get the claimed result. &
References
[1] S. Bezrukov, Isoperimetric problems in discrete spaces, in: P. Frankl, Z. Furedi, G. Katona, D. Miklos (Eds.),
Extremal Problems for Finite Sets, Janos Mathematics Society Bolyai, Budapest, Vol. 3, 1994.
[2] L. Babai, L. Fortnow, N. Nisan, A. Wigderson, BPP has subexponential time simulations unless EXPTIME has
publishable proofs, Comput. Complexity 3 (1993) 307–318.
[3] N. Bshouty, J. Jackson, C. Tamon, Uniform-distribution attribute noise learnability, Comput. Learning Theory 12
(1999) 75–80.
[4] I. Benjamini, G. Kalai, O. Schramm, Noise sensitivity of boolean functions and applications to percolation, Inst.
Hautes Etudes Sci. Publ. Math. 90 (1999) 5–43.
[5] M. Ben-Or, N. Linial, Collective coin flipping, in: S. Micali (Ed.), Randomness and Computation, Academic Press,
New York, 1990.
[6] N. Bshouty, C. Tamon, On the Fourier spectrum of monotone functions, J. ACM 43 (4) (1996) 747–770.
[7] W. Feller, An Introduction to Probability Theory and its Applications, Wiley, New York, 1968.
[8] O. Goldreich, N. Nisan, A. Wigderson, On Yao’s XOR-Lemma, Electron. Colloq. Comput. Complexity 50
(1995).
[9] M. Goldmann, A. Russell, Spectral bounds on general hard core predicates, Sympos. Theoret. Aspects Comput.
Sci. (2000).
[10] R. Impagliazzo, Hard-core distributions for somewhat hard problems, Found. Comput. Sci. 36 (1995)
538–545.
[11] R. Impagliazzo, A. Wigderson, P ¼ BPP if E requires exponential circuits: Derandomizing the XOR lemma,
Sympos. Theory Comput. (1997).
[12] J. Kahn, G. Kalai, N. Linial, The influence of variables on boolean functions, Found. Comput. Sci. 29 (1988)
58–80.
[13] A. Klivans, R. Servedio, Boosting and hard-core sets, Found. Comput. Sci. 40 (1999) 624–633.
[14] N. Linial, Y. Mansour, N. Nisan, Constant depth circuits, Fourier transforms, and learnability, J. ACM 40 (3)
(1993) 607–620.
[15] Y. Mansour, An Oðnlog log nÞ learning algorithm for DNF under the uniform distribution, J. Comput. System Sci.
50 (3) (1995).
[16] E. Mossel, R. O’Donnell, On the noise sensitivity of monotone functions, Colloquium on Mathematics and
Computer Science: Algorithms, Trees, Combinatorics and Probabilities, Versailles, France, 2002.
[17] N. Nisan, A. Wigderson, Hardness vs. randomness, J. Comput. System Sci. 49 (2) (1994).
[18] V. Sazonov, Normal Approximation—Some Recent Advances, Lecture Notes in Mathematics, Vol. 879, Springer,
Berlin, 1981.
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–94 93
[19] R. Servedio, Efficient algorithms in computational learning theory, Ph.D. Thesis, Harvard University, 2001.
[20] R. Shaltiel, Towards proving strong direct product theorems, Electron. Colloq. Comput. Complexity 9 (2001).
[21] M. Sudan, L. Trevisan, S. Vadhan, Pseudorandom generators without the XOR lemma, J. Comput. System Sci.
62 (2) (2001) 236–266.
[22] A. Yao, Theory and application of trapdoor functions, Found. Comput. Sci. 23 (1982) 80–91.
ARTICLE IN PRESS
R. O’Donnell / Journal of Computer and System Sciences 69 (2004) 68–9494