math 247a - harmonic a

MATH 247A - HARMONIC ANALYSIS

Joseph BreenNotes based on lectures from Math 247A at UCLA, Winter 2017, taught by Monica Visan.Last updated: May 31, 2017

Contents

1 Preliminaries 2

2 1/9 — 1/11: The Fourier Transform 5

3 1/13 — 1/18: Lorentz Spaces 11

4 1/20: The Marcinkiewicz Interpolation Theorem 16

5 1/25 — 1/27: Maximal Functions, Ap weights 18

6 1/30 — 2/1: Weighted Maximal Inequality, Vector-Valued Maximal Functions, and the Calderon-Zygmund Decomposition 23

7 2/3: Hardy-Littlewood-Sobolev inequalities 30

8 2/6: Sobolev Embedding Theorem 34

9 2/8 — 2/13: Calderon-Zygmund Convolution Kernels 37

10 2/13 — 2/22: Mikhlin Multiplier Theorem, Littlewood-Paley Projections 45

11 2/22: Littlewood-Paley square function 53

12 2/24 – 3/3: Applications of the square function to fractional derivatives 56

13 3/6 – 3/8: Oscillatory Integrals 61

14 3/13: The Schrodinger equation 63

1

1 Preliminaries

For convenience, we recall some basic facts and techniques that will be used frequently throughout thesenotes.

The first concerns integrating functions of the form |x|−α in Rd. In words, integrating |x|−α over a regionof Rd adds d to the exponent. Explicitly, we have:

Proposition 1.1. Fix R > 0. Then

1. The integral∫|x|>R |x|

−α dx is finite when α > d, in which case∫|x|>R

|x|−α dx ∼ R−α+d.

2. The integral∫|x|<R |x|

−α dx is finite when α ≤ d, in which case∫|x|<R

|x|−α dx ∼ R−α+d.

Consequently, |x|−α /∈ Lp(Rd) for 1 ≤ p <∞, for any choice of α.

Proof. Fix R > 0. Set r = |x|. Then dx = rd−1 dr. Thus,∫|x|>R

1

|x|αdx =

∫r>R

1

rαrd−1 dr =

∫ ∞r=R

r−α+d−1 dr ∼ r−α+d

∣∣∣∣∣∞

R

.

This limit exists precisely when −α+ d < 0, i.e., when α > d, in which case∫|x|>R

1

|x|αdx ∼ R−α+d.

Similarly, ∫|x|<R

1

|x|αdx ∼ r−α+d

∣∣∣∣∣R

0

.

This limit exists precisely when −α+ d ≥ 0, i.e., α ≤ d, in which case∫|x|<R

1

|x|αdx ∼ R−α+d.

Next, we will consider dyadic sums, that is, geometric sums with r = 1/2. Many computations in thesenotes involve summing over dyadic numbers, and we will frequently employ the following fact withoutexplicit comment.

Proposition 1.2. Let 2Z = { 2n : n ∈ Z } be the set of dyadic numbers. Then∑N∈2Z;N≥N0

1

N= 2 · 1

N0

and ∑N∈2Z;N≤N0

N = 2 ·N0

2

Proof. Fix a dyadic number N0 = 2n0 ∈ 2Z. Then

∑N∈2Z;N≥N0

1

N=

∑n∈Z;n≥n0

2−n =

∞∑n=n0

(1

2

)n=

(12

)n0

1− 12

= 2 · 2−n0 = 2 · 1

N0.

Essentially the same computation gives ∑N∈2Z;N≤N0

N = 2 ·N0.

Next, we present and prove the Riesz-Thorin interpolation theorem. The Marcinkiewicz interpolationtheorem, which will be stated and proved later, is a major tool in these notes, though we will periodicallyuse Riesz-Thorin. Here, we follow the presentation in Stein & Shakarchi, volume IV.

Theorem 1.1 (Riesz-Thorin interpolation). Suppose that T is a bounded linear map from Lp0 +Lp1 → Lq0 +

Lq1 and that‖Tf‖Lq0 ≤M0 ‖f‖Lp0 and ‖Tf‖Lq1 ≤M1 ‖f‖Lp1

for some 1 ≤ p0, p1, q0, q1 ≤ ∞. Then

‖Tf‖Lqt ≤M1−t0 M t

1 ‖f‖Lpt

where 1pt

= 1−tp0

+ 1p1

and tqt

= 1−tq0

+ tq1

for any 0 ≤ t ≤ 1.

The proof of this theorem relies on complex analysis. Specifically, we need the following lemma.

Lemma 1.2 (Three Lines lemma). Suppose that Φ(z) is holomorphic in S = {0 < Re z < 1} and continuousand bounded on S. Let

M0 := supy∈R|Φ(iy)| and M1 := sup

y∈R|Φ(1 + iy)|.

Thensupy∈R|Φ(t+ iy)| ≤M1−t

0 M t1

for any 0 ≤ t ≤ 1.

Proof. We can assume without loss of generality that Φ is nonconstant, otherwise the conclusion is trivial.We first prove a special case of the lemma. Suppose thatM0 = M1 = 1 and that sup0≤x≤1 |Φ(x+ iy)| → 0

as |y| → ∞. In this case, φ has a global (on S) supremum of M > 0. Let {zn} be a sequence such that|Φ(zn)| → M . Because sup0≤x≤1 |Φ(x + iy)| → 0 as |y| → ∞, the sequence {zn} is bounded, hence there isa point z∞ ∈ S such that a subsequence of {zn} converges to z∞. By continuity, |Φ(z∞)| = M . Since Φ isnonconstant, by the maximum modulus principle, z∞ is on the boundary of S. This means that |Φ(z)| isglobally bounded by 1. Since M0 = M1 = 1, this proves the special case.

Next, we remove the decay assumption and only suppose that M0 = M1 = 1. For ε > 0, define

Φε(z) := Φ(z)eε(z2−1).

3

First observe that |Φε(z)| ≤ 1 if Re z = 0 or Re z = 1. Indeed, note that for y ∈ R we have

|Φε(iy)| = |Φ(iy)|∣∣∣eε((iy)2−1)

∣∣∣ ≤ ∣∣∣e−ε(y2+1)∣∣∣ ≤ 1

and

|Φε(1 + iy)| = |Φ(1 + iy)|∣∣∣eε((1+iy)2−1)

∣∣∣ ≤ ∣∣∣eε(1+2iy−y2−1)∣∣∣ ≤ ∣∣e2iyε

∣∣ ∣∣∣e−εy2∣∣∣ ≤ 1.

Next, we claim that Φε satisfies the decay condition from the previous case. Indeed, note that

|Φε(x+ iy)| = |Φ(x+ iy)|∣∣∣eε((x+iy)2−1)

∣∣∣ = |Φ(x+ iy)|∣∣∣eε(x2−y2−1+2xyi)

∣∣∣ ≤ |Φ(x+ iy)|∣∣∣eε(x2−y2−1)

∣∣∣ .Because 0 ≤ x ≤ 1 and because |Φ| is bounded, as |y| → ∞, |Φε(x+ iy)| → 0 uniformly in x. By the previouscase, |Φε(z)| ≤ 1 uniformly in S. Since this holds for all ε > 0, letting ε→ 0 gives the desired result for Φ.

Finally we assume that M0 and M1 are arbitrary positive numbers. Define φ(z) := Mz−10 M−z1 φ(z). Note

that if Re z = 0, then|Φ(z)| ≤ |M iy−1

0 M−iy1 |M0 ≤ |M iy0 | · |M

−iy1 | ≤ 1

and if Re z = 1, then|Φ(z)| ≤ |M iy

0 M−1−iy1 |M1 ≤ |M iy

0 | · |M−iy1 | ≤ 1.

The claim then follows by applying the previous case.

With this lemma, we prove the Riesz-Thorin interpolation theorem.

Proof. By scaling appropriately, we may assume without loss of generality that ‖f‖Lpt = 1.First, consider the case when f is a simple function.

4

2 1/9 — 1/11: The Fourier Transform

Definition 2.1. For f ∈ L1(Rd), the Fourier transform of f is given by

F(f)(ξ) = f(ξ) = (f)ˆ(ξ) :=

∫Rde−2πix·ξf(x) dx.

We are interested in how various operations affect the Fourier transform. We first consider sufficientlynice functions:

Definition 2.2. A C∞-function f : Rd → C is Schwartz if xαDβf ∈ L∞x (Rd) for every multiindex α, β ∈ Nd.The collection of Schwartz functions, Schwartz space, is denoted S(Rd).

For completeness, we recall the multiindex notation: for α = (α1, . . . , αd) ∈ Nd,

|α| := α1 + · · ·αd; xα := xα11 · · ·x

αdd ; Dα :=

∂|α|

∂xα11 · · · ∂x

αdd

.

The set of Schwartz functions forms a Frechet space, i.e., a locally convex space (a vector space endowed witha family of seminorms {ρα} that separates points: if ρα(f) = 0 for all α, then f = 0) which is metrizableand complete. In the case of S(Rd), the collection of seminorms is given by {ρα,β}α,β∈Nd , where

ρα,β(f) :=∥∥xαDβf

∥∥L∞x (Rd)

.

Next, we prove a number of properties of the Fourier transform of Schwartz functions.

Proposition 2.1. Let f ∈ S(Rd). Then:

1. If g(x) := f(x− y), then g(ξ) = e−2πiy·ξ f(ξ);

2. If g(x) := e2πix·ηf(x), then g(ξ) = f(ξ − η);

3. If g(x) := f(Tx) for T ∈ GL(Rd), then g(ξ) = |detT |−1f(T−tξ). In particular, if T is a rotation orreflection and f(Tx) = f(x) (i.e. f is rotation or reflection invariant), then g(ξ) = f(ξ);

4. If g(x) := f(x), then g(ξ) = f(−ξ);

5. If g(x) := Dαf(x), then g(ξ) = (2πiξ)αf(ξ);

6. If g(x) := xαf(x), then g(ξ) =(i

2π

)|α|Dαξ f(ξ);

7. If g(x) := (k ∗ f)(x) for k ∈ L1(Rd), then g(ξ) = k(ξ) · f(ξ).

Proof.

1. Making the change of variables z = x− y,

g(ξ) =

∫e−2πix·ξf(x− y) dx =

∫e−2πi(z+y)·ξf(z) dz = e−2πiy

∫e−2πiz·ξf(z) dz = e−2πiy f(ξ).

2.g(ξ) =

∫e−2πix·ξe2πix·ηf(x) dx =

∫e−2πix·(ξ−η)f(x) dx = f(ξ − η).

5

3. For T ∈ GL(Rd), make the change of variables y = Tx:

g(ξ) =

∫e−2πix·ξf(Tx) dx =

∫e−2πi(T−1y)·ξf(y)|detT |−1 dy = |detT |−1

∫e−2πiy·(T−tξ)f(y) dy

= |detT |−1f(T−tξ).

If T is a rotation or reflection (or more generally, T is orthogonal), then T tT = TT t = I and detT = 1.Thus, in this case, g(ξ) = |detT |−1f(T−tξ) = f(ξ).

4.

g(ξ) =

∫e−2πix·ξf(x) dx =

∫e−2πix·ξf(x) dx =

∫e−2πix·(−ξ)f(x) dx = f(−ξ).

5. Integrating by parts, we have:

g(ξ) =

∫e−2πix·ξDαf(x) dx = (−1)|α|

∫Dαe−2πix·ξf(x) dx

= (−1)|α|∫

(−2πiξ)αe−2πix·ξf(x) dx = (2πiξ)α∫e−2πix·ξf(x) dx

= (2πξ)αf(ξ).

6. Note that Dαx e−2πix·ξ = (−2πix)αe−2πix·ξ, so that xαe−2πix·ξ. Thus, integrating by parts again gives

us:

g(ξ) =

∫e−2πix·ξxαf(x) dx =

(i

2π

)|α|f(ξ).

7. Computing, we have

g(ξ) =

∫e−2πix·ξ(k ∗ f)(x) dx =

∫e−2πix·ξ

∫k(x− y)f(y) dy dx =

∫ ∫e−2πix·ξk(x− y)f(y) dx dy.

We make a change of variables z = x− y in the inner integral to get

g(ξ) =

∫ ∫e−2πi(z+y)·ξk(z)f(y) dz dy =

∫e−2πiz·ξk(z) dz ·

∫e−2πiy·ξf(y) dy = k(ξ) · f(ξ).

Note that properties 1,2,3,4, and 7 extend to functions in L1(Rd). Also note that∣∣∣∣∫ e−2πix·ξf(x) dx

∣∣∣∣ ≤ |∫ |e−2πix·ξ||f(x)| dx =

∫|f(x)| dx

so that ∥∥∥f∥∥∥L∞ξ

≤ ‖f‖L1x.

Proposition 2.2. If f ∈ S(Rd), then f ∈ S(Rd). Moreover, if fn → f ∈ S(Rd), then fn → f ∈ S(Rd).

Proof. Suppose that f ∈ S(Rd). We wish to show that f ∈ S(Rd). So consider

|ξαDβ f(ξ)| ∼ |ξαxβf(ξ)| ∼ |Dαxβf(ξ)| =∣∣∣∣∫ e−2πix·ξDα(xβf(x)) dx

∣∣∣∣ .6

Because f is Schwartz, Dα(xβf(x)) ∈ L1(Rd). Therefore,∥∥∥ξαDβ f(ξ)

∥∥∥L∞ξ

.∥∥Dα(xβf(x))

∥∥L1 < ∞ so that

f ∈ S(Rd). It is clear that {fn} → f ∈ S(Rd) if and only if {fn} → f ∈ S(Rd) by properties 5 and 6 of theabove Proposition.

Corollary 2.0.1 (Riemann-Lebesgue Lemma). If f ∈ L1(Rd), then f is uniformly continuous and vanishesat∞.

Proof. We write C0(Rd) for the space of continuous functions which vanish at∞.Let {fn} be a sequence of Schwartz functions so that fn → f in L1(Rd). Since

∥∥∥fn − f∥∥∥L∞≤ ‖fn − f‖L1 ,

fn → f in L∞(Rd). Since f ∈ S(Rd), f ∈ C0(Rd). Since C0(Rd) is closed under uniform convergence, weare done.

Next, we perform an important computation.

Lemma 2.1. Let A be a real, symmetric, positive definite d× d matrix. Then∫e−x·Axe−2πix·ξ dx = π

d2 (detA)−

12 e−π

2ξ·A−1ξ.

Proof. SinceA is real, symmetric, and positive definite, it is diagonalizable. Explicitly, there is an orthogonalmatrix O and a diagonal matrix D = diag(λ1, . . . , λd), λj > 0, so that A = OTDO. Let y = Ox and η = Oξ.Then

x ·Ax = x ·OTDOx = Ox ·DOx = y ·Dy =

d∑j=1

λjy2j

andx · ξ = OT y ·OT η = y · η.

Thus, ∫e−x·Axe−2πix·ξ dx =

∫e−∑dj=1 λjy

2j e−2πiy·η dy =

d∏j=1

∫Re−λjy

2j−2πiyjηj dyj .

Computing each of these one-variable integrals, we have:∫Re−λy

2−2πiyη dy =

∫Re−λ(y−

πiηλ )

2−π2η2

λ dy = e−π2η2

λ

∫Re−λy

2

dy = e−π2η2

λ λ−12π

12 .

So ∫e−x·Axe−2πix·ξ dx =

d∏j=1

e−π2η2jλj λ

− 12

j π12 = π

d2 (λ1 · · ·λd)−

12 e−∑dj=1

π2

λjη2j

= πd2 (detA)−

12 e−π

2η·D−1η.

The result follows from the fact that η ·D−1η = Oξ ·D−1Oξ = ξ ·A−1ξ.

Corollary 2.1.1. The function e−π|x|2

is an eigenvalue of the Fourier transform with eigenvalue 1. Explicitly,

e−π|x|2 = e−π|ξ|2

.

7

Proof. This follows from the previous lemma with A = πId. Indeed,∫e−π|x|

2

e−2πix·ξ dx = πd2

(πd)− 1

2 e−π2ξ· 1π ξ = e−π|ξ|

2

.

Theorem 2.2 (Fourier Inversion). If f ∈ S(Rd), then (f)ˆ(x) = f(−x). Equivalently,

f(x) = (f)ˆ(−x) =: (f)ˇ(x) =: F−1(f)(x).

Proof. For ε > 0, let

Iε(x) :=

∫e−πε

2|ξ|2e2πix·ξ f(ξ) dξ.

By the dominated convergence theorem, as ε → 0, Iε(x) →∫e2πix·ξ f(ξ) dξ. Thus, to prove the theorem, it

suffices to show that Iε(x)→ f(x) as ε→ 0. We have

Iε(x) =

∫e−πε

2|ξ|2e2πix·ξ f(ξ) dξ =

∫e−πε

2|ξ|2e2πix·ξ∫e−2πiy·ξf(y) dy dξ

=

∫f(y)

∫e−πε

2|ξ|2e−2πi(y−x)·ξ dξ dy.

Note that ∫e−πε

2|ξ|2e−2πi(y−x)·ξ dξ =(e−πε

2|ξ|2)

(y − x) = πd2

((πε2)d

)− 12 e−π

2(y−x)· 1πε2

(y−x)

= ε−de−π|y−x|2

ε2 .

Consequently,

Iε(x) =

∫f(y) ε−de−π

|y−x|2

ε2 dy = (f ∗ φε)(x)

where φε(x) = ε−dφ(xε

)with φ(x) = e−π|x|

2

. It is a standard result that {φε} is an approximation to theidentity. Thus, as ε→ 0,

Iε(x) = (f ∗ φε)(x)→ f(x).

Lemma 2.3. If f, g ∈ S(Rd), then∫f(ξ)g(ξ) dξ =

∫f(x)g(x) dx. In particular,

∫f(ξ)g(ξ) dx =

∫f(x)g(x) dx,

so that∥∥∥f∥∥∥

L2= ‖f‖L2 . Hence, the Fourier transform is an isometry on S(Rd).

Proof. Computing, we have∫f(ξ)g(ξ) dξ =

∫ ∫e−2πix·ξf(x) dx g(ξ) dξ =

∫f(x)

∫e−2πix·ξg(ξ) dξ dx =

∫f(x)g(x) dx.

For the “in particular” statement, let h = g. Then

h(x) = ˆg = (g)(−x) = g(x).

8

Theorem 2.4 (Plancharel). The Fourier transform extends from an operator on S(Rd) to a unitary operatorL2(Rd).

Proof. Fix f ∈ L2(Rd). Let {fn} ⊆ S(Rd) such that fnL2

−−→ f . As the Fourier transform is an isometry onS(Rd), we have

∥∥∥fn − fm∥∥∥L2

= ‖fn − fm‖L2 . This shows that {fn} is Cauchy in L2. Let f := limn→∞ fn,

the limit being taken in the L2-sense.We claim that f is well-defined. Let {fn} and {gn} be two sequences of Schwartz functions such that

fn, gnL2

−−→ f . Define

hn :=

{fk if n = 2k − 1

gk if n = 2k.

Then hnL2

−−→ f . By the argument above, {hn} converges in L2, which implies by the uniqueness of limitsthat limn→∞ fn = limn→∞ gn.

Next we claim that f ∈ L2(Rd), then∥∥∥f∥∥∥

L2= ‖f‖L2 , so the Fourier transform is an isometry on L2(Rd).

Because the norm function is continuous in the L2 topology, we have∥∥∥f∥∥∥L2

= limn→∞

∥∥∥fn∥∥∥L2

= limn→∞

‖fn‖L2 =∥∥∥ limn→∞

fn

∥∥∥L2

= ‖f‖L2 .

Before completing the proof, we remark that in infinite dimensions, an isometry is not necessarily a unitaryoperator. For example, let T : `2(N)→ `2(N) be the right-shift operator given by

T (a0, a1, a2, . . . ) := (0, a0, a1, . . . ).

ThenT ∗(a0, a1, a2, . . . ) := (a1, a2, a3, . . . ).

Clearly T is an isometry, but TT ∗ 6= I .However, to prove that the isometry F is unitary, it suffices to prove that F is surjective. We claim

that imF is closed in L2. From this claim, since S(Rd) ⊆ imF and S(Rd) is dense in L2(Rd), the proof iscomplete.

To demonstrate this claim, let g ∈ imF . Then there is a sequence {fn} of L2 functions so that fnL2

−−→ g.As the Fourier transform is an isometry on L2, this implies that {fn} converges in L2. Let f := limn→∞ fn.Then

∥∥∥fn − f∥∥∥L2

= ‖fn − f‖L2 → 0 so that g = fn.

Next, we show that the Fourier transform extends as a bounded map from Lp(Rd) for 1 ≤ p ≤ 2 usingthe Riesz-Thorin interpolation theorem. For completeness, we recall the theorem here.

Theorem 2.5 (Riesz-Thorin Interpolation). Suppose that T is a bounded linear map from Lp0 → Lq0 andLp1 → Lq1 . Then T is a bounded linear map from Lpθ → Lqθ for 0 ≤ θ ≤ 1, where

1

pθ=

1− θp0

+θ

p1and

1

qθ=

1− θq0

+θ

q1.

Moreover,‖T‖Lpθ→Lqθ ≤ ‖T‖

1−θLp0→Lq0 ‖T‖

θLp1→Lq1 .

Proof. See Stein and Shakarchi volume IV, for example.

9

From this, we have the following result.

Theorem 2.6 (Hausdorff-Young). If f ∈ S(Rd), then∥∥∥f∥∥∥Lp′

. ‖f‖Lp

for 1 ≤ p ≤ 2, where 1p + 1

p′ = 1.

Proof. We know that∥∥∥f∥∥∥

L∞≤ ‖f‖L1 and

∥∥∥f∥∥∥L2

= ‖f‖L2 . Applying the Riesz-Thorin interpolation theoremwith p0 = 1, p1 = 2, q0 =∞, and q1 = 2, we get that the Fourier transform is bounded from pθ → qθ where

1

pθ=

1− θ1

+θ

2and

1

qθ=θ

2.

so that 1pθ

= 1− θ2 , hence 1

pθ+ 1

qθ= 1. Since pθ = 2

2−θ , as θ ∈ [0, 1] we have pθ ∈ [1, 2] as desired.

Conversely, we have the following. This is one of our first examples of the power of scaling arguments.

Theorem 2.7. If∥∥∥f∥∥∥

Lq. ‖f‖Lp for all f ∈ S(Rd) for some 1 ≤ p, q ≤ ∞, then 1 ≤ p ≤ 2 and q = p′.

Proof. Fix f 6= 0 ∈ S(Rd). For λ > 0, let fλ(x) := f(x/λ). Then

fλ(ξ) =

∫e−2πix·ξf

(xλ

)dx = λdf(λξ).

We have∥∥∥fλ∥∥∥

Lq. ‖fλ‖Lp , which implies

λd · λ−dq

∥∥∥f∥∥∥Lq

. λdp ‖f‖Lp .

Thus, λd · λ−dq . λ

dp , so that λ

dq′ . λ

dp and hence λ

1q′ . λ

1p for all 0 < λ <∞. Letting λ→∞ gives 1

q′ ≤1p ,

and letting λ→ 0 gives 1q′ ≥

1p . Thus, so q = p′.

Next, we show 1 ≤ p ≤ 2. It suffices to show p ≤ p′. Let ϕ ∈ C∞c (Rd) with suppϕ ⊆ B(0, 1/2). Letϕk(x) = e−2πix·λke1ϕ(x− ke1). Using properties 1 and 2 of the Fourier transform, it follows that

ϕk(ξ) = e−2πiξ·ke1 ϕ(ξ − λke1).

Let f =∑Nk=1 ϕk. Since the supports of ϕk are disjoint, it follows that ‖f‖Lp ∼ N

1p . Next, we compute:∥∥∥f∥∥∥

Lp′=

∥∥∥∥∥N∑k=1

e−2πiξ·ke1 ϕ(ξ − λke1)χB(λke1,λ/2)(ξ) +

N∑k=1

e−2πiξ·ke1 ϕ(ξ − λke1)χCB(λke1,λ/2)(ξ)

∥∥∥∥∥Lp′

≥

∥∥∥∥∥N∑k=1

e−2πiξ·ke1 ϕ(ξ − λke1)χB(λke1,λ/2)(ξ)

∥∥∥∥∥Lp′

−N∑k=1

∥∥∥e−2πiξ·ke1 ϕ(ξ − λke1)χCB(λke1,λ/2)(ξ)∥∥∥Lp′

& N1p′ −N

∥∥∥ϕ(ξ)χCB(0,λ/2)(ξ)∥∥∥Lp′

.

Because ϕ is Schwartz, ∥∥∥ϕ(ξ)χCB(0,λ/2)(ξ)∥∥∥Lp′

.

(∫|ξ|>λ/2

1

|ξ|2ddξ

) 1p′

. λ− dp′

which→ 0 as λ→∞. Thus,∥∥∥f∥∥∥

Lp′. ‖f‖Lp if and only if N

1p′ . N

1p for all N , which is true if and only if

p ≤ p′.

10

3 1/13 — 1/18: Lorentz Spaces

Definition 3.1. For 1 ≤ p ≤ ∞ and f : Rd → C measurable, we define

‖f‖∗Lpweak := supλ>0

λ∣∣{x ∈ Rd : |f(x)| > λ

}∣∣ 1p . (1)

The weak-Lp space, written Lpweak(Rd), is the family of measurable functions f for which ‖f‖∗Lpweak is finite.

The quantity defined in (1) is not a norm, but it is a quasinorm. Recall that a quasinorm satisfies the sameproperties as a norm, except that the triangle inequality takes the form: ‖f + g‖ ≤ C (‖f‖+ ‖g‖) for someconstant C > 0.

To clarify the definition of the weak-Lp norm, consider a function f ∈ Lp(Rd). Then

‖f‖pLp(Rd) =

∫Rd|f(x)|p dx =

∫Rd

∫ |f(x)|

0

pλp−1 dλ dx.

By Tonelli’s theorem,

‖f‖pLp(Rd) =

∫ ∞0

pλp−1

∫{ x∈Rd : |f(x)|>λ }

dx dλ =

∫ ∞0

pλp−1∣∣{x ∈ Rd : |f(x)| > λ

}∣∣ dλ= p

∫ ∞0

(λ∣∣{x ∈ Rd : |f(x)| > λ

}∣∣ 1p)p dλλ.

So ‖f‖Lp(Rd) = p1p

∥∥∥λ ∣∣{x ∈ Rd : |f(x)| > λ}∣∣ 1p ∥∥∥

Lp((0,∞), dλλ ). Using the convention that p

1∞ = 1, we can

then write‖f‖∗Lpweak(Rd) = p

1∞

∥∥∥λ ∣∣{x ∈ Rd : |f(x)| > λ}∣∣ 1p ∥∥∥

L∞((0,∞), dλλ ).

Example 3.1. Let f(x) = 1

|x|dp

. Then f ∈ Lpweak(Rd), but f /∈ Lp(Rd).

To see this, first note that∫Rd|f(x)|p dx =

∫Rd

1

|x|ddx ∼∼

∫ ∞0

1

rdrd−1 dr ∼

∫ ∞0

1

rdr.

This latter integral does not converge, hence the Lp-norm of f is not finite, so f /∈ Lp(Rd).On the other hand,

∣∣{x ∈ Rd : |f(x)| > λ}∣∣ 1p =

∣∣∣∣∣{x ∈ Rd :

1

|x|dp

> λ

}∣∣∣∣∣1p

=

∣∣∣∣∣{x ∈ Rd : |x| <

(1

λ

) pd

}∣∣∣∣∣1p

.

The set being measured is a ball of radius (1/λ)p/d. The volume of a such a ball scales according to((1/λ)p/d

)d= (1/λ)p. Taking pth roots as above then gives

∣∣{x ∈ Rd : |f(x)| > λ}∣∣ 1p . 1

λ . Hence,

‖f‖∗Lpweak(Rd) . supλ>0

λ1

λ= 1 <∞

so that f ∈ Lpweak(Rd).

Definition 3.2. For 1 ≤ p ≤ ∞ and 1 ≤ q ≤ ∞, define the Lorentz space Lp,q(Rd) to be the space ofmeasurable functions f : Rd → C for which the quantity

‖f‖∗Lp,q(Rd) := p1q

∥∥∥λ ∣∣{x ∈ Rd : |f(x)| > λ}∣∣ 1p ∥∥∥

Lq((0,∞), dλλ )(2)

is finite.

11

Note that, by our previous computation, Lp(Rd) = Lp,p(Rd), and Lpweak(Rd) = Lp,∞(Rd).As with the weak-Lp norm, Lp,q-norm defined in (2) is not actually a norm. Nevertheless, we have:

Lemma 3.1. The quantity ‖·‖∗Lp,q(Rd) is a quasinorm.

Proof. First, note that if ‖f‖∗Lp,q(Rd) = 0, then∥∥∥λ ∣∣{x ∈ Rd : |f(x)| > λ}∣∣ 1p ∥∥∥

Lq((0,∞), dλλ )= 0

which implies that, for almost all λ > 0,∣∣{x ∈ Rd : |f(x)| > λ

}∣∣ = 0, from which it follows that f(x) = 0

almost everywhere. Thus, f = 0.Next, let a ∈ C. Then

‖af‖∗Lp,q(Rd) = p1q

∥∥∥λ ∣∣{x ∈ Rd : |af(x)| > λ}∣∣ 1p ∥∥∥

Lq((0,∞), dλλ )

= |a|p1q

∥∥∥∥∥λa∣∣∣∣{x ∈ Rd : |f(x)| > λ

a

}∣∣∣∣ 1p∥∥∥∥∥Lq((0,∞), dλλ )

.

By making the change of variables η = λ/a, it follows that ‖af‖∗Lp,q(Rd) = |a| ‖f‖∗Lp,q(Rd).It remains to prove that quasi-triangle inequality. To do this, we invoke the fact that for 0 < α < 1,

the map x 7→ xα, x > 0, is concave. It is a fact that concave functions are subadditive (Wikipedia has thecomputation), so that (x+ y)α ≤ xα + yα. Thus, we compute:

‖f + g‖∗Lp,q(Rd) = p1q

∥∥∥λ ∣∣{x ∈ Rd : |f(x) + g(x)| > λ}∣∣ 1p ∥∥∥

Lq((0,∞), dλλ ).

Observe that{x ∈ Rd : |f(x) + g(x)| > λ

}⊆{x ∈ Rd : |f(x)| > λ

2

}∪{x ∈ Rd : |g(x)| > λ

2

}.

Thus,

‖f + g‖∗Lp,q(Rd) ≤ p1q

∥∥∥∥∥λ(∣∣∣∣{x : |f(x)| > λ

2

}∣∣∣∣+

∣∣∣∣{x : |g(x)| > λ

2

}∣∣∣∣) 1p

∥∥∥∥∥Lq((0,∞), dλλ )

.

By subbaditivity of the concave map x 7→ x1p , we have

‖f + g‖∗Lp,q(Rd) ≤ p1q

∥∥∥∥∥λ(∣∣∣∣{x : |f(x)| > λ

2

}∣∣∣∣ 1p +

∣∣∣∣{x : |g(x)| > λ

2

}∣∣∣∣ 1p)∥∥∥∥∥

Lq((0,∞), dλλ )

.

Distributing the λ, factoring out a 2, and appliyng the usual Lq-Minkowski inequality then yields:

‖f + g‖∗Lp,q(Rd) ≤ 2p1q

∥∥∥∥∥λ2∣∣∣∣{x : |f(x)| > λ

2

}∣∣∣∣ 1p +λ

2

∣∣∣∣{x : |g(x)| > λ

2

}∣∣∣∣ 1p∥∥∥∥∥Lq((0,∞), dλλ )

≤ 2p1q

∥∥∥∥∥λ2∣∣∣∣{x : |f(x)| > λ

2

}∣∣∣∣ 1p∥∥∥∥∥Lq((0,∞), dλλ )

+

∥∥∥∥∥λ2∣∣∣∣{x : |g(x)| > λ

2

}∣∣∣∣ 1p∥∥∥∥∥Lq((0,∞), dλλ )

= 2

(‖f‖∗Lp,q(Rd) + ‖g‖∗Lp,q(Rd)

).

12

For 1 < p <∞ and 1 ≤ q ≤ ∞, we will show that the quasinorm ‖·‖∗Lp,q(Rd) is equivalent to a norm. Forp = 1, q 6= 1, the quasinorm is not equivalent to a norm. However, in this case, there exists a metric thatgenerates the same topology. Thus, in either of these cases, Lp,q(Rd) is a complete metric space.

As another remark, we note that if |g| ≤ |f |, then for any λ > 0,{x ∈ Rd : |g(x)| > λ

}⊆{x ∈ Rd : |f(x)| > λ

}.

This implies that ‖g‖∗Lp,q(Rd) ≤ ‖f‖∗Lp,q(Rd).

Proposition 3.1. Let 1 ≤ p <∞, 1 ≤ q ≤ ∞, and f ∈ Lp,q(Rd). Write f =∑m∈Z fm, where

fm(x) := f(x)χ{ x∈Rd : 2m≤|f(x)|≤2m+1 }.

Then‖f‖∗Lp,q(Rd) ∼p,q

∥∥∥‖fm‖Lp(Rd)

∥∥∥`qm(Z)

.

In particular, Lp,q1(Rd) ⊆ Lp,q2(Rd) whenever q1 ≤ q2.

Proof. By the preceding remark, note that∥∥∥∥∥∑m∈Z

2m χ{ x : 2m≤|f |≤2m+1 }

∥∥∥∥∥∗

Lp,q(Rd)

≤

∥∥∥∥∥∑m∈Z

fm

∥∥∥∥∥∗

Lp,q(Rd)

≤

∥∥∥∥∥∑m∈Z

2m+1 χ{ x : 2m≤|f |≤2m+1 }

∥∥∥∥∥∗

Lp,q(Rd)

.

Therefore, it suffices to prove the proposition for a function of the form f(x) =∑m∈Z 2mχEm where {Em}

is a pairwise disjoint collection of sets.In this case, we need to show that ‖f‖∗Lp,q(Rd) ∼p,q

∥∥∥2m|Em|1p

∥∥∥`qm(Z)

. We compute:

(‖f‖∗Lp,q(Rd)

)q= p

∫ ∞0

λq | {x : |f(x)| > λ } |qpdλ

λ

= p∑m∈Z

∫ 2m

2m+1

λq | {x : |f(x)| > λ } |qpdλ

λ.

Next, observe that for λ ∈ [2m+1, 2m), {x : |f(x)| > λ } =⋃n≥mEM . Because the Em’s are disjoint, we

then have

(‖f‖∗Lp,q(Rd)

)q= p

∑m∈Z

∫ 2m

2m+1

λq

∑n≥m

|Em|

qp

dλ

λ.

This removes the dependence on λ in the set inside the integral. Because

p

∫ 2m

2m+1

λqdλ

λ=p

q(2mq − 2(m+1)q) =

p

q(1− 2q) 2mq

it follows that

(‖f‖∗Lp,q(Rd)

)q∼p,q

∑m∈Z

2mq

∑n≥m

|En|

qp

=

∥∥∥∥∥∥∥2m

∑n≥m

|Em|

1p

∥∥∥∥∥∥∥q

`qm(Z)

.

13

Thus,

‖f‖∗Lp,q(Rd) ∼p,q

∥∥∥∥∥∥∥2m

∑n≥m

|En|

1p

∥∥∥∥∥∥∥`qm(Z)

&p,q∥∥∥2m (|Em|)

1p

∥∥∥`qm(Z)

.

This gives half of the ∼p,q relation that we need.To get the other inequality, we compute as follows. First, invoking concavity of fractional powers as

before,

‖f‖∗Lp,q(Rd) ∼p,q

∥∥∥∥∥∥∥2m

∑n≥m

|En|

1p

∥∥∥∥∥∥∥`qm(Z)

≤

∥∥∥∥∥∥2m∑n≥m

|En|1p

∥∥∥∥∥∥`qm(Z)

.

Next, making the change of variables n = m+ k,

=

∥∥∥∥∥∥2m∑k≥0

|Em+k|1p

∥∥∥∥∥∥`qm(Z)

=

∥∥∥∥∥∥∑k≥0

2−k2m+k|Em+k|1p

∥∥∥∥∥∥`qm(Z)

≤∑k≥0

2−k∥∥∥2m+k|Em+k|

1p

∥∥∥`qm(Z)

=∑k≥0

2−k∥∥∥2m|Em|

1p

∥∥∥`qm(Z)

=∥∥∥2m|Em|

1p

∥∥∥`qm(Z)

so that ‖f‖∗Lp,q(Rd) .p,q∥∥∥2m|Em|

1p

∥∥∥`qm(Z)

. Therefore,

‖f‖∗Lp,q(Rd) ∼p,q∥∥∥2m|Em|

1p

∥∥∥`qm(Z)

as desired.The “in particular” statement follows from the fact that if q1 ≤ q2, then `q1(Z) ⊆ `q2(Z).

As noted in the proof of the above proposition, the monotonicty of the quasinorm allows for the fol-lowing reduction in many computations involving Lorentz spaces. If f ∈ Lp,q(Rd) is real-valued andnonnegative, then we can assume without loss of generality that f =

∑m∈Z 2mχEm , where the sets Em are

pairwise disjoint. In this case,‖f‖∗Lp,q(Rd) =

∥∥∥2m|Em|1p

∥∥∥`qm(Z)

.

We can further decompose a general function into its real and imaginary parts, and then into their corre-sponding positive and negative parts to apply this reduction.

Proposition 3.2. Let 1 < p < ∞ and 1 ≤ q ≤ ∞, and let p′ and q′ denote the respective Holder conjugates.Then

‖f‖∗Lp,q(Rd) ∼p,q sup‖g‖∗

Lp′,q′ (Rd)

≤1

∣∣∣∣∫Rdf(x)g(x) dx

∣∣∣∣ . (3)

Proof. As the quasinorm is positively homogeneous, we can scale the function f appropriately and assumewithout loss of generality that ‖f‖∗Lp,q(Rd) = 1. Also, as remarked previously, we can assume without lossof generality that f and g are real-valued and nonnegative, hence we can take f =

∑n∈Z 2nχFn for disjoint

sets Fn and g =∑m∈Z 2mχEm for disjoint sets Em.

14

In this case, we have

1 =(‖f‖∗Lp,q(Rd)

)q=∥∥∥2n|Fn|

1p

∥∥∥q`qn(Z)

=∑n∈Z

2nq|Fn|qp .

We decompose the above sum as follows. Let 2Z denote the set of dyadic numbers. FINISH

The righthand side of (3) defines a norm which is equivalent to the quasinorm ‖·‖∗Lp,q(Rd). With respectto this norm, Lp,q(Rd) for 1 < p <∞, 1 ≤ q ≤ ∞, is a Banach space. The proof of completeness is the sameas the Lp-completeness argument. The dual space in this case is Lp

′,q′(Rd).As remarked above, if p = 1 and q 6= 1, there is no norm which is equivalent to the quasinorm. The

following example demonstrates this explicitly.

Example 3.2. Consider the case p = 1, q =∞, d = 1. Let f(x) =∑Nn=1

1|x−n| . We saw in a previous example

that∥∥∥ 1|x−n|

∥∥∥∗L1,∞(R)

=∥∥∥ 1|x−n|

∥∥∥∗L1weak(R)

. 1. This implies that∑Nn=1

∥∥∥ 1|x−n|

∥∥∥∗L1,∞(R)

. N .

We claim that∥∥∥∑N

n=11

|x−n|

∥∥∥∗L1,∞(R)

& N logN . To see this, we compute:

∥∥∥∥∥N∑n=1

1

|x− n|

∥∥∥∥∥∗

L1,∞(R)

= supλ>0

λ

∣∣∣∣∣{x ∈ Rd :

N∑n=1

1

|x− n|> λ

}∣∣∣∣∣ .Fix x ∈ [0, N ]. Note that, for n ≥ x, |x − n| < n so that 1

|x−n| >1n . For n < x, we can consider each

finite number and rearrange them so that their sum is comparable to∑n<x

1n . Thus, for x ∈ [0, N ] we have

f(x) &∑Nn=1

1n ∼ logN . So if we choose λ = C logN for some appropriate constant C, we then have

FINISH!!!!!!!!!!!Note that

{x ∈ Rd :

∑Nn=1

1|x−n| > C logN

}⊇ [0, N ]. Therefore,

‖f‖L1,∞(R) ≥ C logN

∣∣∣∣∣{x ∈ Rd :

N∑n=1

1

|x− n|> C logN

}∣∣∣∣∣ ≥ CN logN

as claimed.Now, suppose that the quasinorm was equivalent to a norm. Then by the triangle inequality,

N logN .

∥∥∥∥∥N∑n=1

1

|x− n|

∥∥∥∥∥∗

L1,∞(R)

.N∑n=1

∥∥∥∥ 1

|x− n|

∥∥∥∥∗L1,∞(R)

. N.

Contradiction.

15

4 1/20: The Marcinkiewicz Interpolation Theorem

Definition 4.1. A mapping T on a class of measurable functions is sublinear if:

1. |T (cf)| ≤ |c||T (f)|;

2. |T (f + g)| ≤ |T (f)|+ |T (g)|

for all c ∈ C and f, g in the support of T .

Example 4.1. 1. Any linear map is sublinear.

2. Given a family of linear mappings {Tt}t∈I , the mapping defined by

T (f)(x) := ‖Tt(f)(x)‖Lqtis sublinear.

When q =∞, this is a kind of maximal function. When q = 2, this is a kind of square function.

Definition 4.2. Let 1 ≤ p, q ≤ ∞. A mapping of functions T is of (strong) type (p, q) if ‖Tf‖Lq(Rd) .T‖f‖Lp(Rd). That is, T is of type (p, q) if it is bounded as a mapping from Lp(Rd)→ Lq(Rd).

Similarly, for q <∞, T is of weak type (p, q) if ‖Tf‖∗Lq,∞(Rd) .T ‖f‖Lp(Rd).

For q <∞, T is of restricted weak type (p, q) if ‖TχF ‖∗Lq,∞(Rd) .T |F |1p for all finite measure sets F .

Note that type (p, q) implies weak type (p, q), which implies restricted weak type (p, q).

Proposition 4.1. A mapping T is of restricted weak type (p, q) if and only if∫|TχF ||χE | . |E|

1p |F |

1q

for all finite measure sets E,F .

Proof. ADD

The main theorem that we will prove is the following, due to Hunt.

Theorem 4.1 (Marcinkiewicz Interpolation). Fix 1 ≤ p1, p2, q1, q2 ≤ ∞ such that p1 6= p2 and q1 6= q2. LetT be a sublinear operator of restricted weak type (p1, q1) and of restricted weak type (p2, q2). Then for any1 ≤ r ≤ ∞ and 0 < θ < 1,

‖Tf‖∗Lqθ,r . ‖f‖∗Lpθ,r

where 1pθ

= θp1

+ 1−θp2

and 1qθ

= θq1

+ 1−θq2

.

Note that if pθ ≤ qθ, taking r = qθ gives

‖Tf‖Lqθ . ‖f‖∗Lpθ,qθ . ‖f‖Lpθ

so that T is of strong type (pθ, qθ). The requirement that pθ ≤ qθ is essential for the strong type conclusion,as evidenced by the following example.

16

Example 4.2. Define T (f)(x) := f(x)

|x|12

. Then T is bounded as an operator from Lp(R) → L2pp+2 ,∞(R) for any

2 ≤ p ≤ ∞, but is not bounded as an operator from Lp(R)→ L2pp+2 (R).

To see this, we first recall Holder’s inequality in Lorentz spaces (see homework): for 1 ≤ p1, p2, p < ∞and 1 ≤ q1, q2, q ≤ ∞ satisfying 1

p = 1p1

+ 1p2

and 1q = 1

q1+ 1

q2, we have ‖fg‖∗Lp,q . ‖f‖

∗Lp1,q1 ‖g‖

∗Lp2,q2 . Then

by this inequality, we have:

‖Tf‖∗L

2pp+2

,∞(R)

. ‖f‖∗Lp,∞(R)

∥∥∥|x|− 12

∥∥∥∗L2,∞(R)

. ‖f‖Lp(R) .

On the other hand, consider the function f(x) = |x|−1p

∣∣∣log(|x|+ 1

|x|

)∣∣∣− p+22p

. We first claim that f ∈ Lp(R).We have:

‖f‖pLp(R) =

∫1

|x|

∣∣∣∣log

(|x|+ 1

|x|

)∣∣∣∣−p+22

dx

= 2

(∫ 2

0

1

|x|

∣∣∣∣log

(|x|+ 1

|x|

)∣∣∣∣−p+22

dx+

∫ ∞2

1

|x|

∣∣∣∣log

(|x|+ 1

|x|

)∣∣∣∣−p+22

dx

).

Consider the second integral. Since p ≥ 2, we have

∫ ∞2

1

|x|

∣∣∣∣log

(|x|+ 1

|x|

)∣∣∣∣−p+22

dx ≤∫ ∞

1

1

|x|

∣∣∣∣log

(|x|+ 1

|x|

)∣∣∣∣−2

dx ≤∫ ∞

2

1

x(log x)2dx =

∫ ∞log 2

1

u2du <∞.

Now consider the first integral. We have:

∫ 2

0

1

|x|

∣∣∣∣log

(|x|+ 1

|x|

)∣∣∣∣−p+22

dx =???.

Now, we compute ‖Tf‖L

2pp+2

.

‖Tf‖2pp+2

L2pp+2

=

∫|x|−

2p+2

∣∣∣∣log

(|x|+ 1

|x|

)∣∣∣∣−1

dx ≥∫ ∞

1

|x|−2p+2

∣∣∣∣log

(|x|+ 1

|x|

)∣∣∣∣−1

dx

17

5 1/25 — 1/27: Maximal Functions, Ap weights

Recall that the Hardy-Littlewood maximal function is defined by:

M(f)(x) := supr>0

1

|Br(x)|

∫Br(x)

|f(y)| dy.

The following theorem is standard.

Theorem 5.1. 1. For f ∈ Lp(Rd), 1 ≤ p ≤ ∞, M(f) is finite almost everywhere.

2. The operator M is of weak type (1, 1), and of strong type (p, p) for 1 < p ≤ ∞.

The fact that M is of weak-type (1, 1) is often referred to as the Hardy-Littlewood maximal inequality, sinceunraveling the definition of M being of weak type (1, 1) gives:

‖Mf‖∗L1,∞(Rd) . ‖f‖L1(Rd) ⇒ |{x : (Mf)(x) > λ } | ≤ C

λ‖f‖L1(Rd) .

Note that M is not of strong type (1, 1). To see this, let φ ∈ C∞c (Rd) with suppφ ⊆ B1/2(0). Fix |x| > 1. Notethat if r < |x| − 1/2, then

∫Br(x)

φ(y) dy = 0. Furthermore, if r > |x|+ 1/2, then

1

|Br(x)|

∫Br(x)

φ(y) dy =1

|Br(x)|

∫B|x|+1/2(x)

φ(y) dy ≤ 1

|B|x|+1/2(x)|

∫B|x|+1/2(x)

φ(y) dy.

Thus,

(Mφ)(x) = sup|x|−1/2≤r≤|x|+1/2

1

|Br(x)|

∫Br(x)

φ(y) dy &1

|x|d.

But 1|x|d is not in L1(Rd).

We will prove a more general theorem then the one from above:

Theorem 5.2. Let ω : Rd → [0,∞) be locally integrable. Associate to ω the measure defined by ω(E) :=∫Eω(y) dy. Then:

M : L1(M(ω) dx)→ L1,∞(ω dx)

M : Lp(M(ω) dx)→ Lp(ω dx) 1 λ }) . 1

λ

∫Rd|f(x)| (Mω)(x) dx

and ∫Rd|(Mf)(x)|pω(x) dx .

∫Rd|f(x)|p (Mω)(x) dx.

Note that if ω ≡ 1, then M(ω) ≡ 1, and we recover the the previous theorem.

Definition 5.1. A function ω satisfies the A1 condition, written ω ∈ A1, if M(ω) . ω almost everywhere.

If ω ∈ A1, then M : L1(ω dx)→ L1,∞(ω dx) and M : Lp(ω dx)→ Lp(ω dx) for 1 < p ≤ ∞.

18

Lemma 5.3. The following are equivalent:

1. ω ∈ A1;

2. ω(B)|B| . ω(x) for almost all x ∈ B and all balls B;

3. 1|B|∫Bf(y) dy . 1

ω(B)

∫Bf(y)ω(y) dx for all f ≥ 0 and all balls B.

Proof. We first show (1)⇒ (2). Fix a ball B0 of radius r0 and let x ∈ B0. Then

ω(x) & (Mω)(x) = supr>0

1

|Br(x)|

∫Br(x)

ω(y) dy ≥ 1

|B2r0(x)|

∫B2r0

(x)

ω(y) dy.

Because B2r0(x) ⊇ B0 (drawing a picture makes this clearer) and because |B2r0(x)| .d |B0|,

ω(x) &1

|B0|

∫B0

ω(y) dy =ω(B0)

|B0|

as desired.The implication (2)⇒ (1) follows immediately from the definition of the maximal function.Next, we show (2)⇒ (3). Fix a ball B and f ≥ 0. Then

1

ω(B)

∫B

f(y)ω(y) dy &1

ω(B)

∫B

f(y)ω(B)

|B|dy =

1

|B|

∫B

f(y) dy.

Finally (3)⇒ (2). Fix a ball B and let x ∈ B be a Lebesgue point of B (i.e., a point for which the conclusionof the Lebesgue differentiation theorem holds). Choose r << 1 so that Br(x) ⊆ B. Let f = χBr(x). Then by(3),

1

|B|

∫B

χBr(x)(y) dy .1

ω(B)

∫B

χBr(x)(y)ω(y) dx ⇒ |Br(x)||B|

.1

ω(B)

∫Br(x)

ω(y) dy.

Since x is a Lebesgue point,ω(B)

|B|.

1

|Br(x)|

∫Br(x)

ω(y) dy → ω(x)

as r → 0.

Definition 5.2. A function ω : Rd → [0,∞) satisfies the Ap condition, written ω ∈ Ap, if

supB

ω(B)

|B|

(1

|B|

∫B

ω(y)−p′p dy

) pp′

. 1,

where the supremum is taken over all open balls in Rd.

We note for convenience that the above condition is equivalent to

supB

ω(B)

|B|p

(∫B

ω(y)−p′p dy

) pp′

. 1.

Theorem 5.4. Fix 1 < p <∞. Then M : Lp(ω dx)→ Lp(ω dx) if and only if ω ∈ Ap.

19

Proof. We only prove one direction (⇒) of this statement; the other direction, which requires a reverseHolder’s inequality type argument, is more difficult, and may be revisited later.

For now, suppose that M : Lp(ω dx)→ Lp(ω dx). Fix a ball B and let f = (ω + ε)−p′p χB . For x ∈ B,

(Mf)(x) = supr>0

1

|B(x, r)|

∫B(x,r)

(ω(y) + ε)−p′p χB(y) dy ≥ 1

|2B|

∫B

(ω(y) + ε)−p′p dy

&1

|B|

∫B

(ω(y) + ε)−p′p dy =: λ.

Here our definition of λ includes any implicit constants from the above inequality. Then

ω(B) ≤ ω({

x : (Mf)(x) >λ

2

}).

1

λp

∫|f(y)|pω(y) dy

where the final inequality follows from the fact that M : Lp(M(ω) dx)→ Lp(ω dx), and hence

M : Lp(M(ω) dx)→ Lp,∞(ω dx).

Plugging in our definition of λ gives

ω(B) . |B|p(∫

B


)−p ∫B

(ω(y) + ε)−p′ω(y) dy

. |B|p(∫

B


)−p ∫B

(ω(y) + ε)−p′+1 dy

= |B|p(∫

B


)−p ∫B


= |B|p(∫

B


)−(p−1)

.

Thus,ω(B)

|B|p

(∫B


) pp′

. 1.

Letting ε→ 0 gives ω ∈ Ap.

We remark that if 1 < p < q <∞, then Ap ⊆ Aq . This follows from Holder’s inequality.

Theorem 5.5. Fix 1 ≤ p ≤ ∞, and let dµ be a nonnegative Borel measure. If M : Lp(dµ) → Lp,∞(dµ), thendµ = ω dx for some ω ∈ Ap.

Proof. If we can prove that dµ is absolutely continuous, then in light of the previous proof, we are done.Decompose dµ = ω(x) dx+ dν where dν is the singular part of dµ, i.e., there exists a compact set K with

|K| = 0 but ν(K) > 0. Define Un = {x : d(x,K) < 1/n } and fn = χUn\K . As Un \ K ⊇ Un+1 \ K, and⋂(Un \K) = ∅, it follows that fn → 0 pointwise.

We claim that dµ is finite on compact sets. To see this, pick a measurable set E with 0 < µ(E) <∞; thispossible since µ is nontrivial. We may assume that E is compact by inner regularity. Then

(MχE)(x) = supr>0

1

|B(x, r)|

∫B(x,r)

χE(y) dy.

20

Let r = d(x,E) + diamE. Then

(MχE)(x) &|E|rd.

Since E is compact, if we restrict x to a compact set then d(x,E) is bounded below and above. Thus,MχE &E,F 1 uniformly for x ∈ F with F compact. So suppose for the sake of contradiction that there is acompact set F with µ(F ) =∞. Then

∞ = µ(F ) ≤ µ ({x : MχE(x) &E,F 1 }) .E,F∫E

dµ = µ(E) <∞

which is a contradiction.With this claim proven, next, note that

∫|fn|p dµ → 0 by the dominated convergence theorem. Fix

x ∈ K. Then since B(x, 1/n) ⊆ Un and since |K| = 0,

(Mf)(x) = supr>0

1

|B(x, r)|

∫B(x,r)

χUn\K(y) dy ≥ 1

|B(x, 1/n)|

∫B(x,1/n)

χUn\K(y) dy

=1

|B(x, 1/n)|

∫KC

χB(x,1/n)(y) dy

=1

Rd

∫KC

χB(x,1/n)(y) dy

= 1.

So x ∈ {x : (Mfn)(x) > 1/2 }. Thus, K ⊆ {x : (Mfn)(x) > 1/2 }. So

0 < ν(K) < µ(K) ≤ µ ({x : (Mfn)(x) > 1/2 }) .∫|fn|p dµ→ 0

which is a contradiction. Thus, dµ = ω(x) dx, and by the previous theorem, ω ∈ Ap.

Lemma 5.6. We have ω ∈ Ap if and only if(1

|B|

∫B

f(y) dy

)p.

1

ω(B)

∫B

f(y)pω(y) dy

uniformly for all f ≥ 0 and all balls B.

Proof. First, suppose that ω ∈ Ap. Then

supB

ω(B)

|B|p

(∫B

w(y)−p′p

) pp′

. 1.

Then, by Holder,

1

|B|p

(∫B

f(y) dy

)p=

1

|B|p

(∫B

f(y)ω(y)1pω(y)−

1p dy

)p.

1

|B|p

∫B

|f(y)|pω(y) dy

(∫B

ω(y)−p′p

) pp′

.1

|B|p

∫B

|f(y)|pω(y) dy|B|p

ω(B)

=1

ω(B)

∫B

f(y)pω(y) dy.

21

To see the converse direction, suppose that(1

|B|

∫B

f(y) dy

)p.

1

ω(B)

∫B

f(y)pω(y) dy

uniformly for all f ≥ 0 and all balls B. Let f = (ω + ε)−p′p . Then

1

|B|p

(∫B


)p.

1

ω(B)

∫B

(ω(y) + ε)−p′ω(y) dy

≤ 1

ω(B)

∫B

(ω(y) + ε)−p′+1 dy

=1

ω(B)

∫B

(ω(y) + ε)p′p dy.

Soω(B)

|B|p

(∫B


)p−1

. 1.

Letting ε→ 0 gives the result.

22

6 1/30 — 2/1: Weighted Maximal Inequality, Vector-Valued MaximalFunctions, and the Calderon-Zygmund Decomposition

We now prove the following theorem, stated earlier, which we refer to as the weighted maximal inequality.

Theorem 6.1. Let ω : Rd → [0,∞) be locally integrable. Associate to ω the measure defined by ω(E) :=∫Eω(y) dy. Then:

M : L1(M(ω) dx)→ L1,∞(ω dx)

M : Lp(M(ω) dx)→ Lp(ω dx), 1 < p ≤ ∞.

We begin by recalling a form of the Vitali covering lemma.

Lemma 6.2 (Vitali). Let F be a finite collection of open balls in Rd. Then there exists a subcollection S of Fsuch that

1. Distinct balls in S are disjoint;

2.⋃B∈F ⊆

⋃B∈S 3B.

Proof. Run the following algorithm.

1. Set S := ∅.

2. Choose a ball in F of largest radius, and add it to S.

3. Discard all of the balls in F which intersect balls in S.

4. If all balls in F are removed, stop. Otherwise, return to step 2.

This algorithm terminates because at least one ball in F is discarded at every step. By construction,distinct balls in S are disjoint. Finally, if B is a ball in F which is not in S, it necessarily intersects someball B′ ∈ S of larger radius. By the triangle inequality (draw a picture), 3B′ contains B. Thus,

⋃B∈F ⊆⋃

B∈S 3B.

Now we prove the main theorem.

Proof. First, we claim that M : L∞(M(ω) dx)→ L∞(ω dx). To see this, let f ∈ L∞(M(ω) dx). Then

‖Mf‖L∞(ω dx) = infω(E)=0

supx∈EC

|(Mf)(x)| ≤ inf|E|=0

supx∈EC

|(Mf)(x)|

because |E| = 0 implies ω(E) = 0 (recall that ω(E) =∫Ew(y) dy). Next, recall the trivial estimate that

‖Mf(x)‖L∞(dx) ≤ ‖f‖L∞(dx), which follows from moving absolute value signs inside the integral definitionof Mf . Then

‖Mf‖L∞(ω dx) ≤ inf|E|=0

supx∈EC

|f(x)|.

23

Note that, unless ω ≡ 0, Mω(x) > 0 for all x. Thus,

‖Mf‖L∞(ω dx) ≤ inf|E|=0

supx∈EC

|f(x)| = inf(Mω)(E)=0

supx∈EC

|f(x)| = ‖f‖L∞(Mω dx)

as desired.Since M is bounded from L∞(M(ω) dx) → L∞(ω dx), to prove the entire theorem, it suffices to prove

that M is bounded from L1(M(ω) dx) → L1,∞(ω dx) by the Marcinkiewicz Interpolation theorem. Explic-itly, suppose that we have demonstrated the latter fact. Then we apply the theorem with p1 = 1, q1 = 1

(since we are supposing that M is bounded from L1(M(ω) dx) → L1,∞(ω dx)) and p2 = ∞, q2 = ∞. Thenfor θ ∈ (0, 1), define

1

pθ=

θ

p1+

1− θp2

= θ ⇒ pθ =1

θ

and1

qθ=

θ

q1+

1− θq2

= θ ⇒ qθ =1

θ.

For r = qθ = pθ = 1θ , the interpolation theorem gives that M is bounded from L

1θ (M(ω) dx) → L

1θ (ω dx).

As 0 < θ < 1, 1 < 1θ <∞, and thus the theorem is proved.

So it remains to prove M : L1(M(ω) dx) → L1,∞(ω dx). For f ∈ L1(M(ω) dx), we need to show that‖Mf‖∗L1,∞(ω dx) . ‖f‖L1(M(ω) dx), i.e.,

supλ>0

λω ({x : |Mf(x)| > λ }) .∫|f(x)|M(ω)(x) dx.

Fix λ > 0 and consider {x : |Mf(x)| > λ }. Let K ⊆ {x : |Mf(x)| > λ } be compact. For x ∈ K, Mf(x) >

λ, so by definition of the maximal function there exists rx > 0 so that

1

|Br(x)|

∫Brx (x)

|f(y)| dy > λ.

Then K ⊆⋃x∈K Brx(x), and compactness then gives a finite subcover of such balls. By Vitali’s covering

lemma, there exists a subcollection S of these balls such that distinct balls in S are disjoint, and K ⊆⋃B∈S 3B. Then ω(K) ≤

∑B∈S ω(3B).

Consider a ball Bj ∈ S with radius rj . For y ∈ Bj , note that B4rj (y) ⊇ 3Bj (draw a picture). Thus,

ω(3Bj) =

∫3Bj

ω(x) dx ≤∫B4rj

(y)

ω(x) dx ≤ (Mω)(y) |B4rj (y)| = (Mω)(y) 4d |Bj |.

Therefore,

ω(3Bj) ·∫Bj

|f(y)| dy ≤ 4d|Bj |∫Bj

(Mω)(y)|f(y)| dy

so thatω(3Bj) ·

1

|Bj |

∫Bj

|f(y)| dy ≤ 4d∫Bj

(Mω)(y)|f(y)| dy.

But 1|Bj |

∫Bj|f(y)| dy > λ, so

ω(3Bj) .d1

λ

∫Bj

(Mω)(y)|f(y)| dy.

Since the balls Bj are all disjoint, we then have

ω(K) ≤∑Bj∈S

ω(3Bj) .1

λ

∫Rd

(Mω)(y)|f(y)| dy =1

λ‖f‖L1(M(ω) dx) .

As this holds for all compact K ⊆ {x : |Mf(x)| > λ }, by the inner regularity of the measure ω, we aredone.

24

As an application of this theorem, we discuss vector-valued maximal functions.

Definition 6.1. For f : Rd → `2(N) given by f(x) = {fn(x)}n≥1, define

‖f‖Lp :=∥∥∥∥{fn(x)}`2n

∥∥∥∥Lpx

and the vector-valued maximal function M by

M(f)(x) := ‖{Mfn(x)}‖`2n .

Oberserve that, Mf : Rd → [0,∞], so that Mf is not itself vector-valued.We have the following vector-valued version of the classical Hardy-Littlewood maximal inequality:

Theorem 6.3.

1. The operator M is of weak type (1, 1), that is,∥∥Mf∥∥∗L1,∞ . ‖f‖L1

for f : Rd → `2(N);

2. M is of strong type (p, p) for all 1 < p <∞, that is,∥∥Mf∥∥Lp

. ‖f‖Lp

for f : Rd → `2(N).

Before proving the theorem, we remark that the bound L∞ → L∞ does not hold, unlike in the scalarcase. To see this, consider the following example.

Example 6.1. Let d = 1. For n ≥ 1, let fn(x) = χ[2n−1,2n)(x) and define f : R→ `2(N) by f(x) = {fn(x)}n≥1.We claim that f ∈ L∞, but that Mf /∈ L∞.

First, compute:

‖f(x)‖2`2n = ‖{fn(x)}n≥1‖2`2n =∑n≥

∣∣χ[2n−1,2n)(x)∣∣2 =

∑n≥1

χ[2n−1,2n)(x) = χ[1,∞)(x)

so that ‖f(x)‖`2n = χ[1,∞)(x). Thus ‖f‖L∞ =∥∥χ[1,∞)(x)

∥∥L∞x

= 1 <∞, so f ∈ L∞.

Next, we consider Mfn(x). Suppose that |x| ≤ 2n. Then Mfn(x) = supr>012r

∫ x+r

x−r χ[2n−1,2n)(y) dy.Taking r = 2n+1 then gives

Mfn(x) ≥ 1

2 · 2n+1

∫ x+2n+1

x−2n+1

χ[2n−1,2n)(y) dy =1

2n+2

∫Rχ[2n−1,2n)(y) dy =

1

2n+2· 2n−1 =

1

8.

So for x ∈ R, ∥∥Mf(x)∥∥2

`2n=∑n≥1

|Mfn(x)|2 ≥∑

n:2n≥|x|

|Mfn(x)|2 ≥∑

n:2n≥|x|

1

64∞.

Therefore, Mf /∈ L∞.

25

Before proving Theorem 6.3, we give an outline of the argument. The case p = 2 is a straightforwardcomputation, and the case p > 2 will follow from the weighted maximal inequality. Having establishedthe theorem for p = 2, by the Marcinkiewicz Interpolation theorem, it will suffice to prove the weak typeestimate for p = 1 to prove the entire theorem. To prove this case we employ a Calderon-Zygmund decom-position technique.

We adopt the convention that for f : Rd → `2(N), |f(x)| := ‖{fn(x)}‖`2n .

Lemma 6.4 (Calderon-Zygmund Decomposition). Given f ∈ L1(Rd) (possibly vector-valued) and λ > 0,there is a decomposition f = g + b such that:

1. |g(x)| ≤ λ for almost all x ∈ Rd;

2. b = fχQk , where {Qk} is a collection of cubes whose interiors are disjoint and such that

λ <1

|Qk|

∫Qk

|b(y)| dy ≤ 2dλ.

Proof. Decompose Rd into dyadic cubes of the form:

Qk = [2nk1, 2n(k1 + 1)× · · · × [2nkd, 2

n(kd + 1))

where the diameter of each cube is chosen to be large enough so that

1

Qk

∫Qk

|f(y)| dy ≤ λ.

Because∫Rd |f(y)| dy <∞, this is certainly possible.

Run the following algorithm. Fix a cube Q from the above decomposition. Divide Q into 2d equal sizedcubes. Let Q′ denote one of these smaller cubes.

If Q satisfies1

Q′

∫Q′|f(y)| dy > λ,

then stop and add Q′ to the collection of cubes which define the support of b. Note that, for such a cube,

λ <1

|Q′|

∫Q′|f(y)| dy ≤ 2d

|Q|

∫Q

|f(y)| dy ≤ 2dλ

as required by the definition of b.If Q satisfies

1

Q′

∫Q′|f(y)| dy ≤ λ,

then subdivide Q′ as we did with Q and continue until, if ever, we land in the previous case.Having run this algorithm, let b = fχ⋃Qk , where {Qk} is the collection of cubes from the above algo-

rithm. Set g = f − b. By construction, the cubes Qk have disjoint interiors.It only remains to check that |g| ≤ λ almost everywhere. If x /∈

⋃Qk, then by our application there is a

sequence of cubes, each containing x, with diameters shrinking to 0, such that the average value of |f | onthese cubes is all ≤ λ. By the Lebesgue differentiation theorem, |g| ≤ λ almost everywhere.

In the Calderon-Zygmund decomposition of an L1 function f , g represents the good part of f , i.e., thepart which is essentially uniformly bounded by λ, and b represents the bad part. The requirement of b in thedecomposition says that the average of b on any given cube is on the order of λ.

Now, we prove Theorem 6.3.

26

Proof. First consider the case p = 2. For a vector-valued f = {fn} ∈ L2(Rd), we need to show that∥∥Mf∥∥L2 . ‖f‖L2 . By definition,

∥∥Mf∥∥2

L2 =

∫Rd‖{Mfn(x)}‖2`2n dx =

∫Rd

∑n≥1

|Mfn(x)|2 dx.

Since all the quantities in question are nonnegative, we can interchange the integral and summation (byTonelli’s theorem) to get

∥∥Mf∥∥2

L2 =∑n≥1

∫Rd|Mfn(x)|2 dx.

Previously, we proved that M is of strong type (2, 2). So

∥∥Mf∥∥2

L2 =∑n≥1

∫Rd|Mfn(x)|2 dx .

∑n≥1

∫Rd|fn(x)|2 dx =

∫Rd

∑n≥1

|fn(x)|2 dx = ‖f‖2L2

as desired.Next, suppose that 2 < p <∞. We wish to show that

∥∥Mf∥∥Lp

. ‖f‖Lp . Observe that

∥∥Mf∥∥Lp

=∥∥∥‖Mfn(x)‖`2n

∥∥∥Lpx

=∥∥∥(‖Mfn(x)‖`2n)2

∥∥∥ 12

Lp/2x

so that∥∥Mf

∥∥2

Lp=∥∥(Mf(x))2

∥∥Lp/2

. By the duality characterization of norms,

∥∥Mf∥∥2

Lp=∥∥(Mf(x))2

∥∥Lp/2

= sup‖ω‖

L(p/2)′=1

∫(Mf(x))2 ω(x) dx = sup

‖ω‖L(p/2)′=1

∫ ∑n≥1

|Mfn(x)|2 ω(x) dx

= sup‖ω‖

L(p/2)′=1

∑n≥1

∫|Mfn(x)|2 ω(x) dx.

For ω ≥ 0, the weighted maximal inequality that we proved earlier tells us that M : L2(M(ω) dx) →L2(ω dx). Thus, supposing without loss of generality that the supremum in the above expression is takenover ω ≥ 0, we have∥∥Mf

∥∥2

Lp= sup‖ω‖

L(p/2)′=1

∑n≥1

∫|Mfn(x)|2 ω(x) dx . sup

‖ω‖L(p/2)′=1

∑n≥1

∫|fn(x)|2 (Mω)(x) dx

= sup‖ω‖

L(p/2)′=1

∫ ∑n≥1

|fn(x)|2 (Mω)(x) dx.

Applying Holder’s inequality,

∥∥Mf∥∥2

Lp. sup‖ω‖

L(p/2)′=1

∥∥∥∥∥∥∑n≥1

|fn|2∥∥∥∥∥∥L(p/2)

‖Mω‖L(p/2)′ .

By the standard Hardy-Little maximal inequality theorem, ‖Mω‖L(p/2)′ . ‖ω‖L(p/2)′ = 1. Finally, note that∥∥∥∥∥∥∑n≥1

|fn|2∥∥∥∥∥∥L(p/2)

=∥∥∥‖fn‖2`2n∥∥∥L(p/2)

= ‖f‖2Lp

so that∥∥Mf

∥∥Lp

. ‖f‖Lp as desired.

27

As noted above, it remains to prove the weak type (1, 1) claim. Explicitly, given f = {fn} ∈ L1(Rd), weneed to show that

supλ>0

λ∣∣{x : (Mf)(x) > λ

}∣∣ . ‖f‖L1 .

Using the Calderon-Zygmund decomposition, write f = g + b with |g| ≤ λ almost everywhere and b =

fχ⋃Qk where Qk are cubes with disjoint interiors satisfying

1

|Qk|

∫Qk

|b(y)| dy ∼ λ.

Because M is a sublinear operator,

{x : (Mf)(x) > λ

}⊆{x : (Mg)(x) >

λ

2

}∪{x : (Mb)(x) >

λ

2

}.

Consider the first set. We have already shown that M is of strong type (2, 2), so in particular it is of weaktype (2, 2). Thus,∣∣∣∣{x : (Mg)(x) >

λ

2

}∣∣∣∣ . 1

λ‖g‖2L2 =

1

λ2

∥∥|g|2∥∥L1 ≤

1

λ2‖λ|g|‖L1 =

1

λ‖g‖L1 ≤

1

λ‖f‖L1 .

Therefore, it remains to prove the same type of estimate for the set{x : (Mb)(x) > λ

2

}.

Let 2Qk denote the cube with same center as Qk and twice the side length. Then, because the interior ofthe Qk’s are disjoint, we have:∣∣∣⋃ 2Qk

∣∣∣ ≤ 2d∑k

|Qk| .∑k

1

λ

∫Qk

|b(y)| dy .1

λ

∫Rd|b(y)| dy

≤ 1

λ‖f‖L1 .

With this estimate, it remains to show that∣∣∣∣{x /∈⋃ 2Qk : (Mb)(x) >λ

2

}∣∣∣∣ . 1

λ‖f‖L1 .

Define bavgn (x) =∑k χQk(x) 1

|Qk|∫Qk|bn(y)| dy. If x ∈ Qk, then

‖bavgn (x)‖`2n =1

|Qk|

∥∥∥∥∫Qk

|bn(y)| dy∥∥∥∥`2n

≤ 1

|Qk|

∫Qk

‖bn(y)‖`2n dy =1

|Qk|

∫Qk

|b(y)| dy . λ

invoking Minkowski’s integral inequality to move the `2n norm inside the integral. Thus,

‖bavg‖L1 =∥∥∥‖bavgn (x)‖`2n

∥∥∥L1x

. λ∣∣∣⋃Qk

∣∣∣ . λ · 1

λ‖f‖L1 = ‖f‖L1 .

Now, fix x /∈⋃

2Qk. Then

Mbn(x) = supr>0

1

|B(x, r)|

∫B(x,r)

|bn(y)| dy = supr>0

1

|B(x, r)|∑k

∫B(x,r)∩Qk

|bn(y)| dy.

Suppose that x /∈⋃

2Qk but that B(x, r) ∩Qk 6= ∅. Let l be the side length of Qk. Then necessarily r ≥ l/2.

28

Note that the diameter of Qk is√dl ≤ 2r

√d. This implies that B(x, r(1 + 2

√d)) ⊇ Qk. Thus,

Mbn(x) .d supr>0

1

|B(x, r(1 + 2√d)|

∑k

∫Qk

|bn(y)| dy

= supr>0

1

|B(x, r(1 + 2√d)|

∑k

(∫B(x,r(1+2

√d))

χQk(z) dz

)(∫Qk

|bn(y)| dy)

= supr>0

1

|B(x, r(1 + 2√d)|

∫B(x,r(1+2

√d))

∑k

χQk(z) · 1

|Qk|

∫Qk

|bn(y)| dy dz

= Mbavgn (x).

It follows that Mb . Mbavg . So we again apply the (2, 2) estimate of M to get∣∣∣∣{x /∈⋃ 2Qk : (Mb)(x) >λ

2

}∣∣∣∣ ≤ ∣∣∣{x /∈⋃ 2Qk : (Mbavg)(x) & λ}∣∣∣ . 1

λ2‖bavg‖2L2

=1

λ2

∥∥∥‖bavgn (x)‖`2n∥∥∥2

L2x

=1

λ2

∥∥∥‖bavgn (x)‖2`2n∥∥∥L1x

.1

λ

∥∥∥‖bavgn (x)‖`2n∥∥∥L1x

=1

λ‖bavg‖L1

.1

λ‖f‖L1 .

We defined the maximal function for vector-valued functions taking values in `2(N), but we can gener-alize our definition for functions taking values in `q(N) for 1 ≤ q ≤ ∞.

Definition 6.2. Let 1 ≤ q ≤ ∞ and suppose that f : Rd → `q(N). Define Mqf(x) := ‖{Mfn(x)}‖`qn .

Then:

Theorem 6.5. For 1 < q ≤ ∞, Mq is of weak type (1, 1) and strong type (p, p) for 1 > 1. Subdivide [0, 1] into N equal subintervals I1, . . . , IN . Define

f := (χI1 , . . . , χIN , 0, . . . ).

Then ‖f(x)‖`1n = χ[0,1](x). Thus, ‖f‖Lp = 1 for all 1 ≤ p ≤ ∞. But for x ∈ [0, 1],

(M1f)(x) =

N∑n=1

Mfn(x) =

N∑n=1

supr>0

1

2r

∫ x+r

x−r|fn(y)| dy &

N∑n=1

1

n∼ logN.

This approaches∞ as N →∞.

29

7 2/3: Hardy-Littlewood-Sobolev inequalities

We prove two theorems, the latter a generalization of the former. The proof of the first theorem is due toHedberg, and the technique is amenable to proving certain inverse inequalities.

Theorem 7.1 (Hardy-Littlewood-Sobolev I). Let f ∈ S(Rd). Then∥∥∥∥f ∗ 1

|x|α

∥∥∥∥Lr

. ‖f‖Lp

whenever 1 + 1r = 1

p + αd for 1 < p < r <∞ and 0 < α < d.

Proof. First, we observe that requiring f ∈ S(Rd) ensures that f ∈ Lp(Rd) for all necessary p.Decompose the convolution as follows:(

f ∗ 1

|x|α

)(x) =

∫f(y)

|x− y|αdy =

∫|x−y|≤R

f(y)

|x− y|αdy +

∫|x−y|>R

f(y)

|x− y|αdy

for some R > 0. We will estimate each integral and then optimize in R.Consider the first integral.∣∣∣∣∣

∫|x−y|≤R

f(y)

|x− y|αdy

∣∣∣∣∣ ≤∫|x−y|≤R

|f(y)||x− y|α

dy ≤∑

r∈2Z;r≤R

∫r<|x−y|<2r

|f(y)||x− y|α

dy

.∑

r∈2Z;r≤R

r−α∫|x−y|<2r

|f(y)| dy

.∑

r∈2Z;r≤R

rd−α

|B(x, 2r)|

∫|x−y|<2r

|f(y)| dy.

The quantity 1|B(x,2r)|

∫|x−y|<2r

|f(y)| dy is bounded by Mf(x), so∣∣∣∣∣∫|x−y|≤R

f(y)

|x− y|αdy

∣∣∣∣∣ ≤Mf(x)∑

r∈2Z;r≤R

rd−α.

Because d − α > 0 and because the r’s are dyadic numbers, the sum is summable and is bounded (up to aconstant) by the largest term. Thus,∣∣∣∣∣

∫|x−y|≤R

f(y)

|x− y|αdy

∣∣∣∣∣ ≤ Rd−αMf(x).

Now we consider the second integral. We have∫|x−y|≤R

f(y)

|x− y|αdy =

(f ∗(

1

|x|αχ{|x|>R}

))(x).

By Young’s convolution inequality (or more directly, Holder’s inequality),

∥∥∥∥f ∗ ( 1

|x|αχ{|x|>R}

)∥∥∥∥L∞

. ‖f‖Lp∥∥∥∥ 1

|x|αχ{|x|>R}

∥∥∥∥Lp′

= ‖f‖Lp

(∫|x|>R

1

|x|αp′dx

) 1p′

.

30

If αp′ > d, then the integral quantity is finite, and in particular . Rd−αp′p′ . Since

α

d+

1

p> 1 ⇒ α

d> 1− 1

p=

1

p′

this is indeed the case. Thus,∣∣∣∣∣∫|x−y|≤R

f(y)

|x− y|αdy

∣∣∣∣∣ ≤∥∥∥∥f ∗ ( 1

|x|αχ{|x|>R}

)∥∥∥∥L∞

. ‖f‖Lp Rdp′−α.

We optimize our choice of R by requiring the two estimates to be comparable, so that Rd−αMf(x) ∼

Rdp′−α ‖f‖p. Choose R so that R ∼

(‖f‖LpMf(x)

) pd

.With this choice,

∣∣∣∣f ∗ 1

|x|α

∣∣∣∣ (x) .

(( ‖f‖pMf(x)

) pd)d−α

Mf(x) = ‖f‖d−αd p

Lp (Mf(x))1− pd (d−α).

Fun with exponents yields:

1− p

d(d− α) = p

(1

p− 1 +

α

d

)=p

r.

So ∣∣∣∣f ∗ 1

|x|α

∣∣∣∣ (x) . ‖f‖1−pr

Lp (Mf(x))pr .

Taking the Lr norm and using the fact that M is of type (p, p) then gives∥∥∥∥f ∗ 1

|x|α

∥∥∥∥Lr

. ‖f‖1−pr

Lp

∥∥∥(Mf(x))pr

∥∥∥Lr

= ‖f‖1−pr

Lp ‖Mf‖pr

Lp . ‖f‖1−pr

Lp ‖f‖pr

Lp = ‖f‖Lp

as desired.

Note that the function 1|x|α lives in L

dα ,∞(Rd). This suggests the following generalization of the previous

theorem.

Theorem 7.2 (Hardy-Littlewood-Sobolev II). For 1 < p < r <∞ and 1 < q <∞, we have

‖f ∗ g‖Lr . ‖f‖Lp ‖g‖∗Lq,∞

whenever 1 + 1r = 1

p + 1q .

Proof. We begin with a few reductions. By rescaling, we can assume without loss of generality that ‖f‖Lp =

‖g‖∗Lq,∞ = 1. Furthermore, note that for a fixed such g it suffices to prove that the operator f 7→ f ∗ gis of strong type (p, r). In fact, by the Marcinkiewicz interpolation theorem, it suffices to prove that thisoperator is of weak type (p, r). This is because the condition 1 0. We need to show that ‖f ∗ g‖∗Lr,∞ . 1, i.e., |{x : |f ∗ g|(x) > λ }| . λ−r. As in the previousproof, we decompose g = gχ|g|≤R + gχ|g|>R := g1 + g2. Then

|{x : |f ∗ g|(x) > λ }| ≤∣∣∣∣{x : |f ∗ g1|(x) >

λ

2

}∣∣∣∣+

∣∣∣∣{x : |f ∗ g2|(x) >λ

2

}∣∣∣∣ .31

We will show that the first quantity on the right is 0 for an appropriately chosen R. Intuitively, since g1 isbounded by R, the convolution of f with g cannot be too large. Computing:

‖f ∗ g1‖L∞ . ‖f‖Lp ‖g1‖Lp′ = ‖g1‖Lp′ .(∫ ∞

0

αp′|{x : |g1(x)| > α }| dα

α

) 1p′

=

(∫ R

0

αp′|{x : |g1(x)| > α }| dα

α

) 1p′

where we have recalled an equivalent computation for the Lp′

norm and (see the section on Lorentz spaces)and used the fact that |g1| ≤ R. Continuing,

‖f ∗ g1‖L∞ .

(∫ R

0

supα>0

[αq |{x : |g1(x)| > α }|] αp′−q dα

α

) 1p′

=

(supα>0

[αq |{x : |g1(x)| > α }|]) 1p′(∫ R

0

αp′−q dα

α

) 1p′

.

The quantity on the left is precisely(‖g‖∗Lq,∞

) 1p′ , which we are assuming is 1. The integral on the right is

integrable if p′ − q > 0. By assumption,

1− 1

p=

1

q− 1

r⇒ 1

p′<

1

q⇒ p′ > q.

So the integral is finite, and in particular,(∫ R

0αp′−q dα

α

) 1p′

. Rp′−qp′ . Putting this all together gives

‖f ∗ g1‖L∞ . Rp′−qp′ .

Recall that we are estimating∣∣{x : |f ∗ g1|(x) > λ

2

}∣∣. Thus, if we choose R small enough (dependent onλ, say R = cλ for some sufficiently small constant c, then ‖f ∗ g1‖L∞ ≤ λ/2, and so∣∣∣∣{x : |f ∗ g1|(x) >

λ

2

}∣∣∣∣ = 0.

Thus, it remains to estimate∣∣{x : |f ∗ g2|(x) > λ

2

}∣∣. By Chebychev’s inequality,∣∣∣∣{x : |f ∗ g2|(x) >λ

2

}∣∣∣∣ . λ−p ‖f ∗ g2‖pLp . λ−p (‖f‖Lp ‖g2‖L1)p

= λ−p(∫ ∞

0

α |{x : |g2(x)| > α }| dαα

)p.

We consider the integral separately.∫ ∞0

α |{x : |g2(x)| > α }| dαα

=

∫ R

0

α |{x : |g2(x)| > α }| dαα

+

∫ ∞R

α |{x : |g2(x)| > α }| dαα

≤ R · |{x : |g(x)| > R }|+ supα>0

αq |{x : |g(x)| > α }|∫ ∞R

α1−q dα

α.

As before, supα>0 αq |{x : |g(x)| > α }| = ‖g‖∗Lq,∞ = 1. Furthermore, since q > 1,

∫∞Rα1−q dα

α . R1−q . So∫ ∞0

α |{x : |g2(x)| > α }| dαα

. Rq · |{x : |g(x)| > R }| R1−q +R1−q.

32

But since Rq · |{x : |g(x)| > R }| ≤ ‖g‖∗Lq,∞ = 1, we then have∫ ∞0

α |{x : |g2(x)| > α }| dαα

. R1−q.

Plugging this into our original estimate and using R1− qp′ = cλ,

|{x : |f ∗ g2| > λ/2 }| . λ−pRp(1−q) . λ−pRp(1−q) . λ−pλ

(p′p′−q

)p(1−q)

. λ−r.

ADD COMMENTS ABOUT ALTERNATE PROOF METHOD

33

8 2/6: Sobolev Embedding Theorem

As a consequence of the Hardy-Littlewood Sobolev inequalities, we prove a Sobolev Embedding Theorem.First, we perform a computation. Our first goal will be to (formally) take the Fourier transform of

something like 1|x|d−α . Towards that, suppose that 0 < α < d. Then:

∫ ∞0

e−πt|x|2

td−α

2dt

t=

∫ ∞0

e−u(

u

π|x|2

) d−α2 du

u= π−

d−α2 Γ

(d− α

2

)1

|x|d−α.

As noted, we want to take the Fourier transform of this quantity, but 1|x|d−α /∈ L2(Rd). Thus, we consider

the Fourier transform in the sense of distributions. Recall that if T ∈ S ′(Rd) is a tempered distribution, wedefine its Fourier transform by T (f) := T (f) for f ∈ S(Rd). So for f ∈ S(Rd), we have(

π−d−α

2 Γ

(d− α

2

)1

|x|d−α

)(f) =

(π−

d−α2 Γ

(d− α

2

)1

|x|d−α

)(f)

=

∫Rdπ−

d−α2 Γ

(d− α

2

)1

|x|d−αf(x) dx.

Applying our computation from above and the definition of the Fourier transform, we get:

=

∫Rd

∫ ∞0

∫Rde−πt|x|

2

td−α

2dt

t

∫Rde−2πix·yf(y) dy dx

=

∫Rd

∫ ∞0

(∫Rde−πt|x|

2

e−2πix·y dx

)t−

d−α2

dt

tf(y) dy.

The quantity in the parentheses is precisely the Fourier transform of e−π|x|2

. Recall that(e−x·Ax

)= π

d2 (detA)

− 12 e−π

2ξ·A−1ξ.

Thus, ∫Rde−πt|x|

2

e−2πix·y dx = πd2 (πt)−

d2 e−π

2y· 1πty = t

d2 e−

π|y|2t .

Continuing the computation gives

=

∫Rd

∫ ∞0

e−π|y|2t t−

α2dt

tf(y) dy =

∫Rd

∫ ∞0

td2 e−

π|y|2t t−

d−α2

dt

tf(y) dy

=

∫Rd

∫ ∞0

(π|y|2

u

)−α2e−u

du

uf(y) dy

=

∫Rd

∫ ∞0

π−α2 |y|−αuα2 e−u du

uf(y) dy

=

∫Rdπ−

α2 Γ(α

2

)|y|−α f(y) dy

=(π−

α2 Γ(α

2

)|y|−α

)(f).

This proves that, in the sense of tempered distributions,(π−

d−α2 Γ

(d− α

2

)1

|x|d−α

)= π−

α2 Γ(α

2

) 1

|x|α. (4)

We will use this fact to prove a Sobolev embedding inequality.

34

Definition 8.1. Fix s > −d. Define the operator |∇|s via its action on the Fourier side: for ∈ S(Rd), define

(|∇|sf)ˆ(ξ) := (2π|ξ|)sf(ξ)

where this equality is understood in the sense of distributions.

This operator is meant to generalize the notion of derivative, which is clear by its definition in theFourier domain. Here s informally represents the number of derivatives being taken. We require s > −d sothat the quantity (2π|ξ|)s makes sense as a distribution on Rd.

We reiterate the fact that |∇|sf is a tempered distribution. So for ϕ ∈ S(Rd),

(|∇|sf)ˆ(ϕ) = (|∇|sf)(ϕ) =

∫Rd

(|∇|sf)(x) ϕ(x) dx

but on the other hand(|∇|sf)ˆ(ϕ) =

∫Rd

(2π|ξ|)sf(ξ)ϕ(ξ) dξ.

This gives us a representation for |∇|sf : for ψ ∈ S(Rd),

(|∇|sf)(ψ) =

∫Rd

(2π|ξ|)sf(ξ) ψ(ξ) dξ.

Theorem 8.1 (Sobolev Embedding Theorem). Fix 1 0. Let f ∈ S(Rd) such that |∇|sf ∈ Lp.1

Then‖f‖Lq . ‖|∇|

sf‖Lp

provided 1p = 1

q + sd .

Proof. Recall that S(Rd) is dense in Lq′. Thus, by duality of Lp norms and Plancharel’s theorem,

‖f‖Lq = supg∈S(Rd):‖g‖

Lq′=1

〈f, g〉 = supg∈S:‖g‖

Lq′=1

⟨f , g⟩

= supg∈S:‖g‖

Lq′=1

⟨(2π|ξ|)sf(ξ), (2π|ξ|)−sg(ξ)

⟩.

By our previous comments, (2π|ξ|)sf(ξ) ∈ S ′(Rd). Thus, we would like for (2π|ξ|)−sg(ξ) to be a Schwartzfunction. But if g does not vanish near the origin, the singularity at ξ = 0 may prevent this from being true.Thus, we need to consider a slightly smaller space of functions.

We claim that the family of functions

F :={g ∈ S(Rd) : g vanishes on a neighborhood of ξ = 0

}is dense in Lp for 1 < p <∞. Obviously it suffices to show that F is dense in S.

Towards showing this claim, fix g ∈ S(Rd). Let ϕ be a smooth bump function on Rd such that ϕ(ξ) = 1

for |ξ| ≤ 1 and ϕ(ξ) = 0 for |ξ| ≥ 2. Define

gε(ξ) := g(ξ)

(1− ϕ

(|ξ|ε

)).

Note that

g(ξ)

(1− ϕ

(|ξ|ε

))= g(ξ)− g(ξ)ϕ

(|ξ|ε

)=

(g(x)− g ∗

(ϕ

(|ξ|ε

))(x)

)1In the sense of distributions, i.e., |∇|sf is given by integration against a function in Lp.

35

So that

‖g − gε‖Lp =

∥∥∥∥∥g ∗(ϕ

(|ξ|ε

))∥∥∥∥∥Lp

=∥∥g ∗ εdϕ(ε|x|)

∥∥Lp

. εd ‖g‖L1 ε− dp ‖ϕ‖Lp .

Since g and ϕ are fixed and because p > 1, as ε→ 0, the above quantity→ 0.Note the above estimate argument fails for p = 1, but intuitively, F is not dense in L1 because g(0) is the

total integral over Rd of g. Functions with mean zero are certainly not dense in L1.But, for 1 < p <∞, we now have a new dense subset F . So

‖f‖Lq = supg∈F :‖g‖

Lq′=1

⟨(2π|ξ|)sf(ξ), (2π|ξ|)−sg(ξ)

⟩.

Now the functions (2π|ξ|)−sg(ξ) are certainly Schwartz, since g is a Schwartz function which vanishes in aneighborhood of 0.

Applying Plancharel again gives

‖f‖Lq = supg∈F :‖g‖

Lq′=1

⟨|∇|sf, |∇|−sg

⟩. supg∈F :‖g‖

Lq′=1

‖|∇|sf‖Lp∥∥|∇|−sg∥∥

Lp′.

We then have (|∇|−sg

)(x) =

((2π|ξ|)−sg(ξ)

)(x).

Now, recall that if T is a tempered distribution and g is a Schwartz function, then (T ∗ g)ˆ = T · g. Since(2π|ξ|)−s is a tempered distribution and g is Schwartz, it follows that

(|∇|−sg

)(x) =

((((2π|ξ|)−s)ˇ∗ g

))(x).

Our computation from (4) tells us that (2π|ξ|)−s)ˇ∼d,s |x|s−d. Therefore,

|∇|−sg ∼d,s |x|s−d ∗ g.

By the first Hardy-Littlewood-Sobolev inequality,∥∥∥∥g ∗ 1

|x|d−s

∥∥∥∥Lp′

. ‖g‖Lq′

provided 1 + 1p′ = 1

q′ + d−sd . But this is true precisely when

1 + 1− 1

p= 1− 1

q+ 1− s

d⇐⇒ −1

p= −1

q− s

d⇐⇒ 1

p=

1

q+s

d

which is the assumption of the theorem.Therefore,

‖f‖Lq . supg∈F :‖g‖

Lq′=1

‖|∇|sf‖Lp∥∥|∇|−sg∥∥

Lp′. supg∈F :‖g‖

Lq′=1

‖|∇|sf‖Lp ‖g‖Lq′ = ‖|∇|sf‖Lp

as desired.

36

9 2/8 — 2/13: Calderon-Zygmund Convolution Kernels

Consider the following question. For a Schwartz function f , is it true that ‖|∇|f‖Lp . ‖∇f‖Lp for 1 < p <

∞?The usefulness of this question is apparent from the fact that |∇|s is a nonlocal operator, in that it was

defined by multiplication in the Fourier domain, hence by convolution in the physical domain. On theother hand, ∇ is a purely local operator.

Recall that |∇|f(ξ) = 2π|ξ|f(ξ). Write

2π|ξ| = 2π|ξ|2

|ξ|= 2π

∑ ξ2j

|ξ|=∑−i ξj|ξ|· 2πiξj =

∑mj(ξ)2πiξj

where mj(ξ) := −i ξj|ξ| are called Riesz multipliers. If we can prove that the Riesz multipliers are bounded

on Lp, that is, the operatorsRj defined via Rjf(ξ) := mj(ξ)f(ξ), or equivalentlyRjf = mj ∗f , are boundedon Lp, then the answer to our original question is yes. Indeed, by our above computation,

|∇|f(ξ) = 2π|ξ|f(ξ) =∑

mj(ξ)2πiξj f(ξ)

so that |∇|f =∑Rj(∂jf). By the triangle inequality and the equivalence of norms on finite dimensional

spaces,‖|∇|f‖Lp ≤

∑‖Rj(∂jf)‖Lp .

∑‖∂jf‖Lp . ‖∇f‖Lp .

To prove that this is indeed the case, we consider a more general class of objects.

Definition 9.1. A function K : Rd \ {0} → C is a Calderon-Zygmund convolution kernel if it satisfies:

1. |K(x)| . 1|x|d uniformly in x 6= 0;

2.∫R1≤|x|≤R2

K(x) dx = 0 for all 0 < R1 < R2 <∞;

3.∫|x|≥2|y| |K(x)−K(x+ y)| dx . 1 uniformly in y ∈ Rd.

The first condition says that K is allowed to have a singularity at 0, but not too extreme. The secondcondition says that K satisfies some sort of cancellation property; note that if K is an odd function (in anappropriate sense), then this condition is satisfied. The third condition is a smoothness requirement.

Also, note that if |∇K|(x) . 1|x|1+d , then K satisfies condition 3. Indeed, we have by the fundamental

theorem of calculus

K(x)−K(x+ y) = −∫ 1

0

y · ∇K(x+ θy) dθ.

So ∫|x|≥2|y|

|K(x)−K(x+ y)| dx ≤∫|x|≥2|y|

∫ 1

0

|y| · 1

|x+ θy|1+ddθ dx.

Since |x| ≥ 2|y|, |x+ θy| & |x|. Thus,∫|x|≥2|y|

|K(x)−K(x+ y)| dx .∫|x|≥2|y|

∫ 1

0

|y| · 1

|x|1+ddθ dx =

∫|x|≥2|y|

|y| · 1

|x|1+ddx . |y| · 1

|y|= 1

and so K satisfies the third condition.

37

We also remark that the kernel of the Riesz transforms (the Riesz multipliers) discussed above are alsoCalderon-Zygmund convolution kernels. To see this, first note that

Kj := mj =

(−i ξj|ξ|

)=

(− 1

2π· 2πiξj|ξ|

).

In our discussion of the Sobolev Embedding theorem, we discussed the inverse Fourier transform of func-tions of the form 1

|ξ|s . Thus,

Kj(x) ∼d −1

2π∂j

(1

|x|d−1

)∼d

xj|x|d+1

since ∂j |x| =xj|x| . Then clearly |Kj(x)| .d 1

|x|d and∫R1≤|x|≤R2

Kj(x) dx = 0 (since x 7→ xj is odd). Finally,we wish to show |∇Kj | . 1

|x|1+d to verify the third condition. Note that

∂kKj(x) ∼ ∂kxj |x|−(d+1) = δjk|x|−(d+1) − (d+ 1)|x|−(d+1)−1xk|x|−1 =δjk|x|d+1

− (d− 1)xk|x|d+3

so certainly |∇Kj | . 1|x|1+d . Thus, Kj is a Calderon-Zygmund convolution kernel.

With this in mind, we prove the following theorem.

Theorem 9.1. Let K be a Calderon-Zygmund convolution kernel. For ε > 0, let Kε := K · χε≤|x|≤1/ε. Then‖Kε ∗ f‖L2 . ‖f‖L2 uniformly in ε > 0. Moreover, K ∗ f = limε→0Kε ∗ f and the operator K ∗ f extends asa bounded operator from S(Rd) to L2(Rd).

Proof. First, we will show that Kε is a Calderon-Zygmund convolution kernel. Because Kε is just a restric-tion of K, conditions 1 and 2 are immediate. We only need to verify that

∫|x|≥2|y|Kε(x)−Kε(x+ y)| dy ≤ 1

uniformly in y ∈ Rd and ε > 0.To do this, consider the following three subregions of |x| ≥ 2|y|:

A = {|x| ≥ 2|y|, ε ≤ |x|, |x+ y| ≤ 1/ε}

B = {|x| ≥ 2|y|, ε ≤ |x| ≤ 1/ε, |x+ y| < ε or |x+ y| > 1/ε}

C = {|x| ≥ 2|y|, ε ≤ |x+ y| ≤ 1/ε, |x| < ε or |x| > 1/ε}.

Then ∫|x|≥2|y|

|Kε(x)−Kε(x+ y)| dy ≤∫A

|K(x)−K(x+ y)| dy

+

∫B

|K(x)| dy +

∫C

|K(x+ y)| dy.

The integral over A is . 1 because K is a Calderon-Zygmund convolution kernel.Consider the integral over B. Assume first that |x + y| < ε. Then |x| ≤ |x + y| + |y| < ε + |x|

2 , hence|x| < 2ε. The contribution of this part of the integral is bounded by∫

ε≤|x|≤2ε

|K(x)| dx ≤∫ε≤|x|≤2ε

1

|x|ddx . 1.

If |x + y| > 1/ε, then |x| ≥ |x + y| − |y| > 1/ε − |x|/2, hence |x| ≥ 23ε . The contribution of this part of the

integral is bounded by ∫23ε≤|x|≤

1ε

|K(x)| dx ≤∫

23ε≤|x|≤

1ε

1

|x|ddx . 1.

38

Thus,∫B|K(x)| dy . 1.

Bounding the integral over C is essentially the same argument, paired with a change of variables. Thus,∫C|K(x+y)| dy . 1. This verifies the third condition, henceKε is a Calderon-Zygmund convolution kernel.Next, we want to show ‖Kε ∗ f‖L2 . ‖f‖L2 uniformly in ε > 0. By Plancharel, we have

‖Kε ∗ f‖L2 =∥∥∥Kε ∗ f

∥∥∥L2

=∥∥∥Kε · f

∥∥∥L2≤∥∥∥Kε

∥∥∥L∞

∥∥∥f∥∥∥L2

∥∥∥Kε

∥∥∥L∞‖f‖L2 .

Thus, it suffices to prove that∥∥∥Kε

∥∥∥L∞

. 1 uniformly in ε.We will decompose the Fourier transform of Kε into an integral without any oscillation and an integral

with oscillation. Explicitly,

Kε(ξ) =

∫e−2πix·ξKε(x) dx =

∫|x|≤1/|ξ|

e−2πix·ξKε(x) dx+

∫|x|≥1/|ξ|

e−2πix·ξKε(x) dx.

In the first integral, |x·ξ| ≤ |x||ξ| ≤ 1, thus the term e−2πix·ξ is restricted in its oscillation. Using the cancella-tion property of Calderon-Zygmund convolution kernels, we have

∫|x|≤1/|ξ|Kε(x) dx =

∫ε≤|x|≤1/|ξ|K(x) dx =

0. Thus,∣∣∣∣∣∫|x|≤1/|ξ|

e−2πix·ξKε(x) dx

∣∣∣∣∣ =

∣∣∣∣∣∫|x|≤1/|ξ|

(e−2πix·ξ − 1

)Kε(x) dx

∣∣∣∣∣ ≤∫|x|≤1/|ξ|

|e−2πix·ξ − 1||Kε(x)| dx.

Because Kε is a Calderon-Zygmund convolution kernel, |Kε(x)| . |x|−d. The quantity |e−2πx·ξ − 1| is thedistance from the point e−2πix·ξ of the unit circle to 1. This is bounded by the arc-length of that segment onthe unit circle corresponds to the phase, which is |x · ξ| ≤ |x||ξ|. Therefore,∣∣∣∣∣

∫|x|≤1/|ξ|

e−2πix·ξKε(x) dx

∣∣∣∣∣ ≤∫ε≤|x|≤1/|ξ|

|x||ξ||x|−d dx . |ξ| · 1

|ξ|= 1.

Next, consider the second integral. Here we invoke the smoothness condition of Calderon-Zygmund con-volution kernels. Write

1 =1− eiπ

2=

1− e2πi ξ·ξ2|ξ|2

2.

Then∫|x|>1/|ξ|

e−2πix·ξKε(x) dx =

∫|x|>1/|ξ|

e−2πix·ξ

1− e2πi ξ·ξ2|ξ|2

2

Kε(x) dx

=1

2

∫|x|>1/|ξ|

(e−2πix·ξ − e−2πiξ·

(x− ξ

2|ξ|2

))Kε(x) dx

=1

2

∫|x|>1/|ξ|

e−2πix·ξKε(x) dx− 1

2

∫∣∣∣x+ ξ

2|ξ|2

∣∣∣>1/|ξ|e−2πix·ξKε

(x+

ξ

2|ξ|2

)dx

We want to combine these integrals, but have to account for the shifted circles which define the domains; ithelps to draw a picture here. We have:

=1

2

∫|x|>1/|ξ|

e−2πix·ξ(Kε(x)−Kε

(x+

ξ

2|ξ|2

))dx

+1

2

∫A

e−2πix·ξKε

(x+

ξ

2|ξ|2

)dx− 1

2

∫B

e−2πix·ξKε

(x+

ξ

2|ξ|2

)dx

39

where

A =

{x : |x| ≥ 1

|ξ|≥∣∣∣∣x+

ξ

2|ξ|2

∣∣∣∣ } ,B =

{x : |x| ≤ 1

|ξ|≤∣∣∣∣x+

ξ

2|ξ|2

∣∣∣∣ } .Estimate the first of these three integrals by employing the smoothness condition of the Calderon-Zygmundconvolution kernels:∣∣∣∣∣12

∫|x|>1/|ξ|

e−2πix·ξ(Kε(x)−Kε

(x+

ξ

2|ξ|2

))dx

∣∣∣∣∣ .∫|x|>2

∣∣∣ ξ

2|ξ|2

∣∣∣∣∣∣∣Kε(x)−Kε

(x+

ξ

2|ξ|2

)∣∣∣∣ dx. 1.

Next, consider the integral over A. By the reverse triangle inequality,

|x| ≥ 1

|ξ|≥∣∣∣∣x+

ξ

2|ξ|2

∣∣∣∣ ⇒∣∣∣∣x+

ξ

2|ξ|2

∣∣∣∣ ≥ |x| − 1

2|ξ|≥ 1

|ξ|.

Thus, ∣∣∣∣12∫A

e−2πix·ξKε

(x+

ξ

2|ξ|2

)dx

∣∣∣∣ . ∫1

2|ξ|≤∣∣∣x+ ξ

2|ξ|2

∣∣∣≤ 1|ξ|

∣∣∣∣Kε

(x+

ξ

2|ξ|2

)∣∣∣∣ dx.∫

12|ξ|≤|x|≤

1|ξ|

1

|x|ddx

. 1

where we have made the obvious change of variables and invoked the first property of Calderon-Zygmundconvolution kernels. Use a similar trick for bounding the integral over B, noting that

|x| ≤ 1

|ξ|≤∣∣∣∣x+

ξ

2|ξ|2

∣∣∣∣ ⇒∣∣∣∣x+

ξ

2|ξ|2

∣∣∣∣ ≤ |x|+ 1

2|ξ|≤ 3

2|ξ|.

Combining all of this gives∣∣∣∫|x|>1/|ξ| e

−2πix·ξKε(x) dx∣∣∣ . 1, and hence |Kε(ξ)| . 1 uniformly in ε and ξ,

thus,∥∥∥Kε(ξ)

∥∥∥L∞

. 1 uniformly in ε. Therefore, ‖Kε ∗ f‖L2 . ‖f‖L2 .

Next, we wish to show that Kε ∗ f converges in L2 as ε→ 0. It suffices to show that {Kε ∗ f} is Cauchy.So fix 0 < ε2 < ε1 and fix f ∈ S(Rd). Then

(Kε1 ∗ f −Kε2 ∗ f)(x) =

∫ε1≤|y|≤1/ε1

K(y)f(x− y) dy −∫ε2≤|y|≤1/ε2

K(y)f(x− y) dy

= −∫ε2≤|y|≤ε1

K(y)f(x− y) dy −∫

1/ε1≤|y|≤1/ε2

K(y)f(x− y) dy.

Consider the second integral.∥∥∥∥∥∫

1/ε1≤|y|≤1/ε2

K(y)f(x− y) dy

∥∥∥∥∥L2

.∥∥K · χ1/ε1≤|y|≤1/ε2

∥∥L2 ‖f‖L1 .f

(∫1/ε1≤|y|≤1/ε2

1

|x|2ddx

) 12

. εd21

40

which→ 0 as ε1 → 0. For the first integral, the cancellation property of K gives∫ε2≤|y|≤ε1

K(y)f(x− y) dy =

∫ε2≤|y|≤ε1

K(y)(f(x− y)− f(x)) dy

=

∫ε2≤|y|≤ε1

K(y) y

∫ 1

0

∇f(x− θy) dθ dy.

Thus, ∥∥∥∥∥∫ε2≤|y|≤ε1

K(y)f(x− y) dy

∥∥∥∥∥L2x

.∫ 1

0

∫ε2≤|y|≤ε1

1

|y|d|y| ‖∇f(x− θy)‖L2

xdy dθ

.f

∫ε2≤|y|≤ε1

1

|y|d−1dy ≤ ε1

→ 0

as ε1 → 0.This shows that for a Schwartz function f , {Kε ∗ f} is Cauchy. By density, {Kε ∗ f} is Cauchy for any

f ∈ L2. To see this argument explicitly, fix δ > 0 and let g ∈ S(Rd) be such that ‖f − g‖L2 < δ. Then by theusual triangle inequality argument,

‖Kε1 ∗ f −Kε2 ∗ f‖L2 ≤ ‖Kε1 ∗ g −Kε2 ∗ g‖L2 + ‖Kε1 ∗ (f − g)‖L2 + ‖Kε2 ∗ (f − g)‖L2 .

The first quantity→ 0 as ε1, ε2 → 0 since g is Schwartz. We have already seen that Calderon-Zygmund con-volution kernels are of type (2, 2), thus, ‖Kε1 ∗ (f − g)‖L2 . ‖f − g‖L2 ≤ δ and likewise for Kε2 . Therefore,taking ε1, ε2 → 0 and then taking δ → 0 gives the desired result.

Therefore, we can define K ∗ f := L2 − limε→0Kε ∗ f . This operator extends to L2 by uniform bound-edness of the Kε operators.

The theorem we just proved says that Calderon-Zygmund convolution kernels are operators of type(2, 2). We will use this along with interpolation to prove a more general result:

Theorem 9.2. Suppose that K is a Calderon-Zygmund convolution kernel. For ε > 0, let Kε := K ·χε≤|x|≤1/ε. Then

1. K is of weak-type (1, 1), i.e., ∣∣{x ∈ Rd : |Kε ∗ f | > λ}∣∣ . 1

λ‖f‖L1(Rd)

uniformly in λ > 0 and ε > 0.

2. Kε is of (strong) type (p, p) for 1 0.

Moreover, for 1 < p < ∞, K ∗ f := Lp − limε→0Kε ∗ f extends as a bounded map on S(Rd) to a boundedmap on Lp(Rd).

41

Proof. Assume first that we have proven 1, that Kε is of weak-type (1, 1). Since Kε is of type (2, 2) bythe previous theorem, it follows from the Marcinkiewicz interpolation theorem that Kε is of type (p, p) for1 < p < 2. We then use a duality argument to achieve the estimate for 2 < p < ∞. Explicitly, for p in thisrange,

‖Kε ∗ f‖Lp = sup‖g‖

Lp′=1

〈Kε ∗ f, g〉 = sup‖g‖

Lp′=1

∫ ∫Kε(x− y)f(y) g(x) dy dx

= sup‖g‖

Lp′=1

∫ ∫(Kε)R(y − x)g(x) dx f(y) dy

= sup‖g‖

Lp′=1

⟨f, (Kε)R ∗ g

⟩where (Kε)R(x) = Kε(−x) is the reflection of Kε. By Holder’s inequality,

‖Kε ∗ f‖Lp ≤ sup‖g‖

Lp′=1

‖f‖Lp∥∥∥(Kε)R ∗ g

∥∥∥Lp′

.

Note that (Kε)R is also a Calderon-Zygmund convolution kernel. Since 1 < p′ < 2, we then have∥∥∥(Kε)R ∗ g

∥∥∥Lp′

.

‖g‖Lp′ = 1, so that ‖Kε ∗ f‖Lp . ‖f‖Lp as desired.For the “moreover” statement, we wish to show that ‖Kε1 ∗ f −Kε2 ∗ f‖Lp → 0 as ε1, ε2 → 0 as in

the previous theorem. Though the same proof technique as before will work, we present an alternativeargument here. For 1 < p < 2, choose 1 < q < p and compute, using the log-concavity of Lp-norms:

‖Kε1 ∗ f −Kε2 ∗ f‖Lp . ‖Kε1 ∗ f −Kε2 ∗ f‖θLq ‖Kε1 ∗ f −Kε2 ∗ f‖

1−θL2

where 1p = θ

q + 1−θ2 . We have ‖Kε1 ∗ f −Kε2 ∗ f‖

θLq . ‖f‖

θLq and ‖Kε1 ∗ f −Kε2 ∗ f‖

1−θL2 → 0 from previous

results. For 2 < p <∞, choose p < q <∞ and argue similarly.Thus, to prove the entire theorem it only remains to show 1. Let f ∈ L1(Rd). We perform a modified

version of the Calderon-Zygmund decomposition introduced earlier. For a fixed λ > 0, write f = g + b

where:

- |g| . λ a.e.;

- b is supported on a union of cubes {Qk}whose interiors are disjoint and |⋃Qk| . 1

λ

∫|f(y)| dy;

- b |Qk= f |Qk − 1|Qk|

∫Qkf(y) dy, so that consequently

∫Qkb(y) dy = 0;

- 1|Qk|

∫Qk|b(y)| dy . λ.

Then

|{x : |Kε ∗ f | > λ }| ≤∣∣∣∣{x : |Kε ∗ g| >

λ

2

}∣∣∣∣+

∣∣∣∣{x : |Kε ∗ b| >λ

2

}∣∣∣∣ .We estimate the first quantity on the right using the (2, 2) estimate for Kε. Using Chebychev’s inequalityand the uniform bound on g, we have:∣∣∣∣{x : |Kε ∗ g| >

λ

2

}∣∣∣∣ . λ−2 ‖Kε ∗ g‖2L2 . λ−2 ‖g‖2L2 = λ−2∥∥|g|2∥∥

L1 . λ−1 ‖g‖L1 ≤1

λ‖f‖L1 .

To estimate the second quantity, we first remove a set of comparable size to 1λ ‖f‖L1 . LetQ∗k be the cube with

same center, call it xk, as Qk but with side lengths 2√d`(Qk). (To draw a picture, draw Qk and circumscribe

a circle; draw a circle with twice the radius, then circumscribe a cube around this circle.) Then∣∣∣⋃Q∗k

∣∣∣ ≤∑ |Qk| ≤ (2√d)d∑|Qk| .d

1

λ‖f‖L1 .

42

Thus, it suffices to estimate the quantity∣∣{x /∈ ⋃Q∗k : |Kε ∗ b| > λ

2

}∣∣. By Chebychev, we have∣∣∣∣{x /∈⋃Q∗k : |Kε ∗ b| >λ

2

}∣∣∣∣ . 1

λ‖Kε ∗ b‖L1((

⋃Q∗k)C) .

Writing out the definition of the convolution above gives

(Kε ∗ b)(x) =

∫Kε(x− y)b(y) dy =

∑∫Qk

Kε(x− y)b(y) dy

=∑∫

Qk

(Kε(x− y)−Kε(x− xk)) b(y) dy

where the last equality follows from the fact that b has mean 0. So

‖Kε ∗ b‖L1((⋃Q∗k)C) =

∫(⋃Q∗k)C)

|Kε ∗ b|(x) dx

≤∑∫

(Q∗k)C

∫Qk

|Kε(x− y)−Kε(x− xk)| |b(y)| dy dx

=∑∫

Qk

|b(y)|∫

(Q∗k)C|Kε(x− y)−Kε(x− xk)| dx dy.

Making the change of variables x− xk 7→ x yields

‖Kε ∗ b‖L1((⋃Q∗k)C) =

∑∫Qk

|b(y)|∫

(Q∗k)C−xk|Kε(x+ xk − y)−Kε(x)| dx dy

where (Q∗k)C − xk denotes the translation of (Q∗k)C by the point xk. Note that if x ∈ (Q∗k)C − xk, then|x| > 2

√d`(Qk). Since

√d`(Qk) is the diameter of Qk and y ∈ Qk, this implies that |x| ≥ 2|y − xk|. By the

the smoothness property of Calderon-Zygmund convolution kernels,∫(Q∗k)C−xk

|Kε(x+ xk − y)−Kε(x)| dx . 1.

Thus,

‖Kε ∗ b‖L1((⋃Q∗k)C) .

∑∫Qk

|b(y)| dy .∫|f(y)| dy

and we are done.

In the proof of this theorem, having already shown (2, 2) boundedness of K, the only property of Calderon-Zygmund convolution kernels that we explicitly used was the smoothness condition. Interpolation wasenough to get boundedness for other p values. This gives us a general fact:

Proposition 9.1. If K is any convolution kernel (not necessarily Calderon-Zygmund) which is of type (2, 2)

and satisfies∫|x|≥2|y| |K(x+ y)−K(x)| dx . 1 uniformly in y, then K extends to a bounded map on Lp for

1 < p <∞.

As an application of the previous theorem, we consider the Hilbert transform, the convolution operatorcorresponding to the kernel K : R \ {0} → R given by K(x) = 1

πx . Explicitly, the Hilbert transform of afunction f : R→ R is:

(Hf)(x) := limε→0

∫ε≤|y|≤ 1

ε

1

πyf(x− y) dy.

43

We claim that K is a Calderon-Zygmund convolution kernel. The estimate |K(x)| . 1|x| is immediate from

the definition, and the fact that K satisfies the cancellation property over annuli in R follows from the factthat 1

x is an odd function. Checking the smoothness condition, we have∫|x|≥2|y|

|K(x+ y)−K(x)| dx =1

π

∫|x|≥2|y|

∣∣∣∣ 1

x+ y− 1

x

∣∣∣∣ dx =1

π

∫|x|≥2|y|

|y||x+ y||x|

dx.

Since |x| ≥ 2|y|, |x+ y| ≥ |x| − |y| & |x|, so that∫|x|≥2|y|

|K(x+ y)−K(x)| dx .∫|x|≥|y|

|y||x|2

dx . 1.

Thus, K is a Calderon-Zygmund convolution kernel, and consequently the Hilbert transform is a boundedoperator on Lp for 1 < p <∞.

It is also worth nothing that the strong-type estimates do indeed fail for p = 1 and p =∞. For example,let f = χ[a,b]. Then f ∈ L1(R) and f ∈ L∞(R), and

(Hf)(x) =1

πlimε→0

∫ε≤|y|≤ 1

ε

1

yχ[a,b](x− y) dy.

Consider the case where x− a, x+ b > 0. Then for ε sufficiently small,

(Hf)(x) =1

π

∫ x−b

x−a

1

ydy =

1

πlog

∣∣∣∣x− ax− b

∣∣∣∣for almost all x, which is neither in L1(R) nor L∞(R). The cases x− a, x− b < 0 and x− a < 0 < x− b arehandled similarly.

44

10 2/13 — 2/22: Mikhlin Multiplier Theorem, Littlewood-Paley Projec-tions

Let ϕ ∈ C∞(Rd) be a smooth bump function satisfying

ϕ(x) =

{1 |x| ≤ 1

0 |x| ≥ 1110

.

Let ψ(x) = ϕ(x)− ϕ(2x). Then

ψ(x) =

0 |x| ≤ 1

2

1 1120 ≤ |x| ≤ 1

0 |x| ≥ 1110

.

For N ∈ 2Z, define ψN (x) = ψ(xN

). These doughnut-like bump functions which are ∼ 1 on a scale of N

will be used in the proof of the following theorem.

Theorem 10.1 (Mikhlin multiplier theorem). Let m : Rd \ {0} → C satisfy∣∣Dαξm(ξ)

∣∣ . 1

|ξ||α|(5)

uniformly in ξ for all 0 ≤ |α| ≤⌈d+1

2

⌉. Then f 7→ (m · f)ˇ = m ∗ f is bounded on Lp for 1 < p <∞.

Proof. For p = 2, this follows from Plancharel. Indeed,

‖m ∗ f‖L2x

=∥∥∥m · f∥∥∥

L2ξ

≤ ‖m‖L∞ξ∥∥∥f∥∥∥

L2ξ

.

By choosing α = 0 in (5), we see that m is uniformly bounded, so ‖m‖L∞ξ <∞. Thus, ‖m ∗ f‖L2x.∥∥∥f∥∥∥

L2ξ

.

With this (2, 2) estimate, by Proposition 9.1, to prove boundedness on Lp for 1 < p < ∞ it suffices toprove that K = m satisfies the smoothness condition of Calderon-Zygmund convolution kernels. Beforedoing this with the hypothesis of the theorem, we prove a slightly easier case. In particular, suppose that(5) holds for all |α| ≤ d + 2 rather than all |α| ≤

⌈d+1

2

⌉. In this case, we will show that |∇K(x)| . 1

|x|d+1 .By a previous remark (see the definition of Calderon-Zygmund convolution kernels), this implies that Ksatisfies the smoothness condition, as desired. Towards this goal, note that∑

N∈2Z

ψN (x) = 1

for almost all x ∈ Rd. Writem(ξ) =

∑N∈2Z

m(ξ)ψN (ξ) =:∑N∈2Z

mN (ξ).

Then |∇K(x)| ≤∑N∈2Z |∇mN (x)|. By the properties of the Fourier transform,

‖xα∇mN (x)‖L∞x .∥∥Dα

ξ ξmN (ξ)∥∥L1ξ

.

By the product rule,

Dαξ (ξmN (ξ)) = Dα

ξ (ξm(ξ)ψN (ξ)) =∑

α1+α2=α

Cα1,α2 Dα1

ξ (ξm(ξ))Dα2

ξ (ψN (ξ)).

45

By assumption, |Dα1

ξ m(ξ)| . |ξ|−|α1|. Therefore, |Dα1

ξ (ξm(ξ))| . |ξ|1−|α1|, again by the product rule. Next,by the chain rule,

|Dα2

ξ ψN (ξ)| =∣∣∣∣Dα2

ξ

(ψ

(ξ

N

))∣∣∣∣ = N−|α2|∣∣∣∣(Dα2

ξ ψ)

(ξ

N

)∣∣∣∣ .Thus,

|Dαξ ξmN (ξ)| .

∑α1+α2=α

|ξ|1−|α1|N−|α2|∣∣∣∣(Dα2

ξ ψ)

(ξ

N

)∣∣∣∣ .Since ψN and all of its derivatives are supported on an annulus of radius comparable to N ,

‖xα∇mN (x)‖L∞x .∥∥Dα

ξ ξmN (ξ)∥∥L1ξ

.∑

α1+α2=α

∫|ξ|∼N

|ξ|1−|α1|N−|α2| dξ .∑

α1+α2=α

N1+d−|α1|N−|α2|

. N1+d−|α|.

By choosing α = 0 and |α| = d+ 2, it then follows that

|∇mN (x)| . min

{Nd+1,

1

N |x|d+2

}so that

|∇K(x)| ≤∑N∈2Z

|∇mN (x)| .∑N∈2Z

min

{Nd+1,

1

N |x|d+2

}.

Splitting this sum over small and large frequencies, chosen appropriately,

|∇K(x)| .∑N≤ 1

|x|

Nd+1 +∑N> 1

|x|

1

N |x|d+2.

Since both of these sums are over dyadic numbers, the first sum is bounded (up to a constant) by its largestterm and the second sum is bounded (up to a constant) by its smallest term. Thus,

|∇K(x)| . 1

|x|d+1+

11|x| |x|d+2

.1

|x|d+1.

By our initial remark, we are done.Next, we return to our original assumption that |α| ≤

⌈d+1

2

⌉. The proof is similar, except that we perform

the computation in L2(Rd) instead of L∞(Rd). Using Plancharel and then proceeding like before, we have:

‖(−2πix)αmN (x)‖L2x

=∥∥Dα

ξmN (ξ)∥∥L2ξ

.∑

α1+α2=α

∥∥∥∥Dα1

ξ m(ξ) ·Dα2

ξ

(ψ

(ξ

N

))∥∥∥∥L2ξ

.∑

α1+α2=α

∥∥∥∥ 1

|ξ||α1|·N−|α2|

∥∥∥∥L2ξ(|ξ|∼N)

=∑

α1+α2=α

N−|α2|

(∫|ξ|∼N

|ξ|−2|α1| dξ

) 12

.∑

α1+α2=α

N−|α2|(N−2|α1|+d

) 12

= N−|α|+d2 .

46

In particular, by choosing α = 0 we have ‖mN (x)‖L2x. N

d2 . Thus, for a fixed A > 0, Holder’s inequality

gives ∫|x|≤A

|mN (x)| dx . ‖mN (x)‖L2

∥∥χ|x|≤A∥∥L2 . Nd2A

d2

and ∫|x|>A

|mN (x)| dx .∥∥∥|x||α|mN (x)

∥∥∥L2

∥∥∥|x|−|α|χ|x|>A∥∥∥L2.

By the above computation,∥∥|x||α|mN (x)

∥∥L2 . N−|α|+

d2 . Computing the other norm,

∥∥∥|x|−|α|χ|x|>A∥∥∥L2

=

(∫|x|>A

|x|−2|α| dx

) 12

. A−|α|+d2

provided |α| ≤⌈d+1

2

⌉. Combining these two estimates gives∫

|x|>A|mN (x)| dx . (NA)−d

d+12 e+ d

2 .

In particular, choosing A = 1/N , we have ∫|mN (x)| dx . 1

uniformly in N ∈ 2Z. Essentially the same computation gives∫|∇mN (x)| dx . N

where the implicit constant is independent of N . The only difference in the calculation is that we beginby estimating

∥∥(−2πix)α ∂∂xmN (x)

∥∥L2x∼∥∥∥Dα

ξ ξmN (ξ)∥∥∥L2ξ

and consequently pick up an extra power of N

throughout. Thus,∫|x|≥2|y|

|K(x+ y)−K(x)| dx ≤∑N∈2Z

∫|x|≥2|y|

|mN (x+ y)− mN (x)| dx

=∑N≤ 1

|y|

∫|x|≥2|y|

|mN (x+ y)− mN (x)| dx

+∑N> 1

|y|

∫|x|≥2|y|

|mN (x+ y)− mN (x)| dx.

In the first sum, over low frequencies, we can use the fundamental theorem of calculus to get

∑N≤ 1

|y|

∫|x|≥2|y|

|mN (x+ y)− mN (x)| dx ≤∑N≤ 1

|y|

∫|x|≥2|y|

|y|∫ 1

0

|∇mN (x+ θy)| dθ dx.

Using Fubini’s theorem and integrating over Rd instead of |x| ≥ 2|y| gives∑N≤ 1

|y|

∫|x|≥2|y|

|mN (x+ y)− mN (x)| dx ≤∑N≤ 1

|y|

|y| ‖∇mN‖L1 .∑N≤ 1

|y|

|y|N . 1

47

where the last inequality follows from the usual dyadic sum argument. For the second sum, we have thecrude estimate∑

N> 1|y|

∫|x|≥2|y|

|mN (x+ y)− mN (x)| dx ≤∑N> 1

|y|

∫|x|≥2|y|

|mN (x+ y)|+ |mN (x)| dx

.∑N> 1

|y|

∫|x|≥|y|

|mN (x)| dx

.∑N> 1

|y|

(N |y|)−dd+12 e+ d

2 .

Here, we have used our previous estimate with A = |y|. Since −⌈d+1

2

⌉+ d

2 < 0 and the sum ranges overdyadic numbers greater than 1

|y| , we have

∑N> 1

|y|

(N |y|)−dd+12 e+ d

2 .

(1

|y||y|)−d d+1

2 e+ d2

= 1.

Therefore, ∫|x|≥2|y|

|K(x+ y)−K(x)| dx . 1 + 1 . 1

and we are done.

Next, we define the Littlewood-Paley projections. For convenience, we recall the definitions of thefunctions ϕ, ψ, and ψN from above:

ϕ(x) =

{1 |x| ≤ 1

0 |x| ≥ 1110

.

and ψ(x) = ϕ(x)− ϕ(2x). Then

ψ(x) =

0 |x| ≤ 1

2

1 1120 ≤ |x| ≤ 1

0 |x| ≥ 1110

.

For N ∈ 2Z, define ψN (x) = ψ(xN

).

Definition 10.1. The Littlewood-Paley projection onto frequencies |ξ| ∼ N , denoted PN , is defined in theFourier domain for Schwartz functions f via

PNf(ξ) = ψN (ξ)f(ξ).

Equivalently, for more general f ,fN := PNf = f ∗Ndψ(N ·).

Note that PN is not a true projection operator, since P 2N 6= PN .

Definition 10.2. The Littlewood-Paley projection onto low frequencies, denoted P≤N , is defined in theFourier domain for Schwartz functions f via

P≤Nf(ξ) = ϕ

(ξ

N

)f(ξ).

48

Equivalently, for more general f ,f≤N := P≤Nf = f ∗Ndϕ(N ·).

The Littlewood-Paley projection onto high frequencies is P>N := I − P≤N , and the Littlewood-Paleyprojection onto medium frequencies is PM≤≤N :=

∑M≤K≤N PK .

The following basic properties of Littlewood-Paley projections will be used frequently.

Proposition 10.1. Fix N ∈ 2Z.

1. The operators PN and P≤N are bounded on Lp for 1 ≤ p ≤ ∞, i.e., ‖fN‖Lp + ‖f≤N‖Lp . ‖f‖Lp .

2. We have the pointwise estimate |fN (x)|+ |f≤N (x)| . (Mf)(x).

3. For f ∈ Lp with 1 < p <∞,∑N∈2Z PNf

Lp−−→ f .

4. (Bernstein inequality I) ‖fN‖Lq . Ndp−

dq ‖fN‖Lp for 1 ≤ p ≤ q ≤ ∞.

5. (Bernstein inequality II) ‖|∇|sfN‖Lp ∼ Ns ‖fN‖Lp for 1 ≤ p ≤ ∞ and s ∈ R.

Proof. 1. We compute:

‖fN‖Lp =∥∥f ∗Ndψ(N ·)

∥∥Lp

. ‖f‖Lp∥∥Ndψ(N ·)

∥∥L1 .

By construction, Ndψ(N ·) is L1-normalized, so that∥∥Ndψ(N ·)

∥∥L1 =

∥∥ψ∥∥L1 . Thus,

‖fN‖Lp . ‖f‖Lp∥∥ψ∥∥

L1 .

The same computation works for ‖f≤N‖Lp .

2. We show the estimate for |fN (x)|, and as before, the same proof works for |f≤N (x)|. We have

fN (x) =(f ∗Ndψ(N ·)

)(x) = Nd

∫f(x− y)ψ(Ny) dy.

Because ψ is Schwartz, ψ is Schwartz, hence is bounded and decays up to any order. Thus,

|fN (x)| . Nd

∫|f(x− y)|〈N |y|〉100d

dy

where we use the Japanese angle bracket notation 〈x〉 := (1 + |x|2)12 . When |y| is small, we exploit the

boundedness of the angle bracket, and when |y| is large we exploit its decay. Explicitly,

|fN (x)| . Nd

∫|y|≤ 1

N

|f(x− y)| dy +Nd

∫|y|> 1

N

|f(x− y)|(N |y|)100d

dy.

Because |B(0, 1/N)| ∼ N−d so that 1/|B(0, 1/N)| ∼ Nd, we have by definition of the maximal function

Nd

∫|y|≤ 1

N

|f(x− y)| dy . (Mf)(x).

49

To estimate the second integral, we sum over dyadic annuli.

Nd

∫|y|> 1

N

|f(x− y)|(N |y|)100d

dy . Nd∑

M∈2Z;M> 1N

∫M≤|y|≤2M

|f(x− y)|(NM)100d

dy

. Nd∑

M∈2Z;M> 1N

(NM)−100d Md

|B(0, 2M)|

∫B(0,2M)

|f(x− y)| dy

.∑

M∈2Z;M> 1N

(NM)−99d(Mf)(x).

By the usual dyadic sum argument,

∑M∈2Z;M> 1

N

(NM)−99d(Mf)(x) .

(1

MM

)−99d

(Mf)(x) = (Mf)(x)

and we are done.

3. By the density of S(Rd) ⊆ Lp(Rd) and by property (1), it suffices by the usual approximation argumentto prove the claim for Schwartz functions. Thus, fix f ∈ S(Rd).

For p = 2, by Plancharel we have

‖f − fN≤≤M‖L2 =

∥∥∥∥∥∥f1−

∑N≤K≤M

ψK

∥∥∥∥∥∥L2

≤

∥∥∥∥∥f ∑K<N

ψK

∥∥∥∥∥L2

+

∥∥∥∥∥f ∑K>M

ψK

∥∥∥∥∥L2

.

Since∑K<N ψK ≤ χB(0,N) and

∑K>M ψK ≤ χ|ξ|≥M , by the dominated convergence theorem, the

left hands side tends to 0 as N → 0, and the right hand side tends to 0 as M →∞.

For 1 < p < 2, we use interpolation. Choose θ so that 1p = θ

1 + 1−θ2 . Then

‖f − fN≤≤M‖Lp ≤ ‖f − fN≤≤M‖θL1 ‖f − fN≤≤M‖1−θL2 .

By (1),‖f − fN≤≤M‖θL1 ≤

(‖f‖L1 + ‖fN≤≤M‖L1

)θ. ‖f‖θL1 .

Our previous computation shows that ‖f − fN≤≤M‖1−θL2 → 0 and thus ‖f − fN≤≤M‖Lp → 0.

For 2 < p <∞, the same trick works:

‖f − fN≤≤M‖Lp ≤ ‖f − fN≤≤M‖1− 2

p

L1 ‖f − fN≤≤M‖2p

L2 .

The left hand side is bounded by (1), and the right hand side→ 0 by the previous calculation.

4. By Young’s convolution inequality, for r satisfying 1p + 1

r = 1q + 1,

‖fN‖Lq =∥∥f ∗Ndψ(N ·)

∥∥Lq

. ‖f‖Lp∥∥Ndψ(N ·)

∥∥Lr

= ‖f‖Lp Nd− dr

∥∥ψ∥∥Lr

. Ndp−

dq ‖f‖Lp .

This is great but this isn’t the estimate we need. To recover fN on the right, we use fattened Littlewood-Paley projections PN := PN

2 ≤≤2N . Note that PNPN = PN . Also, we have

PNf(ξ) =

∑N2 ≤K≤2N

ψK

(ξ)f(ξ)

50

so that

PNf = f ∗

∑N2 ≤K≤2N

ψK

ˇ

= f ∗∑

N2 ≤K≤2N

Kdψ(K·) ∼ f ∗Nd∑

N2 ≤K≤2N

ψ(K·) = f ∗Nd ˇψ(N ·)

for an appropriately defined ψ. 2 Thus,

‖fN‖Lq =∥∥∥PNfN∥∥∥

Lq∼∥∥∥fN ∗Nd ˇ

ψ(N ·)∥∥∥Lq

. ‖fN‖Lp∥∥∥Nd ˇ

ψ(N ·)∥∥∥Lr

= ‖fN‖Lp Nd− dr

∥∥∥ ˇψ∥∥∥Lr

. Ndp−

dq ‖fN‖Lp .

5. Fix s ∈ R. By definition, we have:

|∇|sfN (ξ) ∼ |ξ|sψN (ξ)f(ξ) = Ns

[(|ξ|N

)sψ

(ξ

N

)]f(ξ).

Since the support of ψ is localized away from the origin, ρ(ξ) := |ξ|sψ(ξ) ∈ C∞c (Rd\{0}) for any valueof s. So |∇|sfN (ξ) ∼ Nsρ

(ξN

)f(ξ), hence

|∇|sfN = Ns[f ∗Ndρ(N ·)

].

Thus,‖|∇|sfN‖Lp . Ns ‖f‖Lp

∥∥Ndρ(N ·)∥∥L1 = Ns ‖f‖Lp ‖ρ‖L1 . Ns ‖f‖Lp .

Via the same fattened Littlewood-Paley projections technique, we get the estimate ‖|∇|sfN‖Lp .

Ns ‖fN‖Lp .

To get the reverse inequality, observe that

fN (ξ) = |ξ|−s|ξ|sfN (ξ) ∼ (|∇|−s|∇|sfN )ˆ(ξ).

Thus, applying the inequality we already have to |∇|−s|∇|sfN gives

‖fN‖Lp ∼∥∥|∇|−s|∇|sfN∥∥Lp . N−s ‖|∇|sfN‖Lp

so that Ns ‖fN‖Lp . ‖|∇|sfN‖Lp as desired.

We remark that (3) does not hold for p = 1 or p =∞. Note that∫fN (x) dx = fN (0) = 0 for any N ∈ 2Z,

so∫fN≤≤M (x) dx = 0. But

∑PNf

L1

−−→ f implies that their means converge. The claim fails for p = 1 bychoosing an f ∈ L1 with

∫f(x) dx 6= 0. To see why the claim fails for p =∞, note that fN≤≤M ∈ C∞. Since

convergence in L∞ is uniform convergence, lim∑PNf is continuous.

We also remark that the same proof technique in (4) gives the estimate ‖f≤N‖Lq . Ndp−

dq ‖f≤N‖Lp .

Though (3) does not hold for p = 1, we do have the following result.

Proposition 10.2. Let f ∈ L1(Rd) with∫Rd f(x) dx = 0. Then

∑PNf

L1

−−→ f .

2Note that, since we are summing over dyadic numbers,∑

N2≤K≤2N

ψ(K·) = ψ(

N2·)

+ ψ (N ·) + ψ (2N ·).

51

Proof. First, we claim that f can be approximated by a smooth function with compact support and mean 0.Indeed, fix ε > 0. Then since f ∈ L1, there exists an R > 0 such that

∫|x|>R |f(x)| dx < ε. Then∫

f(x)χB(0,R)(x) +O(ε)χR≤|x|≤2R(x) dx = 0

for some appropriately chosen constant O(ε). Convolving with a smooth approximation to the identity thedesired approximation.

Thus, we can assume that f ∈ C∞c (Rd) and∫f(x) dx = 0. By the triangle inequality,∥∥∥∥∥∥f −

∑N≤K≤M

PNf

∥∥∥∥∥∥L1

≤

∥∥∥∥∥ ∑K<N

PKf

∥∥∥∥∥L1

+∑K>M

‖PKf‖L1 .

First, consider the high frequencies. By Bernstein’s inequality,∑K>M

‖PKf‖L1 .∑K>M

K−1 ‖|∇|fK‖L1 .

Since f ∈ C∞c (Rd) and the sum is a geometric sum over dyadic numbers,∑K>M ‖PKf‖L1 .f M−1, which

→ 0 as M → ∞. Next, consider the low frequencies. We exploit the fact that convolving with a mean 0

function is like taking a derivative, in the following sense: we have

(P≤f)(x) =

∫f(y)Ndϕ(N(x− y)) dy = Nd

∫f(y) [ϕ(N(x− y))− ϕ(Nx)] dy

since f has mean 0. By the fundamental theorem of calculus,

ϕ(N(x− y))− ϕ(Nx) = Ny

∫ 1

0

∇ϕ(Nx− θNy) dθ.

Using the fact that the support of f is contained in B(0, R), along with the fact that ϕ is Schwartz and hencedecays as fast as we need,

‖P≤Nf‖L1 . Nd+1

∫ ∫|y||f(y)|

∫ 1

0

|∇ϕ(Nx− θNy)| dθ dy dx

. Nd+1R

∫ ∫|f(y)| 1

〈Nx〉100ddy dx.

Here we have also used the fact that NR << 1 for N << 1, so we don’t need to include it in the Japaneseangle bracket. So

‖P≤Nf‖L1 . Nd+1R ‖f‖L1

∫1

〈Nx〉100ddx . Nd+1R ‖f‖L1 N

−d

which→ 0 as N → 0.

52

11 2/22: Littlewood-Paley square function

We begin with an inequality from probability theory.

Lemma 11.1 (Kinchin’s Inequality). Let Xn be independent identically distributed random variables suchthat Xn = ±1 with equal probability. Then for any 0 λ) dλ

λ.

We have

P(∣∣∣∑ cnXN

∣∣∣ > λ)≤ P

(∑cnXN > λ

)+ P

(∑cnXN < −λ

)= 2P

(∑cnXN > λ

)because the Xn are i.i.d with Xn = ±1 with equal probability. Next, recall the exponential Chebychev’sinequality: for a random variable X and t > 0,

P (X ≥ λ) ≤ e−tλ E(etX).

Thus, for t > 0,

P(∣∣∣∑ cnXN

∣∣∣ > λ)≤ 2e−λt E

(et∑cnXn

).

Because the Xn are independent, this gives

P(∣∣∣∑ cnXN

∣∣∣ > λ)≤ 2e−λt

∏E(etcnXn

)= 2e−λt

∏(1

2etcn +

1

2e−tcn

)= 2e−λt

∏cosh(tcn).

Recall that cosh(x) ≤ e x2

2 ; one quick way to see this is by comparing Taylor series. So

P(∣∣∣∑ cnXN

∣∣∣ > λ)≤ 2e−λt

∏e

(tcn)2

2 = 2e−λtet2

2

∑c2n .

Choosing t such that λt = t2∑c2n, hence t = λ∑

c2n, gives

P(∣∣∣∑ cnXN

∣∣∣ > λ)≤ 2e−λte

λt2 = 2e−

λt2 = 2e

− λ2

2∑c2n .

So

E∣∣∣∑ cnXN

∣∣∣p .p ∫ ∞0

λpe− λ2

2∑c2ndλ

λ.

Making the changes of variables z = λ√∑c2n

yields

E∣∣∣∑ cnXN

∣∣∣p . (∑ c2n

) p2

∫ ∞0

zpe−z2

2dz

z.(∑

c2n

) p2

.

This gives the . direction of the statement.

53

Next, we need the & direction. We first consider the case 1 ≤ p < ∞. Note that∑|cn|2 = E |

∑cnXn|2.

This is because the Xn are independent, hence E(XnXm) = E(Xn)E(Xm) = 0. Then by Holder, and by theabove inequality,

∑|cn|2 ≤

(E∣∣∣∑ cnXn

∣∣∣p) 1p

(E∣∣∣∑ cnXn

∣∣∣p′) 1p′

.(E∣∣∣∑ cnXn

∣∣∣p) 1p(∑

c2n

) 12

.

This gives (∑c2n

) p2


∣∣∣p) 1p

as desired.For 0 < p < 1, we use Cauchy-Schwarz:∑

|cn|2 = E∣∣∣∑ cnXn

∣∣∣2 = E(∣∣∣∑ cnXn

∣∣∣ p2 ∣∣∣∑ cnXn

∣∣∣2− p2)≤(E∣∣∣∑ cnXn

∣∣∣p) 12

(E∣∣∣∑ cnXn

∣∣∣4−p) 12

.

Again by the first demonstrated inequality we have∑|cn|2 .

(E∣∣∣∑ cnXn

∣∣∣p) 12(∑

|cn|2) 4−p

4 ⇒(∑

|cn|2) p

4


∣∣∣p) 12

.

Raising both sides to the power 2p gives the desired result.

Theorem 11.2 (Littlewood-Paley square function). Let f ∈ S(Rd). Define the Littlewood-Paley squarefunction by

(Sf)(x) =

(∑N∈2Z

|fN (x)|2) 1

2

.

Then ‖Sf‖Lp ∼ ‖f‖Lp for 1 < p <∞.

Proof. We will first show ‖Sf‖Lp . ‖f‖Lp . As a remark, the proof of this direction does not rely on thespecific multiplier ψ defining P1, so this inequality holds in more generality. In particular, ψ can be replacedby any element of C∞c (Rd \ {0}).

Let m(ξ) :=∑N∈2Z ψN (ξ)XN , where the XN are i.i.d random variables such XN = ±1 with equal

probability. Then ‖m‖L∞ . 1, because for any given ξ, only finitely many of the summands give anynonzero contribution due to the compact supports of the φN . Also,

|Dαξm(ξ)| ≤

∑N∈2Z

N−|α|∣∣∣∣Dα

ξ ψ

(ξ

N

)∣∣∣∣ .Because ψN is smooth and has support |ξ| ∼ N , and because only finitely summands contribute,

|Dαξm(ξ)| . |ξ|−|α|.

As this holds for any α ∈ Nd, the Mikhlin multiplier theorem gives ‖m ∗ f‖Lp . ‖f‖Lp for all 1 < p < ∞.Note that m ∗ f =

∑N∈2Z fNXN . So by Kinchin’s inequality,

(Sf)(x) =

(∑N∈2Z

|fN (x)|2) 1

2

∼

(E

∣∣∣∣∣ ∑N∈2Z

fN (x)XN

∣∣∣∣∣p) 1

p

= (E |m ∗ f |p)1p .

54

Using Fubini’s theorem,

‖Sf‖pLp ∼∫

E |m ∗ f |p (x) dx . E ‖m ∗ f‖pLp

Since the expectation of a constant is itself,

‖Sf‖pLp . E ‖m ∗ f‖pLp . E ‖f‖pLp = ‖f‖pLp .

Next, we show ‖f‖Lp . ‖Sf‖Lp using duality and the fattened Littlewood-Paley projections. Recall thatPNPN = PN . We have

‖f‖Lp = sup‖g‖

Lp′=1

〈f, g〉 = sup‖g‖

Lp′=1

⟨∑PNf, g

⟩= sup‖g‖

Lp′=1

⟨∑PNPNf, g

⟩= sup‖g‖

Lp′=1

∑⟨PNf, PNg

⟩since the operators PN are self-adjoint. By Cauchy-Schwarz and then Holder,

‖f‖Lp ≤ sup‖g‖

Lp′=1

⟨(∑|PNf |2

) 12

,(∑

|PNg|2) 1

2

⟩≤ sup‖g‖

Lp′=1

‖Sf‖Lp∥∥∥∥(∑ |PNg|2

) 12

∥∥∥∥Lp′

.

By the remark at the beginning of the proof,∥∥∥∥(∑ |PNg|2) 1

2

∥∥∥∥Lp′

. ‖g‖Lp′ . Thus, ‖f‖Lp . ‖Sf‖Lp .

55

12 2/24 – 3/3: Applications of the square function to fractional deriva-tives

Proposition 12.1. Let 1 0,

‖|∇|sf‖Lp ∼

∥∥∥∥∥∥(∑N∈2Z

|Nsf≥N |2) 1

2

∥∥∥∥∥∥Lp

.

Proof.

1. First, consider the & inequality. We have∥∥∥∥∥∥(∑N∈2Z

|NsfN |2) 1

2

∥∥∥∥∥∥Lp

=

∥∥∥∥∥∥(∑N∈2Z

|Ns|∇|−s|∇|sfN |2) 1

2

∥∥∥∥∥∥Lp

.

Also, |∇|sfN = |∇|sPNf = PN (|∇|sf), as these operators are given by multiplication on the Fourierside. Thus, ∥∥∥∥∥∥

(∑N∈2Z

|NsfN |2) 1

2

∥∥∥∥∥∥Lp

=

∥∥∥∥∥∥(∑N∈2Z

|Ns|∇|−sPN (|∇|sf)|2) 1

2

∥∥∥∥∥∥Lp

.

Recall that in the proof of the Littlewood-Paley square function estimate from, we proved one di-rection in a greater generality; we showed that ‖Sf‖Lp . ‖f‖Lp where S is defined via any ψ ∈C∞c (Rd \ {0}). In particular, this holds for ψ(ξ) := |ξ|−sψ(ξ), where ψ is the usual Littlewood-Paley ψ.Because ψ ∈ C∞c (Rd \ {0}), ψ ∈ C∞c (Rd \ {0}). Defining ψN (ξ) := Ns|ξ|−sψ(ξ/N), the general squarefunction estimate gives ∥∥∥∥∥∥

(∑N∈2Z

|Ns|∇|−sPN (|∇|sf)|2) 1

2

∥∥∥∥∥∥Lp

. ‖|∇|sf‖Lp .

Next, we show ., using duality and the fattened Littlewood-Paley projections.

‖|∇|sf‖Lp = sup‖g‖

Lp′=1

〈|∇|sf, g〉 = sup‖g‖

Lp′=1

⟨∑PN (|∇|sf), g

⟩= sup‖g‖

Lp′=1

⟨∑|∇|sPNPNf, g

⟩= sup‖g‖

Lp′=1

∑⟨PNf, |∇|sPNg

⟩= sup‖g‖

Lp′=1

∑⟨NsPNf,N

−s|∇|sPNg⟩.

Applying Cauchy-Schwarz and then Holder,

‖|∇|sf‖Lp ≤ sup‖g‖

Lp′=1

⟨(∑|N2fN |2

) 12

,(∑

|N−s|∇|sPNg|2) 1

2

⟩≤ sup‖g‖

Lp′=1

∥∥∥∥(∑ |NsfN |2) 1

2

∥∥∥∥Lp

∥∥∥∥(∑ |N−s|∇|sPNg|2) 1

2

∥∥∥∥Lp′

.

56

By the same remark from above,∥∥∥∥(∑ |N−s|∇|sPNg|2) 1

2

∥∥∥∥Lp′

. ‖g‖Lp′ .

Thus, ‖|∇|sf‖Lp .∥∥∥(∑ |NsfN |2

) 12

∥∥∥Lp

.

2. Note that PN = P≥N − P≥2N . Thus,

‖|∇|sf‖Lp ∼∥∥∥∥(∑ |NsfN |2

) 12

∥∥∥∥Lp≤∥∥∥∥(∑ |Nsf≥N |2

) 12

∥∥∥∥Lp

+

∥∥∥∥(∑ |Nsf≥2N |2) 1

2

∥∥∥∥Lp

where we have invoked the `2 triangle inequality followed by the Lp triangle inequality. But up to afactor of 2, ∥∥∥∥(∑ |Nsf≥2N |2

) 12

∥∥∥∥Lp

.

∥∥∥∥(∑ |Nsf≥N |2) 1

2

∥∥∥∥Lp.

So

‖|∇|sf‖Lp .

∥∥∥∥(∑ |Nsf≥N |2) 1

2

∥∥∥∥Lp.

Next, we show &. We have∑|Nsf≥N |2 =

∑N2s|f≥N |2 ≤

∑N∈2Z

N2s∑K≥N

|fK |2

≤∑N∈2Z

N2s

∑N1≥N

|fN1|

∑N2≥N

|fN2|

.

Multiplying the latter sums out, rearranging, and picking up a factor of 2 from symmetry gives∑|Nsf≥N |2 ≤ 2

∑N∈2Z

N2s∑

N≤N1≤N2

|fN1||fN2

|.

Cleverly inserting constants yields∑|Nsf≥N |2 ≤ 2

∑N∈2Z

N2s∑

N≤N1≤N2

1

Ns1N

s2

|Ns1fN1

||Ns2fN2

|

= 2∑

N,N1,N2;N≤N1≤N2

N2s

Ns1N

s2

|Ns1fN1 ||Ns

2fN2 |.

Freeze N1 and N2 and consider the sum over N . This is a dyadic sum, and by the usual argument wecan bound this by plugging in the largest term, which is N1. Thus,∑

|Nsf≥N |2 .∑

N1≤N2

(N1

N2

)s|Ns

1fN1||Ns

2fN2|.

This is summable in both N1 and N2, since s > 0. By Schur’s test, it follows that

∑|Nsf≥N |2 .

(∑N2

|Ns2fN2 |2

) 12(∑N1

|Ns1fN1 |2

) 12

=∑|NsfN |2.

Taking the square root of both sides and then the Lp norm gives∥∥∥∥(∑ |Nsf≥N |2) 1

2

∥∥∥∥Lp

.

∥∥∥∥(∑ |NsfN |2) 1

2

∥∥∥∥Lp.

Applying the result from the previous part gives the desired result.

57

Next, we use these estimates to prove a product rule and chain rule for the operator |∇|s, both resultsdue to Christ and Weinstein. As |∇|s is a nonlocal operator, these rules are given as Lp norm estimates,rather than as pointwise facts. These results are used frequently in the study of partial differential equations.

Theorem 12.1 (Fractional product rule). Let 1 0. Then

‖|∇|s(fg)‖Lp . ‖|∇|sf‖Lp1 ‖g‖Lp2 + ‖|∇|sg‖Lq1 ‖f‖Lq2

where 1p = 1

p1+ 1

p2= 1

q1+ 1

q2.

Proof. The previous proposition gives ‖|∇|s(fg)‖Lp ∼∥∥∥(∑N2s|PN (fg)|2

) 12

∥∥∥Lp

. We perform a paraproductdecomposition on fg:

fg = f≥N8g + f<N

8g = f≥N8

g + f<N8g≥N8

+ f<N8g<N

8.

ThenPN (fg) = PN (f≥N8

g) + PN (f<N8g≥N8

) + PN (f<N8g<N

8).

Note that f<N8

and g<N8

have frequency supports bounded by 2N8 = N4 . Thus, the maximum frequency

attained by f<N8g<N

8is bounded by N

4 + N4 = N

2 . Thus, PN (f<N8g<N

8) = 0. Thus, it remains to consider the

first two terms.Using the fact that the Littlewood-Paley projections are bounded by the maximal function,

|PN (fg)| .M(f≥N8g) +M(f<N

8g≥N8

) .M(f≥N8g) +M

[(Mf) g≥N8

].

Therefore, the `2 and Lp triangle inequalities yield∥∥∥∥(∑N2s|PN (fg)|2) 1

2

∥∥∥∥Lp

.

∥∥∥∥∥(∑∣∣∣M(Nsf≥N8

g)∣∣∣2) 1

2

∥∥∥∥∥Lp

+

∥∥∥∥∥(∑∣∣∣M [

(Mf)Nsg≥N8

]∣∣∣2) 12

∥∥∥∥∥Lp

.

Next, apply the vector-valued Hardy-Littlewood maximal inequality to both norms:∥∥∥∥(∑N2s|PN (fg)|2) 1

2

∥∥∥∥Lp

.

∥∥∥∥∥(∑∣∣∣Nsf≥N8

g∣∣∣2) 1

2

∥∥∥∥∥Lp

+

∥∥∥∥∥(∑∣∣∣(Mf)Nsg≥N8

∣∣∣2) 12

∥∥∥∥∥Lp

=

∥∥∥∥∥|g|(∑∣∣∣Nsf≥N8

∣∣∣2) 12

∥∥∥∥∥Lp

+

∥∥∥∥∥|Mf |(∑∣∣∣Nsg≥N8

∣∣∣2) 12

∥∥∥∥∥Lp

.

Applying Holder’s inequality,∥∥∥∥(∑N2s|PN (fg)|2) 1

2

∥∥∥∥Lp

. ‖g‖Lp2

∥∥∥∥∥(∑∣∣∣Nsf≥N8

∣∣∣2) 12

∥∥∥∥∥Lp1

+ ‖Mf‖Lq2

∥∥∥∥∥(∑∣∣∣Nsg≥N8

∣∣∣2) 12

∥∥∥∥∥Lq1

.

Since the maximal function is of type (q2, q2), ‖Mf‖Lq2 . ‖f‖Lq2 . Applying part 2 of the previous proposi-tion to the p1 and q1 norms yields

‖|∇|s(fg)‖Lp ∼∥∥∥∥(∑N2s|PN (fg)|2

) 12

∥∥∥∥Lp

. ‖g‖Lp2 ‖|∇|sf‖Lp1 + ‖f‖Lq2 ‖|∇|

sg‖Lq1 .

58

Theorem 12.2 (Fractional chain rule). Suppose F : C→ C satisfies

|F (u)− F (v)| . |u− v| · |G(u)−G(v)|

for all functions u, v : Rd → C and for some function G : C→ [0,∞). Then for 1 < p <∞ and 0 < s < 1,

‖|∇|sF (u)‖Lp . ‖|∇|su‖Lp1 ‖G(u)‖Lp2

where 1p = 1

p1+ 1

p2and 1 < p2 ≤ ∞.

Proof. As before, the previous proposition yields ‖|∇|sF (u)‖Lp ∼∥∥∥(∑N2s|PNF (u)|2

) 12

∥∥∥Lp

.

Observe that the function ψ is a mean zero function. Indeed,∫Rdψ(x) dx = ˆψ(0) = ψ(0) = 0.

We use this and the fact that convolving with a mean zero function is like differentiation as follows:

PNF (u)(x) = (F (u) ∗Ndψ(N ·))(x) = Nd

∫NyF (u(x− y)) dy

= Nd

∫ψ(Ny) [F (u(x− y))− F (u(x))] dy

and so|PNF (u)(x)| . Nd

∫|ψ(Ny)| |u(x− y)− u(x)| |G(u(x− y))−G(u(x)| dy.

We have|u(x− y)− u(x)| ≤ |u>N (x− y)|+ |u>N (x)|+

∑K≤N

|uK(x− y)− uK(x)|.

We claim that|uK(x− y)− uK(x)| . K|y| |(MuK)(x− y) + (MuK)(x)| .

Indeed, if K|y| & 1, then

|uK(x− y)− uK(x)| ≤ |uK(x− y)|+ |uK(x)| . |(MuK)(x− y) + (MuK)(x)|

. K|y| |(MuK)(x− y) + (MuK)(x)| .

If K|y| << 1, then

uK(x− y)− uK(x) = PK(uK(x− y)− uK(x))

= Kd

∫ˇψ(Ky) [uK(x− y − z)− uK(x− z)] dz

=

∫Kd[

ˇψ(K(z − y))− ˇ

ψ(Kz)]uK(x− z) dz

via the change of variables y + z 7→ z. By the fundamental theorem of calculus,

|uK(x− y)− uK(x)| ≤∫Kd

∫ 1

0

K|y|∣∣∇ψ(Kz − θKy)

∣∣ dθ |uK(x− z)| dz.

Because ˇψ is Schwartz and thus is bounded and decays as quickly as we need, we have

|uK(x− y)− uK(x)| . K|y|∫Kd 1

〈Kz〉100d|uK(x− z)| dz.

59

In previous arguments, by considering z < 1/K and z > 1/K separately, we have seen that∫Kd 1

〈Kz〉100d|uK(x− z)| dz . (MuK)(x).

Our claim then follows from this fact.Next, we consider the contribution of |u>N (x− y)| to |PNF (u)(x)|. The contribution is bounded by∫

Nd |ψ(Ny)| |u>N (x− y)| [G(u)(x− y) +G(u)(x)] dy.

By the same trick as above, use the quickly decaying and bounded nature of ψ to estimate this integral by

M (u>N ·G(u)) (x) +G(u)(x) ·M(u>N )(x) .M (u>N ·G(u)) (x) +M(G(u))(x) ·M(u>N )(x).

Similarly, the contribution of |u>N (x)| to |PNF (u)(x)| is bounded by∫Nd |ψ(Ny)| |u>N (x)| [G(u)(x− y) +G(u)(x)] dy . |u>N (x)| ·M(G(u))(x) + |u>N (x)| ·G(u)(x)

.M(u>N )(x) ·M(G(u))(x).

Thus, the contribution of |u>N (x− y)|+ |u>N (x)| to |PNF (u)(x)| is bounded by

M (u>N ·G(u)) (x) +M(G(u))(x) ·M(u>N )(x).

Estimate the contribution of M (u>N ·G(u)) (x) to ‖|∇|sF (u)‖Lp as follows:∥∥∥∥∥∥(∑N∈2Z

N2s |M(u>NG(u))|2) 1

2

∥∥∥∥∥∥Lp

=

∥∥∥∥∥∥(∑N∈2Z

|M(Nsu>NG(u))|2) 1

2

∥∥∥∥∥∥Lp

.

∥∥∥∥∥∥(∑N∈2Z

|Nsu>NG(u)|2) 1

2

∥∥∥∥∥∥Lp

using the vector-valued Hardy-Littlewood maximal inequality. Next, pull the G(u) out of the sum andapply Holder’s inequality followed by our derivative estimate from earlier:

‖G(u)‖Lp2

∥∥∥∥∥∥(∑N∈2Z

|Nsu>N |2) 1

2

∥∥∥∥∥∥Lp1

. ‖G(u)‖Lp2 ‖|∇|su‖Lp1 .

The contribution of M(G(u))(x) ·M(u>N )(x) to ‖|∇|sF (u)‖Lp is almost the exact same calculation.Thus, it only remains to bound the contribution of the

∑K≤N |uK(x− y)− uK(x)| term. FINISH THIS

An example of such a functionF which appears in applications in partial differential equations isF (u) =

|u|pu. Indeed, by the fundamental theorem of calculus,

|F (u)− F (v)| ≤ |u− v| · ||u|p + |v|p| .

Nonlinearities in equations such as the Schrodinger equation often take this form.Also, this fractional product rule is useful for the case s > 1, as we can typically write derivative opera-

tors as an integer part plus a fractional part 0 < s < 1.

60

13 3/6 – 3/8: Oscillatory Integrals

Definition 13.1. An oscillatory integral of the first kind is of the form

I(λ) =

∫Rdeiλφ(x) ψ(x) dx

where φ : Rd → R, ψ : Rd → C, and λ > 0.

Our main goal is to understand the asymptotic behavior of these integrals as λ → ∞. This questionarises in the study of many PDE’s; for example, λ can represent time. These kinds of oscillatory integralswill be our only focus, though we make the following definition for completeness:

Definition 13.2. An oscillatory integral of the second kind is an operator of the form

(Tλf)(x) =

∫Rdeiλφ(x,y)K(x, y) f(y) dy

where φ : Rd × Rd → R, K : Rd × Rd → C, f : Rd → C, and λ > 0.

In this case, it is desirable to understand the asymptotic behavior of ‖Tλ‖op as λ → ∞. A reference forthese kinds of oscillatory integrals is Chapter 9 of Stein’s harmonic analysis text.

We begin with the one-dimensional case for oscillatory integrals of the first kind. There are three mainresults which we prove.

Proposition 13.1 (Nonstationary phase). Let φ : R → R and ψ : R → C be smooth. Assume that ψ hascompact support in (a, b) and assume φ′(x) 6= 0 for all x ∈ [a, b]. Then∣∣∣∣∣

∫ b

a

eiλφ(x)ψ(x) dx

∣∣∣∣∣ .N λ−N

for any N ≥ 0.

Proof. Note that eiλφ(x) = 1iλφ′(x)

ddxe

iλφ(x), since φ′(x) 6= 0. Let D := 1iλφ′(x)

ddx . Then eiλφ(x) = DN

(eiλφ(x)

)for any N ≥ 0. We wish the compute the adjoint of D. Integrating by parts, for any smooth f , we have∫

RDf(x)ψ(x) dx =

∫R

1

iλφ′(x)

df

dx(x)ψ(x) dx = −

∫Rf(x)

d

dx

(1

iλφ′(x)ψ(x)

)dx

with no boundary terms because of the compact support of ψ. Thus, the adjoint of D is defined by

tDf(x) := − d

dx

[1

iλφ′(x)f(x)

].

We have ∫ b

a

eiλφ(x)ψ(x) dx =

∫ b

a

DN (eiλφ(x))ψ(x) dx =

∫ b

a

eiλφ(x) (tD)Nψ(x) dx

=

∫ b

a

eiλφ(x)

[− d

dx

1

iλφ′(x)

]N(ψ)(x) dx.

61

Therefore, ∣∣∣∣∣∫ b

a

eiλφ(x)ψ(x) dx

∣∣∣∣∣ ≤ λ−N∫ b

a

∣∣∣∣∣[d

dx

1

φ′(x)

]N(ψ)(x)

∣∣∣∣∣ dx.Inside this integral, the derivative can hit either the φ′(x) in the denominator or the ψ(x). So∫ b

a

∣∣∣∣∣[d

dx

1

φ′(x)

]N(ψ)(x)

∣∣∣∣∣ dxis some finite constant which depends on N + 1 derivatives of φ(x) and N derivatives of ψ(x). Thus,∣∣∣∣∣

∫ b

a

eiλφ(x)ψ(x) dx

∣∣∣∣∣ .N λ−N

for any N ≥ 0, where the implicit constant depends on N + 1 derivatives of φ(x) and N derivatives ofψ(x).

We remark that if ψ does not have compact support, then the best decay we can expect in λ is λ−1. Forexample, when ψ(x) = 1 and φ(x) = x, observe that∣∣∣∣∫

abeiλx dx

∣∣∣∣ =

∣∣∣∣eiλb − eiλaiλ

∣∣∣∣ . λ−1.

Proposition 13.2 (Van der Corput). Let φ : R → R be smooth. Fix k ≥ 1, and assume that |φ(k)(x)| ≥ 1. Ifk = 1 assume further that φ′ is monotone. Then∣∣∣∣∣

∫ b

a

eiλφ(x) dx

∣∣∣∣∣ .k λ− 1k

where the implicit constant does not depend on a or b.

Proof. We proceed by induction on k. FINISH

REMARK BEFORE PROOF

62

14 3/13: The Schrodinger equation

We now consider applications of our results from oscillatory integrals to partial differential equations.The linear Schrodinger equation is given by{

iut = −∆2 u

u(0) = u0

. (6)

Here, u : Rt × Rdx → C and u0 : Rdx → C. For now, we will assume u0 ∈ S(Rd).To solve (6), we apply the Fourier transform and then separate variables. For this section, we adopt the

following normalization convention with the Fourier transform: for a function f , define

f(ξ) =1

(2π)d2

∫Rde−ix·ξf(x) dx.

Thenf(x) =

1

(2π)d2

∫Rdeix·ξ f(ξ) dξ.

We still have ‖f‖L2 =∥∥∥f∥∥∥

L2.

With this convention, applying the Fourier transform to (6) gives{iut(ξ) = |ξ|2

2 u(ξ)

u(0, ξ) = u0(ξ)

since ∆ = (iξ) · (iξ) = −|ξ|2. This is an ODE with respect to t, and can be solved by separation of variables.Therefore, the above equation is solved by

u(t, ξ) = e−i|ξ|22 t u0(ξ).

Inverting the Fourier transform gives a solution to (6) of the form

u(t, x) =1

(2π)d2

∫Rdeix·ξ−i

|ξ|22 t u0(ξ) dξ

=1

(2π)d2

∫Rdeit

(xt ·ξ−

|ξ|22

)u0(ξ) dξ.

This is an oscillatory integral of the first kind. Let φ(ξ) := xt · ξ −

|ξ|22 . Then ∇φ(ξ) = x

t − ξ, hence φ has acritical point at ξ0 := x

t . Because D2φ(ξ0) = −I , this critical point is nondegenerate. Consequently, by thestationary phase result, we have

u(t, x) = (2π)−d2 eitφ(ξ0)u0(ξ0)(2π)

d2 e−

π4 dit−

d2 +O

(t−

d2 − 1

)= e−i

πd4 t−

d2 ei

|x|22t u0

(xt

)+O

(t−

d2 − 1

)as t→∞. This is known as Fraunhofer’s formula. In words, it says that the long-time behavior of the physicaldistribution u(t, x) is given by the momentum distribution u0

(xt

).

63

math 247a - harmonic a

Documents