fourier analysis - wordpress.com · 1. introduction abstract harmonic analysis, the generalization...

FOURIER ANALYSIS

T.K.SUBRAHMONIAN MOOTHATHU

Contents

1. Introduction 1

2. Convolution and approximate identities 3

3. Fourier series: preliminaries, and a divergence result 7

4. Sufficient conditions for pointwise convergence of Fourier series 11

5. Cesaro summability and Abel summability 14

6. Weak type boundedness for maximal functions 17

7. Fourier series: pointwise convergence of Cesaro and Abel sums 20

8. Pointwise convergence of Fourier series for functions of bounded variation 24

9. Convolution is a smoothing operation 26

10. Topologies on the spaces D = C∞c (R) and E = C∞(R) 30

11. The Schwartz space S 35

12. Distributions: preliminaries 37

13. Convolution and distributions 42

14. Some structure theorems about distributions 45

15. Fourier transform on R: basics 48

16. Fourier transform: sufficient conditions for pointwise inversion 50

17. Fourier transform on S, L2(R), and on distributions 54

18. Fourier transform of measures 58

19. Poisson summation formula 61

20. Two theorems of Wiener 62

21. Sketch: interpolation and the Lp-theory of Fourier series 65

1. Introduction

Abstract Harmonic Analysis, the generalization of Fourier Analysis, refers generally to the repre-

sentation theory of locally compact topological groups which are not necessarily abelian. Harmonic

Analysis done on Euclidean groups such as the torus and Rn (which are abelian) is usually called1

2 T.K.SUBRAHMONIAN MOOTHATHU

Fourier Analysis, and this is what we plan to study here. We will discuss the basic aspects of

Fourier Analysis from the perspective of pure Mathematics, making use of tools from Measure

Theory and Functional Analysis (and hence these two subjects are prerequisites for this course). A

little bit of knowledge about Topological Groups will also be needed for the next section.

Fourier series on the circle will be discussed first. Fourier integral on R, distributions, and a

few advanced topics will be discussed afterwards. For a study deeper than what we present here,

the student may refer to, for instance, the following books: (i) Y. Katznelson, An Introduction to

Harmonic Analysis, (ii) L. Grafakos, Classical Fourier Analysis, (iii) M.A. Pinsky, Introduction to

Fourier Analysis and Wavelets, (iv) F.G. Friedlander, Introduction to the Theory of Distributions.

Recall that if X is a Hilbert space with orthonormal basis {en : n ∈ N}, then any x ∈ X has

the series representation x =∑∞

n=1⟨x, en⟩en. The idea of representing functions on the circle as a

Fourier series is quite similar. We sketch this briefly.

Let T = R/Z denote the unit circle on the complex plane, which we parametrize either as [0, 1)

or as [−1/2, 1/2). We remark that in several textbooks T is parametrized either as [0, 2π) or

as [−π, π), and then one will get an extra 12π factor in certain integral expressions. Irrespective

of how we parametrize T, keep in mind that T is a compact metric space. Note that C(T) =

{f : T → C : f is continuous} is a Banach space w.r.to the supremum norm ∥ · ∥∞ defined as

∥f∥∞ = sup{|f(t)| : t ∈ T}. We equip T = [0, 1) with the Lebesgue measure, and consider the

Hilbert space L2(T) of square integrable complex valued functions on T, where the inner product

is ⟨f, g⟩ =∫ 10 f(t)g(t)dt. Let en(t) = e2πint for n ∈ Z . We see that ⟨em, en⟩ =

∫ 10 em−n(t)dt = 0 if

m = n and = 1 if m = n, which means {en : n ∈ Z} is an orthonormal set in L2(T).

Since span{en : n ∈ Z} is a subalgebra of C(T) separating points, vanishing nowhere, and

closed under complex conjugation, it follows by Stone-Weierstrass theorem (Theorem 7.33 of Rudin,

Principles of Mathematical Analysis) that span{en : n ∈ Z} is dense in (C(T), ∥ · ∥∞). Since

∥ · ∥2 ≤ ∥ · ∥∞, the topology induced on C(T) by ∥ · ∥∞ is stronger than the L2-topology. Also C(T)

is dense in L2(T), a fact from Measure Theory. It follows that span{en : n ∈ Z} is dense in L2(T).

Thus {en : n ∈ Z} is an orthonormal basis for the Hilbert space L2(T).

Hence any f ∈ L2(T) has a series representation f =∑

n∈Z anen, where an = ⟨f, en⟩ =∫ 10 fe−n =∫ 1

0 f(t)e−2πintdt. By a change of variable θ = 2πt (which amounts to parametrizing T as [0, 2π)),

we can also obtain the classical formula an = 12π

∫ 2π0 f(θ)e−inθdθ.

Observe that the representation f =∑

n∈Z anen means only that the sequence (∑N

n=−N anen)∞N=0

of partial sums converges to f in the L2-norm. Since Lp-convergence (1 ≤ p < ∞) does not imply

FOURIER ANALYSIS 3

pointwise convergence, the series∑

n∈Z anen(t) may not converge to f(t) at points t ∈ T. Therefore,

among other things, it is natural to discuss the following: (i) finding sufficient conditions for the

pointwise convergence of the Fourier series, (ii) rate of convergence of the Fourier series, (iii) other

types of convergence that may hold even in the absence of pointwise convergence.

2. Convolution and approximate identities

We start with a little abstract theory that is applicable to both T and Rn. Let G be a lo-

cally compact second countable abelian group throughout this section. Read the basic theory of

such groups from relevant books (for instance, G admits a complete separable metric). The most

important fact is that such a group G has a Haar measure µ on it, which means

(i) µ = 0 is a Borel measure on G (i.e., defined on the Borel σ-algebra of G),

(ii) (local finiteness) µ(K) <∞ for every compact set K ⊂ G, and

(iii) (translation invariance) µ(A+ x) = µ(A) for every Borel set A ⊂ G and every x ∈ G.

Moreover any other measure on G satisfying the above properties must be of the form cµ for

some c > 0. A Haar measure µ on G is always regular, which means

µ(A) = sup{µ(K) : K ⊂ A and K is compact} = inf{µ(U) : A ⊂ U ⊂ G and U is open}.

Note that the Lebesgue measure is a Haar measure on T and Rn.

Definition: Let µ be a Haar measure on G. For 1 ≤ p < ∞, let Lp(G) = {f : G → C :∫G |f |pdµ < ∞} with the usual convention that we identify f and g if they agree µ-almost

everywhere. For f ∈ Lp(G), let ∥f∥p = (∫G |f |pdµ)1/p. When p = 2, L2(G) is a Hilbert

space with the inner product ⟨f, g⟩ =∫G fgdµ. Similarly one defines L∞(G) = {f : G →

C : f is measurable, and bounded µ-almost everywhere} with norm defined as ∥f∥∞ = inf{M >

0 : |f(x)| ≤ M for x outside a µ-null set}. Let C(G) = {f : G → C : f is continuous} and

Cc(G) = {f ∈ C(G) : f has compact support}, where the support of f , denoted as supp(f), is the

closure of the set {x ∈ G : f(x) = 0}. The following is a standard fact.

Exercise-1: Let 1 ≤ p < ∞. Then Cc(G) is dense in Lp(G). In particular, if G is compact, then

C(G) is dense in Lp(G). [Hint : Find the proof from a textbook. The idea of one proof is roughly

as follows. The indicator function 1K of a compact set K ⊂ G can be approximated by members

of Cc(G) using Urysohn’s lemma. Since the measure µ is regular, the indicator function 1A of any

Borel set A ⊂ G also has such approximations. And a general f ∈ Lp(G) can be approximated by

simple functions, i.e., functions of the form∑k

j=1 aj1Aj where Aj ⊂ G are Borel.]

The translation invariance of the Haar measure has the following important consequence:


Exercise-2: Let f ∈ L1(G), and fy(x) = f(x− y) for y ∈ G. Then,

(i)∫G fydµ =

∫G f(x− y)dµ(x) =

∫G fdµ, fy ∈ L1(G), and ∥fy∥1 = ∥f∥1 for every y ∈ G.

(ii) For each f ∈ L1(G), the map y 7→ fy from G to L1(G) is continuous.

[Hint : (i) The equality∫fydµ =

∫fdµ is clear when f is an indicator function 1A with µ(A) <∞.

And any f ∈ L1(G) may be approximated by linear combinations of such indicator functions. (ii)

Since ∥fy−gy∥1 = ∥(f−g)y∥1 = ∥f−g∥1, and since Cc(G) is dense in L1(G) by Exercise-1, it suffices

to consider f ∈ Cc(G) and establish the continuity of y 7→ fy at y = 0 ∈ G. Let K = supp(f),

the (compact) support of f , and A ⊂ G be a compact symmetric neighborhood of 0 ∈ G. Note

that µ(K + A) < ∞ since K + A is compact. As f ∈ Cc(G) is uniformly continuous, given ε > 0,

we can find a symmetric neighborhood U ⊂ G of 0 with U ⊂ A such that a − b ∈ U implies

|f(a) − f(b)| < ε/µ(K + A). Then, for y ∈ U , we have ∥f − fy∥1 =∫G |f(x) − f(x − y)|dµ(x) ≤∫

K+A(ε/µ(K +A))dµ = ε.]

Remark: As G is locally compact and second countable, G is σ-compact, and consequently the

Haar measure µ on G is σ-finite. Hence Fubini’s theorem holds for µ (this will be used repeatedly).

Certain Banach spaces admit an associative multiplication operation that distributes over addi-

tion, making the Banach space a Banach algebra. For instance, l∞ has the pointwise multiplication

(an)(bn) := (anbn), and C(T) has the pointwise multiplication (fg)(x) := f(x)g(x). The Banach

space L1(G) also admits a multiplication called convolution:

Definition: Let µ be a Haar measure on G. The convolution f ∗ g of f, g ∈ L1(G) is defined as

f ∗g(x) =∫G f(y)g(x−y)dµ(y) =

∫G f(y)gy(x)dµ(y), which can be roughly thought of as a weighted

average of f by the function y 7→ gy. Applying Fubini’s theorem to Φ(x, y) := |f(y)gy(x)|, we

see that∫G

∫GΦdµdµ =

∫G |f(y)|

(∫G |gy(x)|dµ(x)

)dµ(y) =

∫G |f(y)|∥g∥1dµ(y) = ∥f∥1∥g∥1 < ∞.

Hence (f ∗g)(x) is well-defined for µ-almost every x ∈ G and f ∗g ∈ L1(G) with ∥f ∗g∥1 ≤ ∥f∥1∥g∥1.

Exercise-3: Let µ be a Haar measure on G and let f, g, h ∈ L1(G). Then,

(i) f ∗ g ∈ L1(G) with ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1. (already done above)

(ii) (Commutativity) f ∗ g = g ∗ f . [Hint : Substitute u = x− y in the integral representing f ∗ g.]

(iii) (Associativity) f ∗ (g ∗ h) = (f ∗ g) ∗ h.

(iv) (Linearity in each variable) (af + bg) ∗ h = a(f ∗ h) + b(g ∗ h) for a, b ∈ C.

(v) (Convolution commutes with translation) (f ∗ g)z = fz ∗ g = f ∗ gz for every z ∈ G.

Remark: If G is compact, then Lp(G) ⊂ L1(G) for 1 < p <∞ since µ(G) <∞. This inclusion does

not hold when G is non-compact such as Rn. But still f ∗ g is defined for f, g belonging to certain

Lp(G) spaces, and the result below gives information about the location of g ∗ f in such cases.

FOURIER ANALYSIS 5

[101] Let µ be a Haar measure on G.

(i) If g ∈ L1(G) and f ∈ Cc(G), then g ∗ f ∈ L∞(G) and g ∗ f is uniformly continuous.

(ii) (Minkowski’s inequality) Let 1 ≤ p ≤ ∞. If g ∈ L1(G) and f ∈ Lp(G), then g ∗ f ∈ Lp(G) with

∥g ∗ f∥p ≤ ∥f∥p∥g∥1.

(iii) (Young’s inequality) Let 1 ≤ p, q, r ≤ ∞ be such that 1p + 1

r = 1q + 1. If g ∈ Lr(G) and

f ∈ Lp(G), then f ∗ g ∈ Lq(G) and ∥f ∗ g∥q ≤ ∥g∥r∥f∥p.

Proof. (i) Since f ∈ Cc(G), f is uniformly continuous and bounded. We have |g∗f(x)| ≤ ∥g∥1∥f∥∞.

Given ε > 0, choose a symmetric neighborhood U ⊂ G of 0 ∈ G such that |f(x)− f(y)| < ε/∥g∥1whenever x − y ∈ U . Then |(g ∗ f)(a) − (g ∗ f)(b)| ≤

∫G |g(y)||f(a − y) − f(b − y)|dµ(y) <

(ε/∥g∥1)∫G |g(y)|dµ(y) = ε whenever a− b ∈ U , and hence g ∗ f is uniformly continuous.

(ii) The case p = 1 is already done, and the case p = ∞ is easy. So assume 1 < p < ∞. Let

q = p/(p − 1) so that 1p + 1

q = 1. Let h(y) = f(x − y) and ν be the Borel measure on G given by

ν(A) =∫A |g|dµ. Applying Holder’s inequality w.r.to ν, we have |g ∗ f(x)| ≤∫

G|f(x− y)||g(y)|dµ(y) =

∫G1 · |h|dν ≤ ∥1∥Lq(ν)∥h∥Lp(ν) = ∥g∥1/q1

(∫G|f(x− y)|p|g(y)|dµ(y)

)1/p

.

Since p/q = p− 1,∫G |g ∗ f(x)|pdµ(x) ≤ ∥g∥p−1

1 C, where by Fubini and translation invariance,

C =

∫G

∫G|f(x− y)|p|g(y)|dµ(y)dµ(x) =

∫G

(∫G|f(x− y)|pdµ(x)

)|g(y)|dµ(y)

=

∫G

(∫G|f(x)|pdµ(x)

)|g(y)|dµ(y) =

∫G∥f∥pp|g(y)|dµ(y) = ∥g∥1∥f∥pp.

Thus ∥g ∗ f∥pp ≤ ∥g1∥p1∥f∥pp, or ∥g ∗ f∥p ≤ ∥g∥1∥f∥p.

(iii) This is a generalization of (ii) with a similar, but more complicated proof, which is left as a

reading assignment. See Theorem 1.2.12 of Grafakos, Classical Fourier Analysis. �

In general, L1(G) may not have a unit element for convolution, i.e., there may not exist g ∈ L1(G)

with f ∗ g = f for every f ∈ L1(G); see the Remark after [103]. However, L1(G) has what is called

approximate identities: a parametrized family of functions which, in the limiting case, behaves as

a unit element for convolution. First we will define an approximate identity formally; its behavior

justifying the name ‘approximate identity’ will be proved in [102] below.

Definition: Let µ be a Haar measure on G, and let 0 < b < ∞. A parametrized family {Ka : 0 <

a < b} in L1(G) is an approximate identity for L1(G) if the following three properties are satisfied:

(A1) (Normalization)∫GKadµ = 1 for every a.

(A2) (L1-boundedness) sup{∥Ka∥1 : 0 < a < b} <∞.


(A3) (L1-concentration at 0 ∈ G) For any neighborhood U ⊂ G of 0 ∈ G, lima→0

∫G\U |Ka|dµ = 0.

Note that if Ka ≥ 0, then (A2) follows from (A1).

Sometimes the approximate identity satisfies an additional property:

(A4) (L∞-concentration at 0 ∈ G) For any neighborhood U ⊂ G of 0 ∈ G, lima→0 sup{|Ka(x)| :

x ∈ G \ U} = 0. (Note that (A4) implies (A3) when µ(G) <∞, when G is compact).

Remark: Soon we will encounter explicit examples of approximate identities on L1(T) in relation

with the theory of Fourier series. For the moment note that if g ∈ L1(Rn) is with g ≥ 0 and∫Rn gdµ = 1, then we can obtain an approximate identity {Ka}a>0 for L1(Rn) by putting Ka(x) =

a−ng(x/a). We have∫Rn Ka(x)dx =

∫Rn a

−ng(x/a)dx =∫Rn g(y)dy = 1 by the change of variable

y = x/a, where we used the fact that a−n is the determinant of the Jacobian of the map x 7→ x/a

from Rn to Rn. And∫|x|>δ |Ka(x)|dx =

∫|y|>δ/a |g(y)|dy → 0 as a→ 0.

Remark: In the above definition, we parametrized the approximate identity with a ∈ (0, b) and con-

sidered the limit as a→ 0. However, we can define approximate identity using other parametriza-

tions and other limiting processes; for instance, we may parametrize with n ∈ N and consider the

limit as n→ ∞, or parametrize with r ∈ (0, 1) and consider the limit as r → 1.

A general strategy: While estimating integral expressions in Fourier Theory, the following strategy

will be often followed: split the integral into two parts, one part for a neighborhood of 0, and the

other part for the region outside; then estimate each integral separately.

[102] (Justifying the name approximate identity) Let µ be a Haar measure on G, and let {Ka : 0 <

a < b} be an approximate identity on L1(G).

(i) If f ∈ Cc(G), then lima→0 ∥Ka ∗ f − f∥∞ = 0.

(ii) If 1 ≤ p <∞ and f ∈ Lp(G), then lima→0 ∥Ka ∗ f − f∥p = 0.

(iii) Assume in addition that G is compact and {Ka : 0 < a < b} satisfies the L∞-concentration

condition (A4). If g ∈ L1(G) satisfies limx→0 g(x) = c, then lima→0

∫GKa(x)g(x)dµ(x) = c.

Proof. (i) Using condition (A1) of an approximate identity, we note that Ka ∗ f(x) − f(x) =∫GKa(y)(f(x − y) − f(x))dµ(y). Let ε > 0 be given, and M = sup{∥Ka∥1 : 0 < a < b}. Since

f ∈ Cc(G) is uniformly continuous, we may choose a neighborhood U ⊂ G of 0 ∈ G such that

|f(x− y) − f(x)| ≤ ε2M for every y ∈ U and x ∈ G. Using condition (A3), choose a0 ∈ (0, b) such

that∫G\U |Ka|dµ < ε

4∥f∥∞ for every a ∈ (0, a0). Then for 0 < a < a0, we have

|Ka ∗ f(x)− f(x)| ≤∫U|Ka(y)||f(x− y)− f(x)|dµ(y) +

∫G\U

|Ka(y)||f(x− y)− f(x)|dµ(y)

≤ (ε/2M)

∫U|Ka(y)|dµ(y) + 2∥f∥∞

∫G\U

|Ka(y)|dµ(y) ≤ ε/2 + ε/2 = ε

FOURIER ANALYSIS 7

by the choice ofM and a0. As this holds for every x ∈ G, ∥Ka∗f−f∥∞ ≤ ε for every a ∈ (0, a0).

(ii) If f ∈ Lp(G) and g ∈ Cc(G), we have ∥Ka ∗f −f∥p ≤ ∥Ka ∗ (f −g)∥p+∥Ka ∗g−g∥p+∥g−f∥p.

Since Cc(G) is dense in Lp(G), we can make ∥g − f∥p arbitrarily small by choosing g suitably. By

[101](ii), ∥Ka ∗ (f − g)∥p ≤ ∥Ka∥1∥f − g∥p ≤ M∥f − g∥p, where M = sup{∥Ka∥1 : 0 < a < b}.

Finally, ∥Ka ∗ g − g∥p ≤ ∥Ka ∗ g − g∥∞ → 0 as a → 0 by part (i). From these observations, it

follows that ∥Ka ∗ f − f∥p → 0 as a→ 0.

(iii) We have∫GKa(x)g(x)dµ(x) − c =

∫GKa(x)(g(x) − c)dµ(x) by (A1). Let M = sup{∥Ka∥1 :

0 < a < b}. Given ε > 0, choose a neighborhood U ⊂ G of 0 ∈ G such that |g(x) − c| < ε2M for

x ∈ U . Then choose a0 ∈ (0, b) by (A4) such that sup{|Ka(x)| : x ∈ G \ U} < ε

2(∥g∥1 + cµ(G)),

where the compactness of G ensures that µ(G) <∞. Then, for 0 < a < a0, as in the proof of (i),

|∫GKagdµ− c| ≤ ε

2M

∫U |Ka|dµ+

ε

2(∥g∥1 + cµ(G))

∫G\U |g(x)− c|dµ ≤ ε/2 + ε/2 = ε. �

3. Fourier series: preliminaries, and a divergence result

Recall that we parametrize the circle T = R/Z as [0, 1) (sometimes also as [−1/2, 1/2)) with

addition performed modulo 1. Let er(t) = e2πirt for r ∈ R, and note that er = e−r, where the bar

stands for complex conjugation. Since the Lebesgue measure on T is finite, we have Lp(T) ⊃ Lq(T)

for 1 ≤ p ≤ q < ∞, and thus the largest among them is L1(T). Whenever needed, it will be

implicitly assumed that any f ∈ L1(T) is extended to the whole of R with period 1.

Definition: The Fourier series of f ∈ L1(T) is formally defined as∑

n∈Z f(n)en, where f(n) =∫ 10 f(t)en(t)dt =

∫ 10 f(t)e−n(t)dt. Here f(n) is called the nth Fourier coefficient of f for n ∈ Z.

Remark: Note that f(n)en + f(−n)e−n = (f(n) + f(−n)) cos 2πnt + i(f(n) − f(−n)) sin 2πint.

Hence we have the formal equality of Fourier series

∑n∈Z

f(n)en = f(0) +

∞∑n=1

(f(−n)e−n + f(n)en) = A0 +

∞∑n=1

(An cos 2πnt+Bn sin 2πint),

whereA0 = f(0) =∫T f(t)dt, An = f(n)+f(−n) =

∫T f(t)(e−n(t)+en(t))dt = 2

∫T f(t) cos 2πntdt

and Bn = i(f(n)− f(−n)) =∫T if(t)(e−n(t)− en(t))dt = 2

∫T f(t) sin 2πintdt. Observe that if f is

an even function, then Bn = 0 for all n since sin is an odd function; and if f is an odd function,

then An = 0 for all n since cos is an even function.

Exercise-4: Restricted to L2(T), the map f 7→ (f(n))n∈Z from L2(T) to l2(Z) is an isometric

isomorphism of Hilbert spaces. [Hint : See the Introduction, and use Parseval’s identity.]


Example: Let T = [0, 1) and f : T → C be f(t) = t. We have f(0) =∫ 10 tdt = 1/2. For

n ∈ Z \ {0}, integration by parts gives f(n) =∫ 10 te−n(t)dt = (−2πin)−1te−n(t)|10 −

∫ 10 e−n(t)dt =

(−2πin)−1 − 0 = (−2πin)−1. Also ∥f∥22 =∫ 10 t

2dt = 1/3. Now ∥f∥22 =∑

n∈Z |f(n)|2 by Exercise-4.

Hence 1/3 = (1/4) + 2∑∞

n=1(4π2n2)−1. Simplification yields π2/6 =

∑∞n=1 n

−2.

Exercise-5: (Basic properties of the Fourier coefficients) Let f, g ∈ L1(T) and a, b ∈ C. Then, for

every n ∈ Z we have:

(i) (Linearity) af + bg(n) = af(n) + bg(n).

(ii) f(n) = f(−n).

(iii) If fs ∈ L1(T) is defined as fs(t) = f(t− s) for s ∈ T, then fs(n) = f(n)e−n(s) = f(n)en(−s).

(iv) (Another expression for the Fourier coefficient) f(n) = (1/2)∫ 10 [f(t)− f(t+ 1

2n)]e−n(t)dt.

(v) en ∗ f = f ∗ en = f(n)en.

(vi) (Fourier coefficient of convolution is product of Fourier coefficients) f ∗ g(n) = f(n)g(n).

[Hint : (iii) fs(n) =∫T f(t−s)e−n(t)dt =

∫T f(y)e−n(y+s)dy = f(n)e−n(s) by putting t = y+s. (iv)

In f(n) =∫T f(t)e−n(t)dt, substitute t = y+ 1

2n to get f(n) = −∫T f(y+

12n)e−n(y)dy and add this to

the first expression for f(n). (v) f ∗ en(t) =∫f(s)en(t− s)ds = (

∫f(s)e−n(s)ds)en(t) = f(n)en(t).

(vi) Using (v) we have f ∗ g(n)en = (f ∗ g) ∗ en = f ∗ (g ∗ en) = f ∗ (g(n)en) = g(n)(f ∗ en) =

g(n)f(n)en. And we may cancel en = 0 from both ends].

The smoother the function f , the faster the rate of convergence of (f(n)) to 0 as |n| → ∞.

[103] (Rate of decay of Fourier coefficients) Let f ∈ L1(T). Then,

(i) |f(n)| ≤ ∥f∥1 for every n ∈ Z and hence (f(n))n∈Z ∈ l∞(Z) (this is improved below). This has

the following consequence by linearity: if (fk) → f in L1(T), then (fk(n)) → f(n) uniformly in n.

(ii) (Riemann-Lebesgue lemma) lim|n|→∞

f(n) = 0, i.e., (f(n))n∈Z ∈ c0(Z) := {(xn) : lim|n|→∞

xn = 0}.

(iii) (Generalized Riemann-Lebesgue lemma) limr∈R; |r|→∞

∫Tf(t)er(t)dt = 0, and consequently

limr∈R; |r|→∞

∫Tf(t) cos 2πrtdt = 0 and lim

r∈R; |r|→∞

∫Tf(t) sin 2πrtdt = 0.

(iv) If f is k-times differentiable with f (k) ∈ L1(T), then we have f (k)(n) = (2πin)kf(n), and hence

lim|n|→∞ |n|kf(n) = 0 (this means (f(n)) goes to 0 faster than (|n|−k)).

Proof. (i) |f(n)| ≤∫T |fe−n|dt =

∫T |f |dt = ∥f∥1 since |e−n| = 1.

(ii) Since C(T) is dense in L1(T) by Exercise-1, and since |f(n) − g(n)| ≤ ∥f − g∥1, it suffices to

prove the result for f ∈ C(T). And the result in this case follows from Exercise-5(iv), where we

noted the expression f(n) = (1/2)∫T[f(t)− f(t+ 1

2n)]e−n(t)dt.

FOURIER ANALYSIS 9

(iii) To prove the first statement, note as above that it suffices to prove for f ∈ C(T), and then

note∫T f(t)er(t)dt = (1/2)

∫T[f(t)−f(t−

12r )]er(t)dt for r ∈ R\{0} as in the hint of Exercise-5(iv).

To prove the second statement, assume f is real valued and note er(t) = cos 2πrt+ i sin 2πrt.

(iv) Integration by parts gives f(n) =∫ 10 fe−ndt = (−2πin)−1f(t)e−n|10 + (2πin)−1

∫ 10 f

′e−ndt =

0 + (2πin)−1∫ 10 f

′e−ndt = (2πin)−1f ′(n). That is, f ′(n) = 2πinf(n), and inductively f (k)(n) =

(2πin)kf(n). Finally, the assertion lim|n|→∞ |n|kf(n) = 0 follows by applying part (ii) to f (k). �

Remark: We may explain why there is no multiplicative identity for convolution in L1(T). Suppose

there is f ∈ L1(T) with f ∗g = g for every g ∈ L1(T). Consider g =∑∞

n=−∞ en/n2 ∈ C(T) ⊂ L1(T),

and note that g(n) = 1/n2 = 0 for every n ∈ Z. If g = f ∗ g, then g(n) = f ∗ g(n) = f(n)g(n) and

hence we must have f(n) = 1 for every n ∈ Z, which contradicts Riemann-Lebesgue lemma.

For later use, we note down the following consequences of [103](iii).

[104] Let g, h ∈ L1(T) and b < c be in T. Then,

(i) limN→∞

∫ c

bg(t) sin(2N + 1)πtdt = 0.

(ii) If the function t 7→ t−1h(t) is bounded almost everywhere in a neighborhood of 0, then

limN→∞

∫ c

bt−1h(t) sin(2N + 1)πtdt = 0.

Proof. (i) Note that sin(2N + 1)πt = sin 2π(N + 12)t, and apply [103](iii) to 1(b,c)g ∈ L1(T).

(ii) Let g(t) = t−1h(t). By hypothesis, there are δ ∈ (0, 1/2) and M > 0 such that |g(t)| < M for

almost every t ∈ (−δ, δ). Hence∫T |g(t)|dt ≤

∫|t|<δMdt +

∫|t|>δ δ

−1|h(t)|dt ≤ δM + δ−1∥h∥1 < ∞

and thus g ∈ L1(T). Apply part (i) to g. �

Question: When does the Fourier series converge pointwise?

Definition: The Nth Dirichlet kernel1 DN ∈ C(T) is defined as DN =∑N

n=−N e−n =∑N

n=−N en.

Note that DN (t) = 1+2∑N

n=1 cos 2πnt, and therefore DN is a real valued even function. The Nth

partial sum sN (f) of the Fourier series of f ∈ L1(T) is defined as sN (f) =∑N

n=−N f(n)en.

Exercise-6: (sN in terms of DN ) For f ∈ L1(T), we have: (i) sN (f) = DN ∗ f , and consequently

sN (f) is real valued whenever f is real valued.

(ii) If T is parametrized as [−1/2, 1/2), then sN (f, a) =∫ 1/20 DN (t)[f(a+ t) + f(a− t)]dt ∀ a ∈ T.

[Hint : (i) sN (f) = f ∗ (∑N

n=−N en) = f ∗DN = DN ∗ f by Exercise-5(v).

(ii) sN (f, a) = DN ∗ f(a) = (∫ 0−1/2+

∫ 1/20 )DN (t)f(a− t)dt. Now the substitution t = −y converts

the first integral into∫ 1/20 DN (y)f(a+ y)dy since DN is even.]

1See the end of Section 8 for a picture of the graph of DN .


Remark: sN : L1(T) → L1(T) is linear, but is not a positive operator (since DN � 0). For any

N ∈ N, we can construct f ∈ L1(T) with f ≥ 0 and sN (f) � 0 as follows. Choose b ∈ T with

DN (b) < 0, and then choose ε > 0 and δ > 0 by the continuity of DN such that DN (t) < −ε for

every t ∈ [b, b + 2δ]. Note that sN (f, a) =∫ 10 DN (t)f(a − t)dt =

∫ 10 DN (y)f(a + y)dy since DN is

even. If we take f = 1[b+δ,b+2δ], then sN (f, a) =∫ b+2δ−ab+δ−a DN (y)dy < −εδ < 0 for every a ∈ (0, δ).

Below we show that even though {DN} fails to satisfy property (A2) in the definition of an

approximate identity, {DN} satisfies (A1) and a property similar to (A3).

[105] (i) DN (t) =sin(2N + 1)πt

sinπtfor t ∈ T \ Z and DN (t) = 2N + 1 for t ∈ Z.

(ii) (Normalization property (A1) holds)∫ 10 DN (t)dt = 1.

(iii) (L1-boundedness (A2) fails) ∥DN∥1 → ∞ as N → ∞.

(iv) (A property similar to the L1-concentration (A3) holds) Let 0 < δ < 1/2. Then we have

limN→∞∫ −δ−1/2DN (t)dt = 0 = limN→∞

∫ 1/2δ DN (t)dt; more generally

limN→∞∫ −δ−1/2DN (t)h(t)dt = 0 = limN→∞

∫ 1/2δ DN (t)h(t)dt for every h ∈ L1(T).

Proof. (i) 2i sinπtDN (t) = (e1/2(t)−e−1/2(t))DN (t) = eN+1/2(t)−e−(N+1/2)(t) = 2i sin(2N+1)πt.

(ii)∫ 10 DN (t)dt =

∑Nn=−N

∫ 10 en = 1 since

∫ 10 e0 = 1 and

∫ 10 en = 0 for n = 0.

(iii) Since sinx < x for x > 0, we have ∥DN∥1 =∫ 10 |DN (t)|dt ≥

∫ 10 | sin (2N + 1)πt|(πt)−1dt.

Put y = (2N + 1)πt and note dy/y = dt/t. We see ∥DN∥1 ≥∫ (2N+1)π0 | sin y|(πy)−1dy ≥∑2N+1

n=1

∫ nπ(n−1)π | sin y|(nπ

2)−1dy =∑2N+1

n=1 (nπ2)−1 → ∞ as N → ∞. We used:∫ nπ(n−1)π | sin y| = 1.

(iv) Since∫ −δ−1/2DN (t)h(t)dt =

∫ 1/2δ DN (t)(−h(t))dt, it suffices to show lim

N→∞

∫ 1/2

δDN (t)h(t)dt = 0.

Note by (i) that∫ 1/2δ DN (t)h(t)dt =

∫T g(t) sin(2N + 1)πtdt, where the function g defined as

g(t) = 1(δ,1/2)(t)h(t)/ sinπt belongs to L1(T) since sinπt > sinπδ for t ∈ (δ, 1/2). Therefore,∫ 1/2δ DN (t)h(t)dt =

∫T g(t) sin(2N + 1)πtdt→ 0 as N → ∞ by [104](i). �

To be alert about the possible failure of pointwise convergence of the Fourier series even in the

case of continuous functions, we establish a negative result in the beginning itself. To use in the

proof, keep in mind the Uniform boundedness theorem: if (Tα)α∈J is a pointwise bounded family of

bounded linear operators from a Banach space to a normed space, then sup{∥Tα∥ : α ∈ J} <∞.

Remark: The family {DN} as N → ∞ cannot be an approximate identity on L1(T) because of the

failure of L1-boundedness proved in [104](iii) above. This is the essential reason behind the failure

of pointwise convergence for a general Fourier series.

FOURIER ANALYSIS 11

[106] (Failure of pointwise convergence of Fourier series even for continuous functions) Parametrize

T as [−1/2, 1/2). There is f ∈ C(T) with sup{|sN (f, 0)| : N ∈ N} = ∞. So the sequence (sN (f, 0))

of Fourier partial sums of f at 0 does not converge to f(0).

Proof. Define linear functionals ϕN : (C(T), ∥ · ∥∞) → C as ϕN (f) = sN (f, 0). They are bounded

since |ϕN (f)| ≤∑N

n=0 |f(n)| ≤ (N +1)∥f∥∞. We need to show (ϕN ) is not pointwise bounded. By

the Uniform boundedness theorem, it suffices to show sup{∥ϕN∥ : N ∈ N} = ∞. For this purpose,

we will show ∥ϕN∥ ≥ ∥DN∥1 for every N ∈ N. This suffices since ∥DN∥1 → ∞ by [105](iii).

Fix N ∈ N and let h ∈ L1(T) be such that hDN = |DN |, i.e., we take h(t) = 1 if DN (t) ≥ 0 and

h(t) = −1 if DN (t) < 0. We may find a sequence (fk) in C(T) such that ∥fk∥∞ ≤ 1 and (fk) → h

pointwise (check). Note that ϕN (fk) = sN (fk, 0) =∫T fk(t)DN (0 − t)dt =

∫T fk(t)DN (t)dt since

DN (−t) = DN (t). Since fk’s are dominated by 1 ∈ L1(T), we get by Lebesgue dominated conver-

gence theorem that limk→∞ ϕN (fk) = limk→∞∫T fk(t)DN (t)dt =

∫T h(t)DN (t)dt =

∫T |DN (t)|dt =

∥DN∥1. Since ∥fk∥∞ ≤ 1, we conclude ∥ϕN∥ ≥ ∥DN∥1, and we are done. �

Remark: However, ∥sN (f)∥∞ cannot grow very fast: it is known that limN→∞∥sN (f)∥∞

logN= 0 for

f ∈ C(T), see Proposition 1.6.6 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets.

Remark: We sketch a little history. Kolmogorov gave an example of f ∈ L1(T) whose Fourier series

diverges almost everywhere. In contrast, Carleson showed that the Fourier series of any f ∈ L2(T)

converges pointwise to f for almost every t ∈ T. Hunt extended this result to every f ∈ Lp(T) for

every 1 < p < ∞. These results are beyond our scope, and we will not prove them. However, we

will prove pointwise convergence of Fourier series under some smoothness assumption (we need to

prevent f from oscillating too much). And also a little later, we will prove Fejer-Lebesgue theorem

that says that the averages of partial sums of the Fourier series of any f ∈ L1(T) converge to f

pointwise almost everywhere (i.e., outside a Lebesgue null set).

4. Sufficient conditions for pointwise convergence of Fourier series

Philosophy: If f ∈ L1(T) satisfies some smoothness condition that prevents f from oscillating too

much, then we may expect (sN (f)) to converge to f pointwise.

We start with some basic observations.

Exercise-7: Let f ∈ L1(T), w ∈ C and a ∈ T = [−1/2, 1/2).

(i) For any δ ∈ (0, 1/2) we have

sN (f, a)− w =

∫TDN (t)(f(a− t)− w)dt =

(∫|t|<δ

+

∫|t|>δ

)DN (t)(f(a− t)− w)dt.


(ii) limN→∞

sN (f, a) = w ⇔ there exists δ ∈ (0, 1/2) such that limN→∞

∫|t|<δ

DN (t)(f(a− t)− w)dt = 0.

(iii) (Riemann localization principle) If f ≡ 0 in a neighborhood of a, then limN→∞ sN (f, a) = 0.

[Hint : (i) w =∫TDN (t)wdt since

∫TDN (t)dt = 1. Also, sN (f) = DN ∗ f . (ii) As N → ∞, the

integral∫|t|>δ in (i) goes to 0 by [105](iv) since the function t 7→ f(a− t)− w belongs to L1(T).]

Improving Exercise-7(ii), we get:

[107] (Pointwise convergence criterion - 1) Let f ∈ L1(T), w ∈ C and a ∈ T = [−1/2, 1/2). Then,

(i) limN→∞

sN (f, a) = w ⇔ ∃ δ ∈ (0, 1/2) with limN→∞

∫|t|<δ

t−1(f(a− t)− w) sin(2N + 1)πtdt = 0.

(ii) (Dini’s test) If t 7→ t−1(f(a− t)− w) belongs to L1(T), then limN→∞

sN (f, a) = w.

(iii) If t 7→ t−1(f(a− t)− w) is bounded a.e. in a neighborhood of 0, then limN→∞

sN (f, a) = w.

Proof. (i) By Exercise-7(ii), limN→∞

sN (f, a) = w ⇔ there exists δ ∈ (0, 1/2) such that

limN→∞

∫|t|<δ

(f(a− t)− w)DN (t)dt = 0. Note that the integrand can be written as

(f(a− t)− w)DN (t) = (1

sinπt− 1

πt+

1

πt)(f(a− t)− w) sin(2N + 1)πt.

Using l’Hopital rule, check limt→0(1

sin t −1t ) = 0. Hence g : T → C defined as g(0) = 0 and

g(t) = 1sinπt −

1πt for t = 0 satisfies g ∈ C(T) ⊂ L1(T). Therefore, t 7→ g(t)(f(a − t) − w) belongs

to L1(T), and so by [104](i), we have limN→∞∫|t|<δ g(t)(f(a− t)− w) sin(2N + 1)πtdt = 0 for any

δ ∈ (0, 1/2). The required result follows.

(ii) Use part (i), and apply [104](i) to g(t) := t−1(f(a− t)− w).

(iii) Use part (i), and apply [104](ii) to h(t) := f(a− t)− w. �

Definition: Let f ∈ L1(T) (assumed to be extended to the whole of R with period 1), and a ∈ T.

We say2 f is Lipschitz at a if there exist λ ≥ 1 and a neighborhood U ⊂ T of a such that

|f(a)− f(b)| ≤ λ|a− b| for every b ∈ U . Note that if f is differentiable at a, then f is Lipschitz at

a (∵ define g(a) = f ′(a) and g(t) = (f(t)− f(a))/(t− a) for t = a. Then f(t)− f(a) = g(t)(t− a),

and g is continuous at a so that |g(t)| ≤ λ := |g(a)| + 1 in a neighborhood of a). We say f is

Holder continuous at a if there exist ⟨≥ 1, α > 0, and a neighborhood U ⊂ T of a such that

|f(a)− f(b)| ≤ λ|a− b|α for every b ∈ U .

[108] Let f ∈ L1(T) and a ∈ T. Then each of the following implies limN→∞

sN (f, a) = f(a):

(i) f is differentiable at a.

(ii) f is Lipschitz at a.

2These definitions are to be understood modulo a null set since f ∈ L1(T).

FOURIER ANALYSIS 13

(iii) f is Holder continuous at a.

Proof. Let g(t) = t−1(f(a−t)−f(a)). For (i) and (ii), note that g is bounded a.e. in a neighborhood

of 0, and apply [107](iii). For (iii), it suffices by [107](ii) to show that g ∈ L1(T). By hypothesis,

there exist λ ≥ 1, α > 0 and δ ∈ (0, 1/2) such that |f(a− t)− f(a)| ≤ λ|t|α whenever |t| < δ. Then∫T|g(t)|dt ≤

∫|t|<δ

λ|t|α−1dt+

∫|t|>δ

δ−1|f(a− t)− f(a)|dt ≤ 2λα−1δα + 2δ−1∥f∥1 <∞.

�

Definition: Let f ∈ L1(T). We say f is piecewise continuous if for every open interval (a, b) ⊂ T,

the map f is continuous on (a, b) except possibly for finitely many jump discontinuities, and if the

limits f(a+), f(b−) exist. We say f is piecewise C1 if both f and f ′ are piecewise continuous. Next

we will prove Dirichlet’s theorem about convergence of Fourier series for piecewise C1 functions.

Exercise-8: Let f ∈ L1(T), a ∈ T, and w ∈ C.

(i) For any δ ∈ (0, 1/2) we have sN (f, a)− w =∫ 1/2

0DN (t)[f(a+ t) + f(a− t)− 2w]dt =

(∫0<t<δ

+

∫t>δ

)DN (t)[f(a+ t) + f(a− t)− 2w]dt.

(ii) limN→∞

sN (f, a) = w ⇔ ∃ δ ∈ (0, 1/2) with limN→∞

∫ δ

0DN (t)[f(a+ t) + f(a− t)− 2w]dt = 0.

[Hint : (i) sN (f) =∫ 1/20 DN (t)[f(a+ t)+f(a− t)]dt by Exercise-6. Also, w = 2

∫ 1/20 DN (t)wdt since∫

TDN (t)dt = 1 and DN is even. (ii) As N → ∞, the integral∫t>δ in (i) goes to 0 by [105](iv) since

the function t 7→ f(a+ t) + f(a− t)− 2w belongs to L1(T).]

We use a splitting of the integrand different from the one used in [107] for the next:

Exercise-9: (Pointwise convergence criterion - 2) Let f ∈ L1(T), a ∈ T, and assume the limits f(a+),

f(a−) exist. If g1(t) :=f(a+ t)− f(a+)

sinπtand g2(t) :=

f(a− t)− f(a−)

sinπtare bounded almost

everywhere in (0, δ) for some δ ∈ (0, 1/2) , then limN→∞ sN (f, a) = [f(a+)+ f(a−)]/2. [Hint : Let

g = 1(0,δ)(g1+g2), w = [f(a+)+f(a−)]/2. Then g ∈ L1(T) and∫ δ0 DN (t)[f(a+t)+f(a−t)−2w]dt =∫ δ

0 g(t) sin(2N + 1)πtdt→ 0 as N → ∞ by [104](i). Now apply Exercise-8(ii).]

[109] (Dirichlet’s theorem) If f ∈ L1(T) is a piecewise C1 function, then limN→∞ sN (f, a) =

[f(a+) + f(a−)]/2 for every a ∈ T. (note: [f(a+) + f(a−)]/2 = f(a) if f is continuous at a.)

Proof. Let g1(t) =f(a+ t)− f(a+)

sinπtand g2(t) =

f(a− t)− f(a−)

sinπt. By l’Hopital rule, we get

limt→0+

g1(t) = limt→0+

f ′(a+ t)

π cosπt= π−1f ′(a+) and lim

t→0+g2(t) = lim

t→0+

f ′(a− t)

π cosπt= π−1f ′(a−). Therefore

g1, g2 are bounded a.e. in (0, δ) for some δ ∈ (0, 1/2). Now the result follows by Exercise-9. �


It is also possible to extend (with some effort) the arguments from Exercise-8 along the lines of

[107] to prove Jordan’s theorem of convergence of Fourier series for functions of bounded variation.

But we decide to give another proof of this, which will be given a little later.

5. Cesaro summability and Abel summability

In spite of the negative result [106], it will be shown a little later that the Fourier series of f ∈

L1(T) converges w.r.to certain other notions of summability. In this Analysis section without any

Fourier Theory, we briefly mention two such notions - Cesaro summability and Abel summability.

Motivating observations: (i) The series 1−1+1−1+1−1+ · · · is not convergent in the usual sense

since the sequence of partial sums is (1, 0, 1, 0, 1, 0, . . .). However, the averages of the partial sums

form the sequence (1, 1/2, 2/3, 1/2, 3/5, 1/2, . . .), which converges to 1/2. (ii) Saying that∑∞

n=0 an

is convergent is same as saying the analytic function f(z) =∑∞

n=0 anzn converges at z = 1. If

f has radius of convergence ≥ 1, then even if∑∞

n=0 anzn does not converge at z = 1, the limit

limr→1−∑∞

n=0 anrn may exist.

Definition: Let an ∈ C for n ≥ 0, sN =∑N

n=0 an, and s ∈ C. (i) We say∑∞

n=0 an is Cesaro

summable to s if limN→∞ σN = s, where σN = (N + 1)−1∑N

n=0 sn. (ii) We say∑∞

n=0 an is Abel

summable to s if∑∞

n=0 anrn converges for every r ∈ (0, 1) and limr→1−

∑∞n=0 anr

n = s.

Exercise-10: Let an ∈ C for n ≥ 0, sN =∑N

n=0 an, and σN = (N + 1)−1∑N

n=0 sn. Then,

(i) σN = (N + 1)−1∑N

n=0(∑n

m=0 an) = (N + 1)−1∑N

n=0(N + 1− n)an =∑N

n=0(1−n

N+1)an.

(ii)∑∞

n=0 anrn = (1− r)

∑∞n=0 snr

n = (1− r)2∑∞

n=0(n+ 1)σnrn for 0 < r < 1.

Further, for p = 1, 2 and n ≥ 0, let ap,n ∈ C, and define a3,n =∑

i+j=n a1,ia2,j for n ≥ 0 (this is

called the Cauchy product, and can be thought of as a discrete version of convolution).

(iii) If the series∑∞

n=0 a1,n and∑∞

n=0 a2,n are convergent to a, b ∈ C respectively, then their Cauchy

product∑∞

n=0 a3,n defined above may not converge (see 3.49 of Rudin, Principles of Mathematical

Analysis for this), but∑∞

n=0 a3,n is Cesaro summable to ab.

(iv) Even if the series∑∞

n=0 a1,n and∑∞

n=0 a2,n are Cesaro summable to a, b ∈ C respectively, the

series∑∞

n=0 a3,n may not be Cesaro summable. Show that Abel summability does not imply Cesaro

summability by considering∑∞

n=0 a1,n =∑∞

n=0 a2,n = 1− 1 + 1− 1 + 1− 1 + · · · , and noting that∑∞n=0 a3,n = 1− 2 + 3− 4 + 5− 6 + · · · is Abel summable to 1/4 but is not Cesaro summable.

[Hint : (ii) sn − sn−1 = an and (n + 1)σn − nσn = sn. (iii) Let sp,N =∑N

n=0 ap,n for p = 1, 2, 3.

Check that∑N

n=0 s3,n =∑

i+j=N (N + 1 − i − j)a1,ia2,j =∑

i+j=N s1,is2,j . Also the hypothesis

says (s1,n) → a and (s2,n) → b. Hence |ab − 1N+1

∑Nn=0 s3,n| = | 1

N+1

∑i+j=N (ab − s1,is2,j)| ≤

1N+1

∑i+j=N ; min{i,j}≤N0

|ab− s1,is2,j |+ 1N+1

∑i+j=N ; min{i,j}>N0

|ab− s1,is2,j |, where the last two

FOURIER ANALYSIS 15

sums can be made arbitrarily small if N0 is large and N > N0 is very large compared to N0. (iv) For

1−2+3−4+ · · · , the averages of the partial sums form the sequence (1, 0, 2/3, 0, 3/5, 0, 4/7, 0, . . .),

which does not converge; but their averages converge to 1/4.]

[110] (i) If a complex series∑∞

n=0 an converges to s ∈ C, then∑∞

n=0 an is Cesaro summable to s.

(ii) If∑∞

n=0 an is Cesaro summable to s ∈ C, then∑∞

n=0 an is Abel summable to s.

Proof. Let sN =∑N

n=0 an, and σN = (N + 1)−1∑N

n=0 sn.

(i) Let ε > 0 be given. Since (sn − s) → 0 by hypothesis, there is M > 0 with |sn − s| ≤ M for

every n ≥ 0. Choose k ∈ N such that |sn − s| < ε for every n ≥ k. Then choose m > k such that

(k + 1)M/(N + 1) < ε for every N ≥ m. Then for every N > m, we have |σN − s| =

| 1

N + 1

N∑n=0

(sn − s)| ≤ | 1

N + 1

k∑n=0

(sn − s)|+ | 1

N + 1

N∑n=k+1

(sn − s)| ≤ (k + 1)M

N + 1+N − k

N + 1ε < ε+ ε.

(ii) Given limσN = s. By Exercise-10(ii), we have∑∞

n=0 anrn = (1 − r)2

∑∞n=0(n + 1)σnr

n for

0 < r < 1. Since (1− r)−2 =∑∞

n=0(n+1)rn, we may write s = (1− r)2∑∞

n=0(n+1)srn. Then for

any N ∈ N, we have

|∞∑n=0

anrn − s| ≤

(N∑

n=0

+

∞∑n=N+1

)(n+ 1)(1− r)2rn|σn − s| = J1(N, r) + J2(N, r), say.

Given ε > 0, choose N ∈ N large enough so that |σn − s| < ε for n ≥ N . Then J2(N, r) ≤ ε

since∑∞

n=N+1(n+1)rn ≤∑∞

n=0(n+1)rn = (1− r)−2. If M := max{(n+1)|σn− s| : n ≤ N}, then

J1(N, r) ≤M∑N

n=0(1− r)2 → 0 as r → 1−1. �

Remark: Tauber showed that if∑∞

n=0 an is Abel summable to s ∈ C and limn→∞ nan = 0, then∑∞n=0 an = s. This generated many other results of the same type. A result where a weaker notion

of summability plus some condition giving the convergence of the original series is now called

a Tauberian theorem3. In [111] below, we present Hardy-Littlewood Tauberian theorem (which

improves Tauber’s result), with a simplified proof due to Karamata and H. Wielandt.

Exercise-11: Let f = 1(1/2,1) : (0, 1) → R and ε > 0. Then there exist real polynomials p1 and p2

such that (i) p1 ≤ f ≤ p2,

(ii) p1(0) = 0 = p2(0) and p1(1) = 1 = p2(1) (this implies t(1− t) is a factor of p2 − p1), and

(iii) the polynomial q(t) =p2(t)− p1(t)

t(1− t)satisfies

∫ 10 q(t)dt < ε.

[Hint : Let F : (0, 1) → R be F (t) =f(t)− t

t(1− t), which is bounded and continuous except for a jump

discontinuity at 1/2. First approximate F by continuous functions, and then apply Weierstrass

3The terminology Tauberian theorem has a more general meaning, which we may see later.


approximation theorem to find polynomials h1, h2 such that h1 ≤ F ≤ h2 and∫ 10 (h2 − h1) < ε.

Note that f(t) = t+ t(1− t)F (t). Let pj(t) = t+ t(1− t)hj(t) for j = 1, 2. Then p1(0) = 0 = p2(0),

p1(1) = 1 = p2(1), p1 ≤ f ≤ p2, and∫ 10

p2(t)− p1(t)

t(1− t)dt =

∫ 10 (h2 − h1) < ε.]

[111] (i) (Hardy’s Tauberian theorem) If a complex series∑∞

n=0 an is Cesaro summable to s ∈ C

and supn∈N |nan| <∞, then∑∞

n=0 an = s.

(ii) (Hardy-Littlewood Tauberian theorem) If a complex series∑∞

n=0 an is Abel summable to s ∈ C

and supn∈N |nan| <∞, then∑∞

n=0 an = s.

Proof. As (i) follows from (ii) and [110], it suffices to prove (ii). After a translation of the function∑∞n=0 anz

n, we may assume s = 0. Let C > 0 be such that supn |nan| ≤ C.

Step-1 : Let F = {f : (0, 1) → R : limr→1−∑∞

n=0 anf(rn) = 0}, which is a real vector space. The

hypothesis says limr→1−∑∞

n=0 anrn = 0, and a substitution r = tk gives limx→1−

∑∞n=0 ant

kn = 0.

Thus all the maps t 7→ tk for k ∈ N belong to F , and hence F contains all polynomials p with

p(0) = 0, i.e., all p without the constant term. Let f = 1(1/2,1) : (0, 1) → R. Note that for

1/2 < r < 1,∑∞

n=0 anf(rn) =

∑∞n=0 an, and therefore what we need to show is that f ∈ F , i.e.,

that limr→1−∑∞

n=0 anf(rn) = 0. We will achieve this by approximating f with polynomials.

Step-2 : Given ε > 0, choose polynomials p1 and p2 for f and as specified by Exercise-11 such that

the polynomial q(t) =p2(t)− p1(t)

t(1− t)=∑m

k=0 bktk (say) satisfies

∑mk=0

bkk + 1

=∫ 10 q(t)dt < ε/C.

For our upcoming estimate, observe that1− rn

1− r≤ n, or

1− rn

n≤ 1− r for 0 < r < 1, and hence

p2(rn)− p1(r

n)

n=rn(1− rn)q(rn)

n≤ rn(1− r)q(rn) = (1− r)

m∑k=0

bkr(k+1)n.

Step-3 : Since |an| ≤C

nand f − p1 ≤ p2 − p1, the last estimate from step-2 gives

|∞∑n=0

an(f(rn)− p1(r

n))| ≤ C∞∑n=0

p2(rn)− p1(r

n)

n= C

m∑k=0

(1− r)bk

∞∑n=0

rn+kn = Cm∑k=0

bk1− r

1− rk+1,

which tends to C∑m

k=0

bkk + 1

< ε as r → 1−. Also, limr→1−

∞∑n=0

anp1(rn) = 0 since p1 ∈ F . Thus we

conclude lim supr→1−

|∞∑n=0

anf(rn)| ≤ ε. By considering

∑∞n=0 an(p2(r

n)−f(rn)), we can show similarly

that lim infr→1−

∞∑n=0

anf(rn) ≥ −ε. Since ε > 0 is arbitrary, lim

r→1−

∞∑n=0

anf(rn) = 0. �

FOURIER ANALYSIS 17

6. Weak type boundedness for maximal functions

We will touch upon maximal functions and their relation to pointwise convergence. This will

be used in the next section to prove Fejer’s result about the Cesaro summability of Fourier series.

Suppose we have a sequence of operators (Tn) defined on L1(X,µ) for some measure space (X,µ).

Their maximal function T ∗ is defined as the supremum of Tn’s in an appropriate sense. We are

interested in finding out conditions that will ensure that T ∗ has some sort of boundedness behavior.

Definition: Let C = C ∪ {∞}. For a measurable space X, let M(X, C) = {g : X → C :

g is measurable}, and similarly define M(X,R), M(X, [0,∞]), etc.

Definition: Let (X,µ), (Y, ν) be measure spaces. A map T : L1(X,µ) → M(Y, C) (not necessarily

linear) is weak (1, 1) if ∃C > 0 such that for every α > 0 and f ∈ L1(X,µ) we have ν({y ∈ Y :

|Tf(y)| > α}) ≤ C∥f∥1/α. Note that if T is weak (1,1), then ν({y ∈ Y : |Tf(y)| > n}) → 0 as

n→ ∞, and hence Tf is finite almost everywhere, i.e., ν({y ∈ Y : Tf(y) = ∞}) = 0. The inclusion

operator I : L1(X,µ) → M(X,C) is weak (1, 1) with constant C = 1: if we fix f ∈ L1(X,µ) and

put Aα = {x ∈ X : |f(x)| > α}, then µ(Aα) =∫Aα

1dµ ≤∫Aα

|f/α|dµ ≤ ∥f∥1/α for α > 0.

Remark: (i) More generally, the condition ν({y ∈ Y : |Tf(y)| > α})1/q ≤ C∥f∥p/α defines weak

(p, q) maps, but we do not need this more general concept. (ii) If the supremum of a sequence of

linear maps is weak (1, 1), then there is a useful conclusion, which is stated below.

[112] Let (X,µ) be a measure space, and Tn : L1(X,µ) → M(X, C) be linear. If the maximal

function T ∗ : L1(X,µ) → M(X, [0,∞]) of {Tn : n ∈ N} defined as T ∗f(x) = sup{|Tnf(x)| : n ∈ N}

is weak (1, 1), then the set F := {f ∈ L1(X,µ) : (Tnf) → f pointwise µ-a.e.} is closed in L1(X,µ).

Proof. Let (fk) be a sequence in F converging to f ∈ L1(X,µ) in ∥ · ∥1-norm. Note that {x ∈ X :

lim supn |Tnf(x)−f(x)| > 0} =∪∞

m=1Am, where Am := {x ∈ X : lim supn |Tnf(x)−f(x)| > 2/m}.

Hence it suffices to show µ(Am) = 0 for each m ∈ N. We have lim supn→∞ |Tnf(x)− f(x)| ≤

lim supn→∞

|Tn(f − fk)(x)|+ lim supn→∞

|Tnfk(x)− f(x)| = T ∗(f − fk)(x) + |(f − fk)(x)|

for a.e. x ∈ X since Tn is linear and (Tnfk) → fk pointwise a.e. Therefore µ(Am) ≤

µ({x ∈ X : T ∗(f−fk)(x) > 1/m})+µ({x ∈ X : |(f−fk)(x)| > 1/m}) ≤ Cm∥f−fk∥1+m∥f−fk∥1,

where C > 0 is given by the weak (1, 1) property of T ∗, and in the last term we used the fact

that the inclusion operator is weak (1, 1) with constant 1. As the above inequality is true for every

fk and since ∥f − fk∥1 → 0, we conclude that µ(Am) = 0. �


We will soon see that a particular maximal function that we are going to consider is lower

semicontinuous, which motivates Exercise-12, a small diversion from our main theme.

Exercise-12: Let X be a metric space (can be more generally a topological space also). A function

f : X → R (or f : X → [−∞,∞]) is upper semicontinuous if {x ∈ X : f(x) < α} is open in R

for every α ∈ R, and is lower semicontinuous if {x ∈ X : f(x) > α} is open in R for every α ∈ R.

For example, 1A is upper semicontinuous if A ⊂ X is closed, and 1U is lower semicontinuous if

U ⊂ X is open. Since union of open sets is open and intersection of closed sets is closed, we have

that the infimum of a family of upper semicontinuous functions is upper semicontinuous, and the

supremum of a family of lower semicontinuous functions is lower semicontinuous. By the same

reasoning, the pointwise limit of a decreasing sequence of upper semicontinuous functions is upper

semicontinuous, and the pointwise limit of an increasing sequence of lower semicontinuous functions

is lower semicontinuous. Now, consider a function f : X → R, where X is a metric space.

(i) f is upper semicontinuous ⇔ for each x ∈ X and ε > 0, there is a neighborhood U ⊂ X of x

such that f(y) < f(x) + ε for every y ∈ U ⇔ lim supk→∞ f(xk) ≤ f(x) whenever (xk) → x in X.

(ii) If f is upper semicontinuous with X compact, f is bounded above and attains its maximum.

(iii) If a sequence (fn) of upper semicontinuous functions from X to R converges uniformly to f ,

then f is upper semicontinuous.

(iv) If f is upper semicontinuous and sup f(X) < ∞, then for each x ∈ X we have that f(x) =

inf{g(x) : g ∈ C(X,R) and f ≤ g}.

(v) Formulate and prove the corresponding statements for lower semicontinuity. For example, a

lower semicontinuous function on a compact space is bounded below and attains its minimum.

[Hint : (ii) Let Un = {x ∈ X : f(x) < n}. Then extracting a finite subcover of the open cover

{Un : n ∈ N} of X, we see f is bounded above. Let M = sup f(X), and let (xk) be a sequence

in X with limk f(xk) = M . By compactness, we may assume (xk) → x ∈ X, and then f(x) ≥

lim sup f(xk) = M by upper semicontinuity. (iii) Let x ∈ X and ε > 0. Choose n large with

∥f −fn∥∞ < ε/3 and then choose a neighborhood U of x with fn(y) < fn(x)+ ε/3 for every y ∈ U .

Then f(y) < fn(y) + ε/3 < fn(x) + 2ε/3 < f(x) + ε for every y ∈ U . (iv) After a translation,

assume f ≤ −1. Fix b ∈ X, and let ε ∈ (0, 1). Since A := {x ∈ X : f(x) ≥ f(b) + ε} is closed,

there is continuous h : X → [0, 1] with h(b) = 1 and h(A) = {0}. Consider g ∈ C(X,R) defined as

g(x) = (f(b) + ε)h(x), which satisfies g(b) = f(b) + ε. If x ∈ A, then f(x) ≤ −1 < 0 = g(x); and if

x ∈ X \A, then f(x) < f(b) + ε ≤ g(x) since f(b) + ε < 0 and 0 ≤ h ≤ 1. Thus f ≤ g.]

We will introduce Hardy-Littlewood maximal function on Rn instead of on R as there is no extra

cost for this in the proofs, but we will use it only in one dimension.

FOURIER ANALYSIS 19

Definition: A measurable function f : Rn → C is said to be locally integrable if f1K ∈ L1(Rn),

i.e., if∫K |f |dµ < ∞ for every compact set K ⊂ R. Let L1

loc(Rn) be the collection of all locally

integrable functions on Rn. Clearly L∞(Rn) ⊂ L1loc(Rn) and hence L1

loc(Rn) = L1(Rn). We may

also see Lp(Rn) ⊂ L1loc(Rn) for 1 ≤ p <∞ as follows. Assume 1 < p <∞ (the case p = 1 is trivial)

and 1p + 1

q = 1. Then for any compact K ⊂ Rn, we have 1K ∈ Lq(Rn) and hence by Holder’s

inequality we obtain∫K |f |dµ =

∫Rn |f |1Kdµ ≤ ∥f∥p∥1K∥q <∞.

Remark: If ν is a locally finite Borel measure on Rn absolutely continuous w.r.to the Lebesgue

measure µ, then by Radon-Nikodym theorem there is a measurable function f : Rn → [0,∞)

with ν(A) =∫A fdµ for every Borel set A ⊂ Rn. Evidently, f ∈ L1

loc(Rn); and conversely any

f ∈ L1loc(Rn) defines a locally finite Borel measure on Rn that is absolutiely continuous w.r.to µ.

Definition: The Hardy-Littlewood maximal function M : L1loc(Rn) → M(Rn, [0,∞]) is defined as

Mf(a) = supr>01

µ(B(a,r))

∫B(a,r) |f |dµ for f ∈ L1

loc(Rn) and a ∈ R, where µ is the Lebesgue measure

on Rn, and the measurability of Mf is ensured by Exercise-13 below. Moreover, Mf is finite

almost everywhere, i.e., µ({a ∈ Rn : Mf(a) = ∞}) = 0, as a consequence of [113] below. When

n = 1, we have Mf(a) = supr>012r

∫ a+ra−r |f(t)|dt for f ∈ L1

loc(R) and a ∈ R. Also note that

Mf(a) = sup0<r<1/212r

∫ a+ra−r |f(t)|dt for f ∈ L1(T) and a ∈ T.

Remark: (i) If f ≡ c, then Mf ≡ |c| /∈ L1(Rn). (ii) If f : R → C is f = 1(0,∞), then Mf(a) = 1 for

a > 0 and Mf(a) = 1/2 for a ≤ 0; here, Mf is not continuous. (iii) M is sublinear : M(f + g) ≤

Mf +Mg (since |f + g| ≤ |f |+ |g|) and M(cf) = |c|Mf . Check that M is not linear.

Exercise-13: Fix f ∈ L1loc(Rn) and let Mf be as defined above. Verify that the set {x ∈ Rn :

Mf(x) > α} is open for each α > 0. Thus the function Mf on Rn is lower semicontinuous and

hence Borel measurable. Also, Mf(x) ≥ lim supn→∞Mf(xk) whenever (xk) → x in Rn.

The following technical fact from Euclidean Measure Theory is needed in the next proof.

Exercise-14: Let B1, . . . , Bk ⊂ Rn be finitely many balls, and µ be the Lebesgue measure on Rn.

Then there is a pairwise disjoint subcollection {Bj : j ∈ F} for some F ⊂ {1, . . . , k} such that

µ(∪k

j=1Bj) ≤ 3nµ(∪

j∈F Bj) = 3n∑

j∈F µ(Bj). [Hint : Assume µ(B1) ≥ · · · ≥ µ(Bk) so that r(1) ≥

· · · ≥ r(k) for the radii. Let j1 = 1. Having chosen j1, . . . , jr, let jr+1 be the smallest j > jr such

that Bj is disjoint with∪r

i=1Bji . Put F = {j1, j2, . . .}. If m ∈ {1, . . . , k} \ F , then Bm intersects

Bji for some ji < m. Then r(m) ≤ r(ji) and Bm ⊂ 3∗Bji , where 3∗Bji is the ball concentric to Bji

with radius 3r(ji). Hence µ(∪k

j=1Bj) ≤ µ(∪

j∈F 3∗Bj) ≤ 3n∑

j∈F µ(Bj) = 3nµ(∪

j∈F Bj).]

[113] The Hardy-Littlewood maximal function M : L1(Rn) → M(Rn, [0,∞]) is weak (1, 1) with

constant C = 3n. Similarly, M : L1(T) → M(T, [0,∞]) is weak (1, 1).


Proof. Fix f ∈ L1(Rn) and let Aα = {x ∈ Rn : Mf(x) > α} for α > 0. We need to show µ(Aα) ≤

3n∥f∥1/α. Since µ(Aα) = sup{µ(K) : K ⊂ Aα compact}, it suffices to show µ(K) ≤ 3n∥f∥1/α for

an arbitrary compact set K ⊂ Aα. By the definition of Mf , for each a ∈ K ⊂ Aα there is a ball

B centered at a with 1µ(B)

∫B |f |dµ > α, or equivalently µ(B) < 1

α

∫B |f |dµ. As K is compact, we

may cover K with finitely many such balls B1, . . . , Bk. By Exercise-14, choose a pairwise disjoint

subcollection {Bj : j ∈ F} for some F ⊂ {1, . . . , k} with µ(∪k

j=1Bk) ≤ 3n∑

j∈F µ(Bj). Then,

µ(K) ≤ µ(k∪

j=1

Bk) ≤ 3n∑j∈F

µ(Bj) =3n

α

∑j∈F

∫Bj

|f |dµ ≤ 3n

α

∫Rn

|f |dµ =3n∥f∥1α

,

where the third inequality is by the disjointness of the collection {Bj : j ∈ F}. �

Definition: Parametrize T = [−1/2, 1/2). For r ∈ (0, 1/2), define the Lebesgue kernel Lr :=

12r1[−r,r] ∈ L1(T), and note ∥Lr∥1 = 1. See that Lr∗f(a) = 1

2r

∫T f(t)1[−r,r](a−t)dt = 1

2r

∫ a+ra−r f(t)dt

is a local average of f at a ∈ T for f ∈ L1(T). The Lebesgue maximal function L∗ : L1(T) →

M(T, [0,∞]) is defined as L∗f(t) = sup0<r<1/2 |Lr ∗f(t)| = sup0<r<1/212r |∫ a+ra−r f(t)dt|. Comparing

with the Hardy-Littlewood maximal function in one dimension, we have L∗f ≤ L∗|f | =Mf .

Exercise-15: Let f ∈ L1(T) and t ∈ T. Then,

(i) |Lr ∗ f | ≤ Lr ∗ |f | ≤ L∗|f | = 1 · L∗|f | = ∥Lr∥1L∗|f | = ∥Lr∥1Mf .

(ii) If K =∑k

j=1 cjLrj is a convex combination, then |K ∗ f | ≤ K ∗ |f | ≤ ∥K∥1L∗|f | = ∥K∥1Mf .

(iii) If K ∈ L1(T) is a nonnegative even function decreasing on [0, 1/2), then K/∥K∥1 can be

approximated by convex combinations of Lr’s and hence |K ∗f | ≤ K ∗|f | ≤ ∥K∥1L∗|f | = ∥K∥1Mf .

7. Fourier series: pointwise convergence of Cesaro and Abel sums

Philosophy: We know {DN} is not an approximate identity. But the averages of DN ’s will form

an approximate identity, and this will lead to the Cesaro summability of the Fourier series.

Definition: Recall the Dirichlet kernel DN =∑N

n=−N en, and define the Nth Fejer kernel

FN := (N + 1)−1N∑

n=0

Dn = (N + 1)−1N∑

n=−N

n∑k=−n

ek =

N∑n=−N

(1− |n|N + 1

)en.

For f ∈ L1(T), recall the Fourier partial sum sN (f) =∑N

n=−N f(n)en = DN ∗ f , and define the

Nth Fejer mean

σN (f) := (N + 1)−1N∑

n=0

sn(f) = FN ∗ f =

N∑n=−N

(1− |n|N + 1

)en ∗ f =

N∑n=−N

(1− |n|N + 1

)f(n)en,

where we used the fact en ∗ f = f ∗ en = f(n)en from Exercise-5(v).

Remark: σN (f) is real valued when f is; and σN (f) ≥ 0 when f ≥ 0 since FN ≥ 0.

FOURIER ANALYSIS 21

[114] (i) FN (t) =sin2(N + 1)πt

(N + 1) sin2 πtand in particular FN ≥ 0 for every N ≥ 0.

(ii) If we parametrize T = [1/2, 1/2), then FN is an even function for every N ≥ 0.

(iii) ∥FN∥1 =∫T FN (t)dt = 1 for every N ≥ 0.

(iv) For 0 < δ < t < 1/2, we have FN (t) ≤ 1

(N + 1) sin2 πδ→ 0 as N → ∞ uniformly in t ∈ (δ, 1/2).

(v) {FN : N ≥ 0} is a nonnegative approximate identity for L1(T) satisfying also the L∞-

concentration condition (A4) from page 5.

(vi) σN (f, a) =∫ 1/20 FN (t)[f(a+ t) + f(a− t)]dt for every a ∈ T.

Proof. (i) Recall DN (t) =sin(2N + 1)πt

sinπtby [105], and 2 sinA sinB = cos(A − B) − cos(A +

B). Then, we see (N + 1)FN (t) · 2 sin2 πt =∑N

n=0 2Dn(t) sin2 πt =

∑Nn=0 2 sin(2n + 1)πt sinπt =∑N

n=0[cos 2nπt− cos(2n+ 2)πt] = cos 0− cos(2N + 2)πt = 2 sin2(N + 1)πt.

(ii) This follows from (i).

(iii) FN =∑N

n=−N (1− |n|N+1)en; and

∫T en(t)dt = 0 for n = 0, and = 1 for n = 0.

Statement (iv) also follows from (i), and (v) is a summary of what is proved above.

(vi) Recall sN (f, a) =∫ 1/20 DN (t)[f(a + t) + f(a − t)]dt by Exercise-6. Now use the facts that

σN (f) = (N + 1)−1∑N

n=0 sn(f) and FN = (N + 1)−1∑N

n=0Dn. �

Since {FN : N ≥ 0} is an approximate identity satisfying (A4), we may deduce the following:

[115] (Fejer’s theorem) (i) If f ∈ C(T), then ∥σN (f)− f∥∞ → 0 as N → ∞.

(ii) Let 1 ≤ p <∞. If f ∈ Lp(T), then ∥σN (f)− f∥p → 0 as N → ∞.

(iii) Let f ∈ L1(T) (assumed to be extended to R with period 1). If the limits f(a+) and f(a−)

exists at a ∈ T, then limN→∞ σN (f, a) = [f(a+) + f(a−)]/2. In particular, if f is continuous at

a ∈ T, then limN→∞ σN (f, a) = f(a).

Proof. (i) and (ii): We know σN (f) = FN ∗ f . Now use [114](v) and [102].

(ii) Let g : T → C be g(t) = [f(a + t) + f(a − t)]/2 for the given a ∈ T. Then g ∈ L1(T) with

∥g∥1 ≤ ∥f∥1. By [114](vi), σN (f, a) = 2∫ 1/20 FN (t)g(t)dt =

∫T FN (t)g(t)dt since FN , g are even,

and hence limN→∞ σN (f, a) = limt→0 g(t) = [f(a+) + f(a−)]/2 by [114](v) and [102](iii). �

Seminar topic: Let f0, f1, f2 ∈ C(T) be f0 ≡ 1, f1(t) = sin 2πt and f2(t) = cos 2πt. Korovkin’s

(second) theorem states that if Tn : C(T) → C(T) are positive linear maps for n ∈ N with

limn→∞ ∥Tnfj − fj∥∞ = 0 for j = 0, 1, 2, then limn→∞ ∥Tnf − f∥∞ = 0 for every f ∈ C(T).


Present a proof of Korovkin’s theorem (see for example, M. Uchiyama, Proof of Korovkin’s the-

orems via inequalities, Amer. Math. Monthly, 110, (2003)). Apply Korovkin’s theorem to the

sequence (σN ) of positive linear maps of C(T) to give another proof of [115](i).

Definition: A trigonometric polynomial is a finite linear combination of en’s with complex scalars.

For example, FN =∑N

n=−N en is a trigonometric polynomial. If g =∑cnen is a trigonometric poly-

nomial (where the sum is finite), then for any f ∈ L1(T), g∗f = f ∗g =∑cn(f ∗en) =

∑cnf(n)en,

which is again a trigonometric polynomial. In particular, σN (f) = FN ∗ f is a trigonometric poly-

nomial for every f ∈ L1(T); in fact, σN (f) =∑N

n=−N (1− |n|N+1)f(n)en.

Exercise-16: (i) {g ∈ C(T) : g is a trigonometric polynomial} is dense in both (C(T), ∥ · ∥∞) and

Lp(T) for 1 ≤ p <∞.

(ii) (Uniqueness of Fourier coefficients) If f, g ∈ L1(T) and f(n) = g(n) for every n ∈ Z, then f = g.

(iii) (Fourier inversion) If f ∈ L1(T) is such that (f(n)) ∈ l1(Z), i.e., if∑∞

n=−∞ |f(n)| < ∞, then

g :=∑∞

n=−∞ f(n)en ∈ C(T) and f = g almost everywhere.

(iv) F : L1(T) → c0(Z) given by f 7→ (f(n))n∈Z is linear and injective, but is not surjective.

[Hint : (i) Use [115](i) and [115](ii) respectively after noting that σN (f) =∑N

n=−N (1− |n|N+1)f(n)en

is a trigonometric polynomial. (ii) By considering f − g, we may assume g = 0. Use [115](i) and

the above expression for σN (f). (iii) Use (ii) after noting g(n) = f(n) for every n. (iv) Linearity

is clear, and F is injective by (ii). If F is also surjective, then F−1 should by a bounded linear

operator by Inverse mapping theorem since the spaces L1(T) and c0(Z) are Banach. But DN (n) = 1

if |n| ≤ N , and = 0 if |n| > N so that ∥(DN (n))∥∞ = 1; and we know ∥Dn∥1 → ∞ by [105].]

Exercise-17: (Wiener’s density theorem for T). Let f ∈ L1(T ). Then {f ∗g : g ∈ L1(T)} is dense in

L1(T) ⇔ f(n) = 0 for every n ∈ Z. [Hint : If f(n0) = 0, then f ∗ g(n0) = f(n0)g(n0) = 0 for every

g ∈ L1(T), and consequently {f ∗ g : g ∈ L1(T)} cannot be dense in L1(T). Conversely, suppose

f(n) = 0 for every n ∈ N. Given h ∈ L1(T ), define gN =∑N

n=−N (1 − |n|N+1)(h(n)/f(n))en. Since

f ∗ en = f(n)en, we see f ∗ gN =∑N

n=−N (1− |n|N+1)h(n)en = σN (h) → h in L1(T) by [115](ii).]

Now we prepare ourselves to prove Lebesgue’s extension of Fejer’s theorem; and also the Dirichlet-

Jordan theorem about functions of bounded variation. Even though {FN} is an approximate

identity, FN is not decreasing4 on [0, 1/2). Therefore, in order to make use of Exercise-15(iii), we

will define below KN ≥ FN such that KN ’s are even and decreasing on [0, 1/2).

Exercise-18: (i) (Trigonometric facts) sin(N +1)πt ≤ (N +1) sinπt and sinπt ≥ 2t for 0 ≤ t ≤ 1/2.

4See the end of Section 8 for a picture of the graph of FN .

FOURIER ANALYSIS 23

(ii) For 0 ≤ |t| ≤ 12(N+1) , we have FN (t) ≤ (N+1)2 sin2 πt

(N+1) sin2 πt= N + 1 =: KN (t). And for 1

2(N+1) ≤ |t| <

1/2, we have FN (t) ≤ 1(N+1)(2t)2

= 14(N+1))t2

=: KN (t). The function KN : T → C defined in this

manner satisfies the following: KN ≥ FN ≥ 0, KN ∈ C(T), KN is an even function that decreases

on [0, 1/2), and ∥KN∥1 = 2∫ 1/(2N+2)0 (N +1)dt+ 2

4(N+1)

∫ 1/21/(2N+2) t

−2dt = (1− 0) + (1− 1N+1) ≤ 2.

(iii) |FN ∗ f | ≤ FN ∗ |f | ≤ KN ∗ |f | ≤ 2L∗|f | = 2Mf by parts (ii) and (iii) and by Exercise-15(iii),

where M is the Hardy-Littlewood maximal function in one dimension.

(iv) The Fejer maximal function F ∗ is defined as F ∗f(t) = supN |FN ∗ f(t)| for f ∈ L1(T) and

t ∈ T. Then F ∗f ≤ 2L∗|f | = 2Mf by (iv).

[116] (i) The Fejer maximal function F ∗ and the Lebesgue maximal function L∗ are weak (1, 1).

(ii) (Lebesgue theorem about the pointwise convergence of Fejer means) Let f ∈ L1(T). Then

(σN (f)) → f pointwise almost everywhere.

(iii) (Lebsgue differentiation theorem on T) Let f ∈ L1(T). Then limr→012r

∫ a+ra−r f(t)dt = f(a) for

almost every a ∈ T, and includes all a ∈ T at which f is continuous.

Proof. (i) We know that L∗f ≤ L∗|f | = Mf ; and F ∗f ≤ 2L∗|f | = 2Mf by Exercise-18(iv). Also

M is weak (1, 1) by [113].

(ii) (σN (f)) → f pointwise (in fact, uniformly) for every f ∈ C(T) by Exercise-16. Also C(T) is

dense in L1(T) by Exercise-1. Apply [112] to F ∗ after putting TNf = FN ∗ f = σN (f).

(iii) Let f ∈ L1(T) be continuous at a ∈ T, and ε > 0. Choose δ > 0 such that |f(a) − f(t)| < ε

whenever |a− t| < δ. Then for 0 < r < δ we have |f(a)− Lr ∗ f(a)| =

|f(a)− 1

2r

∫ a+r

a−rf(t)dt| = | 1

2r

∫ a+r

a−r(f(a)− f(t))dt| ≤ 1

2r

∫ a+r

a−r|f(a)− f(t)|dy ≤ 1

2rε · 2r = ε.

Hence limr→0 Lr ∗ f(a) = limr→012r

∫ a+ra−r f(t)dt = f(a). In particular, this holds at every a ∈ T

when f ∈ C(T). Now use the denseness of C(T) in L1(T) and apply [112] to L∗ after putting

Tnf = Lrn ∗ f for any decreasing sequence (rn) in (0, 1/2) converging to 0. �

Remark: A point a ∈ T with limr→012r

∫ a+ra−r f(t)dt = f(a) is called a Lebesgue point for f ∈ L1(T).

Definition: For 0 < r < 1, define the Poisson kernel5 Pr =∑

n∈Z r|n|en, which belongs to C(T)

by the uniform convergence of the series. Keeping in mind the expressions in Exercise-7(iii), for

f ∈ L1(T) and 0 < r < 1, define the rth Abel mean of f as

Ar(f) = (1− r)

∞∑N=0

rNsN (f) = (1− r)

∞∑N=0

rNN∑

n=−N

f(n)en =∑N∈Z

f(n)r|n|en = Pr ∗ f

5See the end of Section 8 for a picture of the graph of Pr


where we used the fact f(n)en = f ∗ en = en ∗ f from Exercise-5.

[117] (i) (Another expression for Poisson kernel) Pr(t) =1− r2

1 + r2 − 2r cos 2πt.

(ii) {Pr : 0 < r < 1} as r → 1 is an approximate identity on L1(T) satisfying Pr ≥ 0 and also the

following L∞-concentration condition (A4): limr→1 sup{|Pr(t)| : |t| > δ} = 0 for every δ ∈ (0, 1/2).

Proof. (i)

Pr(t) + 1 =

∞∑n=0

rne−n(t) +

∞∑n=0

rnen(t) =1

1− re−1(t)+

1

1− re1(t)=

2− 2r cos 2πt

1 + r2 − 2r cos 2πt

since e−1(t) + e1(t) = 2 cos 2πt. Hence Pr(t) =2− 2r cos 2πt

1 + r2 − 2r cos 2πt− 1 =

1− r2

1 + r2 − 2r cos 2πt.

(ii)∫T Pr =

∫T∑

n∈Z r|n|en =

∑n∈Z

∫T r

|n|en = 1 since∫T en = 0 for n = 0; here the interchange of

summation and integral is justified by the uniform convergence of the series. Thus {Pr} satisfies

condition (A1) of an approximate identity. By (i), Pr(t) ≥ 1− r2

1 + r2 − 2r=

1− r2

(1− r)2≥ 0, and

hence the L1-boundedness condition (A2) follows from (A1). For |t| > δ ∈ (0, 1/2), we have

1 + r2 − 2r cos 2πt = (1− r)2 + 2r(1− cos 2πt) > 2r(1− cos 2πδ). Hence limr→1 sup{|Pr(t)| : |t| >

δ} ≤ lim supr→1

1− r2

2r(1− cos 2πδ)= 0, establishing (A4); and this implies property (A3) also. �

Remark: It can also be shown that for each r, the Poisson kernel Pr decreases on [0, 1/2).

[118] (i) If f ∈ C(T), then limr→1− ∥Ar(f)− f∥∞ = 0.

(ii) Let 1 ≤ p <∞. If f ∈ L1(T), then limr→1− ∥Ar(f)− f∥p = 0.

(iii) If f ∈ L1(T) and if the limits f(a+) and f(a−) exist at a point a ∈ T, then limr→1− Ar(f, a) =

[f(a+)+f(a−)]/2. In particular, if f ∈ L1(T) is continuous at a ∈ T, then limr→1− Ar(f, a) = f(a).

(iv) If f ∈ L1(T), then Ar(f, a) → f(a) as r → 1− for almost every a ∈ T.

Proof. Statements (i) and (ii) follow from the fact [117](ii) that {Pr : 0 < r < 1} as r → 1 is an

approximate identity, by [102]. Statements (iii) and (iv) follow from [115] since Cesaro summability

implies Abel summability by [110] (or imitate the proofs of [115](iii) and [116](ii)). �

Remark: Poisson kernel appears naturally in the theory of partial differential equations, for example

in solving the Laplace equation of the unit disc.

8. Pointwise convergence of Fourier series for functions of bounded variation

Here we will present Jordan’s theorem about pointwise convergence of Fourier series for a function

f ∈ L1(T) of bounded variation.

FOURIER ANALYSIS 25

Definition: For a function f : T → C, and [a, b] ⊂ T, let V ba (f) = sup{

∑kj=1 |f(aj)− f(aj−1)| : k ∈

N and a = a0 ≤ a1 ≤ · · · ≤ ak−1 ≤ ak = b} be the total variation of f in [a, b]. We say f is of

bounded variation if V 10 (f) <∞.

Remark: Let f ∈ L1(T) be of bounded variation. It is known that we may write f = f1 − f2 +

i(f3−f4), where fj ’s are real valued are monotone increasing. A monotone function is differentiable

almost everywhere and has only jump discontinuities. Thus f is differentiable almost everywhere

and the limits f(a+) and f(a−) exist for every a ∈ T (see my notes Measure Theory for these

facts). We aim to show limN→∞(sN (f, a)) = [f(a+) + f(a−)]/2 for every a ∈ T.

[119] Let f ∈ L1(T) be a function of bounded variation. Then,

(i) (Dirichlet’s theorem) limN→∞ σN (f, a) = [f(a+) + f(a−)]/2 for every a ∈ T.

(ii) |nf(n)| ≤ V 10 (f)/2 for every n ∈ Z.

(iii) (Jordan’s theorem) limN→∞ sN (f, a) = [f(a+) + f(a−)]/2 for every a ∈ T.

Proof. (i) By the Remark above, the limits f(a+) and f(a−) exist for every a ∈ T. So the result

follows by [115](iii).

(ii) We may assume n = 0, and put r = 12|n| . First note that

f(n) =∑2|n|−1

k=0

∫ (k+1)rkr f(t)e−n(t)dt =

∑2|n|−1k=0

∫ r0 f(s+ kr)e−n(s+ kr)ds.

Since e−n(s+ kr) = e−n(s) when k is even, and e−n(s+ kr) = −en(s) when k is odd, we get

f(n) =∫ r0

(∑|n|−1j=0 [f(s+ 2jr)− f(s+ (2j + 1)r)]

)e−n(s)ds.

Therefore, |f(n)| ≤∫ r0

∑|n|−1j=0 |f(s+2jr)−f(s+(2j+1)r)|ds ≤

∫ r0 V

10 (f)ds ≤ V 1

0 (f)r =V 10 (f)

2|n|.

(iii) This follows from parts (i) and (ii) above, and Hardy’s Tauberian theorem [111](i). �

Remark: Let f ∈ L1(T) be of bounded variation. Then as remarked above, f is differentiable a.e.,

and hence continuous a.e.; now, Jordan’s theorem implies limN→∞ sN (f, a) = f(a) for a.e. a ∈ T.

Seminar topic/ reading assignment: (i) If (an)n∈Z is a sequence of nonnegative numbers such that

a−n = an, an+1 − an ≥ an − an−1 and limn→∞ an = 0, then there is a nonnegative function

f ∈ L1(T) with f(n) = an for every n ∈ Z. (ii)∑∞

n=2

sin 2πnt

log n=∑

|n|≥2

sgn(n)en(t)

2i log |n|is a convergent

trigonometric series that is not the Fourier series of any f ∈ L1(T); see sections 4.1 and 4.2 of

Y. Katznelson, An Introduction to Harmonic Analysis. (iii) Gibbs-Wilbraham phenomenon; see

section 1.2.8 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets.

Remark: We saw in the beginning of this notes that the Fourier series converges in L2(T). About

the general Lp-convergence, the following is known: if 1 < p <∞, then limN→∞ ∥f − sN (f)∥p = 0


for every f ∈ Lp(T); but there are f ∈ L1(T) for which the sequence (sN (f)) does not converge to

f in L1(T).

Figure 1. Rough shape of the graphs of DN , FN , and Pr.

In sections 9-14, we will introduce the theory of distributions, and this will have an interplay

with the theory of Fourier transform that we will discuss afterwards. The theory of distributions

is usually developed on an open subset of Rn. But we will stick to the one dimensional space R in

order to convey the ideas in the simplest way without notational distractions. After grasping the

one dimensional case, for applications the student should read from relevant books6 the multidi-

mensional theory, which is built upon more or less the same ideas. One serious difference that we

observe when we move from T to R is that we have Lp(T) ⊂ L1(T) for 1 ≤ p <∞, but there are no

inclusion relations among Lp(R) for 1 ≤ p ≤ ∞ since the Lebesgue measure of R is infinite; and we

have to pay special attention to the decay properties of a function value f(x) as |x| → ∞. Special

subspaces defined in terms of various decay properties will play an important role, and we start by

sketching the theory of such subspaces.

9. Convolution is a smoothing operation

In this section, we will establish the following philosophy in various forms: f ∗ g is at least as

smooth as f and g, and often smoother. As a warm-up, we first look at different types of continuous

functions on R.

Definition: Recall that the support of a function f : R → C is supp(f) := {x ∈ R : f(x) = 0}. Also

recall C(R) = {f : R → C : f is continuous}, and Cc(R) = {f ∈ C(R) : f has compact support}.

Let C0(R) = {f ∈ C(R) : lim|x|→∞ f(x) = 0} (the space of continuous functions vanishing at ∞),

and Cb(R) = {f ∈ C(R) : f is bounded}. Note that all these spaces are complex vector spaces.

Exercise-19: (i) Cc(R) ⊂ C0(R) ⊂ Cb(R) = C(R) ∩ L∞(R).

6Eg: F.G. Friedlander, Introduction to the Theory of Distributions, or L. Grafakos, Classical Fourier Analysis.

FOURIER ANALYSIS 27

(ii) Every f ∈ C0(R) is uniformly continuous, but f ∈ Cb(R) may not be uniformly continuous.

(iii) Cc(R) ⊂ Lp(R) for 1 ≤ p ≤ ∞.

(iv) For 1 ≤ p <∞, C0(R) is not a subset of Lp(R), and Cb(R) ∩ Lp(R) is not a subset of C0(R).

(v) C0(R) is a closed vector subspace of (L∞(R), ∥ · ∥∞), and hence is a Banach space.

(vi) Cc(R) is dense in both (C0(R), ∥ · ∥∞) and (Lp(R), ∥ · ∥p) for 1 ≤ p <∞.

[Hint : (ii) If f ∈ Cb(R) is with f(n) = 0 and f(n + 1/n) = 1 for n ≥ 2, then f is not uniformly

continuous. (iv) If f ∈ C0(R) is with f(x) = 1/n for 10n ≤ x ≤ 10n + n ∀n ∈ N, then f /∈ L1(R).

(vi) Given f ∈ C0(R) and ε > 0, choose N ∈ N such that |f(x)| < ε for |x| ≥ N . Choose

continuous g : R → [0, 1] with g ≡ 1 on [−N,N ] and g(x) = 0 for |x| ≥ N + 1. Then fg ∈ Cc(R)

and ∥f − fg∥∞ ≤ ε. The denseness of Cc(R) in Lp(R) is already noted in Exercise-1.]

If f, g ∈ L1(R), then we know by Exercise-3(i) that f ∗ g ∈ L1(R) with ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1. To

supplement this, first we observe:

Exercise-20: (i) Fix f ∈ Lp(R), 1 ≤ p <∞. Then a 7→ fa from R to Lp(R) is uniformly continuous.

(ii) If f ∈ L1(R) and g ∈ L∞(R), then f ∗ g ∈ L∞(R) with ∥f ∗ g∥∞ ≤ ∥f∥1∥g∥∞, and moreover

f ∗ g is uniformly continuous.

[Hint : (i) As in Exercise-2(ii), we may assume f ∈ Cc(R). Let N ∈ N be such that supp(f) ⊂

[−(N − 1), N − 1]. Given ε > 0, choose δ ∈ (0, 1) such that |x − y| < δ implies |f(x) − f(y)|p <ε2N . Then for a, b ∈ R with |a − b| < δ, we have ∥fa − fb∥pp =

∫R |f(x − a) − f(x − b)|pdx =∫

R |f(y) − f(y + a − b)|pdy =∫ N−N |f(y) − f(y + a − b)|pdy < ε

2N · 2N = ε. (ii) |f ∗ g(x)| =

|∫R f(y)g(x − y)dy| ≤

∫R |f(y)|∥g∥∞dy = ∥f∥1∥g∥∞. Define F ∈ L1(R) as F (y) = f(−y). Note

that |g ∗ f(a)− g ∗ f(b)| = |∫g(y)(Fa(y)− Fb(y))dy| ≤ ∥g∥∞∥Fa − Fb∥1, and use (i).]

[120] (i) If f, g ∈ Cc(R), then f ∗ g ∈ Cc(R) and supp(f ∗ g) ⊂ supp(f) + supp(g).

(ii) If f ∈ L1(R) and g ∈ C0(R), then f ∗ g ∈ C0(R).

(iii) Let 1 < p, q < ∞ and 1p + 1

q = 1. If f ∈ Lp(R) and g ∈ Lq(R), then f ∗ g ∈ C0(R) and

∥f ∗ g∥∞ ≤ ∥f∥p∥g∥q.

(iv) Let 1 ≤ p <∞. If f ∈ Lp(R) and g ∈ Cc(R), then f ∗ g ∈ C0(R).

Proof. (i) Let h = f ∗ g, K = supp(f) and L = supp(g). Then supp(gy) = y + L. Since

|h(a)− h(b)| ≤∫K|f(y)||g(a− y)− g(b− y)|dy ≤ ∥f∥∞

∫K|g(a− y)− g(b− y)|dy,

we may deduce that h is continuous by the uniform continuity of g. If x /∈ K + L, then gy(x) = 0

for every y ∈ K and hence h(x) =∫K f(y)gy(x)dy = 0. This shows supp(h) ⊂ K + L.

(ii) f ∗ g is defined by Exercise-20(ii) since g ∈ C0(R) ⊂ L∞(R). By Exercise-19(vi), choose

sequences (fn) and (gn) in Cc(R) with ∥f − fn∥1 → 0 and ∥g − gn∥∞ → 0. Let M = sup ∥gn∥∞ <


∞. We know by (i) that fn ∗ gn ∈ Cc(R) ⊂ C0(R). With the help of Exercise-20(ii), we see

|f∗g(x)−fn∗gn(x)| ≤ |f∗g(x)−f ∗gn(x)|+|f ∗gn(x)−fn∗gn(x)| ≤ ∥f∥1∥g−gn∥∞+M∥f−fn∥1 → 0

as n→ ∞. Since C0(R) is closed w.r.to ∥ · ∥∞ by Exercise-19(v), we conclude that f ∗ g ∈ C0(R).

(iii) Let h(x) = g(−x) so that f ∗ g(x) =∫R f(y)hx(y)dy. Apply Holder’s inequality and noting

∥hx∥q = ∥g∥q, we may deduce that f ∗ g ∈ L∞(R) with ∥f ∗ g∥∞ ≤ ∥f∥p∥g∥q. The proof that

f ∗ g ∈ C0(R) is similar to the one given for (ii). Choose sequences (fn) and (gn) in Cc(R)

with ∥f − fn∥p → 0 and ∥g − gn∥q → 0. Let M = supn ∥gn∥q < ∞. By what is proved so far,

∥f ∗g−fn∗gn∥∞ ≤ ∥f ∗(g−gn)∥∞+∥(f−fn)∗gn∥∞ ≤ ∥f∥p∥g−gn∥q+∥f−fn∥1M → 0 as n→ ∞.

Since fn ∗ gn ∈ Cc(R) ⊂ C0(R) and since C0(R) is a Banach space, we get that f ∗ g ∈ C0(R).

(iv) This follows from (ii) and (iii) since Cc(R) ⊂ Lq(R). �

Remark: We mention an application of [120](iii). Claim: If A ⊂ R has positive Lebesgue measure,

then A − A contains a neighborhood of 0. Proof : We may assume µ(A) < ∞. Let f = 1A and

g(x) = f(−x), and note f, g ∈ L2(R). Given that 0 < µ(A) =∫f =

∫f2 = f ∗ g(0). Since f ∗ g is

continuous by [120](iii), there is a neighborhood U ⊂ R of 0 such that f ∗ g(x) > 0 for every x ∈ U .

But 0 < f ∗ g(x) =∫f(y)g(x− y)dy =

∫f(y)f(y− x)dy implies x ∈ A−A, and hence U ⊂ A−A.

Definition: Let Cmc (R) = {f ∈ Cc(R) : f is m-times continuously differentiable}, Cm

0 (R) = {f ∈

C0(R) : f is m-times continuously differentiable}, and C∞0 (R) =

∩∞m=1C

m0 (R). More importantly,

for our future discussion of distributions, we define the space E of smooth functions, and the space

D of smooth functions with compact support as follows:

Let E = C∞(R) = {f : R → C : f is infinitely often differentiable},

and D = C∞c (R) = Cc(R) ∩ E =

∩∞m=1C

mc (R).

Notation: Let Df = f ′, the derivative of f : R → C when f is differentiable.

Fact from Analysis: (Differentiating under the integral - see Theorem 7.40 in Apostol, Mathematical

Analysis) If ϕ : [a, b] × [c, d] → C and the partial derivative ∂ϕ∂x are continuous, then h : [a, b] → C

defined as h(x) :=∫ dc ϕ(x, y)dy is differentiable and Dh(x) =

∫ dc

∂ϕ∂x (x, y)dy.

[121] (i) If f ∈ Cc(R) and g ∈ C1c (R), then f ∗ g ∈ C1

c (R) and D(f ∗ g) = f ∗Dg.

(ii) Let k,m ≥ 0. If f ∈ Ckc (R) and g ∈ Cm

c (R), then f ∗g ∈ Ck+mc (R) and Di+j(f ∗g) = Dif ∗Djg

for 0 ≤ i ≤ k and 0 ≤ j ≤ m. In particular, if f ∈ Cc(R) and g ∈ D, then f ∗ g ∈ D and

Dm(f ∗ g) = f ∗Dmg for every m ∈ N.

(iii) Let 1 ≤ p ≤ ∞. If f ∈ Lp(R) and g ∈ Cmc (R) for some m ∈ N, then f ∗ g ∈ Cm

0 (R) and

Dj(f ∗ g) = f ∗Djg for 1 ≤ j ≤ m. In particular, if f ∈ Lp(R) and g ∈ D, then f ∗ g ∈ C∞0 (R) and

Dm(f ∗ g) = f ∗Dmg for every m ∈ N.

FOURIER ANALYSIS 29

(iv) If f ∈ L1(R) and g ∈ C∞0 (R), then f ∗ g ∈ C∞

0 (R) and Dm(f ∗ g) = f ∗Dmg for every m ∈ N.

Proof. (i) Since g ∈ C1c (R), we have Dg ∈ Cc(R) and hence f ∗Dg ∈ Cc(R) by [120](i). It remains

to show D(f ∗ g) = f ∗ Dg. Put ϕ(x, y) := f(y)g(x − y). Then ϕ and ∂ϕ∂x are continuous, where

∂ϕ∂x (x, y) = f(y)Dg(x− y). For a fixed x ∈ R, ϕ(x, y) = 0 for y outside a compact interval since f

and g have compact supports. By applying the Fact mentioned above to ϕ, we get D(f ∗g) = f ∗Dg.

(ii) This follows by the repeated application of (i) since the convolution is symmetric.

(iii) We know by [120](iv) that f ∗ g ∈ C0(R). It remains to show f ∗ g is m-times continuously

differentiable and Dj(f ∗ g) = f ∗Djg for 1 ≤ j ≤ m.

Case-1 : f ∈ L1(R). Fix x ∈ R, and we claim that D(f ∗ g)(x) = f ∗ Dg(x). Let (tn) be a

sequence of non-zero reals converging to 0. Define hn(y) = t−1n (g(x + tn − y) − g(x − y)). Since

t−1n (f∗g(x+tn)−f∗g(x)) =

∫R f(y)hn(y)dy, it remains to show limn→∞

∫R f(y)hn(y)dy = f∗Dg(x).

By mean value theorem applied to g, we observe that ∥hn∥∞ ≤ ∥Dg∥∞ for every n ∈ N. Also,

if we put h(y) = Dg(x − y), then we see that (fhn) → fh pointwise. Since fhn is dominated by

the integrable function ∥Dg∥∞|f |, we conclude by Lebesgue dominated convergence theorem that

limn→∞

∫Rf(y)hn(y)dy =

∫Rf(y)h(y)dy =

∫Rf(y)Dg(x− y)dy = f ∗Dg(x), which proves the claim.

Since Dg ∈ Cc(R), we have D(f ∗ g) = f ∗Dg ∈ C0(R) also by [120](iv), and thus f ∗ g ∈ C10 (R).

Now repeating the argument with Dg, D2g, etc. in place of g, we get the desired result.

Case-2 : f ∈ Lp(R), where p ∈ (1,∞). Let K = supp(g). Since f ∗g(x) =∫R 1x−K(y)f(y)g(x−y)dy

and since y 7→ 1x−K(y)f(y) belongs to L1(R) by Holder’s inequality, the result follows by case-1.

Case-3 : f ∈ L∞(R). Let K = supp(g). Since f ∗ g(x) =∫R 1x−K(y)f(y)g(x − y)dy and since

y 7→ 1x−K(y)f(y) belongs to L1(R), the result follows again by case-1.

(iv) Use the argument in case-1 of the proof of (iii), and use [120](ii) to say f ∗Dg ∈ C0(R). �

We will see below that R has an approximate identity {Hδ : δ > 0} with the additional property

that Hδ ∈ D for every δ > 0. This tool is useful in approximating Lp-functions by members of D.

Exercise-21: Let h : R → R be h(x) = e−1/x1(0,∞)(x). Then h ≥ 0 and h ∈ E .

(ii) For a < b, let ha,b(x) = h(x− a)h(b− x). Then ha,b ≥ 0, ha,b ∈ D and supp(ha,b) = [a, b].

(iii) Let H(x) = c ·h−1,1(x) = ce−1/(1−x2), where c > 0 is a constant chosen so that∫RH(x)dx = 1.

Then H ≥ 0, H ∈ D and supp(H) = [−1, 1].

(iv) For δ > 0, let Hδ(x) = δ−1H(x/δ). Then Hδ ≥ 0,∫RHδ = 1, Hδ ∈ D and supp(Hδ) = [−δ, δ].

Thus {Hδ : δ > 0} as δ → 0 is a nonnegative approximate identity on R satisfying the L1-

concentration condition in a strong sense (the family {Hδ : δ > 0} is also called a mollifier).


(v) If f ∈ Cc(R), then ∥f − f ∗Hδ∥∞ → 0 as δ → 0.

(vi) Let 1 ≤ p <∞. If f ∈ Lp(R), then ∥f − f ∗Hδ∥p → 0 as δ → 0.

[Hint : (i) For x > 0, show inductively that f (n)(x) = pn(1/x)f(x) for some polynomial pn. (ii)-(iv)

are easy consequences of (i). Statements (v) and (vi) follow from part (vi) and [102].]

[122] D := C∞c (R) is dense in (Lp(R), ∥ · ∥p) for 1 ≤ p <∞.

Proof. Since Cc(R) is dense in (Lp(R), ∥ · ∥p) by Exercise-1, it suffices to show f ∈ Cc(R) can be

approximated in Lp-norm by members of D. We know ∥f − f ∗Hδ∥p → 0 by Exercise-21(vi). Since

f ∈ Cc(R) and Hδ ∈ D, we also have f ∗Hδ ∈ D by [121](ii), and we are done. �

10. Topologies on the spaces D = C∞c (R) and E = C∞(R)

The theory of distributions (to be introduced soon) is based on the three spacesD ⊂ S ⊂ E , where

D := C∞c (R), E := C∞(R) , and S is the Schwartz space(to be defined in the next section). In this

section, we will introduce suitable topologies on D and E , and will mention a few basic properties

of these spaces. First we will review a few selected facts from Functional Analysis that we need.

Definition: Let X be a topological vector space (over C), i.e., X is a vector space having a Hausdorff

topology, and the maps (a, x) 7→ ax from C×X toX and (x, y) 7→ x+y fromX2 toX are continuous.

(i) X is locally convex if 0 ∈ X has a neighborhood base consisting of convex open sets.

(ii) X is a Frechet space if X is locally convex and admits a complete metric.

Remark: Many function spaces appearing in Analysis do not admit any natural structure of a

Banach space, but they retain two nice properties of a Banach space - local convexity and the

existence of an admissible complete metric - so that they are Frechet spaces (we will see examples

shortly). One advantage of a Frechet space is that Baire category theorem, and consequently many

classic theorems in Functional Analysis based on Baire category theorem (Open mapping theorem,

Uniform boundedness theorem, etc.) hold good on Frechet spaces - see Rudin, Functional Analysis.

Definition: Let X be a vector space (over C). A function p : X → [0,∞) is a seminorm if

p(ax) = |a|p(x) and p(x + y) ≤ p(x) + p(y) for every a ∈ C and x, y ∈ X. The first property

implies p(x) = 0 if x = 0 (take a = 2). If the converse (that is, p(x) = 0 ⇒ x = 0) also holds, then

p becomes a norm on X. An easy way to produce a seminorm p is: choose any linear functional

ϕ : X → C and put p(x) = |ϕ(x)|. A useful observation about a seminorm p is the following: since

p(x) + p(y − x) ≤ p(y) and p(y) + p(x− y) ≤ p(x), we get |p(x)− p(y)| ≤ p(x− y).

Definition: Let X be a vector space (over C), and P be a family of seminorms on X.

(i) P is separating if the only x ∈ X with p(x) = 0 ∀ p ∈ P is x = 0.

FOURIER ANALYSIS 31

(ii) The topology generated by P on X is the smallest topology on X that makes every p ∈ P

continuous. Sets of the form {x ∈ X : pj(x) < ε for 1 ≤ j ≤ k}, where ε > 0 and p1, . . . , pk ∈ P ,

form a base at 0 ∈ X for this topology. Using this, observe that the topology generated by a family

of seminorms is always locally convex.

(iii) P is directed if ∀ p1, p2 ∈ P , ∃ p3 ∈ P with p3 ≥ max{p1, p2}. If P is a directed, then basic

neighborhoods of 0 ∈ X have the form {x ∈ X : p(x) < ε} for some ε > 0 and p ∈ P .

[123] (Working knowledge about a topology specified by seminorms) Let X be a topological vector

space, where the topology is generated by a directed family P of seminorms. Then,

(i) A seminorm q on X (not necessarily a member of P ) is continuous iff there exist p ∈ P and

C > 0 such that q(x) ≤ Cp(x) for every x ∈ X. Consequently, a linear functional ϕ : X → C is

continuous iff there exist p ∈ P and C > 0 such that |ϕ(x)| ≤ Cp(x) for every x ∈ X.

(ii) Let P be separating and countable, say P = {pk : k ∈ N}, and let Uk = {x ∈ X : pk(x) < 1/k}.

Then {Uk : k ∈ N} is a local base at 0 for X (we remark that even when X is metrizable, it is more

convenient to use the seminorms than the metric).

(iii) If P = {pk : k ∈ N} is separating, then X is metrizable with a translation invariant metric

d(x, y) :=∑∞

k=1 2−k min{1, pk(x−y)}. Here, translation invariance means d(x+z, y+z) = d(x, y).

(iv) Let P = {pk : k ∈ N} be separating. Then, (xn) → x in X ⇔ for every k ∈ N, there is n0 ∈ N

such that pk(x− xn) < 1/k for every n ≥ n0; and (xn) is Cauchy in X ⇔ for every k ∈ N, there is

n0 ∈ N such that pk(xm − xn) < 1/k for every m,n ≥ n0.

Proof. (i) Suppose q is continuous and U ⊂ X be a basic neighborhood of 0 ∈ X with q(U) ⊂ [0, 1).

We may assume U = {x ∈ X : p(x) < ε} for some ε > 0 and p ∈ P since P is directed. We

claim that C := 2/ε works. Consider x ∈ X. If p(x) = 0, then p(ax) = ap(x) = 0 so that ax ∈ U

for every a > 0, which implies aq(x) = q(ax) < 1 for every a > 0; hence x = 0 and trivially

q(x) ≤ Cp(x). If p(x) > 0, then for a := ε2p(x) we have ax ∈ U and aq(x) = q(ax) < 1; this implies

q(x) < 1/a = Cp(x). Conversely, if the given condition holds, then q is continuous at 0 ∈ X; and

the continuity at a general point follows by observing that q(x− y) ≤ |x− y|. The second assertion

about a linear functional follows by applying what is already proved to q(x) := |ϕ(x)|.

(ii) Consider a basic neighborhood U = {x ∈ X : pj(x) < ε for 1 ≤ j ≤ m} of 0, where pj ∈ P and

ε > 0. Let n ≥ m be such that 1/n < ε, and then choose pk ≥ max{p1, . . . , pn} using the fact that

P is directed. Then clearly 0 ∈ Uk ⊂ U , and this shows {Uk : k ∈ N} is a local base at 0.

(iii) Translation invariance and symmetry of d are clear. Triangle inequality of d follows from

that of pk’s. We have d(x, x) = 0 since pk(0) = 0. If d(x, y) = 0, then pk(x − y) = 0 for

every k ∈ N , which implies x − y = 0 or x = y since P is separating. Thus d is a translation


invariant metric. Now we will verify using Uk’s from (ii) that d induces the same topology. Since

pk(x) ≤ 2kd(0, x), we have Bd(0,1

2kk) ⊂ Uk. For the other direction, for a given ε > 0, choose

m ∈ N with 1/m < ε/2 and∑∞

j=m+1 2−j < ε/2. Then choose k > m with pk ≥ max{p1, . . . , pm}

by the directedness of P . For any x ∈ Uk, we now have pj(x) ≤ pk(x) < 1/k for 1 ≤ j ≤ k and

hence d(0, x) ≤∑k

j=1 2−j/k +

∑∞j=k+1 2

−j < 1/k + ε/2 < ε, which shows Uk ⊂ Bd(0, ε).

(iv) This follows from (ii). �

We now apply these tools from Functional Analysis to the functions spaces of our interest:

Exercise-22: (Examples) (i) Let ρN (f) = max{|f(x)| : |x| ≤ N} for f ∈ C(R) and N ∈ N. Then

P = {ρN : N ∈ N} is a directed separating family of seminorms on C(R), and C(R) is a Frechet

space w.r.to the metric induced by P . Also, the topology on C(R) obtained in this way (the topology

of uniform convergence on compact sets) coincides with the compact-open topology on C(R).

(ii) For f ∈ E := C∞(R), and N ∈ N, let pN (f) = max{|Djf(x)| : 0 ≤ j ≤ N and |x| ≤ N}. Then

P = {pN : N ∈ N} is a directed separating family of seminorms on E , and E is a Frechet space

w.r.to the metric induced by P . If we put UN = {f ∈ E : pN (f) < 1/N}, then {UN : N ∈ N} is a

local base at 0 for E . For fn, f ∈ E , we have (fn) → f in E iff for each N ∈ N, there is n0 ∈ N such

that pN (f − fn) < 1/N for every n ≥ n0. A linear functional ϕ : E → C is continuous iff there exist

C > 0 and N ∈ N such that |ϕ(f)| ≤ CpN (f) for every f ∈ E .

(iii) Dk := {f ∈ E : supp(f) ⊂ [−k, k]} is a closed vector subspace of E (hence a Frechet space),

and note∪∞

k=1Dk = D. The seminorms pN from part (ii) become norms when restricted to Dk,

and {Dk ∩ UN : N ∈ N} is a local base at 0 for Dk, where UN is as in (ii). For f, fn ∈ Dk, we have

(fn) → f in Dk iff for each N ∈ N, there is n0 ∈ N such that pN (f − fn) < 1/N for every n ≥ n0.

[Hint : (ii) Completeness of E = C∞(R): if (fn) is Cauchy in E , then (Djfn)∞n=1 is Cauchy in the

Frechet space C(R) for each j ≥ 0; put gj := limnDjfn and deduce Djg0 = gj for every j ∈ N

using Theorem 7.17 in Rudin, Principles of Mathematical Analysis. (iii) Let ϕx : E → C be the

continuous functional ϕx(f) = f(x). Then Dk =∩

|x|>k ker(ϕx), and hence Dk is closed.]

Remark: By Exercise-22(ii), (fn) → f in E ⇔ for each N ∈ N and j ≥ 0, (Djfn) → f uniformly

to f on [−N,N ] as n → ∞ ⇔ for each j ≥ 0, (Djfn) → f uniformly on compact subsets of R. In

higher dimension, if we consider E(W ) := C∞(W,C) for an open set W ⊂ Rm, then (fn) → f in

E(W ) ⇔ for each multi-index α, (Dαfn) → f uniformly on compact subsets of W .

Exercise-23: (i) (Existence of smooth bump function) Given 0 < a < b, there exists g ∈ D such

that 0 ≤ g ≤ 1, g ≡ 1 on [−a, a], and supp(g) ⊂ [−b, b].

(ii) D is dense in E .

FOURIER ANALYSIS 33

[Hint : (i) By Exercise-21, there is f ∈ E with f(x) = 0 for x ≤ 0 and f(x) > 0 for x > 0. Let

h(x) = f(b−x)/[f(b−x)+f(x−a)]. Then h ≡ 1 on (−∞, a), h ≡ 0 on (b,∞), and h(a, b) = (0, 1).

Put g(x) = h(|x|). (ii) Let f ∈ E . Pick g ∈ D by (i) with g ≡ 1 on [−1, 1]. Define gn(x) = g(x/n).

Then fgn ∈ D, and fgn ≡ f on [−n, n] so that pN (f − fgn) = 0 for n > N , giving (fgn) → f in E .]

Remark: As D = E , it follows by Exercise-23(ii) that D is not closed (complete) in E (another

argument for this is by using Baire category theorem, after noting that the proper closed subspaces

Dk must be nowhere dense in D). Intuitively, D is not closed in E because: if (fn) is a sequence in

D converging to f ∈ E , then supp(fn) can get bigger with n so that in the limiting case supp(f)

may fail to be compact. Since it is desirable to work with complete spaces, we will now try to put a

(sequentially complete) topology Tind called the inductive limit topology on D. This topology Tindwill be such that if (fn) is Cauchy in (D, Tind), then there will be a uniform bound for supp(fn) for

every n, and this will ensure that the support of f := lim fn is also compact.

As before, let Dk = {f ∈ D : supp(f) ⊂ [−k, k]} and let Tk denote the topology on Dk (subspace

topology induced from E). Keep in mind that (Dk, Tk) is a Frechet space.

Definition: The inductive limit topology Tind on D is defined as the finest (strongest) locally convex

topology on D such that the inclusions Dk ⊂ D become continuous for every k ∈ N.

Remark: By definition, Tind is stronger than the subspace topology on D induced by E . Conse-

quently, Dk is closed in (D, Tind) for each k ∈ N by Exercise-22(iii).

Exercise-24: (i) The collection of all convex balanced sets U ⊂ D such that Dk ∩ U is open in Dk

form a local base at 0 for the locally convex space (D, Tind). Here, U is said to be balanced if

cU ⊂ U for all c ∈ C with |c| ≤ 1).

(ii) The subspace topology induced on Dk from Tind coincides with the original topology Tk of Dk.

Thus U ⊂ D is open in D iff Dk ∩ U is open in Dk for every k ∈ N .

[Hint : We leave this as a reading assignment - see 6.4 in Rudin, Functional Analysis.]

Definition: Let X,Y be topological vector spaces. (i) A ⊂ X is a bounded subset if for every

neighborhood U of 0 ∈ X, there is c > 0 such that cA ⊂ U . (ii) A sequence (xn) in X is Cauchy

if for every neighborhood U ⊂ X of 0, there is n0 ∈ N such that xn − xm ∈ U for every m,n ≥ n0;

and X is (sequentially) complete if every Cauchy sequence in X converges to some element of X.

(iii) A linear map T : X → Y is bounded if T (A) is bounded in Y whenever A ⊂ X is bounded.

[124] (i) A ⊂ D is bounded ⇔ there is k ∈ N such that A ⊂ Dk and A is bounded in Dk.

(ii) (fn) → f in D ⇔ there is k ∈ N such that f, fn ∈ Dk for every n ∈ N and (fn) → f in Dk.

(iii) D is sequentially complete.


(iv) D is not metrizable.

Proof. In (i) and (ii), we will prove only the implication ‘⇒’ since ‘⇐’ is a direct consequence of

the continuity of the inclusion Dk ⊂ D.

(i) Suppose A ⊂ D is bounded. If A is not a subset of Dk for any k, choose fk ∈ A \ Dk for every

k ∈ N. Then there are xk ∈ R with |xk| > k and εk := |fk(xk)/k| > 0 for every k ∈ N. Let

U = {f ∈ D : |f(xk)| < εk for every k ∈ N}. We claim that Dk ∩ U is a neighborhood of 0 in Dk

for every k ∈ N. Given k ∈ N, choose m ∈ N such that |xj | < m and 1/m < εj for 1 ≤ j ≤ k. Then

Um := {f ∈ Dk : pm(g) < 1/m} ⊂ U , and this establishes the claim. Therefore by Exercise-24(ii),

U is a neighborhood of 0 in D. Since (fk) is bounded, we must have (fk/k) → 0. But fk/k /∈ U

for any k ∈ N by the definition of U , a contradiction. This contradiction establishes that A ⊂ Dk

for some k ∈ N. To show A is bounded in Dk, consider a neighborhood V of 0 in Dk. Then

V = Dk ∩ U for some neighborhood U of 0 in D by Exercise-24(ii). Let c > 0 be with cA ⊂ U .

Then cA ⊂ cDk ∩ U = Dk ∩ U = V , where cDk = Dk since Dk is a vector subspace.

(ii) This follows from part (i) and Exercise-24(ii) since {f} ∪ {fn : n ∈ N} is bounded.

(iii) This follows from (i) and the completeness of Dk since a Cauchy sequence is bounded.

(iv) If D is metrizable, then it is a complete metric space by (iii). Now D =∪∞

k=1Dk is a union of

proper closed vector subspaces (∵ Dk = D by Exercise-21(ii)). And a proper closed vector subspace

must be nowhere dense. Hence we arrive at a contradiction by Baire category theorem. �

Remark: More explicitly, [124](ii) means: (fn) → f in D ⇔ there is k ∈ N such that supp(f) and

supp(fn) are subsets of [−k, k] for every n ∈ N and limn→∞ ∥Djf −Djfn∥∞ = 0 for every j ≥ 0.

In higher dimension, if we consider D(W ) := C∞c (W,C) for an open set W ⊂ Rm, then (fn) → f

in D(W ) ⇔ there is a compact set K ⊂ W such that supp(f), supp(fn) ⊂ K for every n, and

(Dαfn) → f uniformly on K for each multi-index α.

Seminar topic: The spaces E , Dk and D have the Heine-Borel property: closed and bounded subsets

are compact (see 1.46 and 6.7 in Rudin, Functional Analysis).

Even though D is not metrizable, continuity can be characterized using sequences:

Exercise-25: Let Y be a locally convex vector space and T : D → Y be linear. Then TFAE:

(i) T is continuous.

(ii) T |Dkis continuous for each k ∈ N.

(iii) T is bounded.

(iv) (fn) → 0 in D implies (Tfn) → 0 in Y .

FOURIER ANALYSIS 35

[Hint : (i) ⇔ (ii) can be deduced using Exercise-24(ii) (a general property of inductive limit topol-

ogy), and (i) ⇔ (iii) is routine Functional Analysis. We have (iv) ⇒ (ii) since Dk is metrizable.]

Remark: Since pN (Djf) ≤ pN+j(f), the map f 7→ Djf is continuous on Dk for every k, j ∈ N, and

hence f 7→ Djf from D to D is continuous for every j ∈ N by Exercise-25.

11. The Schwartz space S

We have Cc(R) ⊂ C0(R) ⊂ C(R), where the members f ∈ C0(R) are defined in terms of the

decay of f at infinity. Similarly, now we will define7 the Schwartz space S of smooth functions with

D ⊂ S ⊂ E , where the members f ∈ S are defined by requiring that f and all its derivatives decay

rapidly at infinity. Later we will see that S is the natural domain for the Fourier transform.

Exercise-26: Write xif for the function x 7→ xif(x). For f ∈ C(R), the following are equivalent:

(i) xif ∈ L∞(R) for i = 0, 1, 2, . . ..

(ii) (1 + |x|)if ∈ L∞(R) for i = 0, 1, 2, . . ..

(iii) xif ∈ C0(R) for i = 0, 1, 2, . . ..

(iv) (1 + |x|)if ∈ C0(R) for i = 0, 1, 2, . . ..

[Hint : Since |x|i ≤ (1 + |x|)i ≤∑i

j=0 cj |x|j ≤ C|x|i for |x| ≥ 1, we get (i) ⇔ (ii) and (iii) ⇔ (iii).

If ∥xi+1f∥∞ <∞, then lim|x|→∞ |x|i|f(x)| = 0, and this shows (i) ⇒ (iii).]

Definition: Define the Schwartz space S as below and note D ⊂ S ⊂ C∞0 (R) ⊂ E .

S := {f ∈ E : xiDjf ∈ L∞(R) ∀ i, j ≥ 0} = {f ∈ E : (1 + |x|)iDjf ∈ L∞(R) ∀ i, j ≥ 0}

= {f ∈ E : xiDjf ∈ C0(R) ∀ i, j ≥ 0} = {f ∈ E : (1 + |x|)iDjf ∈ C0(R) ∀ i, j ≥ 0}.

Example: The map x 7→ e−1/x21(0,∞)(x) belongs to S, but x 7→ (1 + x2)−1 is not a member of S.

Exercise-27: If we put si,j(f) = ∥xiDjf∥∞, then the family {si,j : i, j ≥ 0} of seminorms defines a

locally convex topology on the vector space S. Moreover:

(i) The family {si,j : i, j ≥ 0} of seminorms is separating but is not a directed family. However, if

we put qN =∑

0≤i,j≤N si,j , then {qN : N ∈ N} is a separating directed family of seminorms on S

inducing the same topology on S.

(ii) Let VN = {f ∈ S : qN (f) < 1/N} = {f ∈ S :∑

0≤i,j≤N si,j(f) < 1/N}. Then {VN : N ∈ N} is

a local base at 0 for S.

(iii) A linear functional ϕ : S → C is continuous ⇔ there exist C > 0 and N ∈ N such that

|ϕ(f)| ≤ CqN (f) = C∑

0≤i,j≤N si,j(f) for every f ∈ S.

7L. Schwartz who introduced the theory of distributions is different from H.A. Schwarz in Cauchy-Schwarz

inequality.


(iv) For f, fn ∈ S, we have (fn) → f in S ⇔ for every N ∈ N, there is n0 ∈ N such that

qN (f − fn) =∑

0≤i,j≤N si,j(f − fn) < 1/N for every n ≥ n0.

(v) S admits an invariant metric that is complete. Thus S is a Frechet space.

[Hint : After verifying (i), deduce the other statements using [123] and the hint of Exercise-22(ii).]

Exercise-28: (i) D is dense in S.

(ii) The inclusions Dk ⊂ S, (hence) D ⊂ S, and S ⊂ (C∞0 (R), ∥ · ∥∞), and S ⊂ E are all continuous.

(iii) For 1 ≤ p <∞ we have S ⊂ Lp(R) and the inclusion is continuous.

(iv) S is closed under the following linear maps: f 7→ xif for every i ≥ 0 (and hence under f 7→ gf

for any polynomial g), and f 7→ Djf for every j ≥ 0. These maps are continuous on S.

[Hint : (i) Let f ∈ S. By Exercise-23(i), there is g ∈ D, 0 ≤ g ≤ 1, with g ≡ 1 on [−1, 1].

Define gn(x) = g(x/n) so that fgn ∈ D and fgn ≡ f on [−n, n]. Note that Dj(f − fgn) =∑jk=0 ckD

kfDj−k(1− gn). Let M = max0≤k≤j

∥Dj−k(1− g)∥∞. Then supn ∥Dj−k(1− gn)∥∞ ≤M for

0 ≤ k ≤ j. Therefore, si,j(f − fgn) ≤ sup|x|>n

M

j∑k=0

ck|xiDkf(x)| → 0. (ii) Fix N > k. Let M =

max{|xi| : 0 ≤ i ≤ N, |x| ≤ N}. Then qN (f) =∑

0≤i,j≤N si,j(f) ≤∑

0≤i,j≤N MpN (f) for f ∈ Dk.

Hence Dk ⊂ S is continuous. Next, S ⊂ C∞0 (R) and S ⊂ E are continuous since ∥f∥∞ = q0(f) and

pN (f) ≤ qN (f). (iii) Let g(x) = 1/(1+ |x|)2. Note g ∈ Lp(R). Let C > 0 be with (1+ |x|)2 ≤ C|x|2

for |x| ≥ 1. For f ∈ S, we have∫R |f |p ≤

∫|x|≤1 |f |

p +∫|x|>1 |Cx

2gf |p ≤ 2s0,0(f)p +Cps2,0(f)

p∥g∥pp.

(iv) For continuity, note si1,j(xi2f) ≤ si1+i2,j(f) and si,j1(D

j2f) ≤ si,j1+j2(f).]

[125] If f, g ∈ S, then f ∗ g ∈ S. And the bilinear map (f, g) 7→ f ∗ g from S2 to S is continuous.

Proof. Let f, g ∈ S ⊂ C∞0 (R)∩L1(R). The second inclusion ensures f ∗g is defined. Fix i, j ≥ 0. We

need to show xiDj(f ∗ g) ∈ L∞(R). Let h = Djg, which belongs to S. By [121](iv), f ∗ g ∈ C∞0 (R)

and Dj(f ∗ g) = f ∗ h. Writing xi = (y + (x− y))i =∑i

k=0 ckyk(x− y)i−k, observe that

xi(f ∗ h)(x) =∫Rxif(y)h(x− y)dy =

i∑k=0

ck

∫Rykf(y)(x− y)i−kh(x− y)dy =

i∑k=0

ckxkf ∗ xi−kh.

Since f, h ∈ S, we have xkf, xi−kh ∈ S ⊂ C∞0 (R) ∩ L1(R). So by [121](iv) we get xkf ∗ xi−kh ∈

C∞0 (R). Hence xi(f ∗ h) ∈ C∞

0 (R) ⊂ L∞(R), and this completes the proof that f ∗ g ∈ S.

To establish the continuity of the bilinear map (f, g) 7→ f ∗ g, consider sequences (fn) → 0

and (gn) → 0 in S. In view of Exercise-27(iv), we need to show limn→∞ si,j(fn ∗ gn) = 0 for every

i, j ≥ 0. From the above arguments, si,j(fn∗gn) = ∥xiDj(fn∗gn)∥∞ ≤∑i

k=0 ck∥xkfn∗xi−kDjgn∥∞.

By Exercise-20, ∥xkfn ∗ xi−kDjgn∥∞ ≤ ∥xkfn∥1∥xi−kDjgn∥∞. The right hand side goes to 0

because: (xkfn) → 0 and (xi−kDjgn) → 0 in S by Exercise-28(iv), and then ∥xkfn∥1 → 0 and

FOURIER ANALYSIS 37

∥xi−kDjgn∥∞ → 0 since the inclusions S ⊂ L1(R) and S ⊂ (C∞0 (R), ∥ ·∥∞) are continuous by parts

(iii) and (i) of Exercise-28. Hence limn→∞ si,j(fn ∗ gn) = 0. �

Remark: Similarly, it can be shown that the map (f, g) 7→ f ∗ g is bilinear and continuous from D2

to D, from D × C∞0 (R) to C∞

0 (R), etc. (try to write the proofs for some of them).

12. Distributions: preliminaries

Motivations for introducing distributions: (i) Two fundamental operations in Calculus are differ-

entiation and integration. People say that the theory of distributions is a completion of the theory

of ordinary differentiation just as the theory of Lebesgue integration is a completion of the theory

of Riemann integration.

(ii) We will show that all distributions (also called generalized functions) are differentiable in a

certain sense8. It will turn out that all locally integrable functions and all locally finite Borel

measures are distributions, and hence we can differentiate them!

(iii) Certain partial differential equations may have meaningful solutions that are differentiable only

almost everywhere. The framework of distributions provides a proper place for such solutions.

(iv) Distributions provide a mathematical framework in which certain ‘functions’ such as the dirac

delta function δ that are not functions in the ordinary sense (δ(0) = ∞ and δ ≡ 0 on R \ {0})

obtain the proper status of functions.

(v) The theory of Fourier transform can be elegantly developed playing with the Schwartz class

and tempered distributions (this is the approach in Grafakos, Classical Fourier Analysis).

For a topological vector space X, let X ′ = {ϕ : X → C : ϕ is linear and continuous} be its dual.

We will write ⟨ϕ, x⟩ for ϕ(x) when x ∈ X and ϕ is a map from X to C.

Definition: Recall the spaces D ⊂ S ⊂ E , where D = (C∞c (R), Tind), E = C∞(R), and S is the

Schwartz space. Let D′, S ′, and E ′ be their duals respectively. We have D′ ⊃ S ′ ⊃ E ′. The members

of D′ are called distributions, the members of S ′ are called tempered distributions, and the members

of E ′ are called distributions with compact support (this terminology will be clarified shortly). From

the earlier theory, we have the following characterization of members of D′, S ′, and E ′.

[126] Recall that pN (f) = max0≤k≤N

{|f (k)(x)| : |x| ≤ N}, si,j(f) = ∥xiDjf∥∞, and qN = max0≤i,j≤N

si,j .

(i) A linear functional ϕ : D → C belongs D′ ⇔ (ϕ|Dkis continuous for each k ∈ N) for each k ∈ N,

there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk.

8The generalized derivative of functions in the sense of distributions is often called the weak derivative.


(ii) A linear functional ϕ : S → C belongs to S ′ ⇔ there exist C > 0 and N ∈ N such that

|⟨ϕ, f⟩| ≤ CqN (f) = C∑

0≤i,j≤N si,j(f) for every f ∈ S.

(iii) A linear functional ϕ : E → C belongs to E ′ ⇔ there exist C > 0 and N ∈ N such that

|⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ E . Consequently, for every ϕ ∈ E ′, there is N ∈ N with the

following property: whenever f ∈ E satisfies Dkf = 0 for 0 ≤ k ≤ N , then ⟨ϕ, f⟩ = 0 (the smallest

such N is sometimes called the order of ϕ; thus every ϕ ∈ E ′ is of finite order).

Examples: (i) (Distributions generalize the notion of locally integrable functions) We will show

E ⊂ L1loc(R) ⊂ D′. The first inclusion is clear. Now, any g ∈ L1

loc(R) induces a linear functional

ϕg : D → C by the expression ⟨ϕg, f⟩ = ϕg(f) :=∫R f(x)g(x)dx. Fix k ∈ N. Then ∥f∥∞ ≤ pk(f)

for every f ∈ Dk, and for C :=∫ k−k |g| we have |⟨ϕg, f⟩| ≤ Cpk(f). Thus ϕg is continuous by [126](i)

and ϕg ∈ D′. From now onwards we will just write ⟨g, f⟩ for ⟨ϕg, f⟩ if no confusion can arise.

(ii) (Tempered distributions generalize the notion of Lp-functions and polynomials) Let 1 ≤ p ≤

∞, and we will show Lp(R) ⊂ S ′. If g ∈ L∞(R), then by the hint of Exercise-28(iii), there

is C > 0 such that ∥f∥1 ≤ C∑

0≤i,j≤2 si,j(f) for every f ∈ S, and hence |⟨g, f⟩| ≤ ∥g∥∞∫|f | ≤

∥g∥∞C∑

0≤i,j≤2 si,j(f). Therefore g (more precisely, the functional f 7→ ⟨g, f⟩) belongs to S ′. Next

suppose 1 ≤ p <∞, let g ∈ Lp(R) and 1 < q ≤ ∞ be such that 1p + 1

q = 1. Let h(x) = (1 + |x|)−2.

Since h ∈ Lq(R), we get by Holder’s inequality that∫R |gh| ≤ ∥g∥p∥h∥q. Let C > 0 be with

(1 + |x|)2 ≤ C|x|2 for |x| ≥ 2. Since∫|x|≤2(1 + |x|)2dx ≤

∫|x|≤2 9dx = 36, we have

|⟨g, f⟩| ≤ (

∫|x|≤2

+

∫|x|>2

)|gh|(1 + |x|)2|f |dx ≤ (36s0,0(f) + Cs2,0(f))∥g∥p∥h∥q,

which shows g ∈ S ′ by [126](ii). More generally, if g : R → R satisfies∫R g(x)(1 + |x|)−mdx <∞

for some m ≥ 2, then by an argument similar to the one given above, we can show g ∈ S ′. In

particular, if g : R → R is a polynomial, then g ∈ S ′.

(iii) (Distributions with compact support generalize the notion of Lp-functions with bounded sup-

port9) If g ∈ Dk ⊂ L1(R), then |⟨g, f⟩| ≤ pk(f)∫ k−k |g| ≤ ∥g∥1pk(f) for f ∈ E . This shows D ⊂ E ′.

More generally, if 1 ≤ p ≤ ∞ and g ∈ Lp(R) satisfies supp(g) ⊂ [−N,N ] for some N ∈ N, then

putting h = 1[−N,N ] we see |⟨g, f⟩| ≤ pN (f)∫ N−N |g| ≤ pN (f)

∫R |gh| ≤ ∥g∥p∥h∥qpN (f) for f ∈ E ,

where 1p + 1

q = 1. This shows for 1 ≤ p ≤ ∞, any g ∈ Lp(R) with bounded support belongs to E ′.

(iv) The inclusions E ′ ⊂ S ′ ⊂ D′ are proper. Let g(x) = e1/x21(0,∞)(x). Then g ∈ L1

loc(R) ⊂ D′, but

g /∈ S ′ because for f ∈ S defined as f(x) = e−1/x21(0,∞)(x), we have ⟨g, f⟩ =

∫R fg =

∫∞0 1 = ∞.

Similarly, if g ≡ 1, then g ∈ S ′ (being a polynomial) but g /∈ E ′ since g ∈ E and∫R gg = ∞.

9When we say a general function has bounded/compact support, it means the function is identically zero outside

a bounded/compact set.

FOURIER ANALYSIS 39

(v) (Locally finite measures are distributions) A Borel measure µ on R is locally finite10 if µ(K) <∞

for every compact set K ⊂ R. If µ is a locally finite Borel measure on R, then µ ∈ D′, where

we identify µ with the linear functional f 7→ ⟨µ, f⟩ :=∫R fdµ. This linear functional is indeed

continuous since for each k ∈ N, we have |⟨µ, f⟩| ≤ (2k+1)µ([−k, k])∥f∥∞ ≤ (2k+1)µ([−k, k])pk(f)

for every f ∈ Dk. If in addition µ is compactly supported, then a similar argument gives µ ∈ E ′.

In particular, for each a ∈ R, the Dirac measure δa defined as δa(Y ) = 1 if a ∈ Y and = 0 if a /∈ Y

belongs to E ′, and ⟨δa, f⟩ =∫R fdδa = f(a), which is just the evaluation map at a.

(vi) (Cauchy’s principal value PV 1x) The map x 7→ 1/x is locally integrable on R\{0}, but not on R

because∫ 10

1xdx = limε→0

∫ 1ε

1xdx = limε→0 log x|1ε = ∞. Still, there is a distribution corresponding

to this map, which we introduce now. Let f ∈ D, and write f(x) = f(0) + xg(x) for g ∈ D. Since

f(−ε)− f(ε) = −2εg(ε), integration by parts yields

∫|x|>ε

x−1f(x)dx = (

∫ −ε

−∞+

∫ ∞

ε)x−1f(x)dx = −ε(g(−ε)+g(ε)) log ε−(

∫ −ε

−∞+

∫ ∞

ε)Df(x) log |x|dx.

Since g(0) = Df(0), we have limε→0 ε(g(−ε) + g(ε)) log ε = 0. Hence limε→0

∫|x|>ε x

−1f(x)dx =

−∫∞−∞Df(x) log |x|dx. The distribution PV 1

x called the Cauchy’s principal value is defined as

⟨PV 1

x, f⟩ = lim

ε→0

∫|x|>ε

x−1f(x)dx = −∫RDf(x) log |x|dx

for f ∈ D. The function h(x) := log |x| belongs to L1loc(R) because

∫ b0 log xdx = limε→0

∫ bε log xdx =

limε→0[x log x − x]bε = b log b − b. Hence h is (induces) a distribution, and ⟨PV 1x , f⟩ = −⟨h,Df⟩,

from which we may deduce that PV 1x is indeed a distribution.

Idea behind the definition of various operations on distributions: If T : D → D is a linear operator,

then correspondingly there is a transpose operator T t : D′ → D′ given by T tϕ(f) = ϕ(Tf) for

ϕ ∈ D′ and f ∈ D. The defining expression T tϕ(f) = ϕ(Tf) may be written as ⟨T tϕ, f⟩ =

⟨ϕ, Tf⟩. When ϕ = ϕg for a genuine function g, the defining expression becomes∫(T tg)(x)f(x)dx =∫

g(x)(Tf)(x)dx. This observation tells us how to define various operations on distributions.

Definition: Let ϕ ∈ D′ and f ∈ D.

(i) (Translation) Recall the notation fy(x) = f(x−y). Observe that∫gy(x)f(x)dx =

∫g(z)f−y(z)dz.

Hence we define the translation ϕy of ϕ as ⟨ϕy, f⟩ = ⟨ϕ, f−y⟩ for y ∈ R.

(ii) (Dilation/scaling) Let a ∈ R \ {0}. Since∫g(ax)f(x)dx =

∫g(z)f(z/a)a−1dz, we define the

dilation ϕ(a ·) of ϕ as ⟨ϕ(a ·), f⟩ = ⟨ϕ, a−1f(·/a)⟩. Remark: In Rn, ⟨ϕ(a ·), f⟩ := ⟨ϕ, a−nf(·/a)⟩.

10A locally finite Borel measure on spaces such as Rn is generally called a Radon measure.


(iii) (Reflection) Let f(x) = f(−x). Since∫R gf =

∫R gf , we define ⟨ϕ, f⟩ = ⟨ϕ, f⟩.

(iv) (Smooth multiplication) Since∫(hg)f =

∫g(hf), we define hϕ ∈ D′ as ⟨hϕ, f⟩ = ⟨ϕ, hf⟩ for

h ∈ E . If h ∈ D, then hϕ ∈ E ′, and hϕ is called a localization of ϕ.

(v) (Differentiation) Integration by parts gives∫(Dg)f = gf |∞−∞ −

∫gDf = −

∫gDf since f has

compact support. Hence we define the derivative Dϕ of ϕ as ⟨Dϕ, f⟩ = −⟨ϕ,Df⟩ (do not forget the

minus sign!). If ϕ comes from a function g, i.e., if ⟨ϕ, f⟩ =∫fg, then Dϕ is called the distributional

derivative of g, and is also written as Dg. In this sense, every g ∈ L1loc(R) has a distributional

derivative.

Remark: The above operations are also defined for ϕ ∈ S ′ and ϕ ∈ E ′ with a modification: we need

to assume h ∈ D in (iv) to ensure that hϕ ∈ S ′ when ϕ ∈ S ′.

Example: (i) Let h = 1(0,∞) : R → R, which is called the Heaviside function. We have ⟨Dh, f⟩ =

−⟨h,Df⟩ = −∫∞0 Df = f(0) = ⟨δ0, f⟩ by the Fundamental theorem of calculus for f ∈ D. Hence

Dh = δ0, the Dirac measure at 0. Note further that ⟨Dmδa, f⟩ = (−1)mDmf(a) for f ∈ E , a ∈ R

and m ∈ N. (ii) From an earlier discussion, we find that the distributional derivative of PV 1x is

the locally integrable function x 7→ log |x|. (iii) Let g : R → C be absolutely continuous (then

g ∈ L1loc(R) ⊂ D′), h ∈ L1

loc(R), and assume g′(x) = h(x) for almost every x ∈ R. For any f ∈ D,

integration by parts gives ⟨Dg, f⟩ = −⟨g,Df⟩ = −∫gDf = −gf |∞−∞ +

∫fh = 0 +

∫fh = ⟨h, f⟩.

Thus Dg = h in the sense of distributions.

Exercise-29: (Smooth partition of unity) If K ⊂ R is a nonempty compact set and U1, . . . , Uk ⊂ R

are nonempty open sets covering K, then there are g1, . . . , gk ∈ D such that

(i) 0 ≤ gj ≤ 1 and supp(gj) ⊂ Uj for 1 ≤ j ≤ k.

(ii)∑k

j=1 gj(x) = 1 for every x ∈ K.

Here, we say {gj : 1 ≤ j ≤ k} is a smooth partition of unity for K subordinate to {Uj : 1 ≤ j ≤ k}.

[Hint : See my notes Introduction to Manifolds.]

Now we will explain why the members of E ′ are called ‘distributions with compact support’.

Definition We say ϕ ∈ D′ vanishes (or ϕ is 0) in an open set U ⊂ R if ⟨ϕ, f⟩ = 0 for every f ∈ D

with supp(f) ⊂ U . We define the support of ϕ, supp(ϕ), as the complement of the largest open

subset U ⊂ R on which ϕ vanishes. Exercise-30(i) below ensures that this definition is meaningful.

Example: For the Dirac measure δa at a ∈ R, we have supp(δa) = {a}.

Exercise-30: (i) If ϕ ∈ D′ vanishes on open sets U1, . . . , Uk ⊂ R, then ϕ vanishes on∪k

j=1 Uj .

(ii) If ϕ ∈ D′ and f ∈ D have disjoint supports, then ⟨ϕ, f⟩ = 0.

FOURIER ANALYSIS 41

(iii) Let ϕ ∈ D′, and f, g ∈ D. If f and g agree on a neighborhood of supp(ϕ), then ⟨Djϕ, f⟩ =

⟨Djϕ, g⟩ for every j ≥ 0.

(iv) If g ∈ D, and ϕg ∈ D′ is given as ⟨ϕg, f⟩ =∫R fg, then supp(ϕg) = supp(g).

[Hint : (i) Let f ∈ D be with K := supp(f) ⊂∪k

j=1 Uj . Let {gj : 1 ≤ j ≤ k} be a smooth partition

of unity for K subordinate to {Uj : 1 ≤ j ≤ k} given by Exercise-29. Then f =∑k

j=1 gjf , and by

hypothesis ⟨ϕ, gjf⟩ = 0 for 1 ≤ j ≤ k. Hence ⟨ϕ, f⟩ = 0. Statement (iii) is a consequence of (ii)

and linearity. (iv) Assume g is real valued. Let K = supp(g) and L = supp(ϕg). Clearly, L ⊂ K.

If L = K, using the smoothness of g, find a nondegenerate closed interval J ⊂ K \ L such that

g > 0 (or g < 0) on J . Let h ∈ D be a bump function such that h ≥ 0, h ≡ 1 on a neighborhood

of J where g > 0, and supp(h) ∩ L = ∅. Then 0 <∫J g ≤

∫R hg = ⟨ϕg, h⟩ = 0, a contradiction.]

Remark: In Exercise-30(iii), agreement on supp(ϕ) is not sufficient: f(x) = x and g ≡ 0 agree on

supp(δ0) = {0}, but ⟨Dδ0, f⟩ = −⟨δ0, Df⟩ = −Df(0) = −1 = 0 = −Dg(0) = ⟨Dδ0, g⟩.

[127] (Justification of a name) Let ϕ ∈ D′. Then ϕ ∈ E ′ ⇔ supp(ϕ) is compact.

Proof. ⇒: If supp(ϕ) is not compact, there is a sequence (fk) in D such that [−k, k]∩ supp(fk) = ∅

and ak := ⟨ϕ, fk⟩ = 0 for every k ∈ N. Let hk = a−1k fk. Since [−N,N ] ∩ supp(Djhk) = ∅ for every

k ≥ N and every j ≥ 0, we have limN→∞ pN (hk) = 0 for every N , and thus (hk) → 0 in E . On the

other hand, 1 = ⟨ϕ, hk⟩ 9 0. Hence ϕ /∈ E ′. We add the remark that though (hk) → 0 in E , we

have (hk) 9 0 in D since∪∞

k=1 supp(hk) is not a bounded set.

⇐: Assume supp(ϕ) is compact, and k ∈ N be with supp(ϕ) ⊂ (−k, k). Since ϕ ∈ D′, by [126](i)

there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk. Let h ∈ D, h ≥ 0, be

a bump function with h ≡ 1 on a neighborhood of supp(ϕ), and supp(h) ⊂ [−k, k]. Extend ϕ to E

by putting ⟨ϕ, f⟩ = ⟨ϕ, hf⟩ for f ∈ E . The extension is linear. For every f ∈ E , we have hf ∈ Dk,

and |⟨ϕ, f⟩| = |⟨ϕ, hf⟩| ≤ CpN (hf) ≤ CC ′pN (f), where the constant C ′ = Ch is obtained by using

the product rule for Dj(hf) for 0 ≤ j ≤ N . By [126](iii), the extension ϕ belongs to E ′. �

[128] A summary of some important inclusions that we have seen so far is:

(i) D ⊂ S ⊂ E ⊂ L1loc(R) ⊂ D′.

(ii) S ⊂ Lp(R) ⊂ S ′ for 1 ≤ p <∞; and also S ⊂ C∞0 (R) ⊂ L∞(R) ⊂ S ′.

(iii) D ⊂ E ′ ⊂ S ′ ⊂ D′.

Remark: By Exercise-23(ii), we know D is dense in E . A similar argument will show E ′ is dense in

D′ as follows. Given ϕ ∈ D′, choose g ∈ D with g ≡ 1 on [−1, 1], put gn(x) = g(x/n), and check

that gnϕ ∈ E ′. We have ⟨gnϕ, f⟩ = ⟨ϕ, gnf⟩ → ⟨ϕ, f⟩ for every f ∈ D, and this shows (gnϕ) → ϕ in

D′, completing the argument. Later we will see that D is dense in D′, and S is dense in S ′.


13. Convolution and distributions

Recall the notation g(y) = g(−y) so that g(x− y) = gx(y). For h, g ∈ D observe that h ∗ g(x) =∫h(y)g(x− y)dy =

∫h(y)gx(y)dy = ⟨h, gx⟩. This motivates the following definition.

Definition: (Convolution of a distribution by a smooth function) Let ϕ ∈ D′ and g ∈ D; or ϕ ∈ S ′

and g ∈ S; or ϕ ∈ E ′ and g ∈ E . Define ϕ ∗ g : R → C as ϕ ∗ g(x) = ⟨ϕ, gx⟩. To prove basic facts

about convolution, we require the following technical result:

Exercise-31: Let ϕ ∈ D′, G ∈ C∞(R2), and assume that for each x ∈ R, there are δ > 0 and k ∈ N

with supp(G(x+ t, ·)) ⊂ [−k, k] for every t ∈ (−δ, δ). Then h : R → C defined as h(x) = ⟨ϕ,G(x, ·)⟩

belongs to E and Dmh = ⟨ϕ, ∂mG(x, ·)∂xm

⟩ for every m ∈ N. [Hint : Fix x ∈ R, let δ > 0 and k ∈ N be

as given. Note that t−1(h(x+ t)− h(x)) = ⟨ϕ, t−1(G(x+ t, ·)−G(x, ·))⟩ since ϕ is linear. Check by

hypothesis that t−1(G(x+ t, ·)−G(x, ·)) → ∂G(x, ·)∂x

in D as t → 0. Hence by the continuity of ϕ

we get limt→0

t−1(h(x+ t)− h(x)) = ⟨ϕ, limt→0

t−1(G(x+ t, ·)−G(x, ·))⟩ = ⟨ϕ, ∂G(x, ·)∂x

⟩. Now repeat.]

Remark: The following is an important result in the theory of distributions. It tells you that even

if a distribution ϕ is given by a very rough function (for example, by a member of L1loc(R)), a

convolution ϕ ∗ g of ϕ by an appropriate smooth function g produces a smooth function.

[129] (i) Let ϕ ∈ D′ and g ∈ D; or ϕ ∈ S ′ and g ∈ S; or ϕ ∈ E ′ and g ∈ E . Then ϕ ∗ g ∈ E .

(ii) If ϕ ∈ E ′ and g ∈ D, then ϕ ∗ g ∈ D.

(iii) If ϕ ∈ E ′ and g ∈ S, then ϕ ∗ g ∈ S.

(iv) In all the above cases, we have Dm(ϕ ∗ g) = ϕ ∗Dmg for every m ∈ N.

Proof. Step-1 : First suppose ϕ ∈ D′ and g ∈ D. Define G : R2 → C as G(x, y) = gx(y) = g(x− y).

Then the hypothesis of Exercise-31 is satisfied. Hence ϕ ∗ g ∈ E . Also∂mG

∂xm(x, y) =

∂mg(x− y)

∂xm=

(Dmg)x(y), and hence Dm(ϕ ∗ g) = ⟨ϕ, (Dmg)x⟩ = ϕ ∗Dmg for every m ∈ N.

Step-2 : Let ϕ ∈ E ′ and g ∈ E . Let h ∈ D be a bump function with h ≡ 1 on a neighborhood

V of supp(ϕ). Then ⟨ϕ, f⟩ = ⟨ϕ, hf⟩ for every f ∈ E . In particular, ϕ ∗ g(x) = ⟨ϕ, hgx⟩. Letting

G(x, y) = h(y)g(x − y) and applying Exercise-31, we deduce that ϕ ∗ g ∈ E . Since h ≡ 1 on a

neighborhood V of supp(ϕ), we also have∂mG

∂xm(x, y) = 1 · ∂

mg(x− y)

∂xm= (Dmg)x(y) for y ∈ V , and

hence Dm(ϕ ∗ g) = ⟨ϕ, (Dmg)x⟩ = ϕ ∗Dmg for every m ∈ N.

Step-3 : Let ϕ ∈ E ′ and g ∈ D. Then there is k ∈ N such that ϕ and gx have disjoint supports for

|x| ≥ k, and hence ϕ ∗ g(x) = ⟨ϕ, gx⟩ = 0 for |x| ≥ k, which shows supp(ϕ ∗ g) is also compact.

FOURIER ANALYSIS 43

Step-4 : (hint) Let ϕ ∈ S ′ and g ∈ S. An argument similar to (but slightly more computational

than) that in step-1 will show ϕ∗g ∈ E and Dm(ϕ∗g) = ϕ∗Dmg; see Exercise 2.3.5(a) in Grafakos,

Classical Fourier Analysis.

Step-4 : If ϕ ∈ E ′ ⊂ S ′ and g ∈ S, then ϕ ∗ g ∈ E and Dm(ϕ ∗ g) = ϕ ∗Dmg as hinted in step-4. To

show ϕ ∗ g ∈ S, we need to show si,j(ϕ ∗ g) < ∞ for every i, j ≥ 0. Since Dj(ϕ ∗ g) = ϕ ∗Djg and

Djg ∈ S, we may assume j = 0. Thus we need to show ∥xi(ϕ ∗ g)∥∞ < ∞. Since g ∈ S, we have

(1+ |x|)iDmg ∈ S and hence for every N ∈ N, there is βN > 0 such that ∥(1+ |x|)iDmg∥∞ ≤ βN for

0 ≤ m ≤ N . Since ϕ ∈ E ′, there exist C > 0 and N ∈ N by [126](iii) such that |⟨ϕ, f⟩| ≤ CpN (f) =

Cmax{|Dmf(y)| : 0 ≤ m ≤ N, |y| ≤ N} for every f ∈ E . Hence |xi(ϕ ∗ g)(x)| = |xi⟨ϕ, gx⟩| =

|⟨ϕ, xigx⟩| ≤ Cmax{|xi ∂mg

∂ym(x− y)| : 0 ≤ m ≤ N, |y| ≤ N} ≤ CβN max{ |x|i

(1 + |x− y|)i: |y| ≤ N}.

From this estimate, we conclude that ∥xi(ϕ ∗ g)∥∞ < ∞ by observing the following: if |x| ≥ 2N ,

then|x|i

(1 + |x− y|)i≤ |x|i

(1 + |x|/2)i≤ 2i for |y| ≤ N . �

Example: Let g ∈ E . Then δ0 ∗ g(x) = ⟨δ0, gx⟩ = gx(0) = g(x− 0) = g(x). That is, δ0 ∗ g = g.

Definition and Remark: For f, g, h ∈ D, note that ⟨h ∗ g, f⟩ =∫(h ∗ g)(x)f(x)dx =

∫ ∫h(y)g(x −

y)f(x)dxdy =∫ ∫

h(y)f(x)g(y − x)dxdy =∫h(y)(f ∗ g)(y)dy = ⟨h, f ∗ g⟩. This motivates us to

define ⟨ϕ ∗ g, f⟩ = ⟨ϕ, f ∗ g⟩ for various types of distributions ϕ and appropriate smooth functions

g, f . We may check that this is compatible with our earlier definition of ϕ∗g as follows: ⟨ϕ∗g, f⟩ =∫(ϕ ∗ g)(x)f(x)dx =

∫⟨ϕ, gx⟩f(x)dx =

∫⟨ϕ, f(x)gx⟩dx = ⟨ϕ,

∫f(x)gx(·)dx⟩ = ⟨ϕ, y 7→

∫f(x)g(x−

y)dx⟩ = ⟨ϕ, f ∗ g⟩, where taking the integral sign∫inside ⟨, ⟩ is justified by considering the limit of

the Riemann sums defining the integral and using the linearity and continuity of ϕ.

Exercise-32: (i) Let ϕ ∈ D′. Then T : D → E defined as Tg = ϕ ∗ g is linear, continuous, and

commutes with translations (i.e., ϕ ∗ gy = (ϕ ∗ g)y for every y ∈ R).

(ii) (An important fact about convolution) Conversely, if a continuous linear map T : D → E

commutes with translations, then there is a unique ϕ ∈ D′ such that Tg = ϕ ∗ g for every g ∈ D.

[Hint : (ii) Define ϕ ∈ D′ as ⟨ϕ, g⟩ = Tg(0) for g ∈ D. Since T commutes with translation, we have

Tg(x) = (Tg)−x(0) = T (g−x)(0) = T (gx)(0) = ⟨ϕ, gx⟩ = ϕ ∗ g(x) for every x ∈ R.]

Remark: There are results similar to Exercise-32(ii) in other settings. For instance, Theorem 2.5.2

of Grafakos, Classical Fourier Analysis says in particular (with a more involved proof) that if

1 ≤ p, q ≤ ∞ and T : Lp(R) → Lq(R) is a bounded linear operator commuting with translations,

then there is a unique ϕ ∈ S ′ such that Tg = ϕ ∗ g for every g ∈ Lp(R).

It is also possible to define the convolution of two distributions when at least one of them has

compact support. For this, one has to develop the rather technical theory of tensor product of two


distributions. The main points are stated without proof in [130] below. The student may refer

to Chapter 4 of F.G. Friedlander, Introduction to the Theory of Distributions for the proofs. For

f, g : R → C, let f⊗g : R2 → C be f

⊗g(x, y) = f(x)g(y). Let D(R2) = C∞

c (R2). It is known

that span{f⊗g : f, g ∈ D} is dense in D(R2).

[130] Let ϕ, ψ ∈ D′, with at least one having compact support. Then there is a unique dis-

tribution ϕ⊗ψ ∈ D′(R2) called the tensor product of ϕ and ψ with the defining property that

⟨ϕ⊗ψ, f

⊗g⟩ = ⟨ϕ, f⟩⟨ψ, g⟩ for every f, g ∈ D. Moreover, we have:

(i) ∀F ∈ D(R2), ⟨ϕ⊗ψ,F ⟩ = ⟨ϕ, x 7→ ⟨ψ, F (x, ·)⟩⟩ = ⟨ψ, y 7→ ⟨ϕ, F (·, y)⟩⟩ ∀ (x, y) ∈ R2.

(ii) supp(ϕ⊗ψ) = supp(ϕ)× supp(ψ) ⊂ R2.

(iii) (ϕ, ψ) 7→ ϕ⊗ψ is linear and continuous in each variable.

(iv)∂k

∂xk∂m

∂ym(ϕ⊗ψ) = Dkϕ

⊗Dmψ for every k,m ∈ N.

(v) δx⊗δy = δ(x,y) for every (x, y) ∈ R2.

Remark: The idea of proof is to use the first equality in (i) as the definition of ϕ⊗ψ, and then to

prove that this indeed defines a distribution with the listed properties.

Note that ⟨h∗g, f⟩ =∫(h∗g)(z)f(z)dz =

∫ ∫h(x)g(z−x)f(z)dxdz =

∫ ∫h(x)g(y)f(x+y)dxdy.

[131] Let ϕ, ψ ∈ D′, with at least one of them having compact support. Then their convolution

ϕ ∗ ψ ∈ D′ is defined as ⟨ϕ ∗ ψ, f⟩ = ⟨ϕ⊗ψ, (x, y) 7→ f(x+ y)⟩ for f ∈ D. We have:

(i) ⟨ϕ ∗ ψ, f⟩ = ⟨ϕ, x 7→ ⟨ψ, f(x+ ·)⟩⟩ = ⟨ψ, y 7→ ⟨ϕ, f(·+ y)⟩⟩, and hence ϕ ∗ ψ = ψ ∗ ϕ.

(ii) supp(ϕ ∗ ψ) ⊂ supp(ϕ) + supp(ψ).

(iii) (ϕ, ψ) 7→ ϕ ∗ ψ is linear and continuous in each variable.

(iv) Dm(ϕ ∗ ψ) = Dmϕ ∗ ψ = ϕ ∗Dmψ for every m ∈ N.

(v) ϕ ∗ δ0 = ϕ.

Proof. All are essentially direct consequences of [130]. We just indicate a proof for (ii). Let f ∈ D

be with supp(f) ∩ (supp(ϕ) + supp(ψ)) = ∅. Then the support of the map (x, y) 7→ f(x + y) is

disjoint with supp(ϕ)× supp(ψ), and hence by [130](ii) we get ⟨ϕ ∗ ψ, f⟩ = 0. �

[132] (i) D (considered as a subset of E ′) is dense in D′.

(ii) S is dense in S ′.

Proof. (i) By the Remark after [128], E ′ is dense in D′. So it suffices to show D is dense in E ′. Let

ϕ ∈ E ′. Pick g ∈ D be with g ≥ 0 and∫g = 1, and put gn(x) = n−1g(x/n). Then (gn) → δ0 in D′

(check). Therefore (ϕ ∗ gn) → ϕ ∗ δ0 = ϕ by [131](iii) and [131](v). Also ϕ ∗ gn ∈ D by [129](ii).

(ii) An argument similar to that in (i) works since D ⊂ S ⊂ S ′ ⊂ D′. �

FOURIER ANALYSIS 45

14. Some structure theorems about distributions

[133] (i) Let ϕ ∈ E ′ be with supp(ϕ) = {0}. Then there exists N ∈ N such that for every f ∈ E

with Djf(0) = 0 for 0 ≤ j ≤ N , we have ⟨ϕ, f⟩ = 0.

(ii) (Structure theorem for distribution supported on a point) Let ϕ ∈ E ′ be with supp(ϕ) = {a}.

Then ϕ =∑N

j=0 cjDjδa for finitely many constants cj ∈ C, where δa is the Dirac distribution at a.

Proof. (i) Since ϕ ∈ E ′, there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ E .

Fix f ∈ E with Djf(0) = 0 for 0 ≤ j ≤ N . Let g ∈ E be such that g(x) = 0 for |x| ≤ 1 and

g(x) = 1 for |x| ≥ 2. Put gk(x) = g(kx). Then gk(x) = 0 for |x| ≤ 1/k and gk(x) = 1 for |x| ≥ 2/k.

Since fgk ≡ 0 in a neighborhood of 0, we have supp(ϕ) ∩ supp(fgk) = ∅, and hence ⟨ϕ, fgk⟩ = 0

by Exercise-30. Therefore, |⟨ϕ, f⟩| = ⟨ϕ, f − fgk⟩| ≤ CpN (f − fgk) for every k ∈ N. Thus to prove

⟨ϕ, f⟩ = 0, it suffices to show limk→∞ pN (f − fgk) = 0. We make four observations:

(a) Since f = fgk for |x| > 2/k, pN (f − fgk) = max{|Dj(f − fgk)(x)| : |x| ≤ 2/k, 0 ≤ j ≤ N},

(b) Dj(f − fgk) = Dj(f(1− gk)) =∑j

i=0 cijDifDj−i(1− gk) by product rule.

(c) ∥Dj−i(1− gk)∥∞ ≤ kj−i∥Dj−i(1− g)∥∞ for 0 ≤ i ≤ j ≤ N and every k ∈ N.

(d) ∃ a constantM1 > 0 with max{|Dif(x)| : |x| ≤ 2/k} ≤ M1

kN+1−ifor 0 ≤ i ≤ N since Djf(0) =

0 for 0 ≤ j ≤ N (to see this, consider |xi| ≤ 2/k, and pick |xi+1| ≤ 2/k with |xiDi+1f(xi+1)| =

|Dif(xi)−Dif(0)| = |Dif(xi)| by Mean value theorem, and repeat this till one gets xN+1).

Combining (c) with (d), we get a constant M2 > 0 such that max{|Dif(x)Dj−i(1 − gk)(x)| :

|x| ≤ 2/k} ≤ M2

kN+1−j≤ M2

kfor 0 ≤ i ≤ j ≤ N . Combining this with (a) and (b), we get a constant

M > 0 such that pN (f − fgk) ≤M/k → 0 as k → ∞.

(ii) After a translation assume a = 0. Let N ∈ N be as given by part (i). Consider f ∈ E . By

Taylor’s theorem, there is h ∈ E such that f(x) =∑N

j=0Djf(0)

xj

j!+h(x) for x ∈ R. Differentiating

this repeatedly and substituting x = 0, see Djh(0) = 0 for 0 ≤ j ≤ N . Hence ⟨ϕ, h⟩ = 0 by (i), and

therefore ⟨ϕ, f⟩ =∑N

j=0Djf(0)⟨ϕ, x

j

j!⟩. Recall that ⟨Djδ0, f⟩ = (−1)j⟨δ0, Djf⟩ = (−1)jDjf(0) and

put cj = (−1)j⟨ϕ, xj

j!⟩. Then ⟨ϕ, f⟩ =

∑Nj=0 cj⟨Djδ0, f⟩ for f ∈ E , and hence ϕ =

∑Nj=0 cjD

jδ0. �

Exercise-33: Let β ∈ D′ be the distribution induced by the constant function 1 (which is locally

integrable), i.e., ⟨β, f⟩ =∫R f.1 =

∫R f for f ∈ D. Then ker(β) = {f ∈ D :

∫R f = 0} is a vector

subspace of D having codimension one. Let J : ker(β) → D be Jf(x) =∫ x−∞ f(y)dy. Then J is

well-defined, i.e, Jf ∈ D for f ∈ ker(β), and J is linear and continuous. Moreover, DJf = f for

f ∈ ker(β) and JDf = f for f ∈ D. In particular, {f ∈ D :∫R f = 0} = ker(β) = D(D), the range

of the differentiation operator D : D → D.

[Hint : To show J is continuous, show J : Dk ∩ ker(β) → D are continuous at 0 for k ∈ N.]


[134] (i) (Every distribution has a primitive) If ϕ ∈ D′, then there is ψ ∈ D′ with Dψ = ϕ.

(ii) (Determined up to a constant) If ϕ, ψ1, ψ2 ∈ D′ are with Dψ1 = ϕ = Dψ2, then ψ1 − ψ2 is a

constant in the sense that there is c ∈ C with ⟨ψ1 − ψ2, f⟩ = ⟨c, f⟩ = c∫f for every f ∈ D.

(iii) If Dψ = 0 for ψ ∈ D′, then ψ is a constant, i.e., ∃ c ∈ C with ⟨ψ, f⟩ = ⟨c, f⟩ = c∫f for f ∈ D.

Proof. (i) Let β and J be as in Exercise-33. Fix h ∈ D with ⟨β, h⟩ =∫R h = 1. Then h spans a

one-dimensional space complementary to ker(β). Define the projection P : D → ker(β) as Pf =

f−⟨β, f⟩h (check that ⟨β, Pf⟩ is indeed zero). Then any f ∈ D can be written as f = ⟨β, f⟩h+Pf .

We define ψ : D → C as ⟨ψ, f⟩ = −⟨ϕ, JPf⟩, which is obviously linear. If (fn) → 0 in D, then

Pfn = fn − ⟨β, fn⟩h → 0 in ker(β) ⊂ D, and hence (JPfn) → 0 in D by the continuity of J .

Therefore, ⟨ψ, fn⟩ = −⟨ϕ, JPfn⟩ → 0 in C. This shows that ψ is continuous at 0, and hence

ψ ∈ D′. For f ∈ D, we have PDf = Df since Df ∈ ker(β), and also JDf = f by Exercise-33(ii);

hence ⟨Dψ, f⟩ = −⟨ψ,Df⟩ = ⟨ϕ, JPDf⟩ = ⟨ϕ, JDf⟩ = ⟨ϕ, f⟩. Thus Dψ = ϕ.

(ii) Write f ∈ D as f = ⟨β, f⟩h + Pf as above. We have ⟨ψ1 − ψ2, Pf⟩ = ⟨ψ1 − ψ2, DJPf⟩ =

−⟨D(ψ1 − ψ2), JPf⟩ = −⟨0, JPf⟩ = 0, and therefore ⟨ψ1 − ψ2, f⟩ = ⟨ψ1 − ψ2, ⟨β, f⟩h⟩ = c⟨β, f⟩ =

c∫f , where c := ⟨ψ1 − ψ2, h⟩.

(iii) This follows from (ii) since we also have D0 = 0. �

[135] (Local structure theorem for distributions - every distribution is locally a finite order de-

rivative of a continuous function) Let ϕ ∈ D′ and a > 0. Then there exist g ∈ Cc(R) with

supp(g) ⊂ [−a, a] and an integer m ≥ 0 such that for every f ∈ D with supp(f) ⊂ (−a, a) (note

the open bracket), we have ⟨ϕ, f⟩ = ⟨Dmg, f⟩.

Proof. Let Γ = {f ∈ D : supp(f) ⊂ (−a, a)} and k ≥ a. By [126](i), there are C > 0 and N ≥ k

such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk, and in particular for every f ∈ Γ. Fix f ∈ Γ,

0 ≤ j ≤ N , and let x0 ∈ (−a, a) be with |Df j(x0)| = ∥Df j∥∞. By mean value theorem, there

is y0 ∈ (−a, x0) with Df j+1(y0) =Df j(x0)−Df j(−a)

x0 + a=

Df j(x0)− 0

x0 + a, and hence ∥Df j∥∞ =

|Df j(x0)| = |x0 + a||Df j+1(y0)| ≤ (2a + 1)∥Df j+1∥∞. Applying this observation repeatedly, we

see pN (f) = max{|Djf(x)| : 0 ≤ j ≤ N, |x| ≤ N} ≤ (2a + 1)N∥DNf∥∞ for f ∈ Γ. Putting

C0 = C(2a+ 1)N , we conclude |⟨ϕ, f⟩| ≤ C0∥DNf∥∞ for every f ∈ Γ.

Further note that DNf ∈ Γ for f ∈ Γ and DNf(x) =∫ x−aD

N+1f(y)dy for x ∈ (−a, a), which

shows ∥DNf∥∞ ≤ ∥DN+1f∥1 for f ∈ Γ. Hence |⟨ϕ, f⟩| ≤ C0∥DN+1f∥1 for every f ∈ Γ. Let

DN+1(Γ) = {DN+1f : f ∈ Γ} and define the linear functional ψ : DN+1(Γ) → C as ⟨ψ,DN+1f⟩ =

⟨ϕ, f⟩. If f1, f2 ∈ Γ and DN+1f1 = DN+1f2, then f1 = f2 since Djf1 and Djf2 have compact

FOURIER ANALYSIS 47

supports for every j ≥ 0, and therefore ψ is well-defined. From the estimate above, |⟨ψ,DN+1f⟩| ≤

C0∥DN+1f∥1, and thus ψ : (DN+1(Γ), ∥ · ∥1) → C is also continuous.

By Hahn-Banach theorem, ψ has a continuous linear extension ψ : L1(−a, a) → C. Since

L∞(−a, a) is the dual of L1(−a, a), there is h ∈ L∞(−a, a) such that ⟨ψ, f⟩ =∫ a−a fh for every

f ∈ L1(−a, a). Put h(x) = 0 for |x| ≥ a and note h ∈ D′. For f ∈ Γ we have ⟨ϕ, f⟩ =

⟨ψ,DN+1f⟩ =∫ a−a hD

N+1f =∫R hD

N+1f = ⟨h,DN+1f⟩ = (−1)N+1⟨DN+1h, f⟩ = ⟨DN+1h0, f⟩,

where h0 = (−1)N+1h. If we define g : R → C as g(x) =∫ x−∞ h0(y)dy, then g is continuous with

supp(g) ⊂ [−a, a] and Dg = h0. And ⟨ϕ, f⟩ = ⟨DN+1h0, f⟩ = ⟨DN+2g, f⟩ for every f ∈ Γ. �

[136] (Global structure theorem for distributions with compact support - every distribution with

compact support is a finite sum of finite order derivatives of continuous functions) If ϕ ∈ E ′ and b > 0

is with supp(ϕ) ⊂ (−b, b), then there are finitely many functions fj ∈ Cc(R) with supp(fj) ⊂ (−b, b)

such that ϕ =∑m

j=0Djfj .

Proof. Let 0 < a < b be chosen with supp(ϕ) ⊂ (−a, a). By [131], there are g ∈ Cc(R) with

supp(g) ⊂ [−a, a] ⊂ (−b, b) and m ≥ 0 such that ⟨ϕ, f⟩ = ⟨Dng, f⟩ for every f ∈ D with supp(f) ⊂

(−a, a). Let h ∈ D be a bump function satisfying h ≡ 1 on supp(ϕ) and supp(h) ⊂ (−a, a). Then

for any f ∈ E , we have supp(fh) ⊂ (−a, a), and

⟨ϕ, f⟩ = ⟨ϕ, fh⟩ = ⟨Dmg, fh⟩ = (−1)m⟨g,Dm(fh)⟩ = (−1)mm∑j=0

cj⟨g,DjfDm−jh⟩.

Putting gj = (−1)mcjgDm−jh, which does not depend on f , we see that g ∈ C(R) with

supp(gj) ⊂ [−a, a] ⊂ (−b, b) and

⟨ϕ, f⟩ = (−1)mm∑j=0

cj

∫gDjfDm−jh =

m∑j=0

∫gjD

jf =

m∑j=0

⟨gj , Djf⟩ =m∑j=0

(−1)j⟨Djgj , f⟩.

Letting fj = (−1)jgj , we get the required result. �

Definition: We say g ∈ C(R) is of polynomial growth if there are constants C > 0, M > 0 such that

|g(x)| ≤ C(1 + |x|)M for every x ∈ R.

[137] (Structure theorem for tempered distributions - every tempered distribution is a finite order

derivative of a continuous function of polynomial growth) If ϕ ∈ S ′, then there exist m ∈ N and

g ∈ C(R) of polynomial growth such that ⟨ϕ, f⟩ = ⟨Dmg, f⟩ for every f ∈ S.

Proof. Step-1 : (Sketch) First assume supp(ϕ) ⊂ (0,∞). Since ϕ ∈ S ′, there exist C > 0 and N ∈ N

such that |⟨ϕ, f⟩| ≤∑

0≤i,j≤N si,j(f) for every f ∈ S. Choose h0 ∈ E with h0 ≡ 1 in a neighborhood


of supp(ϕ) and supp(h0) ⊂ (0,∞). Since ⟨ϕ, f⟩ = ⟨ϕ, h0f⟩, we can find C ′ = C ′(h0) > 0 by applying

the product rule of differentiation in si,j(h0f) so that

|⟨ϕ, f⟩| = |⟨ϕ, h0f⟩| ≤ C ′∑0≤i,j≤N sup{|xiDjf(x)| : x > 0} for every f ∈ S. (∗)

Let h : R → C be h(x) = xN/N ! for x > 0 and h(x) = 0 for x ≤ 0 so that DNh = 1(0,∞) and

DN+1h = δ0. Let g = ϕ ∗ h, which is defined as ϕ ∗ h(x) = ⟨ϕ, y 7→ h(x − y)⟩; see the beginning

of the previous section. One then verifies that the continuous function g is of polynomial growth

with the help of (∗), and checks DN+1g = DN+1(ϕ ∗ h) = ϕ ∗DN+1h = ϕ ∗ δ0 = ϕ; see section 8.3

of F.G. Friedlander, Introduction to the Theory of Distributions for the details.

Step-2 : In the general case, choose h1, h2 ∈ E with h1, h2 ≥ 0, supp(h1) ⊂ (−∞, 1), supp(h2) ⊂

(0,∞), and supp(h1) ∪ supp(h2) = R. Putting gj = hj/(h1 + h2) for j = 1, 2, we see gj ∈ E ,

0 ≤ g1, g2 ≤ 1, g1+ g2 = 1, supp(g1) ⊂ (−∞, 1) and supp(g2) ⊂ (0,∞); in other words, {g1, g2} is a

smooth partition of unity for R subordinate to the open cover {(−∞, 1), (0,∞)}. Since ϕ = g1ϕ+g2ϕ

with supp(g1ϕ) ⊂ (−∞, 1) and supp(g2ϕ) ⊂ (0,∞), we may apply the argument in step-1 to each

of g1ϕ and g2ϕ to deduce the required result about ϕ. �

15. Fourier transform on R: basics

For r ∈ R, let er : R → C be er(x) = e2πirx for x ∈ R. Note that er ∈ L∞(R) for every r ∈ R.

Recall from Exercise-5 that f(n) = f(n) · 1 = f(n)en(0) = f ∗ en(0) for f ∈ L1(T). This motivates

the following definition.

Definition: The Fourier transform f : R → C of f ∈ L1(R) is defined as f(y) = f ∗ ey(0) =∫R f(x)e−y(x)dx for y ∈ R. For example, let u ∈ (0,∞) and f = 1[−u,u]. Then f(0) = 2u and

f(y) =∫ u−u e−y(x)dx =

e−y(−u)− e−y(u)

2πiy=

sin 2πiuy

πyfor y ∈ R \ {0}.

Remark: The integral defining f(y) is a global integral, over the whole of R. Hence, if we change

f on a small interval, the value of f(y) may change for every y ∈ R.

Exercise-34: (Properties of the Fourier transform - I) Let f, g ∈ L1(R).

(i) f ∈ L∞(R), f is uniformly continuous, and ∥f∥∞ ≤ ∥f∥1.

(ii) (Continuity) If (fk) → f in L1(R), then ∥f − fk∥∞ → 0 as k → ∞.

(iii) (Reflection and Linearity)f =

˜f (where f(x) = f(−x)), and af + bg = af + bg for a, b ∈ C.

(iv) If f is an even (odd) function, then so is f .

(v) f(y) = (1/2)∫R[f(x)− f(x+ 1

2y )]e−y(x)dx for y ∈ R \ {0}.

[Hint : (i) Note ∥ey∥∞ = 1, and use an argument similar to that for Exercise-20. (ii) Use ∥f∥∞ ≤

∥f∥1. (iv) If f is even, then f(−y) =∫f(x)ey(x)dx =

∫f(−x)ey(−x)dx =

∫f(x)e−y(x)dz = f(y).

FOURIER ANALYSIS 49

(v) Substitute x = z+ 12y in the integral expression for f(y) to get f(y) = −

∫R f(z+

12y )e−y(z)dz =

−∫R f(x+ 1

2y )e−y(x)dx and add this to the original expression for f(y).]

Exercise-35: (Properties of the Fourier transform - II) Let f, g ∈ L1(R), , and let fa(x) = f(x− a).

(i) eaf = (f)a and (fa) = e−af for every a ∈ R.

(ii) f ∗ ey = f(y)ey for every y ∈ R.

(iii) f ∗ g = f g.

(iv)∫R f(x)g(x)dx =

∫R f(x)g(x)dx.

(v) Let a = 0. If g(x) = f(ax), then g(y) = a−1f(y/a) (so If g(x) = f(x/a), then g(y) = af(ay)).

[Hint : (i) eaf(y) =∫ea(x)f(x)e−y(x)dx =

∫f(y)e−(y−a)(x)dx = f(y − a), and fa(y) =

∫f(x −

a)e−y(x)dx =∫f(z)e−y(z + a)dz = e−y(a)

∫f(z)e−y(z)dz = e−a(y)f(y). Deduce (iii) from (ii) as

follows: f ∗ g(y)ey = f ∗ g ∗ ey = f ∗ g(y)ey = g(y)f ∗ ey = g(y)f(y)ey and cancel ey = 0 from both

ends as in Exercise-5. Use Fubini’s theorem to prove (iv). For (v), put z = ax in the integral.]

Exercise-36: (Fourier transform and differentiation) Let f ∈ L1(R).

(i) If Df ∈ L1(R), then lim|x|→∞ f(x) = 0 and Df(y) = 2πiyf(y) for y ∈ R.

(ii) If xf ∈ L1(R), then f is differentiable and Df(y) = −2πi(xf)(y) for y ∈ R.

(iii) More generally, we have: Dmf ∈ L1(R) ⇒ Dmf(y) = (2πiy)mf(y); and

xmf ∈ L1(R) ⇒ f is m-times differentiable with Dmf(y) = (−2πi)m(xmf)(y).

[Hint : (i) Since Df ∈ L1(R), ∀ ε > 0 ∃M > 0 with |f(b)− f(a)| = |∫ ba Df | ≤

∫ ba |Df | < ε for M ≤

a < b. So limx→∞ f(x) exists. This limit must be 0 since f ∈ L1(R). Similarly limx→−∞ f(x) = 0.

To derive the expression for Df(y), do integration by parts. (ii) Since |e−t(x) − 1| ≤ |2πtx|

and since xf ∈ L1(R), Lebesgue dominated convergence theorem gives limt→0f(y + t)− f(y)

t=∫

R f(x)e−y(x)

[limt→0

e−t(x)− 1

t

]dx = −2πi

∫R xf(x)e−y(x)dx = −2πi(xf)(y).]

Remark: From Exercise-35(i) and Exercise-36, and some of the future results, we see the following

are pairs of dual operations for the Fourier transform:

(i) translation and rotation (multiplication by a unimodular scalar).

(ii) Differentiation and multiplying with x.

(iii) Convolution and taking pointwise product.

Example: Let f ∈ L1(R) be f(x) = e−πx2(and f(0) = 1). We will show f = f . We have

Df(x) = −2πxf(x) and hence f is the unique solution to Dg(x) + 2πxg(x) = 0 with initial

condition g(0) = 1. Thus it suffices to show f also satisfies this equation. Since xf ∈ L1(R)

and using −2πxf(x) = Df(x), we get by Exercise-36 that Df(y) =∫(−2πixf(x))e−y(x)dx =

50 T.K.SUBRAHMONIAN MOOTHATHU∫iDf(x)e−y(x)dx = iDf(y) = i · 2πiyf(y) = −2πyf(y). Also f(0) =

∫R f(x)dx = 1 by Complex

integration (consider (∫R f(x)dx)(

∫R f(y)dy) =

∫C e

−π|z|2dz and use polar coordinates).

Remark: The second part of Exercise-36(iii) says that the faster the decay of f at ∞, the smoother

f is. Result [138](ii) below says that the smoother f is, the faster the decay of f at ∞.

[138] (i) (Riemann-Lebesgue lemma) Let f ∈ L1(R). Then lim|y|→∞

f(y) = 0, and hence f ∈ C0(R).

Also, lim|y|→∞

∫Rf(x) cos 2πxydx = 0 and lim

|y|→∞

∫Rf(x) sin 2πxydx = 0.

(ii) (Smoother functions, faster decay) If Djf ∈ L1(R) for 0 ≤ j ≤ m, then lim|y|→∞

|y|mf(y) = 0.

Proof. (i) If f = 1[a,b], then f(y) =e−y(a)− e−y(b)

2πiy→ 0 as |y| → ∞. By linearity, f(y) → 0 as

|y| → ∞ for any step function f ∈ L1(R). Also it is known that step functions are dense in L1(R)

(see my notes Measure Theory). Another proof : We may assume f ∈ Cc(R) because Cc(R) is dense

in L1(R). Now use the expression for f(y) from Exercise-34(v) and the uniform continuity of f -

as in the proof of [103](ii) - to deduce lim|y|→∞ f(y) = 0.

(ii) This follows from (i) and Exercise-36(i). �

Remark: By the above results, f 7→ f from L1(R) to (C∞0 (R), ∥ · ∥∞) is linear and continuous.

16. Fourier transform: sufficient conditions for pointwise inversion

Definition: For g ∈ L1(R), the Fourier inverse transform g∨ of g is defined as g∨(y) = g(−y) =∫R g(x)ey(x)dx. By Exercise-34(i) and [138], we see that g∨ ∈ C0(R); in particular, g∨ is bounded

and uniformly continuous. Also note that if f ∈ L1(R) is an even function, then f∨(y) =∫f(x)ey(x)dx =

∫f(−x)ey(x)dx =

∫f(z)e−y(z)dz = f(y), i.e., f∨ = f when f is even.

Question and Remark: If f ∈ L1(R), can we expect the equality (f)∨ = f? If we want this equality

to hold everywhere, a necessary condition (by the observation in the above paragraph) is that

f ∈ C0(R). Even if we demand equality only almost everywhere, a necessary condition is that

f ∈ L∞(R). Another point to note is, formally (f)∨(x) =∫∞−∞ f(y)ex(y)dy, but this integral may

not be defined if f is not an L1-function. With our knowledge that f ∈ L∞(R), an integral that we

can always define for f ∈ L1(R) is∫ u−u f(y)ex(y)dy for u ∈ (0,∞); also we may investigate whether

this integral converges to f(x) as u→ ∞. This motivates the following definitions.

Definition: (i) For u ∈ (0,∞), the continuous Dirichlet kernel Du : R → C is defined asDu = 1[−u,u],

and hence Du ∈ C0(R). From an earlier calculation, Du(y) =e−y(−u)− e−y(u)

2u=

sin 2πuy

πyfor

y ∈ R \ {0}, and Du(0) = 2u. Thus Du is a real valued function, and also Du(−y) = Du(y).

FOURIER ANALYSIS 51

(ii) For f ∈ L1(R) and u > 0, the uth Fourier partial integral su(f) : R → C is defined as su(f, a) =∫ u−u f(x)ea(x)dx for a ∈ R. We now look for conditions that yield limu→∞ su(f, a) = f(a).

Exercise-37: (Various expressions for su(f)) Let f ∈ L1(R), u > 0 and a ∈ R. Then,

(i) su(f, a) =∫∞−∞ f(a+ x)

sin 2πux

πxdx =

∫∞−∞ f(a− x)

sin 2πux

πxdx = Du ∗ f(a).

(ii) su(f, a) =∫∞0 [f(a+ x) + f(a− x)]

sin 2πux

πxdx =

∫∞0 [f(a+ x) + f(a− x)]Du(x)dx.

[Hint : (i) By Exercise-35(i), eaf = f−a, and for g := 1[−u,u], we have g = Du. Hence su(f, a) =∫ u−u f(x)ea(x)dx =

∫R f−a(x)g(x)dx =

∫∞−∞ f−a(x)g(x)dx =

∫R f(a+x)Du(x)dx by Exercise-35(iv).

Replacing x with −x gives su(f, a) =∫R f(a−x)Du(x)dx = f ∗Du(a) since Du is an even function.]

[139] (i) Let g ∈ L1(R)∩L∞(R) be such that g ≥ 0 and∫R g = 1. Let gt(x) = t−1g(x/t) for t > 0.

Then {gt : t > 0} as t → 0 is an approximate identity for L1(R). Moreover, for any f ∈ L1(R) we

have limt→0 f ∗ gt(x) = f(x) for a.e. x ∈ R.

(ii) (Fourier inversion theorem) Assume f, f ∈ L1(R), and let f0 = (f)∨. Then, f0 ∈ C0(R), f = f0

almost everywhere, and also (f∨) = f0. Moreover, f , f∨ ∈ L1(R) ∩ L∞(R).

Proof. (i) We know {gt : t > 0} as t → 0 is an approximate identity for L1(R). We see limt→0 f ∗

gt(x) = f(x) for a.e. x ∈ R by noting that |f(x) − f ∗ gt(x)| = |∫R(f(x) − f(x − y))gt(y)dy| ≤∫

R |f(x)−fy(x)|t−1g(y/t)dy =∫R |f(x)ftz(x)|g(z)dz ≤ ∥f−ftz∥1∥g∥∞ → 0 as t→ 0 by Exercise-20.

(ii) Direct evaluation will not work since the complex valued exponential function on R does not

belong to L1(R). Therefore the proof becomes a little involved, where we need to insert a suitable

approximate identity into the integral. We know f0 ∈ C0(R). Let K(x) = e−πx2, and Kt(x) =

t−1K(x/t) = t−1e−πx2/t for t > 0. Then {Kt : t > 0} as t → 0 is an approximate identity. Since

K = K, we have ht(y) := Kt(y) = K(ty) = K(ty) = e−πt2y2 by Exercise-35(v). Since K, Kt are

even functions, K∨t = Kt, and therefore ht = Kt.

Fix z ∈ R. Since (ht) → 1 pointwise as t → 0, Lebesgue dominated convergence theorem yields

that f0(z) =∫R f(y)ez(y)dy = limt→0

∫R f(y)ez(y)ht(y)dy. By Exercise-35 and the evenness of

Kt, we see∫R f(y)ez(y)ht(y)dy =

∫R f−z(y)ht(y)dy =

∫R f−z(y)ht(y)dt =

∫R f(y + z)Kt(y)dy =∫

R f(−y + z)Kt(y)dy = f ∗Kt(z) → f(z) for a.e. z ∈ R as t → 0 by part (i). In the last step, one

can also take limit along a sequence (tn) → 0 after noting that ∥f − f ∗Kt∥1 → 0 as t → 0, and

that L1-convergence implies pointwise convergence a.e. along a subsequence.

Since f∨(y) = f(−y), we also have (f∨) = f0. These imply that f, f ∈ L1(R) ∩ L∞(R). �

The following is an important technical fact for Fourier Theory.

[140] (i) Let h(x) =∫ x0 y

−1 sin ydt. Then limx→∞ h(x) = π/2.


(ii)∫∞0 Du(x)dx =

∫∞0

sin 2πux

πxdx = 1/2, and therefore

∫∞−∞Du(x)dx =

∫∞−∞

sin 2πux

πxdx = 1.

Proof. (i) Note that h is monotone on (nπ, (n + 1)π), and the differences h((n + 1)π) − h(nπ) =∫ (n+1)πnπ y−1 sin ydy alternate in sign and decrease to 0. Hence limx→∞ h(x) exists. Thus it suf-

fices to show limn→∞ h(xn) = π/2 for some sequence (xn) → ∞. From the earlier parts of the

notes, we know that the discrete Dirichlet kernel11 DN ∈ C(T) satisfies 1/2 =∫ 1/20 DN (t)dt =∫ 1/2

0

sin(2N + 1)πt

sinπtdt. Since lim

N→∞

∫ 1/2

0(

1

sinπt− 1

πt) sin(2N + 1)πtdt = 0 by [104],

1/2 = limN→∞

∫ 1/2

0

sin(2N + 1)πt

πt, and hence

π/2 = limN→∞∫ 1/20 t−1 sin(2N+1)πtdt = limN→∞

∫ (N+ 12)π

0 y−1 sin ydy = h((N+ 12)π) by putting

(2N + 1)πt = y. For another proof of (i) using complex integration, see Example 2.7 in Chapter 5

of Conway, Functions of one Complex Variable.

(ii) Putting x =t

2πu, we see

∫∞0 Du(x)dx = 1

π

∫∞0 t−1 sin tdt. Now apply (i). �

Exercise-38: (Sufficient conditions for the pointwise Fourier inversion) Let f ∈ L1(R) and a ∈ R.

(i) limu→∞ su(f, a) = f(a) ⇔ there is δ > 0 such that∫ δ−δ(f(a+ x)− f(a))

sin 2πux

πxdx = 0.

(ii) (Dini’s test) If x 7→ f(a+ x)− f(a)

xis in L1(−δ, δ) for some δ > 0, then lim

u→∞su(f, a) = f(a).

(iii) If x 7→ f(a+ x)− f(a)

xis bounded a.e. in a neighborhood of 0, then lim

u→∞su(f, a) = f(a).

(iv) If f is differentiable at a, or Lipschitz/Holder continuous at a, then limu→∞

su(f, a) = f(a).

(v) If f is piecewise C1 on each bounded interval, then limu→∞ su(f, a) = [f(a+) + f(a−)]/2.

[Hint : (i) f(a) =∫R f(a)Du(x)dx by [139], and hence su(f, a) − f(a) = (

∫ δ−δ +

∫|x|>δ)(f(x + a) −

f(a))Du(x)dx, where the second integral goes to 0 as u → ∞ by [138]. The proofs of other

statements are also similar to what we wrote for the pointwise convergence of Fourier series. Refer

to G.Bachman, L.Narici and E.Beckenstein, Fourier and Wavelet Analysis for some helpful hints.]

Definition: For v > 0, define the continuous Fejer kernel Fv on R as Fv(x) = 1v

∫ vu=0Du(x)du =[

− cos 2πux

2π2vx2

]vu=0

=1− cos 2πvx

2π2vx2≥ 0. Note that Fv is an even function, and by the identity

1 − cos 2θ = 2 sin2 θ we have Fv(x) =sin2 πvx

π2vx2. See

∫R Fv(x)dx = 1

v

∫ vu=0

∫x∈RDu(x)dxdu =

1v

∫ v0 1du = 1 by [139](ii) and an interchange of the integral. Since Fv(x) = vF1(vx), the family

{Fv : v > 0} also satisfies the L1-concentration condition (A3). Thus {Fv : v > 0} as v → ∞ is

an approximate identity on R. For f ∈ L1(R) and v > 0, the continuous Fejer mean σv(f) of f

is defined as σv(f, x) =1v

∫ vu=0 su(f, x)du = 1

v

∫ vu=0

∫y∈RDu(y)f(x− y)dydu = Fv ∗ f(x), where the

last equality is by an interchange of the integrals.

11Same notation is used for discrete and continuous Dirichlet kernels: DN (t) =sin(2N + 1)πt

sinπt, Du(x) =

sin 2πux

πx.

FOURIER ANALYSIS 53

Exercise-39: (i) For v > 0, let gv : R → R be gv(x) = 1 − |x|v for −v < x < v, and g(x) = 0

elsewhere. Then Fv = gv = g∨v ∈ C0(R).

(ii) For f ∈ L1(R), v > 0, and a ∈ R, we have σv(f, a) = Fv ∗ f(a) =∫ v−v(1−

|y|v )f(y)ea(y)dy.

(iii) If f ∈ Cc(R), then ∥f − Fv ∗ f∥∞ → 0 as v → ∞.

(iv) Let 1 ≤ p <∞. If f ∈ Lp(R), then ∥f − Fv ∗ f∥p → 0 as v → ∞.

(v) Let f ∈ L1(R) and assume f is continuous at a ∈ R. Then limv→∞ σv(f, a) = f(a).

(vi) If f ∈ L1(R), then limv→∞ σv(f, a) = f(a) for a.e. a ∈ R.

(vii) (Uniqueness) If f, g ∈ L1(R) and f = g, then f = g a.e., and hence f = g in L1(R).

[Hint : (i) Replacing x with−x in∫ 0−v, and integrating by parts, g(y) =

∫ v0 (1−

xv )[ey(x)+e−y(x)]dx =∫ v

0 (1−xv ) · 2 cos 2πxy dx = 0 + 1

v

∫ v0

sin 2πxy

πydx = 1

v

∫ v0 Dx(y)dx = Fv(x). And g∨v = gv since gv is

even. (ii) Since Fv(a− z) = g∨v (a− z) =∫gv(y)ea−z(y)dy, we get Fv ∗ f(a) =

∫Fv(a− z)f(z)dz =∫ ∫

gv(y)ea(y)e−y(z)f(z)dzdy =∫gv(y)ea(y)f(y)dy =

∫ v−v(1−

|y|v )f(y)ea(y)dy. (iii) and (iv): They

follow from [102] since {Fv : v > 0} as v → ∞ is an approximate identity. (v) Similar to the initial

part of the proof of [116](v). Statement (vi) follows from [139](i), and (vii) from (ii) and (vi).]

Remark: The space Cbu(R) := {f ∈ C(R) : f is bounded and uniformly continuous} is closed in

L∞(R). Also Cbu(R) satisfies the following two properties: ∥fy∥∞ = ∥f∥∞, and y 7→ fy from R

to (Cbu(R), ∥ · ∥∞) is continuous for each f ∈ Cbu(R). At the abstract level, these two are the

properties going into the proof of [102](i). Hence, extending Exercise-39(iii), we can also establish

that ∥f − Fv ∗ f∥∞ → 0 as v → ∞ for every f ∈ Cbu(R), and in particular for every f ∈ C0(R).

Now we prove the analogue of Dirichlet-Jordan theorem (result [118]) with a different proof, for

which we will make use of the following fact.

Fact: If g : [a, b] → R is increasing, then g is differentiable almost everywhere, and for any

bounded real function h on [a, b], we have∫ ba hdg =

∫ ba h(x)Dg(x)dx, where

∫ ba hdg is the Riemann-

Stieltjes integral w.r.to g (see Theorem 6.17 in Rudin, Principles of Mathematical analysis). If

g : [a, b] → C is of bounded variation, then we can write g = g1 − g2 + i(g3 − g4), where gj ’s are

monotone increasing, and hence we can define the Riemann-Stieltjes integral w.r.to g, and one has∫ ba hdg =

∫ ba h(x)Dg(x)dx for any bounded real function h on [a, b] in this case also.

[141] (Dirichlet-Jordan theorem for R) Let f ∈ L1(R) be of bounded variation in every compact

interval [a, b] ⊂ R. Then limu→∞ su(f, a) = [f(a+) + f(a−)]/2 for every a ∈ R; in particular,

limu→∞ su(f, a) = f(a) for a.e. a ∈ R (since a function of bounded variation, being a linear

combination of monotone functions, is differentiable a.e., and hence continuous a.e.)


Proof. Fix a ∈ R and let g(x) = f(a+x)+f(a−x). Then g ∈ L1(R), and g is of bounded variation on

any compact interval [a, b] ⊂ R. Fix δ > 0. By Exercise-37(ii), su(f, a) = (∫ δ0 +

∫∞δ )g(x)Du(x)dx.

As a consequence of Riemann-Lebesgue lemma, limu→∞∫∞δ g(x)Du(x)dx = 0 (see [105](iv) for a

comparison). Thus it suffices to show limu→∞∫ δ0 g(x)Du(x)dx = [f(+) + f(a−)]/2. Let hu(x) =∫ x

0 Du(y)dy so that Dhu = Du. Integration by parts yields

∫ δ

0g(x)Du(x)dx =

∫ δ

0g(x)Dhu(x)dx = g(x)hu(x)|δ0 −

∫ δ

0Dg(x)hu(x)dx.(∗)

LetH(x) =∫ x0

sin t

πtdt. By [139] note thatH(0) = 0, H(2πux) = hu(x), and limx→∞H(x) = 1/2.

Now g(x)hu(x)|δ0 = g(δ−)H(2πuδ) → g(δ−)/2 as u→ ∞. Also,

limu→∞

∫ δ

0Dg(x)hu(x)dx = lim

u→∞

∫ δ

0hu(x)dg = lim

u→∞

∫ δ

0H(2πux)dg =

∫ δ

0

1

2dg =

g(δ−)− g(0+)

2

by the Fact above and Lebesgue dominated convergence theorem. Using these in (∗), we get

limu→∞∫ δ0 g(x)Du(x)dx = g(0+)/2 = [f(a+) + f(a−)]/2. �

Exercise-40: If f ∈ L1(R), then limu→∞∫ ba su(f, x)dx =

∫ ba f(x)dx for every a < b in R.

[Hint : This is essentially a consequence of the fact that Du is even and hence su behaves self-

adjointly :∫su(f)g =

∫fsu(g). Indeed, letting g = 1(a,b), we

∫ ba su(f, x)dx =

∫R su(f, x)g(x)dx =∫

R f ∗ Du(x)g(x)dx =∫R∫R f(y)Du(x − y)g(x)dxdy =

∫R∫R f(y)Du(y − x)g(x)dxdy =

∫R f(y)g ∗

Du(y)dy =∫R f(y)su(g, y)dy. Also, since g ∈ L1(R) is of bounded variation, su(g) → g pointwise

a.e., bounded by an integrable function. Hence∫ ba su(f) =

∫R fsu(g) →

∫R fg =

∫ ba f as u→ ∞.]

17. Fourier transform on S, L2(R), and on distributions

Recall that the Schwartz space S is a subset of Lp(R) for 1 ≤ p ≤ ∞. We denote by ⟨·, ·⟩2 the

L2-inner product.

[142] If f ∈ S, then f ∈ S. The Fourier transform map F : S → S given by Ff = f is linear and

bijective, and satisfies the unitary condition FF∗ = I = F∗F , where F∗ : S → S is the inverse

Fourier transform F∗f = f∨. In particular, ⟨Ff,Fg⟩2 = ⟨f, g⟩2 and ∥Ff∥2 = ∥f∥2 for f, g ∈ S.

Proof. Let f ∈ S and i, j ≥ 0. We need to show ∥yiDj f∥∞ < ∞. By Exercise-36, yjDj f(y) =

yi(−2πi)j (xjf) = (−1)j(2πi)j−iDi(xjf)(y). And ∥Di(xjf)∥∞ ≤ ∥Di(xjf)∥1 < ∞ since we have

Di(xjf) ∈ S ⊂ L1(R) as f ∈ S. For f, g ∈ S, we see by Fubini’s theorem that

⟨Ff, g⟩2 =∫ ∫

f(x)e−y(x)g(y)dxdy =

∫ ∫f(x)g(y)ey(x)dxdy =

∫f(x)F ∗g(x)dx = ⟨f, f∗g⟩2,

FOURIER ANALYSIS 55

and hence F∗ is indeed the adjoint of F . Since S ⊂ L1(R), we have F∗F = I and similarly FF∗ = I

by [139]. This gives ⟨Ff,Fg⟩2 = ⟨f,F∗Fg⟩2 = ⟨f, g⟩2 for f, g ∈ S. �

Remark: A corollary of [142] is that Γ := {f ∈ L1(R) : supp(f) is compact} is dense in L1(R).

Proof : Since D ⊂ S ⊂ L1(R) are dense inclusions, and since F∗ : S → S is an isomorphism, the set

{g∨ : g ∈ D} is dense in L1(R). And {g∨ : g ∈ D} ⊂ Γ since g∨ = g, which has compact support.

[143] (Plancherel’s theorem - Fourier transform on L2(R)) Let F : S → S be the Fourier transform

Ff = f . Since S is dense in L2(R), we see by [142] that F has a unique extension F : L2(R) →

L2(R) as a unitary operator, i.e., as a bijective linear map satisfying ⟨Ff,Fg⟩2 = ⟨f, g⟩2 for

f, g ∈ L2(R). In particular, ∥Ff∥2 = ∥f∥2 holds for every f ∈ L2(R). The unique extension

F∗ : L2(R) → L2(R) of the inverse Fourier transform F∗ : S → S is the inverse of F on L2(R).

Moreover, the extension F on L2(R) satisfies Ff = f a.e. for every f ∈ L1(R) ∩ L2(R).

Proof. All except the last line are evident. To prove the last line, consider f ∈ L1(R) ∩L2(R). We

claim that there is a sequence (fn) ∈ D such that ∥f − fn∥1 → 0 and ∥f − fn∥2 → 0 as n → ∞.

Since (f1[−n,n]) → f in both L1(R) and L2(R), it suffices to consider the case where f has compact

support. Choose g ∈ D with g ≥ 0 and∫g = 1; let gn(x) = ng(nx); and consider fn = f ∗ gn to

establish the claim. Now, ∥f− fn∥∞ ≤ ∥f−fn∥1 → 0 and ∥Ff−Ffn∥2 = ∥f−fn∥2 → 0 as n→ ∞

by Exercise-34(i) and [142]. But Ffn = fn since fn ∈ D ⊂ S, and therefore Ff = f a.e. �

Example: Let f(x) = e−2πa|x| on R, where a > 0. Then f(y) =∫ 0−∞ e2π(a−iy)xdx+

∫∞0 e−2π(a+iy)xdx =

1

2π(a− iy)+

1

2π(a+ iy)=

a

π(a2 + y2). We can now use the Plancherel identity ∥f∥22 = ∥f∥22 to

evaluate the following integral:∫R

1

(a2 + y2)2dy = a−2π2∥f∥22 = a−2π2∥f∥22 = a−2π2(

∫ 0−∞ e4πaxdx+∫∞

0 e−4πaxdx) = a−2π2 · ( 1

4πa+

1

4πa) =

π

2a3.

Three natural classes of approximate identities on L1(R) can be obtained in a unified fashion as

demonstrated in the Exercise below.

Exercise-41: (0) (Facts about integrals) Let K : R → C be any of the following: K(x) =1− cosx

πx2,

or K(x) =1

π(1 + x2), or K(x) =

e−x2/4

√4π

. In each case K ≥ 0 and∫RK = 1, so that we can

manufacture approximate identities from K as stated below.

(i) (Fejer kernel) In the first case, let Ku(x) = 2πuK(2πux) =1− cos 2πux

2π2ux2. Then {Ku : u > 0}

as u→ ∞ is an approximate identity for L1(R).

(ii) (Poisson kernel) In the second case, let Kt(x) = t−1K(x/t) =t

π(t2 + x2). Then {Kt : t > 0} as

t→ 0 is an approximate identity for L1(R).


(iii) (Gaussian kernel) In the third case, let Kt(x) = t−1/2K(x/√t) =

e−x2/4t

√4πt

. Then {Kt : t > 0}

as t→ 0 is an approximate identity for L1(R).

The mathematical formulation of Heisenberg’s uncertainty principle says roughly the following:

for every a, b ∈ R, it is impossible for f to be concentrated about a and f to be concentrated about

b simultaneously. In other words, if f is concentrated, then f should spread out, and vice versa.

[144] (i) (Uncertainty principle - qualitative form) If f, f ∈ Cc(R), then f ≡ 0.

(ii) (Uncertainty principle - quantitative form) Let f ∈ L2(R) satisfy the following decay conditions:

xf,Df ∈ L2(R) and x|f |2 vanishes at ∞. Then f, f ∈ L1(R), and for every a, b ∈ R, we have

(∫R |x− a|2|f(x)|2dx)(

∫R |y − b|2|f(y)|2dy ≥ ∥f∥42

16π2, i.e., 16π2∥(x− a)f∥22∥(y − b)f∥22 ≥ ∥f∥42.

Proof. (i) Check g : C → C defined as g(z) =∫R f(t)e−z(t)dt is complex analytic by differentiating

under the integral sign, etc. We have g(u+ i0) = f(u) = 0 for u ∈ R \ supp(f). Since R \ supp(f)

contains an interval (and hence is a set containing a limit point), we get g ≡ 0. Then f(u) = g(u) =

0 for every u ∈ R. Hence f ≡ 0 by Fourier inversion [139] and the continuity of f .

(ii) Since xf ∈ L2, we may write f = (1 + x2)1/2f × (1 + x2)−1/2 and apply Cauchy-Schwarz

inequality to see f ∈ L1(R). Similarly, Df ∈ L2(R) implies yf ∈ L2(R) by Exercise-36(i) and [143],

and this in turn (by writing f = (1 + y2)1/2f × (1 + y2)−1/2) implies f ∈ L1(R).

By Exercise-35(i), a translation in f corresponds to multiplying f with a unimodular complex

scalar, and a translation in f corresponds to multiplying f with a unimodular complex scalar.

So, after a translation in both f and f , we may assume a = 0 = b. Thus it suffices to show

16π2∥xf∥22∥yf∥22 ≥ ∥f∥42.. Note that xfDf ∈ L1(R) by Cauchy-Schwarz inequality since xf,Df ∈

L2(R). We claim that 16π2∥xf∥22∥yf∥22 ≥ (2∫xfDf)2 ≥ (2Re

∫xfDf)2 = ∥f∥42.

We have 4π2∥yf∥22 = ∥2πiyf∥22 = ∥Df∥22 = ∥Df∥22 by Exercise-36 and [143], and therefore

16π2∥xf∥22∥yf∥22 = 4∥xf∥22∥Df∥22 = 4∥xf∥22∥Df∥22 ≥ (2∫xfDf)2 by Cauchy-Schwarz. Next, note

that D|f |2 = D(ff) = f · Df + Df · f = 2Re(fDf), and consequently∫ u−u |f |

2 =∫ u−u |f |

2 · 1 =

x|f |2|u−u − 2Re∫ u−u xfDf . Letting u→ ∞ and using the vanishing of x|f |2 at infinity, we conclude

∥f∥22 = 0− 2Re∫xfDf so that ∥f∥42 = (2Re

∫xfDf)2. �

Remark: (i) If f ∈ L2(R) is with ∥f∥2 = 1, then infa∈R∫R(x−a)

2|f(x)|2dx =: σ2(f) is the variance

of f in the language of Probability Theory, and hence [144](ii) says σ2(f)σ2(f) ≥ 1/(16π2). (ii)

Every f ∈ S satisfies the hypothesis of [144](ii).

Now we wish to define Fourier transform of distributions. Recall the identity∫fg =

∫fg. This

suggests the definition ⟨ϕ, f⟩ = ⟨ϕ, f⟩. However, note that if f ∈ D \ {0}, then f is not in D by the

FOURIER ANALYSIS 57

uncertainty principle. Therefore, to have a symmetric situation, we define Fourier transform only

for tempered distributions (and this includes distributions with compact support).

Definition and Example: For ϕ ∈ S ′ we define ϕ ∈ S ′ as ⟨ϕ, f⟩ = ⟨ϕ, f⟩. (i) If ϕ ∈ S comes from

an L1-function g, then ϕ = g since∫fg =

∫fg. (ii) Consider the Dirac measure δa ∈ E ′ ⊂ S ′.

We have ⟨δa, f⟩ = ⟨δa, f⟩ = f(a) =∫fe−a = ⟨e−a, f⟩, and hence δa is the tempered distribution

induced by the function e−a. In particular, δ0 = e0 = 1, which means ⟨δ0, f⟩ =∫f for f ∈ S.

Remark: (i) ϕ 7→ ϕ from S ′ to itself is a sequentially continuous linear isomorphism due to [142].

(ii) Since Lp(R) ⊂ S ′ for 1 ≤ p ≤ ∞ by [128], the Fourier transform of every f ∈ Lp(R) (1 ≤ p ≤ ∞)

is now defined in the sense of distributions.

Exercise-42: Let ϕ, ψ ∈ S ′. Then, (i) (aϕ+ bψ) = aϕ+ bψ, andϕ =

˜ϕ.

(ii) Dϕ = (−2πixϕ) and hence (by invertibility) (2πixϕ) = −Dϕ.

(iii) (Dϕ) = 2πiyϕ.

(iv) (eaϕ) = (ϕ)a and (ϕa) = e−aϕ.

[Hint : (ii) ⟨Dϕ, f⟩ = −⟨ϕ, (Df)⟩ = −⟨ϕ, 2πiyf⟩ = ⟨−2πixϕ, f⟩ = ⟨ (−2πixϕ), f⟩. (iii) ⟨(Dϕ), f⟩ =

−⟨ϕ,Df⟩ = −⟨ϕ,−2πi(xf)⟩ = ⟨ϕ, 2πixf⟩ = ⟨2πiyϕ, f⟩. (iv) ⟨(eaϕ), f⟩ = ⟨ϕ, eaf⟩ = ⟨ϕ, (f−a)⟩ =

⟨ϕ, f−a⟩ = ⟨(ϕ)a, f⟩, and ⟨(ϕa), f⟩ = ⟨ϕa, f⟩ = ⟨ϕ, (f)−a⟩ = ⟨ϕ, e−af⟩ = ⟨ϕ, e−af⟩ = ⟨e−aϕ, f⟩.]

Exercise-43: (i) f ∗ g = (fg) for f, g ∈ S (this is needed for the proof of (ii)).

(ii) Let g ∈ S and ϕ ∈ S ′. Then ϕ ∗ g = gϕ and ϕ ∗ g = gϕ.

[Hint : (i) As (·)∨ =(·), we have (f ∗ g)∨ = (f)∨(g)∨ = fg by the product rule (h1 ∗ h2)∨ = h∨1 h

∨2 .

Now use invertibility in S. (ii) ⟨ϕ ∗ g, f⟩ = ⟨ϕ ∗ g, f⟩ = ⟨ϕ, f ∗ g⟩ = ⟨ϕ, (fg)⟩ = ⟨ϕ, f g⟩ = ⟨gϕ, f⟩,

and ⟨ϕ ∗ g, f⟩ = ⟨ϕ, f ∗ ˜g⟩ = ⟨ϕ, f ∗ g∨⟩ = ⟨ϕ, f ∗ g∨⟩ = ⟨ϕ, f (g∨)⟩ = ⟨ϕ, fg⟩ = ⟨gϕ, f⟩ = ⟨gϕ, f⟩.]

Exercise-44: (i) If ϕ ∈ E ′ ⊂ S ′, then ϕ is (the restriction to R of) a complex analytic function g, i.e.,

⟨ϕ, f⟩ = ⟨g, f⟩ for f ∈ S ′. In particular, ϕ ∈ E . Also, Dmϕ has polynomial growth at ∞ ∀m ≥ 0.

(ii) (Product rule) If ϕ, ψ ∈ E ′, then ϕ ∗ ψ ∈ E ′, and ⟨ϕ ∗ ψ, f⟩ = ⟨ϕ, f⟩⟨ψ, f⟩ for f ∈ S.

[Hint : (i) g : C → C given by g(z) = ⟨ϕ, e−z⟩ is complex analytic, and Dmg(z) = ⟨ϕ, (−2πiy)me−z⟩.

Since ϕ ∈ E ′, there exist C > 0 andN ≥ 0 with |Dmg(z)| = |⟨ϕ, (−2πiy)me−z⟩| ≤ CpN ((−2πiy)me−z),

from which the polynomial growth property follows12. Since e−y(x) = e−x(y), and since the

integral representing f(y) can be approximated by Riemann sums, we get ⟨ϕ, f⟩ = ⟨ϕ, f⟩ =

ϕ(∫f(x)e−x(·)dx) =

∫f(x)ϕ(e−x(·))dx =

∫f(x)g(x)dx = ⟨g, f⟩. (ii) We know ϕ ∗ ψ ∈ E ′. By

[131], we see ⟨ϕ ∗ ψ, f⟩ =∫ ∫

f(x + y)ϕ(e−x)ψ(e−y)dxdy = ⟨ϕ, f⟩⟨ψ, f⟩ by interchanging ϕ and ψ

with the integrals as above.]

12see p.119 of Grafakos, Classical Fourier Analysis for the computational details.


18. Fourier transform of measures

In this and the remaining sections, we select a few topics related to Fourier Analysis and give a

very brief sketch about them, often with partial or skipped proofs. These sketches are intended as

appetizers for the students to learn more about advanced topics related to Fourier Analysis.

Definition: Let (X,A) be a measurable space. A complex measure on (X,A) is a map µ : A → C

satisfying the following: µ(A) =∑∞

k=1 µ(Ak) whenever A =∪∞

k=1Ak is a measurable partition.

For example, if f ∈ L1(R), then µ(A) :=∫A f(x)dx defines a complex Borel measure on R (hint:

use Lebesgue dominated convergence theorem to get countable additivity). This also shows that

the modulus of a complex measure does not satisfy monotonicity property: A ⊂ B does not imply

|µ(A)| ≤ |µ(B)|. For instance, if f = 1[0,1] − 1[2,3] and dµ = fdx, then µ(R) = 0 but µ([0, 1]) = 1.

We will show that the modulus of a complex measure can always be dominated by a finite

(positive) measure in an optimal manner. In the proof, we will use the following fact.

Fact: (see Lemma 6.3 in Rudin, Real and Complex Analysis) If z1, . . . , zN are finitely many complex

numbers, then there is F ⊂ {1, . . . , N} with |∑

k∈F zk| ≥ π−1∑N

k=1 |zk|.

[145] (Domination by finite positive measure) Let µ be a complex measure on a measure space

(X,A). For A ∈ A, let β(A) = sup∑∞

k=1 |µ(Ak)|, where the supremum is taken over all measurable

partitions A =∪∞

k=1Ak. Then, β is a finite positive measure on (X,A) with |µ(X)| ≤ β(X).

Proof. Clearly β(∅) = 0. Also note |µ(A)| ≤ β(A) for A ∈ A since A = A ∪ ∅ ∪ ∅ ∪ ∅ ∪ · · · is

also a measurable partition. To check countable additivity for β, consider A ∈ A and a measurable

partition A =∪∞

k=1Ak. For any measurable partition A =∪∞

n=1Bn, we have that Ak =∪∞

n=1(Ak∩

Bn) and Bn =∪∞

k=1(Ak ∩Bn) are measurable partitions of Ak and Bn, and hence∑∞

n=1 |µ(Bn)| ≤∑∞n=1 |

∑∞k=1 µ(Ak ∩ Bn)| ≤

∑∞k=1

∑∞n=1 |µ(Ak ∩ Bn)| ≤

∑∞k=1 β(Ak). Taking supremum over all

measurable partitions A =∪∞

n=1Bn, we get β(A) ≤∑∞

k=1 β(Ak). To prove the reverse inequality,

consider 0 ≤ ck < β(Ak), and choose a measurable partition Ak =∪∞

n=1Ck,n of Ak with ck <∑∞k=1 |µ(Ck,n)| for each k ∈ N. As A =

∪∞k,n=1Ck,n is a measurable partition, we get

∑∞k=1 ck ≤∑∞

k,n=1 |µ(Ck,n)| ≤ β(A), and it follows that∑∞

k=1 β(Ak) ≤ β(A) by the choice of ck’s.

If β(X) = ∞, we derive a contradiction as follows. Given M > 0, choose a measurable partition

X =∪∞

k=1An with∑N

k=1 |µ(Ak)| > πM for some N ∈ N. Applying the Fact mentioned above with

zk = µ(Ak), find F ⊂ {1, . . . , N} such that for B :=∪

k∈F Ak we have |µ(B)| = |∑

k∈F µ(Ak)| ≥

π−1∑N

k=1 |µ(Ak)| > M . AsM > 0 is arbitrary, we must have sup{|µ(B)| : B ∈ A} = ∞. Therefore,

we can find a sequence (Bn) in A such that |µ(B1)| ≥ 1 and |µ(Bn+1)| ≥ 1 +∑n

j=1 |µ(Bj)|.

Put C1 = B1 and Cn+1 = Bn+1 \∪n

j=1Bj . Then µ(Cn) ≥ 1 for every n ∈ N and hence the

FOURIER ANALYSIS 59

series∑∞

n=1 µ(Cn) cannot converge to any complex number. On the other hand, we should have

µ(∪∞

n=1Cn) =∑∞

n=1 µ(Cn) since Cn’s are disjoint. This is the required contradiction. �

Remark: (i) In the above, |µ(X)| is called the total variation of µ, and β is called the total variation

measure of µ (often β is denoted as |µ|; note that |µ(A)| ≤ |µ|(A), but equality may not hold). (ii)

Because of [145], the collection of complex measures does not include all positive measures since a

positive measure need not be finite!

Definition: If X is a metric space, let M(X) denote the collection of all complex Borel measures on

X. For µ ∈M(R), its Fourier transform µ : R → C is defined as µ(y) =∫R e−y(x)dµ(x) for y ∈ R.

Similarly, the nth Fourier coefficient of µ ∈ M(T) is defined as µ(n) =∫T e−n(x)dµ(x) for n ∈ Z.

Note that if µ ∈ M(R) is absolutely continuous w.r.to the Lebesgue measure, then by Radon-

Nikodym theorem, there is f ∈ L1(R) with dµ = fdt and hence µ(y) =∫R e−y(t)f(t)dt = f(y).

Remark: The Fourier transform of a measure share many (but not all) of the properties of Fourier

transform of a function. This is not surprising because we can identify f ∈ L1(R) with fdx ∈M(R).

Also, it can be shown that M(R) is a Banach space w.r.to the norm ∥µ∥ := |µ|(R).

Exercise-45: (Properties - I) Let µ ∈M(R), and let |µ| = β as in [145]. Then,

(i) |∫R fdµ| ≤

∫R |f |d|µ|.

(ii) µ ∈ L∞(R) with ∥µ∥∞ ≤ |µ|(R).

(iii) µ : R → C is uniformly continuous.

(iv) In general, Riemann-Lebesgue lemma fails for µ, i.e., µ may not vanish at infinity; for example,

δ0(y) =∫R e−ydδ0 = δ0(e−y) = 1 for every y ∈ R. However, if µ is absolutely continuous w.r.to the

Lebesgue measure, then lim|y|→∞ µ(y) = 0 by Radon-Nikodym theorem.

(v) If µ has compact support, then µ ∈ E with Dmµ(y) = (−2πi)mxmµ(y) for m ∈ N and y ∈ R.

(vi) If f, f ∈ L1(R) (for instance, if f ∈ S), then∫R fdµ =

∫R f(y)µ(y)dy.

[Hint : (ii) |µ(y)| ≤∫R |e−y(x)|d|µ(x)| =

∫R 1d|µ| = |µ|(R). (iii) |µ(y + t) − µ(y)| ≤

∫R |e−t(x) −

1|d|µ|(x). (iv) Let supp(µ) ⊂ (−b, b). We have t−1(µ(y+t)−µ(y)) =∫ b−b e−y(x)t

−1(e−t(x)−1)dµ(x),

|t−1(e−tx − 1)| ≤ 2π|x| and x 7→ 2πx belongs to L1(−b, b). By dominated convergence theorem,

Dµ(y) =∫ b−b e−y(x)(−2πix)dµ = −2πiν(y), where dν = xdµ. (vi) f = (f)∨ by Fourier inversion,

and hence∫R fdµ =

∫R∫R f(y)ey(x)dµ(x)dy =

∫R f(y)µ(y)dy by Fubini’s theorem.]

Exercise-46 (Properties - II) Let µ, ν be complex measures on R. Then,

(i) (Linearity) aµ+ bν = aµ+ bν for a, b ∈ C.

(ii) µ ∗ ν = µν, where µ∗ν(A) :=∫R∫R 1A(x+y)dµ(x)dν(y) =

∫R µ(A−y)dν(y) =

∫R ν(A−x)dµ(x).

(iii)∫R µdν =

∫R νdµ.


(iv) (Uniqueness) If µ = ν, then µ = ν.

[Hint : (ii) µ∗ν satisfies∫R fd(µ∗ν) =

∫R∫R f(x+y)dµ(x)dν(y) for f ∈ L1(R, µ)∩L1(R, ν). Hence

µ ∗ ν(z) =∫R∫R e−z(x+y)dµ(x)dν(y) =

∫R µ(z)e−z(y)dν(y) = µ(z)ν(z). (iii) Use Fubini’s theorem.

(iv) By Exercise-45(vi),∫R fdµ =

∫R fdν for every f ∈ D ⊂ S. Now 1[a,b] can be approximated by

members of D, and hence µ([a, b]) = ν([a, b]) for every a < b.]

Remark: (i) Analogues of Exercise-45 and Exercise-46 hold for complex measures on T. (ii) µ ∈

M(T) is called a Rajchman measure if Riemann-Lebesgue lemma holds for µ, i.e., if lim|n|→∞

µ(n) = 0.

For example, if µ ∈ M(T) is absolutely continuous w.r.to the Lebesgue measure, then µ = f for

some f ∈ L1(T) by Radon-Nikodym theorem, and therefore µ is a Rajchman measure. By a

theorem of Neder, every Rajchman measure µ is continuous in the sense that µ({a}) = 0 ∀ a ∈ T.

Seminar topic: (Bochner’s theorem) Let g : R → C be continuous. Then g = µ for some µ ∈M(R)

with µ ≥ 0 iff g is positive definite in the sense that∑n

j,k=1 g(zj − zk)zjzk ≥ 0 for any finite subset

{z1, . . . , zn} ⊂ C (see section 2.8 in Katznelson, An Introduction to Harmonic Analysis).

Going back to the theory of Fourier series, we may now supplement [119] as follows:

[146] Let f ∈ C(T) be of bounded variation. Then,

(i) Let µ ∈M(T) be given by the Riemann-Stieltjez integral w.r.to f , i.e., µ(A) =∫T 1Adf for Borel

subsets A ⊂ T. Then 2π|nf(n)| ≤ |µ(n)| ≤ |µ|(T) for every n ∈ Z.

(ii) limN→∞ ∥f − sN (f)∥∞ = 0, i.e., (sN (f)) → f uniformly.

Proof. (i) Recall that a function of bounded variation is differentiable a.e. Now, integration by parts

gives 2πinf(n) = 2πin∫ 10 f(t)e−n(t)dt = 0+

∫ 10 e−n(t)Df(t)dt =

∫ 10 e−n(t)df(t) =

∫ 10 e−n(t)dµ(t) =

µ(n). And use (the analogue of) Exercise-45(ii).

(ii) limN→∞

∥f − σN (f)∥∞ = 0 by [115], and supn |nf(n)| <∞ by part (i). Now use Hardy’s Tauberian

theorem [111] (to be precise, a uniform version of [111]) to deduce limN→∞

∥f − sN (f)∥∞ = 0. �

Remark: On Fourier series, termwise integration is always allowed, but termwise differentiation

is allowed only under extra hypothesis. We explain: (i) Let f ∈ L1(T) and F : T → C be

F (t) =∫ t0 f(s)ds. Then F is absolutely continuous, and hence is a continuous function of bounded

variation (see 5.4 of Royden, Real Analysis). By [146], the Fourier series of F converge to F

uniformly. Since uniform convergence allows the interchange of integration and summation, the

Fourier series of F is obtained by termwise integration of the Fourier series of f . (ii) Since termwise

differentiation of∑

n∈Z f(n)en brings an additional n to the numerator, the resulting series may not

converge for a general f ∈ L1(T). However, if we assume some smoothness condition, say assume

FOURIER ANALYSIS 61

f ∈ L1(T) is piecewise C2, then it can be shown that the series obtained by termwise differentiation

of∑

n∈Z f(n)en(t) converges pointwise to [Df(t+) +Df(t−)]/2 for every t ∈ T.

19. Poisson summation formula

If f ∈ L1(R), we may ask what the series∑

n∈Z f(n)en does represent. Poisson summation

formula relates this series to the periodization of f defined below.

Definition: For f ∈ L1(R), its periodization fP on T is formally defined as fP (t) =∑

k∈Z f(t+ k)

for t ∈ T = [0, 1) (we may also view fP as defined on the whole of R with period 1, given by the

same series). For example, the periodization of continuous Dirichlet kernel is the discrete Dirichlet

kernel (see p.223 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets).

[147] Let f ∈ L1(R) and fP be its periodization on T defined above. Then,

(i) fP (t) ∈ C, i.e., the series∑

k∈Z f(t + k) is convergent, for a.e. t ∈ T. Also, fP ∈ L1(T) with

fP (n) = f(n) for every n ∈ Z.

(ii) (Poisson summation formula) Assume in addition that∑

n∈Z |f(n)| <∞. Then (after modifying

on a null set), fP ∈ C(T) with∑

k∈Z f(t+ k) = fP (t) =∑

n∈Z f(n)en(t) for every t ∈ T.

Proof. (i) As f ∈ L1(R), we see ∞ >∫R |f(x)|dx =

∑k∈Z

∫ k+1k |f(x)|dx =

∑k∈Z

∫ 10 |f(x + k)|dx.

Since the series is absolutely convergent, we may interchange summation and integration to get∫ 10 (∑

k∈Z |f(x+ k)|)dx <∞. This shows fP (x) is finite a.e. and fP |[0,1) ∈ L1(T). Similarly, an in-

terchange of series and integration yields fP (n) =∫ 10 fP (t)e−n(t)dt =

∫ 10 (∑

k∈Z f(t+k))e−n(t)dt =∑k∈Z

∫ 10 f(t+ k)e−n(t)dt =

∑k∈Z

∫ k+1k f(y)e−n(y)dy =

∫R f(y)e−n(y)dy = f(n).

(ii) By part (i),∑

n∈Z |fp(n)| =∑

n∈Z |f(n)| <∞, and hence by Exercise-16(iii) we obtain fP (t) =∑n∈Z fP (n)en(t) =

∑n∈Z f(n)en(t) for t ∈ T. �

As an application, we mention below a case of recovering g : R → C from just knowing g|Z.

Exercise-47: (Sampling formula - simple form) Let f ∈ L1(T) be with∑

n∈Z |f(n)| < ∞, and

g : R → C be g(y) =∫ 1/2−1/2 f(t)ey(t)dt for y ∈ R. Then g(x) =

∑n∈Z

g(n) sinπ(x− n)

π(x− n)for x ∈ R.

[Hint : Parametrize T = [−1/2, 1/2) and extend f to R by putting f = 0 for |t| > 1/2. Then

f ∈ L1(R) and f(y) :=∫R f(x)e−y(x)dx = g(−y) for y ∈ R. Poisson summation formula for the

periodization fP ∈ L1(T) gives fP (t) =∑

n∈Z f(n)en(t) =∑

n∈Z g(−n)en(t) =∑

n∈Z g(n)e−n(t)

for t ∈ T. Since fP = f on T, we get g(x) =∫ 1/2−1/2 f(t)ex(t)dt =

∫ 1/2−1/2(

∑n∈Z g(n)e−n(t))ex(t)dt =∑

n∈Z g(n)∫ 1/2−1/2 ex−n(t)dt, where the interchange of integration and series is justified by uniform

convergence. Now note that∫ 1/2−1/2 ex−n(t)dt =

ex−n(t)

2πi(x− n)|1/2t=−1/2 =

sinπ(x− n)

π(x− n).]


Remark: In the language of Physics, Exercise-47 is called the sampling of bandlimited signal, where

a bandlimited signal is a signal whose Fourier transform has compact support; note that if we think

of f in Exercise-47 as defined on the whole of R, then part of the hypothesis is supp(f) ⊂ [−1/2, 1/2].

20. Two theorems of Wiener

We will present two theorems of Wiener in Fourier Theory using tools from the theory of Banach

algebras13. This will also provide an opportunity for the student to see the fruitful interaction among

different branches of Mathematics. A few facts about Banach algebras will be briefly mentioned

below.

Definition: A Banach algebra is a complex Banach space Γ admitting an associative multiplication

operation ‘·’ that satisfies the following for s, t ∈ Γ:

(i) (submultiplicative property of norm) ∥s · t∥ ≤ ∥s∥∥t∥, and

(ii) (bilinearity of product) (s, t) 7→ s · t is linear w.r.to addition in each variable.

If the multiplication is also commutative, then Γ is called a commutative Banach algebra. If there

is a multiplicative identity, then Γ is called a unital Banach algebra. It may be noted that there is

a simple procedure by which a multiplicative unit (say) u can be added to any non-unital Banach

algebra Γ to convert it into a (slightly larger) unital Banach algebra Γ + Cu.

Some examples of commutative Banach algebras: (i) C with usual multiplication.

(ii) L∞(R) with pointwise product as multiplication.

(iii) C(K) := {all continuous f : K → C} with sup-norm and pointwise multiplication, where K is

a compact Hausdorff space.

(iv) L1(R) with convolution as multiplication (recall that ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1 for f, g ∈ L1(R)).

(v) l1(Z) with discrete convolution x ∗ y(k) :=∑

n∈Z x(k − n)y(n) as multiplication, whose multi-

plicative identity is the element (. . . , 0, 0, 1, 0, 0, . . .), where 1 is at the 0th place.

Definition: Let Γ be a commutative Banach algebra and Γ∗ = {all continuous linear ϕ : Γ → C}.

We say ϕ ∈ Γ∗ \ {0} is a multiplicative functional if ϕ(st) = ϕ(s)ϕ(t) for s, t ∈ Γ. If K is a compact

Hausdorff space and a ∈ K, then the evaluation map ϕa : C(K) → C given by ϕa(f) = f(a) is a

multiplicative functional on C(K); also, ker(ϕa) = {f ∈ C(K) : f(a) = 0} is a maximal ideal in Γ.

We will use the following facts about a commutative unital Banach algebra Γ in the sequel.

Fact-1: Maximal ideals in Γ are precisely the kernels of multiplicative functionals on Γ.

13Originally these theorems where proved using the tools of Fourier Analysis, with rather complicated proofs.

FOURIER ANALYSIS 63

Fact-2: An element t ∈ Γ is invertible w.r.to multiplication iff ϕ(t) = 0 for every multiplicative

functional ϕ on Γ.

Fact-3: Let M(Γ) = {all multiplicative functionals on Γ} ≃ {all maximal ideals in Γ}, which is

called the Gelfand space of Γ. We have M(Γ) ⊂ the unit sphere of the dual Γ∗, and (M(Γ),weak*)

is a compact Hausdorff space. Also, the evaluation map (called the Gelfand map) E : Γ → C(M(Γ))

given by Et(ϕ) = ϕ(t) for t ∈ Γ and ϕ ∈ M(Γ) embeds Γ in C(M(Γ)). If Γ has no multiplicative

unit, then we can only say that M(Γ) is a locally compact Hausdorff space, and ∥ϕ∥ ≤ 1 for

every ϕ ∈ M(Γ) (if ∥ϕ∥ > 1, there is t ∈ Γ with ∥t∥ < 1 < |ϕ(t)|; then ∥tn∥ ≤ ∥t∥n → 0, but

|ϕ(tn)| = |ϕ(t)|n → ∞, a contradiction to the continuity of ϕ).

[148] (i) ϕ ∈ M(l1(Z))⇔ ∃ z ∈ C with |z| = 1 such that ϕ(x) =∑

n∈Z x(n)zn. ThusM(l1(Z)) = T.

(ii) (Wiener’s theorem about invertibility) Let C1(T) = {f ∈ C(T) : (f(n)) ∈ l1(Z)}. If f ∈ C1(T)

is non-vanishing on T, then 1/f ∈ C1(T).

Proof. (i) Here we think of T as T = {z ∈ C : |z| = 1}. For z ∈ T, let ϕz : l1(Z) → C be

ϕz(x) =∑

n∈Z x(n)zn, which is linear and continuous, and not identically zero. Now, ϕz(x ∗ y) =∑

k∈Z(x ∗ y)(k)zk =∑

k∈Z∑

n∈Z x(k − n)y(n)zk =∑

m∈Z∑

n∈Z x(m)y(n)zm+n = ϕz(x)ϕz(y) and

thus ϕz ∈ M(l1(Z)). Conversely, consider ϕ ∈ M(l1(Z)). Let {vn : n ∈ Z} be the standard

basis of l1(Z), where vn(k) = 1 for k = n and vn(k) = 0 for k = n. Verify that vi ∗ vj = vi+j ,

and v0 is the multiplicative identity for convolution in l1(Z). Let z = ϕ(v1) ∈ T. Then ϕ(v2) =

ϕ(v1 ∗ v1) = ϕ(v1)ϕ(v1) = z2 and inductively ϕ(vn) = zn for n ∈ N. Moreover, ϕ(v0) = 1 = z0 and

ϕ(v−n) = 1/ϕ(vn) = z−n for n ∈ N since vn ∗ v−n = v0. Since span{vn : n ∈ Z} is dense in l1(Z),

it follows by the linearity and continuity of ϕ that ϕ(x) =∑

n∈Z x(n)zn for every x ∈ l1(Z). It can

also be shown that the correspondence M(l1(Z)) ↔ T is a homeomorphism.

(ii) Here we parametrize T as T = [0, 1). We may identify C1(T) with l1(Z) by the correspondence

f ↔ (f(n))n∈Z. In this correspondence, the convolution in l1(Z) corresponds to pointwise product

in C1(T) by Fourier inversion. If f ∈ C1(T) is non-vanishing, then f(t) =∑

n∈Z f(n)en(t) =∑n∈Z f(n)(e

2πit)n = 0 for every t ∈ T = [0, 1). This means by (i) that ϕ(f) = 0 for every ϕ ∈

M(l1(Z)). Then by Fact-2, f is invertible w.r.to multiplication in C1(T). Hence 1/f ∈ C1(T). �

To prove the second theorem of Wiener, first we identify multiplicative functionals on L1(R).

We start with a little abstract theory that generalizes [148](i).

Definition: Let G be a locally compact second countable abelian group (example: Z, T, R). Define

its dual group G = {all continuous group homomorphisms α : G→ T}, where the group operation

is pointwise multiplication in T. Any α ∈ G is called a character of G.


Fact-4: (i) α ∈ R iff there is y ∈ R with α(x) = ey(x), and hence R = R. (ii) α ∈ Z iff there is

z ∈ T with α(n) = zn, and hence Z = T. By duality, T = Z.

[149] (i) Let G be a locally compact second countable abelian group equipped with Haar measure.

Then we may identify G with M(L1(G)), where α ∈ G corresponds to ϕα ∈ M(L1(G)) given by

ϕα(f) =∫G fα for f ∈ L1(G).

(ii) (Fourier transform gives all multiplicative functionals) ϕ ∈ M(L1(R)) iff there is y ∈ R with

ϕ(f) = f(y) for f ∈ L1(R).

Proof. (i) All integrations considered below are w.r.to the Haar measure µ. We know that L1(G)∗ =

L∞(G) and any ϕ ∈ L1(G) is given by ϕ(f) =∫fg for some g ∈ L∞(G). Since G ⊂ L∞(G), it

follows from the definition of ϕα that ϕα ∈ L∞(G) = L1(G)∗ for α ∈ G. If K ⊂ G is a compact

set of positive measure, then ϕα(α1K) =∫K |α|2dµ = µ(K) > 0, and hence ϕα = 0. Moreover,

ϕα(f ∗ g) =∫(f ∗ g)α =

∫ ∫f(x − y)g(y)α(x)dµ(x)dµ(y) =

∫ ∫f(z)g(y)α(y + z)dµ(y)dµ(z) =

ϕα(f)ϕα(g) by Fubini since α(y + z) = α(y)α(z), and thus ϕα ∈ M(L1(G)) for α ∈ G.

For α, β ∈ G, if ϕα = ϕβ, then α must coincide with β in L∞(G), which means α = β almost

everywhere. And then α = β everywhere since α, β are continuous. This shows α 7→ ϕα is injective.

Now consider ϕ ∈ M(L1(G)). We need to show ϕ = ϕα for some α ∈ G. Let g ∈ L∞(G) be

with ϕ(f) =∫fg and h ∈ L1(G) be with ϕ(h) = 1. Since ϕ is multiplicative, we observe ϕ(f) =

1·ϕ(f) = ϕ(h)ϕ(f) = ϕ(h∗f) =∫ ∫

h(x−y)f(y)g(x)dµ(x)dµ(y) =∫f(y)

∫hy(x)g(x)dµ(x)dµ(y) =∫

f(y)ϕ(hy)dµ(y), which suggests that we define α : G → T as α(y) := ϕ(hy). Since ϕ and

y 7→ hy are continuous, α is continuous. Since translation commutes with convolution, α(y + z) =

ϕ(hy+z) = ϕ(h)ϕ(hy+z) = ϕ(h ∗ hy+z) = ϕ(hz ∗ hy) = ϕ(hz)ϕ(hy) = α(z)α(y), which shows

α(y + z) = α(y)α(z). It remains to show |α| = 1. Since ∥ϕ∥ ≤ 1 by Fact-3, |α(y)| ≤ ∥hy∥1 = ∥h∥1by Exercise-2. For any n ∈ N, we see |α(y)n| = |α(ny)| ≤ ∥h∥1, and similarly |α(−y)n| ≤ ∥h∥1.

Since |α(y)nα(−y)n| = |α(ny − ny)| = |α(0)| = ∥h∥1, we must have |α(y)| = 1 = ∥h∥1.

(ii) By Fact-4, any character α of R is of the form α = ey for some y ∈ R. Since ey = e−y, part (i)

applied to G = R yields the required result. �

Exercise-48: Let Γ = Cδ0 + L1(R) be the unital Banach algebra obtained by attaching the multi-

plicative unit δ0 for convolution to L1(R). Let ϕ0 : Γ → C be ϕ0(cδ0 + f) = c. Then,

(i) M(Γ) = {ϕ0} ∪ {ψ : ψ(cδ0 + f) = ϕ0(cδ0 + f) + ϕ(f) for some ϕ ∈ M(L1(R))}.

(ii) If ψ ∈ M(Γ) \ {ϕ0}, then there is y ∈ R such that ψ(cδ0 + f) = c+ f(y).

(iii) Let w ∈ D ⊂ S be with 0 ≤ w ≤ 1, and let v = w∨ ∈ S ⊂ L1(R). Let f ∈ L1(R) be such

that f is non-vanishing, and put f1(x) = f(−x). Then ϕ0(δ0 − v + f ∗ f1) = 1 = 0. Also, for any

FOURIER ANALYSIS 65

ψ ∈ M(Γ)\{ϕ0} there is y ∈ R with ψ(δ0−v+f ∗f1) = 1− v(y)+ f ∗ f1(y) = 1−w(y)+ |f(y)|2 > 0

since f ∗ f1 = f f1 = f f = |f |2. Consequently, δ0 − v + f ∗ f1 is invertible in Γ by Fact-2.

[Hint : For (ii), use (i) and [149](ii). Compute ψ(δ0 − v + f ∗ f1) in (iii) using (ii).]

[150] (Wiener’s theorem about translates) For f ∈ L1(R), the following are equivalent:

(i) The Fourier transform f is non-vanishing.

(ii) {f ∗ g : g ∈ L1(R)} is dense in L1(R).

(iii) span{fy : y ∈ R} is dense in L1(R), where fy(x) = f(x− y).

Proof. (i) ⇒ (ii): Let Λ = {h ∈ L1(R) : supp(h) is compact}, which is dense in L1(R) by the

Remark after [142]. Hence it suffices to show that for every h ∈ Λ, there is g ∈ L1(R) with

h = f ∗ g. So consider h ∈ Λ. Let Γ = Cδ0 + L1(R), and w, v, f1 ∈ L1(R) be as in Exercise-48. We

may assume w ≡ 1 in a neighborhood of supp(h). By Exercise-48, δ0 − v+ f ∗ f1 is invertible in Γ.

Let u ∈ Γ be with (δ0 − v + f ∗ f1) ∗ u = δ0, and then (δ0 − v + f ∗ f1) ∗ u ∗ h = h. The Fourier

transform of (δ0 − v) ∗ u ∗ h is (1− w)uh, which is 0 since we assume w ≡ 1 in a neighborhood of

supp(h). By the uniqueness of Fourier transform, we must have (δ0 − v) ∗ u ∗ h = 0, and hence

f ∗ f1 ∗ u ∗ h = h. Writing u = cδ0 + g0 with g0 ∈ L1(R), we get f ∗ (f1 ∗ ch + f1 ∗ g0 ∗ h) = h.

Letting g ∈ L1(R) to be the bracketed expression, we arrive at the desired conclusion f ∗ g = h.

(ii) ⇒ (iii): (Sketch) From (ii) and the inequality ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1, it follows that {f ∗ g : g ∈

Cc(R)} is also dense in L1(R) since Cc(R) = L1(R). Consider g ∈ Cc(R) and assume supp(g) ⊂ [a, b].

We have f ∗ g(x) =∫ ba f(x − y)g(y)dy =

∫ ba fy(x)g(y)dy. If a = a0 ≤ a1 ≤ · · · ak−1 ≤ ak = b is a

sufficiently fine partition of [a, b], and h(x) :=∑k

j=1 faj (x)g(aj)(aj − aj−1), then h approximates

f ∗ g in L1(R); and also h ∈ span{fy : y ∈ R}. Therefore span{fy : y ∈ R} is dense in L1(R).

(iii) ⇒ (i): Suppose f(z) = 0 for some z ∈ R. Then fy(z) = ey(z)f(z) = 0, and hence g(z) = 0 for

every g ∈ Λ := span{fy : y ∈ R}. Since g 7→ g from L1(R) to C0(R) is continuous, and since there

are g ∈ L1(R) with g(z) = 0 (use Fourier inversion), it follows that Λ cannot be dense in L1(R). �

21. Sketch: interpolation and the Lp-theory of Fourier series

Operators on Lp(T) and Lp(Rn) play a significant role in the modern theory of Fourier Analysis.

While dealing with such operators, two interpolation theorems are of basic importance: Riesz-

Thorin theorem and Marcinkiewicz theorem. We will give a brief sketch about the former, which is

needed in our discussion of Lp-convergence of Fourier series.

Exercise-49: Let (X,µ) be a σ-finite measure space and let Lp = Lp(X,µ).

(i) Let 1 ≤ p < q ≤ ∞. If f ∈ Lp and A = {x : |f(x)| ≤ 1}, then f1A ∈ Lq.

(ii) Let 1 ≤ p < q ≤ ∞. If f ∈ Lq and B = {x : |f(x)| > 1}, then f1B ∈ Lp.


(iii) If 1 ≤ p < q < r ≤ ∞, then Lp ∩ Lr ⊂ Lq ⊂ Lp + Lr.

[Hint : (i) Since |f1A| ≤ 1, we have |f1A|q ≤ |f1A|p ≤ |f |p. (ii) Let r = q/p > 1 and s > 1 be

with 1r + 1

s = 1. We have |f |p ∈ Lr, and 1B ∈ Ls since µ(B) < ∞. Hence by Holder’s inequality,∫|f1B|p =

∫|f |p1B ≤ ∥fp∥r∥1B∥s < ∞. (iii) If f ∈ Lp ∩ Lr, then f1{x:|f(x)|≤1} ∈ Lq by (i) and

f1{x:|f(x)|>1} ∈ Lq by (ii) so that (their sum) f ∈ Lq. Similar reasoning gives Lq ⊂ Lp + Lr.]

The proof of Riesz-Thorin theorem is based on the following fact from Complex Analysis:

Fact: (Three lines theorem) Consider the vertical strip S = {z ∈ C : 0 < Re(z) < 1}. Let h : S → C

be a bounded continuous function analytic on S. Let Mt = sup{|h(z)| : Re(z) = t} for 0 ≤ t ≤ 1.

Then Mt ≤M1−t0 M t

1 for 0 ≤ t ≤ 1.

[151] (Riesz-Thorin interpolation theorem) Let (X,µ) and (Y, ν) be σ-finite measure spaces. Let

1 ≤ p0 ≤ p1 ≤ ∞, 1 ≤ q0 ≤ q(1) ≤ ∞, and T : Lp0(µ) + Lp1(µ) → Lq0(ν) + Lq1(ν) be a linear

operator such that T restricted to Lpj (µ) is a bounded linear operator into Lqj (ν) with operator

normMj for j = 0, 1. For 0 < t < 1, define 1 ≤ pt, qt ≤ ∞ by 1pt

= 1−tp0

+ tp1

and 1qt

= 1−tq0

+ 1q1. Then

T restricted to Lpt(µ) is a bounded linear operator into Lqt(ν) with operator norm ≤M1−t0 M t

1, i.e.,

∥Tf∥qt ≤M1−t0 M t

1∥f∥pt for every f ∈ Lpt(µ).

Proof. (Sketch) Observe that since Lpt(µ) ⊂ Lp0(µ)+Lp1(µ) by Exercise-49, T is defined on Lpt(µ).

For s = 0, t, 1, let rs > 1 be such that 1qs

+ 1rs

= 1. Note that the operator norm on Lpt(µ) that we

need to estimate is equal to Mt := sup{|∫Y (Tf)gdν| : f ∈ Lpt(µ) and g ∈ Lrt(ν) have unit norm}.

By approximation, it is enough to consider simple functions f, g of unit norm in the expression

for Mt. Let f =∑N

j=1 ajeiαj1Aj ∈ Lpt(µ) and g =

∑Nk=1 bke

iβk1Bk∈ Lrt(ν) be simple functions

of unit norm, where aj , bk ≥ 0. Let fj = eiαj1Aj and gk = eiβk1Bkso that f =

∑Nj=1 ajfj and

g =∑N

k=1 bkgk. Let S = {z ∈ C : 0 < Re(z) < 1}. For z ∈ S, define pz, rz by the condition that

1pz

= 1−zp0

+ zp1

and 1rz

= 1−zr0

+ zr1. Note that since aj , bk ≥ 0, we may define the quantities a

pt/pzj

and brt/rzk and they depend analytically on z for z ∈ S. Let h : S → C be defined as

h(z) =

∫Y(T (

N∑j=1

apt/pzj fj))(

N∑k=1

brt/rzk gk)dν =

N∑j,k=1

apt/pzj b

rt/rzk

∫Y(Tfj)gkdν,

which is clearly analytic on S. The proof is completed by verifying that h is continuous and

bounded on S, and then applying the Three lines theorem mentioned above. See the book of

Grafakos or Pinsky for the computational details. �

Question: Let 1 ≤ p < ∞, f ∈ Lp(T) and sN (f) =∑N

n=−N f(n)en. Does (sN (f)) → f in Lp(T)?

We know the answer to be YES when p = 2, and it is known that the answer is NO when p = 1.

To investigate other cases, we introduce certain operators.

FOURIER ANALYSIS 67

Definition: Let F (T) be the collection of all trigonometric polynomials on T, where note that F (T)

is dense in Lp(T) for 1 ≤ p <∞. We write f ∈ F (T) as f =∑

n∈Z f(n)en with the understanding

that it is a finite sum, i.e., f(n) = 0 except for finitely many n ∈ Z. We define the Hilbert transform

H and the Riesz projection P from Lp(T) to itself (1 ≤ p < ∞) by defining them on the dense

subspace F (T) (see also [152] below):

(i) H(∑

n∈Z f(n)en) = i(∑−1

n=−∞−∑∞

n=1)f(n)en. If we define sgn(0) = 0, sgn(n) = 1 and

sgn(−n) = −1 for n ∈ N, then note that H(∑

n∈Z f(n)en) = −i∑

n∈Z sgn(n)f(n)en.

(ii) P (∑

n∈Z f(n)en) =∑∞

n=1 f(n)en.

[152] Let 1 < p <∞ (note that we have excluded 1). Then the Hilbert transform H and the Riesz

projection P are bounded linear operators on Lp(T).

Proof. (Sketch) Clearly H,P are linear. For f ∈ F (T), note that f + iHf = f(0) + 2Pf , and

|f(0)| ≤∫ 10 |f | =

∫ 10 1 · |f | ≤ ∥f∥p by Holder’s inequality. Therefore it suffices to show H is

bounded. We outline the structure of proof.

Step-1 : By Parseval, ∥Hf∥22 ≤∑n∈Z

|f(n)|2 = ∥f∥22 for f ∈ F (T) and hence H is bounded on L2(T).

Step-2 : One shows by some computation that H is bounded on L2k(T) for every integer k ≥ 2. The

proof is left as a reading assignment; see for instance, Lemma 3.3.4 of M.A. Pinsky, introduction

to Fourier Analysis and Wavelets.

Step-3 : From the above two steps and Riesz-Thorin interpolation theorem, it follows that H is

bounded on Lp(T) for 2 ≤ p <∞.

Step-4 : If 1 < p < 2, choose q > 2 with 1p + 1

q = 1. As the adjoint of H is −H, ∥Hf∥p =

sup{|∫fHg| : g ∈ F (T) and ∥g∥q = 1} ≤ ∥f∥p∥Hg∥q ≤ ∥f∥p∥H∥q→q by Holder’s inequality. As

∥H∥q→q <∞ by step-3, we get ∥H∥p→p <∞, where ∥H∥p→p is the operator norm on Lp(T). �

[153] Let 1 < p <∞ (note that we have excluded 1). Then,

(i) supN∈N ∥sN∥p→p <∞, and (ii) limN→∞ ∥f − sN (f)∥p = 0 for every f ∈ Lp(T).

Proof. (i) Let PN be the operator on Lp(T) specified by PN (∑

n∈Z f(n)en) =∑2N

n=0 f(n)en for f =∑n∈Z f(n)en ∈ F (T). Observe that e−N

∑2Nn=0 f eN (n)en =

∑Nn=−N f(n)en, i.e., e−NPN (feN ) =

sN (f) for f ∈ F (T). Since multiplication by e±N does not change Lp-norm, it suffices to show

supN∈N ∥PN∥p→p <∞. For f ∈ F (T), we have PNf = (∑∞

n=0−∑∞

n=2N+1)f(n)en =∑∞

n=0 f(n)en−

e2N∑∞

n=1 f(n + 2N)en = f(0) + Pf − e2NP (fe−2N ), where P is the Riesz projection. As P is

bounded by [152] and |f(0)| ≤ ∥f∥p, we get supN∈N ∥PN∥p→p ≤ 1 + 2∥P∥p→p <∞.


(ii) Let f ∈ Lp(T) and ε > 0. Let M = supN∈N ∥sN∥p→p, which is finite by part (i). Choose a

trigonometric polynomial g with ∥f − g∥p < ε. We have sN (g) = g for all large N ∈ N. Therefore,

∥f − sN (f)∥p ≤ ∥f − g∥p + ∥sN (g) − sN (f)∥p ≤ ∥f − g∥p + ∥sN∥p→p∥f − g∥p < (1 +M)ε for all

large N ∈ N, which shows limN→∞ ∥f − sN (f)∥p = 0. �

Remark: Similarly we can define Hilbert transform and Riesz projection on Lp(R) for 1 < p <∞,

and use their boundedness to prove that limu→∞ ∥f −su(f)∥p = 0 for every f ∈ Lp(R), 1 < p <∞.

In the proof, instead of F (T), one should use a suitable dense subset of Lp(R), for instance {f ∈

L1(R) ∩ Lp(R) : supp(f) is compact}.

Further reading: (i) The Hilbert transform is a prototype of an important class of operators called

multipliers; see section 3.6 in Grafakos, Classical Fourier Analysis. (ii) Modern theory of Fourier

Analysis depends heavily on singular integral operators in whose study a basic result is Calderon-

Zygmund decomposition; see section 4.3 in Grafakos, Classical Fourier Analysis. (iii) Applications

of Fourier Theory to Probability Theory can be found in Chapter 5 of M.A. Pinsky, Introduction

to Fourier Analysis and Wavelets. Fourier theory in higher dimension and many other interesting

topics can also be found in the books of Grafakos and Pinsky. See also J. Duoandikoetxea, Fourier

Analysis. For a more abstract general theory, see Rudin, Fourier Analysis on Groups.

*****

fourier analysis - wordpress.com · 1. introduction abstract harmonic analysis, the generalization...

Documents