fourier analysis - wordpress.com · 1. introduction abstract harmonic analysis, the generalization...
TRANSCRIPT
FOURIER ANALYSIS
T.K.SUBRAHMONIAN MOOTHATHU
Contents
1. Introduction 1
2. Convolution and approximate identities 3
3. Fourier series: preliminaries, and a divergence result 7
4. Sufficient conditions for pointwise convergence of Fourier series 11
5. Cesaro summability and Abel summability 14
6. Weak type boundedness for maximal functions 17
7. Fourier series: pointwise convergence of Cesaro and Abel sums 20
8. Pointwise convergence of Fourier series for functions of bounded variation 24
9. Convolution is a smoothing operation 26
10. Topologies on the spaces D = C∞c (R) and E = C∞(R) 30
11. The Schwartz space S 35
12. Distributions: preliminaries 37
13. Convolution and distributions 42
14. Some structure theorems about distributions 45
15. Fourier transform on R: basics 48
16. Fourier transform: sufficient conditions for pointwise inversion 50
17. Fourier transform on S, L2(R), and on distributions 54
18. Fourier transform of measures 58
19. Poisson summation formula 61
20. Two theorems of Wiener 62
21. Sketch: interpolation and the Lp-theory of Fourier series 65
1. Introduction
Abstract Harmonic Analysis, the generalization of Fourier Analysis, refers generally to the repre-
sentation theory of locally compact topological groups which are not necessarily abelian. Harmonic
Analysis done on Euclidean groups such as the torus and Rn (which are abelian) is usually called1
2 T.K.SUBRAHMONIAN MOOTHATHU
Fourier Analysis, and this is what we plan to study here. We will discuss the basic aspects of
Fourier Analysis from the perspective of pure Mathematics, making use of tools from Measure
Theory and Functional Analysis (and hence these two subjects are prerequisites for this course). A
little bit of knowledge about Topological Groups will also be needed for the next section.
Fourier series on the circle will be discussed first. Fourier integral on R, distributions, and a
few advanced topics will be discussed afterwards. For a study deeper than what we present here,
the student may refer to, for instance, the following books: (i) Y. Katznelson, An Introduction to
Harmonic Analysis, (ii) L. Grafakos, Classical Fourier Analysis, (iii) M.A. Pinsky, Introduction to
Fourier Analysis and Wavelets, (iv) F.G. Friedlander, Introduction to the Theory of Distributions.
Recall that if X is a Hilbert space with orthonormal basis {en : n ∈ N}, then any x ∈ X has
the series representation x =∑∞
n=1⟨x, en⟩en. The idea of representing functions on the circle as a
Fourier series is quite similar. We sketch this briefly.
Let T = R/Z denote the unit circle on the complex plane, which we parametrize either as [0, 1)
or as [−1/2, 1/2). We remark that in several textbooks T is parametrized either as [0, 2π) or
as [−π, π), and then one will get an extra 12π factor in certain integral expressions. Irrespective
of how we parametrize T, keep in mind that T is a compact metric space. Note that C(T) =
{f : T → C : f is continuous} is a Banach space w.r.to the supremum norm ∥ · ∥∞ defined as
∥f∥∞ = sup{|f(t)| : t ∈ T}. We equip T = [0, 1) with the Lebesgue measure, and consider the
Hilbert space L2(T) of square integrable complex valued functions on T, where the inner product
is ⟨f, g⟩ =∫ 10 f(t)g(t)dt. Let en(t) = e2πint for n ∈ Z . We see that ⟨em, en⟩ =
∫ 10 em−n(t)dt = 0 if
m = n and = 1 if m = n, which means {en : n ∈ Z} is an orthonormal set in L2(T).
Since span{en : n ∈ Z} is a subalgebra of C(T) separating points, vanishing nowhere, and
closed under complex conjugation, it follows by Stone-Weierstrass theorem (Theorem 7.33 of Rudin,
Principles of Mathematical Analysis) that span{en : n ∈ Z} is dense in (C(T), ∥ · ∥∞). Since
∥ · ∥2 ≤ ∥ · ∥∞, the topology induced on C(T) by ∥ · ∥∞ is stronger than the L2-topology. Also C(T)
is dense in L2(T), a fact from Measure Theory. It follows that span{en : n ∈ Z} is dense in L2(T).
Thus {en : n ∈ Z} is an orthonormal basis for the Hilbert space L2(T).
Hence any f ∈ L2(T) has a series representation f =∑
n∈Z anen, where an = ⟨f, en⟩ =∫ 10 fe−n =∫ 1
0 f(t)e−2πintdt. By a change of variable θ = 2πt (which amounts to parametrizing T as [0, 2π)),
we can also obtain the classical formula an = 12π
∫ 2π0 f(θ)e−inθdθ.
Observe that the representation f =∑
n∈Z anen means only that the sequence (∑N
n=−N anen)∞N=0
of partial sums converges to f in the L2-norm. Since Lp-convergence (1 ≤ p < ∞) does not imply
FOURIER ANALYSIS 3
pointwise convergence, the series∑
n∈Z anen(t) may not converge to f(t) at points t ∈ T. Therefore,
among other things, it is natural to discuss the following: (i) finding sufficient conditions for the
pointwise convergence of the Fourier series, (ii) rate of convergence of the Fourier series, (iii) other
types of convergence that may hold even in the absence of pointwise convergence.
2. Convolution and approximate identities
We start with a little abstract theory that is applicable to both T and Rn. Let G be a lo-
cally compact second countable abelian group throughout this section. Read the basic theory of
such groups from relevant books (for instance, G admits a complete separable metric). The most
important fact is that such a group G has a Haar measure µ on it, which means
(i) µ = 0 is a Borel measure on G (i.e., defined on the Borel σ-algebra of G),
(ii) (local finiteness) µ(K) <∞ for every compact set K ⊂ G, and
(iii) (translation invariance) µ(A+ x) = µ(A) for every Borel set A ⊂ G and every x ∈ G.
Moreover any other measure on G satisfying the above properties must be of the form cµ for
some c > 0. A Haar measure µ on G is always regular, which means
µ(A) = sup{µ(K) : K ⊂ A and K is compact} = inf{µ(U) : A ⊂ U ⊂ G and U is open}.
Note that the Lebesgue measure is a Haar measure on T and Rn.
Definition: Let µ be a Haar measure on G. For 1 ≤ p < ∞, let Lp(G) = {f : G → C :∫G |f |pdµ < ∞} with the usual convention that we identify f and g if they agree µ-almost
everywhere. For f ∈ Lp(G), let ∥f∥p = (∫G |f |pdµ)1/p. When p = 2, L2(G) is a Hilbert
space with the inner product ⟨f, g⟩ =∫G fgdµ. Similarly one defines L∞(G) = {f : G →
C : f is measurable, and bounded µ-almost everywhere} with norm defined as ∥f∥∞ = inf{M >
0 : |f(x)| ≤ M for x outside a µ-null set}. Let C(G) = {f : G → C : f is continuous} and
Cc(G) = {f ∈ C(G) : f has compact support}, where the support of f , denoted as supp(f), is the
closure of the set {x ∈ G : f(x) = 0}. The following is a standard fact.
Exercise-1: Let 1 ≤ p < ∞. Then Cc(G) is dense in Lp(G). In particular, if G is compact, then
C(G) is dense in Lp(G). [Hint : Find the proof from a textbook. The idea of one proof is roughly
as follows. The indicator function 1K of a compact set K ⊂ G can be approximated by members
of Cc(G) using Urysohn’s lemma. Since the measure µ is regular, the indicator function 1A of any
Borel set A ⊂ G also has such approximations. And a general f ∈ Lp(G) can be approximated by
simple functions, i.e., functions of the form∑k
j=1 aj1Aj where Aj ⊂ G are Borel.]
The translation invariance of the Haar measure has the following important consequence:
4 T.K.SUBRAHMONIAN MOOTHATHU
Exercise-2: Let f ∈ L1(G), and fy(x) = f(x− y) for y ∈ G. Then,
(i)∫G fydµ =
∫G f(x− y)dµ(x) =
∫G fdµ, fy ∈ L1(G), and ∥fy∥1 = ∥f∥1 for every y ∈ G.
(ii) For each f ∈ L1(G), the map y 7→ fy from G to L1(G) is continuous.
[Hint : (i) The equality∫fydµ =
∫fdµ is clear when f is an indicator function 1A with µ(A) <∞.
And any f ∈ L1(G) may be approximated by linear combinations of such indicator functions. (ii)
Since ∥fy−gy∥1 = ∥(f−g)y∥1 = ∥f−g∥1, and since Cc(G) is dense in L1(G) by Exercise-1, it suffices
to consider f ∈ Cc(G) and establish the continuity of y 7→ fy at y = 0 ∈ G. Let K = supp(f),
the (compact) support of f , and A ⊂ G be a compact symmetric neighborhood of 0 ∈ G. Note
that µ(K + A) < ∞ since K + A is compact. As f ∈ Cc(G) is uniformly continuous, given ε > 0,
we can find a symmetric neighborhood U ⊂ G of 0 with U ⊂ A such that a − b ∈ U implies
|f(a) − f(b)| < ε/µ(K + A). Then, for y ∈ U , we have ∥f − fy∥1 =∫G |f(x) − f(x − y)|dµ(x) ≤∫
K+A(ε/µ(K +A))dµ = ε.]
Remark: As G is locally compact and second countable, G is σ-compact, and consequently the
Haar measure µ on G is σ-finite. Hence Fubini’s theorem holds for µ (this will be used repeatedly).
Certain Banach spaces admit an associative multiplication operation that distributes over addi-
tion, making the Banach space a Banach algebra. For instance, l∞ has the pointwise multiplication
(an)(bn) := (anbn), and C(T) has the pointwise multiplication (fg)(x) := f(x)g(x). The Banach
space L1(G) also admits a multiplication called convolution:
Definition: Let µ be a Haar measure on G. The convolution f ∗ g of f, g ∈ L1(G) is defined as
f ∗g(x) =∫G f(y)g(x−y)dµ(y) =
∫G f(y)gy(x)dµ(y), which can be roughly thought of as a weighted
average of f by the function y 7→ gy. Applying Fubini’s theorem to Φ(x, y) := |f(y)gy(x)|, we
see that∫G
∫GΦdµdµ =
∫G |f(y)|
(∫G |gy(x)|dµ(x)
)dµ(y) =
∫G |f(y)|∥g∥1dµ(y) = ∥f∥1∥g∥1 < ∞.
Hence (f ∗g)(x) is well-defined for µ-almost every x ∈ G and f ∗g ∈ L1(G) with ∥f ∗g∥1 ≤ ∥f∥1∥g∥1.
Exercise-3: Let µ be a Haar measure on G and let f, g, h ∈ L1(G). Then,
(i) f ∗ g ∈ L1(G) with ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1. (already done above)
(ii) (Commutativity) f ∗ g = g ∗ f . [Hint : Substitute u = x− y in the integral representing f ∗ g.]
(iii) (Associativity) f ∗ (g ∗ h) = (f ∗ g) ∗ h.
(iv) (Linearity in each variable) (af + bg) ∗ h = a(f ∗ h) + b(g ∗ h) for a, b ∈ C.
(v) (Convolution commutes with translation) (f ∗ g)z = fz ∗ g = f ∗ gz for every z ∈ G.
Remark: If G is compact, then Lp(G) ⊂ L1(G) for 1 < p <∞ since µ(G) <∞. This inclusion does
not hold when G is non-compact such as Rn. But still f ∗ g is defined for f, g belonging to certain
Lp(G) spaces, and the result below gives information about the location of g ∗ f in such cases.
FOURIER ANALYSIS 5
[101] Let µ be a Haar measure on G.
(i) If g ∈ L1(G) and f ∈ Cc(G), then g ∗ f ∈ L∞(G) and g ∗ f is uniformly continuous.
(ii) (Minkowski’s inequality) Let 1 ≤ p ≤ ∞. If g ∈ L1(G) and f ∈ Lp(G), then g ∗ f ∈ Lp(G) with
∥g ∗ f∥p ≤ ∥f∥p∥g∥1.
(iii) (Young’s inequality) Let 1 ≤ p, q, r ≤ ∞ be such that 1p + 1
r = 1q + 1. If g ∈ Lr(G) and
f ∈ Lp(G), then f ∗ g ∈ Lq(G) and ∥f ∗ g∥q ≤ ∥g∥r∥f∥p.
Proof. (i) Since f ∈ Cc(G), f is uniformly continuous and bounded. We have |g∗f(x)| ≤ ∥g∥1∥f∥∞.
Given ε > 0, choose a symmetric neighborhood U ⊂ G of 0 ∈ G such that |f(x)− f(y)| < ε/∥g∥1whenever x − y ∈ U . Then |(g ∗ f)(a) − (g ∗ f)(b)| ≤
∫G |g(y)||f(a − y) − f(b − y)|dµ(y) <
(ε/∥g∥1)∫G |g(y)|dµ(y) = ε whenever a− b ∈ U , and hence g ∗ f is uniformly continuous.
(ii) The case p = 1 is already done, and the case p = ∞ is easy. So assume 1 < p < ∞. Let
q = p/(p − 1) so that 1p + 1
q = 1. Let h(y) = f(x − y) and ν be the Borel measure on G given by
ν(A) =∫A |g|dµ. Applying Holder’s inequality w.r.to ν, we have |g ∗ f(x)| ≤∫
G|f(x− y)||g(y)|dµ(y) =
∫G1 · |h|dν ≤ ∥1∥Lq(ν)∥h∥Lp(ν) = ∥g∥1/q1
(∫G|f(x− y)|p|g(y)|dµ(y)
)1/p
.
Since p/q = p− 1,∫G |g ∗ f(x)|pdµ(x) ≤ ∥g∥p−1
1 C, where by Fubini and translation invariance,
C =
∫G
∫G|f(x− y)|p|g(y)|dµ(y)dµ(x) =
∫G
(∫G|f(x− y)|pdµ(x)
)|g(y)|dµ(y)
=
∫G
(∫G|f(x)|pdµ(x)
)|g(y)|dµ(y) =
∫G∥f∥pp|g(y)|dµ(y) = ∥g∥1∥f∥pp.
Thus ∥g ∗ f∥pp ≤ ∥g1∥p1∥f∥pp, or ∥g ∗ f∥p ≤ ∥g∥1∥f∥p.
(iii) This is a generalization of (ii) with a similar, but more complicated proof, which is left as a
reading assignment. See Theorem 1.2.12 of Grafakos, Classical Fourier Analysis. �
In general, L1(G) may not have a unit element for convolution, i.e., there may not exist g ∈ L1(G)
with f ∗ g = f for every f ∈ L1(G); see the Remark after [103]. However, L1(G) has what is called
approximate identities: a parametrized family of functions which, in the limiting case, behaves as
a unit element for convolution. First we will define an approximate identity formally; its behavior
justifying the name ‘approximate identity’ will be proved in [102] below.
Definition: Let µ be a Haar measure on G, and let 0 < b < ∞. A parametrized family {Ka : 0 <
a < b} in L1(G) is an approximate identity for L1(G) if the following three properties are satisfied:
(A1) (Normalization)∫GKadµ = 1 for every a.
(A2) (L1-boundedness) sup{∥Ka∥1 : 0 < a < b} <∞.
6 T.K.SUBRAHMONIAN MOOTHATHU
(A3) (L1-concentration at 0 ∈ G) For any neighborhood U ⊂ G of 0 ∈ G, lima→0
∫G\U |Ka|dµ = 0.
Note that if Ka ≥ 0, then (A2) follows from (A1).
Sometimes the approximate identity satisfies an additional property:
(A4) (L∞-concentration at 0 ∈ G) For any neighborhood U ⊂ G of 0 ∈ G, lima→0 sup{|Ka(x)| :
x ∈ G \ U} = 0. (Note that (A4) implies (A3) when µ(G) <∞, when G is compact).
Remark: Soon we will encounter explicit examples of approximate identities on L1(T) in relation
with the theory of Fourier series. For the moment note that if g ∈ L1(Rn) is with g ≥ 0 and∫Rn gdµ = 1, then we can obtain an approximate identity {Ka}a>0 for L1(Rn) by putting Ka(x) =
a−ng(x/a). We have∫Rn Ka(x)dx =
∫Rn a
−ng(x/a)dx =∫Rn g(y)dy = 1 by the change of variable
y = x/a, where we used the fact that a−n is the determinant of the Jacobian of the map x 7→ x/a
from Rn to Rn. And∫|x|>δ |Ka(x)|dx =
∫|y|>δ/a |g(y)|dy → 0 as a→ 0.
Remark: In the above definition, we parametrized the approximate identity with a ∈ (0, b) and con-
sidered the limit as a→ 0. However, we can define approximate identity using other parametriza-
tions and other limiting processes; for instance, we may parametrize with n ∈ N and consider the
limit as n→ ∞, or parametrize with r ∈ (0, 1) and consider the limit as r → 1.
A general strategy: While estimating integral expressions in Fourier Theory, the following strategy
will be often followed: split the integral into two parts, one part for a neighborhood of 0, and the
other part for the region outside; then estimate each integral separately.
[102] (Justifying the name approximate identity) Let µ be a Haar measure on G, and let {Ka : 0 <
a < b} be an approximate identity on L1(G).
(i) If f ∈ Cc(G), then lima→0 ∥Ka ∗ f − f∥∞ = 0.
(ii) If 1 ≤ p <∞ and f ∈ Lp(G), then lima→0 ∥Ka ∗ f − f∥p = 0.
(iii) Assume in addition that G is compact and {Ka : 0 < a < b} satisfies the L∞-concentration
condition (A4). If g ∈ L1(G) satisfies limx→0 g(x) = c, then lima→0
∫GKa(x)g(x)dµ(x) = c.
Proof. (i) Using condition (A1) of an approximate identity, we note that Ka ∗ f(x) − f(x) =∫GKa(y)(f(x − y) − f(x))dµ(y). Let ε > 0 be given, and M = sup{∥Ka∥1 : 0 < a < b}. Since
f ∈ Cc(G) is uniformly continuous, we may choose a neighborhood U ⊂ G of 0 ∈ G such that
|f(x− y) − f(x)| ≤ ε2M for every y ∈ U and x ∈ G. Using condition (A3), choose a0 ∈ (0, b) such
that∫G\U |Ka|dµ < ε
4∥f∥∞ for every a ∈ (0, a0). Then for 0 < a < a0, we have
|Ka ∗ f(x)− f(x)| ≤∫U|Ka(y)||f(x− y)− f(x)|dµ(y) +
∫G\U
|Ka(y)||f(x− y)− f(x)|dµ(y)
≤ (ε/2M)
∫U|Ka(y)|dµ(y) + 2∥f∥∞
∫G\U
|Ka(y)|dµ(y) ≤ ε/2 + ε/2 = ε
FOURIER ANALYSIS 7
by the choice ofM and a0. As this holds for every x ∈ G, ∥Ka∗f−f∥∞ ≤ ε for every a ∈ (0, a0).
(ii) If f ∈ Lp(G) and g ∈ Cc(G), we have ∥Ka ∗f −f∥p ≤ ∥Ka ∗ (f −g)∥p+∥Ka ∗g−g∥p+∥g−f∥p.
Since Cc(G) is dense in Lp(G), we can make ∥g − f∥p arbitrarily small by choosing g suitably. By
[101](ii), ∥Ka ∗ (f − g)∥p ≤ ∥Ka∥1∥f − g∥p ≤ M∥f − g∥p, where M = sup{∥Ka∥1 : 0 < a < b}.
Finally, ∥Ka ∗ g − g∥p ≤ ∥Ka ∗ g − g∥∞ → 0 as a → 0 by part (i). From these observations, it
follows that ∥Ka ∗ f − f∥p → 0 as a→ 0.
(iii) We have∫GKa(x)g(x)dµ(x) − c =
∫GKa(x)(g(x) − c)dµ(x) by (A1). Let M = sup{∥Ka∥1 :
0 < a < b}. Given ε > 0, choose a neighborhood U ⊂ G of 0 ∈ G such that |g(x) − c| < ε2M for
x ∈ U . Then choose a0 ∈ (0, b) by (A4) such that sup{|Ka(x)| : x ∈ G \ U} < ε
2(∥g∥1 + cµ(G)),
where the compactness of G ensures that µ(G) <∞. Then, for 0 < a < a0, as in the proof of (i),
|∫GKagdµ− c| ≤ ε
2M
∫U |Ka|dµ+
ε
2(∥g∥1 + cµ(G))
∫G\U |g(x)− c|dµ ≤ ε/2 + ε/2 = ε. �
3. Fourier series: preliminaries, and a divergence result
Recall that we parametrize the circle T = R/Z as [0, 1) (sometimes also as [−1/2, 1/2)) with
addition performed modulo 1. Let er(t) = e2πirt for r ∈ R, and note that er = e−r, where the bar
stands for complex conjugation. Since the Lebesgue measure on T is finite, we have Lp(T) ⊃ Lq(T)
for 1 ≤ p ≤ q < ∞, and thus the largest among them is L1(T). Whenever needed, it will be
implicitly assumed that any f ∈ L1(T) is extended to the whole of R with period 1.
Definition: The Fourier series of f ∈ L1(T) is formally defined as∑
n∈Z f(n)en, where f(n) =∫ 10 f(t)en(t)dt =
∫ 10 f(t)e−n(t)dt. Here f(n) is called the nth Fourier coefficient of f for n ∈ Z.
Remark: Note that f(n)en + f(−n)e−n = (f(n) + f(−n)) cos 2πnt + i(f(n) − f(−n)) sin 2πint.
Hence we have the formal equality of Fourier series
∑n∈Z
f(n)en = f(0) +
∞∑n=1
(f(−n)e−n + f(n)en) = A0 +
∞∑n=1
(An cos 2πnt+Bn sin 2πint),
whereA0 = f(0) =∫T f(t)dt, An = f(n)+f(−n) =
∫T f(t)(e−n(t)+en(t))dt = 2
∫T f(t) cos 2πntdt
and Bn = i(f(n)− f(−n)) =∫T if(t)(e−n(t)− en(t))dt = 2
∫T f(t) sin 2πintdt. Observe that if f is
an even function, then Bn = 0 for all n since sin is an odd function; and if f is an odd function,
then An = 0 for all n since cos is an even function.
Exercise-4: Restricted to L2(T), the map f 7→ (f(n))n∈Z from L2(T) to l2(Z) is an isometric
isomorphism of Hilbert spaces. [Hint : See the Introduction, and use Parseval’s identity.]
8 T.K.SUBRAHMONIAN MOOTHATHU
Example: Let T = [0, 1) and f : T → C be f(t) = t. We have f(0) =∫ 10 tdt = 1/2. For
n ∈ Z \ {0}, integration by parts gives f(n) =∫ 10 te−n(t)dt = (−2πin)−1te−n(t)|10 −
∫ 10 e−n(t)dt =
(−2πin)−1 − 0 = (−2πin)−1. Also ∥f∥22 =∫ 10 t
2dt = 1/3. Now ∥f∥22 =∑
n∈Z |f(n)|2 by Exercise-4.
Hence 1/3 = (1/4) + 2∑∞
n=1(4π2n2)−1. Simplification yields π2/6 =
∑∞n=1 n
−2.
Exercise-5: (Basic properties of the Fourier coefficients) Let f, g ∈ L1(T) and a, b ∈ C. Then, for
every n ∈ Z we have:
(i) (Linearity) af + bg(n) = af(n) + bg(n).
(ii) f(n) = f(−n).
(iii) If fs ∈ L1(T) is defined as fs(t) = f(t− s) for s ∈ T, then fs(n) = f(n)e−n(s) = f(n)en(−s).
(iv) (Another expression for the Fourier coefficient) f(n) = (1/2)∫ 10 [f(t)− f(t+ 1
2n)]e−n(t)dt.
(v) en ∗ f = f ∗ en = f(n)en.
(vi) (Fourier coefficient of convolution is product of Fourier coefficients) f ∗ g(n) = f(n)g(n).
[Hint : (iii) fs(n) =∫T f(t−s)e−n(t)dt =
∫T f(y)e−n(y+s)dy = f(n)e−n(s) by putting t = y+s. (iv)
In f(n) =∫T f(t)e−n(t)dt, substitute t = y+ 1
2n to get f(n) = −∫T f(y+
12n)e−n(y)dy and add this to
the first expression for f(n). (v) f ∗ en(t) =∫f(s)en(t− s)ds = (
∫f(s)e−n(s)ds)en(t) = f(n)en(t).
(vi) Using (v) we have f ∗ g(n)en = (f ∗ g) ∗ en = f ∗ (g ∗ en) = f ∗ (g(n)en) = g(n)(f ∗ en) =
g(n)f(n)en. And we may cancel en = 0 from both ends].
The smoother the function f , the faster the rate of convergence of (f(n)) to 0 as |n| → ∞.
[103] (Rate of decay of Fourier coefficients) Let f ∈ L1(T). Then,
(i) |f(n)| ≤ ∥f∥1 for every n ∈ Z and hence (f(n))n∈Z ∈ l∞(Z) (this is improved below). This has
the following consequence by linearity: if (fk) → f in L1(T), then (fk(n)) → f(n) uniformly in n.
(ii) (Riemann-Lebesgue lemma) lim|n|→∞
f(n) = 0, i.e., (f(n))n∈Z ∈ c0(Z) := {(xn) : lim|n|→∞
xn = 0}.
(iii) (Generalized Riemann-Lebesgue lemma) limr∈R; |r|→∞
∫Tf(t)er(t)dt = 0, and consequently
limr∈R; |r|→∞
∫Tf(t) cos 2πrtdt = 0 and lim
r∈R; |r|→∞
∫Tf(t) sin 2πrtdt = 0.
(iv) If f is k-times differentiable with f (k) ∈ L1(T), then we have f (k)(n) = (2πin)kf(n), and hence
lim|n|→∞ |n|kf(n) = 0 (this means (f(n)) goes to 0 faster than (|n|−k)).
Proof. (i) |f(n)| ≤∫T |fe−n|dt =
∫T |f |dt = ∥f∥1 since |e−n| = 1.
(ii) Since C(T) is dense in L1(T) by Exercise-1, and since |f(n) − g(n)| ≤ ∥f − g∥1, it suffices to
prove the result for f ∈ C(T). And the result in this case follows from Exercise-5(iv), where we
noted the expression f(n) = (1/2)∫T[f(t)− f(t+ 1
2n)]e−n(t)dt.
FOURIER ANALYSIS 9
(iii) To prove the first statement, note as above that it suffices to prove for f ∈ C(T), and then
note∫T f(t)er(t)dt = (1/2)
∫T[f(t)−f(t−
12r )]er(t)dt for r ∈ R\{0} as in the hint of Exercise-5(iv).
To prove the second statement, assume f is real valued and note er(t) = cos 2πrt+ i sin 2πrt.
(iv) Integration by parts gives f(n) =∫ 10 fe−ndt = (−2πin)−1f(t)e−n|10 + (2πin)−1
∫ 10 f
′e−ndt =
0 + (2πin)−1∫ 10 f
′e−ndt = (2πin)−1f ′(n). That is, f ′(n) = 2πinf(n), and inductively f (k)(n) =
(2πin)kf(n). Finally, the assertion lim|n|→∞ |n|kf(n) = 0 follows by applying part (ii) to f (k). �
Remark: We may explain why there is no multiplicative identity for convolution in L1(T). Suppose
there is f ∈ L1(T) with f ∗g = g for every g ∈ L1(T). Consider g =∑∞
n=−∞ en/n2 ∈ C(T) ⊂ L1(T),
and note that g(n) = 1/n2 = 0 for every n ∈ Z. If g = f ∗ g, then g(n) = f ∗ g(n) = f(n)g(n) and
hence we must have f(n) = 1 for every n ∈ Z, which contradicts Riemann-Lebesgue lemma.
For later use, we note down the following consequences of [103](iii).
[104] Let g, h ∈ L1(T) and b < c be in T. Then,
(i) limN→∞
∫ c
bg(t) sin(2N + 1)πtdt = 0.
(ii) If the function t 7→ t−1h(t) is bounded almost everywhere in a neighborhood of 0, then
limN→∞
∫ c
bt−1h(t) sin(2N + 1)πtdt = 0.
Proof. (i) Note that sin(2N + 1)πt = sin 2π(N + 12)t, and apply [103](iii) to 1(b,c)g ∈ L1(T).
(ii) Let g(t) = t−1h(t). By hypothesis, there are δ ∈ (0, 1/2) and M > 0 such that |g(t)| < M for
almost every t ∈ (−δ, δ). Hence∫T |g(t)|dt ≤
∫|t|<δMdt +
∫|t|>δ δ
−1|h(t)|dt ≤ δM + δ−1∥h∥1 < ∞
and thus g ∈ L1(T). Apply part (i) to g. �
Question: When does the Fourier series converge pointwise?
Definition: The Nth Dirichlet kernel1 DN ∈ C(T) is defined as DN =∑N
n=−N e−n =∑N
n=−N en.
Note that DN (t) = 1+2∑N
n=1 cos 2πnt, and therefore DN is a real valued even function. The Nth
partial sum sN (f) of the Fourier series of f ∈ L1(T) is defined as sN (f) =∑N
n=−N f(n)en.
Exercise-6: (sN in terms of DN ) For f ∈ L1(T), we have: (i) sN (f) = DN ∗ f , and consequently
sN (f) is real valued whenever f is real valued.
(ii) If T is parametrized as [−1/2, 1/2), then sN (f, a) =∫ 1/20 DN (t)[f(a+ t) + f(a− t)]dt ∀ a ∈ T.
[Hint : (i) sN (f) = f ∗ (∑N
n=−N en) = f ∗DN = DN ∗ f by Exercise-5(v).
(ii) sN (f, a) = DN ∗ f(a) = (∫ 0−1/2+
∫ 1/20 )DN (t)f(a− t)dt. Now the substitution t = −y converts
the first integral into∫ 1/20 DN (y)f(a+ y)dy since DN is even.]
1See the end of Section 8 for a picture of the graph of DN .
10 T.K.SUBRAHMONIAN MOOTHATHU
Remark: sN : L1(T) → L1(T) is linear, but is not a positive operator (since DN � 0). For any
N ∈ N, we can construct f ∈ L1(T) with f ≥ 0 and sN (f) � 0 as follows. Choose b ∈ T with
DN (b) < 0, and then choose ε > 0 and δ > 0 by the continuity of DN such that DN (t) < −ε for
every t ∈ [b, b + 2δ]. Note that sN (f, a) =∫ 10 DN (t)f(a − t)dt =
∫ 10 DN (y)f(a + y)dy since DN is
even. If we take f = 1[b+δ,b+2δ], then sN (f, a) =∫ b+2δ−ab+δ−a DN (y)dy < −εδ < 0 for every a ∈ (0, δ).
Below we show that even though {DN} fails to satisfy property (A2) in the definition of an
approximate identity, {DN} satisfies (A1) and a property similar to (A3).
[105] (i) DN (t) =sin(2N + 1)πt
sinπtfor t ∈ T \ Z and DN (t) = 2N + 1 for t ∈ Z.
(ii) (Normalization property (A1) holds)∫ 10 DN (t)dt = 1.
(iii) (L1-boundedness (A2) fails) ∥DN∥1 → ∞ as N → ∞.
(iv) (A property similar to the L1-concentration (A3) holds) Let 0 < δ < 1/2. Then we have
limN→∞∫ −δ−1/2DN (t)dt = 0 = limN→∞
∫ 1/2δ DN (t)dt; more generally
limN→∞∫ −δ−1/2DN (t)h(t)dt = 0 = limN→∞
∫ 1/2δ DN (t)h(t)dt for every h ∈ L1(T).
Proof. (i) 2i sinπtDN (t) = (e1/2(t)−e−1/2(t))DN (t) = eN+1/2(t)−e−(N+1/2)(t) = 2i sin(2N+1)πt.
(ii)∫ 10 DN (t)dt =
∑Nn=−N
∫ 10 en = 1 since
∫ 10 e0 = 1 and
∫ 10 en = 0 for n = 0.
(iii) Since sinx < x for x > 0, we have ∥DN∥1 =∫ 10 |DN (t)|dt ≥
∫ 10 | sin (2N + 1)πt|(πt)−1dt.
Put y = (2N + 1)πt and note dy/y = dt/t. We see ∥DN∥1 ≥∫ (2N+1)π0 | sin y|(πy)−1dy ≥∑2N+1
n=1
∫ nπ(n−1)π | sin y|(nπ
2)−1dy =∑2N+1
n=1 (nπ2)−1 → ∞ as N → ∞. We used:∫ nπ(n−1)π | sin y| = 1.
(iv) Since∫ −δ−1/2DN (t)h(t)dt =
∫ 1/2δ DN (t)(−h(t))dt, it suffices to show lim
N→∞
∫ 1/2
δDN (t)h(t)dt = 0.
Note by (i) that∫ 1/2δ DN (t)h(t)dt =
∫T g(t) sin(2N + 1)πtdt, where the function g defined as
g(t) = 1(δ,1/2)(t)h(t)/ sinπt belongs to L1(T) since sinπt > sinπδ for t ∈ (δ, 1/2). Therefore,∫ 1/2δ DN (t)h(t)dt =
∫T g(t) sin(2N + 1)πtdt→ 0 as N → ∞ by [104](i). �
To be alert about the possible failure of pointwise convergence of the Fourier series even in the
case of continuous functions, we establish a negative result in the beginning itself. To use in the
proof, keep in mind the Uniform boundedness theorem: if (Tα)α∈J is a pointwise bounded family of
bounded linear operators from a Banach space to a normed space, then sup{∥Tα∥ : α ∈ J} <∞.
Remark: The family {DN} as N → ∞ cannot be an approximate identity on L1(T) because of the
failure of L1-boundedness proved in [104](iii) above. This is the essential reason behind the failure
of pointwise convergence for a general Fourier series.
FOURIER ANALYSIS 11
[106] (Failure of pointwise convergence of Fourier series even for continuous functions) Parametrize
T as [−1/2, 1/2). There is f ∈ C(T) with sup{|sN (f, 0)| : N ∈ N} = ∞. So the sequence (sN (f, 0))
of Fourier partial sums of f at 0 does not converge to f(0).
Proof. Define linear functionals ϕN : (C(T), ∥ · ∥∞) → C as ϕN (f) = sN (f, 0). They are bounded
since |ϕN (f)| ≤∑N
n=0 |f(n)| ≤ (N +1)∥f∥∞. We need to show (ϕN ) is not pointwise bounded. By
the Uniform boundedness theorem, it suffices to show sup{∥ϕN∥ : N ∈ N} = ∞. For this purpose,
we will show ∥ϕN∥ ≥ ∥DN∥1 for every N ∈ N. This suffices since ∥DN∥1 → ∞ by [105](iii).
Fix N ∈ N and let h ∈ L1(T) be such that hDN = |DN |, i.e., we take h(t) = 1 if DN (t) ≥ 0 and
h(t) = −1 if DN (t) < 0. We may find a sequence (fk) in C(T) such that ∥fk∥∞ ≤ 1 and (fk) → h
pointwise (check). Note that ϕN (fk) = sN (fk, 0) =∫T fk(t)DN (0 − t)dt =
∫T fk(t)DN (t)dt since
DN (−t) = DN (t). Since fk’s are dominated by 1 ∈ L1(T), we get by Lebesgue dominated conver-
gence theorem that limk→∞ ϕN (fk) = limk→∞∫T fk(t)DN (t)dt =
∫T h(t)DN (t)dt =
∫T |DN (t)|dt =
∥DN∥1. Since ∥fk∥∞ ≤ 1, we conclude ∥ϕN∥ ≥ ∥DN∥1, and we are done. �
Remark: However, ∥sN (f)∥∞ cannot grow very fast: it is known that limN→∞∥sN (f)∥∞
logN= 0 for
f ∈ C(T), see Proposition 1.6.6 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets.
Remark: We sketch a little history. Kolmogorov gave an example of f ∈ L1(T) whose Fourier series
diverges almost everywhere. In contrast, Carleson showed that the Fourier series of any f ∈ L2(T)
converges pointwise to f for almost every t ∈ T. Hunt extended this result to every f ∈ Lp(T) for
every 1 < p < ∞. These results are beyond our scope, and we will not prove them. However, we
will prove pointwise convergence of Fourier series under some smoothness assumption (we need to
prevent f from oscillating too much). And also a little later, we will prove Fejer-Lebesgue theorem
that says that the averages of partial sums of the Fourier series of any f ∈ L1(T) converge to f
pointwise almost everywhere (i.e., outside a Lebesgue null set).
4. Sufficient conditions for pointwise convergence of Fourier series
Philosophy: If f ∈ L1(T) satisfies some smoothness condition that prevents f from oscillating too
much, then we may expect (sN (f)) to converge to f pointwise.
We start with some basic observations.
Exercise-7: Let f ∈ L1(T), w ∈ C and a ∈ T = [−1/2, 1/2).
(i) For any δ ∈ (0, 1/2) we have
sN (f, a)− w =
∫TDN (t)(f(a− t)− w)dt =
(∫|t|<δ
+
∫|t|>δ
)DN (t)(f(a− t)− w)dt.
12 T.K.SUBRAHMONIAN MOOTHATHU
(ii) limN→∞
sN (f, a) = w ⇔ there exists δ ∈ (0, 1/2) such that limN→∞
∫|t|<δ
DN (t)(f(a− t)− w)dt = 0.
(iii) (Riemann localization principle) If f ≡ 0 in a neighborhood of a, then limN→∞ sN (f, a) = 0.
[Hint : (i) w =∫TDN (t)wdt since
∫TDN (t)dt = 1. Also, sN (f) = DN ∗ f . (ii) As N → ∞, the
integral∫|t|>δ in (i) goes to 0 by [105](iv) since the function t 7→ f(a− t)− w belongs to L1(T).]
Improving Exercise-7(ii), we get:
[107] (Pointwise convergence criterion - 1) Let f ∈ L1(T), w ∈ C and a ∈ T = [−1/2, 1/2). Then,
(i) limN→∞
sN (f, a) = w ⇔ ∃ δ ∈ (0, 1/2) with limN→∞
∫|t|<δ
t−1(f(a− t)− w) sin(2N + 1)πtdt = 0.
(ii) (Dini’s test) If t 7→ t−1(f(a− t)− w) belongs to L1(T), then limN→∞
sN (f, a) = w.
(iii) If t 7→ t−1(f(a− t)− w) is bounded a.e. in a neighborhood of 0, then limN→∞
sN (f, a) = w.
Proof. (i) By Exercise-7(ii), limN→∞
sN (f, a) = w ⇔ there exists δ ∈ (0, 1/2) such that
limN→∞
∫|t|<δ
(f(a− t)− w)DN (t)dt = 0. Note that the integrand can be written as
(f(a− t)− w)DN (t) = (1
sinπt− 1
πt+
1
πt)(f(a− t)− w) sin(2N + 1)πt.
Using l’Hopital rule, check limt→0(1
sin t −1t ) = 0. Hence g : T → C defined as g(0) = 0 and
g(t) = 1sinπt −
1πt for t = 0 satisfies g ∈ C(T) ⊂ L1(T). Therefore, t 7→ g(t)(f(a − t) − w) belongs
to L1(T), and so by [104](i), we have limN→∞∫|t|<δ g(t)(f(a− t)− w) sin(2N + 1)πtdt = 0 for any
δ ∈ (0, 1/2). The required result follows.
(ii) Use part (i), and apply [104](i) to g(t) := t−1(f(a− t)− w).
(iii) Use part (i), and apply [104](ii) to h(t) := f(a− t)− w. �
Definition: Let f ∈ L1(T) (assumed to be extended to the whole of R with period 1), and a ∈ T.
We say2 f is Lipschitz at a if there exist λ ≥ 1 and a neighborhood U ⊂ T of a such that
|f(a)− f(b)| ≤ λ|a− b| for every b ∈ U . Note that if f is differentiable at a, then f is Lipschitz at
a (∵ define g(a) = f ′(a) and g(t) = (f(t)− f(a))/(t− a) for t = a. Then f(t)− f(a) = g(t)(t− a),
and g is continuous at a so that |g(t)| ≤ λ := |g(a)| + 1 in a neighborhood of a). We say f is
Holder continuous at a if there exist ⟨≥ 1, α > 0, and a neighborhood U ⊂ T of a such that
|f(a)− f(b)| ≤ λ|a− b|α for every b ∈ U .
[108] Let f ∈ L1(T) and a ∈ T. Then each of the following implies limN→∞
sN (f, a) = f(a):
(i) f is differentiable at a.
(ii) f is Lipschitz at a.
2These definitions are to be understood modulo a null set since f ∈ L1(T).
FOURIER ANALYSIS 13
(iii) f is Holder continuous at a.
Proof. Let g(t) = t−1(f(a−t)−f(a)). For (i) and (ii), note that g is bounded a.e. in a neighborhood
of 0, and apply [107](iii). For (iii), it suffices by [107](ii) to show that g ∈ L1(T). By hypothesis,
there exist λ ≥ 1, α > 0 and δ ∈ (0, 1/2) such that |f(a− t)− f(a)| ≤ λ|t|α whenever |t| < δ. Then∫T|g(t)|dt ≤
∫|t|<δ
λ|t|α−1dt+
∫|t|>δ
δ−1|f(a− t)− f(a)|dt ≤ 2λα−1δα + 2δ−1∥f∥1 <∞.
�
Definition: Let f ∈ L1(T). We say f is piecewise continuous if for every open interval (a, b) ⊂ T,
the map f is continuous on (a, b) except possibly for finitely many jump discontinuities, and if the
limits f(a+), f(b−) exist. We say f is piecewise C1 if both f and f ′ are piecewise continuous. Next
we will prove Dirichlet’s theorem about convergence of Fourier series for piecewise C1 functions.
Exercise-8: Let f ∈ L1(T), a ∈ T, and w ∈ C.
(i) For any δ ∈ (0, 1/2) we have sN (f, a)− w =∫ 1/2
0DN (t)[f(a+ t) + f(a− t)− 2w]dt =
(∫0<t<δ
+
∫t>δ
)DN (t)[f(a+ t) + f(a− t)− 2w]dt.
(ii) limN→∞
sN (f, a) = w ⇔ ∃ δ ∈ (0, 1/2) with limN→∞
∫ δ
0DN (t)[f(a+ t) + f(a− t)− 2w]dt = 0.
[Hint : (i) sN (f) =∫ 1/20 DN (t)[f(a+ t)+f(a− t)]dt by Exercise-6. Also, w = 2
∫ 1/20 DN (t)wdt since∫
TDN (t)dt = 1 and DN is even. (ii) As N → ∞, the integral∫t>δ in (i) goes to 0 by [105](iv) since
the function t 7→ f(a+ t) + f(a− t)− 2w belongs to L1(T).]
We use a splitting of the integrand different from the one used in [107] for the next:
Exercise-9: (Pointwise convergence criterion - 2) Let f ∈ L1(T), a ∈ T, and assume the limits f(a+),
f(a−) exist. If g1(t) :=f(a+ t)− f(a+)
sinπtand g2(t) :=
f(a− t)− f(a−)
sinπtare bounded almost
everywhere in (0, δ) for some δ ∈ (0, 1/2) , then limN→∞ sN (f, a) = [f(a+)+ f(a−)]/2. [Hint : Let
g = 1(0,δ)(g1+g2), w = [f(a+)+f(a−)]/2. Then g ∈ L1(T) and∫ δ0 DN (t)[f(a+t)+f(a−t)−2w]dt =∫ δ
0 g(t) sin(2N + 1)πtdt→ 0 as N → ∞ by [104](i). Now apply Exercise-8(ii).]
[109] (Dirichlet’s theorem) If f ∈ L1(T) is a piecewise C1 function, then limN→∞ sN (f, a) =
[f(a+) + f(a−)]/2 for every a ∈ T. (note: [f(a+) + f(a−)]/2 = f(a) if f is continuous at a.)
Proof. Let g1(t) =f(a+ t)− f(a+)
sinπtand g2(t) =
f(a− t)− f(a−)
sinπt. By l’Hopital rule, we get
limt→0+
g1(t) = limt→0+
f ′(a+ t)
π cosπt= π−1f ′(a+) and lim
t→0+g2(t) = lim
t→0+
f ′(a− t)
π cosπt= π−1f ′(a−). Therefore
g1, g2 are bounded a.e. in (0, δ) for some δ ∈ (0, 1/2). Now the result follows by Exercise-9. �
14 T.K.SUBRAHMONIAN MOOTHATHU
It is also possible to extend (with some effort) the arguments from Exercise-8 along the lines of
[107] to prove Jordan’s theorem of convergence of Fourier series for functions of bounded variation.
But we decide to give another proof of this, which will be given a little later.
5. Cesaro summability and Abel summability
In spite of the negative result [106], it will be shown a little later that the Fourier series of f ∈
L1(T) converges w.r.to certain other notions of summability. In this Analysis section without any
Fourier Theory, we briefly mention two such notions - Cesaro summability and Abel summability.
Motivating observations: (i) The series 1−1+1−1+1−1+ · · · is not convergent in the usual sense
since the sequence of partial sums is (1, 0, 1, 0, 1, 0, . . .). However, the averages of the partial sums
form the sequence (1, 1/2, 2/3, 1/2, 3/5, 1/2, . . .), which converges to 1/2. (ii) Saying that∑∞
n=0 an
is convergent is same as saying the analytic function f(z) =∑∞
n=0 anzn converges at z = 1. If
f has radius of convergence ≥ 1, then even if∑∞
n=0 anzn does not converge at z = 1, the limit
limr→1−∑∞
n=0 anrn may exist.
Definition: Let an ∈ C for n ≥ 0, sN =∑N
n=0 an, and s ∈ C. (i) We say∑∞
n=0 an is Cesaro
summable to s if limN→∞ σN = s, where σN = (N + 1)−1∑N
n=0 sn. (ii) We say∑∞
n=0 an is Abel
summable to s if∑∞
n=0 anrn converges for every r ∈ (0, 1) and limr→1−
∑∞n=0 anr
n = s.
Exercise-10: Let an ∈ C for n ≥ 0, sN =∑N
n=0 an, and σN = (N + 1)−1∑N
n=0 sn. Then,
(i) σN = (N + 1)−1∑N
n=0(∑n
m=0 an) = (N + 1)−1∑N
n=0(N + 1− n)an =∑N
n=0(1−n
N+1)an.
(ii)∑∞
n=0 anrn = (1− r)
∑∞n=0 snr
n = (1− r)2∑∞
n=0(n+ 1)σnrn for 0 < r < 1.
Further, for p = 1, 2 and n ≥ 0, let ap,n ∈ C, and define a3,n =∑
i+j=n a1,ia2,j for n ≥ 0 (this is
called the Cauchy product, and can be thought of as a discrete version of convolution).
(iii) If the series∑∞
n=0 a1,n and∑∞
n=0 a2,n are convergent to a, b ∈ C respectively, then their Cauchy
product∑∞
n=0 a3,n defined above may not converge (see 3.49 of Rudin, Principles of Mathematical
Analysis for this), but∑∞
n=0 a3,n is Cesaro summable to ab.
(iv) Even if the series∑∞
n=0 a1,n and∑∞
n=0 a2,n are Cesaro summable to a, b ∈ C respectively, the
series∑∞
n=0 a3,n may not be Cesaro summable. Show that Abel summability does not imply Cesaro
summability by considering∑∞
n=0 a1,n =∑∞
n=0 a2,n = 1− 1 + 1− 1 + 1− 1 + · · · , and noting that∑∞n=0 a3,n = 1− 2 + 3− 4 + 5− 6 + · · · is Abel summable to 1/4 but is not Cesaro summable.
[Hint : (ii) sn − sn−1 = an and (n + 1)σn − nσn = sn. (iii) Let sp,N =∑N
n=0 ap,n for p = 1, 2, 3.
Check that∑N
n=0 s3,n =∑
i+j=N (N + 1 − i − j)a1,ia2,j =∑
i+j=N s1,is2,j . Also the hypothesis
says (s1,n) → a and (s2,n) → b. Hence |ab − 1N+1
∑Nn=0 s3,n| = | 1
N+1
∑i+j=N (ab − s1,is2,j)| ≤
1N+1
∑i+j=N ; min{i,j}≤N0
|ab− s1,is2,j |+ 1N+1
∑i+j=N ; min{i,j}>N0
|ab− s1,is2,j |, where the last two
FOURIER ANALYSIS 15
sums can be made arbitrarily small if N0 is large and N > N0 is very large compared to N0. (iv) For
1−2+3−4+ · · · , the averages of the partial sums form the sequence (1, 0, 2/3, 0, 3/5, 0, 4/7, 0, . . .),
which does not converge; but their averages converge to 1/4.]
[110] (i) If a complex series∑∞
n=0 an converges to s ∈ C, then∑∞
n=0 an is Cesaro summable to s.
(ii) If∑∞
n=0 an is Cesaro summable to s ∈ C, then∑∞
n=0 an is Abel summable to s.
Proof. Let sN =∑N
n=0 an, and σN = (N + 1)−1∑N
n=0 sn.
(i) Let ε > 0 be given. Since (sn − s) → 0 by hypothesis, there is M > 0 with |sn − s| ≤ M for
every n ≥ 0. Choose k ∈ N such that |sn − s| < ε for every n ≥ k. Then choose m > k such that
(k + 1)M/(N + 1) < ε for every N ≥ m. Then for every N > m, we have |σN − s| =
| 1
N + 1
N∑n=0
(sn − s)| ≤ | 1
N + 1
k∑n=0
(sn − s)|+ | 1
N + 1
N∑n=k+1
(sn − s)| ≤ (k + 1)M
N + 1+N − k
N + 1ε < ε+ ε.
(ii) Given limσN = s. By Exercise-10(ii), we have∑∞
n=0 anrn = (1 − r)2
∑∞n=0(n + 1)σnr
n for
0 < r < 1. Since (1− r)−2 =∑∞
n=0(n+1)rn, we may write s = (1− r)2∑∞
n=0(n+1)srn. Then for
any N ∈ N, we have
|∞∑n=0
anrn − s| ≤
(N∑
n=0
+
∞∑n=N+1
)(n+ 1)(1− r)2rn|σn − s| = J1(N, r) + J2(N, r), say.
Given ε > 0, choose N ∈ N large enough so that |σn − s| < ε for n ≥ N . Then J2(N, r) ≤ ε
since∑∞
n=N+1(n+1)rn ≤∑∞
n=0(n+1)rn = (1− r)−2. If M := max{(n+1)|σn− s| : n ≤ N}, then
J1(N, r) ≤M∑N
n=0(1− r)2 → 0 as r → 1−1. �
Remark: Tauber showed that if∑∞
n=0 an is Abel summable to s ∈ C and limn→∞ nan = 0, then∑∞n=0 an = s. This generated many other results of the same type. A result where a weaker notion
of summability plus some condition giving the convergence of the original series is now called
a Tauberian theorem3. In [111] below, we present Hardy-Littlewood Tauberian theorem (which
improves Tauber’s result), with a simplified proof due to Karamata and H. Wielandt.
Exercise-11: Let f = 1(1/2,1) : (0, 1) → R and ε > 0. Then there exist real polynomials p1 and p2
such that (i) p1 ≤ f ≤ p2,
(ii) p1(0) = 0 = p2(0) and p1(1) = 1 = p2(1) (this implies t(1− t) is a factor of p2 − p1), and
(iii) the polynomial q(t) =p2(t)− p1(t)
t(1− t)satisfies
∫ 10 q(t)dt < ε.
[Hint : Let F : (0, 1) → R be F (t) =f(t)− t
t(1− t), which is bounded and continuous except for a jump
discontinuity at 1/2. First approximate F by continuous functions, and then apply Weierstrass
3The terminology Tauberian theorem has a more general meaning, which we may see later.
16 T.K.SUBRAHMONIAN MOOTHATHU
approximation theorem to find polynomials h1, h2 such that h1 ≤ F ≤ h2 and∫ 10 (h2 − h1) < ε.
Note that f(t) = t+ t(1− t)F (t). Let pj(t) = t+ t(1− t)hj(t) for j = 1, 2. Then p1(0) = 0 = p2(0),
p1(1) = 1 = p2(1), p1 ≤ f ≤ p2, and∫ 10
p2(t)− p1(t)
t(1− t)dt =
∫ 10 (h2 − h1) < ε.]
[111] (i) (Hardy’s Tauberian theorem) If a complex series∑∞
n=0 an is Cesaro summable to s ∈ C
and supn∈N |nan| <∞, then∑∞
n=0 an = s.
(ii) (Hardy-Littlewood Tauberian theorem) If a complex series∑∞
n=0 an is Abel summable to s ∈ C
and supn∈N |nan| <∞, then∑∞
n=0 an = s.
Proof. As (i) follows from (ii) and [110], it suffices to prove (ii). After a translation of the function∑∞n=0 anz
n, we may assume s = 0. Let C > 0 be such that supn |nan| ≤ C.
Step-1 : Let F = {f : (0, 1) → R : limr→1−∑∞
n=0 anf(rn) = 0}, which is a real vector space. The
hypothesis says limr→1−∑∞
n=0 anrn = 0, and a substitution r = tk gives limx→1−
∑∞n=0 ant
kn = 0.
Thus all the maps t 7→ tk for k ∈ N belong to F , and hence F contains all polynomials p with
p(0) = 0, i.e., all p without the constant term. Let f = 1(1/2,1) : (0, 1) → R. Note that for
1/2 < r < 1,∑∞
n=0 anf(rn) =
∑∞n=0 an, and therefore what we need to show is that f ∈ F , i.e.,
that limr→1−∑∞
n=0 anf(rn) = 0. We will achieve this by approximating f with polynomials.
Step-2 : Given ε > 0, choose polynomials p1 and p2 for f and as specified by Exercise-11 such that
the polynomial q(t) =p2(t)− p1(t)
t(1− t)=∑m
k=0 bktk (say) satisfies
∑mk=0
bkk + 1
=∫ 10 q(t)dt < ε/C.
For our upcoming estimate, observe that1− rn
1− r≤ n, or
1− rn
n≤ 1− r for 0 < r < 1, and hence
p2(rn)− p1(r
n)
n=rn(1− rn)q(rn)
n≤ rn(1− r)q(rn) = (1− r)
m∑k=0
bkr(k+1)n.
Step-3 : Since |an| ≤C
nand f − p1 ≤ p2 − p1, the last estimate from step-2 gives
|∞∑n=0
an(f(rn)− p1(r
n))| ≤ C∞∑n=0
p2(rn)− p1(r
n)
n= C
m∑k=0
(1− r)bk
∞∑n=0
rn+kn = Cm∑k=0
bk1− r
1− rk+1,
which tends to C∑m
k=0
bkk + 1
< ε as r → 1−. Also, limr→1−
∞∑n=0
anp1(rn) = 0 since p1 ∈ F . Thus we
conclude lim supr→1−
|∞∑n=0
anf(rn)| ≤ ε. By considering
∑∞n=0 an(p2(r
n)−f(rn)), we can show similarly
that lim infr→1−
∞∑n=0
anf(rn) ≥ −ε. Since ε > 0 is arbitrary, lim
r→1−
∞∑n=0
anf(rn) = 0. �
FOURIER ANALYSIS 17
6. Weak type boundedness for maximal functions
We will touch upon maximal functions and their relation to pointwise convergence. This will
be used in the next section to prove Fejer’s result about the Cesaro summability of Fourier series.
Suppose we have a sequence of operators (Tn) defined on L1(X,µ) for some measure space (X,µ).
Their maximal function T ∗ is defined as the supremum of Tn’s in an appropriate sense. We are
interested in finding out conditions that will ensure that T ∗ has some sort of boundedness behavior.
Definition: Let C = C ∪ {∞}. For a measurable space X, let M(X, C) = {g : X → C :
g is measurable}, and similarly define M(X,R), M(X, [0,∞]), etc.
Definition: Let (X,µ), (Y, ν) be measure spaces. A map T : L1(X,µ) → M(Y, C) (not necessarily
linear) is weak (1, 1) if ∃C > 0 such that for every α > 0 and f ∈ L1(X,µ) we have ν({y ∈ Y :
|Tf(y)| > α}) ≤ C∥f∥1/α. Note that if T is weak (1,1), then ν({y ∈ Y : |Tf(y)| > n}) → 0 as
n→ ∞, and hence Tf is finite almost everywhere, i.e., ν({y ∈ Y : Tf(y) = ∞}) = 0. The inclusion
operator I : L1(X,µ) → M(X,C) is weak (1, 1) with constant C = 1: if we fix f ∈ L1(X,µ) and
put Aα = {x ∈ X : |f(x)| > α}, then µ(Aα) =∫Aα
1dµ ≤∫Aα
|f/α|dµ ≤ ∥f∥1/α for α > 0.
Remark: (i) More generally, the condition ν({y ∈ Y : |Tf(y)| > α})1/q ≤ C∥f∥p/α defines weak
(p, q) maps, but we do not need this more general concept. (ii) If the supremum of a sequence of
linear maps is weak (1, 1), then there is a useful conclusion, which is stated below.
[112] Let (X,µ) be a measure space, and Tn : L1(X,µ) → M(X, C) be linear. If the maximal
function T ∗ : L1(X,µ) → M(X, [0,∞]) of {Tn : n ∈ N} defined as T ∗f(x) = sup{|Tnf(x)| : n ∈ N}
is weak (1, 1), then the set F := {f ∈ L1(X,µ) : (Tnf) → f pointwise µ-a.e.} is closed in L1(X,µ).
Proof. Let (fk) be a sequence in F converging to f ∈ L1(X,µ) in ∥ · ∥1-norm. Note that {x ∈ X :
lim supn |Tnf(x)−f(x)| > 0} =∪∞
m=1Am, where Am := {x ∈ X : lim supn |Tnf(x)−f(x)| > 2/m}.
Hence it suffices to show µ(Am) = 0 for each m ∈ N. We have lim supn→∞ |Tnf(x)− f(x)| ≤
lim supn→∞
|Tn(f − fk)(x)|+ lim supn→∞
|Tnfk(x)− f(x)| = T ∗(f − fk)(x) + |(f − fk)(x)|
for a.e. x ∈ X since Tn is linear and (Tnfk) → fk pointwise a.e. Therefore µ(Am) ≤
µ({x ∈ X : T ∗(f−fk)(x) > 1/m})+µ({x ∈ X : |(f−fk)(x)| > 1/m}) ≤ Cm∥f−fk∥1+m∥f−fk∥1,
where C > 0 is given by the weak (1, 1) property of T ∗, and in the last term we used the fact
that the inclusion operator is weak (1, 1) with constant 1. As the above inequality is true for every
fk and since ∥f − fk∥1 → 0, we conclude that µ(Am) = 0. �
18 T.K.SUBRAHMONIAN MOOTHATHU
We will soon see that a particular maximal function that we are going to consider is lower
semicontinuous, which motivates Exercise-12, a small diversion from our main theme.
Exercise-12: Let X be a metric space (can be more generally a topological space also). A function
f : X → R (or f : X → [−∞,∞]) is upper semicontinuous if {x ∈ X : f(x) < α} is open in R
for every α ∈ R, and is lower semicontinuous if {x ∈ X : f(x) > α} is open in R for every α ∈ R.
For example, 1A is upper semicontinuous if A ⊂ X is closed, and 1U is lower semicontinuous if
U ⊂ X is open. Since union of open sets is open and intersection of closed sets is closed, we have
that the infimum of a family of upper semicontinuous functions is upper semicontinuous, and the
supremum of a family of lower semicontinuous functions is lower semicontinuous. By the same
reasoning, the pointwise limit of a decreasing sequence of upper semicontinuous functions is upper
semicontinuous, and the pointwise limit of an increasing sequence of lower semicontinuous functions
is lower semicontinuous. Now, consider a function f : X → R, where X is a metric space.
(i) f is upper semicontinuous ⇔ for each x ∈ X and ε > 0, there is a neighborhood U ⊂ X of x
such that f(y) < f(x) + ε for every y ∈ U ⇔ lim supk→∞ f(xk) ≤ f(x) whenever (xk) → x in X.
(ii) If f is upper semicontinuous with X compact, f is bounded above and attains its maximum.
(iii) If a sequence (fn) of upper semicontinuous functions from X to R converges uniformly to f ,
then f is upper semicontinuous.
(iv) If f is upper semicontinuous and sup f(X) < ∞, then for each x ∈ X we have that f(x) =
inf{g(x) : g ∈ C(X,R) and f ≤ g}.
(v) Formulate and prove the corresponding statements for lower semicontinuity. For example, a
lower semicontinuous function on a compact space is bounded below and attains its minimum.
[Hint : (ii) Let Un = {x ∈ X : f(x) < n}. Then extracting a finite subcover of the open cover
{Un : n ∈ N} of X, we see f is bounded above. Let M = sup f(X), and let (xk) be a sequence
in X with limk f(xk) = M . By compactness, we may assume (xk) → x ∈ X, and then f(x) ≥
lim sup f(xk) = M by upper semicontinuity. (iii) Let x ∈ X and ε > 0. Choose n large with
∥f −fn∥∞ < ε/3 and then choose a neighborhood U of x with fn(y) < fn(x)+ ε/3 for every y ∈ U .
Then f(y) < fn(y) + ε/3 < fn(x) + 2ε/3 < f(x) + ε for every y ∈ U . (iv) After a translation,
assume f ≤ −1. Fix b ∈ X, and let ε ∈ (0, 1). Since A := {x ∈ X : f(x) ≥ f(b) + ε} is closed,
there is continuous h : X → [0, 1] with h(b) = 1 and h(A) = {0}. Consider g ∈ C(X,R) defined as
g(x) = (f(b) + ε)h(x), which satisfies g(b) = f(b) + ε. If x ∈ A, then f(x) ≤ −1 < 0 = g(x); and if
x ∈ X \A, then f(x) < f(b) + ε ≤ g(x) since f(b) + ε < 0 and 0 ≤ h ≤ 1. Thus f ≤ g.]
We will introduce Hardy-Littlewood maximal function on Rn instead of on R as there is no extra
cost for this in the proofs, but we will use it only in one dimension.
FOURIER ANALYSIS 19
Definition: A measurable function f : Rn → C is said to be locally integrable if f1K ∈ L1(Rn),
i.e., if∫K |f |dµ < ∞ for every compact set K ⊂ R. Let L1
loc(Rn) be the collection of all locally
integrable functions on Rn. Clearly L∞(Rn) ⊂ L1loc(Rn) and hence L1
loc(Rn) = L1(Rn). We may
also see Lp(Rn) ⊂ L1loc(Rn) for 1 ≤ p <∞ as follows. Assume 1 < p <∞ (the case p = 1 is trivial)
and 1p + 1
q = 1. Then for any compact K ⊂ Rn, we have 1K ∈ Lq(Rn) and hence by Holder’s
inequality we obtain∫K |f |dµ =
∫Rn |f |1Kdµ ≤ ∥f∥p∥1K∥q <∞.
Remark: If ν is a locally finite Borel measure on Rn absolutely continuous w.r.to the Lebesgue
measure µ, then by Radon-Nikodym theorem there is a measurable function f : Rn → [0,∞)
with ν(A) =∫A fdµ for every Borel set A ⊂ Rn. Evidently, f ∈ L1
loc(Rn); and conversely any
f ∈ L1loc(Rn) defines a locally finite Borel measure on Rn that is absolutiely continuous w.r.to µ.
Definition: The Hardy-Littlewood maximal function M : L1loc(Rn) → M(Rn, [0,∞]) is defined as
Mf(a) = supr>01
µ(B(a,r))
∫B(a,r) |f |dµ for f ∈ L1
loc(Rn) and a ∈ R, where µ is the Lebesgue measure
on Rn, and the measurability of Mf is ensured by Exercise-13 below. Moreover, Mf is finite
almost everywhere, i.e., µ({a ∈ Rn : Mf(a) = ∞}) = 0, as a consequence of [113] below. When
n = 1, we have Mf(a) = supr>012r
∫ a+ra−r |f(t)|dt for f ∈ L1
loc(R) and a ∈ R. Also note that
Mf(a) = sup0<r<1/212r
∫ a+ra−r |f(t)|dt for f ∈ L1(T) and a ∈ T.
Remark: (i) If f ≡ c, then Mf ≡ |c| /∈ L1(Rn). (ii) If f : R → C is f = 1(0,∞), then Mf(a) = 1 for
a > 0 and Mf(a) = 1/2 for a ≤ 0; here, Mf is not continuous. (iii) M is sublinear : M(f + g) ≤
Mf +Mg (since |f + g| ≤ |f |+ |g|) and M(cf) = |c|Mf . Check that M is not linear.
Exercise-13: Fix f ∈ L1loc(Rn) and let Mf be as defined above. Verify that the set {x ∈ Rn :
Mf(x) > α} is open for each α > 0. Thus the function Mf on Rn is lower semicontinuous and
hence Borel measurable. Also, Mf(x) ≥ lim supn→∞Mf(xk) whenever (xk) → x in Rn.
The following technical fact from Euclidean Measure Theory is needed in the next proof.
Exercise-14: Let B1, . . . , Bk ⊂ Rn be finitely many balls, and µ be the Lebesgue measure on Rn.
Then there is a pairwise disjoint subcollection {Bj : j ∈ F} for some F ⊂ {1, . . . , k} such that
µ(∪k
j=1Bj) ≤ 3nµ(∪
j∈F Bj) = 3n∑
j∈F µ(Bj). [Hint : Assume µ(B1) ≥ · · · ≥ µ(Bk) so that r(1) ≥
· · · ≥ r(k) for the radii. Let j1 = 1. Having chosen j1, . . . , jr, let jr+1 be the smallest j > jr such
that Bj is disjoint with∪r
i=1Bji . Put F = {j1, j2, . . .}. If m ∈ {1, . . . , k} \ F , then Bm intersects
Bji for some ji < m. Then r(m) ≤ r(ji) and Bm ⊂ 3∗Bji , where 3∗Bji is the ball concentric to Bji
with radius 3r(ji). Hence µ(∪k
j=1Bj) ≤ µ(∪
j∈F 3∗Bj) ≤ 3n∑
j∈F µ(Bj) = 3nµ(∪
j∈F Bj).]
[113] The Hardy-Littlewood maximal function M : L1(Rn) → M(Rn, [0,∞]) is weak (1, 1) with
constant C = 3n. Similarly, M : L1(T) → M(T, [0,∞]) is weak (1, 1).
20 T.K.SUBRAHMONIAN MOOTHATHU
Proof. Fix f ∈ L1(Rn) and let Aα = {x ∈ Rn : Mf(x) > α} for α > 0. We need to show µ(Aα) ≤
3n∥f∥1/α. Since µ(Aα) = sup{µ(K) : K ⊂ Aα compact}, it suffices to show µ(K) ≤ 3n∥f∥1/α for
an arbitrary compact set K ⊂ Aα. By the definition of Mf , for each a ∈ K ⊂ Aα there is a ball
B centered at a with 1µ(B)
∫B |f |dµ > α, or equivalently µ(B) < 1
α
∫B |f |dµ. As K is compact, we
may cover K with finitely many such balls B1, . . . , Bk. By Exercise-14, choose a pairwise disjoint
subcollection {Bj : j ∈ F} for some F ⊂ {1, . . . , k} with µ(∪k
j=1Bk) ≤ 3n∑
j∈F µ(Bj). Then,
µ(K) ≤ µ(k∪
j=1
Bk) ≤ 3n∑j∈F
µ(Bj) =3n
α
∑j∈F
∫Bj
|f |dµ ≤ 3n
α
∫Rn
|f |dµ =3n∥f∥1α
,
where the third inequality is by the disjointness of the collection {Bj : j ∈ F}. �
Definition: Parametrize T = [−1/2, 1/2). For r ∈ (0, 1/2), define the Lebesgue kernel Lr :=
12r1[−r,r] ∈ L1(T), and note ∥Lr∥1 = 1. See that Lr∗f(a) = 1
2r
∫T f(t)1[−r,r](a−t)dt = 1
2r
∫ a+ra−r f(t)dt
is a local average of f at a ∈ T for f ∈ L1(T). The Lebesgue maximal function L∗ : L1(T) →
M(T, [0,∞]) is defined as L∗f(t) = sup0<r<1/2 |Lr ∗f(t)| = sup0<r<1/212r |∫ a+ra−r f(t)dt|. Comparing
with the Hardy-Littlewood maximal function in one dimension, we have L∗f ≤ L∗|f | =Mf .
Exercise-15: Let f ∈ L1(T) and t ∈ T. Then,
(i) |Lr ∗ f | ≤ Lr ∗ |f | ≤ L∗|f | = 1 · L∗|f | = ∥Lr∥1L∗|f | = ∥Lr∥1Mf .
(ii) If K =∑k
j=1 cjLrj is a convex combination, then |K ∗ f | ≤ K ∗ |f | ≤ ∥K∥1L∗|f | = ∥K∥1Mf .
(iii) If K ∈ L1(T) is a nonnegative even function decreasing on [0, 1/2), then K/∥K∥1 can be
approximated by convex combinations of Lr’s and hence |K ∗f | ≤ K ∗|f | ≤ ∥K∥1L∗|f | = ∥K∥1Mf .
7. Fourier series: pointwise convergence of Cesaro and Abel sums
Philosophy: We know {DN} is not an approximate identity. But the averages of DN ’s will form
an approximate identity, and this will lead to the Cesaro summability of the Fourier series.
Definition: Recall the Dirichlet kernel DN =∑N
n=−N en, and define the Nth Fejer kernel
FN := (N + 1)−1N∑
n=0
Dn = (N + 1)−1N∑
n=−N
n∑k=−n
ek =
N∑n=−N
(1− |n|N + 1
)en.
For f ∈ L1(T), recall the Fourier partial sum sN (f) =∑N
n=−N f(n)en = DN ∗ f , and define the
Nth Fejer mean
σN (f) := (N + 1)−1N∑
n=0
sn(f) = FN ∗ f =
N∑n=−N
(1− |n|N + 1
)en ∗ f =
N∑n=−N
(1− |n|N + 1
)f(n)en,
where we used the fact en ∗ f = f ∗ en = f(n)en from Exercise-5(v).
Remark: σN (f) is real valued when f is; and σN (f) ≥ 0 when f ≥ 0 since FN ≥ 0.
FOURIER ANALYSIS 21
[114] (i) FN (t) =sin2(N + 1)πt
(N + 1) sin2 πtand in particular FN ≥ 0 for every N ≥ 0.
(ii) If we parametrize T = [1/2, 1/2), then FN is an even function for every N ≥ 0.
(iii) ∥FN∥1 =∫T FN (t)dt = 1 for every N ≥ 0.
(iv) For 0 < δ < t < 1/2, we have FN (t) ≤ 1
(N + 1) sin2 πδ→ 0 as N → ∞ uniformly in t ∈ (δ, 1/2).
(v) {FN : N ≥ 0} is a nonnegative approximate identity for L1(T) satisfying also the L∞-
concentration condition (A4) from page 5.
(vi) σN (f, a) =∫ 1/20 FN (t)[f(a+ t) + f(a− t)]dt for every a ∈ T.
Proof. (i) Recall DN (t) =sin(2N + 1)πt
sinπtby [105], and 2 sinA sinB = cos(A − B) − cos(A +
B). Then, we see (N + 1)FN (t) · 2 sin2 πt =∑N
n=0 2Dn(t) sin2 πt =
∑Nn=0 2 sin(2n + 1)πt sinπt =∑N
n=0[cos 2nπt− cos(2n+ 2)πt] = cos 0− cos(2N + 2)πt = 2 sin2(N + 1)πt.
(ii) This follows from (i).
(iii) FN =∑N
n=−N (1− |n|N+1)en; and
∫T en(t)dt = 0 for n = 0, and = 1 for n = 0.
Statement (iv) also follows from (i), and (v) is a summary of what is proved above.
(vi) Recall sN (f, a) =∫ 1/20 DN (t)[f(a + t) + f(a − t)]dt by Exercise-6. Now use the facts that
σN (f) = (N + 1)−1∑N
n=0 sn(f) and FN = (N + 1)−1∑N
n=0Dn. �
Since {FN : N ≥ 0} is an approximate identity satisfying (A4), we may deduce the following:
[115] (Fejer’s theorem) (i) If f ∈ C(T), then ∥σN (f)− f∥∞ → 0 as N → ∞.
(ii) Let 1 ≤ p <∞. If f ∈ Lp(T), then ∥σN (f)− f∥p → 0 as N → ∞.
(iii) Let f ∈ L1(T) (assumed to be extended to R with period 1). If the limits f(a+) and f(a−)
exists at a ∈ T, then limN→∞ σN (f, a) = [f(a+) + f(a−)]/2. In particular, if f is continuous at
a ∈ T, then limN→∞ σN (f, a) = f(a).
Proof. (i) and (ii): We know σN (f) = FN ∗ f . Now use [114](v) and [102].
(ii) Let g : T → C be g(t) = [f(a + t) + f(a − t)]/2 for the given a ∈ T. Then g ∈ L1(T) with
∥g∥1 ≤ ∥f∥1. By [114](vi), σN (f, a) = 2∫ 1/20 FN (t)g(t)dt =
∫T FN (t)g(t)dt since FN , g are even,
and hence limN→∞ σN (f, a) = limt→0 g(t) = [f(a+) + f(a−)]/2 by [114](v) and [102](iii). �
Seminar topic: Let f0, f1, f2 ∈ C(T) be f0 ≡ 1, f1(t) = sin 2πt and f2(t) = cos 2πt. Korovkin’s
(second) theorem states that if Tn : C(T) → C(T) are positive linear maps for n ∈ N with
limn→∞ ∥Tnfj − fj∥∞ = 0 for j = 0, 1, 2, then limn→∞ ∥Tnf − f∥∞ = 0 for every f ∈ C(T).
22 T.K.SUBRAHMONIAN MOOTHATHU
Present a proof of Korovkin’s theorem (see for example, M. Uchiyama, Proof of Korovkin’s the-
orems via inequalities, Amer. Math. Monthly, 110, (2003)). Apply Korovkin’s theorem to the
sequence (σN ) of positive linear maps of C(T) to give another proof of [115](i).
Definition: A trigonometric polynomial is a finite linear combination of en’s with complex scalars.
For example, FN =∑N
n=−N en is a trigonometric polynomial. If g =∑cnen is a trigonometric poly-
nomial (where the sum is finite), then for any f ∈ L1(T), g∗f = f ∗g =∑cn(f ∗en) =
∑cnf(n)en,
which is again a trigonometric polynomial. In particular, σN (f) = FN ∗ f is a trigonometric poly-
nomial for every f ∈ L1(T); in fact, σN (f) =∑N
n=−N (1− |n|N+1)f(n)en.
Exercise-16: (i) {g ∈ C(T) : g is a trigonometric polynomial} is dense in both (C(T), ∥ · ∥∞) and
Lp(T) for 1 ≤ p <∞.
(ii) (Uniqueness of Fourier coefficients) If f, g ∈ L1(T) and f(n) = g(n) for every n ∈ Z, then f = g.
(iii) (Fourier inversion) If f ∈ L1(T) is such that (f(n)) ∈ l1(Z), i.e., if∑∞
n=−∞ |f(n)| < ∞, then
g :=∑∞
n=−∞ f(n)en ∈ C(T) and f = g almost everywhere.
(iv) F : L1(T) → c0(Z) given by f 7→ (f(n))n∈Z is linear and injective, but is not surjective.
[Hint : (i) Use [115](i) and [115](ii) respectively after noting that σN (f) =∑N
n=−N (1− |n|N+1)f(n)en
is a trigonometric polynomial. (ii) By considering f − g, we may assume g = 0. Use [115](i) and
the above expression for σN (f). (iii) Use (ii) after noting g(n) = f(n) for every n. (iv) Linearity
is clear, and F is injective by (ii). If F is also surjective, then F−1 should by a bounded linear
operator by Inverse mapping theorem since the spaces L1(T) and c0(Z) are Banach. But DN (n) = 1
if |n| ≤ N , and = 0 if |n| > N so that ∥(DN (n))∥∞ = 1; and we know ∥Dn∥1 → ∞ by [105].]
Exercise-17: (Wiener’s density theorem for T). Let f ∈ L1(T ). Then {f ∗g : g ∈ L1(T)} is dense in
L1(T) ⇔ f(n) = 0 for every n ∈ Z. [Hint : If f(n0) = 0, then f ∗ g(n0) = f(n0)g(n0) = 0 for every
g ∈ L1(T), and consequently {f ∗ g : g ∈ L1(T)} cannot be dense in L1(T). Conversely, suppose
f(n) = 0 for every n ∈ N. Given h ∈ L1(T ), define gN =∑N
n=−N (1 − |n|N+1)(h(n)/f(n))en. Since
f ∗ en = f(n)en, we see f ∗ gN =∑N
n=−N (1− |n|N+1)h(n)en = σN (h) → h in L1(T) by [115](ii).]
Now we prepare ourselves to prove Lebesgue’s extension of Fejer’s theorem; and also the Dirichlet-
Jordan theorem about functions of bounded variation. Even though {FN} is an approximate
identity, FN is not decreasing4 on [0, 1/2). Therefore, in order to make use of Exercise-15(iii), we
will define below KN ≥ FN such that KN ’s are even and decreasing on [0, 1/2).
Exercise-18: (i) (Trigonometric facts) sin(N +1)πt ≤ (N +1) sinπt and sinπt ≥ 2t for 0 ≤ t ≤ 1/2.
4See the end of Section 8 for a picture of the graph of FN .
FOURIER ANALYSIS 23
(ii) For 0 ≤ |t| ≤ 12(N+1) , we have FN (t) ≤ (N+1)2 sin2 πt
(N+1) sin2 πt= N + 1 =: KN (t). And for 1
2(N+1) ≤ |t| <
1/2, we have FN (t) ≤ 1(N+1)(2t)2
= 14(N+1))t2
=: KN (t). The function KN : T → C defined in this
manner satisfies the following: KN ≥ FN ≥ 0, KN ∈ C(T), KN is an even function that decreases
on [0, 1/2), and ∥KN∥1 = 2∫ 1/(2N+2)0 (N +1)dt+ 2
4(N+1)
∫ 1/21/(2N+2) t
−2dt = (1− 0) + (1− 1N+1) ≤ 2.
(iii) |FN ∗ f | ≤ FN ∗ |f | ≤ KN ∗ |f | ≤ 2L∗|f | = 2Mf by parts (ii) and (iii) and by Exercise-15(iii),
where M is the Hardy-Littlewood maximal function in one dimension.
(iv) The Fejer maximal function F ∗ is defined as F ∗f(t) = supN |FN ∗ f(t)| for f ∈ L1(T) and
t ∈ T. Then F ∗f ≤ 2L∗|f | = 2Mf by (iv).
[116] (i) The Fejer maximal function F ∗ and the Lebesgue maximal function L∗ are weak (1, 1).
(ii) (Lebesgue theorem about the pointwise convergence of Fejer means) Let f ∈ L1(T). Then
(σN (f)) → f pointwise almost everywhere.
(iii) (Lebsgue differentiation theorem on T) Let f ∈ L1(T). Then limr→012r
∫ a+ra−r f(t)dt = f(a) for
almost every a ∈ T, and includes all a ∈ T at which f is continuous.
Proof. (i) We know that L∗f ≤ L∗|f | = Mf ; and F ∗f ≤ 2L∗|f | = 2Mf by Exercise-18(iv). Also
M is weak (1, 1) by [113].
(ii) (σN (f)) → f pointwise (in fact, uniformly) for every f ∈ C(T) by Exercise-16. Also C(T) is
dense in L1(T) by Exercise-1. Apply [112] to F ∗ after putting TNf = FN ∗ f = σN (f).
(iii) Let f ∈ L1(T) be continuous at a ∈ T, and ε > 0. Choose δ > 0 such that |f(a) − f(t)| < ε
whenever |a− t| < δ. Then for 0 < r < δ we have |f(a)− Lr ∗ f(a)| =
|f(a)− 1
2r
∫ a+r
a−rf(t)dt| = | 1
2r
∫ a+r
a−r(f(a)− f(t))dt| ≤ 1
2r
∫ a+r
a−r|f(a)− f(t)|dy ≤ 1
2rε · 2r = ε.
Hence limr→0 Lr ∗ f(a) = limr→012r
∫ a+ra−r f(t)dt = f(a). In particular, this holds at every a ∈ T
when f ∈ C(T). Now use the denseness of C(T) in L1(T) and apply [112] to L∗ after putting
Tnf = Lrn ∗ f for any decreasing sequence (rn) in (0, 1/2) converging to 0. �
Remark: A point a ∈ T with limr→012r
∫ a+ra−r f(t)dt = f(a) is called a Lebesgue point for f ∈ L1(T).
Definition: For 0 < r < 1, define the Poisson kernel5 Pr =∑
n∈Z r|n|en, which belongs to C(T)
by the uniform convergence of the series. Keeping in mind the expressions in Exercise-7(iii), for
f ∈ L1(T) and 0 < r < 1, define the rth Abel mean of f as
Ar(f) = (1− r)
∞∑N=0
rNsN (f) = (1− r)
∞∑N=0
rNN∑
n=−N
f(n)en =∑N∈Z
f(n)r|n|en = Pr ∗ f
5See the end of Section 8 for a picture of the graph of Pr
24 T.K.SUBRAHMONIAN MOOTHATHU
where we used the fact f(n)en = f ∗ en = en ∗ f from Exercise-5.
[117] (i) (Another expression for Poisson kernel) Pr(t) =1− r2
1 + r2 − 2r cos 2πt.
(ii) {Pr : 0 < r < 1} as r → 1 is an approximate identity on L1(T) satisfying Pr ≥ 0 and also the
following L∞-concentration condition (A4): limr→1 sup{|Pr(t)| : |t| > δ} = 0 for every δ ∈ (0, 1/2).
Proof. (i)
Pr(t) + 1 =
∞∑n=0
rne−n(t) +
∞∑n=0
rnen(t) =1
1− re−1(t)+
1
1− re1(t)=
2− 2r cos 2πt
1 + r2 − 2r cos 2πt
since e−1(t) + e1(t) = 2 cos 2πt. Hence Pr(t) =2− 2r cos 2πt
1 + r2 − 2r cos 2πt− 1 =
1− r2
1 + r2 − 2r cos 2πt.
(ii)∫T Pr =
∫T∑
n∈Z r|n|en =
∑n∈Z
∫T r
|n|en = 1 since∫T en = 0 for n = 0; here the interchange of
summation and integral is justified by the uniform convergence of the series. Thus {Pr} satisfies
condition (A1) of an approximate identity. By (i), Pr(t) ≥ 1− r2
1 + r2 − 2r=
1− r2
(1− r)2≥ 0, and
hence the L1-boundedness condition (A2) follows from (A1). For |t| > δ ∈ (0, 1/2), we have
1 + r2 − 2r cos 2πt = (1− r)2 + 2r(1− cos 2πt) > 2r(1− cos 2πδ). Hence limr→1 sup{|Pr(t)| : |t| >
δ} ≤ lim supr→1
1− r2
2r(1− cos 2πδ)= 0, establishing (A4); and this implies property (A3) also. �
Remark: It can also be shown that for each r, the Poisson kernel Pr decreases on [0, 1/2).
[118] (i) If f ∈ C(T), then limr→1− ∥Ar(f)− f∥∞ = 0.
(ii) Let 1 ≤ p <∞. If f ∈ L1(T), then limr→1− ∥Ar(f)− f∥p = 0.
(iii) If f ∈ L1(T) and if the limits f(a+) and f(a−) exist at a point a ∈ T, then limr→1− Ar(f, a) =
[f(a+)+f(a−)]/2. In particular, if f ∈ L1(T) is continuous at a ∈ T, then limr→1− Ar(f, a) = f(a).
(iv) If f ∈ L1(T), then Ar(f, a) → f(a) as r → 1− for almost every a ∈ T.
Proof. Statements (i) and (ii) follow from the fact [117](ii) that {Pr : 0 < r < 1} as r → 1 is an
approximate identity, by [102]. Statements (iii) and (iv) follow from [115] since Cesaro summability
implies Abel summability by [110] (or imitate the proofs of [115](iii) and [116](ii)). �
Remark: Poisson kernel appears naturally in the theory of partial differential equations, for example
in solving the Laplace equation of the unit disc.
8. Pointwise convergence of Fourier series for functions of bounded variation
Here we will present Jordan’s theorem about pointwise convergence of Fourier series for a function
f ∈ L1(T) of bounded variation.
FOURIER ANALYSIS 25
Definition: For a function f : T → C, and [a, b] ⊂ T, let V ba (f) = sup{
∑kj=1 |f(aj)− f(aj−1)| : k ∈
N and a = a0 ≤ a1 ≤ · · · ≤ ak−1 ≤ ak = b} be the total variation of f in [a, b]. We say f is of
bounded variation if V 10 (f) <∞.
Remark: Let f ∈ L1(T) be of bounded variation. It is known that we may write f = f1 − f2 +
i(f3−f4), where fj ’s are real valued are monotone increasing. A monotone function is differentiable
almost everywhere and has only jump discontinuities. Thus f is differentiable almost everywhere
and the limits f(a+) and f(a−) exist for every a ∈ T (see my notes Measure Theory for these
facts). We aim to show limN→∞(sN (f, a)) = [f(a+) + f(a−)]/2 for every a ∈ T.
[119] Let f ∈ L1(T) be a function of bounded variation. Then,
(i) (Dirichlet’s theorem) limN→∞ σN (f, a) = [f(a+) + f(a−)]/2 for every a ∈ T.
(ii) |nf(n)| ≤ V 10 (f)/2 for every n ∈ Z.
(iii) (Jordan’s theorem) limN→∞ sN (f, a) = [f(a+) + f(a−)]/2 for every a ∈ T.
Proof. (i) By the Remark above, the limits f(a+) and f(a−) exist for every a ∈ T. So the result
follows by [115](iii).
(ii) We may assume n = 0, and put r = 12|n| . First note that
f(n) =∑2|n|−1
k=0
∫ (k+1)rkr f(t)e−n(t)dt =
∑2|n|−1k=0
∫ r0 f(s+ kr)e−n(s+ kr)ds.
Since e−n(s+ kr) = e−n(s) when k is even, and e−n(s+ kr) = −en(s) when k is odd, we get
f(n) =∫ r0
(∑|n|−1j=0 [f(s+ 2jr)− f(s+ (2j + 1)r)]
)e−n(s)ds.
Therefore, |f(n)| ≤∫ r0
∑|n|−1j=0 |f(s+2jr)−f(s+(2j+1)r)|ds ≤
∫ r0 V
10 (f)ds ≤ V 1
0 (f)r =V 10 (f)
2|n|.
(iii) This follows from parts (i) and (ii) above, and Hardy’s Tauberian theorem [111](i). �
Remark: Let f ∈ L1(T) be of bounded variation. Then as remarked above, f is differentiable a.e.,
and hence continuous a.e.; now, Jordan’s theorem implies limN→∞ sN (f, a) = f(a) for a.e. a ∈ T.
Seminar topic/ reading assignment: (i) If (an)n∈Z is a sequence of nonnegative numbers such that
a−n = an, an+1 − an ≥ an − an−1 and limn→∞ an = 0, then there is a nonnegative function
f ∈ L1(T) with f(n) = an for every n ∈ Z. (ii)∑∞
n=2
sin 2πnt
log n=∑
|n|≥2
sgn(n)en(t)
2i log |n|is a convergent
trigonometric series that is not the Fourier series of any f ∈ L1(T); see sections 4.1 and 4.2 of
Y. Katznelson, An Introduction to Harmonic Analysis. (iii) Gibbs-Wilbraham phenomenon; see
section 1.2.8 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets.
Remark: We saw in the beginning of this notes that the Fourier series converges in L2(T). About
the general Lp-convergence, the following is known: if 1 < p <∞, then limN→∞ ∥f − sN (f)∥p = 0
26 T.K.SUBRAHMONIAN MOOTHATHU
for every f ∈ Lp(T); but there are f ∈ L1(T) for which the sequence (sN (f)) does not converge to
f in L1(T).
Figure 1. Rough shape of the graphs of DN , FN , and Pr.
In sections 9-14, we will introduce the theory of distributions, and this will have an interplay
with the theory of Fourier transform that we will discuss afterwards. The theory of distributions
is usually developed on an open subset of Rn. But we will stick to the one dimensional space R in
order to convey the ideas in the simplest way without notational distractions. After grasping the
one dimensional case, for applications the student should read from relevant books6 the multidi-
mensional theory, which is built upon more or less the same ideas. One serious difference that we
observe when we move from T to R is that we have Lp(T) ⊂ L1(T) for 1 ≤ p <∞, but there are no
inclusion relations among Lp(R) for 1 ≤ p ≤ ∞ since the Lebesgue measure of R is infinite; and we
have to pay special attention to the decay properties of a function value f(x) as |x| → ∞. Special
subspaces defined in terms of various decay properties will play an important role, and we start by
sketching the theory of such subspaces.
9. Convolution is a smoothing operation
In this section, we will establish the following philosophy in various forms: f ∗ g is at least as
smooth as f and g, and often smoother. As a warm-up, we first look at different types of continuous
functions on R.
Definition: Recall that the support of a function f : R → C is supp(f) := {x ∈ R : f(x) = 0}. Also
recall C(R) = {f : R → C : f is continuous}, and Cc(R) = {f ∈ C(R) : f has compact support}.
Let C0(R) = {f ∈ C(R) : lim|x|→∞ f(x) = 0} (the space of continuous functions vanishing at ∞),
and Cb(R) = {f ∈ C(R) : f is bounded}. Note that all these spaces are complex vector spaces.
Exercise-19: (i) Cc(R) ⊂ C0(R) ⊂ Cb(R) = C(R) ∩ L∞(R).
6Eg: F.G. Friedlander, Introduction to the Theory of Distributions, or L. Grafakos, Classical Fourier Analysis.
FOURIER ANALYSIS 27
(ii) Every f ∈ C0(R) is uniformly continuous, but f ∈ Cb(R) may not be uniformly continuous.
(iii) Cc(R) ⊂ Lp(R) for 1 ≤ p ≤ ∞.
(iv) For 1 ≤ p <∞, C0(R) is not a subset of Lp(R), and Cb(R) ∩ Lp(R) is not a subset of C0(R).
(v) C0(R) is a closed vector subspace of (L∞(R), ∥ · ∥∞), and hence is a Banach space.
(vi) Cc(R) is dense in both (C0(R), ∥ · ∥∞) and (Lp(R), ∥ · ∥p) for 1 ≤ p <∞.
[Hint : (ii) If f ∈ Cb(R) is with f(n) = 0 and f(n + 1/n) = 1 for n ≥ 2, then f is not uniformly
continuous. (iv) If f ∈ C0(R) is with f(x) = 1/n for 10n ≤ x ≤ 10n + n ∀n ∈ N, then f /∈ L1(R).
(vi) Given f ∈ C0(R) and ε > 0, choose N ∈ N such that |f(x)| < ε for |x| ≥ N . Choose
continuous g : R → [0, 1] with g ≡ 1 on [−N,N ] and g(x) = 0 for |x| ≥ N + 1. Then fg ∈ Cc(R)
and ∥f − fg∥∞ ≤ ε. The denseness of Cc(R) in Lp(R) is already noted in Exercise-1.]
If f, g ∈ L1(R), then we know by Exercise-3(i) that f ∗ g ∈ L1(R) with ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1. To
supplement this, first we observe:
Exercise-20: (i) Fix f ∈ Lp(R), 1 ≤ p <∞. Then a 7→ fa from R to Lp(R) is uniformly continuous.
(ii) If f ∈ L1(R) and g ∈ L∞(R), then f ∗ g ∈ L∞(R) with ∥f ∗ g∥∞ ≤ ∥f∥1∥g∥∞, and moreover
f ∗ g is uniformly continuous.
[Hint : (i) As in Exercise-2(ii), we may assume f ∈ Cc(R). Let N ∈ N be such that supp(f) ⊂
[−(N − 1), N − 1]. Given ε > 0, choose δ ∈ (0, 1) such that |x − y| < δ implies |f(x) − f(y)|p <ε2N . Then for a, b ∈ R with |a − b| < δ, we have ∥fa − fb∥pp =
∫R |f(x − a) − f(x − b)|pdx =∫
R |f(y) − f(y + a − b)|pdy =∫ N−N |f(y) − f(y + a − b)|pdy < ε
2N · 2N = ε. (ii) |f ∗ g(x)| =
|∫R f(y)g(x − y)dy| ≤
∫R |f(y)|∥g∥∞dy = ∥f∥1∥g∥∞. Define F ∈ L1(R) as F (y) = f(−y). Note
that |g ∗ f(a)− g ∗ f(b)| = |∫g(y)(Fa(y)− Fb(y))dy| ≤ ∥g∥∞∥Fa − Fb∥1, and use (i).]
[120] (i) If f, g ∈ Cc(R), then f ∗ g ∈ Cc(R) and supp(f ∗ g) ⊂ supp(f) + supp(g).
(ii) If f ∈ L1(R) and g ∈ C0(R), then f ∗ g ∈ C0(R).
(iii) Let 1 < p, q < ∞ and 1p + 1
q = 1. If f ∈ Lp(R) and g ∈ Lq(R), then f ∗ g ∈ C0(R) and
∥f ∗ g∥∞ ≤ ∥f∥p∥g∥q.
(iv) Let 1 ≤ p <∞. If f ∈ Lp(R) and g ∈ Cc(R), then f ∗ g ∈ C0(R).
Proof. (i) Let h = f ∗ g, K = supp(f) and L = supp(g). Then supp(gy) = y + L. Since
|h(a)− h(b)| ≤∫K|f(y)||g(a− y)− g(b− y)|dy ≤ ∥f∥∞
∫K|g(a− y)− g(b− y)|dy,
we may deduce that h is continuous by the uniform continuity of g. If x /∈ K + L, then gy(x) = 0
for every y ∈ K and hence h(x) =∫K f(y)gy(x)dy = 0. This shows supp(h) ⊂ K + L.
(ii) f ∗ g is defined by Exercise-20(ii) since g ∈ C0(R) ⊂ L∞(R). By Exercise-19(vi), choose
sequences (fn) and (gn) in Cc(R) with ∥f − fn∥1 → 0 and ∥g − gn∥∞ → 0. Let M = sup ∥gn∥∞ <
28 T.K.SUBRAHMONIAN MOOTHATHU
∞. We know by (i) that fn ∗ gn ∈ Cc(R) ⊂ C0(R). With the help of Exercise-20(ii), we see
|f∗g(x)−fn∗gn(x)| ≤ |f∗g(x)−f ∗gn(x)|+|f ∗gn(x)−fn∗gn(x)| ≤ ∥f∥1∥g−gn∥∞+M∥f−fn∥1 → 0
as n→ ∞. Since C0(R) is closed w.r.to ∥ · ∥∞ by Exercise-19(v), we conclude that f ∗ g ∈ C0(R).
(iii) Let h(x) = g(−x) so that f ∗ g(x) =∫R f(y)hx(y)dy. Apply Holder’s inequality and noting
∥hx∥q = ∥g∥q, we may deduce that f ∗ g ∈ L∞(R) with ∥f ∗ g∥∞ ≤ ∥f∥p∥g∥q. The proof that
f ∗ g ∈ C0(R) is similar to the one given for (ii). Choose sequences (fn) and (gn) in Cc(R)
with ∥f − fn∥p → 0 and ∥g − gn∥q → 0. Let M = supn ∥gn∥q < ∞. By what is proved so far,
∥f ∗g−fn∗gn∥∞ ≤ ∥f ∗(g−gn)∥∞+∥(f−fn)∗gn∥∞ ≤ ∥f∥p∥g−gn∥q+∥f−fn∥1M → 0 as n→ ∞.
Since fn ∗ gn ∈ Cc(R) ⊂ C0(R) and since C0(R) is a Banach space, we get that f ∗ g ∈ C0(R).
(iv) This follows from (ii) and (iii) since Cc(R) ⊂ Lq(R). �
Remark: We mention an application of [120](iii). Claim: If A ⊂ R has positive Lebesgue measure,
then A − A contains a neighborhood of 0. Proof : We may assume µ(A) < ∞. Let f = 1A and
g(x) = f(−x), and note f, g ∈ L2(R). Given that 0 < µ(A) =∫f =
∫f2 = f ∗ g(0). Since f ∗ g is
continuous by [120](iii), there is a neighborhood U ⊂ R of 0 such that f ∗ g(x) > 0 for every x ∈ U .
But 0 < f ∗ g(x) =∫f(y)g(x− y)dy =
∫f(y)f(y− x)dy implies x ∈ A−A, and hence U ⊂ A−A.
Definition: Let Cmc (R) = {f ∈ Cc(R) : f is m-times continuously differentiable}, Cm
0 (R) = {f ∈
C0(R) : f is m-times continuously differentiable}, and C∞0 (R) =
∩∞m=1C
m0 (R). More importantly,
for our future discussion of distributions, we define the space E of smooth functions, and the space
D of smooth functions with compact support as follows:
Let E = C∞(R) = {f : R → C : f is infinitely often differentiable},
and D = C∞c (R) = Cc(R) ∩ E =
∩∞m=1C
mc (R).
Notation: Let Df = f ′, the derivative of f : R → C when f is differentiable.
Fact from Analysis: (Differentiating under the integral - see Theorem 7.40 in Apostol, Mathematical
Analysis) If ϕ : [a, b] × [c, d] → C and the partial derivative ∂ϕ∂x are continuous, then h : [a, b] → C
defined as h(x) :=∫ dc ϕ(x, y)dy is differentiable and Dh(x) =
∫ dc
∂ϕ∂x (x, y)dy.
[121] (i) If f ∈ Cc(R) and g ∈ C1c (R), then f ∗ g ∈ C1
c (R) and D(f ∗ g) = f ∗Dg.
(ii) Let k,m ≥ 0. If f ∈ Ckc (R) and g ∈ Cm
c (R), then f ∗g ∈ Ck+mc (R) and Di+j(f ∗g) = Dif ∗Djg
for 0 ≤ i ≤ k and 0 ≤ j ≤ m. In particular, if f ∈ Cc(R) and g ∈ D, then f ∗ g ∈ D and
Dm(f ∗ g) = f ∗Dmg for every m ∈ N.
(iii) Let 1 ≤ p ≤ ∞. If f ∈ Lp(R) and g ∈ Cmc (R) for some m ∈ N, then f ∗ g ∈ Cm
0 (R) and
Dj(f ∗ g) = f ∗Djg for 1 ≤ j ≤ m. In particular, if f ∈ Lp(R) and g ∈ D, then f ∗ g ∈ C∞0 (R) and
Dm(f ∗ g) = f ∗Dmg for every m ∈ N.
FOURIER ANALYSIS 29
(iv) If f ∈ L1(R) and g ∈ C∞0 (R), then f ∗ g ∈ C∞
0 (R) and Dm(f ∗ g) = f ∗Dmg for every m ∈ N.
Proof. (i) Since g ∈ C1c (R), we have Dg ∈ Cc(R) and hence f ∗Dg ∈ Cc(R) by [120](i). It remains
to show D(f ∗ g) = f ∗ Dg. Put ϕ(x, y) := f(y)g(x − y). Then ϕ and ∂ϕ∂x are continuous, where
∂ϕ∂x (x, y) = f(y)Dg(x− y). For a fixed x ∈ R, ϕ(x, y) = 0 for y outside a compact interval since f
and g have compact supports. By applying the Fact mentioned above to ϕ, we get D(f ∗g) = f ∗Dg.
(ii) This follows by the repeated application of (i) since the convolution is symmetric.
(iii) We know by [120](iv) that f ∗ g ∈ C0(R). It remains to show f ∗ g is m-times continuously
differentiable and Dj(f ∗ g) = f ∗Djg for 1 ≤ j ≤ m.
Case-1 : f ∈ L1(R). Fix x ∈ R, and we claim that D(f ∗ g)(x) = f ∗ Dg(x). Let (tn) be a
sequence of non-zero reals converging to 0. Define hn(y) = t−1n (g(x + tn − y) − g(x − y)). Since
t−1n (f∗g(x+tn)−f∗g(x)) =
∫R f(y)hn(y)dy, it remains to show limn→∞
∫R f(y)hn(y)dy = f∗Dg(x).
By mean value theorem applied to g, we observe that ∥hn∥∞ ≤ ∥Dg∥∞ for every n ∈ N. Also,
if we put h(y) = Dg(x − y), then we see that (fhn) → fh pointwise. Since fhn is dominated by
the integrable function ∥Dg∥∞|f |, we conclude by Lebesgue dominated convergence theorem that
limn→∞
∫Rf(y)hn(y)dy =
∫Rf(y)h(y)dy =
∫Rf(y)Dg(x− y)dy = f ∗Dg(x), which proves the claim.
Since Dg ∈ Cc(R), we have D(f ∗ g) = f ∗Dg ∈ C0(R) also by [120](iv), and thus f ∗ g ∈ C10 (R).
Now repeating the argument with Dg, D2g, etc. in place of g, we get the desired result.
Case-2 : f ∈ Lp(R), where p ∈ (1,∞). Let K = supp(g). Since f ∗g(x) =∫R 1x−K(y)f(y)g(x−y)dy
and since y 7→ 1x−K(y)f(y) belongs to L1(R) by Holder’s inequality, the result follows by case-1.
Case-3 : f ∈ L∞(R). Let K = supp(g). Since f ∗ g(x) =∫R 1x−K(y)f(y)g(x − y)dy and since
y 7→ 1x−K(y)f(y) belongs to L1(R), the result follows again by case-1.
(iv) Use the argument in case-1 of the proof of (iii), and use [120](ii) to say f ∗Dg ∈ C0(R). �
We will see below that R has an approximate identity {Hδ : δ > 0} with the additional property
that Hδ ∈ D for every δ > 0. This tool is useful in approximating Lp-functions by members of D.
Exercise-21: Let h : R → R be h(x) = e−1/x1(0,∞)(x). Then h ≥ 0 and h ∈ E .
(ii) For a < b, let ha,b(x) = h(x− a)h(b− x). Then ha,b ≥ 0, ha,b ∈ D and supp(ha,b) = [a, b].
(iii) Let H(x) = c ·h−1,1(x) = ce−1/(1−x2), where c > 0 is a constant chosen so that∫RH(x)dx = 1.
Then H ≥ 0, H ∈ D and supp(H) = [−1, 1].
(iv) For δ > 0, let Hδ(x) = δ−1H(x/δ). Then Hδ ≥ 0,∫RHδ = 1, Hδ ∈ D and supp(Hδ) = [−δ, δ].
Thus {Hδ : δ > 0} as δ → 0 is a nonnegative approximate identity on R satisfying the L1-
concentration condition in a strong sense (the family {Hδ : δ > 0} is also called a mollifier).
30 T.K.SUBRAHMONIAN MOOTHATHU
(v) If f ∈ Cc(R), then ∥f − f ∗Hδ∥∞ → 0 as δ → 0.
(vi) Let 1 ≤ p <∞. If f ∈ Lp(R), then ∥f − f ∗Hδ∥p → 0 as δ → 0.
[Hint : (i) For x > 0, show inductively that f (n)(x) = pn(1/x)f(x) for some polynomial pn. (ii)-(iv)
are easy consequences of (i). Statements (v) and (vi) follow from part (vi) and [102].]
[122] D := C∞c (R) is dense in (Lp(R), ∥ · ∥p) for 1 ≤ p <∞.
Proof. Since Cc(R) is dense in (Lp(R), ∥ · ∥p) by Exercise-1, it suffices to show f ∈ Cc(R) can be
approximated in Lp-norm by members of D. We know ∥f − f ∗Hδ∥p → 0 by Exercise-21(vi). Since
f ∈ Cc(R) and Hδ ∈ D, we also have f ∗Hδ ∈ D by [121](ii), and we are done. �
10. Topologies on the spaces D = C∞c (R) and E = C∞(R)
The theory of distributions (to be introduced soon) is based on the three spacesD ⊂ S ⊂ E , where
D := C∞c (R), E := C∞(R) , and S is the Schwartz space(to be defined in the next section). In this
section, we will introduce suitable topologies on D and E , and will mention a few basic properties
of these spaces. First we will review a few selected facts from Functional Analysis that we need.
Definition: Let X be a topological vector space (over C), i.e., X is a vector space having a Hausdorff
topology, and the maps (a, x) 7→ ax from C×X toX and (x, y) 7→ x+y fromX2 toX are continuous.
(i) X is locally convex if 0 ∈ X has a neighborhood base consisting of convex open sets.
(ii) X is a Frechet space if X is locally convex and admits a complete metric.
Remark: Many function spaces appearing in Analysis do not admit any natural structure of a
Banach space, but they retain two nice properties of a Banach space - local convexity and the
existence of an admissible complete metric - so that they are Frechet spaces (we will see examples
shortly). One advantage of a Frechet space is that Baire category theorem, and consequently many
classic theorems in Functional Analysis based on Baire category theorem (Open mapping theorem,
Uniform boundedness theorem, etc.) hold good on Frechet spaces - see Rudin, Functional Analysis.
Definition: Let X be a vector space (over C). A function p : X → [0,∞) is a seminorm if
p(ax) = |a|p(x) and p(x + y) ≤ p(x) + p(y) for every a ∈ C and x, y ∈ X. The first property
implies p(x) = 0 if x = 0 (take a = 2). If the converse (that is, p(x) = 0 ⇒ x = 0) also holds, then
p becomes a norm on X. An easy way to produce a seminorm p is: choose any linear functional
ϕ : X → C and put p(x) = |ϕ(x)|. A useful observation about a seminorm p is the following: since
p(x) + p(y − x) ≤ p(y) and p(y) + p(x− y) ≤ p(x), we get |p(x)− p(y)| ≤ p(x− y).
Definition: Let X be a vector space (over C), and P be a family of seminorms on X.
(i) P is separating if the only x ∈ X with p(x) = 0 ∀ p ∈ P is x = 0.
FOURIER ANALYSIS 31
(ii) The topology generated by P on X is the smallest topology on X that makes every p ∈ P
continuous. Sets of the form {x ∈ X : pj(x) < ε for 1 ≤ j ≤ k}, where ε > 0 and p1, . . . , pk ∈ P ,
form a base at 0 ∈ X for this topology. Using this, observe that the topology generated by a family
of seminorms is always locally convex.
(iii) P is directed if ∀ p1, p2 ∈ P , ∃ p3 ∈ P with p3 ≥ max{p1, p2}. If P is a directed, then basic
neighborhoods of 0 ∈ X have the form {x ∈ X : p(x) < ε} for some ε > 0 and p ∈ P .
[123] (Working knowledge about a topology specified by seminorms) Let X be a topological vector
space, where the topology is generated by a directed family P of seminorms. Then,
(i) A seminorm q on X (not necessarily a member of P ) is continuous iff there exist p ∈ P and
C > 0 such that q(x) ≤ Cp(x) for every x ∈ X. Consequently, a linear functional ϕ : X → C is
continuous iff there exist p ∈ P and C > 0 such that |ϕ(x)| ≤ Cp(x) for every x ∈ X.
(ii) Let P be separating and countable, say P = {pk : k ∈ N}, and let Uk = {x ∈ X : pk(x) < 1/k}.
Then {Uk : k ∈ N} is a local base at 0 for X (we remark that even when X is metrizable, it is more
convenient to use the seminorms than the metric).
(iii) If P = {pk : k ∈ N} is separating, then X is metrizable with a translation invariant metric
d(x, y) :=∑∞
k=1 2−k min{1, pk(x−y)}. Here, translation invariance means d(x+z, y+z) = d(x, y).
(iv) Let P = {pk : k ∈ N} be separating. Then, (xn) → x in X ⇔ for every k ∈ N, there is n0 ∈ N
such that pk(x− xn) < 1/k for every n ≥ n0; and (xn) is Cauchy in X ⇔ for every k ∈ N, there is
n0 ∈ N such that pk(xm − xn) < 1/k for every m,n ≥ n0.
Proof. (i) Suppose q is continuous and U ⊂ X be a basic neighborhood of 0 ∈ X with q(U) ⊂ [0, 1).
We may assume U = {x ∈ X : p(x) < ε} for some ε > 0 and p ∈ P since P is directed. We
claim that C := 2/ε works. Consider x ∈ X. If p(x) = 0, then p(ax) = ap(x) = 0 so that ax ∈ U
for every a > 0, which implies aq(x) = q(ax) < 1 for every a > 0; hence x = 0 and trivially
q(x) ≤ Cp(x). If p(x) > 0, then for a := ε2p(x) we have ax ∈ U and aq(x) = q(ax) < 1; this implies
q(x) < 1/a = Cp(x). Conversely, if the given condition holds, then q is continuous at 0 ∈ X; and
the continuity at a general point follows by observing that q(x− y) ≤ |x− y|. The second assertion
about a linear functional follows by applying what is already proved to q(x) := |ϕ(x)|.
(ii) Consider a basic neighborhood U = {x ∈ X : pj(x) < ε for 1 ≤ j ≤ m} of 0, where pj ∈ P and
ε > 0. Let n ≥ m be such that 1/n < ε, and then choose pk ≥ max{p1, . . . , pn} using the fact that
P is directed. Then clearly 0 ∈ Uk ⊂ U , and this shows {Uk : k ∈ N} is a local base at 0.
(iii) Translation invariance and symmetry of d are clear. Triangle inequality of d follows from
that of pk’s. We have d(x, x) = 0 since pk(0) = 0. If d(x, y) = 0, then pk(x − y) = 0 for
every k ∈ N , which implies x − y = 0 or x = y since P is separating. Thus d is a translation
32 T.K.SUBRAHMONIAN MOOTHATHU
invariant metric. Now we will verify using Uk’s from (ii) that d induces the same topology. Since
pk(x) ≤ 2kd(0, x), we have Bd(0,1
2kk) ⊂ Uk. For the other direction, for a given ε > 0, choose
m ∈ N with 1/m < ε/2 and∑∞
j=m+1 2−j < ε/2. Then choose k > m with pk ≥ max{p1, . . . , pm}
by the directedness of P . For any x ∈ Uk, we now have pj(x) ≤ pk(x) < 1/k for 1 ≤ j ≤ k and
hence d(0, x) ≤∑k
j=1 2−j/k +
∑∞j=k+1 2
−j < 1/k + ε/2 < ε, which shows Uk ⊂ Bd(0, ε).
(iv) This follows from (ii). �
We now apply these tools from Functional Analysis to the functions spaces of our interest:
Exercise-22: (Examples) (i) Let ρN (f) = max{|f(x)| : |x| ≤ N} for f ∈ C(R) and N ∈ N. Then
P = {ρN : N ∈ N} is a directed separating family of seminorms on C(R), and C(R) is a Frechet
space w.r.to the metric induced by P . Also, the topology on C(R) obtained in this way (the topology
of uniform convergence on compact sets) coincides with the compact-open topology on C(R).
(ii) For f ∈ E := C∞(R), and N ∈ N, let pN (f) = max{|Djf(x)| : 0 ≤ j ≤ N and |x| ≤ N}. Then
P = {pN : N ∈ N} is a directed separating family of seminorms on E , and E is a Frechet space
w.r.to the metric induced by P . If we put UN = {f ∈ E : pN (f) < 1/N}, then {UN : N ∈ N} is a
local base at 0 for E . For fn, f ∈ E , we have (fn) → f in E iff for each N ∈ N, there is n0 ∈ N such
that pN (f − fn) < 1/N for every n ≥ n0. A linear functional ϕ : E → C is continuous iff there exist
C > 0 and N ∈ N such that |ϕ(f)| ≤ CpN (f) for every f ∈ E .
(iii) Dk := {f ∈ E : supp(f) ⊂ [−k, k]} is a closed vector subspace of E (hence a Frechet space),
and note∪∞
k=1Dk = D. The seminorms pN from part (ii) become norms when restricted to Dk,
and {Dk ∩ UN : N ∈ N} is a local base at 0 for Dk, where UN is as in (ii). For f, fn ∈ Dk, we have
(fn) → f in Dk iff for each N ∈ N, there is n0 ∈ N such that pN (f − fn) < 1/N for every n ≥ n0.
[Hint : (ii) Completeness of E = C∞(R): if (fn) is Cauchy in E , then (Djfn)∞n=1 is Cauchy in the
Frechet space C(R) for each j ≥ 0; put gj := limnDjfn and deduce Djg0 = gj for every j ∈ N
using Theorem 7.17 in Rudin, Principles of Mathematical Analysis. (iii) Let ϕx : E → C be the
continuous functional ϕx(f) = f(x). Then Dk =∩
|x|>k ker(ϕx), and hence Dk is closed.]
Remark: By Exercise-22(ii), (fn) → f in E ⇔ for each N ∈ N and j ≥ 0, (Djfn) → f uniformly
to f on [−N,N ] as n → ∞ ⇔ for each j ≥ 0, (Djfn) → f uniformly on compact subsets of R. In
higher dimension, if we consider E(W ) := C∞(W,C) for an open set W ⊂ Rm, then (fn) → f in
E(W ) ⇔ for each multi-index α, (Dαfn) → f uniformly on compact subsets of W .
Exercise-23: (i) (Existence of smooth bump function) Given 0 < a < b, there exists g ∈ D such
that 0 ≤ g ≤ 1, g ≡ 1 on [−a, a], and supp(g) ⊂ [−b, b].
(ii) D is dense in E .
FOURIER ANALYSIS 33
[Hint : (i) By Exercise-21, there is f ∈ E with f(x) = 0 for x ≤ 0 and f(x) > 0 for x > 0. Let
h(x) = f(b−x)/[f(b−x)+f(x−a)]. Then h ≡ 1 on (−∞, a), h ≡ 0 on (b,∞), and h(a, b) = (0, 1).
Put g(x) = h(|x|). (ii) Let f ∈ E . Pick g ∈ D by (i) with g ≡ 1 on [−1, 1]. Define gn(x) = g(x/n).
Then fgn ∈ D, and fgn ≡ f on [−n, n] so that pN (f − fgn) = 0 for n > N , giving (fgn) → f in E .]
Remark: As D = E , it follows by Exercise-23(ii) that D is not closed (complete) in E (another
argument for this is by using Baire category theorem, after noting that the proper closed subspaces
Dk must be nowhere dense in D). Intuitively, D is not closed in E because: if (fn) is a sequence in
D converging to f ∈ E , then supp(fn) can get bigger with n so that in the limiting case supp(f)
may fail to be compact. Since it is desirable to work with complete spaces, we will now try to put a
(sequentially complete) topology Tind called the inductive limit topology on D. This topology Tindwill be such that if (fn) is Cauchy in (D, Tind), then there will be a uniform bound for supp(fn) for
every n, and this will ensure that the support of f := lim fn is also compact.
As before, let Dk = {f ∈ D : supp(f) ⊂ [−k, k]} and let Tk denote the topology on Dk (subspace
topology induced from E). Keep in mind that (Dk, Tk) is a Frechet space.
Definition: The inductive limit topology Tind on D is defined as the finest (strongest) locally convex
topology on D such that the inclusions Dk ⊂ D become continuous for every k ∈ N.
Remark: By definition, Tind is stronger than the subspace topology on D induced by E . Conse-
quently, Dk is closed in (D, Tind) for each k ∈ N by Exercise-22(iii).
Exercise-24: (i) The collection of all convex balanced sets U ⊂ D such that Dk ∩ U is open in Dk
form a local base at 0 for the locally convex space (D, Tind). Here, U is said to be balanced if
cU ⊂ U for all c ∈ C with |c| ≤ 1).
(ii) The subspace topology induced on Dk from Tind coincides with the original topology Tk of Dk.
Thus U ⊂ D is open in D iff Dk ∩ U is open in Dk for every k ∈ N .
[Hint : We leave this as a reading assignment - see 6.4 in Rudin, Functional Analysis.]
Definition: Let X,Y be topological vector spaces. (i) A ⊂ X is a bounded subset if for every
neighborhood U of 0 ∈ X, there is c > 0 such that cA ⊂ U . (ii) A sequence (xn) in X is Cauchy
if for every neighborhood U ⊂ X of 0, there is n0 ∈ N such that xn − xm ∈ U for every m,n ≥ n0;
and X is (sequentially) complete if every Cauchy sequence in X converges to some element of X.
(iii) A linear map T : X → Y is bounded if T (A) is bounded in Y whenever A ⊂ X is bounded.
[124] (i) A ⊂ D is bounded ⇔ there is k ∈ N such that A ⊂ Dk and A is bounded in Dk.
(ii) (fn) → f in D ⇔ there is k ∈ N such that f, fn ∈ Dk for every n ∈ N and (fn) → f in Dk.
(iii) D is sequentially complete.
34 T.K.SUBRAHMONIAN MOOTHATHU
(iv) D is not metrizable.
Proof. In (i) and (ii), we will prove only the implication ‘⇒’ since ‘⇐’ is a direct consequence of
the continuity of the inclusion Dk ⊂ D.
(i) Suppose A ⊂ D is bounded. If A is not a subset of Dk for any k, choose fk ∈ A \ Dk for every
k ∈ N. Then there are xk ∈ R with |xk| > k and εk := |fk(xk)/k| > 0 for every k ∈ N. Let
U = {f ∈ D : |f(xk)| < εk for every k ∈ N}. We claim that Dk ∩ U is a neighborhood of 0 in Dk
for every k ∈ N. Given k ∈ N, choose m ∈ N such that |xj | < m and 1/m < εj for 1 ≤ j ≤ k. Then
Um := {f ∈ Dk : pm(g) < 1/m} ⊂ U , and this establishes the claim. Therefore by Exercise-24(ii),
U is a neighborhood of 0 in D. Since (fk) is bounded, we must have (fk/k) → 0. But fk/k /∈ U
for any k ∈ N by the definition of U , a contradiction. This contradiction establishes that A ⊂ Dk
for some k ∈ N. To show A is bounded in Dk, consider a neighborhood V of 0 in Dk. Then
V = Dk ∩ U for some neighborhood U of 0 in D by Exercise-24(ii). Let c > 0 be with cA ⊂ U .
Then cA ⊂ cDk ∩ U = Dk ∩ U = V , where cDk = Dk since Dk is a vector subspace.
(ii) This follows from part (i) and Exercise-24(ii) since {f} ∪ {fn : n ∈ N} is bounded.
(iii) This follows from (i) and the completeness of Dk since a Cauchy sequence is bounded.
(iv) If D is metrizable, then it is a complete metric space by (iii). Now D =∪∞
k=1Dk is a union of
proper closed vector subspaces (∵ Dk = D by Exercise-21(ii)). And a proper closed vector subspace
must be nowhere dense. Hence we arrive at a contradiction by Baire category theorem. �
Remark: More explicitly, [124](ii) means: (fn) → f in D ⇔ there is k ∈ N such that supp(f) and
supp(fn) are subsets of [−k, k] for every n ∈ N and limn→∞ ∥Djf −Djfn∥∞ = 0 for every j ≥ 0.
In higher dimension, if we consider D(W ) := C∞c (W,C) for an open set W ⊂ Rm, then (fn) → f
in D(W ) ⇔ there is a compact set K ⊂ W such that supp(f), supp(fn) ⊂ K for every n, and
(Dαfn) → f uniformly on K for each multi-index α.
Seminar topic: The spaces E , Dk and D have the Heine-Borel property: closed and bounded subsets
are compact (see 1.46 and 6.7 in Rudin, Functional Analysis).
Even though D is not metrizable, continuity can be characterized using sequences:
Exercise-25: Let Y be a locally convex vector space and T : D → Y be linear. Then TFAE:
(i) T is continuous.
(ii) T |Dkis continuous for each k ∈ N.
(iii) T is bounded.
(iv) (fn) → 0 in D implies (Tfn) → 0 in Y .
FOURIER ANALYSIS 35
[Hint : (i) ⇔ (ii) can be deduced using Exercise-24(ii) (a general property of inductive limit topol-
ogy), and (i) ⇔ (iii) is routine Functional Analysis. We have (iv) ⇒ (ii) since Dk is metrizable.]
Remark: Since pN (Djf) ≤ pN+j(f), the map f 7→ Djf is continuous on Dk for every k, j ∈ N, and
hence f 7→ Djf from D to D is continuous for every j ∈ N by Exercise-25.
11. The Schwartz space S
We have Cc(R) ⊂ C0(R) ⊂ C(R), where the members f ∈ C0(R) are defined in terms of the
decay of f at infinity. Similarly, now we will define7 the Schwartz space S of smooth functions with
D ⊂ S ⊂ E , where the members f ∈ S are defined by requiring that f and all its derivatives decay
rapidly at infinity. Later we will see that S is the natural domain for the Fourier transform.
Exercise-26: Write xif for the function x 7→ xif(x). For f ∈ C(R), the following are equivalent:
(i) xif ∈ L∞(R) for i = 0, 1, 2, . . ..
(ii) (1 + |x|)if ∈ L∞(R) for i = 0, 1, 2, . . ..
(iii) xif ∈ C0(R) for i = 0, 1, 2, . . ..
(iv) (1 + |x|)if ∈ C0(R) for i = 0, 1, 2, . . ..
[Hint : Since |x|i ≤ (1 + |x|)i ≤∑i
j=0 cj |x|j ≤ C|x|i for |x| ≥ 1, we get (i) ⇔ (ii) and (iii) ⇔ (iii).
If ∥xi+1f∥∞ <∞, then lim|x|→∞ |x|i|f(x)| = 0, and this shows (i) ⇒ (iii).]
Definition: Define the Schwartz space S as below and note D ⊂ S ⊂ C∞0 (R) ⊂ E .
S := {f ∈ E : xiDjf ∈ L∞(R) ∀ i, j ≥ 0} = {f ∈ E : (1 + |x|)iDjf ∈ L∞(R) ∀ i, j ≥ 0}
= {f ∈ E : xiDjf ∈ C0(R) ∀ i, j ≥ 0} = {f ∈ E : (1 + |x|)iDjf ∈ C0(R) ∀ i, j ≥ 0}.
Example: The map x 7→ e−1/x21(0,∞)(x) belongs to S, but x 7→ (1 + x2)−1 is not a member of S.
Exercise-27: If we put si,j(f) = ∥xiDjf∥∞, then the family {si,j : i, j ≥ 0} of seminorms defines a
locally convex topology on the vector space S. Moreover:
(i) The family {si,j : i, j ≥ 0} of seminorms is separating but is not a directed family. However, if
we put qN =∑
0≤i,j≤N si,j , then {qN : N ∈ N} is a separating directed family of seminorms on S
inducing the same topology on S.
(ii) Let VN = {f ∈ S : qN (f) < 1/N} = {f ∈ S :∑
0≤i,j≤N si,j(f) < 1/N}. Then {VN : N ∈ N} is
a local base at 0 for S.
(iii) A linear functional ϕ : S → C is continuous ⇔ there exist C > 0 and N ∈ N such that
|ϕ(f)| ≤ CqN (f) = C∑
0≤i,j≤N si,j(f) for every f ∈ S.
7L. Schwartz who introduced the theory of distributions is different from H.A. Schwarz in Cauchy-Schwarz
inequality.
36 T.K.SUBRAHMONIAN MOOTHATHU
(iv) For f, fn ∈ S, we have (fn) → f in S ⇔ for every N ∈ N, there is n0 ∈ N such that
qN (f − fn) =∑
0≤i,j≤N si,j(f − fn) < 1/N for every n ≥ n0.
(v) S admits an invariant metric that is complete. Thus S is a Frechet space.
[Hint : After verifying (i), deduce the other statements using [123] and the hint of Exercise-22(ii).]
Exercise-28: (i) D is dense in S.
(ii) The inclusions Dk ⊂ S, (hence) D ⊂ S, and S ⊂ (C∞0 (R), ∥ · ∥∞), and S ⊂ E are all continuous.
(iii) For 1 ≤ p <∞ we have S ⊂ Lp(R) and the inclusion is continuous.
(iv) S is closed under the following linear maps: f 7→ xif for every i ≥ 0 (and hence under f 7→ gf
for any polynomial g), and f 7→ Djf for every j ≥ 0. These maps are continuous on S.
[Hint : (i) Let f ∈ S. By Exercise-23(i), there is g ∈ D, 0 ≤ g ≤ 1, with g ≡ 1 on [−1, 1].
Define gn(x) = g(x/n) so that fgn ∈ D and fgn ≡ f on [−n, n]. Note that Dj(f − fgn) =∑jk=0 ckD
kfDj−k(1− gn). Let M = max0≤k≤j
∥Dj−k(1− g)∥∞. Then supn ∥Dj−k(1− gn)∥∞ ≤M for
0 ≤ k ≤ j. Therefore, si,j(f − fgn) ≤ sup|x|>n
M
j∑k=0
ck|xiDkf(x)| → 0. (ii) Fix N > k. Let M =
max{|xi| : 0 ≤ i ≤ N, |x| ≤ N}. Then qN (f) =∑
0≤i,j≤N si,j(f) ≤∑
0≤i,j≤N MpN (f) for f ∈ Dk.
Hence Dk ⊂ S is continuous. Next, S ⊂ C∞0 (R) and S ⊂ E are continuous since ∥f∥∞ = q0(f) and
pN (f) ≤ qN (f). (iii) Let g(x) = 1/(1+ |x|)2. Note g ∈ Lp(R). Let C > 0 be with (1+ |x|)2 ≤ C|x|2
for |x| ≥ 1. For f ∈ S, we have∫R |f |p ≤
∫|x|≤1 |f |
p +∫|x|>1 |Cx
2gf |p ≤ 2s0,0(f)p +Cps2,0(f)
p∥g∥pp.
(iv) For continuity, note si1,j(xi2f) ≤ si1+i2,j(f) and si,j1(D
j2f) ≤ si,j1+j2(f).]
[125] If f, g ∈ S, then f ∗ g ∈ S. And the bilinear map (f, g) 7→ f ∗ g from S2 to S is continuous.
Proof. Let f, g ∈ S ⊂ C∞0 (R)∩L1(R). The second inclusion ensures f ∗g is defined. Fix i, j ≥ 0. We
need to show xiDj(f ∗ g) ∈ L∞(R). Let h = Djg, which belongs to S. By [121](iv), f ∗ g ∈ C∞0 (R)
and Dj(f ∗ g) = f ∗ h. Writing xi = (y + (x− y))i =∑i
k=0 ckyk(x− y)i−k, observe that
xi(f ∗ h)(x) =∫Rxif(y)h(x− y)dy =
i∑k=0
ck
∫Rykf(y)(x− y)i−kh(x− y)dy =
i∑k=0
ckxkf ∗ xi−kh.
Since f, h ∈ S, we have xkf, xi−kh ∈ S ⊂ C∞0 (R) ∩ L1(R). So by [121](iv) we get xkf ∗ xi−kh ∈
C∞0 (R). Hence xi(f ∗ h) ∈ C∞
0 (R) ⊂ L∞(R), and this completes the proof that f ∗ g ∈ S.
To establish the continuity of the bilinear map (f, g) 7→ f ∗ g, consider sequences (fn) → 0
and (gn) → 0 in S. In view of Exercise-27(iv), we need to show limn→∞ si,j(fn ∗ gn) = 0 for every
i, j ≥ 0. From the above arguments, si,j(fn∗gn) = ∥xiDj(fn∗gn)∥∞ ≤∑i
k=0 ck∥xkfn∗xi−kDjgn∥∞.
By Exercise-20, ∥xkfn ∗ xi−kDjgn∥∞ ≤ ∥xkfn∥1∥xi−kDjgn∥∞. The right hand side goes to 0
because: (xkfn) → 0 and (xi−kDjgn) → 0 in S by Exercise-28(iv), and then ∥xkfn∥1 → 0 and
FOURIER ANALYSIS 37
∥xi−kDjgn∥∞ → 0 since the inclusions S ⊂ L1(R) and S ⊂ (C∞0 (R), ∥ ·∥∞) are continuous by parts
(iii) and (i) of Exercise-28. Hence limn→∞ si,j(fn ∗ gn) = 0. �
Remark: Similarly, it can be shown that the map (f, g) 7→ f ∗ g is bilinear and continuous from D2
to D, from D × C∞0 (R) to C∞
0 (R), etc. (try to write the proofs for some of them).
12. Distributions: preliminaries
Motivations for introducing distributions: (i) Two fundamental operations in Calculus are differ-
entiation and integration. People say that the theory of distributions is a completion of the theory
of ordinary differentiation just as the theory of Lebesgue integration is a completion of the theory
of Riemann integration.
(ii) We will show that all distributions (also called generalized functions) are differentiable in a
certain sense8. It will turn out that all locally integrable functions and all locally finite Borel
measures are distributions, and hence we can differentiate them!
(iii) Certain partial differential equations may have meaningful solutions that are differentiable only
almost everywhere. The framework of distributions provides a proper place for such solutions.
(iv) Distributions provide a mathematical framework in which certain ‘functions’ such as the dirac
delta function δ that are not functions in the ordinary sense (δ(0) = ∞ and δ ≡ 0 on R \ {0})
obtain the proper status of functions.
(v) The theory of Fourier transform can be elegantly developed playing with the Schwartz class
and tempered distributions (this is the approach in Grafakos, Classical Fourier Analysis).
For a topological vector space X, let X ′ = {ϕ : X → C : ϕ is linear and continuous} be its dual.
We will write ⟨ϕ, x⟩ for ϕ(x) when x ∈ X and ϕ is a map from X to C.
Definition: Recall the spaces D ⊂ S ⊂ E , where D = (C∞c (R), Tind), E = C∞(R), and S is the
Schwartz space. Let D′, S ′, and E ′ be their duals respectively. We have D′ ⊃ S ′ ⊃ E ′. The members
of D′ are called distributions, the members of S ′ are called tempered distributions, and the members
of E ′ are called distributions with compact support (this terminology will be clarified shortly). From
the earlier theory, we have the following characterization of members of D′, S ′, and E ′.
[126] Recall that pN (f) = max0≤k≤N
{|f (k)(x)| : |x| ≤ N}, si,j(f) = ∥xiDjf∥∞, and qN = max0≤i,j≤N
si,j .
(i) A linear functional ϕ : D → C belongs D′ ⇔ (ϕ|Dkis continuous for each k ∈ N) for each k ∈ N,
there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk.
8The generalized derivative of functions in the sense of distributions is often called the weak derivative.
38 T.K.SUBRAHMONIAN MOOTHATHU
(ii) A linear functional ϕ : S → C belongs to S ′ ⇔ there exist C > 0 and N ∈ N such that
|⟨ϕ, f⟩| ≤ CqN (f) = C∑
0≤i,j≤N si,j(f) for every f ∈ S.
(iii) A linear functional ϕ : E → C belongs to E ′ ⇔ there exist C > 0 and N ∈ N such that
|⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ E . Consequently, for every ϕ ∈ E ′, there is N ∈ N with the
following property: whenever f ∈ E satisfies Dkf = 0 for 0 ≤ k ≤ N , then ⟨ϕ, f⟩ = 0 (the smallest
such N is sometimes called the order of ϕ; thus every ϕ ∈ E ′ is of finite order).
Examples: (i) (Distributions generalize the notion of locally integrable functions) We will show
E ⊂ L1loc(R) ⊂ D′. The first inclusion is clear. Now, any g ∈ L1
loc(R) induces a linear functional
ϕg : D → C by the expression ⟨ϕg, f⟩ = ϕg(f) :=∫R f(x)g(x)dx. Fix k ∈ N. Then ∥f∥∞ ≤ pk(f)
for every f ∈ Dk, and for C :=∫ k−k |g| we have |⟨ϕg, f⟩| ≤ Cpk(f). Thus ϕg is continuous by [126](i)
and ϕg ∈ D′. From now onwards we will just write ⟨g, f⟩ for ⟨ϕg, f⟩ if no confusion can arise.
(ii) (Tempered distributions generalize the notion of Lp-functions and polynomials) Let 1 ≤ p ≤
∞, and we will show Lp(R) ⊂ S ′. If g ∈ L∞(R), then by the hint of Exercise-28(iii), there
is C > 0 such that ∥f∥1 ≤ C∑
0≤i,j≤2 si,j(f) for every f ∈ S, and hence |⟨g, f⟩| ≤ ∥g∥∞∫|f | ≤
∥g∥∞C∑
0≤i,j≤2 si,j(f). Therefore g (more precisely, the functional f 7→ ⟨g, f⟩) belongs to S ′. Next
suppose 1 ≤ p <∞, let g ∈ Lp(R) and 1 < q ≤ ∞ be such that 1p + 1
q = 1. Let h(x) = (1 + |x|)−2.
Since h ∈ Lq(R), we get by Holder’s inequality that∫R |gh| ≤ ∥g∥p∥h∥q. Let C > 0 be with
(1 + |x|)2 ≤ C|x|2 for |x| ≥ 2. Since∫|x|≤2(1 + |x|)2dx ≤
∫|x|≤2 9dx = 36, we have
|⟨g, f⟩| ≤ (
∫|x|≤2
+
∫|x|>2
)|gh|(1 + |x|)2|f |dx ≤ (36s0,0(f) + Cs2,0(f))∥g∥p∥h∥q,
which shows g ∈ S ′ by [126](ii). More generally, if g : R → R satisfies∫R g(x)(1 + |x|)−mdx <∞
for some m ≥ 2, then by an argument similar to the one given above, we can show g ∈ S ′. In
particular, if g : R → R is a polynomial, then g ∈ S ′.
(iii) (Distributions with compact support generalize the notion of Lp-functions with bounded sup-
port9) If g ∈ Dk ⊂ L1(R), then |⟨g, f⟩| ≤ pk(f)∫ k−k |g| ≤ ∥g∥1pk(f) for f ∈ E . This shows D ⊂ E ′.
More generally, if 1 ≤ p ≤ ∞ and g ∈ Lp(R) satisfies supp(g) ⊂ [−N,N ] for some N ∈ N, then
putting h = 1[−N,N ] we see |⟨g, f⟩| ≤ pN (f)∫ N−N |g| ≤ pN (f)
∫R |gh| ≤ ∥g∥p∥h∥qpN (f) for f ∈ E ,
where 1p + 1
q = 1. This shows for 1 ≤ p ≤ ∞, any g ∈ Lp(R) with bounded support belongs to E ′.
(iv) The inclusions E ′ ⊂ S ′ ⊂ D′ are proper. Let g(x) = e1/x21(0,∞)(x). Then g ∈ L1
loc(R) ⊂ D′, but
g /∈ S ′ because for f ∈ S defined as f(x) = e−1/x21(0,∞)(x), we have ⟨g, f⟩ =
∫R fg =
∫∞0 1 = ∞.
Similarly, if g ≡ 1, then g ∈ S ′ (being a polynomial) but g /∈ E ′ since g ∈ E and∫R gg = ∞.
9When we say a general function has bounded/compact support, it means the function is identically zero outside
a bounded/compact set.
FOURIER ANALYSIS 39
(v) (Locally finite measures are distributions) A Borel measure µ on R is locally finite10 if µ(K) <∞
for every compact set K ⊂ R. If µ is a locally finite Borel measure on R, then µ ∈ D′, where
we identify µ with the linear functional f 7→ ⟨µ, f⟩ :=∫R fdµ. This linear functional is indeed
continuous since for each k ∈ N, we have |⟨µ, f⟩| ≤ (2k+1)µ([−k, k])∥f∥∞ ≤ (2k+1)µ([−k, k])pk(f)
for every f ∈ Dk. If in addition µ is compactly supported, then a similar argument gives µ ∈ E ′.
In particular, for each a ∈ R, the Dirac measure δa defined as δa(Y ) = 1 if a ∈ Y and = 0 if a /∈ Y
belongs to E ′, and ⟨δa, f⟩ =∫R fdδa = f(a), which is just the evaluation map at a.
(vi) (Cauchy’s principal value PV 1x) The map x 7→ 1/x is locally integrable on R\{0}, but not on R
because∫ 10
1xdx = limε→0
∫ 1ε
1xdx = limε→0 log x|1ε = ∞. Still, there is a distribution corresponding
to this map, which we introduce now. Let f ∈ D, and write f(x) = f(0) + xg(x) for g ∈ D. Since
f(−ε)− f(ε) = −2εg(ε), integration by parts yields
∫|x|>ε
x−1f(x)dx = (
∫ −ε
−∞+
∫ ∞
ε)x−1f(x)dx = −ε(g(−ε)+g(ε)) log ε−(
∫ −ε
−∞+
∫ ∞
ε)Df(x) log |x|dx.
Since g(0) = Df(0), we have limε→0 ε(g(−ε) + g(ε)) log ε = 0. Hence limε→0
∫|x|>ε x
−1f(x)dx =
−∫∞−∞Df(x) log |x|dx. The distribution PV 1
x called the Cauchy’s principal value is defined as
⟨PV 1
x, f⟩ = lim
ε→0
∫|x|>ε
x−1f(x)dx = −∫RDf(x) log |x|dx
for f ∈ D. The function h(x) := log |x| belongs to L1loc(R) because
∫ b0 log xdx = limε→0
∫ bε log xdx =
limε→0[x log x − x]bε = b log b − b. Hence h is (induces) a distribution, and ⟨PV 1x , f⟩ = −⟨h,Df⟩,
from which we may deduce that PV 1x is indeed a distribution.
Idea behind the definition of various operations on distributions: If T : D → D is a linear operator,
then correspondingly there is a transpose operator T t : D′ → D′ given by T tϕ(f) = ϕ(Tf) for
ϕ ∈ D′ and f ∈ D. The defining expression T tϕ(f) = ϕ(Tf) may be written as ⟨T tϕ, f⟩ =
⟨ϕ, Tf⟩. When ϕ = ϕg for a genuine function g, the defining expression becomes∫(T tg)(x)f(x)dx =∫
g(x)(Tf)(x)dx. This observation tells us how to define various operations on distributions.
Definition: Let ϕ ∈ D′ and f ∈ D.
(i) (Translation) Recall the notation fy(x) = f(x−y). Observe that∫gy(x)f(x)dx =
∫g(z)f−y(z)dz.
Hence we define the translation ϕy of ϕ as ⟨ϕy, f⟩ = ⟨ϕ, f−y⟩ for y ∈ R.
(ii) (Dilation/scaling) Let a ∈ R \ {0}. Since∫g(ax)f(x)dx =
∫g(z)f(z/a)a−1dz, we define the
dilation ϕ(a ·) of ϕ as ⟨ϕ(a ·), f⟩ = ⟨ϕ, a−1f(·/a)⟩. Remark: In Rn, ⟨ϕ(a ·), f⟩ := ⟨ϕ, a−nf(·/a)⟩.
10A locally finite Borel measure on spaces such as Rn is generally called a Radon measure.
40 T.K.SUBRAHMONIAN MOOTHATHU
(iii) (Reflection) Let f(x) = f(−x). Since∫R gf =
∫R gf , we define ⟨ϕ, f⟩ = ⟨ϕ, f⟩.
(iv) (Smooth multiplication) Since∫(hg)f =
∫g(hf), we define hϕ ∈ D′ as ⟨hϕ, f⟩ = ⟨ϕ, hf⟩ for
h ∈ E . If h ∈ D, then hϕ ∈ E ′, and hϕ is called a localization of ϕ.
(v) (Differentiation) Integration by parts gives∫(Dg)f = gf |∞−∞ −
∫gDf = −
∫gDf since f has
compact support. Hence we define the derivative Dϕ of ϕ as ⟨Dϕ, f⟩ = −⟨ϕ,Df⟩ (do not forget the
minus sign!). If ϕ comes from a function g, i.e., if ⟨ϕ, f⟩ =∫fg, then Dϕ is called the distributional
derivative of g, and is also written as Dg. In this sense, every g ∈ L1loc(R) has a distributional
derivative.
Remark: The above operations are also defined for ϕ ∈ S ′ and ϕ ∈ E ′ with a modification: we need
to assume h ∈ D in (iv) to ensure that hϕ ∈ S ′ when ϕ ∈ S ′.
Example: (i) Let h = 1(0,∞) : R → R, which is called the Heaviside function. We have ⟨Dh, f⟩ =
−⟨h,Df⟩ = −∫∞0 Df = f(0) = ⟨δ0, f⟩ by the Fundamental theorem of calculus for f ∈ D. Hence
Dh = δ0, the Dirac measure at 0. Note further that ⟨Dmδa, f⟩ = (−1)mDmf(a) for f ∈ E , a ∈ R
and m ∈ N. (ii) From an earlier discussion, we find that the distributional derivative of PV 1x is
the locally integrable function x 7→ log |x|. (iii) Let g : R → C be absolutely continuous (then
g ∈ L1loc(R) ⊂ D′), h ∈ L1
loc(R), and assume g′(x) = h(x) for almost every x ∈ R. For any f ∈ D,
integration by parts gives ⟨Dg, f⟩ = −⟨g,Df⟩ = −∫gDf = −gf |∞−∞ +
∫fh = 0 +
∫fh = ⟨h, f⟩.
Thus Dg = h in the sense of distributions.
Exercise-29: (Smooth partition of unity) If K ⊂ R is a nonempty compact set and U1, . . . , Uk ⊂ R
are nonempty open sets covering K, then there are g1, . . . , gk ∈ D such that
(i) 0 ≤ gj ≤ 1 and supp(gj) ⊂ Uj for 1 ≤ j ≤ k.
(ii)∑k
j=1 gj(x) = 1 for every x ∈ K.
Here, we say {gj : 1 ≤ j ≤ k} is a smooth partition of unity for K subordinate to {Uj : 1 ≤ j ≤ k}.
[Hint : See my notes Introduction to Manifolds.]
Now we will explain why the members of E ′ are called ‘distributions with compact support’.
Definition We say ϕ ∈ D′ vanishes (or ϕ is 0) in an open set U ⊂ R if ⟨ϕ, f⟩ = 0 for every f ∈ D
with supp(f) ⊂ U . We define the support of ϕ, supp(ϕ), as the complement of the largest open
subset U ⊂ R on which ϕ vanishes. Exercise-30(i) below ensures that this definition is meaningful.
Example: For the Dirac measure δa at a ∈ R, we have supp(δa) = {a}.
Exercise-30: (i) If ϕ ∈ D′ vanishes on open sets U1, . . . , Uk ⊂ R, then ϕ vanishes on∪k
j=1 Uj .
(ii) If ϕ ∈ D′ and f ∈ D have disjoint supports, then ⟨ϕ, f⟩ = 0.
FOURIER ANALYSIS 41
(iii) Let ϕ ∈ D′, and f, g ∈ D. If f and g agree on a neighborhood of supp(ϕ), then ⟨Djϕ, f⟩ =
⟨Djϕ, g⟩ for every j ≥ 0.
(iv) If g ∈ D, and ϕg ∈ D′ is given as ⟨ϕg, f⟩ =∫R fg, then supp(ϕg) = supp(g).
[Hint : (i) Let f ∈ D be with K := supp(f) ⊂∪k
j=1 Uj . Let {gj : 1 ≤ j ≤ k} be a smooth partition
of unity for K subordinate to {Uj : 1 ≤ j ≤ k} given by Exercise-29. Then f =∑k
j=1 gjf , and by
hypothesis ⟨ϕ, gjf⟩ = 0 for 1 ≤ j ≤ k. Hence ⟨ϕ, f⟩ = 0. Statement (iii) is a consequence of (ii)
and linearity. (iv) Assume g is real valued. Let K = supp(g) and L = supp(ϕg). Clearly, L ⊂ K.
If L = K, using the smoothness of g, find a nondegenerate closed interval J ⊂ K \ L such that
g > 0 (or g < 0) on J . Let h ∈ D be a bump function such that h ≥ 0, h ≡ 1 on a neighborhood
of J where g > 0, and supp(h) ∩ L = ∅. Then 0 <∫J g ≤
∫R hg = ⟨ϕg, h⟩ = 0, a contradiction.]
Remark: In Exercise-30(iii), agreement on supp(ϕ) is not sufficient: f(x) = x and g ≡ 0 agree on
supp(δ0) = {0}, but ⟨Dδ0, f⟩ = −⟨δ0, Df⟩ = −Df(0) = −1 = 0 = −Dg(0) = ⟨Dδ0, g⟩.
[127] (Justification of a name) Let ϕ ∈ D′. Then ϕ ∈ E ′ ⇔ supp(ϕ) is compact.
Proof. ⇒: If supp(ϕ) is not compact, there is a sequence (fk) in D such that [−k, k]∩ supp(fk) = ∅
and ak := ⟨ϕ, fk⟩ = 0 for every k ∈ N. Let hk = a−1k fk. Since [−N,N ] ∩ supp(Djhk) = ∅ for every
k ≥ N and every j ≥ 0, we have limN→∞ pN (hk) = 0 for every N , and thus (hk) → 0 in E . On the
other hand, 1 = ⟨ϕ, hk⟩ 9 0. Hence ϕ /∈ E ′. We add the remark that though (hk) → 0 in E , we
have (hk) 9 0 in D since∪∞
k=1 supp(hk) is not a bounded set.
⇐: Assume supp(ϕ) is compact, and k ∈ N be with supp(ϕ) ⊂ (−k, k). Since ϕ ∈ D′, by [126](i)
there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk. Let h ∈ D, h ≥ 0, be
a bump function with h ≡ 1 on a neighborhood of supp(ϕ), and supp(h) ⊂ [−k, k]. Extend ϕ to E
by putting ⟨ϕ, f⟩ = ⟨ϕ, hf⟩ for f ∈ E . The extension is linear. For every f ∈ E , we have hf ∈ Dk,
and |⟨ϕ, f⟩| = |⟨ϕ, hf⟩| ≤ CpN (hf) ≤ CC ′pN (f), where the constant C ′ = Ch is obtained by using
the product rule for Dj(hf) for 0 ≤ j ≤ N . By [126](iii), the extension ϕ belongs to E ′. �
[128] A summary of some important inclusions that we have seen so far is:
(i) D ⊂ S ⊂ E ⊂ L1loc(R) ⊂ D′.
(ii) S ⊂ Lp(R) ⊂ S ′ for 1 ≤ p <∞; and also S ⊂ C∞0 (R) ⊂ L∞(R) ⊂ S ′.
(iii) D ⊂ E ′ ⊂ S ′ ⊂ D′.
Remark: By Exercise-23(ii), we know D is dense in E . A similar argument will show E ′ is dense in
D′ as follows. Given ϕ ∈ D′, choose g ∈ D with g ≡ 1 on [−1, 1], put gn(x) = g(x/n), and check
that gnϕ ∈ E ′. We have ⟨gnϕ, f⟩ = ⟨ϕ, gnf⟩ → ⟨ϕ, f⟩ for every f ∈ D, and this shows (gnϕ) → ϕ in
D′, completing the argument. Later we will see that D is dense in D′, and S is dense in S ′.
42 T.K.SUBRAHMONIAN MOOTHATHU
13. Convolution and distributions
Recall the notation g(y) = g(−y) so that g(x− y) = gx(y). For h, g ∈ D observe that h ∗ g(x) =∫h(y)g(x− y)dy =
∫h(y)gx(y)dy = ⟨h, gx⟩. This motivates the following definition.
Definition: (Convolution of a distribution by a smooth function) Let ϕ ∈ D′ and g ∈ D; or ϕ ∈ S ′
and g ∈ S; or ϕ ∈ E ′ and g ∈ E . Define ϕ ∗ g : R → C as ϕ ∗ g(x) = ⟨ϕ, gx⟩. To prove basic facts
about convolution, we require the following technical result:
Exercise-31: Let ϕ ∈ D′, G ∈ C∞(R2), and assume that for each x ∈ R, there are δ > 0 and k ∈ N
with supp(G(x+ t, ·)) ⊂ [−k, k] for every t ∈ (−δ, δ). Then h : R → C defined as h(x) = ⟨ϕ,G(x, ·)⟩
belongs to E and Dmh = ⟨ϕ, ∂mG(x, ·)∂xm
⟩ for every m ∈ N. [Hint : Fix x ∈ R, let δ > 0 and k ∈ N be
as given. Note that t−1(h(x+ t)− h(x)) = ⟨ϕ, t−1(G(x+ t, ·)−G(x, ·))⟩ since ϕ is linear. Check by
hypothesis that t−1(G(x+ t, ·)−G(x, ·)) → ∂G(x, ·)∂x
in D as t → 0. Hence by the continuity of ϕ
we get limt→0
t−1(h(x+ t)− h(x)) = ⟨ϕ, limt→0
t−1(G(x+ t, ·)−G(x, ·))⟩ = ⟨ϕ, ∂G(x, ·)∂x
⟩. Now repeat.]
Remark: The following is an important result in the theory of distributions. It tells you that even
if a distribution ϕ is given by a very rough function (for example, by a member of L1loc(R)), a
convolution ϕ ∗ g of ϕ by an appropriate smooth function g produces a smooth function.
[129] (i) Let ϕ ∈ D′ and g ∈ D; or ϕ ∈ S ′ and g ∈ S; or ϕ ∈ E ′ and g ∈ E . Then ϕ ∗ g ∈ E .
(ii) If ϕ ∈ E ′ and g ∈ D, then ϕ ∗ g ∈ D.
(iii) If ϕ ∈ E ′ and g ∈ S, then ϕ ∗ g ∈ S.
(iv) In all the above cases, we have Dm(ϕ ∗ g) = ϕ ∗Dmg for every m ∈ N.
Proof. Step-1 : First suppose ϕ ∈ D′ and g ∈ D. Define G : R2 → C as G(x, y) = gx(y) = g(x− y).
Then the hypothesis of Exercise-31 is satisfied. Hence ϕ ∗ g ∈ E . Also∂mG
∂xm(x, y) =
∂mg(x− y)
∂xm=
(Dmg)x(y), and hence Dm(ϕ ∗ g) = ⟨ϕ, (Dmg)x⟩ = ϕ ∗Dmg for every m ∈ N.
Step-2 : Let ϕ ∈ E ′ and g ∈ E . Let h ∈ D be a bump function with h ≡ 1 on a neighborhood
V of supp(ϕ). Then ⟨ϕ, f⟩ = ⟨ϕ, hf⟩ for every f ∈ E . In particular, ϕ ∗ g(x) = ⟨ϕ, hgx⟩. Letting
G(x, y) = h(y)g(x − y) and applying Exercise-31, we deduce that ϕ ∗ g ∈ E . Since h ≡ 1 on a
neighborhood V of supp(ϕ), we also have∂mG
∂xm(x, y) = 1 · ∂
mg(x− y)
∂xm= (Dmg)x(y) for y ∈ V , and
hence Dm(ϕ ∗ g) = ⟨ϕ, (Dmg)x⟩ = ϕ ∗Dmg for every m ∈ N.
Step-3 : Let ϕ ∈ E ′ and g ∈ D. Then there is k ∈ N such that ϕ and gx have disjoint supports for
|x| ≥ k, and hence ϕ ∗ g(x) = ⟨ϕ, gx⟩ = 0 for |x| ≥ k, which shows supp(ϕ ∗ g) is also compact.
FOURIER ANALYSIS 43
Step-4 : (hint) Let ϕ ∈ S ′ and g ∈ S. An argument similar to (but slightly more computational
than) that in step-1 will show ϕ∗g ∈ E and Dm(ϕ∗g) = ϕ∗Dmg; see Exercise 2.3.5(a) in Grafakos,
Classical Fourier Analysis.
Step-4 : If ϕ ∈ E ′ ⊂ S ′ and g ∈ S, then ϕ ∗ g ∈ E and Dm(ϕ ∗ g) = ϕ ∗Dmg as hinted in step-4. To
show ϕ ∗ g ∈ S, we need to show si,j(ϕ ∗ g) < ∞ for every i, j ≥ 0. Since Dj(ϕ ∗ g) = ϕ ∗Djg and
Djg ∈ S, we may assume j = 0. Thus we need to show ∥xi(ϕ ∗ g)∥∞ < ∞. Since g ∈ S, we have
(1+ |x|)iDmg ∈ S and hence for every N ∈ N, there is βN > 0 such that ∥(1+ |x|)iDmg∥∞ ≤ βN for
0 ≤ m ≤ N . Since ϕ ∈ E ′, there exist C > 0 and N ∈ N by [126](iii) such that |⟨ϕ, f⟩| ≤ CpN (f) =
Cmax{|Dmf(y)| : 0 ≤ m ≤ N, |y| ≤ N} for every f ∈ E . Hence |xi(ϕ ∗ g)(x)| = |xi⟨ϕ, gx⟩| =
|⟨ϕ, xigx⟩| ≤ Cmax{|xi ∂mg
∂ym(x− y)| : 0 ≤ m ≤ N, |y| ≤ N} ≤ CβN max{ |x|i
(1 + |x− y|)i: |y| ≤ N}.
From this estimate, we conclude that ∥xi(ϕ ∗ g)∥∞ < ∞ by observing the following: if |x| ≥ 2N ,
then|x|i
(1 + |x− y|)i≤ |x|i
(1 + |x|/2)i≤ 2i for |y| ≤ N . �
Example: Let g ∈ E . Then δ0 ∗ g(x) = ⟨δ0, gx⟩ = gx(0) = g(x− 0) = g(x). That is, δ0 ∗ g = g.
Definition and Remark: For f, g, h ∈ D, note that ⟨h ∗ g, f⟩ =∫(h ∗ g)(x)f(x)dx =
∫ ∫h(y)g(x −
y)f(x)dxdy =∫ ∫
h(y)f(x)g(y − x)dxdy =∫h(y)(f ∗ g)(y)dy = ⟨h, f ∗ g⟩. This motivates us to
define ⟨ϕ ∗ g, f⟩ = ⟨ϕ, f ∗ g⟩ for various types of distributions ϕ and appropriate smooth functions
g, f . We may check that this is compatible with our earlier definition of ϕ∗g as follows: ⟨ϕ∗g, f⟩ =∫(ϕ ∗ g)(x)f(x)dx =
∫⟨ϕ, gx⟩f(x)dx =
∫⟨ϕ, f(x)gx⟩dx = ⟨ϕ,
∫f(x)gx(·)dx⟩ = ⟨ϕ, y 7→
∫f(x)g(x−
y)dx⟩ = ⟨ϕ, f ∗ g⟩, where taking the integral sign∫inside ⟨, ⟩ is justified by considering the limit of
the Riemann sums defining the integral and using the linearity and continuity of ϕ.
Exercise-32: (i) Let ϕ ∈ D′. Then T : D → E defined as Tg = ϕ ∗ g is linear, continuous, and
commutes with translations (i.e., ϕ ∗ gy = (ϕ ∗ g)y for every y ∈ R).
(ii) (An important fact about convolution) Conversely, if a continuous linear map T : D → E
commutes with translations, then there is a unique ϕ ∈ D′ such that Tg = ϕ ∗ g for every g ∈ D.
[Hint : (ii) Define ϕ ∈ D′ as ⟨ϕ, g⟩ = Tg(0) for g ∈ D. Since T commutes with translation, we have
Tg(x) = (Tg)−x(0) = T (g−x)(0) = T (gx)(0) = ⟨ϕ, gx⟩ = ϕ ∗ g(x) for every x ∈ R.]
Remark: There are results similar to Exercise-32(ii) in other settings. For instance, Theorem 2.5.2
of Grafakos, Classical Fourier Analysis says in particular (with a more involved proof) that if
1 ≤ p, q ≤ ∞ and T : Lp(R) → Lq(R) is a bounded linear operator commuting with translations,
then there is a unique ϕ ∈ S ′ such that Tg = ϕ ∗ g for every g ∈ Lp(R).
It is also possible to define the convolution of two distributions when at least one of them has
compact support. For this, one has to develop the rather technical theory of tensor product of two
44 T.K.SUBRAHMONIAN MOOTHATHU
distributions. The main points are stated without proof in [130] below. The student may refer
to Chapter 4 of F.G. Friedlander, Introduction to the Theory of Distributions for the proofs. For
f, g : R → C, let f⊗g : R2 → C be f
⊗g(x, y) = f(x)g(y). Let D(R2) = C∞
c (R2). It is known
that span{f⊗g : f, g ∈ D} is dense in D(R2).
[130] Let ϕ, ψ ∈ D′, with at least one having compact support. Then there is a unique dis-
tribution ϕ⊗ψ ∈ D′(R2) called the tensor product of ϕ and ψ with the defining property that
⟨ϕ⊗ψ, f
⊗g⟩ = ⟨ϕ, f⟩⟨ψ, g⟩ for every f, g ∈ D. Moreover, we have:
(i) ∀F ∈ D(R2), ⟨ϕ⊗ψ,F ⟩ = ⟨ϕ, x 7→ ⟨ψ, F (x, ·)⟩⟩ = ⟨ψ, y 7→ ⟨ϕ, F (·, y)⟩⟩ ∀ (x, y) ∈ R2.
(ii) supp(ϕ⊗ψ) = supp(ϕ)× supp(ψ) ⊂ R2.
(iii) (ϕ, ψ) 7→ ϕ⊗ψ is linear and continuous in each variable.
(iv)∂k
∂xk∂m
∂ym(ϕ⊗ψ) = Dkϕ
⊗Dmψ for every k,m ∈ N.
(v) δx⊗δy = δ(x,y) for every (x, y) ∈ R2.
Remark: The idea of proof is to use the first equality in (i) as the definition of ϕ⊗ψ, and then to
prove that this indeed defines a distribution with the listed properties.
Note that ⟨h∗g, f⟩ =∫(h∗g)(z)f(z)dz =
∫ ∫h(x)g(z−x)f(z)dxdz =
∫ ∫h(x)g(y)f(x+y)dxdy.
[131] Let ϕ, ψ ∈ D′, with at least one of them having compact support. Then their convolution
ϕ ∗ ψ ∈ D′ is defined as ⟨ϕ ∗ ψ, f⟩ = ⟨ϕ⊗ψ, (x, y) 7→ f(x+ y)⟩ for f ∈ D. We have:
(i) ⟨ϕ ∗ ψ, f⟩ = ⟨ϕ, x 7→ ⟨ψ, f(x+ ·)⟩⟩ = ⟨ψ, y 7→ ⟨ϕ, f(·+ y)⟩⟩, and hence ϕ ∗ ψ = ψ ∗ ϕ.
(ii) supp(ϕ ∗ ψ) ⊂ supp(ϕ) + supp(ψ).
(iii) (ϕ, ψ) 7→ ϕ ∗ ψ is linear and continuous in each variable.
(iv) Dm(ϕ ∗ ψ) = Dmϕ ∗ ψ = ϕ ∗Dmψ for every m ∈ N.
(v) ϕ ∗ δ0 = ϕ.
Proof. All are essentially direct consequences of [130]. We just indicate a proof for (ii). Let f ∈ D
be with supp(f) ∩ (supp(ϕ) + supp(ψ)) = ∅. Then the support of the map (x, y) 7→ f(x + y) is
disjoint with supp(ϕ)× supp(ψ), and hence by [130](ii) we get ⟨ϕ ∗ ψ, f⟩ = 0. �
[132] (i) D (considered as a subset of E ′) is dense in D′.
(ii) S is dense in S ′.
Proof. (i) By the Remark after [128], E ′ is dense in D′. So it suffices to show D is dense in E ′. Let
ϕ ∈ E ′. Pick g ∈ D be with g ≥ 0 and∫g = 1, and put gn(x) = n−1g(x/n). Then (gn) → δ0 in D′
(check). Therefore (ϕ ∗ gn) → ϕ ∗ δ0 = ϕ by [131](iii) and [131](v). Also ϕ ∗ gn ∈ D by [129](ii).
(ii) An argument similar to that in (i) works since D ⊂ S ⊂ S ′ ⊂ D′. �
FOURIER ANALYSIS 45
14. Some structure theorems about distributions
[133] (i) Let ϕ ∈ E ′ be with supp(ϕ) = {0}. Then there exists N ∈ N such that for every f ∈ E
with Djf(0) = 0 for 0 ≤ j ≤ N , we have ⟨ϕ, f⟩ = 0.
(ii) (Structure theorem for distribution supported on a point) Let ϕ ∈ E ′ be with supp(ϕ) = {a}.
Then ϕ =∑N
j=0 cjDjδa for finitely many constants cj ∈ C, where δa is the Dirac distribution at a.
Proof. (i) Since ϕ ∈ E ′, there exist C > 0 and N ∈ N such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ E .
Fix f ∈ E with Djf(0) = 0 for 0 ≤ j ≤ N . Let g ∈ E be such that g(x) = 0 for |x| ≤ 1 and
g(x) = 1 for |x| ≥ 2. Put gk(x) = g(kx). Then gk(x) = 0 for |x| ≤ 1/k and gk(x) = 1 for |x| ≥ 2/k.
Since fgk ≡ 0 in a neighborhood of 0, we have supp(ϕ) ∩ supp(fgk) = ∅, and hence ⟨ϕ, fgk⟩ = 0
by Exercise-30. Therefore, |⟨ϕ, f⟩| = ⟨ϕ, f − fgk⟩| ≤ CpN (f − fgk) for every k ∈ N. Thus to prove
⟨ϕ, f⟩ = 0, it suffices to show limk→∞ pN (f − fgk) = 0. We make four observations:
(a) Since f = fgk for |x| > 2/k, pN (f − fgk) = max{|Dj(f − fgk)(x)| : |x| ≤ 2/k, 0 ≤ j ≤ N},
(b) Dj(f − fgk) = Dj(f(1− gk)) =∑j
i=0 cijDifDj−i(1− gk) by product rule.
(c) ∥Dj−i(1− gk)∥∞ ≤ kj−i∥Dj−i(1− g)∥∞ for 0 ≤ i ≤ j ≤ N and every k ∈ N.
(d) ∃ a constantM1 > 0 with max{|Dif(x)| : |x| ≤ 2/k} ≤ M1
kN+1−ifor 0 ≤ i ≤ N since Djf(0) =
0 for 0 ≤ j ≤ N (to see this, consider |xi| ≤ 2/k, and pick |xi+1| ≤ 2/k with |xiDi+1f(xi+1)| =
|Dif(xi)−Dif(0)| = |Dif(xi)| by Mean value theorem, and repeat this till one gets xN+1).
Combining (c) with (d), we get a constant M2 > 0 such that max{|Dif(x)Dj−i(1 − gk)(x)| :
|x| ≤ 2/k} ≤ M2
kN+1−j≤ M2
kfor 0 ≤ i ≤ j ≤ N . Combining this with (a) and (b), we get a constant
M > 0 such that pN (f − fgk) ≤M/k → 0 as k → ∞.
(ii) After a translation assume a = 0. Let N ∈ N be as given by part (i). Consider f ∈ E . By
Taylor’s theorem, there is h ∈ E such that f(x) =∑N
j=0Djf(0)
xj
j!+h(x) for x ∈ R. Differentiating
this repeatedly and substituting x = 0, see Djh(0) = 0 for 0 ≤ j ≤ N . Hence ⟨ϕ, h⟩ = 0 by (i), and
therefore ⟨ϕ, f⟩ =∑N
j=0Djf(0)⟨ϕ, x
j
j!⟩. Recall that ⟨Djδ0, f⟩ = (−1)j⟨δ0, Djf⟩ = (−1)jDjf(0) and
put cj = (−1)j⟨ϕ, xj
j!⟩. Then ⟨ϕ, f⟩ =
∑Nj=0 cj⟨Djδ0, f⟩ for f ∈ E , and hence ϕ =
∑Nj=0 cjD
jδ0. �
Exercise-33: Let β ∈ D′ be the distribution induced by the constant function 1 (which is locally
integrable), i.e., ⟨β, f⟩ =∫R f.1 =
∫R f for f ∈ D. Then ker(β) = {f ∈ D :
∫R f = 0} is a vector
subspace of D having codimension one. Let J : ker(β) → D be Jf(x) =∫ x−∞ f(y)dy. Then J is
well-defined, i.e, Jf ∈ D for f ∈ ker(β), and J is linear and continuous. Moreover, DJf = f for
f ∈ ker(β) and JDf = f for f ∈ D. In particular, {f ∈ D :∫R f = 0} = ker(β) = D(D), the range
of the differentiation operator D : D → D.
[Hint : To show J is continuous, show J : Dk ∩ ker(β) → D are continuous at 0 for k ∈ N.]
46 T.K.SUBRAHMONIAN MOOTHATHU
[134] (i) (Every distribution has a primitive) If ϕ ∈ D′, then there is ψ ∈ D′ with Dψ = ϕ.
(ii) (Determined up to a constant) If ϕ, ψ1, ψ2 ∈ D′ are with Dψ1 = ϕ = Dψ2, then ψ1 − ψ2 is a
constant in the sense that there is c ∈ C with ⟨ψ1 − ψ2, f⟩ = ⟨c, f⟩ = c∫f for every f ∈ D.
(iii) If Dψ = 0 for ψ ∈ D′, then ψ is a constant, i.e., ∃ c ∈ C with ⟨ψ, f⟩ = ⟨c, f⟩ = c∫f for f ∈ D.
Proof. (i) Let β and J be as in Exercise-33. Fix h ∈ D with ⟨β, h⟩ =∫R h = 1. Then h spans a
one-dimensional space complementary to ker(β). Define the projection P : D → ker(β) as Pf =
f−⟨β, f⟩h (check that ⟨β, Pf⟩ is indeed zero). Then any f ∈ D can be written as f = ⟨β, f⟩h+Pf .
We define ψ : D → C as ⟨ψ, f⟩ = −⟨ϕ, JPf⟩, which is obviously linear. If (fn) → 0 in D, then
Pfn = fn − ⟨β, fn⟩h → 0 in ker(β) ⊂ D, and hence (JPfn) → 0 in D by the continuity of J .
Therefore, ⟨ψ, fn⟩ = −⟨ϕ, JPfn⟩ → 0 in C. This shows that ψ is continuous at 0, and hence
ψ ∈ D′. For f ∈ D, we have PDf = Df since Df ∈ ker(β), and also JDf = f by Exercise-33(ii);
hence ⟨Dψ, f⟩ = −⟨ψ,Df⟩ = ⟨ϕ, JPDf⟩ = ⟨ϕ, JDf⟩ = ⟨ϕ, f⟩. Thus Dψ = ϕ.
(ii) Write f ∈ D as f = ⟨β, f⟩h + Pf as above. We have ⟨ψ1 − ψ2, Pf⟩ = ⟨ψ1 − ψ2, DJPf⟩ =
−⟨D(ψ1 − ψ2), JPf⟩ = −⟨0, JPf⟩ = 0, and therefore ⟨ψ1 − ψ2, f⟩ = ⟨ψ1 − ψ2, ⟨β, f⟩h⟩ = c⟨β, f⟩ =
c∫f , where c := ⟨ψ1 − ψ2, h⟩.
(iii) This follows from (ii) since we also have D0 = 0. �
[135] (Local structure theorem for distributions - every distribution is locally a finite order de-
rivative of a continuous function) Let ϕ ∈ D′ and a > 0. Then there exist g ∈ Cc(R) with
supp(g) ⊂ [−a, a] and an integer m ≥ 0 such that for every f ∈ D with supp(f) ⊂ (−a, a) (note
the open bracket), we have ⟨ϕ, f⟩ = ⟨Dmg, f⟩.
Proof. Let Γ = {f ∈ D : supp(f) ⊂ (−a, a)} and k ≥ a. By [126](i), there are C > 0 and N ≥ k
such that |⟨ϕ, f⟩| ≤ CpN (f) for every f ∈ Dk, and in particular for every f ∈ Γ. Fix f ∈ Γ,
0 ≤ j ≤ N , and let x0 ∈ (−a, a) be with |Df j(x0)| = ∥Df j∥∞. By mean value theorem, there
is y0 ∈ (−a, x0) with Df j+1(y0) =Df j(x0)−Df j(−a)
x0 + a=
Df j(x0)− 0
x0 + a, and hence ∥Df j∥∞ =
|Df j(x0)| = |x0 + a||Df j+1(y0)| ≤ (2a + 1)∥Df j+1∥∞. Applying this observation repeatedly, we
see pN (f) = max{|Djf(x)| : 0 ≤ j ≤ N, |x| ≤ N} ≤ (2a + 1)N∥DNf∥∞ for f ∈ Γ. Putting
C0 = C(2a+ 1)N , we conclude |⟨ϕ, f⟩| ≤ C0∥DNf∥∞ for every f ∈ Γ.
Further note that DNf ∈ Γ for f ∈ Γ and DNf(x) =∫ x−aD
N+1f(y)dy for x ∈ (−a, a), which
shows ∥DNf∥∞ ≤ ∥DN+1f∥1 for f ∈ Γ. Hence |⟨ϕ, f⟩| ≤ C0∥DN+1f∥1 for every f ∈ Γ. Let
DN+1(Γ) = {DN+1f : f ∈ Γ} and define the linear functional ψ : DN+1(Γ) → C as ⟨ψ,DN+1f⟩ =
⟨ϕ, f⟩. If f1, f2 ∈ Γ and DN+1f1 = DN+1f2, then f1 = f2 since Djf1 and Djf2 have compact
FOURIER ANALYSIS 47
supports for every j ≥ 0, and therefore ψ is well-defined. From the estimate above, |⟨ψ,DN+1f⟩| ≤
C0∥DN+1f∥1, and thus ψ : (DN+1(Γ), ∥ · ∥1) → C is also continuous.
By Hahn-Banach theorem, ψ has a continuous linear extension ψ : L1(−a, a) → C. Since
L∞(−a, a) is the dual of L1(−a, a), there is h ∈ L∞(−a, a) such that ⟨ψ, f⟩ =∫ a−a fh for every
f ∈ L1(−a, a). Put h(x) = 0 for |x| ≥ a and note h ∈ D′. For f ∈ Γ we have ⟨ϕ, f⟩ =
⟨ψ,DN+1f⟩ =∫ a−a hD
N+1f =∫R hD
N+1f = ⟨h,DN+1f⟩ = (−1)N+1⟨DN+1h, f⟩ = ⟨DN+1h0, f⟩,
where h0 = (−1)N+1h. If we define g : R → C as g(x) =∫ x−∞ h0(y)dy, then g is continuous with
supp(g) ⊂ [−a, a] and Dg = h0. And ⟨ϕ, f⟩ = ⟨DN+1h0, f⟩ = ⟨DN+2g, f⟩ for every f ∈ Γ. �
[136] (Global structure theorem for distributions with compact support - every distribution with
compact support is a finite sum of finite order derivatives of continuous functions) If ϕ ∈ E ′ and b > 0
is with supp(ϕ) ⊂ (−b, b), then there are finitely many functions fj ∈ Cc(R) with supp(fj) ⊂ (−b, b)
such that ϕ =∑m
j=0Djfj .
Proof. Let 0 < a < b be chosen with supp(ϕ) ⊂ (−a, a). By [131], there are g ∈ Cc(R) with
supp(g) ⊂ [−a, a] ⊂ (−b, b) and m ≥ 0 such that ⟨ϕ, f⟩ = ⟨Dng, f⟩ for every f ∈ D with supp(f) ⊂
(−a, a). Let h ∈ D be a bump function satisfying h ≡ 1 on supp(ϕ) and supp(h) ⊂ (−a, a). Then
for any f ∈ E , we have supp(fh) ⊂ (−a, a), and
⟨ϕ, f⟩ = ⟨ϕ, fh⟩ = ⟨Dmg, fh⟩ = (−1)m⟨g,Dm(fh)⟩ = (−1)mm∑j=0
cj⟨g,DjfDm−jh⟩.
Putting gj = (−1)mcjgDm−jh, which does not depend on f , we see that g ∈ C(R) with
supp(gj) ⊂ [−a, a] ⊂ (−b, b) and
⟨ϕ, f⟩ = (−1)mm∑j=0
cj
∫gDjfDm−jh =
m∑j=0
∫gjD
jf =
m∑j=0
⟨gj , Djf⟩ =m∑j=0
(−1)j⟨Djgj , f⟩.
Letting fj = (−1)jgj , we get the required result. �
Definition: We say g ∈ C(R) is of polynomial growth if there are constants C > 0, M > 0 such that
|g(x)| ≤ C(1 + |x|)M for every x ∈ R.
[137] (Structure theorem for tempered distributions - every tempered distribution is a finite order
derivative of a continuous function of polynomial growth) If ϕ ∈ S ′, then there exist m ∈ N and
g ∈ C(R) of polynomial growth such that ⟨ϕ, f⟩ = ⟨Dmg, f⟩ for every f ∈ S.
Proof. Step-1 : (Sketch) First assume supp(ϕ) ⊂ (0,∞). Since ϕ ∈ S ′, there exist C > 0 and N ∈ N
such that |⟨ϕ, f⟩| ≤∑
0≤i,j≤N si,j(f) for every f ∈ S. Choose h0 ∈ E with h0 ≡ 1 in a neighborhood
48 T.K.SUBRAHMONIAN MOOTHATHU
of supp(ϕ) and supp(h0) ⊂ (0,∞). Since ⟨ϕ, f⟩ = ⟨ϕ, h0f⟩, we can find C ′ = C ′(h0) > 0 by applying
the product rule of differentiation in si,j(h0f) so that
|⟨ϕ, f⟩| = |⟨ϕ, h0f⟩| ≤ C ′∑0≤i,j≤N sup{|xiDjf(x)| : x > 0} for every f ∈ S. (∗)
Let h : R → C be h(x) = xN/N ! for x > 0 and h(x) = 0 for x ≤ 0 so that DNh = 1(0,∞) and
DN+1h = δ0. Let g = ϕ ∗ h, which is defined as ϕ ∗ h(x) = ⟨ϕ, y 7→ h(x − y)⟩; see the beginning
of the previous section. One then verifies that the continuous function g is of polynomial growth
with the help of (∗), and checks DN+1g = DN+1(ϕ ∗ h) = ϕ ∗DN+1h = ϕ ∗ δ0 = ϕ; see section 8.3
of F.G. Friedlander, Introduction to the Theory of Distributions for the details.
Step-2 : In the general case, choose h1, h2 ∈ E with h1, h2 ≥ 0, supp(h1) ⊂ (−∞, 1), supp(h2) ⊂
(0,∞), and supp(h1) ∪ supp(h2) = R. Putting gj = hj/(h1 + h2) for j = 1, 2, we see gj ∈ E ,
0 ≤ g1, g2 ≤ 1, g1+ g2 = 1, supp(g1) ⊂ (−∞, 1) and supp(g2) ⊂ (0,∞); in other words, {g1, g2} is a
smooth partition of unity for R subordinate to the open cover {(−∞, 1), (0,∞)}. Since ϕ = g1ϕ+g2ϕ
with supp(g1ϕ) ⊂ (−∞, 1) and supp(g2ϕ) ⊂ (0,∞), we may apply the argument in step-1 to each
of g1ϕ and g2ϕ to deduce the required result about ϕ. �
15. Fourier transform on R: basics
For r ∈ R, let er : R → C be er(x) = e2πirx for x ∈ R. Note that er ∈ L∞(R) for every r ∈ R.
Recall from Exercise-5 that f(n) = f(n) · 1 = f(n)en(0) = f ∗ en(0) for f ∈ L1(T). This motivates
the following definition.
Definition: The Fourier transform f : R → C of f ∈ L1(R) is defined as f(y) = f ∗ ey(0) =∫R f(x)e−y(x)dx for y ∈ R. For example, let u ∈ (0,∞) and f = 1[−u,u]. Then f(0) = 2u and
f(y) =∫ u−u e−y(x)dx =
e−y(−u)− e−y(u)
2πiy=
sin 2πiuy
πyfor y ∈ R \ {0}.
Remark: The integral defining f(y) is a global integral, over the whole of R. Hence, if we change
f on a small interval, the value of f(y) may change for every y ∈ R.
Exercise-34: (Properties of the Fourier transform - I) Let f, g ∈ L1(R).
(i) f ∈ L∞(R), f is uniformly continuous, and ∥f∥∞ ≤ ∥f∥1.
(ii) (Continuity) If (fk) → f in L1(R), then ∥f − fk∥∞ → 0 as k → ∞.
(iii) (Reflection and Linearity)f =
˜f (where f(x) = f(−x)), and af + bg = af + bg for a, b ∈ C.
(iv) If f is an even (odd) function, then so is f .
(v) f(y) = (1/2)∫R[f(x)− f(x+ 1
2y )]e−y(x)dx for y ∈ R \ {0}.
[Hint : (i) Note ∥ey∥∞ = 1, and use an argument similar to that for Exercise-20. (ii) Use ∥f∥∞ ≤
∥f∥1. (iv) If f is even, then f(−y) =∫f(x)ey(x)dx =
∫f(−x)ey(−x)dx =
∫f(x)e−y(x)dz = f(y).
FOURIER ANALYSIS 49
(v) Substitute x = z+ 12y in the integral expression for f(y) to get f(y) = −
∫R f(z+
12y )e−y(z)dz =
−∫R f(x+ 1
2y )e−y(x)dx and add this to the original expression for f(y).]
Exercise-35: (Properties of the Fourier transform - II) Let f, g ∈ L1(R), , and let fa(x) = f(x− a).
(i) eaf = (f)a and (fa) = e−af for every a ∈ R.
(ii) f ∗ ey = f(y)ey for every y ∈ R.
(iii) f ∗ g = f g.
(iv)∫R f(x)g(x)dx =
∫R f(x)g(x)dx.
(v) Let a = 0. If g(x) = f(ax), then g(y) = a−1f(y/a) (so If g(x) = f(x/a), then g(y) = af(ay)).
[Hint : (i) eaf(y) =∫ea(x)f(x)e−y(x)dx =
∫f(y)e−(y−a)(x)dx = f(y − a), and fa(y) =
∫f(x −
a)e−y(x)dx =∫f(z)e−y(z + a)dz = e−y(a)
∫f(z)e−y(z)dz = e−a(y)f(y). Deduce (iii) from (ii) as
follows: f ∗ g(y)ey = f ∗ g ∗ ey = f ∗ g(y)ey = g(y)f ∗ ey = g(y)f(y)ey and cancel ey = 0 from both
ends as in Exercise-5. Use Fubini’s theorem to prove (iv). For (v), put z = ax in the integral.]
Exercise-36: (Fourier transform and differentiation) Let f ∈ L1(R).
(i) If Df ∈ L1(R), then lim|x|→∞ f(x) = 0 and Df(y) = 2πiyf(y) for y ∈ R.
(ii) If xf ∈ L1(R), then f is differentiable and Df(y) = −2πi(xf)(y) for y ∈ R.
(iii) More generally, we have: Dmf ∈ L1(R) ⇒ Dmf(y) = (2πiy)mf(y); and
xmf ∈ L1(R) ⇒ f is m-times differentiable with Dmf(y) = (−2πi)m(xmf)(y).
[Hint : (i) Since Df ∈ L1(R), ∀ ε > 0 ∃M > 0 with |f(b)− f(a)| = |∫ ba Df | ≤
∫ ba |Df | < ε for M ≤
a < b. So limx→∞ f(x) exists. This limit must be 0 since f ∈ L1(R). Similarly limx→−∞ f(x) = 0.
To derive the expression for Df(y), do integration by parts. (ii) Since |e−t(x) − 1| ≤ |2πtx|
and since xf ∈ L1(R), Lebesgue dominated convergence theorem gives limt→0f(y + t)− f(y)
t=∫
R f(x)e−y(x)
[limt→0
e−t(x)− 1
t
]dx = −2πi
∫R xf(x)e−y(x)dx = −2πi(xf)(y).]
Remark: From Exercise-35(i) and Exercise-36, and some of the future results, we see the following
are pairs of dual operations for the Fourier transform:
(i) translation and rotation (multiplication by a unimodular scalar).
(ii) Differentiation and multiplying with x.
(iii) Convolution and taking pointwise product.
Example: Let f ∈ L1(R) be f(x) = e−πx2(and f(0) = 1). We will show f = f . We have
Df(x) = −2πxf(x) and hence f is the unique solution to Dg(x) + 2πxg(x) = 0 with initial
condition g(0) = 1. Thus it suffices to show f also satisfies this equation. Since xf ∈ L1(R)
and using −2πxf(x) = Df(x), we get by Exercise-36 that Df(y) =∫(−2πixf(x))e−y(x)dx =
50 T.K.SUBRAHMONIAN MOOTHATHU∫iDf(x)e−y(x)dx = iDf(y) = i · 2πiyf(y) = −2πyf(y). Also f(0) =
∫R f(x)dx = 1 by Complex
integration (consider (∫R f(x)dx)(
∫R f(y)dy) =
∫C e
−π|z|2dz and use polar coordinates).
Remark: The second part of Exercise-36(iii) says that the faster the decay of f at ∞, the smoother
f is. Result [138](ii) below says that the smoother f is, the faster the decay of f at ∞.
[138] (i) (Riemann-Lebesgue lemma) Let f ∈ L1(R). Then lim|y|→∞
f(y) = 0, and hence f ∈ C0(R).
Also, lim|y|→∞
∫Rf(x) cos 2πxydx = 0 and lim
|y|→∞
∫Rf(x) sin 2πxydx = 0.
(ii) (Smoother functions, faster decay) If Djf ∈ L1(R) for 0 ≤ j ≤ m, then lim|y|→∞
|y|mf(y) = 0.
Proof. (i) If f = 1[a,b], then f(y) =e−y(a)− e−y(b)
2πiy→ 0 as |y| → ∞. By linearity, f(y) → 0 as
|y| → ∞ for any step function f ∈ L1(R). Also it is known that step functions are dense in L1(R)
(see my notes Measure Theory). Another proof : We may assume f ∈ Cc(R) because Cc(R) is dense
in L1(R). Now use the expression for f(y) from Exercise-34(v) and the uniform continuity of f -
as in the proof of [103](ii) - to deduce lim|y|→∞ f(y) = 0.
(ii) This follows from (i) and Exercise-36(i). �
Remark: By the above results, f 7→ f from L1(R) to (C∞0 (R), ∥ · ∥∞) is linear and continuous.
16. Fourier transform: sufficient conditions for pointwise inversion
Definition: For g ∈ L1(R), the Fourier inverse transform g∨ of g is defined as g∨(y) = g(−y) =∫R g(x)ey(x)dx. By Exercise-34(i) and [138], we see that g∨ ∈ C0(R); in particular, g∨ is bounded
and uniformly continuous. Also note that if f ∈ L1(R) is an even function, then f∨(y) =∫f(x)ey(x)dx =
∫f(−x)ey(x)dx =
∫f(z)e−y(z)dz = f(y), i.e., f∨ = f when f is even.
Question and Remark: If f ∈ L1(R), can we expect the equality (f)∨ = f? If we want this equality
to hold everywhere, a necessary condition (by the observation in the above paragraph) is that
f ∈ C0(R). Even if we demand equality only almost everywhere, a necessary condition is that
f ∈ L∞(R). Another point to note is, formally (f)∨(x) =∫∞−∞ f(y)ex(y)dy, but this integral may
not be defined if f is not an L1-function. With our knowledge that f ∈ L∞(R), an integral that we
can always define for f ∈ L1(R) is∫ u−u f(y)ex(y)dy for u ∈ (0,∞); also we may investigate whether
this integral converges to f(x) as u→ ∞. This motivates the following definitions.
Definition: (i) For u ∈ (0,∞), the continuous Dirichlet kernel Du : R → C is defined asDu = 1[−u,u],
and hence Du ∈ C0(R). From an earlier calculation, Du(y) =e−y(−u)− e−y(u)
2u=
sin 2πuy
πyfor
y ∈ R \ {0}, and Du(0) = 2u. Thus Du is a real valued function, and also Du(−y) = Du(y).
FOURIER ANALYSIS 51
(ii) For f ∈ L1(R) and u > 0, the uth Fourier partial integral su(f) : R → C is defined as su(f, a) =∫ u−u f(x)ea(x)dx for a ∈ R. We now look for conditions that yield limu→∞ su(f, a) = f(a).
Exercise-37: (Various expressions for su(f)) Let f ∈ L1(R), u > 0 and a ∈ R. Then,
(i) su(f, a) =∫∞−∞ f(a+ x)
sin 2πux
πxdx =
∫∞−∞ f(a− x)
sin 2πux
πxdx = Du ∗ f(a).
(ii) su(f, a) =∫∞0 [f(a+ x) + f(a− x)]
sin 2πux
πxdx =
∫∞0 [f(a+ x) + f(a− x)]Du(x)dx.
[Hint : (i) By Exercise-35(i), eaf = f−a, and for g := 1[−u,u], we have g = Du. Hence su(f, a) =∫ u−u f(x)ea(x)dx =
∫R f−a(x)g(x)dx =
∫∞−∞ f−a(x)g(x)dx =
∫R f(a+x)Du(x)dx by Exercise-35(iv).
Replacing x with −x gives su(f, a) =∫R f(a−x)Du(x)dx = f ∗Du(a) since Du is an even function.]
[139] (i) Let g ∈ L1(R)∩L∞(R) be such that g ≥ 0 and∫R g = 1. Let gt(x) = t−1g(x/t) for t > 0.
Then {gt : t > 0} as t → 0 is an approximate identity for L1(R). Moreover, for any f ∈ L1(R) we
have limt→0 f ∗ gt(x) = f(x) for a.e. x ∈ R.
(ii) (Fourier inversion theorem) Assume f, f ∈ L1(R), and let f0 = (f)∨. Then, f0 ∈ C0(R), f = f0
almost everywhere, and also (f∨) = f0. Moreover, f , f∨ ∈ L1(R) ∩ L∞(R).
Proof. (i) We know {gt : t > 0} as t → 0 is an approximate identity for L1(R). We see limt→0 f ∗
gt(x) = f(x) for a.e. x ∈ R by noting that |f(x) − f ∗ gt(x)| = |∫R(f(x) − f(x − y))gt(y)dy| ≤∫
R |f(x)−fy(x)|t−1g(y/t)dy =∫R |f(x)ftz(x)|g(z)dz ≤ ∥f−ftz∥1∥g∥∞ → 0 as t→ 0 by Exercise-20.
(ii) Direct evaluation will not work since the complex valued exponential function on R does not
belong to L1(R). Therefore the proof becomes a little involved, where we need to insert a suitable
approximate identity into the integral. We know f0 ∈ C0(R). Let K(x) = e−πx2, and Kt(x) =
t−1K(x/t) = t−1e−πx2/t for t > 0. Then {Kt : t > 0} as t → 0 is an approximate identity. Since
K = K, we have ht(y) := Kt(y) = K(ty) = K(ty) = e−πt2y2 by Exercise-35(v). Since K, Kt are
even functions, K∨t = Kt, and therefore ht = Kt.
Fix z ∈ R. Since (ht) → 1 pointwise as t → 0, Lebesgue dominated convergence theorem yields
that f0(z) =∫R f(y)ez(y)dy = limt→0
∫R f(y)ez(y)ht(y)dy. By Exercise-35 and the evenness of
Kt, we see∫R f(y)ez(y)ht(y)dy =
∫R f−z(y)ht(y)dy =
∫R f−z(y)ht(y)dt =
∫R f(y + z)Kt(y)dy =∫
R f(−y + z)Kt(y)dy = f ∗Kt(z) → f(z) for a.e. z ∈ R as t → 0 by part (i). In the last step, one
can also take limit along a sequence (tn) → 0 after noting that ∥f − f ∗Kt∥1 → 0 as t → 0, and
that L1-convergence implies pointwise convergence a.e. along a subsequence.
Since f∨(y) = f(−y), we also have (f∨) = f0. These imply that f, f ∈ L1(R) ∩ L∞(R). �
The following is an important technical fact for Fourier Theory.
[140] (i) Let h(x) =∫ x0 y
−1 sin ydt. Then limx→∞ h(x) = π/2.
52 T.K.SUBRAHMONIAN MOOTHATHU
(ii)∫∞0 Du(x)dx =
∫∞0
sin 2πux
πxdx = 1/2, and therefore
∫∞−∞Du(x)dx =
∫∞−∞
sin 2πux
πxdx = 1.
Proof. (i) Note that h is monotone on (nπ, (n + 1)π), and the differences h((n + 1)π) − h(nπ) =∫ (n+1)πnπ y−1 sin ydy alternate in sign and decrease to 0. Hence limx→∞ h(x) exists. Thus it suf-
fices to show limn→∞ h(xn) = π/2 for some sequence (xn) → ∞. From the earlier parts of the
notes, we know that the discrete Dirichlet kernel11 DN ∈ C(T) satisfies 1/2 =∫ 1/20 DN (t)dt =∫ 1/2
0
sin(2N + 1)πt
sinπtdt. Since lim
N→∞
∫ 1/2
0(
1
sinπt− 1
πt) sin(2N + 1)πtdt = 0 by [104],
1/2 = limN→∞
∫ 1/2
0
sin(2N + 1)πt
πt, and hence
π/2 = limN→∞∫ 1/20 t−1 sin(2N+1)πtdt = limN→∞
∫ (N+ 12)π
0 y−1 sin ydy = h((N+ 12)π) by putting
(2N + 1)πt = y. For another proof of (i) using complex integration, see Example 2.7 in Chapter 5
of Conway, Functions of one Complex Variable.
(ii) Putting x =t
2πu, we see
∫∞0 Du(x)dx = 1
π
∫∞0 t−1 sin tdt. Now apply (i). �
Exercise-38: (Sufficient conditions for the pointwise Fourier inversion) Let f ∈ L1(R) and a ∈ R.
(i) limu→∞ su(f, a) = f(a) ⇔ there is δ > 0 such that∫ δ−δ(f(a+ x)− f(a))
sin 2πux
πxdx = 0.
(ii) (Dini’s test) If x 7→ f(a+ x)− f(a)
xis in L1(−δ, δ) for some δ > 0, then lim
u→∞su(f, a) = f(a).
(iii) If x 7→ f(a+ x)− f(a)
xis bounded a.e. in a neighborhood of 0, then lim
u→∞su(f, a) = f(a).
(iv) If f is differentiable at a, or Lipschitz/Holder continuous at a, then limu→∞
su(f, a) = f(a).
(v) If f is piecewise C1 on each bounded interval, then limu→∞ su(f, a) = [f(a+) + f(a−)]/2.
[Hint : (i) f(a) =∫R f(a)Du(x)dx by [139], and hence su(f, a) − f(a) = (
∫ δ−δ +
∫|x|>δ)(f(x + a) −
f(a))Du(x)dx, where the second integral goes to 0 as u → ∞ by [138]. The proofs of other
statements are also similar to what we wrote for the pointwise convergence of Fourier series. Refer
to G.Bachman, L.Narici and E.Beckenstein, Fourier and Wavelet Analysis for some helpful hints.]
Definition: For v > 0, define the continuous Fejer kernel Fv on R as Fv(x) = 1v
∫ vu=0Du(x)du =[
− cos 2πux
2π2vx2
]vu=0
=1− cos 2πvx
2π2vx2≥ 0. Note that Fv is an even function, and by the identity
1 − cos 2θ = 2 sin2 θ we have Fv(x) =sin2 πvx
π2vx2. See
∫R Fv(x)dx = 1
v
∫ vu=0
∫x∈RDu(x)dxdu =
1v
∫ v0 1du = 1 by [139](ii) and an interchange of the integral. Since Fv(x) = vF1(vx), the family
{Fv : v > 0} also satisfies the L1-concentration condition (A3). Thus {Fv : v > 0} as v → ∞ is
an approximate identity on R. For f ∈ L1(R) and v > 0, the continuous Fejer mean σv(f) of f
is defined as σv(f, x) =1v
∫ vu=0 su(f, x)du = 1
v
∫ vu=0
∫y∈RDu(y)f(x− y)dydu = Fv ∗ f(x), where the
last equality is by an interchange of the integrals.
11Same notation is used for discrete and continuous Dirichlet kernels: DN (t) =sin(2N + 1)πt
sinπt, Du(x) =
sin 2πux
πx.
FOURIER ANALYSIS 53
Exercise-39: (i) For v > 0, let gv : R → R be gv(x) = 1 − |x|v for −v < x < v, and g(x) = 0
elsewhere. Then Fv = gv = g∨v ∈ C0(R).
(ii) For f ∈ L1(R), v > 0, and a ∈ R, we have σv(f, a) = Fv ∗ f(a) =∫ v−v(1−
|y|v )f(y)ea(y)dy.
(iii) If f ∈ Cc(R), then ∥f − Fv ∗ f∥∞ → 0 as v → ∞.
(iv) Let 1 ≤ p <∞. If f ∈ Lp(R), then ∥f − Fv ∗ f∥p → 0 as v → ∞.
(v) Let f ∈ L1(R) and assume f is continuous at a ∈ R. Then limv→∞ σv(f, a) = f(a).
(vi) If f ∈ L1(R), then limv→∞ σv(f, a) = f(a) for a.e. a ∈ R.
(vii) (Uniqueness) If f, g ∈ L1(R) and f = g, then f = g a.e., and hence f = g in L1(R).
[Hint : (i) Replacing x with−x in∫ 0−v, and integrating by parts, g(y) =
∫ v0 (1−
xv )[ey(x)+e−y(x)]dx =∫ v
0 (1−xv ) · 2 cos 2πxy dx = 0 + 1
v
∫ v0
sin 2πxy
πydx = 1
v
∫ v0 Dx(y)dx = Fv(x). And g∨v = gv since gv is
even. (ii) Since Fv(a− z) = g∨v (a− z) =∫gv(y)ea−z(y)dy, we get Fv ∗ f(a) =
∫Fv(a− z)f(z)dz =∫ ∫
gv(y)ea(y)e−y(z)f(z)dzdy =∫gv(y)ea(y)f(y)dy =
∫ v−v(1−
|y|v )f(y)ea(y)dy. (iii) and (iv): They
follow from [102] since {Fv : v > 0} as v → ∞ is an approximate identity. (v) Similar to the initial
part of the proof of [116](v). Statement (vi) follows from [139](i), and (vii) from (ii) and (vi).]
Remark: The space Cbu(R) := {f ∈ C(R) : f is bounded and uniformly continuous} is closed in
L∞(R). Also Cbu(R) satisfies the following two properties: ∥fy∥∞ = ∥f∥∞, and y 7→ fy from R
to (Cbu(R), ∥ · ∥∞) is continuous for each f ∈ Cbu(R). At the abstract level, these two are the
properties going into the proof of [102](i). Hence, extending Exercise-39(iii), we can also establish
that ∥f − Fv ∗ f∥∞ → 0 as v → ∞ for every f ∈ Cbu(R), and in particular for every f ∈ C0(R).
Now we prove the analogue of Dirichlet-Jordan theorem (result [118]) with a different proof, for
which we will make use of the following fact.
Fact: If g : [a, b] → R is increasing, then g is differentiable almost everywhere, and for any
bounded real function h on [a, b], we have∫ ba hdg =
∫ ba h(x)Dg(x)dx, where
∫ ba hdg is the Riemann-
Stieltjes integral w.r.to g (see Theorem 6.17 in Rudin, Principles of Mathematical analysis). If
g : [a, b] → C is of bounded variation, then we can write g = g1 − g2 + i(g3 − g4), where gj ’s are
monotone increasing, and hence we can define the Riemann-Stieltjes integral w.r.to g, and one has∫ ba hdg =
∫ ba h(x)Dg(x)dx for any bounded real function h on [a, b] in this case also.
[141] (Dirichlet-Jordan theorem for R) Let f ∈ L1(R) be of bounded variation in every compact
interval [a, b] ⊂ R. Then limu→∞ su(f, a) = [f(a+) + f(a−)]/2 for every a ∈ R; in particular,
limu→∞ su(f, a) = f(a) for a.e. a ∈ R (since a function of bounded variation, being a linear
combination of monotone functions, is differentiable a.e., and hence continuous a.e.)
54 T.K.SUBRAHMONIAN MOOTHATHU
Proof. Fix a ∈ R and let g(x) = f(a+x)+f(a−x). Then g ∈ L1(R), and g is of bounded variation on
any compact interval [a, b] ⊂ R. Fix δ > 0. By Exercise-37(ii), su(f, a) = (∫ δ0 +
∫∞δ )g(x)Du(x)dx.
As a consequence of Riemann-Lebesgue lemma, limu→∞∫∞δ g(x)Du(x)dx = 0 (see [105](iv) for a
comparison). Thus it suffices to show limu→∞∫ δ0 g(x)Du(x)dx = [f(+) + f(a−)]/2. Let hu(x) =∫ x
0 Du(y)dy so that Dhu = Du. Integration by parts yields
∫ δ
0g(x)Du(x)dx =
∫ δ
0g(x)Dhu(x)dx = g(x)hu(x)|δ0 −
∫ δ
0Dg(x)hu(x)dx.(∗)
LetH(x) =∫ x0
sin t
πtdt. By [139] note thatH(0) = 0, H(2πux) = hu(x), and limx→∞H(x) = 1/2.
Now g(x)hu(x)|δ0 = g(δ−)H(2πuδ) → g(δ−)/2 as u→ ∞. Also,
limu→∞
∫ δ
0Dg(x)hu(x)dx = lim
u→∞
∫ δ
0hu(x)dg = lim
u→∞
∫ δ
0H(2πux)dg =
∫ δ
0
1
2dg =
g(δ−)− g(0+)
2
by the Fact above and Lebesgue dominated convergence theorem. Using these in (∗), we get
limu→∞∫ δ0 g(x)Du(x)dx = g(0+)/2 = [f(a+) + f(a−)]/2. �
Exercise-40: If f ∈ L1(R), then limu→∞∫ ba su(f, x)dx =
∫ ba f(x)dx for every a < b in R.
[Hint : This is essentially a consequence of the fact that Du is even and hence su behaves self-
adjointly :∫su(f)g =
∫fsu(g). Indeed, letting g = 1(a,b), we
∫ ba su(f, x)dx =
∫R su(f, x)g(x)dx =∫
R f ∗ Du(x)g(x)dx =∫R∫R f(y)Du(x − y)g(x)dxdy =
∫R∫R f(y)Du(y − x)g(x)dxdy =
∫R f(y)g ∗
Du(y)dy =∫R f(y)su(g, y)dy. Also, since g ∈ L1(R) is of bounded variation, su(g) → g pointwise
a.e., bounded by an integrable function. Hence∫ ba su(f) =
∫R fsu(g) →
∫R fg =
∫ ba f as u→ ∞.]
17. Fourier transform on S, L2(R), and on distributions
Recall that the Schwartz space S is a subset of Lp(R) for 1 ≤ p ≤ ∞. We denote by ⟨·, ·⟩2 the
L2-inner product.
[142] If f ∈ S, then f ∈ S. The Fourier transform map F : S → S given by Ff = f is linear and
bijective, and satisfies the unitary condition FF∗ = I = F∗F , where F∗ : S → S is the inverse
Fourier transform F∗f = f∨. In particular, ⟨Ff,Fg⟩2 = ⟨f, g⟩2 and ∥Ff∥2 = ∥f∥2 for f, g ∈ S.
Proof. Let f ∈ S and i, j ≥ 0. We need to show ∥yiDj f∥∞ < ∞. By Exercise-36, yjDj f(y) =
yi(−2πi)j (xjf) = (−1)j(2πi)j−iDi(xjf)(y). And ∥Di(xjf)∥∞ ≤ ∥Di(xjf)∥1 < ∞ since we have
Di(xjf) ∈ S ⊂ L1(R) as f ∈ S. For f, g ∈ S, we see by Fubini’s theorem that
⟨Ff, g⟩2 =∫ ∫
f(x)e−y(x)g(y)dxdy =
∫ ∫f(x)g(y)ey(x)dxdy =
∫f(x)F ∗g(x)dx = ⟨f, f∗g⟩2,
FOURIER ANALYSIS 55
and hence F∗ is indeed the adjoint of F . Since S ⊂ L1(R), we have F∗F = I and similarly FF∗ = I
by [139]. This gives ⟨Ff,Fg⟩2 = ⟨f,F∗Fg⟩2 = ⟨f, g⟩2 for f, g ∈ S. �
Remark: A corollary of [142] is that Γ := {f ∈ L1(R) : supp(f) is compact} is dense in L1(R).
Proof : Since D ⊂ S ⊂ L1(R) are dense inclusions, and since F∗ : S → S is an isomorphism, the set
{g∨ : g ∈ D} is dense in L1(R). And {g∨ : g ∈ D} ⊂ Γ since g∨ = g, which has compact support.
[143] (Plancherel’s theorem - Fourier transform on L2(R)) Let F : S → S be the Fourier transform
Ff = f . Since S is dense in L2(R), we see by [142] that F has a unique extension F : L2(R) →
L2(R) as a unitary operator, i.e., as a bijective linear map satisfying ⟨Ff,Fg⟩2 = ⟨f, g⟩2 for
f, g ∈ L2(R). In particular, ∥Ff∥2 = ∥f∥2 holds for every f ∈ L2(R). The unique extension
F∗ : L2(R) → L2(R) of the inverse Fourier transform F∗ : S → S is the inverse of F on L2(R).
Moreover, the extension F on L2(R) satisfies Ff = f a.e. for every f ∈ L1(R) ∩ L2(R).
Proof. All except the last line are evident. To prove the last line, consider f ∈ L1(R) ∩L2(R). We
claim that there is a sequence (fn) ∈ D such that ∥f − fn∥1 → 0 and ∥f − fn∥2 → 0 as n → ∞.
Since (f1[−n,n]) → f in both L1(R) and L2(R), it suffices to consider the case where f has compact
support. Choose g ∈ D with g ≥ 0 and∫g = 1; let gn(x) = ng(nx); and consider fn = f ∗ gn to
establish the claim. Now, ∥f− fn∥∞ ≤ ∥f−fn∥1 → 0 and ∥Ff−Ffn∥2 = ∥f−fn∥2 → 0 as n→ ∞
by Exercise-34(i) and [142]. But Ffn = fn since fn ∈ D ⊂ S, and therefore Ff = f a.e. �
Example: Let f(x) = e−2πa|x| on R, where a > 0. Then f(y) =∫ 0−∞ e2π(a−iy)xdx+
∫∞0 e−2π(a+iy)xdx =
1
2π(a− iy)+
1
2π(a+ iy)=
a
π(a2 + y2). We can now use the Plancherel identity ∥f∥22 = ∥f∥22 to
evaluate the following integral:∫R
1
(a2 + y2)2dy = a−2π2∥f∥22 = a−2π2∥f∥22 = a−2π2(
∫ 0−∞ e4πaxdx+∫∞
0 e−4πaxdx) = a−2π2 · ( 1
4πa+
1
4πa) =
π
2a3.
Three natural classes of approximate identities on L1(R) can be obtained in a unified fashion as
demonstrated in the Exercise below.
Exercise-41: (0) (Facts about integrals) Let K : R → C be any of the following: K(x) =1− cosx
πx2,
or K(x) =1
π(1 + x2), or K(x) =
e−x2/4
√4π
. In each case K ≥ 0 and∫RK = 1, so that we can
manufacture approximate identities from K as stated below.
(i) (Fejer kernel) In the first case, let Ku(x) = 2πuK(2πux) =1− cos 2πux
2π2ux2. Then {Ku : u > 0}
as u→ ∞ is an approximate identity for L1(R).
(ii) (Poisson kernel) In the second case, let Kt(x) = t−1K(x/t) =t
π(t2 + x2). Then {Kt : t > 0} as
t→ 0 is an approximate identity for L1(R).
56 T.K.SUBRAHMONIAN MOOTHATHU
(iii) (Gaussian kernel) In the third case, let Kt(x) = t−1/2K(x/√t) =
e−x2/4t
√4πt
. Then {Kt : t > 0}
as t→ 0 is an approximate identity for L1(R).
The mathematical formulation of Heisenberg’s uncertainty principle says roughly the following:
for every a, b ∈ R, it is impossible for f to be concentrated about a and f to be concentrated about
b simultaneously. In other words, if f is concentrated, then f should spread out, and vice versa.
[144] (i) (Uncertainty principle - qualitative form) If f, f ∈ Cc(R), then f ≡ 0.
(ii) (Uncertainty principle - quantitative form) Let f ∈ L2(R) satisfy the following decay conditions:
xf,Df ∈ L2(R) and x|f |2 vanishes at ∞. Then f, f ∈ L1(R), and for every a, b ∈ R, we have
(∫R |x− a|2|f(x)|2dx)(
∫R |y − b|2|f(y)|2dy ≥ ∥f∥42
16π2, i.e., 16π2∥(x− a)f∥22∥(y − b)f∥22 ≥ ∥f∥42.
Proof. (i) Check g : C → C defined as g(z) =∫R f(t)e−z(t)dt is complex analytic by differentiating
under the integral sign, etc. We have g(u+ i0) = f(u) = 0 for u ∈ R \ supp(f). Since R \ supp(f)
contains an interval (and hence is a set containing a limit point), we get g ≡ 0. Then f(u) = g(u) =
0 for every u ∈ R. Hence f ≡ 0 by Fourier inversion [139] and the continuity of f .
(ii) Since xf ∈ L2, we may write f = (1 + x2)1/2f × (1 + x2)−1/2 and apply Cauchy-Schwarz
inequality to see f ∈ L1(R). Similarly, Df ∈ L2(R) implies yf ∈ L2(R) by Exercise-36(i) and [143],
and this in turn (by writing f = (1 + y2)1/2f × (1 + y2)−1/2) implies f ∈ L1(R).
By Exercise-35(i), a translation in f corresponds to multiplying f with a unimodular complex
scalar, and a translation in f corresponds to multiplying f with a unimodular complex scalar.
So, after a translation in both f and f , we may assume a = 0 = b. Thus it suffices to show
16π2∥xf∥22∥yf∥22 ≥ ∥f∥42.. Note that xfDf ∈ L1(R) by Cauchy-Schwarz inequality since xf,Df ∈
L2(R). We claim that 16π2∥xf∥22∥yf∥22 ≥ (2∫xfDf)2 ≥ (2Re
∫xfDf)2 = ∥f∥42.
We have 4π2∥yf∥22 = ∥2πiyf∥22 = ∥Df∥22 = ∥Df∥22 by Exercise-36 and [143], and therefore
16π2∥xf∥22∥yf∥22 = 4∥xf∥22∥Df∥22 = 4∥xf∥22∥Df∥22 ≥ (2∫xfDf)2 by Cauchy-Schwarz. Next, note
that D|f |2 = D(ff) = f · Df + Df · f = 2Re(fDf), and consequently∫ u−u |f |
2 =∫ u−u |f |
2 · 1 =
x|f |2|u−u − 2Re∫ u−u xfDf . Letting u→ ∞ and using the vanishing of x|f |2 at infinity, we conclude
∥f∥22 = 0− 2Re∫xfDf so that ∥f∥42 = (2Re
∫xfDf)2. �
Remark: (i) If f ∈ L2(R) is with ∥f∥2 = 1, then infa∈R∫R(x−a)
2|f(x)|2dx =: σ2(f) is the variance
of f in the language of Probability Theory, and hence [144](ii) says σ2(f)σ2(f) ≥ 1/(16π2). (ii)
Every f ∈ S satisfies the hypothesis of [144](ii).
Now we wish to define Fourier transform of distributions. Recall the identity∫fg =
∫fg. This
suggests the definition ⟨ϕ, f⟩ = ⟨ϕ, f⟩. However, note that if f ∈ D \ {0}, then f is not in D by the
FOURIER ANALYSIS 57
uncertainty principle. Therefore, to have a symmetric situation, we define Fourier transform only
for tempered distributions (and this includes distributions with compact support).
Definition and Example: For ϕ ∈ S ′ we define ϕ ∈ S ′ as ⟨ϕ, f⟩ = ⟨ϕ, f⟩. (i) If ϕ ∈ S comes from
an L1-function g, then ϕ = g since∫fg =
∫fg. (ii) Consider the Dirac measure δa ∈ E ′ ⊂ S ′.
We have ⟨δa, f⟩ = ⟨δa, f⟩ = f(a) =∫fe−a = ⟨e−a, f⟩, and hence δa is the tempered distribution
induced by the function e−a. In particular, δ0 = e0 = 1, which means ⟨δ0, f⟩ =∫f for f ∈ S.
Remark: (i) ϕ 7→ ϕ from S ′ to itself is a sequentially continuous linear isomorphism due to [142].
(ii) Since Lp(R) ⊂ S ′ for 1 ≤ p ≤ ∞ by [128], the Fourier transform of every f ∈ Lp(R) (1 ≤ p ≤ ∞)
is now defined in the sense of distributions.
Exercise-42: Let ϕ, ψ ∈ S ′. Then, (i) (aϕ+ bψ) = aϕ+ bψ, andϕ =
˜ϕ.
(ii) Dϕ = (−2πixϕ) and hence (by invertibility) (2πixϕ) = −Dϕ.
(iii) (Dϕ) = 2πiyϕ.
(iv) (eaϕ) = (ϕ)a and (ϕa) = e−aϕ.
[Hint : (ii) ⟨Dϕ, f⟩ = −⟨ϕ, (Df)⟩ = −⟨ϕ, 2πiyf⟩ = ⟨−2πixϕ, f⟩ = ⟨ (−2πixϕ), f⟩. (iii) ⟨(Dϕ), f⟩ =
−⟨ϕ,Df⟩ = −⟨ϕ,−2πi(xf)⟩ = ⟨ϕ, 2πixf⟩ = ⟨2πiyϕ, f⟩. (iv) ⟨(eaϕ), f⟩ = ⟨ϕ, eaf⟩ = ⟨ϕ, (f−a)⟩ =
⟨ϕ, f−a⟩ = ⟨(ϕ)a, f⟩, and ⟨(ϕa), f⟩ = ⟨ϕa, f⟩ = ⟨ϕ, (f)−a⟩ = ⟨ϕ, e−af⟩ = ⟨ϕ, e−af⟩ = ⟨e−aϕ, f⟩.]
Exercise-43: (i) f ∗ g = (fg) for f, g ∈ S (this is needed for the proof of (ii)).
(ii) Let g ∈ S and ϕ ∈ S ′. Then ϕ ∗ g = gϕ and ϕ ∗ g = gϕ.
[Hint : (i) As (·)∨ =(·), we have (f ∗ g)∨ = (f)∨(g)∨ = fg by the product rule (h1 ∗ h2)∨ = h∨1 h
∨2 .
Now use invertibility in S. (ii) ⟨ϕ ∗ g, f⟩ = ⟨ϕ ∗ g, f⟩ = ⟨ϕ, f ∗ g⟩ = ⟨ϕ, (fg)⟩ = ⟨ϕ, f g⟩ = ⟨gϕ, f⟩,
and ⟨ϕ ∗ g, f⟩ = ⟨ϕ, f ∗ ˜g⟩ = ⟨ϕ, f ∗ g∨⟩ = ⟨ϕ, f ∗ g∨⟩ = ⟨ϕ, f (g∨)⟩ = ⟨ϕ, fg⟩ = ⟨gϕ, f⟩ = ⟨gϕ, f⟩.]
Exercise-44: (i) If ϕ ∈ E ′ ⊂ S ′, then ϕ is (the restriction to R of) a complex analytic function g, i.e.,
⟨ϕ, f⟩ = ⟨g, f⟩ for f ∈ S ′. In particular, ϕ ∈ E . Also, Dmϕ has polynomial growth at ∞ ∀m ≥ 0.
(ii) (Product rule) If ϕ, ψ ∈ E ′, then ϕ ∗ ψ ∈ E ′, and ⟨ϕ ∗ ψ, f⟩ = ⟨ϕ, f⟩⟨ψ, f⟩ for f ∈ S.
[Hint : (i) g : C → C given by g(z) = ⟨ϕ, e−z⟩ is complex analytic, and Dmg(z) = ⟨ϕ, (−2πiy)me−z⟩.
Since ϕ ∈ E ′, there exist C > 0 andN ≥ 0 with |Dmg(z)| = |⟨ϕ, (−2πiy)me−z⟩| ≤ CpN ((−2πiy)me−z),
from which the polynomial growth property follows12. Since e−y(x) = e−x(y), and since the
integral representing f(y) can be approximated by Riemann sums, we get ⟨ϕ, f⟩ = ⟨ϕ, f⟩ =
ϕ(∫f(x)e−x(·)dx) =
∫f(x)ϕ(e−x(·))dx =
∫f(x)g(x)dx = ⟨g, f⟩. (ii) We know ϕ ∗ ψ ∈ E ′. By
[131], we see ⟨ϕ ∗ ψ, f⟩ =∫ ∫
f(x + y)ϕ(e−x)ψ(e−y)dxdy = ⟨ϕ, f⟩⟨ψ, f⟩ by interchanging ϕ and ψ
with the integrals as above.]
12see p.119 of Grafakos, Classical Fourier Analysis for the computational details.
58 T.K.SUBRAHMONIAN MOOTHATHU
18. Fourier transform of measures
In this and the remaining sections, we select a few topics related to Fourier Analysis and give a
very brief sketch about them, often with partial or skipped proofs. These sketches are intended as
appetizers for the students to learn more about advanced topics related to Fourier Analysis.
Definition: Let (X,A) be a measurable space. A complex measure on (X,A) is a map µ : A → C
satisfying the following: µ(A) =∑∞
k=1 µ(Ak) whenever A =∪∞
k=1Ak is a measurable partition.
For example, if f ∈ L1(R), then µ(A) :=∫A f(x)dx defines a complex Borel measure on R (hint:
use Lebesgue dominated convergence theorem to get countable additivity). This also shows that
the modulus of a complex measure does not satisfy monotonicity property: A ⊂ B does not imply
|µ(A)| ≤ |µ(B)|. For instance, if f = 1[0,1] − 1[2,3] and dµ = fdx, then µ(R) = 0 but µ([0, 1]) = 1.
We will show that the modulus of a complex measure can always be dominated by a finite
(positive) measure in an optimal manner. In the proof, we will use the following fact.
Fact: (see Lemma 6.3 in Rudin, Real and Complex Analysis) If z1, . . . , zN are finitely many complex
numbers, then there is F ⊂ {1, . . . , N} with |∑
k∈F zk| ≥ π−1∑N
k=1 |zk|.
[145] (Domination by finite positive measure) Let µ be a complex measure on a measure space
(X,A). For A ∈ A, let β(A) = sup∑∞
k=1 |µ(Ak)|, where the supremum is taken over all measurable
partitions A =∪∞
k=1Ak. Then, β is a finite positive measure on (X,A) with |µ(X)| ≤ β(X).
Proof. Clearly β(∅) = 0. Also note |µ(A)| ≤ β(A) for A ∈ A since A = A ∪ ∅ ∪ ∅ ∪ ∅ ∪ · · · is
also a measurable partition. To check countable additivity for β, consider A ∈ A and a measurable
partition A =∪∞
k=1Ak. For any measurable partition A =∪∞
n=1Bn, we have that Ak =∪∞
n=1(Ak∩
Bn) and Bn =∪∞
k=1(Ak ∩Bn) are measurable partitions of Ak and Bn, and hence∑∞
n=1 |µ(Bn)| ≤∑∞n=1 |
∑∞k=1 µ(Ak ∩ Bn)| ≤
∑∞k=1
∑∞n=1 |µ(Ak ∩ Bn)| ≤
∑∞k=1 β(Ak). Taking supremum over all
measurable partitions A =∪∞
n=1Bn, we get β(A) ≤∑∞
k=1 β(Ak). To prove the reverse inequality,
consider 0 ≤ ck < β(Ak), and choose a measurable partition Ak =∪∞
n=1Ck,n of Ak with ck <∑∞k=1 |µ(Ck,n)| for each k ∈ N. As A =
∪∞k,n=1Ck,n is a measurable partition, we get
∑∞k=1 ck ≤∑∞
k,n=1 |µ(Ck,n)| ≤ β(A), and it follows that∑∞
k=1 β(Ak) ≤ β(A) by the choice of ck’s.
If β(X) = ∞, we derive a contradiction as follows. Given M > 0, choose a measurable partition
X =∪∞
k=1An with∑N
k=1 |µ(Ak)| > πM for some N ∈ N. Applying the Fact mentioned above with
zk = µ(Ak), find F ⊂ {1, . . . , N} such that for B :=∪
k∈F Ak we have |µ(B)| = |∑
k∈F µ(Ak)| ≥
π−1∑N
k=1 |µ(Ak)| > M . AsM > 0 is arbitrary, we must have sup{|µ(B)| : B ∈ A} = ∞. Therefore,
we can find a sequence (Bn) in A such that |µ(B1)| ≥ 1 and |µ(Bn+1)| ≥ 1 +∑n
j=1 |µ(Bj)|.
Put C1 = B1 and Cn+1 = Bn+1 \∪n
j=1Bj . Then µ(Cn) ≥ 1 for every n ∈ N and hence the
FOURIER ANALYSIS 59
series∑∞
n=1 µ(Cn) cannot converge to any complex number. On the other hand, we should have
µ(∪∞
n=1Cn) =∑∞
n=1 µ(Cn) since Cn’s are disjoint. This is the required contradiction. �
Remark: (i) In the above, |µ(X)| is called the total variation of µ, and β is called the total variation
measure of µ (often β is denoted as |µ|; note that |µ(A)| ≤ |µ|(A), but equality may not hold). (ii)
Because of [145], the collection of complex measures does not include all positive measures since a
positive measure need not be finite!
Definition: If X is a metric space, let M(X) denote the collection of all complex Borel measures on
X. For µ ∈M(R), its Fourier transform µ : R → C is defined as µ(y) =∫R e−y(x)dµ(x) for y ∈ R.
Similarly, the nth Fourier coefficient of µ ∈ M(T) is defined as µ(n) =∫T e−n(x)dµ(x) for n ∈ Z.
Note that if µ ∈ M(R) is absolutely continuous w.r.to the Lebesgue measure, then by Radon-
Nikodym theorem, there is f ∈ L1(R) with dµ = fdt and hence µ(y) =∫R e−y(t)f(t)dt = f(y).
Remark: The Fourier transform of a measure share many (but not all) of the properties of Fourier
transform of a function. This is not surprising because we can identify f ∈ L1(R) with fdx ∈M(R).
Also, it can be shown that M(R) is a Banach space w.r.to the norm ∥µ∥ := |µ|(R).
Exercise-45: (Properties - I) Let µ ∈M(R), and let |µ| = β as in [145]. Then,
(i) |∫R fdµ| ≤
∫R |f |d|µ|.
(ii) µ ∈ L∞(R) with ∥µ∥∞ ≤ |µ|(R).
(iii) µ : R → C is uniformly continuous.
(iv) In general, Riemann-Lebesgue lemma fails for µ, i.e., µ may not vanish at infinity; for example,
δ0(y) =∫R e−ydδ0 = δ0(e−y) = 1 for every y ∈ R. However, if µ is absolutely continuous w.r.to the
Lebesgue measure, then lim|y|→∞ µ(y) = 0 by Radon-Nikodym theorem.
(v) If µ has compact support, then µ ∈ E with Dmµ(y) = (−2πi)mxmµ(y) for m ∈ N and y ∈ R.
(vi) If f, f ∈ L1(R) (for instance, if f ∈ S), then∫R fdµ =
∫R f(y)µ(y)dy.
[Hint : (ii) |µ(y)| ≤∫R |e−y(x)|d|µ(x)| =
∫R 1d|µ| = |µ|(R). (iii) |µ(y + t) − µ(y)| ≤
∫R |e−t(x) −
1|d|µ|(x). (iv) Let supp(µ) ⊂ (−b, b). We have t−1(µ(y+t)−µ(y)) =∫ b−b e−y(x)t
−1(e−t(x)−1)dµ(x),
|t−1(e−tx − 1)| ≤ 2π|x| and x 7→ 2πx belongs to L1(−b, b). By dominated convergence theorem,
Dµ(y) =∫ b−b e−y(x)(−2πix)dµ = −2πiν(y), where dν = xdµ. (vi) f = (f)∨ by Fourier inversion,
and hence∫R fdµ =
∫R∫R f(y)ey(x)dµ(x)dy =
∫R f(y)µ(y)dy by Fubini’s theorem.]
Exercise-46 (Properties - II) Let µ, ν be complex measures on R. Then,
(i) (Linearity) aµ+ bν = aµ+ bν for a, b ∈ C.
(ii) µ ∗ ν = µν, where µ∗ν(A) :=∫R∫R 1A(x+y)dµ(x)dν(y) =
∫R µ(A−y)dν(y) =
∫R ν(A−x)dµ(x).
(iii)∫R µdν =
∫R νdµ.
60 T.K.SUBRAHMONIAN MOOTHATHU
(iv) (Uniqueness) If µ = ν, then µ = ν.
[Hint : (ii) µ∗ν satisfies∫R fd(µ∗ν) =
∫R∫R f(x+y)dµ(x)dν(y) for f ∈ L1(R, µ)∩L1(R, ν). Hence
µ ∗ ν(z) =∫R∫R e−z(x+y)dµ(x)dν(y) =
∫R µ(z)e−z(y)dν(y) = µ(z)ν(z). (iii) Use Fubini’s theorem.
(iv) By Exercise-45(vi),∫R fdµ =
∫R fdν for every f ∈ D ⊂ S. Now 1[a,b] can be approximated by
members of D, and hence µ([a, b]) = ν([a, b]) for every a < b.]
Remark: (i) Analogues of Exercise-45 and Exercise-46 hold for complex measures on T. (ii) µ ∈
M(T) is called a Rajchman measure if Riemann-Lebesgue lemma holds for µ, i.e., if lim|n|→∞
µ(n) = 0.
For example, if µ ∈ M(T) is absolutely continuous w.r.to the Lebesgue measure, then µ = f for
some f ∈ L1(T) by Radon-Nikodym theorem, and therefore µ is a Rajchman measure. By a
theorem of Neder, every Rajchman measure µ is continuous in the sense that µ({a}) = 0 ∀ a ∈ T.
Seminar topic: (Bochner’s theorem) Let g : R → C be continuous. Then g = µ for some µ ∈M(R)
with µ ≥ 0 iff g is positive definite in the sense that∑n
j,k=1 g(zj − zk)zjzk ≥ 0 for any finite subset
{z1, . . . , zn} ⊂ C (see section 2.8 in Katznelson, An Introduction to Harmonic Analysis).
Going back to the theory of Fourier series, we may now supplement [119] as follows:
[146] Let f ∈ C(T) be of bounded variation. Then,
(i) Let µ ∈M(T) be given by the Riemann-Stieltjez integral w.r.to f , i.e., µ(A) =∫T 1Adf for Borel
subsets A ⊂ T. Then 2π|nf(n)| ≤ |µ(n)| ≤ |µ|(T) for every n ∈ Z.
(ii) limN→∞ ∥f − sN (f)∥∞ = 0, i.e., (sN (f)) → f uniformly.
Proof. (i) Recall that a function of bounded variation is differentiable a.e. Now, integration by parts
gives 2πinf(n) = 2πin∫ 10 f(t)e−n(t)dt = 0+
∫ 10 e−n(t)Df(t)dt =
∫ 10 e−n(t)df(t) =
∫ 10 e−n(t)dµ(t) =
µ(n). And use (the analogue of) Exercise-45(ii).
(ii) limN→∞
∥f − σN (f)∥∞ = 0 by [115], and supn |nf(n)| <∞ by part (i). Now use Hardy’s Tauberian
theorem [111] (to be precise, a uniform version of [111]) to deduce limN→∞
∥f − sN (f)∥∞ = 0. �
Remark: On Fourier series, termwise integration is always allowed, but termwise differentiation
is allowed only under extra hypothesis. We explain: (i) Let f ∈ L1(T) and F : T → C be
F (t) =∫ t0 f(s)ds. Then F is absolutely continuous, and hence is a continuous function of bounded
variation (see 5.4 of Royden, Real Analysis). By [146], the Fourier series of F converge to F
uniformly. Since uniform convergence allows the interchange of integration and summation, the
Fourier series of F is obtained by termwise integration of the Fourier series of f . (ii) Since termwise
differentiation of∑
n∈Z f(n)en brings an additional n to the numerator, the resulting series may not
converge for a general f ∈ L1(T). However, if we assume some smoothness condition, say assume
FOURIER ANALYSIS 61
f ∈ L1(T) is piecewise C2, then it can be shown that the series obtained by termwise differentiation
of∑
n∈Z f(n)en(t) converges pointwise to [Df(t+) +Df(t−)]/2 for every t ∈ T.
19. Poisson summation formula
If f ∈ L1(R), we may ask what the series∑
n∈Z f(n)en does represent. Poisson summation
formula relates this series to the periodization of f defined below.
Definition: For f ∈ L1(R), its periodization fP on T is formally defined as fP (t) =∑
k∈Z f(t+ k)
for t ∈ T = [0, 1) (we may also view fP as defined on the whole of R with period 1, given by the
same series). For example, the periodization of continuous Dirichlet kernel is the discrete Dirichlet
kernel (see p.223 of M.A. Pinsky, Introduction to Fourier Analysis and Wavelets).
[147] Let f ∈ L1(R) and fP be its periodization on T defined above. Then,
(i) fP (t) ∈ C, i.e., the series∑
k∈Z f(t + k) is convergent, for a.e. t ∈ T. Also, fP ∈ L1(T) with
fP (n) = f(n) for every n ∈ Z.
(ii) (Poisson summation formula) Assume in addition that∑
n∈Z |f(n)| <∞. Then (after modifying
on a null set), fP ∈ C(T) with∑
k∈Z f(t+ k) = fP (t) =∑
n∈Z f(n)en(t) for every t ∈ T.
Proof. (i) As f ∈ L1(R), we see ∞ >∫R |f(x)|dx =
∑k∈Z
∫ k+1k |f(x)|dx =
∑k∈Z
∫ 10 |f(x + k)|dx.
Since the series is absolutely convergent, we may interchange summation and integration to get∫ 10 (∑
k∈Z |f(x+ k)|)dx <∞. This shows fP (x) is finite a.e. and fP |[0,1) ∈ L1(T). Similarly, an in-
terchange of series and integration yields fP (n) =∫ 10 fP (t)e−n(t)dt =
∫ 10 (∑
k∈Z f(t+k))e−n(t)dt =∑k∈Z
∫ 10 f(t+ k)e−n(t)dt =
∑k∈Z
∫ k+1k f(y)e−n(y)dy =
∫R f(y)e−n(y)dy = f(n).
(ii) By part (i),∑
n∈Z |fp(n)| =∑
n∈Z |f(n)| <∞, and hence by Exercise-16(iii) we obtain fP (t) =∑n∈Z fP (n)en(t) =
∑n∈Z f(n)en(t) for t ∈ T. �
As an application, we mention below a case of recovering g : R → C from just knowing g|Z.
Exercise-47: (Sampling formula - simple form) Let f ∈ L1(T) be with∑
n∈Z |f(n)| < ∞, and
g : R → C be g(y) =∫ 1/2−1/2 f(t)ey(t)dt for y ∈ R. Then g(x) =
∑n∈Z
g(n) sinπ(x− n)
π(x− n)for x ∈ R.
[Hint : Parametrize T = [−1/2, 1/2) and extend f to R by putting f = 0 for |t| > 1/2. Then
f ∈ L1(R) and f(y) :=∫R f(x)e−y(x)dx = g(−y) for y ∈ R. Poisson summation formula for the
periodization fP ∈ L1(T) gives fP (t) =∑
n∈Z f(n)en(t) =∑
n∈Z g(−n)en(t) =∑
n∈Z g(n)e−n(t)
for t ∈ T. Since fP = f on T, we get g(x) =∫ 1/2−1/2 f(t)ex(t)dt =
∫ 1/2−1/2(
∑n∈Z g(n)e−n(t))ex(t)dt =∑
n∈Z g(n)∫ 1/2−1/2 ex−n(t)dt, where the interchange of integration and series is justified by uniform
convergence. Now note that∫ 1/2−1/2 ex−n(t)dt =
ex−n(t)
2πi(x− n)|1/2t=−1/2 =
sinπ(x− n)
π(x− n).]
62 T.K.SUBRAHMONIAN MOOTHATHU
Remark: In the language of Physics, Exercise-47 is called the sampling of bandlimited signal, where
a bandlimited signal is a signal whose Fourier transform has compact support; note that if we think
of f in Exercise-47 as defined on the whole of R, then part of the hypothesis is supp(f) ⊂ [−1/2, 1/2].
20. Two theorems of Wiener
We will present two theorems of Wiener in Fourier Theory using tools from the theory of Banach
algebras13. This will also provide an opportunity for the student to see the fruitful interaction among
different branches of Mathematics. A few facts about Banach algebras will be briefly mentioned
below.
Definition: A Banach algebra is a complex Banach space Γ admitting an associative multiplication
operation ‘·’ that satisfies the following for s, t ∈ Γ:
(i) (submultiplicative property of norm) ∥s · t∥ ≤ ∥s∥∥t∥, and
(ii) (bilinearity of product) (s, t) 7→ s · t is linear w.r.to addition in each variable.
If the multiplication is also commutative, then Γ is called a commutative Banach algebra. If there
is a multiplicative identity, then Γ is called a unital Banach algebra. It may be noted that there is
a simple procedure by which a multiplicative unit (say) u can be added to any non-unital Banach
algebra Γ to convert it into a (slightly larger) unital Banach algebra Γ + Cu.
Some examples of commutative Banach algebras: (i) C with usual multiplication.
(ii) L∞(R) with pointwise product as multiplication.
(iii) C(K) := {all continuous f : K → C} with sup-norm and pointwise multiplication, where K is
a compact Hausdorff space.
(iv) L1(R) with convolution as multiplication (recall that ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1 for f, g ∈ L1(R)).
(v) l1(Z) with discrete convolution x ∗ y(k) :=∑
n∈Z x(k − n)y(n) as multiplication, whose multi-
plicative identity is the element (. . . , 0, 0, 1, 0, 0, . . .), where 1 is at the 0th place.
Definition: Let Γ be a commutative Banach algebra and Γ∗ = {all continuous linear ϕ : Γ → C}.
We say ϕ ∈ Γ∗ \ {0} is a multiplicative functional if ϕ(st) = ϕ(s)ϕ(t) for s, t ∈ Γ. If K is a compact
Hausdorff space and a ∈ K, then the evaluation map ϕa : C(K) → C given by ϕa(f) = f(a) is a
multiplicative functional on C(K); also, ker(ϕa) = {f ∈ C(K) : f(a) = 0} is a maximal ideal in Γ.
We will use the following facts about a commutative unital Banach algebra Γ in the sequel.
Fact-1: Maximal ideals in Γ are precisely the kernels of multiplicative functionals on Γ.
13Originally these theorems where proved using the tools of Fourier Analysis, with rather complicated proofs.
FOURIER ANALYSIS 63
Fact-2: An element t ∈ Γ is invertible w.r.to multiplication iff ϕ(t) = 0 for every multiplicative
functional ϕ on Γ.
Fact-3: Let M(Γ) = {all multiplicative functionals on Γ} ≃ {all maximal ideals in Γ}, which is
called the Gelfand space of Γ. We have M(Γ) ⊂ the unit sphere of the dual Γ∗, and (M(Γ),weak*)
is a compact Hausdorff space. Also, the evaluation map (called the Gelfand map) E : Γ → C(M(Γ))
given by Et(ϕ) = ϕ(t) for t ∈ Γ and ϕ ∈ M(Γ) embeds Γ in C(M(Γ)). If Γ has no multiplicative
unit, then we can only say that M(Γ) is a locally compact Hausdorff space, and ∥ϕ∥ ≤ 1 for
every ϕ ∈ M(Γ) (if ∥ϕ∥ > 1, there is t ∈ Γ with ∥t∥ < 1 < |ϕ(t)|; then ∥tn∥ ≤ ∥t∥n → 0, but
|ϕ(tn)| = |ϕ(t)|n → ∞, a contradiction to the continuity of ϕ).
[148] (i) ϕ ∈ M(l1(Z))⇔ ∃ z ∈ C with |z| = 1 such that ϕ(x) =∑
n∈Z x(n)zn. ThusM(l1(Z)) = T.
(ii) (Wiener’s theorem about invertibility) Let C1(T) = {f ∈ C(T) : (f(n)) ∈ l1(Z)}. If f ∈ C1(T)
is non-vanishing on T, then 1/f ∈ C1(T).
Proof. (i) Here we think of T as T = {z ∈ C : |z| = 1}. For z ∈ T, let ϕz : l1(Z) → C be
ϕz(x) =∑
n∈Z x(n)zn, which is linear and continuous, and not identically zero. Now, ϕz(x ∗ y) =∑
k∈Z(x ∗ y)(k)zk =∑
k∈Z∑
n∈Z x(k − n)y(n)zk =∑
m∈Z∑
n∈Z x(m)y(n)zm+n = ϕz(x)ϕz(y) and
thus ϕz ∈ M(l1(Z)). Conversely, consider ϕ ∈ M(l1(Z)). Let {vn : n ∈ Z} be the standard
basis of l1(Z), where vn(k) = 1 for k = n and vn(k) = 0 for k = n. Verify that vi ∗ vj = vi+j ,
and v0 is the multiplicative identity for convolution in l1(Z). Let z = ϕ(v1) ∈ T. Then ϕ(v2) =
ϕ(v1 ∗ v1) = ϕ(v1)ϕ(v1) = z2 and inductively ϕ(vn) = zn for n ∈ N. Moreover, ϕ(v0) = 1 = z0 and
ϕ(v−n) = 1/ϕ(vn) = z−n for n ∈ N since vn ∗ v−n = v0. Since span{vn : n ∈ Z} is dense in l1(Z),
it follows by the linearity and continuity of ϕ that ϕ(x) =∑
n∈Z x(n)zn for every x ∈ l1(Z). It can
also be shown that the correspondence M(l1(Z)) ↔ T is a homeomorphism.
(ii) Here we parametrize T as T = [0, 1). We may identify C1(T) with l1(Z) by the correspondence
f ↔ (f(n))n∈Z. In this correspondence, the convolution in l1(Z) corresponds to pointwise product
in C1(T) by Fourier inversion. If f ∈ C1(T) is non-vanishing, then f(t) =∑
n∈Z f(n)en(t) =∑n∈Z f(n)(e
2πit)n = 0 for every t ∈ T = [0, 1). This means by (i) that ϕ(f) = 0 for every ϕ ∈
M(l1(Z)). Then by Fact-2, f is invertible w.r.to multiplication in C1(T). Hence 1/f ∈ C1(T). �
To prove the second theorem of Wiener, first we identify multiplicative functionals on L1(R).
We start with a little abstract theory that generalizes [148](i).
Definition: Let G be a locally compact second countable abelian group (example: Z, T, R). Define
its dual group G = {all continuous group homomorphisms α : G→ T}, where the group operation
is pointwise multiplication in T. Any α ∈ G is called a character of G.
64 T.K.SUBRAHMONIAN MOOTHATHU
Fact-4: (i) α ∈ R iff there is y ∈ R with α(x) = ey(x), and hence R = R. (ii) α ∈ Z iff there is
z ∈ T with α(n) = zn, and hence Z = T. By duality, T = Z.
[149] (i) Let G be a locally compact second countable abelian group equipped with Haar measure.
Then we may identify G with M(L1(G)), where α ∈ G corresponds to ϕα ∈ M(L1(G)) given by
ϕα(f) =∫G fα for f ∈ L1(G).
(ii) (Fourier transform gives all multiplicative functionals) ϕ ∈ M(L1(R)) iff there is y ∈ R with
ϕ(f) = f(y) for f ∈ L1(R).
Proof. (i) All integrations considered below are w.r.to the Haar measure µ. We know that L1(G)∗ =
L∞(G) and any ϕ ∈ L1(G) is given by ϕ(f) =∫fg for some g ∈ L∞(G). Since G ⊂ L∞(G), it
follows from the definition of ϕα that ϕα ∈ L∞(G) = L1(G)∗ for α ∈ G. If K ⊂ G is a compact
set of positive measure, then ϕα(α1K) =∫K |α|2dµ = µ(K) > 0, and hence ϕα = 0. Moreover,
ϕα(f ∗ g) =∫(f ∗ g)α =
∫ ∫f(x − y)g(y)α(x)dµ(x)dµ(y) =
∫ ∫f(z)g(y)α(y + z)dµ(y)dµ(z) =
ϕα(f)ϕα(g) by Fubini since α(y + z) = α(y)α(z), and thus ϕα ∈ M(L1(G)) for α ∈ G.
For α, β ∈ G, if ϕα = ϕβ, then α must coincide with β in L∞(G), which means α = β almost
everywhere. And then α = β everywhere since α, β are continuous. This shows α 7→ ϕα is injective.
Now consider ϕ ∈ M(L1(G)). We need to show ϕ = ϕα for some α ∈ G. Let g ∈ L∞(G) be
with ϕ(f) =∫fg and h ∈ L1(G) be with ϕ(h) = 1. Since ϕ is multiplicative, we observe ϕ(f) =
1·ϕ(f) = ϕ(h)ϕ(f) = ϕ(h∗f) =∫ ∫
h(x−y)f(y)g(x)dµ(x)dµ(y) =∫f(y)
∫hy(x)g(x)dµ(x)dµ(y) =∫
f(y)ϕ(hy)dµ(y), which suggests that we define α : G → T as α(y) := ϕ(hy). Since ϕ and
y 7→ hy are continuous, α is continuous. Since translation commutes with convolution, α(y + z) =
ϕ(hy+z) = ϕ(h)ϕ(hy+z) = ϕ(h ∗ hy+z) = ϕ(hz ∗ hy) = ϕ(hz)ϕ(hy) = α(z)α(y), which shows
α(y + z) = α(y)α(z). It remains to show |α| = 1. Since ∥ϕ∥ ≤ 1 by Fact-3, |α(y)| ≤ ∥hy∥1 = ∥h∥1by Exercise-2. For any n ∈ N, we see |α(y)n| = |α(ny)| ≤ ∥h∥1, and similarly |α(−y)n| ≤ ∥h∥1.
Since |α(y)nα(−y)n| = |α(ny − ny)| = |α(0)| = ∥h∥1, we must have |α(y)| = 1 = ∥h∥1.
(ii) By Fact-4, any character α of R is of the form α = ey for some y ∈ R. Since ey = e−y, part (i)
applied to G = R yields the required result. �
Exercise-48: Let Γ = Cδ0 + L1(R) be the unital Banach algebra obtained by attaching the multi-
plicative unit δ0 for convolution to L1(R). Let ϕ0 : Γ → C be ϕ0(cδ0 + f) = c. Then,
(i) M(Γ) = {ϕ0} ∪ {ψ : ψ(cδ0 + f) = ϕ0(cδ0 + f) + ϕ(f) for some ϕ ∈ M(L1(R))}.
(ii) If ψ ∈ M(Γ) \ {ϕ0}, then there is y ∈ R such that ψ(cδ0 + f) = c+ f(y).
(iii) Let w ∈ D ⊂ S be with 0 ≤ w ≤ 1, and let v = w∨ ∈ S ⊂ L1(R). Let f ∈ L1(R) be such
that f is non-vanishing, and put f1(x) = f(−x). Then ϕ0(δ0 − v + f ∗ f1) = 1 = 0. Also, for any
FOURIER ANALYSIS 65
ψ ∈ M(Γ)\{ϕ0} there is y ∈ R with ψ(δ0−v+f ∗f1) = 1− v(y)+ f ∗ f1(y) = 1−w(y)+ |f(y)|2 > 0
since f ∗ f1 = f f1 = f f = |f |2. Consequently, δ0 − v + f ∗ f1 is invertible in Γ by Fact-2.
[Hint : For (ii), use (i) and [149](ii). Compute ψ(δ0 − v + f ∗ f1) in (iii) using (ii).]
[150] (Wiener’s theorem about translates) For f ∈ L1(R), the following are equivalent:
(i) The Fourier transform f is non-vanishing.
(ii) {f ∗ g : g ∈ L1(R)} is dense in L1(R).
(iii) span{fy : y ∈ R} is dense in L1(R), where fy(x) = f(x− y).
Proof. (i) ⇒ (ii): Let Λ = {h ∈ L1(R) : supp(h) is compact}, which is dense in L1(R) by the
Remark after [142]. Hence it suffices to show that for every h ∈ Λ, there is g ∈ L1(R) with
h = f ∗ g. So consider h ∈ Λ. Let Γ = Cδ0 + L1(R), and w, v, f1 ∈ L1(R) be as in Exercise-48. We
may assume w ≡ 1 in a neighborhood of supp(h). By Exercise-48, δ0 − v+ f ∗ f1 is invertible in Γ.
Let u ∈ Γ be with (δ0 − v + f ∗ f1) ∗ u = δ0, and then (δ0 − v + f ∗ f1) ∗ u ∗ h = h. The Fourier
transform of (δ0 − v) ∗ u ∗ h is (1− w)uh, which is 0 since we assume w ≡ 1 in a neighborhood of
supp(h). By the uniqueness of Fourier transform, we must have (δ0 − v) ∗ u ∗ h = 0, and hence
f ∗ f1 ∗ u ∗ h = h. Writing u = cδ0 + g0 with g0 ∈ L1(R), we get f ∗ (f1 ∗ ch + f1 ∗ g0 ∗ h) = h.
Letting g ∈ L1(R) to be the bracketed expression, we arrive at the desired conclusion f ∗ g = h.
(ii) ⇒ (iii): (Sketch) From (ii) and the inequality ∥f ∗ g∥1 ≤ ∥f∥1∥g∥1, it follows that {f ∗ g : g ∈
Cc(R)} is also dense in L1(R) since Cc(R) = L1(R). Consider g ∈ Cc(R) and assume supp(g) ⊂ [a, b].
We have f ∗ g(x) =∫ ba f(x − y)g(y)dy =
∫ ba fy(x)g(y)dy. If a = a0 ≤ a1 ≤ · · · ak−1 ≤ ak = b is a
sufficiently fine partition of [a, b], and h(x) :=∑k
j=1 faj (x)g(aj)(aj − aj−1), then h approximates
f ∗ g in L1(R); and also h ∈ span{fy : y ∈ R}. Therefore span{fy : y ∈ R} is dense in L1(R).
(iii) ⇒ (i): Suppose f(z) = 0 for some z ∈ R. Then fy(z) = ey(z)f(z) = 0, and hence g(z) = 0 for
every g ∈ Λ := span{fy : y ∈ R}. Since g 7→ g from L1(R) to C0(R) is continuous, and since there
are g ∈ L1(R) with g(z) = 0 (use Fourier inversion), it follows that Λ cannot be dense in L1(R). �
21. Sketch: interpolation and the Lp-theory of Fourier series
Operators on Lp(T) and Lp(Rn) play a significant role in the modern theory of Fourier Analysis.
While dealing with such operators, two interpolation theorems are of basic importance: Riesz-
Thorin theorem and Marcinkiewicz theorem. We will give a brief sketch about the former, which is
needed in our discussion of Lp-convergence of Fourier series.
Exercise-49: Let (X,µ) be a σ-finite measure space and let Lp = Lp(X,µ).
(i) Let 1 ≤ p < q ≤ ∞. If f ∈ Lp and A = {x : |f(x)| ≤ 1}, then f1A ∈ Lq.
(ii) Let 1 ≤ p < q ≤ ∞. If f ∈ Lq and B = {x : |f(x)| > 1}, then f1B ∈ Lp.
66 T.K.SUBRAHMONIAN MOOTHATHU
(iii) If 1 ≤ p < q < r ≤ ∞, then Lp ∩ Lr ⊂ Lq ⊂ Lp + Lr.
[Hint : (i) Since |f1A| ≤ 1, we have |f1A|q ≤ |f1A|p ≤ |f |p. (ii) Let r = q/p > 1 and s > 1 be
with 1r + 1
s = 1. We have |f |p ∈ Lr, and 1B ∈ Ls since µ(B) < ∞. Hence by Holder’s inequality,∫|f1B|p =
∫|f |p1B ≤ ∥fp∥r∥1B∥s < ∞. (iii) If f ∈ Lp ∩ Lr, then f1{x:|f(x)|≤1} ∈ Lq by (i) and
f1{x:|f(x)|>1} ∈ Lq by (ii) so that (their sum) f ∈ Lq. Similar reasoning gives Lq ⊂ Lp + Lr.]
The proof of Riesz-Thorin theorem is based on the following fact from Complex Analysis:
Fact: (Three lines theorem) Consider the vertical strip S = {z ∈ C : 0 < Re(z) < 1}. Let h : S → C
be a bounded continuous function analytic on S. Let Mt = sup{|h(z)| : Re(z) = t} for 0 ≤ t ≤ 1.
Then Mt ≤M1−t0 M t
1 for 0 ≤ t ≤ 1.
[151] (Riesz-Thorin interpolation theorem) Let (X,µ) and (Y, ν) be σ-finite measure spaces. Let
1 ≤ p0 ≤ p1 ≤ ∞, 1 ≤ q0 ≤ q(1) ≤ ∞, and T : Lp0(µ) + Lp1(µ) → Lq0(ν) + Lq1(ν) be a linear
operator such that T restricted to Lpj (µ) is a bounded linear operator into Lqj (ν) with operator
normMj for j = 0, 1. For 0 < t < 1, define 1 ≤ pt, qt ≤ ∞ by 1pt
= 1−tp0
+ tp1
and 1qt
= 1−tq0
+ 1q1. Then
T restricted to Lpt(µ) is a bounded linear operator into Lqt(ν) with operator norm ≤M1−t0 M t
1, i.e.,
∥Tf∥qt ≤M1−t0 M t
1∥f∥pt for every f ∈ Lpt(µ).
Proof. (Sketch) Observe that since Lpt(µ) ⊂ Lp0(µ)+Lp1(µ) by Exercise-49, T is defined on Lpt(µ).
For s = 0, t, 1, let rs > 1 be such that 1qs
+ 1rs
= 1. Note that the operator norm on Lpt(µ) that we
need to estimate is equal to Mt := sup{|∫Y (Tf)gdν| : f ∈ Lpt(µ) and g ∈ Lrt(ν) have unit norm}.
By approximation, it is enough to consider simple functions f, g of unit norm in the expression
for Mt. Let f =∑N
j=1 ajeiαj1Aj ∈ Lpt(µ) and g =
∑Nk=1 bke
iβk1Bk∈ Lrt(ν) be simple functions
of unit norm, where aj , bk ≥ 0. Let fj = eiαj1Aj and gk = eiβk1Bkso that f =
∑Nj=1 ajfj and
g =∑N
k=1 bkgk. Let S = {z ∈ C : 0 < Re(z) < 1}. For z ∈ S, define pz, rz by the condition that
1pz
= 1−zp0
+ zp1
and 1rz
= 1−zr0
+ zr1. Note that since aj , bk ≥ 0, we may define the quantities a
pt/pzj
and brt/rzk and they depend analytically on z for z ∈ S. Let h : S → C be defined as
h(z) =
∫Y(T (
N∑j=1
apt/pzj fj))(
N∑k=1
brt/rzk gk)dν =
N∑j,k=1
apt/pzj b
rt/rzk
∫Y(Tfj)gkdν,
which is clearly analytic on S. The proof is completed by verifying that h is continuous and
bounded on S, and then applying the Three lines theorem mentioned above. See the book of
Grafakos or Pinsky for the computational details. �
Question: Let 1 ≤ p < ∞, f ∈ Lp(T) and sN (f) =∑N
n=−N f(n)en. Does (sN (f)) → f in Lp(T)?
We know the answer to be YES when p = 2, and it is known that the answer is NO when p = 1.
To investigate other cases, we introduce certain operators.
FOURIER ANALYSIS 67
Definition: Let F (T) be the collection of all trigonometric polynomials on T, where note that F (T)
is dense in Lp(T) for 1 ≤ p <∞. We write f ∈ F (T) as f =∑
n∈Z f(n)en with the understanding
that it is a finite sum, i.e., f(n) = 0 except for finitely many n ∈ Z. We define the Hilbert transform
H and the Riesz projection P from Lp(T) to itself (1 ≤ p < ∞) by defining them on the dense
subspace F (T) (see also [152] below):
(i) H(∑
n∈Z f(n)en) = i(∑−1
n=−∞−∑∞
n=1)f(n)en. If we define sgn(0) = 0, sgn(n) = 1 and
sgn(−n) = −1 for n ∈ N, then note that H(∑
n∈Z f(n)en) = −i∑
n∈Z sgn(n)f(n)en.
(ii) P (∑
n∈Z f(n)en) =∑∞
n=1 f(n)en.
[152] Let 1 < p <∞ (note that we have excluded 1). Then the Hilbert transform H and the Riesz
projection P are bounded linear operators on Lp(T).
Proof. (Sketch) Clearly H,P are linear. For f ∈ F (T), note that f + iHf = f(0) + 2Pf , and
|f(0)| ≤∫ 10 |f | =
∫ 10 1 · |f | ≤ ∥f∥p by Holder’s inequality. Therefore it suffices to show H is
bounded. We outline the structure of proof.
Step-1 : By Parseval, ∥Hf∥22 ≤∑n∈Z
|f(n)|2 = ∥f∥22 for f ∈ F (T) and hence H is bounded on L2(T).
Step-2 : One shows by some computation that H is bounded on L2k(T) for every integer k ≥ 2. The
proof is left as a reading assignment; see for instance, Lemma 3.3.4 of M.A. Pinsky, introduction
to Fourier Analysis and Wavelets.
Step-3 : From the above two steps and Riesz-Thorin interpolation theorem, it follows that H is
bounded on Lp(T) for 2 ≤ p <∞.
Step-4 : If 1 < p < 2, choose q > 2 with 1p + 1
q = 1. As the adjoint of H is −H, ∥Hf∥p =
sup{|∫fHg| : g ∈ F (T) and ∥g∥q = 1} ≤ ∥f∥p∥Hg∥q ≤ ∥f∥p∥H∥q→q by Holder’s inequality. As
∥H∥q→q <∞ by step-3, we get ∥H∥p→p <∞, where ∥H∥p→p is the operator norm on Lp(T). �
[153] Let 1 < p <∞ (note that we have excluded 1). Then,
(i) supN∈N ∥sN∥p→p <∞, and (ii) limN→∞ ∥f − sN (f)∥p = 0 for every f ∈ Lp(T).
Proof. (i) Let PN be the operator on Lp(T) specified by PN (∑
n∈Z f(n)en) =∑2N
n=0 f(n)en for f =∑n∈Z f(n)en ∈ F (T). Observe that e−N
∑2Nn=0 f eN (n)en =
∑Nn=−N f(n)en, i.e., e−NPN (feN ) =
sN (f) for f ∈ F (T). Since multiplication by e±N does not change Lp-norm, it suffices to show
supN∈N ∥PN∥p→p <∞. For f ∈ F (T), we have PNf = (∑∞
n=0−∑∞
n=2N+1)f(n)en =∑∞
n=0 f(n)en−
e2N∑∞
n=1 f(n + 2N)en = f(0) + Pf − e2NP (fe−2N ), where P is the Riesz projection. As P is
bounded by [152] and |f(0)| ≤ ∥f∥p, we get supN∈N ∥PN∥p→p ≤ 1 + 2∥P∥p→p <∞.
68 T.K.SUBRAHMONIAN MOOTHATHU
(ii) Let f ∈ Lp(T) and ε > 0. Let M = supN∈N ∥sN∥p→p, which is finite by part (i). Choose a
trigonometric polynomial g with ∥f − g∥p < ε. We have sN (g) = g for all large N ∈ N. Therefore,
∥f − sN (f)∥p ≤ ∥f − g∥p + ∥sN (g) − sN (f)∥p ≤ ∥f − g∥p + ∥sN∥p→p∥f − g∥p < (1 +M)ε for all
large N ∈ N, which shows limN→∞ ∥f − sN (f)∥p = 0. �
Remark: Similarly we can define Hilbert transform and Riesz projection on Lp(R) for 1 < p <∞,
and use their boundedness to prove that limu→∞ ∥f −su(f)∥p = 0 for every f ∈ Lp(R), 1 < p <∞.
In the proof, instead of F (T), one should use a suitable dense subset of Lp(R), for instance {f ∈
L1(R) ∩ Lp(R) : supp(f) is compact}.
Further reading: (i) The Hilbert transform is a prototype of an important class of operators called
multipliers; see section 3.6 in Grafakos, Classical Fourier Analysis. (ii) Modern theory of Fourier
Analysis depends heavily on singular integral operators in whose study a basic result is Calderon-
Zygmund decomposition; see section 4.3 in Grafakos, Classical Fourier Analysis. (iii) Applications
of Fourier Theory to Probability Theory can be found in Chapter 5 of M.A. Pinsky, Introduction
to Fourier Analysis and Wavelets. Fourier theory in higher dimension and many other interesting
topics can also be found in the books of Grafakos and Pinsky. See also J. Duoandikoetxea, Fourier
Analysis. For a more abstract general theory, see Rudin, Fourier Analysis on Groups.
*****