shape constrained estimation: a brief ... - yale university · econometrics lunch seminar, yale,...
TRANSCRIPT
Shape constrained estimation:
a brief introduction
Jon A. Wellner
University of Washington, Seattle
Cowles Foundation, Yale
November 7, 2016
Econometrics lunch seminar, Yale
Based on joint work with:
• Fadoua Balabdaoui
• Piet Groeneboom
• Geurt Jongbloed
• Kaspar Rufibach
• Arseni Seregin
• Charles Doss
Outline
• 1. Some shape constrained models:
B A: Monotone (and unimodal).
B B: Convex (or concave):
B C: k−monotone, completely monotone
B D: log-concave; log-convex; s-concave; bi-logconcave
B E: hyperbolically monotone,
hyperbolically completely monotone
B F: higher dimensional monotone, convex, log-concave.
• 2. Estimation methods:
B A. maximum likelihood
B B. least squares
B C. minimum Renyi entropy
B D. penalized likelihoods
Econometrics Lunch Seminar, Yale, November 7, 2016 1.2
• 3. Types of results?
B A. Pointwise consistency; global consistency ?
B B. Global consistency; rates of global consistency?
B C. Global rates of convergence: Hellinger/ L1 ?
B D. Local distribution theory / limiting distributions?
B E. Risk bounds: lower; upper; adaptation to boundary?
• 4. Testing and confidence intervals
• 5. Algorithms
B A. active set
B B. interior point
Econometrics Lunch Seminar, Yale, November 7, 2016 1.3
1. Some shape constrained models:
A. Monotone (density):
Fmon ≡ {f a density on R+, f(x) ≤ f(y) if x ≥ y}.
f on R+ = [0,∞) is decreasing (i.e. non-increasing) if and only
if it is a scale mixture of uniform densities: for x > 0,
f(x) =∫ ∞
0
1
y1[0,y)(x)dG(y) =
∫{y>x}
1
ydG(y) , (1)
for some distribution function G on (0,∞). [Schoenberg, 1941;
Williamson, 1956; Feller, 1971].
Thus if X ∼ f ∈ Fmon, there is a random variable Z ∼ G on R+
such that
Xd= UZ
where U ∼ Uniform[0,1].
Econometrics Lunch Seminar, Yale, November 7, 2016 1.4
The relationship (1) implies that the corresponding distribution
function F is given by
F (x) =∫ ∞
0
x
y1[0,y)(x)dG(y) +
∫ ∞0
1[y,∞)(x)dG(y)
= xf(x) +G(x) ,
and this can be “inverted” to yield
G(x) = F (x)− xf(x) . (2)
From Figure 1 we see that the function on the right side of (2) is
non-negative and non-decreasing: the shaded area gives exactly
the difference F (x)− xf(x).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.5
0.2 0.4 0.6 0.8 1 1.2 1.4
0.2
0.4
0.6
0.8
1
0.2 0.4 0.6 0.8 1 1.2 1.4
0.2
0.4
0.6
0.8
1
Inverting the monotone density mixture representation
Econometrics Lunch Seminar, Yale, November 7, 2016 1.6
Alternatively, if G is a distribution function with finite mean
µG ≡∫∞0 (1−G(y))dy, then
f(x;G) ≡1
µG(1−G(x)) (3)
is a monotone decreasing density on R+. This mixture rep-
resentation yields the sub-class of monotone densities with
f(0+) < ∞. This is the way that monotone densities often
arise in applications, since the right side in (3) is the limiting
distribution of both the forward and backward recurrence time
distributions (or the distributions of “excess life” γt and “current
life” δt) in renewal theory; see e.g. Karlin-Taylor (1975).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.7
If we change the uniform density to a triangular density, themixture representation yields convex decreasing densities f ∈Fconv:
f(x) =∫ ∞
02y−1
(1−
x
y
)+dG(y).
C. k−monotone; completely monotone Since the triangulardensity 2(1 − x)1[0,1](x) is just the Beta(1,2) density, we easilyfind:
Proposition. (Williams, 1956). A density f is a k−monotone(completely monotone) density if and only if it can be representedas a scale mixture of Beta(1, k) (exponential) densities; withx+ ≡ x1{x ≥ 0},
f(x) =
∫∞0 y−1
(1− x
ky
)k−1
+dG(y), k ∈ {1,2, . . .},∫∞
0 y−1exp(−x/y)dG(y), k =∞ ,(4)
for some distribution function G on (0,∞).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.8
The inversion formulas corresponding to these mixture represen-
tations are given in the following proposition.
Proposition 2. Suppose that f is a k−monotone density
with distribution function F (so F (x) =∫ x0 f(t)dt). Then the
distribution function G = Gk of (4) is given at continuity points
of Gk by
Gk(t) =k∑
j=0
(−1)j
j!(kt)jF (j)(kt) , (5)
G∞(t) = limk→∞
Gk(t) . (6)
Econometrics Lunch Seminar, Yale, November 7, 2016 1.9
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
Graphical view of the inversion formula,
convex decreasing density
Econometrics Lunch Seminar, Yale, November 7, 2016 1.10
Note that for 3 ≤ k <∞
F∞ ⊂ Fk ⊂ F2 ⊂ Fmon.
D. Log-concave; log-convex; s−concave; . . .
• p is log-concave on Rd if logp ≡ ϕ is concave (on Rd).
• p is log-convex on Rd if logp ≡ ϕ is convex (on Rd).
• for s > 0, p is s−concave if p = ϕ1/s+ for some ϕ concave;
for s < 0, p is s−concave if p = ϕ1/s+ for some ϕ convex.
Then, for −∞ < s < 0 < r <∞
P−∞ ⊃ Ps ⊃ P0 ⊃ Pr ⊃ P∞.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.11
E. Hyperbolically k−monotone and completely monotone
A non-negative function f(x) on (0,∞) is said to be
hyperbolically monotone of order k (or HMk) if for each fixed u
the function
H(w) ≡ f(uv)f(u/v), w ≡ 2−1(v + 1/v)
satisfies (−1)jH(j)(w) ≥ 0, j = 0,1, . . . , k−1 and (−1)(k−1)H(k−1)(w)
is right-continuous and decreasing. If f is HMk for every k, then
f is hyperbolically completely monotone, or HM∞.
• Any uniform density is HM1.
• The half-normal densities f(x) = 2φ(x/σ)/σ for x ≥ 0 are HM1,
but not HM2.
• The log-normal distribution is HM∞.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.12
Proposition 1. (Bondesson, 1991) X ∼ HM1 if and only if logX
has a log-concave density on R.
Proposition 2. (Bondesson, 1991) If X ∼ HMk and Y ∼ HMk
are independent then XY ∼ HMk and X/Y ∼ HMk.
Thus:
log-concave = logHM1 ⊂ logHMk ⊂ logHM∞.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.13
2. Estimation methods:
• Maximum likelihood:
• Suppose that X1, . . . , Xn are i.i.d. p0 ∈ P.
• Then with Pn ≡ n−1∑n1 δXi, the log-likelihood is defined by
Ln(p) ≡n∑i=1
logp(Xi) ≡ nPn(logp).
• If pn satisfies
Ln(pn) = maxp∈P
Ln(p),
then pn is the Maximum Likelihood Estimator of p0 ∈ P.
• The “adjusted log-likelihood” Ψn is defined by
Ψn(p) ≡ Ln(p)−∫Rdp(x)dx.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.14
• Least squares:
• X1, . . . , Xn i.i.d. p0 ∈ P.
• The least squares contrast function is defined by
Ψn(p) =1
2
∫p2(x)dx−
∫p(x)dPn(x)
≈1
2
∫(p(x)− pn(x))2dx−
1
2
∫p2n(x)dx
if Pn had density pn; but the neglected term depends only on
the data, not on p.
• If pn satisfies
Ψn(pn) = maxp∈P
Ψn(p),
then pn is the least squares estimator of p0 ∈ P.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.15
• Other methods:
• Minimum Renyi entropy (or divergence).
• Penalized versions of maximum likelihood, LS, . . ..
• Least squares for regression models ...
• L1 for regression models?
Econometrics Lunch Seminar, Yale, November 7, 2016 1.16
Example 1. Monotone densities.
Grenander (1956) showed that the maximum likelihood estimator
fn of f is the (left-) derivative of the least concave majorant of
the empirical distribution function Fn
fn = left derivative of the least concave majorant of Fn,the empirical distribution of X1, . . . , Xn i.i.d. F
For the class of monotone densities on [0,∞),
the MLE and LSE are identical.
This is not true in general.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.17
1 2 3 4
0.2
0.4
0.6
0.8
1
Least Concave Majorant and Empirical n = 10
Econometrics Lunch Seminar, Yale, November 7, 2016 1.18
1 2 3 4
1
2
3
4
Grenander Estimator and Exp(1) density, n = 10
Econometrics Lunch Seminar, Yale, November 7, 2016 1.19
1 2 3 4 5
0.2
0.4
0.6
0.8
1.0
Least Concave Majorant and Empirical n = 40
Econometrics Lunch Seminar, Yale, November 7, 2016 1.20
0 1 2 3 40.0
0.2
0.4
0.6
0.8
1.0
1.2
Grenander Estimator and Exp(1) density, n = 40
Econometrics Lunch Seminar, Yale, November 7, 2016 1.21
Properties of Grenander’s estimator:
What do we know about fn?
• Consistency:
B fn(x)→a.s. f0(x) for each x > 0.
B If f0 is continuous on (0,∞), supx≥c |fn(x)−f0(x)| →a.s. 0.
B If f0(0) <∞, then fn(0)→d f0(0)U−1 where
U ∼ Uniform(0,1).
B (Groeneboom, Birge, van de Geer) Let
H2(f, g) ≡ 2−1∫{√f(x)−
√g(x)}2dx
be the Hellinger distance between f and g.
If∫f0<1 f
1−α0 (x)dx <∞ and
∫[f0>1] f
1+α0 (x)dx <∞, then
H(fn, f0) = Op(n−1/3).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.22
• Limit distribution theory:
B f ′0(x) < 0, f0(x) > 0, and f ′0 is continuous in a
neighborhood of f0, then with C ≡ (2−1f0(x)|f ′0(x)|)1/3,
B
n1/3(fn(x)− f0(x))→d CS(0)d= C2Z
where
S(0) ≡ slope at zero of the least concave majorant
of W (h)− h2,
Z ≡ argmaxh{W (h)− h2};
here W is two-sided Brownian motion started at 0.
B The distribution of Z is called Chernoff’s distribution.
B From Groeneboom (1989), lots is known about fZ.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.23
-1.5 -1 -0.5 0.5 1 1.5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Chernoff’s density fZ(x) = (1/2)g(x)g(−x).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.24
• Marshall’s lemma: ‖Fn − F0‖∞ ≤ ‖Fn − F0‖∞.
• Kiefer-Wolfowitz theorem:
B If α1(F ) ≡ inf{t : F0(t) = 1} <∞,
β1(F ) ≡ supt<α1(−f ′0(t)/f20 (t)) > 0,
and γ1(F ) ≡ supt<α1(−f ′(t))/ inft<α1 f2(t) <∞,
B . . . then
‖Fn − Fn‖∞ = O((n−1logn)2/3) a.s.
and hence
√n‖Fn − Fn‖∞ = O(n−1/6(logn)2/3) a.s..
Econometrics Lunch Seminar, Yale, November 7, 2016 1.25
• Model miss-specification: If f is not monotone (or even if F
does not have a density), then∫|fn(x)− f∗0(x)|dx→a.s. 0
where
K(f, f∗0) = inf{K(f, g) : g ∈ Fmon}.
(Pfanzagl (1990), Patilea (2001), Jankowski (2014), Dumbgen,
Samworth, and Schumacher (2011)).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.26
A. Log-concave densities on R1
Suppose that
f(x) ≡ fϕ(x) = exp(ϕ(x)) = exp (−(−ϕ(x)))
where ϕ is concave (and −ϕ is convex). The class of all densities
f on R of this form is called the class of log-concave densities,
Plog−concave ≡ P0.
Properties of log-concave densities:
• A density f on R is log-concave if and only if its convolution
with any unimodal density is again unimodal (Ibragimov,
1956).
• Every log-concave density f is unimodal (but need not be
symmetric).
• P0 is closed under convolution.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.27
A. Log-concave densities on R1
• Many parametric families are log-concave, for example:
B Normal (µ, σ2); Uniform(a, b);
B Gamma(r, λ) for r ≥ 1; Beta(a, b) for a, b ≥ 1
• tr densities with r > 0 are not log-concave
• Tails of log-concave densities are necessarily sub-exponential
• Plog−concave = the class of “Polya frequency functions of
order 2”, PF2, in the terminology of Schoenberg (1951) and
Karlin (1968). See Marshall and Olkin (1979), chapter 18,
and Dharmadhikari and Joag-Dev (1988), page 150. for nice
introductions.
• Thus:
log-concave = PF2 = strongly uni-modal
Econometrics Lunch Seminar, Yale, November 7, 2016 1.28
B. Nonparametric estimation, log-concave on R
• The (nonparametric) MLE fn exists (Rufibach, Dumbgen
and Rufibach).
• fn can be computed: R-package “logcondens” (Dumbgen
and Rufibach)
• In contrast, the (nonparametric) MLE for the class of
unimodal densities on R1 does not exist. Birge (1997) and
Bickel and Fan (1996) consider alternatives to maximum
likelihood for the class of unimodal densities.
• Consistency and rates of convergence for fn:
Dumbgen and Rufibach, (2009); Pal, Woodroofe and Meyer
(2007).
• Pointwise limit theory? Yes! Balabdaoui, Rufibach, and W
(2009).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.29
B. Nonparametric estimation, log-concave on R
MLE of f and ϕ: Let C denote the class of all concave function
ϕ : R→ [−∞,∞). The estimator ϕn based on X1, . . . , Xn i.i.d. as
f0 is the maximizer of the “adjusted criterion function”
`n(ϕ) =∫
logfϕ(x)dFn(x)−∫fϕ(x)dx
=∫ϕ(x)dFn(x)−
∫eϕ(x)dx
over ϕ ∈ C.
Properties of fn, ϕn: (Dumbgen & Rufibach, 2009)
• ϕn is piecewise linear.
• ϕn = −∞ on R \ [X(1), X(n)].
• The knots (or kinks) of ϕn occur at a subset of the order
statistics X(1) < X(2) < · · · < X(n).
• Characterized by ...
Econometrics Lunch Seminar, Yale, November 7, 2016 1.30
B. Nonparametric estimation, log-concave on R
... ϕn is the MLE of logf0 = ϕ0 if and only if
Hn(x)
{≤ Hn(x), for all x > X(1),
= Hn(x), if x is a knot.
where
Fn(x) =∫ xX(1)
fn(y)dy, Hn(x) =∫ xX(1)
Fn(y)dy,
Hn(x) =∫ x−∞
Fn(y)dy.
Furthermore, for every function ∆ such that ϕn+ t∆ is concave
for t small enough,∫R
∆(x)dFn(x) ≤∫R
∆(x)dFn(x).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.31
B. Nonparametric estimation, log-concave on R
Consistency of fn and ϕn:
• (Pal, Woodroofe, & Meyer, 2007):
If f0 ∈ P0, then H(fn, f0)→a.s. 0.
• (Dumbgen & Rufibach, 2009):
If f0 ∈ P0 and ϕ0 ∈ Hβ,L(T ) for some compact T = [A,B] ⊂{x : f0(x) > 0}◦, M <∞, and 1 ≤ β ≤ 2. Then
supt∈T
(ϕn(t)− ϕ0(t)) = Op
(logn
n
)β/(2β+1) , and
supt∈Tn
(ϕ0(t)− ϕn(t)) = Op
(logn
n
)β/(2β+1)
where Tn ≡ [A+ (logn/n)β/(2β+1), B − (logn/n)β/(2β+1)] and
β/(2β + 1) ∈ [1/3,2/5] for 1 ≤ β ≤ 2.
• The same remains true if ϕn, ϕ0 are replaced by fn, f0.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.32
B. Nonparametric estimation, log-concave on R
• If ϕ0 ∈ Hβ,L(T ) as above and, with ϕ′0 = ϕ0(·−) or ϕ′0(·+),
ϕ′0(x)−ϕ′0(y) ≥ C(y−x) for some C > 0 and all A ≤ x < y ≤ B,
then
supt∈Tn|Fn(t)− Fn(t)| = Op
(logn
n
)3β/(4β+2) .
where 3β/(2β + 4) ∈ [1/2,3/5] = [.5, .6] for 1 ≤ β ≤ 2.
• If β > 1, this implies supt∈Tn |Fn(t)− Fn(t)| = op(n−1/2).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.33
B. Nonparametric estimation, log-concave on R
0 1 2 3 4 5
0.0
0.1
0.2
0.3
0.4
dens
ity fu
nctio
ns
0 1 2 3 4 5
7
6
5
4
3
2
1
0
log
dens
ity fu
nctio
ns
0 1 2 3 4 5
0.0
0.2
0.4
0.6
0.8
1.0
CD
Fs
0 1 2 3 4 5
0.0
0.5
1.0
1.5
2.0
haza
rd fu
nctio
ns
0 2 4 6 8 10
0.0
0.1
0.2
0.3
0.4
dens
ity fu
nctio
ns
0 2 4 6 8 10
7
6
5
4
3
2
1
0
log
dens
ity fu
nctio
ns
0 2 4 6 8 10
0.0
0.2
0.4
0.6
0.8
1.0
CD
Fs
0 2 4 6 8 10
0.0
0.5
1.0
1.5
2.0
haza
rd fu
nctio
ns
Econometrics Lunch Seminar, Yale, November 7, 2016 1.34
B. Nonparametric estimation, log-concave on R
Econometrics Lunch Seminar, Yale, November 7, 2016 1.35
B. Nonparametric estimation, log-concave on R
Top panel: Estimated log-densities. Bottom panel: Empirical processes
Econometrics Lunch Seminar, Yale, November 7, 2016 1.36
B. Nonparametric estimation, log-concave on R
Econometrics Lunch Seminar, Yale, November 7, 2016 1.37
C: Limit theory at a fixed point in R
Assumptions: • f0 is log-concave, f0(x0) > 0.
• If ϕ′′0(x0) 6= 0, then k = 2;
otherwise, k is the smallest integer such that
ϕ(j)0 (x0) = 0, j = 2, . . . , k − 1, ϕ(k)
0 (x0) 6= 0.
• ϕ(k)0 is continuous in a neighborhood of x0.
Example: f0(x) = Cexp(−x4) with C =√
2Γ(3/4)/π: k = 4.
Driving process: Yk(t) =∫ t0W (s)ds−tk+2, W standard 2-sided
Brownian motion.
Invelope process: Hk determined by limit Fenchel relations:
• Hk(t) ≤ Yk(t) for all t ∈ R
•∫R(Hk(t)− Yk(t))dH(3)
k (t) = 0.
• H(2)k is concave.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.38
C: Limit theory at a fixed point in R
Theorem. (Balabdaoui, Rufibach, & W, 2009)
• Pointwise limit theorem for fn(x0):(nk/(2k+1)(fn(x0)− f0(x0))
n(k−1)/(2k+1)(f ′n(x0)− f ′0(x0))
)→d
ckH(2)k (0)
dkH(3)k (0)
where
ck ≡
f0(x0)k+1|ϕ(k)0 (x0)|
(k + 2)!
1/(2k+1)
,
dk ≡
f0(x0)k+2|ϕ(k)0 (x0)|3
[(k + 2)!]3
1/(2k+1)
.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.39
C: Limit theory at a fixed point in R
• Pointwise limit theorem for ϕn(x0):(nk/(2k+1)(ϕn(x0)− ϕ0(x0))
n(k−1)/(2k+1)(ϕ′n(x0)− ϕ′0(x0))
)→d
CkH(2)k (0)
DkH(3)k (0)
where
Ck ≡
|ϕ(k)0 (x0)|
f0(x0)k(k + 2)!
1/(2k+1)
,
Dk ≡
|ϕ(k)0 (x0)|3
f0(x0)k−1[(k + 2)!]3
1/(2k+1)
.
• Proof: Use the same perturbation as for convex - decreasing
density proof with perturbation version of characterization:
Econometrics Lunch Seminar, Yale, November 7, 2016 1.40
D: Mode estimation, log-concave density on R
Let x0 = M(f0) be the mode of the log-concave density f0,
recalling that P0 ⊂ Punimodal. Lower bound calculations using
Jongbloed’s perturbation ϕε of ϕ0 yields:
Proposition. If f0 ∈ P0 satisfies f0(x0) > 0, f ′′0(x0) < 0, and f ′′0is continuous in a neighborhood of x0, and Tn is any estimator
of the mode x0 ≡ M(f0), then fn ≡ exp(ϕεn) with εn ≡ νn−1/5
and ν ≡ 2f ′′0(x0)2/(5f0(x0)),
lim infn→∞ n1/5 inf
Tnmax {En|Tn −M(fn)|, E0|Tn −M(f0)|}
≥1
4
(5/2
10e
)1/5(f0(x0)
f ′′0(x0)2
)1/5
.
Does the MLE M(fn) achieve this?
Econometrics Lunch Seminar, Yale, November 7, 2016 1.41
D: Mode estimation, log-concave density on R
!4 !3 !2 !1 1 2
!10
!8
!6
!4
!2
2
4
Econometrics Lunch Seminar, Yale, November 7, 2016 1.42
D: Mode estimation, log-concave density on R
Proposition. (Balabdaoui, Rufibach, & W, 2009)
Suppose that f0 ∈ P0 satisfies:
• ϕ(j)0 (x0) = 0, j = 2, . . . , k − 1,
• ϕ(k)0 (x0) 6= 0, and
• ϕ(k)0 is continuous in a neighborhood of x0.
Then Mn ≡M(fn) ≡ min{u : fn(u) = supt fn(t)}, satisfies
n1/(2k+1)(Mn −M(f0))→d
((k + 2)!)2f0(x0)
f(k)0 (x0)2
1/(2k+1)
M(H(2)k )
where M(H(2)k ) = argmax(H(2)
k ).
Note that when k = 2 this agrees with the lower bound
calculation, at least up to absolute constants.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.43
D: Mode estimation, log-concave density on R
f0
f1
f2
!1 !0.5 0 0.5 1
!2
0
2
4
6
8
10
12
14
Econometrics Lunch Seminar, Yale, November 7, 2016 1.44
D: Mode estimation, log-concave density on R
Y
H1
H2
!1 !0.5 0 0.5 1
!0.5
0
0.5
1
1.5
Econometrics Lunch Seminar, Yale, November 7, 2016 1.45
D: Mode estimation, log-concave density on R
X
F1
F2
!1 !0.5 0 0.5 1
!2
0
2
4
Econometrics Lunch Seminar, Yale, November 7, 2016 1.46
D: Mode estimation, log-concave density on R
f0
f1
f2
!1 !0.5 0 0.5 1
!2
0
2
4
6
8
10
12
14
Econometrics Lunch Seminar, Yale, November 7, 2016 1.47
D: Mode estimation, log-concave density on R
f !0
f !1
f !2
!1 !0.5 0 0.5 1
!80
!60
!40
!20
0
20
Econometrics Lunch Seminar, Yale, November 7, 2016 1.48
E: Generalizations of log-concave to R and Rd:
Three generalizations:
• log−concave densities on Rd
(Cule, Samworth, and Stewart, 2010)
• s−concave and h− transformed convex densities on Rd
(Seregin, 2010)
• Hyperbolically k−monotone and completely monotone
densities on R; (Bondesson, 1981, 1992)
Econometrics Lunch Seminar, Yale, November 7, 2016 1.49
E: Generalizations of log-concave to R and Rd:
Log-concave densities on Rd:
• A density f on Rd is log-concave if f(x) = exp(ϕ(x)) with ϕ
concave.
• Some properties:
B Any log−concave f is unimodal
B The level sets of f are closed convex sets
B Convolutions of log-concave distributions are log-concave.
B Marginals of log-concave distributions are log-concave.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.50
E: Generalizations of log-concave to R and Rd:
MLE of f ∈ P0(Rd): (Cule, Samworth, Stewart, 2010)
• MLE fn = argmaxf∈P0(Rd)Pnlogf exists and is unique if
n ≥ d+ 1.
• The estimator ϕn of ϕ0 is a “taut tent” stretched over “tent
poles” of certain heights at a subset of the observations.
• Computable via non-differentiable convex optimization meth-
ods: Shor’s (1985) r−algorithm: R−package LogConcDEAD
(Cule, Samworth, Stewart, 2008).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.51
E: Generalizations of log-concave to R and Rd:
Econometrics Lunch Seminar, Yale, November 7, 2016 1.52
E: Generalizations of log-concave to R and Rd:
• If f0 is any density on Rd with∫Rd ‖x‖f0(x)dx <∞,∫
Rd f0(x)logf0(x)dx < ∞, and {x ∈ Rd : f0(x) > 0}◦ =
int(supp(f0)) 6= ∅, then fn satisfies:∫Rd|fn(x)− f∗(x)|dx→a.s. 0
where, for the Kullback-Leibler divergence
K(f0, f) =∫f0log(f0/f)dµ,
f∗ = argminf∈P0(Rd)K(f0, f)
is the “pseudo-true” density in P0(Rd) corresponding to f0.
In fact: ∫Rdea‖x‖|fn(x)− f∗(x)|dx→a.s. 0
for any a < a0 where f∗(x) ≤ exp(−a0‖x‖+ b0).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.53
Questions and Problems:
Questions:
• Rates of convergence? Limiting distribution(s) for d > 1?
(nr with r = 2/(4 + d)?)
• MLE (rate-) inefficient for d ≥ 4? How to penalize to get
efficient rates?
• Multivariate classes with nice preservation/closure properties
and smoother than log-concave?
• Can we treat fn ∈ Ph with miss-specification: f0 /∈ Ph?
• Algorithms for computing fn ∈ Ph?
• Related results for convex regression on Rd: Seijo and Sen,
Ann. Statist. (2011).
Econometrics Lunch Seminar, Yale, November 7, 2016 1.54
Further information:
• Other Jon W talks :
www.stat.washington.edu/jaw/RESEARCH/TALKS/talks.html
• Jon W Probability Seminar:
www.stat.washington.edu/jaw/RESEARCH/TALKS/UW-Prob-
October2016.pdf
• Jon W Frejus lectures:
B jaw/RESEARCH/TALKS/FrejusDay1-small.pdf
B jaw/RESEARCH/TALKS/FrejusDay2-small.pdf
B jaw/RESEARCH/TALKS/FrejusDay3-small.pdf
B jaw/RESEARCH/TALKS/FrejusDay4-small.pdf
B jaw/RESEARCH/TALKS/FrejusDay5-small.pdf
Econometrics Lunch Seminar, Yale, November 7, 2016 1.55
• Groeneboom & Jongbloed (2014).
Nonparametric Estimation under Shape Constraints. Cam-
bridge U. Press.
• Kim, A. K., Guntuboyina, A., and Samworth (2016). Adap-
tation in log-concave density estimation. arXiv:1609.00861.
• Chatterjee, S., Guntuboyina, A., and Sen, B. (2015). On
risk bounds in isotonic and other shape restricted regression
problems. Ann. Statist. 43, 1774-1800.
• Guntuboyina, A. and Sen, B. (2015). Global risk bounds and
adaptation in univariate convex regression. Prob. Theor.
Rel. Fields 163, 379-411.
Econometrics Lunch Seminar, Yale, November 7, 2016 1.56
Thanks!
Econometrics Lunch Seminar, Yale, November 7, 2016 1.57
Econometrics Lunch Seminar, Yale, November 7, 2016 1.58
Econometrics Lunch Seminar, Yale, November 7, 2016 1.59