the persistent homology of distance functions under random projection
TRANSCRIPT
The Persistent Homology of Distance Functions
under Random Projection
Don Sheehy University of Connecticut
Unions of Balls
Unions of Balls
Finite Point Set
Unions of Balls
Finite Point Set Union of Balls
Unions of Balls
Finite Point Set Union of Balls
Topologically uninteresting Potentially Interesting
Unions of Balls
Finite Point Set Union of Balls
Topologically uninteresting Potentially Interesting
Idea: Fill in the gaps in the ambient space. Examples: Molecules and Manifolds
Unions of balls are sublevels of the distance.
Unions of balls are sublevels of the distance.
P ⇢ RdInput:
Unions of balls are sublevels of the distance.
P
↵ =[
p2P
ball(p,↵) = {x 2 Rd | d(x, P ) ↵}
P ⇢ RdInput:
Unions of balls are sublevels of the distance.
P
↵ =[
p2P
ball(p,↵) = {x 2 Rd | d(x, P ) ↵}
Persistent Homology was invented to track changesin the homology of P↵ as ↵ ranges from 0 to 1.
P ⇢ RdInput:
Unions of balls are sublevels of the distance.
P
↵ =[
p2P
ball(p,↵) = {x 2 Rd | d(x, P ) ↵}
Persistent Homology was invented to track changesin the homology of P↵ as ↵ ranges from 0 to 1.
Pers({P↵})
P ⇢ RdInput:
Unions of balls are sublevels of the distance.
P
↵ =[
p2P
ball(p,↵) = {x 2 Rd | d(x, P ) ↵}
Persistent Homology was invented to track changesin the homology of P↵ as ↵ ranges from 0 to 1.
Pers({P↵})
P ⇢ RdInput:
Filtered Simplicial Complexes
Filtered Simplicial Complexes
For � ✓ P ,
rad(�) = radius of the min. encl. ball of �.diam(�) = max
p,q2Pkp� qk2.
Filtered Simplicial Complexes
For � ✓ P ,
rad(�) = radius of the min. encl. ball of �.diam(�) = max
p,q2Pkp� qk2.
ˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}
Filtered Simplicial Complexes
For � ✓ P ,
rad(�) = radius of the min. encl. ball of �.diam(�) = max
p,q2Pkp� qk2.
ˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}ˇ
Cech Filtration: {CP (↵)}↵�0
Filtered Simplicial Complexes
For � ✓ P ,
rad(�) = radius of the min. encl. ball of �.diam(�) = max
p,q2Pkp� qk2.
ˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}
Rips Complex: RP (↵) = {� ✓ P | diam(�) ↵}
ˇ
Cech Filtration: {CP (↵)}↵�0
Filtered Simplicial Complexes
For � ✓ P ,
rad(�) = radius of the min. encl. ball of �.diam(�) = max
p,q2Pkp� qk2.
ˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}
Rips Complex: RP (↵) = {� ✓ P | diam(�) ↵}
ˇ
Cech Filtration: {CP (↵)}↵�0
Rips Filtration: {RP (↵)}↵�0
Filtered Simplicial Complexes
For � ✓ P ,
rad(�) = radius of the min. encl. ball of �.diam(�) = max
p,q2Pkp� qk2.
ˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}
Rips Complex: RP (↵) = {� ✓ P | diam(�) ↵}
ˇ
Cech Filtration: {CP (↵)}↵�0
Rips Filtration: {RP (↵)}↵�0
CP (↵) ✓ RP (↵) ✓ CP (p2↵)
Filtered Simplicial Complexes
For � ✓ P ,
rad(�) = radius of the min. encl. ball of �.diam(�) = max
p,q2Pkp� qk2.
ˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}
Rips Complex: RP (↵) = {� ✓ P | diam(�) ↵}
ˇ
Cech Filtration: {CP (↵)}↵�0
Rips Filtration: {RP (↵)}↵�0
CP (↵) ✓ RP (↵) ✓ CP (p2↵)
Pers({RP (↵)}) is ap2-approximation to Pers({CP (↵)}).
Representing sublevels of distances
Representing sublevels of distances
ˇ
Cech Complex: size O(nd+1).
↵-complex (a.k.a. Delaunay Filtration): size O(ndd/2e).
Quality Meshes: size 2
(d2)n.(Sparse
ˇ
Cech Complex: 2
(d2)n).*
Representing sublevels of distances
ˇ
Cech Complex: size O(nd+1).
↵-complex (a.k.a. Delaunay Filtration): size O(ndd/2e).
Quality Meshes: size 2
(d2)n.(Sparse
ˇ
Cech Complex: 2
(d2)n).*
Representing sublevels of distances
ˇ
Cech Complex: size O(nd+1).
↵-complex (a.k.a. Delaunay Filtration): size O(ndd/2e).
Quality Meshes: size 2
(d2)n.(Sparse
ˇ
Cech Complex: 2
(d2)n).*
Key Point: Ambient Dimension Matters!
Johnson Lindenstrauss Projection
Johnson Lindenstrauss Projection
Idea: Project to lower dimensions. Preserve pairwise distances.
Johnson Lindenstrauss Projection
Idea: Project to lower dimensions. Preserve pairwise distances.
Let f : RD ! Rdbe a linear map where d = O(log n/"2) such that:
Johnson Lindenstrauss Projection
Idea: Project to lower dimensions. Preserve pairwise distances.
(1� ")ka� bk2 kf(a)� f(b)k2 (1 + ")ka� bk2Squared distances preserved up to multiplicative factor.1
Let f : RD ! Rdbe a linear map where d = O(log n/"2) such that:
Johnson Lindenstrauss Projection
Idea: Project to lower dimensions. Preserve pairwise distances.
(1� ")ka� bk2 kf(a)� f(b)k2 (1 + ")ka� bk2Squared distances preserved up to multiplicative factor.1
|(b� a)>(c� a)� (f(b)� f(a))>(f(b)� f(a))| "kb� akkc� ak.Inner products preserved up to additive factor.2
Let f : RD ! Rdbe a linear map where d = O(log n/"2) such that:
Johnson Lindenstrauss Projection
Idea: Project to lower dimensions. Preserve pairwise distances.
a
b
c f(c)
f(b)
f(a)
(1� ")ka� bk2 kf(a)� f(b)k2 (1 + ")ka� bk2Squared distances preserved up to multiplicative factor.1
|(b� a)>(c� a)� (f(b)� f(a))>(f(b)� f(a))| "kb� akkc� ak.Inner products preserved up to additive factor.2
Let f : RD ! Rdbe a linear map where d = O(log n/"2) such that:
Can we use JL for P.H. of distances?
Can we use JL for P.H. of distances?
Yes, for Rips filtrations, but not a tight approximation.
Can we use JL for P.H. of distances?
Yes, for Rips filtrations, but not a tight approximation.Distance function itself is not preserved.
Can we use JL for P.H. of distances?
Yes, for Rips filtrations, but not a tight approximation.Distance function itself is not preserved.Pairwise distances in sublevels are not preserved.
Can we use JL for P.H. of distances?
Yes, for Rips filtrations, but not a tight approximation.Distance function itself is not preserved.Pairwise distances in sublevels are not preserved.Is topology preserved? Maybe yes, maybe no.
Can we use JL for P.H. of distances?
Yes, for Rips filtrations, but not a tight approximation.Distance function itself is not preserved.Pairwise distances in sublevels are not preserved.Is topology preserved? Maybe yes, maybe no.Is persistent homology preserved? YES.
Cech Filtration, MEBs, and Approximation
Cech Filtration, MEBs, and Approximationˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}ˇ
Cech Filtration: {CP (↵)}↵�0
Cech Filtration, MEBs, and Approximationˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}ˇ
Cech Filtration: {CP (↵)}↵�0
Let P ⇢ RDand let f be any map from RD
to Rd.
Cech Filtration, MEBs, and Approximationˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}ˇ
Cech Filtration: {CP (↵)}↵�0
Let P ⇢ RDand let f be any map from RD
to Rd.
Idea: If f “preserves M.E.B. radii”, then it preserves
the persistent homology of the distance function.
Cech Filtration, MEBs, and Approximationˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}ˇ
Cech Filtration: {CP (↵)}↵�0
Let P ⇢ RDand let f be any map from RD
to Rd.
Idea: If f “preserves M.E.B. radii”, then it preserves
the persistent homology of the distance function.
Cech Filtration, MEBs, and Approximationˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}ˇ
Cech Filtration: {CP (↵)}↵�0
Let P ⇢ RDand let f be any map from RD
to Rd.
Idea: If f “preserves M.E.B. radii”, then it preserves
the persistent homology of the distance function.
For S ✓ P , (1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
Cech Filtration, MEBs, and Approximationˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}ˇ
Cech Filtration: {CP (↵)}↵�0
Let P ⇢ RDand let f be any map from RD
to Rd.
Idea: If f “preserves M.E.B. radii”, then it preserves
the persistent homology of the distance function.
For S ✓ P , (1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
For all ↵ � 0, CP (p1� 4") ✓ Cf(P )(↵) ✓ CP (
p1� 4")
Cech Filtration, MEBs, and Approximationˇ
Cech Complex: CP (↵) = {� ✓ P | rad(�) 2↵}ˇ
Cech Filtration: {CP (↵)}↵�0
Let P ⇢ RDand let f be any map from RD
to Rd.
Idea: If f “preserves M.E.B. radii”, then it preserves
the persistent homology of the distance function.
For S ✓ P , (1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
For all ↵ � 0, CP (p1� 4") ✓ Cf(P )(↵) ✓ CP (
p1� 4")
So, Pers(d(·, f(P ))) is a (1 +O("))-approximation
to Pers(d(·, P )).
MEBs under JL projection
MEBs under JL projection
Let S = {p1, . . . , pr} and let x 2 conv(S).
MEBs under JL projection
x =rX
i=1
�ipi, whererX
i=1
�i = 1.
Let S = {p1, . . . , pr} and let x 2 conv(S).
MEBs under JL projection
x =rX
i=1
�ipi, whererX
i=1
�i = 1.
kx� pk2 =
�����
rX
i=1
�i(pi � p)
�����
2
=rX
i=1
rX
j=1
�i�j(pi � p)>(pj � p).For any p 2 S,
Let S = {p1, . . . , pr} and let x 2 conv(S).
MEBs under JL projection
��kp� xk2 � kf(p)� f(x)k2�� =
rX
i=1
rX
j=1
�i�j
��(pi � p)>(pj � p)� (f(pi)� f(p))>(f(pj)� f(p))��
rX
i=1
rX
j=1
�i�j"kpi � pkkpj � pk
rX
i=1
rX
j=1
�i�j4" rad(S)2
= 4" rad(S)2.
x =rX
i=1
�ipi, whererX
i=1
�i = 1.
kx� pk2 =
�����
rX
i=1
�i(pi � p)
�����
2
=rX
i=1
rX
j=1
�i�j(pi � p)>(pj � p).For any p 2 S,
Let S = {p1, . . . , pr} and let x 2 conv(S).
MEBs under JL projection
MEBs under JL projectionTheorem: Let P be a set of points in RD
and let f : RD ! Rdbe an "-JL
projection for P . For every subset S of P ,
(1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
MEBs under JL projectionTheorem: Let P be a set of points in RD
and let f : RD ! Rdbe an "-JL
projection for P . For every subset S of P ,
(1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
Let x = center(S).
MEBs under JL projectionTheorem: Let P be a set of points in RD
and let f : RD ! Rdbe an "-JL
projection for P . For every subset S of P ,
(1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
Upper Bound:
Let x = center(S).
MEBs under JL projectionTheorem: Let P be a set of points in RD
and let f : RD ! Rdbe an "-JL
projection for P . For every subset S of P ,
(1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
Upper Bound: rad(f(S))
2 max
p2P(kx� pk2 + 4" rad(S)
2)
max
p2P((1 + 4")rad(S)
2)
= (1 + 4")rad(S)
2.
Let x = center(S).
MEBs under JL projectionTheorem: Let P be a set of points in RD
and let f : RD ! Rdbe an "-JL
projection for P . For every subset S of P ,
(1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
Upper Bound: rad(f(S))
2 max
p2P(kx� pk2 + 4" rad(S)
2)
max
p2P((1 + 4")rad(S)
2)
= (1 + 4")rad(S)
2.
Lower Bound:
Let x = center(S).
MEBs under JL projectionTheorem: Let P be a set of points in RD
and let f : RD ! Rdbe an "-JL
projection for P . For every subset S of P ,
(1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
Upper Bound: rad(f(S))
2 max
p2P(kx� pk2 + 4" rad(S)
2)
max
p2P((1 + 4")rad(S)
2)
= (1 + 4")rad(S)
2.
Lower Bound:
Let x = center(S).
Let q 2 S be such that kq � xk = rad(S) andkf(q)� center(f(S))k � kf(q)� f(x)k.
MEBs under JL projectionTheorem: Let P be a set of points in RD
and let f : RD ! Rdbe an "-JL
projection for P . For every subset S of P ,
(1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
Upper Bound: rad(f(S))
2 max
p2P(kx� pk2 + 4" rad(S)
2)
max
p2P((1 + 4")rad(S)
2)
= (1 + 4")rad(S)
2.
Lower Bound:
Let x = center(S).
Let q 2 S be such that kq � xk = rad(S) andkf(q)� center(f(S))k � kf(q)� f(x)k.
MEBs under JL projectionTheorem: Let P be a set of points in RD
and let f : RD ! Rdbe an "-JL
projection for P . For every subset S of P ,
(1� 4")rad(S)2 rad(f(S))2 (1 + 4")rad(S)2.
Upper Bound: rad(f(S))
2 max
p2P(kx� pk2 + 4" rad(S)
2)
max
p2P((1 + 4")rad(S)
2)
= (1 + 4")rad(S)
2.
Lower Bound:
Let x = center(S).
Let q 2 S be such that kq � xk = rad(S) andkf(q)� center(f(S))k � kf(q)� f(x)k.
rad(f(S))2 � kf(q)� center(f(S))k2
� kf(q)� f(x)k2
� kq � xk2 � 4" rad(S)2
= (1� 4")rad(S)2.
Extension to k-NN distances.
Extension to k-NN distances.
Extension to k-NN distances.
d
kP (x) = distance from x to k points of P .
Extension to k-NN distances.
d
kP (x) = distance from x to k points of P .
Corollary: If f is an "-JL projection then for all k,Pers(d
kf(P )) is a 1 +O(") approximation to Pers(d
kP ).
Extension to k-NN distances.
d
kP (x) = distance from x to k points of P .
Corollary: If f is an "-JL projection then for all k,Pers(d
kf(P )) is a 1 +O(") approximation to Pers(d
kP ).
Bonus: Also works for weighted points.
Going forward…
Going forward…
• Eliminate inner product condition.
Going forward…
• Eliminate inner product condition.• Eliminate constant factor (4)
Going forward…
• Eliminate inner product condition.• Eliminate constant factor (4)• Eliminate linearity condition.
Going forward…
• Eliminate inner product condition.• Eliminate constant factor (4)• Eliminate linearity condition.• Extend to distances to measures.
Going forward…
• Eliminate inner product condition.• Eliminate constant factor (4)• Eliminate linearity condition.• Extend to distances to measures.
Thank you.