math 4910 & 5910 topological data analysis · 1 prediction methods (supervised learning):...

80
“What’s topology got to do with data analysis?” MATH 4910 & 5910 Topological Data Analysis Instructor: Mehmet Aktas January 9, 2018 1 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Upload: others

Post on 09-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

“What’s topology got to do with data analysis?”

MATH 4910 & 5910Topological Data Analysis

Instructor: Mehmet Aktas

January 9, 2018

1 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 2: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topology is “Pure” math!

http://xkcd.com/435/

2 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 3: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Credit

This presentation is inspired by the following:

Robert Ghrist, Barcodes: The Persistent Topology of Data, 2008

Lecture Notes of Sara Kalisnik Verovsek

3 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 4: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Outline

1 What is Topology?

2 What is data?

3 Topological Data Analysis

4 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 5: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is Topology?

Outline

1 What is Topology?

2 What is data?

3 Topological Data Analysis

5 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 6: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is Topology?

Topology

Pure branch of mathematics that dates back to 1700’.

Euler in Konigsberg

Konigsberg was a city in Prussia situated on the Pregel river (modern dayKaliningrad, a major industrial center of western Russia). Seven bridgesspanned the various branches of the river as depicted in the picture.

Is possible to cross all seven bridges exactly once and return to a startingpoint in a single stroll?

6 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 7: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is Topology?

Topology

Pure branch of mathematics that dates back to 1700’.

Euler in Konigsberg

Konigsberg was a city in Prussia situated on the Pregel river (modern dayKaliningrad, a major industrial center of western Russia). Seven bridgesspanned the various branches of the river as depicted in the picture.

Is possible to cross all seven bridges exactly once and return to a startingpoint in a single stroll?

6 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 8: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is Topology?

Topology

Pure branch of mathematics that dates back to 1700’.

Euler in Konigsberg

Konigsberg was a city in Prussia situated on the Pregel river (modern dayKaliningrad, a major industrial center of western Russia). Seven bridgesspanned the various branches of the river as depicted in the picture.

Is possible to cross all seven bridges exactly once and return to a startingpoint in a single stroll?

6 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 9: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is Topology?

What is Topology?

7 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 10: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is Topology?

Why Topology?

Three key ideas:

1 Invariance under deformation

2 Coordinate freeness

3 Compressed representations

8 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 11: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is Topology?

Why Topology?

Three key ideas:

1 Invariance under deformation

2 Coordinate freeness

3 Compressed representations

8 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 12: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is Topology?

Why Topology?

Three key ideas:

1 Invariance under deformation

2 Coordinate freeness

3 Compressed representations

8 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 13: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Outline

1 What is Topology?

2 What is data?

3 Topological Data Analysis

9 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 14: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data is Big!

Big Data

Data is everywhere.

10 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 15: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data Growth

An article by Forbesstates that

Data is growing fasterthan ever beforeBy the year 2020,about 1.7 megabytesof new informationwill be created everysecond for everyhuman being on theplanet.

11 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 16: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What do data scientists do?

Make discoveries while swimming indata

12 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 17: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What do data scientists do?

Make discoveries while swimming indata

The statistics represent this significantand growing demand for data scientists.

Data mining tops LinkedIns list of the“hottest skills of 2014”Best Job in USA for 20163,433: Number of Job Openings in2016#16 Highest Paying Job in Demandin 2016

Average Base Salary : $105,395:

12 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 18: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 19: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

Storing, organizing and integrating hugeamount of unstructured data

13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 20: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

Storing, organizing and integrating hugeamount of unstructured data

a.k.a. KDD (knowledge discovery indatabases)

13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 21: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

Storing, organizing and integrating hugeamount of unstructured data

a.k.a. KDD (knowledge discovery indatabases)

Types of Data

13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 22: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

Storing, organizing and integrating hugeamount of unstructured data

a.k.a. KDD (knowledge discovery indatabases)

Types of Data

Time-series data,

13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 23: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

Storing, organizing and integrating hugeamount of unstructured data

a.k.a. KDD (knowledge discovery indatabases)

Types of Data

Time-series data,

Sequence data,

13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 24: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

Storing, organizing and integrating hugeamount of unstructured data

a.k.a. KDD (knowledge discovery indatabases)

Types of Data

Time-series data,

Sequence data,

Graphs, social networks,

13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 25: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

Storing, organizing and integrating hugeamount of unstructured data

a.k.a. KDD (knowledge discovery indatabases)

Types of Data

Time-series data,

Sequence data,

Graphs, social networks,

Multimedia, WWWdata,

13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 26: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is Data science/mining?

Non-trivial extraction of implicit,previously unknown, and potentiallyuseful information from data

Storing, organizing and integrating hugeamount of unstructured data

a.k.a. KDD (knowledge discovery indatabases)

Types of Data

Time-series data,

Sequence data,

Graphs, social networks,

Multimedia, WWWdata,

Text data.13 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 27: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Application of Data Science

Internet search

14 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 28: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Application of Data Science

Internet search

Recommender systems

14 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 29: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Application of Data Science

Internet search

Recommender systems

Biological Classification

14 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 30: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Application of Data Science

Internet search

Recommender systems

Biological Classification

...

14 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 31: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data Mining Tasks

1 Prediction Methods (Supervised Learning): Predict unknown orfuture values of the data using other known data

15 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 32: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data Mining Tasks

1 Prediction Methods (Supervised Learning): Predict unknown orfuture values of the data using other known data

Classification: Is this A or B?

15 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 33: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data Mining Tasks

1 Prediction Methods (Supervised Learning): Predict unknown orfuture values of the data using other known data

Classification: Is this A or B?

Anomaly detection: Is this weird?

15 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 34: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data Mining Tasks

1 Prediction Methods (Supervised Learning): Predict unknown orfuture values of the data using other known data

Classification: Is this A or B?

Anomaly detection: Is this weird?

Regression: How much? Howmany?

15 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 35: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data Mining Tasks - Continued

2 Description Methods (Unsupervised learning): Findhuman-interpretable (previously unknown) patterns that describe thedata (unlabeled)

16 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 36: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data Mining Tasks - Continued

3 Description Methods (Unsupervised learning): Findhuman-interpretable (previously unknown) patterns that describe thedata (unlabeled)

Clustering: How is dataorganized?

16 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 37: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

Data Mining Tasks - Continued

4 Description Methods (Unsupervised learning): Findhuman-interpretable (previously unknown) patterns that describe thedata (unlabeled)

Clustering: How is dataorganized?

Association rule mining:Are these related?

16 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 38: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is data?

Collection of data objects andtheir attributes

Simple Case : n × d matrix

n objects with d dimensioneach,d columns are called variables,features or attributes ofobjects

17 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 39: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

What is data?

What is data?

Collection of data objects andtheir attributes

Simple Case : n × d matrix

n objects with d dimensioneach,d columns are called variables,features or attributes ofobjects

17 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 40: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Outline

1 What is Topology?

2 What is data?

3 Topological Data Analysis

18 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 41: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data

In recent years, data is complex

19 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 42: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data

In recent years, data is complex

It is “Big Data”.

19 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 43: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data

In recent years, data is complex

It is “Big Data”.It has also very rich features.

19 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 44: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data

In recent years, data is complex

It is “Big Data”.It has also very rich features.

Usually both!

19 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 45: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data

In recent years, data is complex

It is “Big Data”.It has also very rich features.

Usually both!

The problem in both cases is that thereis not a single story happening in yourdata

19 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 46: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Topological Data Analysis

Topological Data Analysis (TDA) is the tool that summarizes out theirrelevant stories to get at something interesting.

TDA has applications in

BiologyMedical SciencesScience of VotingMusicsSports...

20 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 47: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Topological Data Analysis

Topological Data Analysis (TDA) is the tool that summarizes out theirrelevant stories to get at something interesting.

TDA has applications in

BiologyMedical SciencesScience of VotingMusicsSports...

20 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 48: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Topological Data Analysis

Topological Data Analysis (TDA) is the tool that summarizes out theirrelevant stories to get at something interesting.

TDA has applications in

BiologyMedical SciencesScience of VotingMusicsSports...

20 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 49: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Topological Data Analysis

Topological Data Analysis (TDA) is the tool that summarizes out theirrelevant stories to get at something interesting.

TDA has applications in

BiologyMedical SciencesScience of VotingMusicsSports...

20 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 50: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Topological Data Analysis

Topological Data Analysis (TDA) is the tool that summarizes out theirrelevant stories to get at something interesting.

TDA has applications in

BiologyMedical SciencesScience of VotingMusicsSports...

20 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 51: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Topological Data Analysis

Topological Data Analysis (TDA) is the tool that summarizes out theirrelevant stories to get at something interesting.

TDA has applications in

BiologyMedical SciencesScience of VotingMusicsSports...

20 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 52: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Topological Data Analysis

Topological Data Analysis (TDA) is the tool that summarizes out theirrelevant stories to get at something interesting.

TDA has applications in

BiologyMedical SciencesScience of VotingMusicsSports...

20 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 53: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Topological Data Analysis

Topological Data Analysis (TDA) is the tool that summarizes out theirrelevant stories to get at something interesting.

TDA has applications in

BiologyMedical SciencesScience of VotingMusicsSports...

20 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 54: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Basic Idea in TDA

Data has “shape”,

21 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 55: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Basic Idea in TDA

Data has “shape”,

shape has “meaning”,

21 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 56: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Basic Idea in TDA

Data has “shape”,

shape has “meaning”,

meaning drives “values”.

21 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 57: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data has “shape”

Convert the data into a graph (moregenerally a simplical complex)

22 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 58: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data has “shape”

Convert the data into a graph (moregenerally a simplical complex)

Use the data points as the vertices ofthe graph

22 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 59: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data has “shape”

Convert the data into a graph (moregenerally a simplical complex)

Use the data points as the vertices ofthe graph

To locate edges;

Choose a radius r and draw circlescentered at data points

22 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 60: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Data has “shape”

Convert the data into a graph (moregenerally a simplical complex)

Use the data points as the vertices ofthe graph

To locate edges;

Choose a radius r and draw circlescentered at data pointsLocate edges between pairs of pointswhen their circles intersect

22 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 61: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Which radius?

How many groups are there in this data?

23 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 62: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Which radius?

How many groups are there in this data?

23 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 63: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Which radius?

How many groups are there in this data?

23 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 64: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Which radius?

How many groups are there in this data?

23 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 65: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Which radius?

How many groups are there in this data?

23 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 66: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Shape has “meaning” - Persistence Homology

Persistence homology tracks the evolution of the topological featuresof data across scales

Betti numbers present these topological features of the dataquantitatively: β0 is the number of components, β1 is the number ofholes.

24 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 67: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Shape has “meaning” - Persistence Homology

Persistence homology tracks the evolution of the topological featuresof data across scales

Betti numbers present these topological features of the dataquantitatively: β0 is the number of components, β1 is the number ofholes.

24 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 68: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Shape has “meaning” - Persistence Homology

Persistence homology tracks the evolution of the topological featuresof data across scales

Betti numbers present these topological features of the dataquantitatively: β0 is the number of components, β1 is the number ofholes.

24 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 69: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Shape has “meaning” - Persistence Homology

Persistence homology tracks the evolution of the topological featuresof data across scales

Betti numbers present these topological features of the dataquantitatively: β0 is the number of components, β1 is the number ofholes.

24 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 70: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Persistence Barcodes

25 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 71: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Persistence Barcodes

25 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 72: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Persistence Barcodes

25 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 73: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Persistence Barcodes

25 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 74: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Persistence Barcodes

25 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 75: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Persistence Barcodes

25 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 76: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Persistence Barcodes

Barcodes offer an optimum balance between encoding rich shapeinformation and not being computationally intensive.

25 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 77: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Persistence Barcodes

Barcodes offer an optimum balance between encoding rich shapeinformation and not being computationally intensive.

25 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 78: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Meaning drives “values” - Bottleneck Distance

We have persistence barcodes for each data.

How can we compare two persistence barcodes?

A robust metric between two barcodes: bottleneck distance

26 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 79: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Meaning drives “values” - Bottleneck Distance

We have persistence barcodes for each data.

How can we compare two persistence barcodes?

A robust metric between two barcodes: bottleneck distance

26 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis

Page 80: MATH 4910 & 5910 Topological Data Analysis · 1 Prediction Methods (Supervised Learning): Predict unknown or future values of the data using other known data Classi cation: Is this

Topological Data Analysis

Meaning drives “values” - Bottleneck Distance

We have persistence barcodes for each data.

How can we compare two persistence barcodes?

A robust metric between two barcodes: bottleneck distance

26 / 26 Instructor: Mehmet Aktas MATH 4910 & 5910 Topological Data Analysis