an integrated machine learning approach to stroke prediction

13
Intelligent Database Systems Lab N.Y.U.S. T. I. M. An Integrated Machine Learning Approach to Stroke Prediction Presenter: Tsai Tzung Ruei Authors: Aditya Khosla, Yu Cao, Cliff Chiung- Yu Lin, Hsu-Kuang Chiu, Junling Hu, Honglak Lee SIGKDD 2010 國國國國國國國國 National Yunlin University of Science and Technology

Upload: seoras

Post on 22-Feb-2016

45 views

Category:

Documents


0 download

DESCRIPTION

An Integrated Machine Learning Approach to Stroke Prediction. Presenter: Tsai Tzung Ruei Authors: Aditya Khosla , Yu Cao, Cliff Chiung -Yu Lin, Hsu- Kuang Chiu, Junling Hu, Honglak Lee . 國立雲林科技大學 National Yunlin University of Science and Technology. SIGKDD 2010. Outline. Motivation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

An Integrated Machine Learning Approachto Stroke Prediction

Presenter: Tsai Tzung Ruei Authors: Aditya Khosla, Yu Cao, Cliff Chiung-Yu Lin, Hsu-Kuang Chiu, Junling Hu, Honglak Lee

SIGKDD 2010

國立雲林科技大學National Yunlin University of Science and Technology

Page 2: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Outline

Motivation Objective Methodology Experiments Conclusion Comments

2

Page 3: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Motivation

Most previous prediction models have adopted features (risk factors) that are verified by clinical trials or selected manually by medical experts.

In the past, high-performance machine learning algorithms such as SVM and logistic regression were not explored.

3

Page 4: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objective

To propose a novel automatic feature selection algorithm that selects robust features based on our proposed heuristic: conservative mean.

To present a margin-based censored regression algorithm that combines the concept of margin-based classifiers with censored regression to achieve a better concordance index than the Cox model.

4

Page 5: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

5

Missing Data

Imputation

• Column mean• Column median• Imputation

through linear regression

• Regularized Expectation Maximization (EM)

Feature Selection

• Forward feature selection

• L 1 regularized logistic regression

• Conservative mean feature selection

Learning Algorithms

for Prediction

• Margin-based Censored Regression

Page 6: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

Conservative mean feature selection To consider the variance across different folds along with the

average of the prediction performance.

To evaluate the performance of each feature individually.

6

Age

Calculated

hypertension status

Left ventricula

r mass

Page 7: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

Conservative mean feature selection

7

VECTOR

Age

Left ventricular

mass

Calculated hypertension

status

Page 8: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Methodology

Learning Algorithms for Prediction Margin-based Censored Regression

8

SVM

True

False

Page 9: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

Data Imputation

Feature Selection

9

Missing Data

Imputation

Feature Selection

Learning Algorithms

for Prediction

Page 10: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

Stroke Prediction

10

Missing Data

Imputation

Feature Selection

Learning Algorithms

for Prediction

Page 11: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments

Identifying risk factors

11

Page 12: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Conclusion

Contribution An extensive evaluation of the problems of data

imputation, feature selection and prediction in medical data, with comparisons against the Cox proportional hazards model.

A novel feature selection algorithm, Conservative Mean feature selection, that outperforms both L 1 regularized Cox model and L 1 regularized logistic regression on the CHS dataset.

A novel risk prediction algorithm, Margin-based Censored Regression, that outperforms the Cox model given the same set of features.

12

Page 13: An Integrated Machine Learning Approach to Stroke Prediction

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Comments

Advantage The structure of this paper is very clear.

Drawback ……

Application classification

13