introduction to neural networks in medical diagnosis włodzisław duch dept. of informatics,...

33
Introduction to Introduction to Neural Networks Neural Networks in Medical Diagnosis in Medical Diagnosis Włodzisław Duch Włodzisław Duch Dept. of Informatics, Dept. of Informatics, Nicholas Copernicus University, Nicholas Copernicus University, Toruń, Toruń, Poland Poland

Post on 15-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Introduction to Introduction to Neural Networks Neural Networks

in Medical Diagnosisin Medical Diagnosis

Introduction to Introduction to Neural Networks Neural Networks

in Medical Diagnosisin Medical Diagnosis

Włodzisław DuchWłodzisław Duch

Dept. of Informatics, Dept. of Informatics, Nicholas Copernicus University, Nicholas Copernicus University,

Toruń, Toruń, PolandPoland

Page 2: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

What is it about?What is it about?What is it about?What is it about?

• Data is precious! But also overwhelming ...Data is precious! But also overwhelming ...• Statistical methods are important but new Statistical methods are important but new

techniques may frequently be more accurate techniques may frequently be more accurate and give more insight into the data.and give more insight into the data.

• Data analysis requires intelligence. Data analysis requires intelligence. • Inspirations come from many sources, Inspirations come from many sources,

including biology: artificial neural networks, including biology: artificial neural networks, evolutionary computing, immune systems ...evolutionary computing, immune systems ...

Page 3: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Computational IntelligenceComputational IntelligenceComputational IntelligenceComputational Intelligence

Computational IntelligenceData + Knowledge

Artificial Intelligence

Expert systems

Fuzzylogic

PatternRecognition

Machinelearning

Probabilistic methods

Multivariatestatistics

Visuali-zation

Evolutionaryalgorithms

Neuralnetworks

Page 4: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

What do these methods do?What do these methods do?What do these methods do?What do these methods do?

• Provide non-parametric models of data.Provide non-parametric models of data.• Allow to classify new data to pre-defined Allow to classify new data to pre-defined

categories, supporting diagnosis & prognosis.categories, supporting diagnosis & prognosis.• Allow to discover new categories.Allow to discover new categories.• Allow to understand the data, creating fuzzy Allow to understand the data, creating fuzzy

or crisp logical rules.or crisp logical rules.• Help to visualize multi-dimensional Help to visualize multi-dimensional

relationships among data samples. relationships among data samples. • Help to model real neural networks! Help to model real neural networks!

Page 5: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

GhostMiner PhilosophyGhostMiner PhilosophyGhostMiner PhilosophyGhostMiner Philosophy

• There is no free lunch – provide different type There is no free lunch – provide different type of tools for knowledge discovery. of tools for knowledge discovery. Decision tree, neural, neurofuzzy, similarity-Decision tree, neural, neurofuzzy, similarity-based, committees.based, committees.

• Provide tools for visualization of data.Provide tools for visualization of data.• Support the process of knowledge Support the process of knowledge

discovery/model building and evaluating, discovery/model building and evaluating, organizing it into projects.organizing it into projects.

GhostMiner, data mining tools from our lab. GhostMiner, data mining tools from our lab.

• Separate the process of model building and Separate the process of model building and knowledge discovery from model use => knowledge discovery from model use =>

GhostMiner Developer & GhostMiner Analyzer GhostMiner Developer & GhostMiner Analyzer

Page 6: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Neural networksNeural networksNeural networksNeural networks• Inspired by neurobiology: simple elements Inspired by neurobiology: simple elements

cooperate changing internal parameters.cooperate changing internal parameters.• Large field, dozens of different models, over Large field, dozens of different models, over

500 papers on NN in medicine each year. 500 papers on NN in medicine each year. • Supervised networksSupervised networks: heteroassociative : heteroassociative

mapping X=>Y, symptoms => diseases,mapping X=>Y, symptoms => diseases,universal approximators. universal approximators.

• Unsupervised networksUnsupervised networks: clusterization, : clusterization, competitive learning, autoassociation. competitive learning, autoassociation.

• Reinforcement learningReinforcement learning: modeling behavior, : modeling behavior, playing games, sequential data.playing games, sequential data.

Page 7: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Real and artificial neuronsReal and artificial neuronsReal and artificial neuronsReal and artificial neurons

Synapses

Axon

Dendrites

Synapses

(weights)

Nodes – artificialneurons

Signals

Page 8: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Neural networkNeural network for MI diagnosisfor MI diagnosisNeural networkNeural network for MI diagnosisfor MI diagnosisMyocardial Infarction~ p(MI|X)

Sex Age SmokingECG: ST

PainIntensity

PainDuration

Elevation

0.7

51 1365Inputs:

Outputweights

Inputweights

Page 9: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

MI network functionMI network functionMI network functionMI network function

Training: setting the values of weights and thresholds, efficient algorithms exist.

Effect: non-linear regression function

Such networks are universal approximators: they may learn any mapping X => Y

5 6

1 1

o iMI ij jk k

i k

F W W X

X

Page 10: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Learning dynamicsLearning dynamicsLearning dynamicsLearning dynamicsDecision regions shown every 200 training epochs in x3, x4 coordinates; borders are optimally placed with wide margins.

Page 11: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Neurofuzzy systemNeurofuzzy systemssNeurofuzzy systemNeurofuzzy systemss

Feature Space Mapping (FSM) neurofuzzy system.Feature Space Mapping (FSM) neurofuzzy system.Neural adaptation, estimation of probability density Neural adaptation, estimation of probability density distribution (PDF) using single hidden layer network distribution (PDF) using single hidden layer network (RBF-like) with nodes realizing separable functions:(RBF-like) with nodes realizing separable functions:

1

; ;i i ii

G X P G X P

Fuzzy: Fuzzy: xx(no/yes) replaced by a degree (no/yes) replaced by a degree xx. Triangular, trapezoidal, Gaussian . Triangular, trapezoidal, Gaussian ...... MFMF..

M.f-s in many dimensions:

Page 12: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Knowledge from networksKnowledge from networksKnowledge from networksKnowledge from networks

Simplify networks: force most weights to 0, quantize remaining parameters, be constructive!

• Regularization: mathematical technique improving predictive abilities of the network.• Result: MLP2LN neural networks that are equivalent to logical rules.

Page 13: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

MLP2LNMLP2LNMLP2LNMLP2LN

Converts MLP neural networks into a network Converts MLP neural networks into a network performing logical operations (LN).performing logical operations (LN).

InputInputlayer layer

Aggregation: Aggregation: better featuresbetter features

Output: Output: one node one node per class. per class.

Rule units: Rule units: threshold logicthreshold logic

Linguistic units: Linguistic units: windows, filterswindows, filters

Page 14: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Recurrence of breast cancerRecurrence of breast cancerRecurrence of breast cancerRecurrence of breast cancer

Data from: Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia.

286 cases, 201 no recurrence (70.3%), 85 recurrence cases (29.7%)

no-recurrence-events, 40-49, premeno, 25-29, 0-2, ?, 2, left, right_low, yes

9 nominal features: age (9 bins), menopause, tumor-size (12 bins), nodes involved (13 bins), node-caps, degree-malignant (1,2,3), breast, breast quad, radiation.

Page 15: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Recurrence of breast cancerRecurrence of breast cancerRecurrence of breast cancerRecurrence of breast cancer

Data from: Institute of Oncology, University Medical Center, Ljubljana, Yugoslavia.

Many systems used, 65-78% accuracy reported.

Single rule:IF (nodes-involved [0,2] degree-malignant = 3 THEN recurrence, ELSE no-recurrence

76.2% accuracy, only trivial knowledge in the data: Highly malignant breast cancer involving many nodes is likely to strike back.

Page 16: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Recurrence - comparison. Recurrence - comparison. Recurrence - comparison. Recurrence - comparison.

Method 10xCV accuracy

MLP2LN 1 rule 76.2 SSV DT stable rules 75.7 1.0

k-NN, k=10, Canberra 74.1 1.2

MLP+backprop. 73.5 9.4 (Zarndt)CART DT 71.4 5.0 (Zarndt) FSM, Gaussian nodes 71.7 6.8 Naive Bayes 69.3 10.0 (Zarndt)

Other decision trees < 70.0

Page 17: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Breast cancer diagnosis. Breast cancer diagnosis. Breast cancer diagnosis. Breast cancer diagnosis. Data from University of Wisconsin Hospital, Data from University of Wisconsin Hospital, Madison, collected by dr. W.H. Wolberg.Madison, collected by dr. W.H. Wolberg.

699 cases, 9 features quantized from 1 to 10: clump thickness, uniformity of cell size, uniformity of cell shape, marginal adhesion, single epithelial cell size, bare nuclei, bland chromatin, normal nucleoli, mitoses

Tasks: distinguish benign from malignant cases.

Page 18: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Breast cancer rules. Breast cancer rules. Breast cancer rules. Breast cancer rules.

Data from University of Wisconsin Hospital, Data from University of Wisconsin Hospital, Madison, collected by dr. W.H. Wolberg.Madison, collected by dr. W.H. Wolberg.

Simplest rule from MLP2LN, large regularization:

If uniformity of cell size 3Then benign Else malignantSensitivity=0.97, Specificity=0.85

More complex NN solutions, from 10CV estimate:Sensitivity =0.98, Specificity=0.94

Page 19: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Breast cancer comparison. Breast cancer comparison. Breast cancer comparison. Breast cancer comparison.

Method 10xCV accuracy

k-NN, k=3, Manh 97.0 2.1 (GM)FSM, neurofuzzy 96.9 1.4 (GM)

Fisher LDA 96.8 MLP+backprop. 96.7 (Ster, Dobnikar)LVQ 96.6 (Ster, Dobnikar) IncNet (neural) 96.4 2.1 (GM)Naive Bayes 96.4 SSV DT, 3 crisp rules 96.0 2.9 (GM) LDA (linear discriminant) 96.0 Various decision trees 93.5-95.6

Page 20: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Collected in the Outpatient Center of Dermatology in Rzeszów, Poland.

Four types of Melanoma: benign, blue, suspicious, or malignant.

250 cases, with almost equal class distribution.

Each record in the database has 13 attributes: asymmetry, border, color (6), diversity (5).

TDS (Total Dermatoscopy Score) - single index

Goal: hardware scanner for preliminary diagnosis.

Melanoma skin cancerMelanoma skin cancerMelanoma skin cancerMelanoma skin cancer

Page 21: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Method Rules Training % Test %

MLP2LN, crisp rules 4 98.0 all 100

SSV Tree, crisp rules 4 97.5±0.3 100

FSM, rectangular f. 7 95.5±1.0 100

knn+ prototype selection 13 97.5±0.0 100

FSM, Gaussian f. 15 93.7±1.0 95±3.6

knn k=1, Manh, 2 features -- 97.4±0.3 100

LERS, rough rules 21 -- 96.2

Melanoma resultsMelanoma resultsMelanoma resultsMelanoma results

Page 22: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

27 features taken into account: polarity, size, hydrogen-bond donor or acceptor, pi-donor or acceptor, polarizability, sigma effect.

Pairs of chemicals, 54 features, are compared, which one has higher activity?

2788 cases, 5-fold crossvalidation tests.

Antibiotic activity of pyrimidine Antibiotic activity of pyrimidine compounds.compounds.

Antibiotic activity of pyrimidine Antibiotic activity of pyrimidine compounds.compounds.

Pyrimidines: which compound has stronger antibiotic activity?

Common template, substitutions added at 3 positions, R3, R4 and R5.

Page 23: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Antibiotic activity - results.Antibiotic activity - results.Antibiotic activity - results.Antibiotic activity - results.

Pyrimidines: which compound has stronger antibiotic activity?

Mean Spearman's rank correlation coefficient used: rs

Method Rank correlation

FSM, 41 Gaussian rules 0.77±0.03Golem (ILP) 0.68Linear regression 0.65CART (decision tree) 0.50

Page 24: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Thyroid screening.Thyroid screening.Thyroid screening.Thyroid screening.

Garavan Institute, Sydney, Australia

15 binary, 6 continuous

Training: 93+191+3488 Validate: 73+177+3178

Determine important clinical factors

Calculate prob. of each diagnosis.

Hiddenunits

Finaldiagnoses

TSHT4U

Clinical findings

Agesex……

T3

TT4

TBG

Normal

Hyperthyroid

Hypothyroid

Page 25: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Thyroid – some results.Thyroid – some results.Thyroid – some results.Thyroid – some results.Accuracy of diagnoses obtained with different systems.

Method Rules/Features Training % Test %

MLP2LN optimized 4/6 99.9 99.36

CART/SSV Decision Trees 3/5 99.8 99.33

Best Backprop MLP -/21 100 98.5

Naïve Bayes -/- 97.0 96.1

k-nearest neighbors -/- - 93.8

Page 26: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

PsychometryPsychometryPsychometryPsychometryMMPI (Minnesota Multiphasic Personality MMPI (Minnesota Multiphasic Personality Inventory) psychometric test.Inventory) psychometric test.

Printed forms are scanned or computerized Printed forms are scanned or computerized version of the test is used.version of the test is used.

• Raw data: 550 questions, ex:I am getting tired quickly: Yes - Don’t know - No

• Results are combined into 10 clinical scales and 4 validity scales using fixed coefficients.

• Each scale measures tendencies towards hypochondria, schizophrenia, psychopathic deviations, depression, hysteria, paranoia etc.

Page 27: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

PsychometryPsychometryPsychometryPsychometry

• There is no simple correlation between single values and final diagnosis.

• Results are displayed in form of a histogram, called ‘a psychogram’. Interpretation depends on the experience and skill of an expert, takes into account correlations between peaks.

Goal: an expert system providing evaluation and interpretation of MMPI tests at an expert level.

Problem: agreement between experts only 70% of the time; alternative diagnosis and personality changes over time are important.

Page 28: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Psychometric dataPsychometric dataPsychometric dataPsychometric data

1600 cases for woman, same number for men.1600 cases for woman, same number for men.

27 classes: 27 classes: norm, psychopathic, schizophrenia, paranoia, norm, psychopathic, schizophrenia, paranoia, neurosis, mania, simulation, alcoholism, drug neurosis, mania, simulation, alcoholism, drug addiction, criminal tendencies, abnormal addiction, criminal tendencies, abnormal behavior due to ... behavior due to ...

Extraction of logical rules: 14 scales = features.

Define linguistic variables and use FSM, MLP2LN, SSV - giving about 2-3 rules/class.

Page 29: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Psychometric dataPsychometric dataPsychometric dataPsychometric data

10-CV for FSM is 82-85%, for C4.5 is 79-84%. Input uncertainty ++GGxx around 1.5% (best ROC) improves FSM results to 90-92%.

MethodMethod DataData N. rulesN. rules AccuracyAccuracy ++GGxx%%

C 4.5C 4.5 ♀♀ 5555 93.093.0 93.793.7

♂♂ 6161 92.592.5 93.193.1

FSMFSM ♀♀ 6969 95.495.4 97.697.6

♂♂ 9898 95.995.9 96.996.9

Page 30: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

Psychometric ExpertPsychometric ExpertPsychometric ExpertPsychometric ExpertProbabilities for different classes. Probabilities for different classes. For greater uncertainties more For greater uncertainties more classes are predicted. classes are predicted.

Fitting the rules to the conditions:Fitting the rules to the conditions:typically 3-5 conditions per rule, typically 3-5 conditions per rule, Gaussian distributions around Gaussian distributions around measured values that fall into the measured values that fall into the rule interval are shown in green. rule interval are shown in green.

Verbal interpretation of each Verbal interpretation of each case, rule and scale dependent.case, rule and scale dependent.

Page 31: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

VisualizationVisualizationVisualizationVisualizationProbability of classes versus Probability of classes versus input uncertainty.input uncertainty.

Detailed input probabilities Detailed input probabilities around the measured values around the measured values vs. change in the single scale; vs. change in the single scale; changes over time define changes over time define ‘patients trajectory’. ‘patients trajectory’.

Interactive multidimensional Interactive multidimensional scaling: zooming on the new scaling: zooming on the new case to inspect its similarity to case to inspect its similarity to other cases.other cases.

Page 32: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

SummarySummarySummarySummaryNeural networks and other computational Neural networks and other computational intelligence methods are useful additions to the intelligence methods are useful additions to the multivariate statistical tools.multivariate statistical tools.

They support diagnosis, predictions, and data They support diagnosis, predictions, and data understanding: extracting rules, prototypes.understanding: extracting rules, prototypes.

FDA has approved many devices that use ANNs:FDA has approved many devices that use ANNs:

Oxford’s Instruments Ltd EEG analyzer, Oxford’s Instruments Ltd EEG analyzer,

Cardionetics (UK) ECG analyzer. Cardionetics (UK) ECG analyzer.

PAPNET (NSI), analysis of Pap smearsPAPNET (NSI), analysis of Pap smears

……

Page 33: Introduction to Neural Networks in Medical Diagnosis Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland

ChallengesChallengesChallengesChallenges

• Discovery of theories rather than data modelsDiscovery of theories rather than data models• Integration with image/signal analysisIntegration with image/signal analysis• Integration with reasoning in complex domainsIntegration with reasoning in complex domains• Combining expert systems with neural networksCombining expert systems with neural networks

……..

Fully automatic universal data analysis systems: Fully automatic universal data analysis systems: press the button and wait for the truth …press the button and wait for the truth …

We are slowly getting there. We are slowly getting there.

More & more computational intelligence tools More & more computational intelligence tools (including our own) are available. (including our own) are available.