2010 spring, bioinformatics ii presentation
DESCRIPTION
2010 Spring, Bioinformatics II, Prof. Yu ZhangTRANSCRIPT
Presented by DanniseJangyoung
MarcusBongsoo
Breast Cancer Diagnostics
Outline
Introduction SVM Logistic Regression Conclusion & Discussion
Introduction
Epidemiology World wide the second most common
cancer 1.3 million cases
Most common type of cancer in women US (2009) approximately 40,170
women were expected to die from breast cancer
Most common in well developed countries
Strongly related to age
World Health Organization, American Cancer Society
Introduction Cancer that forms in tissues of the
breast, usually ducts and lobules Diagnosis: mammogram, FNA or
surgical biopsy to identify the nature of the mass
Normal breast Breast cancer
Fine Needle Aspiration
Surgical biopsy
Introduction Benign and malignant tumors
Benign: cyst or other disease Malignant: cancer
Goal: To reduce the number of predictors classifying tumors to simplify diagnosis
Data characteristics radius texture perimeter area smoothness compactness concavity Concave points symmetry Fractal dimension
Mangasarian, et al (1994)
Wolberg, et al. (1994)
SVM (Support vector machine)
Breast cancer Wisconsin data set (569*32)
Linearly separable (Benign & Malignant )
SVM
only means model ( 3-12)
Benign 99.43 %
Malignant 97.63%
SVM
cross-validation of the model - fit a model with 80% of the rows, check if it can predict the type of the other 20% of rows
Benign 94.80% Malignant 91.66%
Logistic Regression
Reduce the number of predictors Simplify the diagnosis Less measurements, less time, less
cost
Logistic Regression Estimate Std. Error z value Pr(>|z|)
(Intercept) 7.35952 12.85259 0.573 0.5669
radius 2.04930 3.71588 0.551 0.5813
texture -0.38473 0.06454 -5.961 2.5e-09 ***
perimeter 0.07151 0.50516 0.142 0.8874
area -0.03980 0.01674 -2.377 0.0174 *
smoothness -76.43227 31.95492 -2.392 0.0168 *
compactness 1.46242 20.34249 0.072 0.9427
concavity -8.46870 8.12003 -1.043 0.2970
concave_points -66.82176 28.52910 -2.342 0.0192 *
symmetry -16.27824 10.63059 -1.531 0.1257
fractal_dimension 68.33703 85.55666 0.799 0.4244
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
SVM
Perform SVM again Used predictors: texture, area,
smoothness, and concave points To assure the validity of the model,
we fit it to 80% of the data and make predictions about the remaining 20%
SVM Results
Full dataset
Bootstrap
Type Benign Malign
Correct (%) 96.92 90.57
Type Benign Malign
Correct (%) 96.63 89.85
Conclusion & Discussion
Type Benign MalignMean model 99.43 97.63Cross Validation(80%) 94.80 91.66The reduced model (Full Dataset) 96.62 90.57The reduced model (Bootstrap) 96.63 89.85
Summary table
Conclusion & Discussion
The characteristics of cells are key to diagnose malign of breast cancer
SVM was good to validate diagnostic model
The reduced model is quiet accurate, and it will help doctors to save the cost and efforts of diagnostics
Conclusion & Discussion
Cell line Origin of cell
Estrogen receptors
Progesterone receptors
ERBB2Amplification
BT-20 Primary No No NoBT-474 Primary Yes Yes YesMCF-7 Metastasis Yes Yes NoSK-BR-3 Metastasis No No Yes
Treatment is based on the diagnostics of cell lines (Examples of invasive ductal carcinoma)
Lasfargues, EY et al. 1958. Cultivation of human breast carcinomas. Borras, M et al. 1997. Estrogen receptor negative/progesterone receptor-positive evsa-T mammary tumor cells: a model for assessing the biological property of this peculiar phenotype of breast cancers.
Conclusion & Discussion
Current breast cancer researches and diagnostics by 3D pictures
(Dr. Mina J. Bissell, Lawrence Berkeley National Laboratory)
Britta Weigelt, Mina J. Bissell. 2008 Unraveling the microenvironmental influences on the normal mammary gland and breast cancer. Seminar in Cancer Biology. (18) 311-321
Thank you very much !Any questions ?