mario lauria
TRANSCRIPT
![Page 1: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/1.jpg)
Rank-based Diagnostic Biomarkers
Mario Lauria The Microsoft Research – University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy
![Page 2: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/2.jpg)
PART 1
Overview of the COSBI method
![Page 3: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/3.jpg)
THE PROBLEM WE ARE TRYING TO SOLVE
- The goal: to devise a computational method to classify clinical samples based on transcriptomics data.
- many disease/ conditions have a recognizable signature at the transcriptional level
- Motivation: to define a new way of diagnosing several diseases.
- high accuracy,
- early detection,
- first diagnostic test for several conditions
- Current issues :
- lack of repeatable results;
- a high profile scandal has seriously undermined trust in new results
![Page 4: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/4.jpg)
THE CANCER BIOMARKER SCANDAL AT DUKE
![Page 5: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/5.jpg)
THE TECHNICAL ISSUES IN SIGNATURE IDENTIFICATION
Observed problems
- low degree of overlap between comparable studies
- differences in lab protocols, normalization procedures, and confounding role of the “batch effect” (noise issue)
Our solution
- use of a composite, large signature (100’s of genes)
- lack of consensus on the size of a signature
Causes
- misguided insistence on identifying smallest number of “champions” (size issue)
- signatures based on gene ranks, not expression values
- a similarity map as output, illustrating signature-to-signature distance
![Page 6: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/6.jpg)
SKETCH OF OUR APPROACH
• Our signatures based on rank => lower
data quality requirements
• Output of our method is a similarity
map of profiles => neighborhood
inspection adds a measure of
robustness
ranked list of genes
compositesignature
expression
profileof patient A
fullnetwork
network with thresholdapplied to distances
![Page 7: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/7.jpg)
ROBUSTNESS TO BATCH EFFECTS
• Signature analysis of miRNA expression data for a subset of immune cell types- data set GSE28489 from Allantaz et al. PLoS ONE 2012 paper
• Conventional map of rank-based signatures (below) show separation by processing date
Clustering performed using PCA (from Allantaz et al. PLoS ONE 2012)
Clustering performed using rank-based signature method
- Our algorithm has no problem correctly clustering by cell types regardless of the date (right)
![Page 8: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/8.jpg)
EXAMPLE OF MULTI-SET ANALYSIS
• Signature analysis of miRNA expression data computed on two sets from different labs
- data sets GSE28489 from HUG
and GSE28487 from Roche
[Allantaz et al. PLoS ONE 2012]
• Result: no obvious separation between HUG and Roche samples
• (see …_repn and …_REPn
labels respectiv.)
![Page 9: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/9.jpg)
3RD PARTY VERIFICATION OF PERFORMANCE
• The SBV IMPROVER challenge is a crowdsourcing approach to the problem of defining an effective diagnostic signature• It is a joint initiative of IBM Research and Philip Morris International
• Challenge participants were asked to establish predictive signatures on unlabeled gene expression data sets for four diseases: • Psoriasis
• Multiple Sclerosis
• Chronic Obstructive Pulmonary Disease
• Lung Cancer
• The submitted predictions and signatures were subsequently scored by an Independent Scoring Committee against the Gold Standard
![Page 10: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/10.jpg)
SBV IMPROVER CHALLENGE RESULTS
- Our purpose in entering the competition was to find out how well a method based on ranks and zero previous knowledge would do
- result: we placed 1st in the Multiple Sclerosis sub-challenge, and 2nd overall out of 52
teams
![Page 11: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/11.jpg)
DETAILS OF RESULTS
• Respectable performance across the board
• low performance on COPD probably points to weakness of our method to differences in
genetic background
rank acc auroc aupr
COPD 24 0.5625 0.5820 0.6636
0.5820 0.4588
Lung Cancer 7 0.4800 0.7280 0.4524
0.6366 0.3753
0.6592 0.4436
0.6327 0.4389
MS Diag 1 0.8833 0.8973 0.8439
0.8973 0.9047
Psoriasis 11 0.9839 0.9857 0.9643
0.9857 0.9938
![Page 12: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/12.jpg)
PART 2
The MS Diagnostic sub-challenge
![Page 13: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/13.jpg)
THE MS DIAGNOSTIC SUBCHALLENGE
• OBJECTIVE: predicting relapsing-remitting multiple sclerosis (RRMS) or Control patients, based on the Peripheral Blood Mononuclear Cells (PBMC) transcriptome
• METHOD: two-step procedure:• we used the full E-MTAB-69 public dataset as training set,
• we then built a combined map of the SBV IMPROVER samples plus a subset of E-
MTAB-69 samples
• RESULT: the combined map (see next slide) produced two cluster of IMPROVER samples, that were later identified as MS/control by inspecting the differential expression of well known MS-associated genes • COSBI algorithm parameters:
• signature of size 200/300 (up/down)
• top 20% distances used for the map
![Page 14: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/14.jpg)
MS DIAGNOSIS SUBCHALLENGE: THE SAMPLES MAP
• Map of samples from two datasets:
• E-TABM-69 (red, green, blue)
• SBV Improver datasets (pink nodes)
• Clustering performed using GLay algorithm in Cytoscape
After clustering
Before clustering
![Page 15: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/15.jpg)
SENSITIVITY OF ALGORITHM PARAMETERS
• Low sensitivity to both signature length and distance threshold
Nup=200 / Ndown = 300Top 20% of edges
Nup=200 / Ndown = 300Top 10% of edges
Nup=250 / Ndown = 250Top 20% of edges
![Page 16: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/16.jpg)
RELATED WORK
• Rank-based method are not new
• k-TSP is a signature definition method based on small (size k) signatures [Tan et al 2005]
• The combined use of rank + maps was proposed by Iorio [Iorio et al 2010] for analyzing Connectivity Map (cMAP) datasets
![Page 17: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/17.jpg)
CONCLUSIONS AND FUTURE APPLICATIONS
• Our method is intuitive, quite general and completely oblivious of the underlying biology of the disease
• method in its simplest form already performs well
• It can be applied to any large dimensional data, therefore many applications are conceivable beyond gene expression signatures
• Current/future work:
• algorithm improvements: selection of signature size, other method of gene selection
• new applications: very encouraging results on profiles of circulating miRNA as
diagnostic biomarkers
![Page 18: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/18.jpg)
ACKNOWLEDGMENTS
• Francesco Iorio (formerly TIGEM, now EBI) for the insightful discussions
• PMI for the funding
• Corrado Priami and others at COSBI for insightful discussions and support
![Page 19: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/19.jpg)
THANK YOU
![Page 20: Mario Lauria](https://reader036.vdocuments.site/reader036/viewer/2022081408/62970f2d1381f80338232596/html5/thumbnails/20.jpg)
LARGE MS MAP