Abstract Lung cancer is the leading cause of death in the U.S. Non-‐small cell lung cancer (NSCLC), the vast
majority of lung cancers, has three main subtypes: adenocarcinoma, large cell carcinoma and squamous cell carcinoma. These subtypes are difficult to distinguish histologically but demand distinct courses of treatment, and applying the wrong treatment can lead to serious adverse consequences. We have developed a gene expression-‐based classifier to distinguish squamous and non-‐squamous NSCLC by using quantitative Nuclease Protection Assayi(qNPA). The chemistry does not require RNA extraction, reverse transcription, or amplification and is fully automated on the HTG Edge System.
Results
Conclusions We have developed a robust classifier that can accurately distinguish Squamous from non-‐squamous NSCLC. Its performance has been demonstrated by independent clinical validation samples.
We first identified a set of candidate biomarker genes through microarray analysis of both fresh frozen and formalin fixed paraffin-‐embedded (FFPE) samples from 134 NSCLC patients, plus in silico analysis of six microarray datasets from GEO. We then refined the set of marker genes and developed a Support Vector Machine (SVM) classifier on an independent cohort of 161 FFPE samples. These 161 samples are represented by variety of sources to model the heterogeneity of samples under both academic and community settings. Finally, we validated the performance of the classifier on another independent set of 97 FFPE samples.
Table 1. Clinical Data sets used to develop and validate the classifier
Methods
1877 Non-‐small cell lung cancer histological sub-‐typing by gene expression analysis from FFPE tissue Eva Wang1,*, Zhenqiang Lu1, Krishna Maddula1, Mark Schwartz1, Chris Roberts1, Mary-‐Beth Joshi2, David Harpole, Jr.2, Vijay Modur1.
1HTG Molecular Dx, Tucson, AZ 85706; 2Duke University Medical Center, Durham, NC *corresponding author: [email protected]
HTG Edge System & Chemistry
Sample Prep Manual
Chemistry
Edge Processor Edge Reader
Automation
The performance of the classifier was measured by its call concordance with three pathologists’ IHC panel consensus reads. The classifier distinguished squamous and non-‐squamous NSCLC of the 97 FFPE independent samples with AUC of 0.98(Fig 1) and accuracy of 94%(Fig 2). Two of the six discordant samples were confirmed as positive in our ALK screening assay; ALK fusions are generally limited to adenocarcinomas. The classifier, combining with an ALK screening assay can provide increased accuracy for NSCLC subtyping.
The classifier’s robustness was further demonstrated by diluting the tumor content with normal adjacent tissues, with as little as 20% of the original tumor content in the final sample. All diluted samples were predicted correctly(Fig 3). In addition, the estimated class probabilities did not vary significantly by dilution ratio, indicating that the classifications are robust to low tumor content.
When adenocarcinoma and squamous lysate were titrated together, the prediction scores varied roughly linearly with the adenocarcinoma or squamous cell concentration, reflecting the biological changes in the sample mixture(Fig 3).
Table 2. In silico data sets used to refine the classifier genes.
Author No. Adeno
No. Large Cell
No. Squamous Total Array Type Data Source
Bild et al 58 53 112 U133 Plus2.0 GEO: GSE3141
Bittner et al 59 34 39 132 U133 Plus2.0 GEO: GSE2109
Hou et al 45 19 27 91 U133 Plus2.0 GEO: GSE19118
Kim et al 62 76 138 U133 Plus2.0 GEO: GSE8894
Su et al 14 14 28 U95A http://www.gnf.org/cancer/epican
Zhu et al 28 10 52 90 U133A GEO: GSE14814
Fig 1 .Final classifier performance on validation data shown as ROC. Area under the curve(AUC) is typically between 0.5 and 1.
Fig 2. Final classifier performance on validation data. Samples on x-‐axis are sorted by prediction probability. The colors are based on the final adjudicated reads from three pathologists.
Fig 3. The prediction probabilities of mixture samples: 1) adenocarcinoma or squamous samples diluted with normal adjacent tissue (NAT) and 2) adenocarcinoma and squamous samples mixed together with different proportions.
Data Sets
References i. Martel et al, Multiplexed screening assay for mRNA combining nuclease protection with luminescent array detection. Assay and Drug Development Technologies (2002),1(1), 61-‐71.
Presented at AACR Annual Meeting | San Diego, CA | April 2014
* Samples from the two discovery sets are Fresh frozen and FFPE matched tissues
Results
Discovery Set Discovery set Training Set Validation Set Platform Affy qNPA qNPA qNPA
Sample Type Fresh frozen* FFPE* FFPE FFPE Sample size 134 134 161 97
Adenocarcinoma 70 70 74 50 Large cell 22
Squamous 64 64 65 47
HTG Molecular Diagnostics | 3430 E Global Loop | Tucson, AZ 85706 | (877) 289-2615 | htgmolecular.com