SupplementaryDataSupplementaryMethodsRNA-Seqworkflow
Initial quality control analysis of RNA-Seqdata for each samplewas performedusing FastQC
(version0.11.2).EachsamplewasalignedusingtheTophataligner(version2.0.13)(24).Samtools
software(version1.0_BCFTools_HTSLib)wasusedtosortandindexthebamfiles(25).Cuffquant
(Cufflinksversion2.2.1)wasusedtogeneratetranscriptabundancefiles(optionsusedinclude
themulti-read-correct,max-bundle-frags<10000000>andmask-file(forgenes<200bp).Once
all sampleswithineachspeciesweremappedandabundanceestimate fileswerecompleted,
Cuffnorm(Cufflinksversion2.2.1),wasusedtogenerateatableofFragmentsPerKilobaseOf
ExonPerMillionFragmentsMapped(FPKM)valuesforgeneswithineachsample(26).Cuffquant
wassettomaskgenes<200bp.TominimizetheeffectsofdividingFPKMvaluesbynumbersclose
to0andstochasticnoise,0.1wasaddedtoeachFPKMvalue(27,28).TheFPKMfilesareavailable
atGEOGSE87686.MappingstatisticsaresummarizedinSupplementalTable3.
Referencegenomes
AspartoftheRNA-Seqworkflow,genesweredefinedbythegenometranscriptfilesfor
eachspecies.TheUniversityofCalifornia,SantaCruz(UCSC)GenomeBrowserversionhg19
(GRCh37assembly)ofthehumanreferencegenomeandEnsemblGenesv70wereusedfor
mappinghumansequence.VersioncanFam3(BroadInstitutev3.1)wasusedasthedog
referencegenomeinthisstudy.TheBroadInstituteprovided.bedfiles(personal
communication)andfromthesea.gtffilewascreated.UCSCversionmm10(GRCm38,Genome
ReferenceConsortiumMouseBuild38(GCA_000001635.2))wasusedasthemousereference
genome.
Correlationanalyses
PairwisePearsoncorrelationcoefficientswerecalculatedusing12,062genescommon
betweenhumanOS(HOS1,HOS2,HOS3),dogOS(DOS),MouseOS(MOS),andTCGAdatafor
other human cancers. Twenty-five samples from RNA-Seq expression data were randomly
selected for TCGA primary tumors from Cervical Squamous Cell Carcinoma and Endocervical
Adenocarcinoma (CESC), ColonAdenocarcinoma (COAD),Glioblastoma (GBM),AcuteMyeloid
Leukemia(LAML),Prostate(PRAD),andThyroid(THCA)cancerdatasets.
Transcriptomeanalyses
OptiType,aprecisionHLAtypingtool(29)wasusedtoidentifyhumanOSsamplesderived
from the same patient. In the 44 human OS samples, five pairs ofmatching human patient
sampleswereidentifiedusingOptiType.For4ofthe5pairs,theprovidedage,sex,andraceof
thepatientswerealsoidentical(the5thpairdidnothavecompletemetadatainformation).These
duplicatesampleswereallderivedfromtheUMNBioNETrepository,andnoothermatchingpairs
wereobservedacrossallOShumansequencingdatausedinthisstudy.
HistopathologyandImmunohistochemistry
For hematoxylin and eosin staining, FFPE sections (4 µm) were de-paraffinized and
rehydratedusingstandardmethods.SlideswereplacedintoHarrishematoxylinfor5minutes.
Slideswererinsed,dippedtwiceinacidalcohol,rinsedandplacedintoammoniawater(bluing)
for2minutes followedbya15minute tapwater rinse.After the tapwater rinseslideswere
placedin80%ethanolfor2minutesandintoEosinYfor1minute.Aftertheeosinslideswererun
throughgradedalcohols,dehydratedandcover-slipped.Forimmunohistochemistry,sections(4
µm)werede-paraffinizedandrehydratedusingstandardmethods.Antigenretrievalwasdone
byplacing slides in a steamer, in 6.0 pHbuffer (RevealDecloaking reagent, BiocareMedical,
Concord,CA)for30minat95-98°C,followedbya20mincooldownperiodforallantibodies
except Calprotectin (MAC387), where antigen retrieval was not needed. Slides were then
incubatedwithProteinaseKfor5minutesat25°C,rinsedandplaceintoTris-bufferedsalinewith
Tween-20 (TBST). Subsequent stepswere automated using an immunohistochemical staining
platform(Nemesis,Biocare).Endogenousperoxidaseactivitywasquenchedbyslideimmersion
in3%hydrogenperoxidesolution(Peroxidazed,Biocare)for10minfollowedbyTBSTrinse.A
serum-freeblockingsolution(BackgroundPunisher,Biocare)Medical,Concord,CA)wasplaced
on sections for10min.Blocking solutionwas removedand slideswere incubated inprimary
antibody diluted in 10% blocking solution/90% TBST.Mousemonoclonal anti-Vimentin(clone
V9)(Zymed;1:400), rabbit monoclonal anti-CD3(clone SP7)(Thermo Scientific, Kalamazoo, MI
1:400), rabbit monoclonal anti-Calprotectin(clone
MAC387)(Invitrogen,Rockford,IL;1:100&1:200),wereincubatedfor60minatroomtemperature
followedbyTBSTrinseanddetectionwithNovocastraNovolinkPolymerKit(LeicaMicrosystems
Inc.,BuffaloGrove, IL)usingthemanufacturer’sspecifications.Allslidesthenproceededwith
TBST rinse and detectionwith diaminobenzidine (DAB) (Covance, Dedham,MA). Slideswere
incubatedfor5minfollowedbyTBSrinsethencounterstainedwithCATHematoxylin(Biocare,
Concord,CA)for5min.Slideswerethendehydratedandcover-slipped.
Statistics
Statistical significancewas calculated using the log-rank test or by Fisher’s exact test
dependingonanalysisandap<0.05wasconsideredsignificant.Kaplan-Meier(KM)survivalplots
weregeneratedusingthe‘survival’packageinR(Version0.98.1103)(30,31).TheGCESSvalues
wereusedtorankthetumorsintoquartilegroupsandthequartilegroupsweresystematically
testedforassociationwithoutcome.Numberoftumorswithoutcomeinformation,thenumber
ofoutcomeeventsused inall statisticalanalysesandquartile cut-off values foreverycluster
analyzedinthispaperareprovidedasSupplementalTable2.Samplesizewasbasedonavailable
data.
OutcomeDefinitionsandMeasurement
Toidentifyassociationsbetweentranscriptionalvariationandoutcomes,severaldifferent
typesofoutcomedatawereutilized.Forthecaninedatasetsurvivaltimeposttumorresection
wasavailablefor19tumors(averagetimeuntildeath137days).FortheHumanHOS2dataset,
followupdatawasavailablefor35tumors,(averagefollowuptime733days)andduringthis
time17deathsoccurred.Additionally,theobservationofpresenceofmetastasesatdiagnosis
wasalsoavailablefor17patients.ForGSE21257,followupdatawasavailablefor53patients
withanaveragefollowuptimeof2056days.23deatheventswereobservedandmetastatic
diseasewasobservedin34ofthesepatients.
SupplementalFigure1.Realhumandataclustered
SupplementalFigure2.Randomhumandataclustered
SupplementalFigure3.Permutedhumandataclustered
SupplementalFigure4.Realmousedataclustered
SupplementalFigure5.Randommousedataclustered
SupplementalFigure6.Permutedmousedataclustered
SupplementalFigure7.Realdogdataclustered
SupplementalFigure8.Randomdogdataclustered
SupplementalFigure9.Permuteddogdataclustered
SupplementalFigure10.Examplesof4immunohistochemistrygroups
Supplemental Figure 11. KM significance in real, random, and permuted human and dog
datasets.
SupplementalFigure12.KMhumancluster-1
SupplementalFigure13.KMhumancluster-4
SupplementalFigure14.KMhumancluster-8
SupplementalFigure15.KMdogcluster-3
SupplementalFigure16.GSE212257Arraycluster-3associationwithOutcome
SupplementalFigure17.GSE212257Arraycluster-5associationwithOutcome
SupplementalFigure18.GSE212257Arraycluster-3associationwithMetastasis
SupplementalFigure19.GSE212257Arraycluster-5associationwithMetastasis
SupplementalFigure20.GSE212257Arraycluster-7associationwithMetastasis
SupplementalFigure21.CellCycleGCESSplottedagainstImmune-1andImmune-2GCESS
SupplementalFigure22.Modeldepictinghowimmunecellcomponentsmaybepreventingthe
occurrenceofmetastasisinOSpatients.
SupplementalFigure23.FlowchartofOSsamplesutilizedinthiswork
SupplementalTable1.Samplemetadatatable.
Tab1HumanTab2DogTab3MouseTab4Humanoutcome-survivaltime,presenceofmetastasisatdiagnosisTab5Dogoutcome-survivaltimeTab6GSE21257outcome-survivaltime,metastasistimeSupplementalTable2.TumorCounts,numberofOutcomeeventsandGCESSOutcome
thresholdvalues
SupplementalTable3.
ReadCounts,mappingpercentageandexpressedgenecountforallRNAseqsamples.Tab1HumanTab2DogTab3Mouse
SupplementalTable4.Genelistsidentifiedintheclustering.Tab1GenesinHumanclusters(n=7),mouseclusters(n=11),dogclusters(n=5),GSE212257(n=14)Tab2OverlapcountsofRNASEQclusterlists.Tab3Overlapofhumancluster1andhumancluster8withLM22signature
SupplementalTable5.IPAenrichmentanalysesofrecurrenthumangeneclusters.
SupplementalTable6.GeneClusterExpressionSummaryScores.
Tab1HumanGCESSvaluesTab2MouseGCESSvaluesTab3DogGCESSvaluesTab4GSE212257GCESSvalues
SupplementalTable7.ImmunohistochemistryvalidationofImmune1andImmune2GCESS
SupplementalFigure1RealHumandata
SupplementalFigure2RandomHumandata
SupplementalFigure3PermutedHumanData
SupplementalFigure4RealMouseData
SupplementalFigure5RandomMousedata
SupplementalFigure6PermutedMousedata
SupplementalFigure7RealDogdata
SupplementalFigure8RandomDogData
SupplementalFigure9PermutedDogData
SupplementalFigure10Examplesof4ImmunohistochemistryGroups
SupplementalFigure11A.
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 540Median survival= NA
p = 0.00588
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 690Median survival= 1260
p = 0.15
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 930Median survival= NA
p = 0.886
SupplementalFigure12HumanCluster1“Immune-1”associationwithOutcome
SupplementalFigure13HumanCluster4“CellCycle”associationwithOutcome
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 930Median survival= 1170
p = 0.861
0 500 1500 2500 35000.
00.
20.
40.
60.
81.
0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 1260Median survival= 900
p = 0.751
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 1260Median survival= 540
p = 0.0577
SupplementalFigure14HumanCluster8“Immune-2”associationwithOutcome
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 690Median survival= 1260
p = 0.14
0 500 1500 2500 35000.
00.
20.
40.
60.
81.
0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 690Median survival= NA
p = 0.261
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 690Median survival= NA
p = 0.0927
SupplementalFigure15CanineCluster3“CellCycle”associationwithOutcome
0 50 100 150 200 250 300
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 165.6986301Median survival= 134.630137
p = 0.685
0 50 100 150 200 250 3000.
00.
20.
40.
60.
81.
0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 196.2739726Median survival= 49.31506848
p = 0.00313
0 50 100 150 200 250 300
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 174.57534245Median survival= 42.4109589
p = 0.00328
SupplementalFigure16GSE212257ArrayCluster3“Immune-1”associationwithOutcome
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 5670Median survival= NA
p = 0.25
0 500 1500 2500 35000.
00.
20.
40.
60.
81.
0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 3300Median survival= NA
p = 0.523
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 3300Median survival= NA
p = 0.0335
SupplementalFigure17GSE212257ArrayCluster5“Immune-2”associationwithOutcome
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 5670Median survival= NA
p = 0.233
0 500 1500 2500 35000.
00.
20.
40.
60.
81.
0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 1050Median survival= NA
p = 0.0121
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 3300Median survival= NA
p = 0.159
SupplementalFigure18GSE212257ArrayCluster3“Immune-1”associationwithMetastasis
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 300Median survival= 1320
p = 0.00114
0 500 1500 2500 35000.
00.
20.
40.
60.
81.
0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 300Median survival= 810
p = 0.0558
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 300Median survival= NA
p = 0.000742
SupplementalFigure19GSE212257ArrayCluster5“Immune-2”associationwithMetastasis
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 285Median survival= 1320
p = 0.000679
0 500 1500 2500 35000.
00.
20.
40.
60.
81.
0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 300Median survival= NA
p = 0.000131
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 300Median survival= NA
p = 0.00613
SupplementalFigure20GSE212257ArrayCluster7 “CellCycle”associationwithMetastasis
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q1 vs Q234
Time (days)
Surv
ival
Median survival= 1080Median survival= 480
p = 0.2
0 500 1500 2500 35000.
00.
20.
40.
60.
81.
0
Q12 vs Q34
Time (days)
Surv
ival
Median survival= 1080Median survival= 285
p = 0.0226
0 500 1500 2500 3500
0.0
0.2
0.4
0.6
0.8
1.0
Q123 vs Q4
Time (days)
Surv
ival
Median survival= 765Median survival= 300
p = 0.0583
SupplementalFigure21HumanCellCycleGCESSplottedagainstImmune-1andImmune-2GCESS
WorseOutcome BetterOutcome
SupplementalFigure22
Follow up data
Samples Cohorts Species Outcome Analysis
Fig 3, KM analyses
Fig 4, KM analyses
Fig 3, KM analyses
Human
HOS1
Tissue (n= 46)
Bone Tissue (n=3)
Cells (n = 5)
HOS22 Tissue (n=35)Time to Death
Metastases at Diagnosis
HOS33 Tissue (n=25)
GSE212574 Tissue (n=53)
Time to Death
Time to Metastases
Mouse
Tissue (n=92)
Cells (n=11)
Dog
Tissue (n=31) Time to Death
Cells (n=2)
Osteoblast Cells (n=1)
MOS1
DOS1
SupplementalFigure23FlowchartofOSsamplesutilizedinthiswork
Figure
Fig 1, CorrelationFig 2, ClusteringFig 3, GCESS
Fig 1, CorrelationFig 2, ClusteringFig 3, GCESS
Fig 1, CorrelationFig 2, ClusteringFig 3, GCESS
Fig 4, Clustering
Fig 1, CorrelationFig 2, ClusteringFig 3, GCESS
Fig 1, CorrelationFig 2, ClusteringFig 3, GCESS
1. RNA-Seq dataproducedforthisstudy.2. RNA-Seq datadownloadedasdescribedinthemethodssection.3. RNA-Seq datapreviouslygeneratedasdescribed.4. Arrray datadownloadedasdescribedinthemethodssection.