single-cell applications of next-generation sequencing

26
Single-Cell Applications of Next-Generation Sequencing Naishitha Anaparthy, 1 Yu-Jui Ho, 2 Luciano Martelotto, 3 Molly Hammell, 2 and James Hicks 4 1 Department of Molecular and Cellular Biology, Stony Brook University, Stony Brook, New York 11794 2 Watson School of Biological Science, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724 3 University of Melbourne, Centre for Cancer Research, Victoria Comprehensive Cancer Centre, 3000 Victoria, Australia 4 Michelson Center for Convergent Biosciences, Dornsife College, University of Southern California, Los Angeles, California 90089 Correspondence: [email protected] The single cell is considered the basic unit of biology, and the pursuit of understanding how heterogeneous populations of cells can functionally coexist in tissues, organisms, microbial ecosystems, and even cancer, makes them the subject of intense study. Next-generation sequencing (NGS) of RNA and DNA has opened a new frontier of (single)-cell biology. Hundreds to millions of cells now can be assayed in parallel, providing the molecular profile of each cell in its milieu inexpensively and in a manner that can be analyzed math- ematically. The goal of this article is to provide a high-level overview of single-cell sequenc- ing for the nonexpert and show how its applications are influencing both basic and applied clinical studies in embryology, developmental genetics, and cancer. A s recently as 2 years ago, a review article about single-cell sequencing technologies would have mainly focused on methods for ef- cient extraction, types of nucleic acid amplica- tion, coverage, and even the price/unit (i.e., cell or nuclei). Today, although these aspects are re- garded as basic contributors for successful sin- gle-cell experimentation, researchers are con- centrating more on applying the right method and the right mathematical analysis to the prob- lem at hand rather than in the artof getting a single-cell genomics laboratory set up. This is largely the result of the advent of high-through- put droplet encapsulation and emulsion-based amplication of nucleic acids and, more recent- ly, the implementation of commercial droplet- based (Drop-Seq) instruments in nearly every sequencing core. Indeed, when large numbers of cells are available, as from whole tumors or embryos, commercial microuidic bead emul- sion devices such as the 10X Chromium (10X Genomics) and BD Rhapsody can amplify tens of thousands of single-cell transcriptomes at once and have them ready for sequencing in little more than 1 day. This technology is being exploited to prole the expression states and lin- eages of cells across species, from atworms to vertebrates. In this article, we will attempt to provide a snapshot of how next-generation sequenc- Editors: W. Richard McCombie, Elaine R. Mardis, James A. Knowles, and John D. McPherson Additional Perspectives on Next-Generation Sequencing in Medicine available at www.perspectivesinmedicine.org Copyright © 2019 Cold Spring Harbor Laboratory Press; all rights reserved Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 1 www.perspectivesinmedicine.org on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/ Downloaded from

Upload: others

Post on 03-Apr-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Single-Cell Applications of Next-Generation Sequencing

Single-Cell Applications of Next-GenerationSequencing

Naishitha Anaparthy,1 Yu-Jui Ho,2 Luciano Martelotto,3 Molly Hammell,2 and James Hicks4

1Department of Molecular and Cellular Biology, Stony Brook University, Stony Brook, New York 117942Watson School of Biological Science, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 117243University of Melbourne, Centre for Cancer Research, Victoria Comprehensive Cancer Centre,3000 Victoria, Australia

4Michelson Center for Convergent Biosciences, Dornsife College, University of Southern California,Los Angeles, California 90089

Correspondence: [email protected]

The single cell is considered the basic unit of biology, and the pursuit of understanding howheterogeneous populations of cells can functionally coexist in tissues, organisms, microbialecosystems, and even cancer, makes them the subject of intense study. Next-generationsequencing (NGS) of RNA and DNA has opened a new frontier of (single)-cell biology.Hundreds to millions of cells now can be assayed in parallel, providing the molecularprofile of each cell in its milieu inexpensively and in a manner that can be analyzed math-ematically. The goal of this article is to provide a high-level overview of single-cell sequenc-ing for the nonexpert and show how its applications are influencing both basic and appliedclinical studies in embryology, developmental genetics, and cancer.

As recently as 2 years ago, a review articleabout single-cell sequencing technologies

would have mainly focused on methods for effi-cient extraction, types of nucleic acid amplifica-tion, coverage, and even the price/unit (i.e., cellor nuclei). Today, although these aspects are re-garded as basic contributors for successful sin-gle-cell experimentation, researchers are con-centrating more on applying the right methodand the right mathematical analysis to the prob-lem at hand rather than in the “art” of getting asingle-cell genomics laboratory set up. This islargely the result of the advent of high-through-put droplet encapsulation and emulsion-basedamplification of nucleic acids and, more recent-

ly, the implementation of commercial droplet-based (Drop-Seq) instruments in nearly everysequencing core. Indeed, when large numbersof cells are available, as from whole tumors orembryos, commercial microfluidic bead emul-sion devices such as the 10X Chromium (10XGenomics) and BD Rhapsody can amplify tensof thousands of single-cell transcriptomes atonce and have them ready for sequencing inlittle more than 1 day. This technology is beingexploited to profile the expression states and lin-eages of cells across species, from flatworms tovertebrates.

In this article, we will attempt to providea snapshot of how next-generation sequenc-

Editors: W. Richard McCombie, Elaine R. Mardis, James A. Knowles, and John D. McPhersonAdditional Perspectives on Next-Generation Sequencing in Medicine available at www.perspectivesinmedicine.org

Copyright © 2019 Cold Spring Harbor Laboratory Press; all rights reservedAdvanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

1

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 2: Single-Cell Applications of Next-Generation Sequencing

ing (NGS) is bringing subjects as complex ascancer, neurobiology, microbial ecology, andembryology to a new level of resolution. Foradditional detail, the overall topic has been thor-oughly reviewed in several recent publicationsfor cancer (Baslan and Hicks 2017; Tsoucasand Yuan 2017), neurobiology (Poulin et al.2016), and microbiology (Woyke et al. 2017),and a series of key reviews on the topic hasbeen assembled in Nature Reviews (see www.nature.com/collections/sxnwgntqsk).

Each single cell is unique in both time (e.g.,cell cycle) and space (e.g., microenvironment,nearest neighbors). Single-cell sequencing hasbeendeveloped to gain insight into that “unique-ness” and to unmask the distinctive molecularproperties of cells, not by themselves but as partof the functional coherent populations that col-onize tissues, organs, tumors, andmicrobial eco-systems. Traditionally, scientists named celltypes according to their appearance or locationin a tissue or organism. Single-cell sequencinghas not only redefined the molecular classi-fication of cells but also uncovered new and un-suspected cell types and opened the window tofurther understanding the spatiotemporal orga-nization of cells fromembryonic development totumor progression to aging itself. Furthermore,the very nature of genomic and transcriptomicdata and the constant development of very clevercomputational tools makes it possible to evi-dence the most incredibly nuanced distinctionsamong cells, their types and states, aswell as theirfate (e.g., differentiation, death, etc.) withincomplex tissues such as the brain or developingembryo.

SINGLE-CELL SEQUENCING APPLICATIONS

Entire areas of investigation are being invigorat-ed by single-cell sequencing. The first is cancerresearch, where the long known, but seldom ac-knowledged in practice, heterogeneity of tumortissue can be tracked at its most basic level, thesingle cell. A second area is sequencing complexmicrobial populations, either from an organism,say human gut or mouth, or from our physicalenvironment. Recent advances in microbial cellgenomics and transcriptomics have enabled the

assignment of functional roles tomembers of thehumanmicrobiome for which there are not suc-cessful methods available for culturing them.These approaches paved the way to deeper un-derstanding of the phenotypic variation that ex-ists among genetically related strains, openingnew opportunities for studies of immunogenicmicroorganisms in disease. Another importantbut often overlooked area is in vitro fertilization(IVF), in which various levels of genomic se-quence from a single cell isolated by hand froma human embryo can provide evidence of triso-my 21 or a suspected genetic disease, and at thesame time enhance the chances for successfulconception (Xu et al. 2016). Currently, however,the greatest scientific breakthroughs are likely tocome in neurobiology and developmental biol-ogy. Indeed, the ability to perform transcriptionstudies on thousands to millions of cells in a fewhours or days using microfluidic emulsion am-plifications has revolutionized the study of celllineage and the detailed structure of complex cellpopulations. The explosion of new data in theseareas was foreseen by Linnarsson and colleaguesin a prescient review published in 2013 (Shapiroet al. 2013). At that time, they predicted that itwould be possible to obtain molecular data fromthousands of cells and, further, that methods tocombine genomics, transcriptomics, epigenom-ics, and proteomics would be common in theensuing decade. It has, in fact, taken less than 5years for these predictions to be realized.

SINGLE-CELL SEQUENCINGMETHODOLOGY

Methods for the genome-wide amplification ofRNA and DNA were in development from theearly 1990s (Van Gelder et al. 1990; Teleniuset al. 1992; Dean et al. 2002). However, it wasnot until both a relatively accurate reference hu-man genomewas drafted andmade readily avail-able (Lander et al. 2001; Kent et al. 2002) andmassively parallel short-read sequencing wasdeveloped that the concept of genome-wide, sin-gle-cell sequencing began to be explored.

The first report of whole-transcriptomesequencing (Tang et al. 2009) used the ABISOLID sequencing platform on complementary

N. Anaparthy et al.

2 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 3: Single-Cell Applications of Next-Generation Sequencing

DNA (cDNA) prepared from a small number ofmouse oocytes and blastomeres demonstratingclear regulatory effects of mutations in themicroRNA (miRNA) silencing pathways. Thefirst genome-wide copy number profiling of sin-gle tumor cells (Navin 2011) used flow sorting of200 individual nuclei (then a lot) isolated frommultiple sectors of two breast tumors and se-quencing using the Illumina Genome Analyzerinstrument. The results revealed the power ofgenomic copy number profiling to trace tumorlineage and to distinguish tumor cells from ge-nomically normal (euploid) stroma. It furtherrevealed the stability of clonal structure withina tumor as measured by copy number “break-points” while at the same time demonstratingstepwise evolution of the tumor genome throughboth increase in ploidy and the systematic in-crease in copy number alterations (CNAs) with-in the clonal structure of the lineage.

In those initial reports, the genomic DNA(gDNA) of single cells/nuclei were amplified,and the sequencing “libraries” (collection ofDNA fragments prepared for sequencing) wereconstructed virtually one by one for each cell. Inthewake of those initial reports, awide variety ofmethods were described for both RNA andDNA sequencing, implementing important im-provements in efficiency and reproducibility andcost. By 2013, single-cell sequencing was named“Method of the Year” by the journal NatureMethods.

SINGLE-CELL SEQUENCING PROTOCOLS

There aremore than 300 publications describingsingle-cell protocols and∼50different data anal-ysis methods listed in a review on the Illuminawebsite (seewww.illumina.com). In reality, thesevarious “single-cell sequencing protocols” arenot sequencing protocols at all. In the main,they are simply the means to isolate differentsubsets of RNA or DNA in a manner that theycan be converted into standard short-read NGSlibraries.An Illumina sequencing library ismadeup of 300–500 bp fragments of the target se-quence flanked by hybrid specialized adap-tor sequences that the instrument uses to primethe sequencing reactions. Libraries created from

bulk tissue (DNA or RNA derived from thou-sands to millions of cells) usually contain verylarge numbers of “unique” fragments (a mea-surement of these number of fragments is “li-brary complexity”). On the other hand, librariesmade from single cells have a much lower num-ber of unique fragments and therefore theirlibrary complexity is limited. A single mamma-lian cellmay contain as fewas300,000messengerRNA (mRNA)molecules and∼10–30 pg of totalmRNA (and ∼7 pg DNA). For single-cell se-quencing, the massive output of a single unit,or “lane” of the sequence is much more thanis needed to fully sequence all those uniquefragments many times over. Thus, several sin-gle-cell libraries can be run together or “multi-plexed” on a single lane as long as each library isconstructed with adaptors that carry uniqueoligonucleotide sequences or “barcodes,” ulti-mately enabling each output sequence to beassigned to a specific cell during subsequent in-formatic analysis. In addition, in current prac-tice,most sequencingadaptors also contain 8–10nucleotides of completely random sequence,called unique molecular identifiers (UMIs) thatcan be used to informatically identify and (op-tionally) remove PCR duplicates, namely, repli-cates of the original unique molecules, elimi-nate errors introduced during the amplificationof nuclei acids and sequencing stages, and totag individual original transcripts allowing to“count” their abundance (i.e., expression level).The UMI concept was envisioned in severalpublications (Casbon et al. 2011; Fu et al.2011; Kivioja et al. 2012; Shiroguchi et al. 2012)before it being used successfully by Islam et al.(2014) for removing duplicates when countingmRNA molecules in a mouse model and is nowstandard practice in most sequencing protocols,whether for validating point mutations in high-depth sequencing of bulk DNA, or in molecularcounting applications such as transcriptomeprofiling.

SINGLE-CELL REVERSETRANSCRIPTION (RT)

The central goal of single-cell transcriptomeprofiling is to accurately quantify the levels of

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 3

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 4: Single-Cell Applications of Next-Generation Sequencing

specific messages in each cell and compare indi-vidual cells to a population to generate molecu-lar phenotypes. NGS provides a mechanism forachieving that goal because each sequence readcan be used as a tag for its template molecule.Huge numbers of short reads permit count-ing for millions of individual templates. Thewhole-transcriptome amplification method thatis the basis for most transcription profiling wasworked out in the Sandberg and LinnarssonLaboratories at the Karolinska Institute Stock-holm and described in three important papers(Islam et al. 2011, 2014; Ramsköld et al. 2012). Inthe first paper used, the goal of the Ramsköldet al. paper was to generate full-length cDNAusing an oligo(dT) primer carrying a PCRprimer using Maloney murine leukemia virus(MMLV) reverse transcriptase, which adds asmall stretch of cytosine residues to the 50 endof the first strand cDNA. Strand switching isaccomplished by the addition of a reverse primersequence carrying a sample barcode alongwith a30dG sequence that hybridizes to the cytosineoverhang and initiates second-strand synthesisgoing through to the end of the first strand. Theynamed themethod Smart-Seq. In the Islam et al.(2014) variation, the first- and second-strandprimers carry a random UMI along with thesample barcode. After PCR amplification inthe Islam et al. protocol, the ends of the PCRproducts were biotinylated before a fragmenta-tion, and the fragments representing the 50 and30 ends of the original transcript were capturedon streptavidin beads and sequenced. Strand oforigin was preserved and identifiable throughthe sequence and location of the barcode andthe UMI was used to filter out duplicates, there-by providing a very accurate counting of theoriginal mRNA molecules.

The strand-switching method using mousemammary tumor virus (MMTV)RT requires noligation, captures strand of origin, and is highlyadaptable as the first- and second-strand primersequences can be altered at will. Many variationson the strand-switching method are commer-cially available as SMART-Seq kits from TakaraBio USA (formerly ClonTech), including 30 endcapture and full-length cDNA as well as kits forcapturing RNA from formalin-fixed, paraffin-

embedded (FFPE) samples. Sensitivity for singlecells seems to be at the level of 5–10 molecules/cell. The RNA amplification process is shownschematically in Figure 1.

It is important to note that it is not necessaryto sequence the entire message to generate sin-gle-cell transcriptomic profiles. This is essential-ly a “counting” exercise more than a sequencingstrategy in the traditional sense. For most mo-lecular phenotyping studies, the sequencingreads are simply tags for counting up the num-ber of each RNA molecule present in the cell.Filtering for duplicates using theUMI sequencesensure that each initial reverse transcript is onlycounted once, adding further stringency to thecounting process. Although some Smart-Seq ap-plications use random primers to initiate cDNAsynthesis and thus can cover an entire messagesequence, it would not yield complete coverageandmost of the effortwould be spent on only thehighest frequency messages. For coverage at thebase-pair level to identify mutations, a targetedprotocol rather than awhole-transcriptome pro-tocol would be favored.

DROPLET CAPTURE

Once itbecameclear thatgenomicand transcrip-tomic profiling of single cells was achievable inindividual test tubes, the quest for high-through-put methods was underway. The C-1 amplifica-tion instrument, marketed by Fluidigm in 2011was a step forward, but its maximum capacityof 100 cells did not present a significant im-provement over what a bench scientist could domanually using individual tubes. However, thecopublication of two breakthrough microfluidicapplications by Evan Macosko and Allon Kleinquickly changed the landscape for indexingmRNA from large populations of cells. Drop-Seq (Macosko et al. 2015) and InDrops (Kleinet al. 2015) from the McCarroll and Kirschnerlaboratories at HarvardMedical School not onlydemonstrated reliable indexing of mRNA fromthousands of single cells in a single run but alsomade theprotocols and reagents accessible forallresearchers. Both methods were based on earlierwork fromDavidWeitz’s laboratorydemonstrat-ing encapsulation of reagents into droplets that

N. Anaparthy et al.

4 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 5: Single-Cell Applications of Next-Generation Sequencing

were stablewhenmixed inemulsions.Thismeth-odmakes use of amicrofluidic device to combinea single cell lysed in a nanoliter droplet with ahydrogel containing primers and reagents or asolid bead with primers synthesized on the sur-face. The InDrops (“indexing droplets”) method

relies on lysing individual cells and capturingmRNA in nanoliter aqueous droplets that aremixed one by one in the microfluidic chamberwith hydrogel beads containing barcoded DNAprimers. After hybridization, the droplets arebroken and the first strand cDNA reaction pro-

Cell lysis

RT

Switch oligo

PCR primer

Poly(A) + RNA

rGrGrGC C C

5′

cDNA amplification

Clean-up

Read 2Poly(dT)VN

PCR primer

Bulk

GEMs

UMIP5 Read 1 10XBarcode

Poly(dT)VNUMIRead 1 10XBarcode

Sampleindex

S7

Enzymatic fragmentation,End repair, A-tail. Ligation, sample index PCR

3′ AAAAAAAAAAAAAAAAAAAAA

3′ AAAAAAAAAAAAAAAAAAAAA

Oil

Collect RTPool

remove oil

Cellsenzyme

10X barcodedgel beads

10X barcodedcDNA

10X barcodedcDNA

Single-cellGEMs

Figure 1. Schematic diagrams of single-cell RNA amplification using the strand-switchingmethodology, showingthemolecular steps in generating double-stranded cDNAwith sample barcodes and uniquemolecular identifiers(UMIs) (top); and the application of the basic method in the 10X high-throughput droplet/bead capture micro-fluidic mode (bottom). (Figures courtesy of 10X Genomics.)

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 5

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 6: Single-Cell Applications of Next-Generation Sequencing

ceeds in bulk, with indexed sequences resolvedafter sequencing. In the Drop-Seq method, theDNAprimersaresynthesizedonthebeadsurfaceusing a “split and pool” method so that the se-quence on each bead has a constant PCR handle,a unique 12-base barcode, and each primer con-tains a UMI plus the oligo(dT) to capture themRNA through its poly(A) sequence and initi-ates cDNA synthesis during the RT reaction. Af-ter breaking the droplets, the captured mRNA isreverse-transcribed using template switchingand the double-stranded RNA is amplified andsequenced, thus labeling the RNA from each cellwith both a sample barcode and UMI.

In a similar fashion, the Klein group usedtheir newly developed method, InDrop, to studythe lineages of embryological development intwo familiar vertebrate models, the zebrafish(Farrell et al. 2018;Wagner et al. 2018) and Xen-opus, thewestern clawed toad (Briggs et al. 2018).In both of these efforts, the first steps were tocreate a pseudotemporal sequence of events byprofiling single cells isolated from sequentialstages of embryo development. Analyzing thetranscriptome profiles of 60,000 to 139,000 sin-gle cells, they were able to visualize lineage treesleading from pluripotent stem cells to maturetissues and to track the temporal-spatial dynam-ics of the maturing populations.

These methods have recently becomewidelyaccessible through automation and commercial-ization by BD Bioscience as the BD Rhapsodyinstrument and by 10X Genomics as the Chro-mium instrument. A schematic of the 10XChro-mium method for single-cell RNA (scRNA)sample barcoding and RNA amplification withUMI is shown in Figure 1 (bottom panel).

ANALYTICAL TOOLS FOR SINGLE-CELLTRANSCRIPTOMICS

Once single-cell transcriptome libraries havebeen created, much of thework to glean relevantinformation relies on the proper analysis andinterpretation of the resulting data. Early meth-ods for analysis relied on simple application oftools created for bulk datasets, with a few extraquality control (QC) steps implemented to re-move poorly amplified or degraded libraries.

Newermethods includemore sophisticated stat-istical tools for QC and for the subsequent stepsto measure transcript abundance, smooth orimpute the expression profiles, and cluster thecells into similar groups of related and/or novelcell types.

The extra QC steps needed for single-cellRNA-Seq (scRNA-Seq) must include an aware-ness of how single-cell libraries differ from tra-ditional bulk libraries. Populations of cells withvery similar gross cellular phenotypes mightshow large differences in their transcriptomesfor biological reasons such as stochastic tran-scription or unsynchronized cell-cycle stages.Unfortunately, these biologically driven differ-ences can look very similar to technical artifactscaused by insufficient capture or shallow se-quencing depth (Grün and van Oudenaarden2015). Several calibration schemes have beenimplemented to try to mitigate these challenges,such as using spike-in controls to model tech-nical noise and as a function of transcript abun-dance (Brennecke et al. 2013; Ding et al. 2015).The use of transcript UMIs, as discussed above,can help mitigate the effects of PCR amplifica-tion artifacts. Even with the use of UMIs andspike-ins, transcript dropout events are notori-ously difficult to remove, such that many groupshave resorted to imputing the expression profilesof genes with anomalously low counts by inter-polating to match the expression profiles fromsimilar cells (Li and Li 2018; vanDijk et al. 2018).

Following proper normalization and qualitycontrol, the hope is that novel clusters of cellswill emerge as independent groups on a clustermap. Clusteringmethodswork best when the setof input genes are preselected for those mostlikely to mark different cell types, for example,those that are well expressed and also show largedifferences between different cells. Selectionmethods include calculating the most variablegenes (Klein et al. 2015; Macosko et al. 2015),determining the genes that are significantlydifferent between known cell types (Shalek etal. 2013), or analyzing genes that contributestrongly to the first few principal components(Satija et al. 2015). Once a gene list has beenselected, these are used as input for clusteringalgorithms that attempt to identify both how

N. Anaparthy et al.

6 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 7: Single-Cell Applications of Next-Generation Sequencing

many clusters are present as well as who belongsto each cluster. Popular methods include matrixfactorization approaches, k-means clusteringapproaches, as well as standard hierarchicalclustering. Choosing the “best” clustering algo-rithm can depend on the underlying biology ofthe datasets. Although many scRNA-Seq analy-sis tools are optimized for mixed populations ofdistinct cells (Kharchenko et al. 2014; Grün andvan Oudenaarden 2015; Haghverdi et al. 2015;Satija et al. 2015; Xu and Su 2015; Zeisel et al.2015), their methods differ strongly from thosedeveloped to analyze time-series datasets thatassume a smooth distribution between celltypes, such as those involving development tra-jectories (Bendall et al. 2014; Marco et al. 2014;Trapnell et al. 2014; Setty et al. 2016). In practice,datasets often include a mixture of cells fromdistinct cell types as well as related subclusterswith significant overlap. Therefore, trial analysiswith both types of methods might be necessarybefore identifying what is best for a given exper-iment (Haghverdi et al. 2015).

To address the challenges outlined above,Ho et al. (2018) designed the scRNA-Seq anal-ysis and klustering evaluation (SAKE). SAKEprovides several modules that include data pre-processing for quality control, sample cluster-ing, t-distributed stochastic neighbor embed-ding (t-SNE) visualization of clusters, differen-tial expression between clusters, and functionalenrichment analysis. Comparing the perfor-mance of several published single-cell datasets,they showed that all of the scRNA-Seq analysistools perform similarly for a wide range of sam-ple types, despite each being algorithmically in-dependent. However, SAKE performed best forvery complex mixtures of cells with extensivesubstructure. Importantly, SAKE also includesquantitative statistics to evaluate the clusteringresults, a feature missing frommost other meth-ods (Deng et al. 2014; Ting et al. 2014; Zeiselet al. 2015), and to evaluate its performance bythe ability to correctly identify clusters reportedin these studies. Figure 2 presents an heuristicexample of the manner in which these tech-niques can be used to rationalize scRNA-Seqdata and to identify genes of interest by func-tional genomics (Box 1).

t-SNE VISUALIZATION OF CLUSTERS

t-SNE (van der Maaten and Hinton 2008) hasemerged as the most widely used visualizationtool for single-cell transcriptomic studies, andis often used to represent data clustered and nor-malized by other methods, including Seurat,SAKE(Hoet al. 2018), and10XGenomicsLoupeCellRanger described (see support.10xgenomics.com/single-cell-vdj/software/visualization/latest/what-is-loupe-vdj-browser).

The goal of t-SNE plots is to represent themost relevant differences among points (e.g.,single-cell genomic profiles) in high-dimen-sional space and represent them in a low-di-mensional 2D or 3D scatterplot, such that, forexample, clusters on a t-SNE plot might repre-sent different cellular types as is shown in Fig-ure 2. t-SNE implementations for gene expres-sion datasets begin by reducing the very highdimensional expression matrix (m genes × nsamples) into a lower dimensional set of prin-cipal components (PCs) that encode most of thevariance in the dataset, typically 20–30 PCs.The t-SNE algorithm then calculates the dis-tance between samples along each of thesePCs and attempts to nonlinearly embed thePCs into a 2D or 3D representation that reca-pitulates relatedness of samples. This tends towork well when a filtered list of a few thousandgenes that carry the most relevant informationof cellular identity is used as the input to the t-SNE algorithms. A major distinction versusprincipal component analysis (PCA) is the ad-dition of a user determined tunable parameter,“perplexity,” into the calculations. When per-plexity is well chosen (typically 5–50, but wellbelow the number of points) and the algorithmis allowed to iterate until saturation (typically∼1000), t-SNE maps can be excellent tools forvisualization of single-cell datasets. In addition,it has been demonstrated that the results from t-SNE may vary widely from different runs thatdo not saturate the perplexity of the embeddeddimensions, such that cluster membership maynot be reproducible from run to run (Watten-berg et al. 2016). A brief explanation of t-SNEfor new users is presented elsewhere (see distill.pub/2016/misread-tsne).

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 7

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 8: Single-Cell Applications of Next-Generation Sequencing

Component 1

451Lu_Parental

451Lu_Parental

451Lu_BR

451Lu_BR

ExpGenes

BRAFV600E inhibitor resistance

7500

–4

Color key

SilhouetteSAKE

cell type

40

10,00012,500

A375_ParentalA375_BR

451Lu_Par451Lu_BR

A375_ParA375_BR

–1

Color key

1× Vemurafenib0.5× Vemurafenib

Vemurafenib concentration (μm)

Cel

l via

bilit

y (%

)

0.0

20

40

60

80

100

120

00.2 0.4 0.6 0.8 1.0

1× Vemurafenib

0.25×Vemurafenib

Vemurafenibresistant

Parental

10

451Lu_Par451Lu_BR

451Lu-Par

451Lu-BRA375_ParA375_BR

10X.BR

y = –0.267+0.362x

DCT

AXL

r 2 = 0.546

10X.ParC1.BR C1.Par

Com

pone

nt 2

Com

pone

nt 2

–0.25–0.25

0.00

0.00

0.25

0.25

0.50

0.50

log2FC (BR/Par) C1 data

log 2

FC

(B

R/P

ar)

10X

dat

a

Par

_Sco

re

BR_Score0.75

0.75

1.00 –10

–10

10

10

5

–5

20 300

0

–20

Component 1

–10 10 20

20

0

0

–20

–20

–30

1.00

B

A

D

F G H

C

E

Figure 2. Examples of a single-cell RNA (scRNA) sequence analysis to detect the molecular basis of drugresistance in melanoma. (A) Scheme for generating drug resistant derivatives. (B) Testing resistant cells.(C) Using signature melanoma signature genes to cluster bulk RNA-Seq data from cell lines. (D) t-SNE mapwas used to display the expression profiles of 400 single cells isolated using the Fluidigm C-1 instrument.(E) Heat map showing expression of highly expressed and highly variable genes. (F) t-SNE plot of scRNA-Seq of 6545 cells performed on a 10X Chromium instrument. (G) Direct comparison of Fluidigm cells andChromium cells showing similar differences between resistant and parental but distinct patterns for the twomethods. (H ) Differential expression analysis identifies genes significantly altered in the resistant cell linederivative (for details, see Ho et al. 2018).

N. Anaparthy et al.

8 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 9: Single-Cell Applications of Next-Generation Sequencing

HIGH-VOLUME SINGLE-CELLTRANSCRIPTOME STUDIES, THE“MEGACELL” EXPERIMENTS

Perhaps the most revolutionary application ofscRNA-Seq has occurred only recently in thefields of developmental and organism biology,and of course, cancer. Using various forms ofdroplet capture and bead-based amplification, itis now possible to obtain transcriptome profilesfrom hundreds of thousands to millions of cellsfrom embryos, whole organs, or tumors in a few

experimental runs. The sheer amount and com-plexity of the data has spurred the developmentof new analytic methods that can create multi-dimensional arrays for mRNA profiles to reflectlineage relationships among cell types, fromplu-ripotent stem cells to fully differentiated somaticcells. Of course, in dissociating the cells for high-volume sequencing, the original tissue structureis lost. The elegance of these “megacell” experi-ments is in using orthogonal methods to relatethe transcriptomic profiles to cell type and celllocation in the original specimen.

BOX 1. ISOLATION AND AMPLIFICATION OF SINGLE CELLS

The range of methods that have been used to capture single cells for genomic analysis is as varied andnuanced as the range of experiments being pursued, but roughly they separate into two generalclasses according to whether the target cells are abundant, as from disaggregated tissue or celllines, or rare, as circulating tumor cells (CTCs) in blood. Cells in suspension can be easily be separatedby characteristic markers and dispensed into reaction tubes or multiwell plates by fluorescence-activated, cell-sorting FACS (Navin 2011; Baslan et al. 2015; Satija et al. 2015; Alexander et al.2018) or distributed randomly by flowing across microwells (Yuan and Sims 2016; Gierahn et al.2017). As of this writing, however, the high-throughput and reproducibility of microfluidic bead-capture methods for automated capture and amplification of single cells are rapidly becoming thestandard for abundant cells in suspension.

By disaggregating cells from tissue, the spatial relationships are lost. In cases in which the exactsource of cells is important, methods for coupling laser capture microscopy (LCM) with single-cellsequencing library preparation are available to capture cells from tissue preparations on slides(Nichterwitz et al. 2016). Another elegant approach for transcriptome mapping live neurons in situwas developed by Eberwine et al. (1992), in which a combined chemical and fluorescent probeinjected into a cell can be photoactivated to “tag”mRNA molecules with biotin for later purificationusing streptavidin beads and sequenced (Lovatt et al. 2014).

Isolating very rare cells, as exemplified by CTCs from blood or bone marrow aspirates (typicallyone in a million nucleated cells) is much more difficult, and a wide array of commercial and non-commercial methods have been developed for this purpose. Among the commercially availablemethods, the CellSearch (Silicon Biosystems) system is a clinical application using magneticcapture of cells using ferrofluid nanoparticles conjugated to EpCAM antibodies to capture epithelialcells. Captured cell populations can be removed and individual cells isolated by micromanipulationor by using a coupled DEP Array system (Silicon Biosystems) to electrophoretically maneuver cellsinto reaction tubes for amplification. Other forms of automated fluorescence capture include theCellCelector (Automated Lab Systems, Gmbh) and theCyteFinder/Accucyte system (Rarecyte), whichcombines automated identification, cell picking, and amplification of blood samples. Alternatives toEpCAM as an enrichment method include separation by size using the ISET system (RareCells) andshape (Biofluidica).

Yet another nonselective method that has yielded an operational clinical assay for the androgenreceptor variant AR-V7 (Scher et al. 2017) is the high-definition, single-cell analysis assay (HD-SCA)(Marrinucci et al. 2009) offered commercially by Epic Sciences (La Jolla, CA). This method arrays allnucleated cells on glass slides and uses bothmorphology and fluorescencemarkers to identify cells ofinterest. Cells can later be picked for amplification by micromanipulation and DNA copy numberanalysis (Dago et al. 2014; Greene et al. 2016).

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 9

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 10: Single-Cell Applications of Next-Generation Sequencing

EMBRYOLOGY AND DEVELOPMENT:FLATWORMS, TOADS, FISH, AND REPTILES

In 2018, two groups synchronously publishedcell type atlases of the freshwater planarian (flat-worm), Schmidtea mediterranea (Fincher et al.2018; Plass et al. 2018). Planarians are multicellorganisms with an organized body structure anddistributed pluripotent stem cells. They are wellknown for their ability to regenerate completeanimals from isolated parts and as such they arefunctionally immortal. The goal of these studieswas to create an atlas of transcriptomic cell typeskeyed back to the known cell and tissue types inthe animal and shed light on the lineage map ofcells during development and regeneration. Us-ing Drop-Seq to create libraries of tens of thou-sands of single cells followed by NGS, bothgroups obtained highly detailed and virtuallycomplete transcriptome atlases of all the knowncell types in the animal. Both groups used theSeurat R toolkit for single-cell genomics (Satijaet al. 2015; Butler et al. 2018) to cluster themRNA expression data and the clustered datawas visualization with t-SNE (van der Maaten2014). They then used a combination of knownmarker genes and in situ hybridization (ISH) onwhole animals to classify the cell types cluster bycluster, identifying more than 20 known celltypes and discovering several previously un-known cell types. This process also created alineage map of planarian development.

Three simultaneous papers, also in 2018,took the atlas and lineage concept a step furtherinto spatiotemporal analysis of vertebrate devel-opment by scRNA-Seq of tens of thousands ofcells from staged embryos of zebrafish (Farrell etal. 2018; Wagner et al. 2018) and Xenopus (Wag-ner et al. 2018). Clustering the transcriptomicprofiles by principle components over the embry-onic time-points defines progressively branchingtree-like maps of development in which the ul-timate fates (e.g., organs, brain regions) can beidentified by well-known markers. This worknot only identifies previously unknown celltypes and nodal branch points in the develop-ment process, but also provides a framework forcomparative analysis of developmental stepsacross species and evolutionary time.

A final example reveals new insights intoboth neurobiology, evolutionary biology, anddevelopment. The origins and lineages of struc-tures in the brain across the evolutionary spec-trum has been an area of intense study formany decades. Tosches et al. (2018) built atranscriptomic atlas of cells comprising the ba-sic brain component (pallium) where develop-ment varies widely from reptiles to birds tomammals. The transcriptomic profiles coupledwith ISH enabled tracking the lineages withhigh resolution and present new opportunitiesfor interpreting structure in terms of synapticorganization.

IMMUNE REPERTOIRE PROFILING

Application of single-cell sequencing for analyz-ing the transcriptomics of immune cells startedwith acquiring cells of known immune typesthrough use of flow cytometry and antibodiesagainst known markers. Innovations such asmass cytometry (CyTOF) increased the numberof proteins that could be simultaneously used tostudy cells and made it possible to unravel morecell types quantitatively and qualitatively. How-ever, the field has now progressed to unbiasedanalysis in which cells are probed without priorknowledge of their type. MARS-Seq (Jaitin et al.2014) is one of the early papers that demonstrat-ed that scRNA-Seq is a great tool for identifica-tion of different populations in heterogeneoustissues. Working with the spleen exposed to li-popolysaccharide (LPS), theywere able to distin-guish different types of immune cells, includingB cells, macrophages, natural killer (NK) cells,dendritic cells, andwere able to study the expres-sion changes in untreated and treated samples.Soon, a plethora of papers leveraging scRNA-Seq and its ease tomergewith other technologieshighlighted how scRNA-Seq can be used to pro-file immune cells in cancer. Immune cell profil-ing has been extensively reviewed in Rosati et al.(2017), Seah et al. (2018), andReece et al. (2016).T- and B-cell repertoire profiling, despite theseadvances, presents a challenge because of theirinherent diversity. VDJ recombinations can re-sult inmillions of differentT-cell receptor (TCR)chains that are generated in response to specific

N. Anaparthy et al.

10 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 11: Single-Cell Applications of Next-Generation Sequencing

stimuli. Similarly, B cells also have variable re-gions, essential for building an adaptive im-munity repertoire. This diversity can be studiedusing either gDNA or cDNA. There are severalcommercial options available from multiplecompanies such as BGI, Adaptive Biotechnolo-gies, and iRepertoire, which probe differentchains and regions from both starting materials.Gierahn et al. (2017) designed a Seq-Well devicethat is particularly useful for low numbers ofcells. On the other hand, droplet-based sequenc-ing methods have made it possible to analyzethousands of these cells at the same time. Aziziet al. (2018), using InDrop, have demonstratedthe strength of paired gene expression and TCRsequencing and McDaniel et al. (2016) havereached the multimillion cell level in ultrahigh-throughput VDJ sequencing. Both BD Biosci-ences and 10X Genomics have several commer-cially available kits for either one of scRNA-Seqand TCR sequencing or both for use on the BDRhapsody and the 10X Genomics Chromiuminstruments. These are great alternatives to set-ting up complex droplet microfluidic systemsand also offer data analysis solutions. In thesekits, the beads have the partial Read1, UMI, and10XGenomics Barcodewith the template switcholigo. This allows for full-length transcription ofthe mRNA and incorporates the barcode andUMI on the 50 end, which is key for the dualapplication. The cDNA is amplified and dividedfor both applications. One half is fragmentedand processed to prepare libraries for transcrip-tomics analysis and the other for T/B-cell reper-toire investigation. Primers designed against theouter and inner constant region of variablechains are used for targeted PCR enrichment.These enriched fragments of different lengthsare processed once again to make sequencinglibraries. The libraries from both applicationsare also sequenced differently. The gene expres-sion libraries are sequenced 26 cycles in read 1 toget the cell and UMI identities and 98 bases inread 2 to get mRNA insert sequence. The VDJrepertoire libraries are sequenced 150 bases inboth reads to get the all the pertinent sequencedata. The sequencing results can be easily ana-lyzed in the software suite designed by 10X Ge-nomics, which allows for clustering of cells,

check enrichment of T/B cell clonotypes, andcharacterize the repertoire, making it an inge-nious tool for immune profiling.

NOVEL APPLICATIONS OF MICROFLUIDICDROPLET-BASED SEQUENCING

The power of genomics and transcriptomics atthe single-cell level has spawned a startling num-ber of variations on the single-cell sequencingparadigm with a dizzying array of cleverly acro-nymic sobriquets, including REAP-Seq, SLaAP-Seq, ATAC-Seq, G&T-Seq, Hi-C, and others.The available menu as of early 2017 was re-viewed carefully by Tsoucas and Yuan (2017).As with the original scDNA and scRNA meth-ods, some of these started out as “one cell pertube” preparations that were adapted to Drop-Seq, but most recent methods are developed di-rectly for automated bead emulsion or micro-well-based formats facilities (see Box 2).

REAP-Seq (RNA expression and protein se-quencing) (Peterson et al. 2017) and CITE-Seq(Stoeckius et al. 2017), both introduced in 2017,are of some particular note among combinationmethods because of their novel use of the directDNA polymerase activity of reverse transcrip-tase for identifyingantibodies orothermoleculesthat bind to specific cells and the broad applica-bility of the methods. The rationale for REAP-Seq andCITE-Seq is thatmRNAexpressiondoesnot necessarily predict protein expression, there-foremeasuring both simultaneously can providea new level of information about cell states andsubtypes. The mRNA transcriptomic compo-nent of REAP-Seq follows the general rulesfor droplet-based mRNA transcriptomics, butthe protein component is assessed by conjugat-ing specific oligonucleotide tags to antibodiesagainst cell-surface proteins. The antibody tagis constructed with a poly(dA) sequence, fol-lowed by a sequence specific to each antibodyand a PCR handle. After antibody binding to acellmixture, the individual cells are encapsulatedwithin a Drop-Seq bead using a microfluidic ap-paratus. After cell lysis, both mRNA and anti-body tags hybridize to the bead primers (con-taining the cell barcode and UMI) and areextended by reverse transcriptase. After emul-

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 11

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 12: Single-Cell Applications of Next-Generation Sequencing

sion breakage, the tags and cDNA are amplifiedand sequenced.

In the initial REAP-Seq publication, 82 an-tibodies were used to characterize more than7000 immunocytes by principal componentanalysis and t-SNE and identified a small subsetof cells with the population of naïve CD8+ Tcells. This outlier group was present in all do-nors indicating the presence a new population

enriched in an intermediate step in megakaryo-cyte development. As of this writing, a variationon these methods using DNA aptamers (Apt-Seq) rather than antibodies (Delley et al. 2018)has also been published.

Yet, another variation exploiting the powerof droplet-based single-cell methods in dissect-ing cellular circuitry is Perturb-Seq (Adamsonet al. 2016; Dixit et al. 2016). In this method,

BOX 2. THE 10X CHROMIUM SYSTEM

The 10X Chromium system, built on their proprietary GemCode technology, has pushed the single-cell sequencing field into a new era. The high-throughput function, in which thousands of cells arebarcoded in little time combined with the ability to be used for multiple applications, makes it avaluable asset in various areas of research.

Most of the droplet-based methods require the coencapsulation of a single cell and a barcodedbead/hydrogel microsphere. This is a double Poisson distribution loading resulting in low captureefficiency (7% InDrop, 12.8% Drop-Seq). However, 10X Chromium controller ensures that all drop-lets formed have a gel bead making the process a single Poisson distribution, greatly increasing thecapture efficiency to 50%. This encapsulation is the core of different single-cell applications of 10XChromium and is performed on a chip specific to each application. Samples, reagents, and partition-ing oil are loaded onto the chip in designatedwells. Gel beads in emulsion (GEMs), that is, the sampleand barcoded gel bead in a single droplet, are collected from the recovery well and processed in PCRtubes to generate libraries for sequencing.

10X single-cell gene expression solution (scRNA-Seq) chip has the capacity to process up to eightsamples at a time. The cell suspension volume and concentration are critical for obtaining the desiredcell recovery and keeping the number of doublets low. The barcoded gel beads have a partial Illuminaread 1, 16 nucleotide 10X barcode, 10 nucleotide unique molecular identifiers (UMIs), and 30nucleotide poly(dT) primer sequence, in that order. This combination gives 10X technology a poolof∼750,000 barcodes to separately index a large number of cells. A single such bead and one cell areencapsulated in a GEM with the reagents for reverse transcription (RT). These GEMs are transferredfrom chip into PCR strip in which the cells are lysed and mRNA is reverse transcribed into full-lengthcomplementary DNA (cDNA) (∼7000–9000 bp) with a cell barcode and UMI. Next, GEMs aredissolved to release cDNA, which is cleaned, size selected using SPRI beads, and quantified.Paired end sequencing libraries are constructed by adding Illumina compatible adapters to cDNA,which is fragmented to 300–700 bp. Also, these fragments are all from the 30 end of cDNA becausethis is necessary to identify the origin of cDNA from the sample pool.

Single-cell copy number variation (CNV) uses the same principles but has cell bead, gel bead(CBGB) droplets instead of GEMs. The chip for this application is designed to handle up to foursamples in one run. Cells are first encapsulated with cell matrix to generate cell beads (CBs). InCBs, the cell is lysed and nuclear proteins are degraded although genomic DNA remains trappedin the cell bead. These CBs undergo a further encapsulation step with barcoded gel beads andreagents to generate CBGBs. CBGBs are then harvested from the chip for amplification of DNA togenerate barcoded sequencing libraries. All DNA from a single cell shares a common barcode. Thisenables separation of sequencing reads and assigning them to individual cells to generate CNVs andthis processing of data can be performed easily using software tools from 10X. In addition to thesesingle-cell applications, 10X Chromium can also be used for other applications such as single-cellimmune profiling, genome, and exome sequencing. The barcodes on gel beads are different fromoneanother and specific to each application.

N. Anaparthy et al.

12 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 13: Single-Cell Applications of Next-Generation Sequencing

the cells of interest are transduced with lentiviralvectors carrying multiple CRISPR-based per-turbations and then subjected to an experimen-tal stimulation. In the Broad example, CRISPRperturbations targeted transcription factors im-plicated in dendritic cell responses. After im-mune stimulation, the cells were processed bydroplet-based transcriptomic analysis. The re-sulting sequences reveal both the specific per-turbation(s) and the overall transcriptomic pro-file for each cell, yielding complex cause andeffect networks that go far beyond the informa-tion gained by perturbing one target gene ata time.

SINGLE-CELL EPIGENETICS

Single-cell epigenetic analysis has been a long-standing goal that is now being approached, al-beit with some difficulty. The basic issue is thatepigenetic events cannot be directly read out byamplification with DNA polymerase, and themost commonmethod for detecting CpGmeth-ylation, bisulfite treatment to convert unmethy-latedC residues toU, is very harsh and requires alarge pool of starting DNA to be reliable. Severalmethods, including Hi-C (Nagano et al. 2013),pBAT (Miura et al. 2012), and scRRBS (single-cell reduced representation bisulfite sequencing)(Guo et al. 2013), have been published but theirinitial impact has been limited because of limit-ed coverage of the methylated C residues in anysingle cell. In addition, because of the technicalsensitivity of bisulfite treatment, these methodsmay be difficult to introduce into high-through-put droplet-based or microwell-based pipelines.

A different epigenetic methodology basedon detecting open versus closed chromatin rath-er than CpG methylation relies on marking thechromatin regions accessible to a sequence con-serving modifier. One such approach, ATAC-Seq (Buenrostro et al. 2015), makes use of theTn5 transposase tagmentation process, oftenused to generate fragments for bulk NGS librar-ies, coupled with droplet-based amplification toidentify genomic regions preferentially accessi-ble to the transposase enzyme. The reportedcorrespondence between the single-cell profilesof open chromatin near gene promoters and the

data from bulk populations is remarkable. Veryquickly, single-cell epigenetics has joined therealm of multi-omics, with a plethora of meth-ods and acronymic names. Recently, 10X Geno-mics has come up with their own solution toprobe the regulatory landscape of chromatin inhundreds to thousands of cells in a single sam-ple. The recently launched Chromium SingleCell ATAC Solution allows use in bulk andin nuclei transposition reaction before using amicrofluidic chip to partition nuclei into nano-liter-scale gel bead in emulsion (GEM) samples.A pool of ∼750,000 10X Genomics barcodes(GemCode technology) allows separately anduniquely indexing the transposed DNA of eachindividual cell. Libraries are generated and se-quenced, and 10X Genomics barcodes are usedto associate individual reads back to the in-dividual partitions, and thereby, to each individ-ual cell.

Single-cell epigenome data has limitedutility without a functional readout, preferablyin the form of transcription data. Combiningsingle-cell transcriptomics with methylationdata (scM&T-Seq) was demonstrated by Anger-mueller et al. (2016), using a modification of anearlier method for combined genomics andtranscriptomics called scG&T-Seq (Macaulayet al. 2015). Hou et al. (2016) extended the mul-tiplicity to combine genomic copy number pro-filing and bisulfite sequencing for detectingmethylation with transcriptomics by gently lys-ing cells in individual tubes and separating thesoluble RNA from the nucleus by centrifuga-tion. Processing the nuclear DNA for bisulfitesequencing and amplification yields both geno-mic copy number variations (CNVs) data andmethylationmapping data, although the RNA isprepared for a Smart-Seq-based transcriptomicanalysis (Hou et al. 2016). Combining genomicanalysis with methylation and a transcriptionalreadout will be extremely important for a newlevel of genomic exploration in cancer. Furtheradvances in multi-omics undoubtedly await,likely involving some of the novel methods de-scribed above, and exemplified most recently bya report from Clark et al. (2018) on the combi-nation of chromatin accessibility, DNA methyl-ation, and transcription as scNMT-Seq.

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 13

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 14: Single-Cell Applications of Next-Generation Sequencing

APPLICATION IN NEUROBIOLOGY

Characterizing the cellular components of thecomplex ecosystem that is the brain is a primeapplication for single-cell genomic analysis,starting with the detection of specific gene ex-pression in live rat neurons using antisense RNAISH (Eberwine et al. 1992), to transcriptome invivo analysis ([TIVA]; Lovatt et al. 2014) selec-tion of RNA molecules for NGS, and now high-throughput RNA transcriptomics to both clas-sify all the neuroanatomical cell types and tracethe lineages back to stem cells (Dulken et al.2017; also see Kester and van Oudenaarden2018 for a detailed review on the general topicof transcriptomics and lineage).

The obvious applicability of scRNA-Seqto brain mapping and neuroanatomy has ledto a literal “tsunami” of publications involvingscRNA-Seq and there are far too many to ad-dress individually in this review. Furthermore,public interest in understanding the brain hasled to the establishment and funding, by boththe National Institutes of Health (NIH) and pri-vate foundations, of a large collection of braintranscriptome databases. Descriptions of thevarious databases have been collected in “TheBrain Transcriptome Database: A User’s Guide”by Kenneth Kwan and colleagues (Keil et al.2018). This publication also provides detailsconcerning access to the databases and an in-sightful review of the field.

As reviewed by Poulin et al. (2016), cellularidentity is determined by the sum of all geneexpression. Therefore, cellular classification mustinclude expression profiling of all genes at thesingle-cell level. Further, to be maximally use-ful, the results must be rationalized withinthe framework of existing cell taxonomy andneural structure. Perhaps the most importantprivately funded effort to satisfy those needsand to combine scRNA-Seq data with neuro-anatomy is focused on expanding the AllenBrain Atlas, an effort begun in 2001 and ded-icated to characterize the cellular diversity ofthe brain at the molecular level (see brain-map.org). Efforts to complete the taxonomyof both human and mouse adult and develop-ing brains are numerous, and more than 50

published and ongoing studies are listed on theBroad Institute Single Cell Portal along withinstructions for accessing published data (seeportals.broadinstitute.org/single_cell). As de-scribed in the Poulin et al. (2016) review, theresulting molecular atlas will not only identifycell types with high resolution, it will empowerefforts aimed at mapping connectivity betweenspecific neuronal cell types, determining theneuronal types contributing to behavior, andunderstanding selective degradation of neuronsin age and disease.

The neuroscience literature also yieldsanother example of the flexibility of the high-throughput scRNA-Seq methodology. High-throughput “megacell” experiments require dis-aggregation of tissue to obtain individual cells ornuclei and it is, of course, desirable that the cellsbe as close to the native state as possible. It hasbeen a long-standing issue that artificial alter-ations in cell state would occur as a result of theprocess of disrupting living tissue and maskingor contaminating the native transcription pro-file. In an attempt to reliably identify the expres-sion of immediate-early genes (IEGs) after neu-ron activation in single cells without havingthem artificially activated by the disaggregationprocess at the single-cell level, Wu et al. (2017)adapted an old trick for analyzing transcriptionto high-throughput scRNA-Seq by freezingtranscription with actinomycin-D before cellisolation (Keil et al. 2018). Dubbed Act-Seq,the method permitted identification of an arrayof IEGs and activation states in the amygdalaand opens the way for linking stimulation withtranscriptional state in a manner not previouslypossible.

MAPPING AND ANALYSIS FOR CNVPROFILING IN CANCER

Cancer is a genetic disease caused by mutationsand rearrangements of the genome. It is also adisease of cell state. Alterations in the genomelead to alterations in the transcriptome and al-terations in cell phenotype to a diseased state.Therefore, both transcriptomic and genomic in-formation is required to fully characterize can-cer at the single-cell level.

N. Anaparthy et al.

14 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 15: Single-Cell Applications of Next-Generation Sequencing

SINGLE-CELL DNA PROFILING OF TUMORS

Perhaps, the most extensive use of whole-ge-nome single-cell DNA sequencing in cancerhas been in studying tumor heterogeneity andusing copy number profiling to identify subclo-nal complexity and tumor lineage. In this spe-cialized application, the density of reads acrossthe genome directly reflects CNVs resultingfrom gross genomic rearrangements and chro-mosome sorting mistakes during mitosis thatlead to aberrant gains and losses that divergefrom the diploid condition of the normal ge-nome of somatic cells. Genomic rearrangementsare a common feature of both hematologic andsolid tumors, but the mitotic errors and sequen-tial errors in recombination repair that lead toaberrant gains and losses of whole chromosomearms and the focal amplification of oncogenesare a hallmark of nearly all solid tumor tissue.We note that copy number variants are oftencalled CNAs in cancerous tissue, but we willuse the more common term CNV here.

Although most sequencing methods aremeasured in read depth, or x-fold coverage ofthe genome, for the majority of single-cellCNV profiling the actual genome coverage re-quired is actually minimal. Generally, there is noattempt to cover every base-pair as in deep se-quencing of bulk tissue. There are limits to thecoverage that can be obtained from just two setsof chromosomes using the amplification meth-ods available, although studies with near whole-genome coverage have been reported using rep-licating premitotic cells (Wang et al. 2014), andimproved high-coverage, single-cell amplifica-tion have been reported (Chen et al. 2017). Asingle cell contains up to thousands of copiesof some mRNAmolecules, but the nucleus con-tains only two copies (four strands) of eachDNA molecule and the overall target sequenceis 100 times larger, so collecting and amplifyinga complete genome’s worth of sequence infor-mation from single-cell DNA is much more dif-ficult. However, as with the transcriptome im-portant information can be gained from usingNGS “reads” as “tags” for individual sequences.In the case of transcriptomics, the tags are usedto count individual mRNAmolecules to create a

transcriptomic profile of the cell. For the ge-nome, the most frequent approach is to use in-dividual reads as tags representing segments ofthe genome. The density of tags across the ge-nome is a very accurate measure of the relativecopy number of each segment of the genome.This method was used by Navin (2011) to de-scribe spatiotemporal evolution of a single tu-mor and by Alexander et al. (2018) to followtumor evolution in prostate cancer biopsies.The segments are defined informatically by theuser and the numberof reads required to define acopy number profile is very low compared withbase-pair level sequencing. Using sample bar-codes, severalhundrednuclei canbeprofiledonasingle Illumina sequencing lane using ∼250,000mapped reads per cell (Baslan et al. 2015). De-tails of themethod and informatics analysis maybe found in Baslan et al. (2012) and Kendall andKrasnitz (2014). Examples of typical results areshown in Figure 3.

A key aspect of cancer cell copy number pro-filing as shown in Figure 3 is the establishment ofdirect genetic lineage using the complexity ofCNV gains and losses as “genomic barcoding.”Even as the complexity of the CNV landscapeincreases, the specific breakpoints of the CNVsremain stable and can be used as if they werenovel SNPs to follow the history of the tumoror cell population. The Navin laboratory hasused this method extensively in resolving tumorevolution and response to chemotherapy inbreast cancer (Wang et al. 2014; Gao et al.2016; Casasent et al. 2018). It should be notedthat applications listed abovewere performed onpurified nuclei or cells disaggregated from freshor fresh-frozen tissue. Most pathology speci-mens, especially valuable archived specimensfrom clinical trials are preserved as FFPE tissue.With that in mind, Martelotto et al. (2017) de-veloped a method for isolating and sequencingsingle nuclei from FFPE sections.

Once again, 10X Genomics has developedthe Chromium single-cell CNV solution to de-termine genomic heterogeneity and map clonalevolution by profiling hundreds to thousands ofcells in a single sample. This method is signifi-cantly more laborious than others developed by10X Genomics. The method requires encapsu-

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 15

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 16: Single-Cell Applications of Next-Generation Sequencing

Tis

sue

prep

arat

ion

Libr

ary

prep

arat

ion

Seq

uenc

ing

Seg

men

t bin

s an

d de

rive

copy

num

ber

stat

e

Dis

play

sin

gle-

cell

copy

num

ber

prof

iles

Der

ive

clon

al s

truc

ture

and

phy

loge

ny

Cou

nt n

umbe

r of

rea

ds p

er b

inan

d no

rmal

ize

by G

C c

onte

ntM

ap r

eads

/rem

ove

PC

R d

uplic

ates

Dat

a in

tegr

atio

nan

d vi

sual

izat

ion

Dat

a pr

oces

sing

Sin

gle-

nucl

eus

sequ

enci

ng

Flo

w s

ortin

gnu

clei

WG

A

A

Figu

re3.

The

core

compo

nentsto

theflow

ofsampleanddata

processing

bySN

S.(A

)Single-nucleus

sequ

encing

(top

panel),throughsequ

ence

read

mapping

andderivation

ofsingle-cellcopy

numberprofi

les(m

iddlepanel),to

clon

alstructureanalysisandvisualization(bottom

panel).(Legend

continueson

followingpage.)

N. Anaparthy et al.

16 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 17: Single-Cell Applications of Next-Generation Sequencing

NY

U00

7.G

S7.

2.2

NY

U00

7.G

S7.

2.2.

1

Clo

ne124936231 1 2 3 4 5 6 7 8 9 10 11 12

Sec

tor

1 B

enig

n LL

B2

GL6

(3+

3) L

LM3

HG

PIN

LLA

4 B

enig

n LM

B5

Ben

ign

LMM

6 B

enig

n LM

A7

Ben

ign

RM

B8

Infla

m R

MM

9 B

enig

n R

MA

10 B

enig

n R

LB11

Ben

ign

RLM

12 B

enig

n R

LA13

Ben

ign

LPL

Cop

y #

0 med

- 2

+m

ed -

1m

edia

nm

ed +

1m

ed +

2+

2C 2C R

x>

2C

Plo

idy

13 14 15 16 17 18 19 20 21 22x y

Sec

tor

Plo

ody

Mul

tiplie

r

Err

or

CJA3850CJA3988CJA3911CJA3932CJA3918CJA4044CJA4097CJA4091CJA4093CJA4081CJA4090CJA4085CJA4089CJA4092CJA4019CJA3915CJA4023CJA3940CJA4117CJA4123CJA3849CJA4126CJA3967CJA4069CJA4052CJA4084CJA3952CJA3957CJA3946CJA3947CJA3935CJA3863CJA3857CJA3944CJA3870CJA4101CJA3872CJA4109CJA4110CJA3861CJA3865CJA3874CJA4102CJA3867CJA4099CJA3869CJA3854CJA3855CJA4104CJA3873CJA3875CJA4103CJA4107CJA3890CJA3876CJA3886CJA3860CJA3858CJA3859CJA3887CJA3868CJA3862CJA4105CJA3913CJA3959CJA3889CJA4079CJA4080

0

NY

U00

7.G

S7.

2.1.

2N

YU

007.

GS

7.2.

1.1

NY

U00

7.G

S7.

2.1

NY

U00

7.G

S7.

2.2

NY

U00

7.G

S7.

2.2.

1

NY

U00

7.G

S7.

2.1.

2

NY

U00

7.G

S7.

2.1.

1

NY

U00

7.G

S7.

2.1

Anc

estr

al m

utan

t cel

l

B

Figu

re3.

(Continu

ed)(B)Magnified

view

ofclon

alstructureof

aprim

aryprostate

cancer

case

contrastingthedegree

ofgeno

miccomplexitybetweenthetwoclon

es.T

here

aretwocolortracks

belowdend

rogram

forclon

esandsubclonesand

they

aredesignated

asfollows:NYU007.GS7.2.1

(red)withitstwosubclones,NYU007.GS7.2.1.1

(green)andNYU007.

GS7.2.1.2

(orange)

andNYU007.GS7.2.2

(yellow)withitssinglesubclone,N

YU007.GS7.2.2.1

(pink).B

show

sinferred

clon

alprogressionfortheclon

esusingthesamecolorschemeforclonesandsubclonesas

describedinA.(From

Alexand

eret

al.2018;reprintedcourtesy

oftheAmerican

Association

forCancerResearch.)(N

otetheancestralmutantcellis

designated

byablue

circle.)

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 17

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 18: Single-Cell Applications of Next-Generation Sequencing

lation of individual nuclei (nuclei suspensionsare prepared beforehand) in a hydrogelmatrix togenerate cell beads on a microfluidic chip. Thecell bead is treated to lyse the encapsulated celland denature the gDNA. Then, a second micro-fluidic chip is used to separately index the gDNAof each individual cell in away akin to other 10XGenomics protocols. The caveat with this meth-od is that the low recovery rate of∼15%–17% (of∼3000 inputted, only ∼500 are recovered). Dataresolution is relatively good (∼1–2 Mb) and canbe improved by increasing the amount of se-quencing/cell and, consequently, costs.

ESTIMATING COPY NUMBER FROMTRANSCRIPTOME DATA

In single-cell transcriptomic cancer studies us-ing disaggregated cells, it is of great importanceto distinguish actual malignant cells from tu-mor-associated stromal cells and other sur-rounding normal tissue cells in the sample. Inthe absence of specific protein markers, it is be-coming common practice to use expression datafrom a population of single cells as a basis forinferring clonal genomic CNVs that identify themalignant aneuploid cells. Several methods forparallel genomic and transcriptomic analysis onsingle cells have been published (Dey et al. 2015;Macaulay et al. 2015); however, they requiremanual separation of DNA and RNA and arenot amenable to high-throughput methods. In astudy to identify a developmental hierarchy indendroglioma, Tirosh et al. (2016a) used thismethod to separate the malignant populationsfrom three tumors and then further analyzedtheir transcriptomic profiles to show that differ-ent development states existed within each CNVdefined subclone, indicating a developmentalhierarchy in the cancer independent of the ge-nome. They used a sliding window of 100 genesacross the genome to compare genome-wide ex-pression levels among the cells and were able todefine two populations; in one, the expressionpatternsmatched knownCNVs from bulk DNAsequencing. These were designated the malig-nant cells for the rest of the study. Althoughthis method is based on inference and doesnot provide an accurate copy number profile

of each cell, it can work well when enough sup-porting evidence (e.g., bulk CNV analysis) isavailable. A similar, but distinct, approach wassuccessfully used by Müller et al. (2016) instudying glioblastoma multiforme (GBM) (Ti-rosh et al. 2016a), and software programs areavailable online at the Broad Institute website(see github.com/broadinstitute/inferCNV).

SINGLE-CELL TRANSCRIPTION STUDIESIN CANCER

Numerous groups have applied various scRNA-Seqmethods to assess heterogeneity, tumor evo-lution, and cell-of-origin studies in tumors withrelatively low throughput methods (Patel et al.2014; Tirosh et al. 2016b; Li et al. 2017). Usinghigh-throughputDrop-Seqmethods, thisfield isbeing revolutionized. Targeted sequencing ofthousands of single cells has enabled the identi-fication of pathogenic mutations in patients inremission revealing complex clonal evolution inacute myeloid leukemia (AML) (Pellegrino etal. 2018). Using whole-transcriptome methods,other groups have cataloged the array of stromalcells in the lung cancer microenvironment(Lambrechts et al. 2018) and identified the cellof origin of kidney cancer (Young et al. 2018).

One study in particular demonstrates thepower of multiplex single-cell analysis going be-yond sequencing. A recent report from the lab-oratories of Derrick Lin, Aviv Regev, and BradBernstein, used single-cell transcription analy-sis, coupled with histology and cell biology torevise the subtypes of head and neck squamouscell cancer (HNSCC) and reveal the presenceand location and function of novel tumor cellphenotypes related to metastasis in HNSCC(Puram et al. 2017).

Traditional molecular analyses of disaggre-gated tumor specimens are complicated by thevariety of nonmalignant stromal and immuno-cytes that along with the tumor cells comprisethe tumor tissue. However, rather than describ-ing the average genomic signal from the bulkanalysis of a homogenized tumor, the investiga-tors performed scRNA sequencing on morethan 6000 individual cells from 18 head andneck cancer patients to explore the diversity of

N. Anaparthy et al.

18 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 19: Single-Cell Applications of Next-Generation Sequencing

cells within the all three populations, literallycreating a molecular and cellular atlas of thedisease. A schematic of the method taken fromtheir paper is shown in Figure 4. Key to thissingle-cell study was the use of complementarygenomic techniques, including inferred copynumber profiling from scRNA-Seq data andexpression phenotyping, with the aim of distin-guishing individual malignant cells with aneu-ploid genomes from the copy number normal(euploid) but phenotypically distinct stromalpopulations, including cancer-associated fibro-blasts and immunocytes. Once identified, anal-ysis of the expression profiles by t-SNE revealedthat the malignant (aneuploid) cells fell intoseparate clusters according to cancer subtype,whereas the stromal cells from all cases clusteredtogether around identifiable cell phenotypes.Furthermore, the malignant cells defined threesubtypes, rather than the four thatwere clinicallyrecognized. The fourth category being identicalto one of the others, but with an overabundanceof fibroblasts that made it appear as a distinct

subtype in bulk cell analysis. The project furtherbecame a tour de force when they used the RNAexpression profiles to infer differing protein ex-pression patterns among the tumor cells, and infact defined a new epithelial mesenchymal tran-sition (EMT)-like phenotype, which they call p(partial) EMT, which was inversely expressedrelative to the epithelial profile of the tumorbody. They then returned to the tumor itselfby using antibodies made to those proteins todetermine the positions of each cell type onmounted sections of primary tissue, showingthat the pEMT cells created a layer on the grow-ing front on the tumor facing the stroma and aconcentration of cancer-associated fibroblasts(CAFs). This result lead to an examination ofthe ligands and receptors expressed at the tu-mor–stroma boundary, which revealed a cor-responding overexpression of receptor–ligandpairs, indicating the activity of a pEMT induc-tion pathway between the CAFs and boundarytumor cells. Thus, the study took disaggregatedsingle-cell transcriptomics frommolecular phe-

Patients withoral cavity HNSCC

Single-cellRNA-Seq

Immuno-histochemistry

t-SNE

T cells

Epithelialmalignant

cells

pEMTmalignant

cells

CAFsFibroblasts

In situ

5 matchingLNs

18 primarytumors

Figure 4 . Schematic of multiplex analysis of head and neck tumors. HNSCC, head and neck squamous cellcancer; LNs, lymph nodes; pEMT, partial epithelial mesenchymal transition; CAFs, cancer-associated fibroblasts.(From Puram et al. 2017; reprinted, with permission, from Elsevier © 2017.)

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 19

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 20: Single-Cell Applications of Next-Generation Sequencing

notyping, to histopathology to functional geno-mics and clinical subtyping. Although this studyon a small number of HNSCC tumors is notdefinitive, it demonstrates the power that canbe realized using single-cell omics coupledwith functional genomics and classical clinicalpathology.

SINGLE-CELL SEQUENCING ON RAREAND ULTRA-RARE CELLS

As a polar opposite to the “megacell” experi-ments described above is the profiling of ultra-rare cells from liquid (blood) biopsies, in whichinformation from a very few cells could providenear real-time information on the progress ofcancer treatment. The killing force in cancer ismetastasis, occurring often years after the pri-mary tumor is discovered and excised. Yet, can-cer treatment is most often based on biomarkersin the primary tumor alone. Biopsies of meta-static sites are infrequent, and necessarily in-complete because it is impossible to find,let alone biopsy, all of the metastatic sites in aprogressing patient. One avenue to the real-timestate of the tumor is the blood, either throughcirculating cell-free DNA or RNA, or throughindividual cells shed into the blood from thevarious metastatic sites. The key is identifyingand then isolating cells that in most cases repre-sent perhaps 10–100 nucleated cells in a stan-dard blood draw containing up to 30 millionnucleated white blood cells (WBCs).

Capture and detailed analysis of circulatingtumor cells (CTCs) has been a goal for two de-cades, and although at least 35 different technol-ogies are in development, only one method,CellSearch, has gained Food and Drug Admin-istration (FDA) approval for identifying cancerthat is starting to progress. Single-cell DNA se-quencing has presented an opportunity to ad-vance the information gained from liquid biop-sies and our understanding of metastasis itself.CellSearch is a well-developed method that cap-tures putative tumor cells on the basis of theepithelial marker EpCam, but has been mainlyused for simply enumerating EpCam-positivecells. Other parameters that have been used forcapture include cell size or plasticity among oth-

ers. Nonselective methods in which putativecells are identified and characterized in situ byimmunofluorescent antibody tags have shownpromise for both enumeration and molecularanalysis at the single-cell level. In particular,the high-density, single-cell analysis or “no cellleft behind assay” originally developed by Kuhnand colleagues (Marrinucci et al. 2009) andcommercially developed by Epic Sciences (Wer-ner et al. 2015; Greene et al. 2016) plates a fullblood draw on slides and then identifies non-WBC with immunofluorohistochemistry. Thismethod has been used to identify clonal rela-tionships based onCNVprofiles, follow changesin cell populations during therapy in castrate-resistant prostate cancer (Dago et al. 2014; Ma-lihi et al. 2018), confirm vascular mimicry bytumor cells in small-cell lung cancer (William-son et al. 2016), and confirm the identity of cir-culating cells inmelanoma (Ruiz et al. 2015). Anexample of the utility of this method is shown inFigure 5, in which Malihi and colleagues usedsingle-cell CNV profiling to track the lineage ofcirculating cells in a metastatic prostate cancerpatient from the primary lesion to the bonemar-row (Malihi et al. 2018). As with the prostatecancer biopsies shown in Figure 3, CNV profil-ing is key to the identification of actual tumorcells from genomically normal cells in the anal-ysis of mixed populations, and is especially crit-ical in identifying ultra-rare (one in a million)tumor cells from liquid biopsies. An automatedversion of this slide-based method based on thesame principle has been developed by RareCyte(Seattle, WA).

CONCLUSIONS

As we experienced in the 2000s with NextGenSequencing itself, automation has democratizedsingle-cell sequencing. Single-cell sequencing inall of its variations has passed from the develop-ment stage in individual laboratories to a pointwhere it can be applied by every laboratory andacross every area of biology. Technical methodsfor single-cell genomic copy number and tran-scriptome profiling as well as targeted sequenc-ing for mutation analysis are highly evolved anddata can be presented in accepted formats. It has

N. Anaparthy et al.

20 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 21: Single-Cell Applications of Next-Generation Sequencing

indeed progressed from the technology of thefuture to the technology of the present, and stillmore novel approaches are continually beingrealized and combined to create more insightfulobservations. The versatility in the cell-isolationmethods has removed many previous barriers.We can analyze large populations, small popu-lations, and even ultra-rare cells, without limits.Cell “atlases” of the type created by “megacell”projects described above will take advantage ofall of these novel methods and will have an im-

portant role in understanding multicellular bi-ology at increasingly detailed levels. A particu-larly important example is the effort sponsoredby the Chan-Zuckerberg Initiative to create acomplete human cell atlas based on multiplexmolecular single-cell profiling of every cell typein the body (see www.chanzuckerberg.com/human-cell-atlas).

Such projects will undoubtedly lead to newinsights into cell–cell and cell–organism dy-namics in all life forms and may well be the

Clones

Samples

1

2

Samples:1 2 3

Legend:AmpNeutralDeletion

PTENdeletion

TP53deletion

MYC andNCOA2

amplification

MTCPTTPBMTP

3

4

5

6

7

8

Chr

omos

ome

9

10

11

12

131415161718

21

X

Individual tumor cells

Y

PT

TP

1P

TT

P2

PT

TP

3P

TT

P4

PT

TP

5P

TT

P6

PT

TP

7P

TT

P8

PT

TP

9P

TT

P10

PT

TP

11P

TT

P12

PT

TP

13P

TT

P14

PT

TP

15P

TT

P16

PT

TP

17P

TT

P18

PT

TP

19P

TT

P20

PT

TP

21P

TT

P22

PT

TP

23P

TT

P24

PT

TP

25

MT

C1

MT

C2

MT

C3

MT

C4

MT

C5

MT

C6

MT

C7

MT

C8

MT

C9

MT

C10

MT

C11

MT

C12

MT

C13

MT

C14

MT

C15

MT

C16

MT

C17

MT

C18

MT

C19

MT

C20

BM

TP

1B

MT

P2

BM

TP

3B

MT

P4

BM

TP

5B

MT

P6

BM

TP

7B

MT

P8

BM

TP

9

BM

TP

10B

MT

P11

BM

TP

12B

MT

P13

BM

TP

14

MT

C21

MT

C22

Figure 5. Tumor cell identification and clonal lineage by single-cell copy number variants (CNVs) profiling.Heat map and phylogenic tree of CNVs across the entire population of cells from metastatic circulation (MTC),bone marrow biopsy (BMTP), and primary resection (PTP). Sample type and clones are identified using colorkey. Three clones were identified: clone 1 consisting of prostate cells with hallmark alterations, clone 2 with fewcopy number alterations (CNAs), and clone 3 showing lineage relationship of bone marrow, primary, andcirculation. Key genes such as MYC, NCOA2, PTEN, and TP53 have been highlighted by chromosome locationacross clone and sample type. (From Malihi et al. 2018; reprinted courtesy of the American Association forCancer Research.)

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 21

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 22: Single-Cell Applications of Next-Generation Sequencing

keys to profound new understanding of the hu-man condition and indeed life itself in bothhealth and disease. We await these develop-ments with great anticipation.

ACKNOWLEDGMENTS

The authors acknowledge the continuing sup-port of single-cell sequencing from the BreastCancer Research Foundation, Prostate CancerFoundation, and Susan G. Komen for the Cure(J.H. and N.A.), support of the National CancerInstitute (NCI) and Leidos Biomedical Research(Contract No. HHSN261200800001E, contractagreements 12XS527 and 15X003), and of theNCI’s USC Norris Comprehensive Cancer Cen-ter (CORE) Support (GrantNo. 5P30CA014089-40) to Prof. Peter Kuhn, USC. We also thankJude Kendall, Alex Krasnitz, and Joan Alexan-der (Cold Spring Harbor Laboratory), Jens Dur-ruthy (10X Genomics) for figures and analyticalinsights, and Peter Kuhn (USC) for support andenlightening discussions.

REFERENCES

Adamson B, Norman TM, Jost M, ChoMY, Nuñez JK, ChenY, Villalta JE, Gilbert LA, Horlbeck MA, Hein MY, et al.2016. A multiplexed single-cell CRISPR screening plat-form enables systematic dissection of the unfolded pro-tein response. Cell 167: 1867–1882.e21. doi:10.1016/j.cell.2016.11.048

Alexander J, Kendall J, McIndoo J, Rodgers L, Aboukhalil R,Levy D, Stepansky A, Sun G, Chobardjiev L, Riggs M, etal. 2018. Utility of single-cell genomics in diagnostic eval-uation of prostate cancer. Cancer Res 78: 348–358.doi:10.1158/0008-5472.CAN-17-1138

Angermueller C, Clark SJ, LeeHJ,Macaulay IC, TengMJ,HuTX, Krueger F, Smallwood S, Ponting CP, Voet T, et al.2016. Parallel single-cell sequencing links transcriptionaland epigenetic heterogeneity. Nat Methods 13: 229–232.doi:10.1038/nmeth.3728

Azizi E, Carr AJ, Plitas G, Cornish AE, Konopacki C, Prab-hakaran S, Nainys J, Wu K, Kiseliovas V, Setty M, et al.2018. Single-cell map of diverse immune phenotypes inthe breast tumor microenvironment. Cell 174: 1293–1308.e36. doi:10.1016/j.cell.2018.05.060

Baslan T, Hicks J. 2017. Unravelling biology and shiftingparadigms in cancer with single-cell sequencing. NatRev Cancer 17: 557–569.

BaslanT, Kendall J, Rodgers L, CoxH, RiggsM, StepanskyA,Troge J, Ravi K, Esposito D, Lakshmi B, et al. 2012. Ge-nome-wide copy number analysis of single cells.Nat Pro-toc 7: 1024–1041. doi:10.1038/nprot.2012.039

Baslan T, Kendall J, Ward B, Cox H, Leotta A, Rodgers L,Riggs M, D’Italia S, Sun G, Yong M, et al. 2015. Optimiz-ing sparse sequencing of single cells for highly multi-plex copy number profiling. Genome Res 25: 714–724.doi:10.1101/gr.188060.114

Bendall SC, Davis KL, Amir el-AD, Tadmor MD, SimondsEF, ChenTJ, ShenfeldDK,NolaGP, Pe’er D. 2014. Single-cell trajectory detection uncovers progression and regu-latory coordination in human B cell development. Cell157: 714–725. doi:10.1016/j.cell.2014.04.005

Brennecke P, Anders S, Kim JK, Kolodziejczyk AA, ZhangX,Proserpio V, Baying B, Benes V, Teichmann SA, MarioniJC, et al. 2013. Accounting for technical noise in single-cell RNA-seq experiments. Nat Methods 10: 1093–1095.doi:10.1038/nmeth.2645

Briggs JA, Weinreb C, Wagner DE, Megason S, Peshkin L,Kirschner MW, Klein AM. 2018. The dynamics of geneexpression in vertebrate embryogenesis at single-cell res-olution. Science 360: eaar5780. doi:10.1126/science.aar5780

Buenrostro JD, Wu B, Litzenburger UM, Ruff D, GonzalesML, Snyder MP, Chang HY, Greenleaf WJ. 2015. Sin-gle-cell chromatin accessibility reveals principles ofregulatory variation. Nature 523: 486–490. doi:10.1038/nature14590

Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. 2018.Integrating single-cell transcriptomic data across differ-ent conditions, technologies, and species. Nat Biotechnol36: 411–420. doi:10.1038/nbt.4096

Casasent AK, Schalck A, Gao R, Sei E, Long A, PangburnW,Casasent T, Meric-Bernstam F, Edgerton ME, Navin NE.2018. Multiclonal invasion in breast tumors identified bytopographic single cell sequencing. Cell 172: 205–217.e12. doi:10.1016/j.cell.2017.12.007

Casbon JA, Osborne RJ, Brenner S, Lichtenstein CP. 2011. Amethod for counting PCR template molecules with appli-cation to next-generation sequencing. Nucleic Acids Res39: e81. doi:10.1093/nar/gkr217

ChenC, XingD, Tan L, LiH, ZhouG,Huang L, Xie XS. 2017.Single-cell whole-genome analyses by Linear Amplifica-tion via Transposon Insertion (LIANTI). Science 356:189–194. doi:10.1126/science.aak9787

Clark SJ, Argelaguet R, Kapourani CA, Stubbs TM, Lee HJ,Alda-Catalinas C, Krueger F, Sanguinetti G, KelseyG, Marioni JC, et al. 2018. scNMT-seq enables jointprofiling of chromatin accessibility DNA methylationand transcription in single cells. Nat Commun 9: 781.doi:10.1038/s41467-018-03149-4

Dago AE, Stepansky A, Carlsson A, Luttgen M, Kendall J,Baslan T, Kolatkar A,WiglerM, Bethel K, GrossME, et al.2014. Rapid phenotypic and genomic change in responseto therapeutic pressure in prostate cancer inferred by highcontent analysis of single circulating tumor cells. PLoSONE 9: e101777. doi:10.1371/journal.pone.0101777

Dean FB, Hosono S, Fang L,Wu X, Faruqi AF, Bray-Ward P,Sun Z, Zong Q, Du Y et al. 2002. Comprehensive humangenome amplification using multiple displacementamplification. Proc Natl Acad Sci 99: 5261–5266. doi:10.1073/pnas.082089499

Delley CL, Liu L, Sarhan MF, Abate AR. 2018. Combinedaptamer and transcriptome sequencing of single cells. SciRep 8: 2919. doi:10.1038/s41598-018-21153-y

N. Anaparthy et al.

22 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 23: Single-Cell Applications of Next-Generation Sequencing

Deng Q, Ramsköld D, Reinius B, Sandberg R. 2014. Single-cell RNA-seq reveals dynamic, random monoallelic geneexpression in mammalian cells. Science 343: 193–196.doi:10.1126/science.1245316

Dey SS, Kester L, Spanjaard B, Bienko M, van OudenaardenA. 2015. Integrated genome and transcriptome se-quencing of the same cell. Nat Biotechnol 33: 285–289.doi:10.1038/nbt.3129

Ding B, Zheng L, Zhu Y, Li N, Jia H, Ai R,Wildberg A,WangW. 2015. Normalization and noise reduction for singlecell RNA-seq experiments. Bioinformatics 31: 2225–2227. doi:10.1093/bioinformatics/btv122

Dixit A, Pamas O, Li B, Chen J, Fulco CP, Jerby-AmonL, Marjanovic ND, Dionne D, Burks T, Raychowd-hury R, et al. 2016. Perturb-Seq: Dissecting molecularcircuits with scalable single-cell RNA profiling of pooledgenetic screens. Cell 167: 1853–1866.e17. doi:10.1016/j.cell.2016.11.038

Dulken BW, Leeman DS, Boutet SC, Hebestreit K, Brunet A.2017. Single-cell transcriptomic analysis defines hetero-geneity and transcriptional dynamics in the adult neuralstem cell lineage. Cell Rep 18: 777–790. doi:10.1016/j.celrep.2016.12.060

Eberwine J, Yeh H, Miyashiro K, Cao Y, Nair S, Finnell R,ZettelM, Coleman P. 1992. Analysis of gene expression insingle live neurons. Proc Natl Acad Sci 89: 3010–3014.doi:10.1073/pnas.89.7.3010

Farrell JA,WangY, Riesenfeld SJ, ShekharK, RegevA, SchierAF. 2018. Single-cell reconstruction of developmental tra-jectories during zebrafish embryogenesis. Science 360:eaar3131. doi:10.1126/science.aar3131

Fincher CT, Wurtzel O, de Hoog T, Kravarik KM, ReddienPW. 2018. Cell type transcriptome atlas for the planarianSchmidtea mediterranea. Science 360: eaaq1736. doi:10.1126/science.aaq1736

FuGK,Hu J,Wang PH, Fodor SP. 2011. Counting individualDNA molecules by the stochastic attachment of diverselabels. Proc Natl Acad Sci 108: 9026–9031. doi:10.1073/pnas.1017621108

Gao R, Davis A, McDonald TO, Sei E, Shi X, Wang Y, TsaiPC, Casasent A, Waters J, Zhang H, et al. 2016. Punc-tuated copy number evolution and clonal stasis in tri-ple-negative breast cancer. Nat Genet 48: 1119–1130.doi:10.1038/ng.3641

Gierahn TM, Wadsworth MH II, Hughes TK, Bryson BD,Butler A, Satija R, Fortune S, Love JC, Shalek AK. 2017.Seq-Well: Portable, low-cost RNA sequencing of singlecells at high throughput. Nat Methods 14: 395–398.doi:10.1038/nmeth.4179

Greene SB, Dago AE, Leitz LJ, Wang Y, Lee J, Werner SL,Gendreau S, Patel P, Jia S, Zhang L, et al. 2016. Chromo-somal instability estimation based on next generation se-quencing and single cell genome wide copy number var-iation analysis. PLoS ONE 11: e0165089.

Grün D, van Oudenaarden A. 2015. Design and analysis ofsingle-cell sequencing experiments. Cell 163: 799–810.doi:10.1016/j.cell.2015.10.039

Guo H, Zhu P, Wu X, Li X, Wen L, Tang F. 2013. Single-cellmethylome landscapes of mouse embryonic stem cellsand early embryos analyzed using reduced representationbisulfite sequencing. Genome Res 23: 2126–2135.

Haghverdi L, Buettner F, Theis FJ. 2015. Diffusion mapsfor high-dimensional single-cell analysis of differentia-tion data. Bioinformatics 31: 2989–2998. doi:10.1093/bioinformatics/btv325

Ho YJ, Anaparthy N,Molik D, MathewG, Aicher T, Patel A,Hicks J, Hammell MG. 2018. Single-cell RNA-seq analy-sis identifies markers of resistance to targeted BRAF in-hibitors in melanoma cell populations. Genome Res 28:1353–1363. doi:10.1101/gr.234062.117

HouY, GuoH, CaoC, Li X,HuB, Zhu P,WuX,Wen L, TangF,HuangY, et al. 2016. Single-cell triple omics sequencingreveals genetic, epigenetic, and transcriptomic heteroge-neity in hepatocellular carcinomas. Cell Res 26: 304–319.doi:10.1038/cr.2016.23

Islam S, Kjällquist U,Moliner A, Zajac P, Fan JB, LönnerbergP, Linnarsson S. 2011. Characterization of the single-celltranscriptional landscape by highly multiplex RNA-seq.Genome Res 21: 1160–1167. doi:10.1101/gr.110882.110

Islam S, Zeisel A, Joost S, La Manno G, Zajac P, Kasper M,Lönnerberg P, Linnarsson S. 2014. Quantitative single-cell RNA-seq with unique molecular identifiers. NatMethods 11: 163–166. doi:10.1038/nmeth.2772

Jaitin DA, Kenigsberg E, Keren-Shaul H, Elefant N, Paul F,Zaretsky I, Mildner A, Cohen N, Jung S, Tanay A, et al.2014. Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science 343:776–779. doi:10.1126/science.1247651

Keil JM, Qalieh A, Kwan KY. 2018. Brain transcriptomedatabases: A user’s guide. J Neurosci 38: 2399–2412.

Kendall J, Krasnitz A. 2014. Computational methods forDNA copy-number analysis of tumors. Methods MolBiol 1176: 243–259. doi:10.1007/978-1-4939-0992-6_20

Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH,Zahler AM, Haussler D. 2002. The human genomebrowser at UCSC. Genome Res 12: 996–1006. doi:10.1101/gr.229102

Kester L, van Oudenaarden A. 2018. Single-cell transcrip-tomics meets lineage tracing. Cell Stem Cell 23: 166–179.doi:10.1016/j.stem.2018.04.014

Kharchenko PV, Silberstein L, Scadden DT. 2014. Bayesianapproach to single-cell differential expression analysis.Nat Methods 11: 740–742. doi:10.1038/nmeth.2967

Kivioja T, Vähärautio A, Karlsson K, Bonke M, Enge M,Linnarsson S, Taipale J. 2012. Counting absolute numbersof molecules using unique molecular identifiers. NatMethods 9: 72–74. doi:10.1038/nmeth.1778

Klein AM, Mazutis L, Akartuna I, Tallapragada N, Veres A,Li V, Peshkin L,Weitz DA, KirschnerMW. 2015. Dropletbarcoding for single-cell transcriptomics applied to em-bryonic stem cells. Cell 161: 1187–1201. doi:10.1016/j.cell.2015.04.044

Lambrechts D, Wauters E, Boeckx B, Aibar S, Nittner D,Burton O, Bassez A, Decaluwé H, Pircher A, Van denEynde K, et al. 2018. Phenotype molding of stromal cellsin the lung tumormicroenvironment.NatMed 24: 1277–1289. doi:10.1038/s41591-018-0096-5

Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC,Baldwin J, Devon K, Dewer K, Doyle M, FitzHugh W,et al. 2001. Initial sequencing and analysis of the humangenome. Nature 409: 860–921.

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 23

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 24: Single-Cell Applications of Next-Generation Sequencing

LiWV, Li JJ. 2018. An accurate and robust imputationmeth-od scImpute for single-cell RNA-seq data. Nat Commun9: 997.

Li H, Courtois ET, Sengupta D, Tan Y, Chen KH, Goh JJL,Kong SL, Chua C, Hon LK, TanWS, et al. 2017. Referencecomponent analysis of single-cell transcriptomes eluci-dates cellular heterogeneity in human colorectal tumors.Nat Genet 49: 708–718. doi:10.1038/ng.3818

Lovatt D, Ruble BK, Lee J, Dueck H, Kim TK, Fisher S,Francis C, Spaethling JM, Wolf JA, Grady MS, et al.2014. Transcriptome in vivo analysis (TIVA) of spatiallydefined single cells in live tissue. Nat Methods 11: 190–196. doi:10.1038/nmeth.2804

Macaulay IC, Haerty W, Kumar P, Li YI, Hu TX, Teng MJ,GoolamM, Saurat N, Coupland P, Shirley LM, et al. 2015.G&T-seq: Parallel sequencing of single-cell genomes andtranscriptomes. Nat Methods 12: 519–522. doi:10.1038/nmeth.3370

Macosko EZ, Basu A, Satija R, Nemesh J, Shekhar K, Gold-manM, Tirosh I, Bialas AR, KamitakiN,Martersteck EM,et al. 2015. Highly parallel genome-wide expression pro-filing of individual cells using nanoliter droplets.Cell 161:1202–1214. doi:10.1016/j.cell.2015.05.002

Malihi P, Morikado M, Welter L, Liu ST, Miller ET, Cada-neanu RM, Knudsen BS, Lewis MS, Carlsson A, RuizVelasco C, et al. 2018. Clonal diversity revealed by mor-phoproteomic and copy number profiles of single pros-tate cancer cells at diagnosis. Converg Sci Phys Oncol 4:015003. doi:10.1088/2057-1739/aaa00b

Marco E, Karp RL, Guo G, Robson P, Hart AH, Trippa L,Guo-Cheng Y. 2014. Bifurcation analysis of single-cellgene expression data reveals epigenetic landscape. ProcNatl Acad Sci 111: E5643–E5650.

Marrinucci D, Bethel K, Luttgen M, Bruce RH, Nieva J,Kuhn P. 2009. Circulating tumor cells fromwell-differen-tiated lung adenocarcinoma retain cytomorphologic fea-tures of primary tumor type. Arch Pathol Lab Med 133:1468–1471.

Martelotto LG, Baslan T, Kendall J, Geyer FC, Burke KA,Spraggon L, Piscuoglio S, Chadalavada K, Nanjangud G,Ng CK, et al. 2017. Whole-genome single-cell copy num-ber profiling from formalin-fixed paraffin-embeddedsamples. Nat Med 23: 376–385.

McDaniel JR, DeKosky BJ, Tanno H, Ellington AD, Geor-giou G. 2016. Ultra-high-throughput sequencing of theimmune receptor repertoire from millions of lympho-cytes. Nat Protoc 11: 429–442. doi:10.1038/nprot. 2016.024

Miura F, Enomoto Y, Dairiki R, Ito T. 2012. Amplification-free whole-genome bisulfite sequencing by post-bisulfiteadaptor tagging. Nucleic Acids Res 40: e136. doi:10.1093/nar/gks454

Müller S, Liu SJ, Di Lullo E, Malatesta M, Pollen AA, Nowa-kowski TJ, Kohanbash G, Aghi M, Kriegstein AR, LimDA, et al. 2016. Single-cell sequencing maps gene ex-pression to mutational phylogenies in PDGF- and EGF-driven gliomas. Mol Syst Biol 12: 889. doi:10.15252/msb.20166969

Nagano T, Lubling Y, Stevens TJ, Schoenfelder S, Yaffe E,DeanW, Laue ED, TanayA, Fraser P. 2013. Single-cell Hi-C reveals cell-to-cell variability in chromosome structure.Nature 502: 59–64.

Navin NN. 2011. Tumour evolution inferred by single-cellsequencing.Nature 472: 90–94. doi:10.1038/nature09807

Nichterwitz S, Chen G, Aguila Benitez J, Yilmaz M, StorvallH, Cao M, Sandberg R, Deng Q, Hedlund E. 2016. Lasercapture microscopy coupled with Smart-seq2 for precisespatial transcriptomic profiling. Nat Commun 7: 12139.doi:10.1038/ncomms12139

Patel AP, Tirosh I, Trombetta JJ, Shalek AK, Gillespie SM,Wakimoto H, Cahill DP, Nahed BV, Curry WT, NahedBV, et al. 2014. Single-cell RNA-seq highlights intratu-moral heterogeneity in primary glioblastomas. Science344: 1396–1401. doi:10.1126/science.1254257

Pellegrino M, Sciambi A, Treusch S, Durruthy-Durruthy R,Gokhale K, Jacob J, Chen TX, Geis JA, Oldham W,Matthews J, et al. 2018. High-throughput single-cellDNA sequencing of acute myeloid leukemia tumorswith droplet microfluidics. Genome Res 28: 1345–1352.doi:10.1101/gr.232272.117

Peterson VM, Zhang KX, Kumar N, Wong J, Li L, WilsonDC,Moore R,McClanahanTK, Sadekova S, KlappenbachJA. 2017.Multiplexed quantification of proteins and tran-scripts in single cells. Nat Biotechnol 35: 936–939.doi:10.1038/nbt.3973

Plass M, Solana J, Wolf FA, Ayoub S, Misios A, Glažar P,Obermayer B, Theis FJ, Kocks C, Rajewsky N. 2018. Celltype atlas and lineage tree of a whole complex animalby single-cell transcriptomics. Science 360: eaaq1723.doi:10.1126/science.aaq1723

Poulin JF, Tasic B, Hjerling-Leffler J, Trimarchi JM, Awatra-mani R. 2016. Disentangling neural cell diversity usingsingle-cell transcriptomics. Nat Neurosci 19: 1131–1141.

Puram SV, Tirosh I, Parikh AS, Patel AP, Yizhak K, GillespieS, Rodman C, Luo CL, Mroz EA, Emerick KS, et al. 2017.Single-cell transcriptomic analysis of primary and meta-static tumor ecosystems in head and neck cancer.Cell 171:1611–1624.e24. doi:10.1016/j.cell.2017.10.044

Ramsköld D, Luo S, Wang YC, Li R, Deng Q, Faridani OR,Daniels GA, Khrebtukova I, Loring JF, Laurent LC, et al.2012. Full-length mRNA-Seq from single-cell levels ofRNA and individual circulating tumor cells. Nat Biotech-nol 30: 777–782. doi:10.1038/nbt.2282

Reece A, Xia B, Jiang Z, Noren B, McBride R, Oakey J. 2016.Microfluidic techniques for high throughput single cellanalysis. Curr Opin Biotechnol 40: 90–96.

Rosati E, Dowds CM, Liaskou E, Henriksen EKK, KarlsenTH, Franke A. 2017. Overview of methodologies for T-cell receptor repertoire analysis. BMC Biotechnol 17: 61.doi:10.1186/s12896-017-0379-9

Ruiz C, Li J, Luttgen MS, Kolatkar A, Kendall JT, Flores E,Topp Z, Samlowski WE, McClay E, Bethel K, et al. 2015.Limited genomic heterogeneity of circulating melanomacells in advanced stage patients. Phys Biol 9: 016008.

Satija R, Farrell JA, Gennert D, Schier AF, Regev A. 2015.Spatial reconstruction of single-cell gene expression data.Nat Biotechnol 33: 495–502. doi:10.1038/nbt.3192

ScherHI, Graf RP, Schreiber NA,McLaughlin B, LuD, LouwJ, Danila DC, Dugan L, Johnson A, Heller G, et al. 2017.Nuclear-specific AR-V7 protein localization is necessaryto guide treatment selection in metastatic castration-re-sistant prostate cancer. Eur Urol 71: 874–882.

N. Anaparthy et al.

24 Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 25: Single-Cell Applications of Next-Generation Sequencing

Seah YFS, Hu H, Merten CA. 2018. Microfluidic single-celltechnology in immunology and antibody screening. MolAspects Med 59: 47–61. doi:10.1016/j.mam.2017.09.004

SettyM, TadmorMD, Reich-Zeliger S, Angel O, Salame TM,Kathail P, Choi K, Bendall S, Friedman N, Pe’er D. 2016.Wishbone identifies bifurcating developmental trajecto-ries from single-cell data. Nat Biotechnol 34: 637–645.

ShalekAK, Satija R, Adiconis X, Gertner RS, Gaublomme JT,Raychowdhury R, Schwartz S, Yosef N,Malboeuf C, LuD,et al. 2013. Single-cell transcriptomics reveals bimodalityin expression and splicing in immune cells. Nature 498:236–240. doi:10.1038/nature12172

Shapiro E, Biezuner T, Linnarsson S. 2013. Single-cell se-quencing-based technologies will revolutionizewhole-or-ganism science. Nat Rev Genet 14: 618–630.

Shiroguchi K, Jia TZ, Sims PA, Xie XS. 2012. Digital RNAsequencing minimizes sequence-dependent bias and am-plification noise with optimized single-molecule bar-codes. Proc Natl Acad Sci 109: 1347–1352. doi:10.1073/pnas.1118018109

Stoeckius M, Hafemeister C, StephensonW, Houck-LoomisB, Chattopadhyay PK, Swerdlow H, Satija R, Smibert P.2017. Simultaneous epitope and transcriptome mea-surement in single cells. Nat Methods 14: 865–868.doi:10.1038/nmeth.4380

Tang F, Barbacioru C, Wang Y, Nordman E, Lee C, Xu N,Wang X, Bodeau J, Tuch BB, Siddiqui A, et al. 2009.mRNA-Seq whole-transcriptome analysis of a singlecell. Nat Methods 6: 377–382. doi:10.1038/nmeth.1315

Telenius H, Carter NP, Bebb CE, Nordenskjöld M, PonderBA, Tunnacliffe A. 1992. Degenerate oligonucleotide-primed PCR: General amplification of target DNA bya single degenerate primer. Genomics 13: 718–725.doi:10.1016/0888-7543(92)90147-K

Ting DT, Wittner BS, Ligorio M, Vincent Jordan N, ShahAM, Miyamoto DT, Aceto N, Bersani F, Brannigan BW,Xega K, et al. 2014. Single-cell RNA sequencing identifiesextracellular matrix gene expression by pancreatic circu-lating tumor cells. Cell Rep 8: 1905–1918. doi:10.1016/j.celrep.2014.08.029

Tirosh I, Venteicher AS, Hebert C, Escalante LE, Patel AP,Yizhak K, Fisher JM, Rodman C, Mount C, Filbin MG, etal. 2016a. Single-cell RNA-seq supports a developmentalhierarchy in human oligodendroglioma. Nature 539:309–313. doi:10.1038/nature20123

Tirosh I, Izar B, Prakadan SM,WadsworthMH II, Treacy D,Trombetta JJ, RotemA, Rodman C, Lian C, Murphy G, etal. 2016b. Dissecting the multicellular ecosystem of met-astatic melanoma by single-cell RNA-seq. Science 352:189–196. doi:10.1126/science.aad0501

Tosches MA, Yamawaki TM, Naumann RK, Jacobi AA,Tushev G, Laurent G. 2018. Evolution of pallium, hippo-campus, and cortical cell types revealed by single-cell transcriptomics in reptiles. Science 360: 881–888.doi:10.1126/science.aar4237

Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li S,Morse M, Lennon NJ, Livak KJ, Mikkelsen TS, Rinn JL.2014. The dynamics and regulators of cell fate decisionsare revealed by pseudotemporal ordering of single cells.Nat Biotechnol 32: 381–386. doi:10.1038/nbt.2859

Tsoucas D, Yuan GC. 2017. Recent progress in single-cellcancer genomics. Curr Opin Genet Dev 42: 22–32.

van der Maaten LJP. 2014. Accelerating t-SNE using tree-based algorithms. J Mach Learn 15: 3221–3245.

van der Maaten LJP, Hinton GE. 2008. Visualizing high-dimensional data using t-SNE. J Mach Learn 9: 2579–2605.

van Dijk D, Sharma R, Nainys J, Yim K, Kathail P, Carr AJ,Burdziak C, Moon KR, Chaffer CL, Pattabiraman D, et al.2018. Recovering gene interactions from single-cell datausing data diffusion. Cell 174: 716–729.e27.

Van Gelder RN, von Zastrow ME, Yool A, Dement WC,Barchas JD, Eberwine JH. 1990. Amplified RNA synthe-sized from limited quantities of heterogeneous cDNA.Proc Natl Acad Sci 87: 1663–1667.

Wagner DE, Weinreb C, Collins ZM, Briggs JA, MegasonSG, Klein AM. 2018. Single-cell mapping of gene expres-sion landscapes and lineage in the zebrafish embryo. Sci-ence 360: 981–987. doi:10.1126/science.aar4362

Wang Y,Waters J, LeungML, Unruh A, RohW, Shi X, ChenK, Scheet P, Vattathil S, Liang H, et al. 2014. Clonalevolution in breast cancer revealed by single nucleus ge-nome sequencing. Nature 512: 155–160. doi:10.1038/nature13600

WattenbergM, Viegas F, Johnson I. 2016. How to use t-SNEeffectively. Distill doi:10.23915/distill.00002

Werner SL, Graf RP, Landers M, Valenta DT, Schroeder M,Greene SB, Bales N, Dittamore R, Marrinucci D.2015. Analytical validation and capabilities of the epicCTC platform: Enrichment-free circulating tumour celldetection and characterization. J Circ Biomark 4: 3.doi:10.5772/60725

Williamson SC, Metcalf RL, Trapani F, Mohan S, AntonelloJ, Abbott B, Leong HS, Chester CP, Simms N, Polanski R,et al. 2016. Vasculogenic mimicry in small cell lung can-cer. Nat Commun 7: 13322. doi:10.1038/ncomms13322

Woyke T, Doud DFR, Schulz F. 2017. The trajectory of mi-crobial single-cell sequencing. Nat Methods 14: 1045–1054. doi:10.1038/nmeth.4469

Wu YE, Pan L, Zuo Y, Li X, Hong W. 2017. Detecting acti-vated cell populations using single-cell RNA-Seq.Neuron96: 313–329.e6. doi:10.1016/j.neuron.2017.09.026

Xu C, Su Z. 2015. Identification of cell types from single-celltranscriptomes using a novel clusteringmethod. Bioinfor-matics 31: 1974–1980.

Xu J, Fang R, Chen L, Chen D, Xiao JP, Yang W, Wang H,Song X,Ma T, Bo S, et al. 2016. Noninvasive chromosomescreening of human embryos by genome sequencing ofembryo culture medium for in vitro fertilization. ProcNatl Acad Sci 113: 11907–11912.

Young MD, Mitchell TJ, Vieira Braga FA, Tran MGB, Stew-art BJ, Ferdinand JR, CollordG, BottingRA, PopescuDM,Loudon KW, et al. 2018. Single-cell transcriptomes fromhumankidneys reveal the cellular identityof renal tumors.Science 361: 594–599. doi:10.1126/ science.aat1699

Yuan J, Sims PA. 2016. An automated microwell platformfor large-scale single cell RNA-Seq. Sci Rep 6: 33883.doi:10.1038/srep33883

Zeisel A, Muñoz-Manchado AB, Codeluppi S, Lönnerberg P,LaManno G, Juréus A,Marques S, MungubaH, He L, Bet-sholtzC, et al. 2015. Brain structure. Cell types in themousecortex and hippocampus revealed by single-cell RNA-seq. Science 347: 1138–1142. doi:10.1126/science.aaa1934

Single-Cell Applications of Next-Generation Sequencing

Advanced Online Article. Cite this article as Cold Spring Harb Perspect Med doi: 10.1101/cshperspect.a026898 25

ww

w.p

ersp

ecti

vesi

nm

edic

ine.

org

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from

Page 26: Single-Cell Applications of Next-Generation Sequencing

published online January 7, 2019Cold Spring Harb Perspect Med  Naishitha Anaparthy, Yu-Jui Ho, Luciano Martelotto, Molly Hammell and James Hicks Single-Cell Applications of Next-Generation Sequencing

Subject Collection Next-Generation Sequencing in Medicine

Next-Generation Sequencing Technologies

Elaine R. MardisW. Richard McCombie, John D. McPherson and Sequencing

Single-Cell Applications of Next-Generation

Martelotto, et al.Naishitha Anaparthy, Yu-Jui Ho, Luciano

Cancer Genomics: From Discovery to ClinicThe Impact of Next-Generation Sequencing on

Elaine R. MardisNext-Generation SequencingFuture Promises and Concerns of Ubiquitous

W. Richard McCombie and John D. McPherson

DisorderNext-Generation Sequencing in Autism Spectrum

Stephan J. Sanders

Next-Generation Sequencing StrategiesShawn E. Levy and Braden E. Boone

Worldview of the EpigenomeSequencing in High Definition Drives a Changing

Emily Hodges

Characterizing the Cancer Genome in BloodSarah-Jane Dawson

Whole-Genome Sequencing in CancerEric Y. Zhao, Martin Jones and Steven J.M. Jones Pharmacogenetics and Pharmacogenomics

The Role of Next-Generation Sequencing in

Ute I. Schwarz, Markus Gulilat and Richard B. Kim

Disease RiskHigh Throughput Sequencing and Assessing

Shannon M. Rego and Michael P. Snyder DisabilityResearch and Diagnostics for Intellectual The Use of Next-Generation Sequencing for

et al.Ricardo Harripaul, Abdul Noor, Muhammad Ayub,

Clinical Versus Research SequencingYuriy Shevchenko and Sherri Bale Results

Next-Generation Sequencing and the Return of

Sénécal, et al.Bartha Maria Knoppers, Minh Thu Nguyen, Karine

http://perspectivesinmedicine.cshlp.org/cgi/collection/ For additional articles in this collection, see

Copyright © 2019 Cold Spring Harbor Laboratory Press; all rights reserved

on April 3, 2022 - Published by Cold Spring Harbor Laboratory Press http://perspectivesinmedicine.cshlp.org/Downloaded from