supplementary information (includes supplementary ... web view supplementary information (includes...

Download SUPPLEMENTARY INFORMATION (includes supplementary ... Web view SUPPLEMENTARY INFORMATION (includes supplementary

Post on 26-Sep-2020




0 download

Embed Size (px)


SUPPLEMENTARY INFORMATION (includes supplementary methods, clinical summaries, and supplementary Figure e-1) Supplementary Methods

Tissue biopsy processing and sequencing. Tissue for the study was collected under a research protocol approved by the Johns Hopkins University School of Medicine Institutional Review Board (IRB NA_00003551). Fresh frozen tissues were obtained from eight cases and two were from paraffin-processed tissues. Except for the paraffin-processed tissues, the small amounts of biopsy material were consumed completely by the library preparation process, leaving no material for additional studies. Fresh frozen tissues used for RNA isolation were submerged immediately after biopsy in RNALater. DNA or RNA isolation, library preparation, and sequencing on Illumina MiSeq platform were all performed in the Johns Hopkins Deep Sequencing and Microarray Core Facility. All samples were first treated with lyticase to break down fungal cell walls before proceeding to either DNA or RNA isolation to ensure capture of any potential fungal sequences in the sample. Biopsy tissue was snap frozen for DNA isolation or preserved in RNALater for RNA isolation. Both type of tissue were homogenized in 180 ul Y1 (1M sorbitol with 0.1M EDTA and b-mercaptoethanol) solution and treated with 20ul 1U/ul lyticase at 30°C for 30 minutes with rotation.  DNA was then isolated following the QiaAmp DNA mini-protocol, and RNA was isolated following the Qiagen miRNeasy protocol for total RNA. Library preparation was performed using either Illumina DNA Nano sample preparation kit or Illumina TruSeq stranded total RNA kit for DNA samples (depending on the available quantity of DNA) and Nugen RNAseq library kit for RNA samples.

For each sample, one run (no multiplexing) of an Illumina MiSeq instrument was used for sequencing, generating up to 30 million reads with read lengths of 150–300 bp (Supplementary Table 1). The number of reads varied based on the quality of the DNA and RNA.

Computational analysis. All of the bacterial and viral genomes are complete–the chromosomes are free of gaps–although the eukaryotes are of necessity draft genomes, because no eukaryotic pathogen has yet been completely sequenced. The non-human genomes were masked to remove low-complexity sequences using the NCBI dustmasker program[11]. In order to identify common vector contaminants, the database also includes the genome of phiX174, and the UniVec ( and EMVec ( databases of vector and adapter sequences. (A more sensitive alignment program such as Bowtie2 could have been used instead of Kraken. Bowtie2 and similar aligners report the best alignment (or the best k alignments for some small value of k) for each read rather than the species. They are not able to report the taxonomic assignment of the reads; thus a read matching two distinct species would yield two matches, and further software would have to be developed to determine the best taxonomic assignment, which might be at the genus level or above. Kraken does this automatically. Also, because Kraken does not do a full alignment of each read, it is significantly faster than Bowtie2. Finally, we note that the number of unclassified reads in each sample was typically much less than 1% of the total, indicating that Kraken's sensitivity was always very high.)

To enrich the Kraken report for the presence of infectious agents, several filtering steps were implemented. First, reads matching known contaminants such as phage phiX174, a standard spike-in control for Illumina sequencing instruments, were removed. Next, the post-processing removed reads from the common human commensal bacteria E. coli and P. acnes and the potential laboratory contaminant S. cerevisiae [12]. After these corrections, the total percentage of reads from each remaining species was recomputed to produce the heatmap in Figure 1. Appendix 1 (Table e-1) shows the read length, total read count, and number of human reads per sample. Tables e-2 and e-3 contains detailed read counts and percentages for all species for each sample. Appendix 1 include files containing all reads found in all samples, after removal of human reads and vector contaminants. Appendix 3 contains a file showing the raw Kraken output for patient PT-5.

As complementary approaches, we analyzed the non-human reads (as classified by Kraken) with MetaPhlAn [13] and DIAMOND[14]. MetaPhlAn uses a marker-gene approach and has a database with ~17,000 microbial and eukaryotic reference genomes. We employed version 2.2.0 with the options --bt2_ps sensitive-local --min_alignment_len 100. DIAMOND is a fast alternative to BLASTX, which we used to search translated protein sequences from reads that Kraken failed to classify. In no case did these other classifiers produce results inconsistent with those from Kraken.

CLINICAL SUMMARIES Cases with a high degree of diagnostic confidence and positive pathogen identification Patient PT-8: A patient with osteomyelitis and multiple nodular lung lesions who presented with multifocal brain and spinal lesions. Pathogen identified: Mycobacterium tuberculosis.

A 67-year-old woman presented to another hospital with a two month history of subacute back pain, fevers and exertional dyspnea. She was transferred to our institution for evaluation of an “epidural abscess.” Diagnostic evaluation revealed numerous intracranial and spinal cord lesions, multiple lung nodules, lumbar osteomyelitis and epidural abscess at the site of two previous steroid injections for back pain and bilateral psoas muscle abscesses. Extensive microbiological studies that included samples from vertebral bone biopsy, serum, sputum and bronchoalveolar lavage (BAL) fluid and bronchial and lung biopsies were negative for bacteria, fungi or mycobacteria. Two BAL fluid samples were also negative for mycobacteria when tested with Xpert MTB/RIF ® assay [1]. QuantiFERON-TB Gold ® interferon-gamma release assays (IGRA) obtained on four separate occasions during the pre-biopsy clinical course were indeterminate. Two cerebrospinal fluid samples showed mild pleocytosis (6 leukocytes/mm3 in sample 1 and 15 leukocytes/mm3 in sample 2) and elevated protein (72 mg/dL in sample 1 and 103 mg/dL in sample 2). Most of the CSF microbiological studies which included stains for AFB, mycobacteria and fungal cultures as well as other assays for fungal or bacterial species were negative. The patient experienced a rapid worsening of her neurological condition and became unresponsive, ultimately requiring prolonged intubation. She had been on a long-standing and broad-spectrum antibiotic treatment but in the context of her decompensation an anti-tuberculous drug regimen was empirically added. She improved transiently without clear diagnosis and was reluctant to proceed with any additional biopsies or tissue. A research assay of CSF using the PLEX-ID platform [2] was positive for Nocardia spp. Because Nocardia was felt to be a plausible explanation of her multiple brain and pulmonary nodules, and due to the absence of other potential pathogens in her workup and her improvement on antibiotics that were active against Nocardia, she was taken off anti-tuberculous therapy and transitioned to trimethoprim/sulfamethoxazole and meropenem. Steroid treatment that had been instituted for management of brain and spinal cord edema was tapered. She was discharged to a rehabilitation facility but later re-admitted with a worsening neurological condition and encephalopathy. Brain MRI showed progression of multiple nodular enhancing lesions throughout supratentorial and infratentorial brain structures compared to prior imaging obtained 4 weeks before (Figure 2). The patient agreed to pursue a brain biopsy for diagnosis. Biopsies from the perilesional brain tissue (S1) and a nodular lesion (S2) were obtained for conventional pathology, microbiology, and NGS studies.

Two DNA sequencing runs yielded 15M reads from sample S1 and 14M reads from sample S2. These runs yielded the fewest microbial reads of any of the patients in our study: 18 and 22 bacterial reads, and only one to six viral and fungal reads, respectively, for samples S1 and S2. Nonetheless, a clear finding emerged for sample S2: 15 reads from Mycobacterium tuberculosis. Despite the small absolute number of reads, this species explained 68% of the bacterial reads detected. We manually confirmed the sequence assignments using Blast [3] to align them against the NCBI nt database. We then re-aligned all reads against one specific genome, M. tuberculosis 7199-99 (accession NC_020089.1) using Bowtie2 with sensitive local alignment settings[4]. This procedure yielded 34 reads that were randomly distributed along the M. tuberculosis genome. As additional support for this diagnosis, we note that M. tuberculosis was not observed as a contaminant in any other sample in this case series, and detection in a brain biopsy due to quiescent infection is unexpected.

Histopathological studies of the corresponding S2 sample showed necrotizing granulomas although extensive studies with AFB, GMS and other special stains failed to identify any microorganism (Figure 2). Because the clinical symptoms of the patient were consistent with tuberculosis, necrotizing granulomas were present in the biopsy, and M. tuberculosis was identified by sequencing, treatment for tuberculosis was re-initiated the same day that sequencing was completed. The patient responded rapidly over the next few days and was discharged to continue her anti-tuberculous treatment at home. The patient has exhibited nearly complete cognitive and neurologic recovery, although at 5 months after the diagnosis was established, continued with residual back pain resulting from her spine pathology.

Patient PT-5: A patient with a f


View more >