in silico reconstruction of viral genomes from small rnas improves virus-derived small interfering...
TRANSCRIPT
Raunak Shrestha
10th Nov. 2011
Source: Vodovar N, Goic B, Blanc H, Saleh MC. In silico reconstruction of viral genomes from small RNAs improves virus-derived small interfering RNA profiling. J Virol. 2011 Nov;85(21):11016-21
RNA interference (RNAi) pathway
2Source: http://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/ShRNA_Lentivirus.svg/473px-ShRNA_Lentivirus.svg.png
• 21-nucleotide long virus-derived small interfering RNAs (vsiRNAs) derived from cleavage of viral double-stranded RNA (dsRNA)
• vsiRNAs contain the viral genomic imprint
• This property can be used to reconstruct the viral genome using vsiRNA
3
Method• Model organism for study: Drosophila melanogaster
4
Naïve D. melanogaster S2
tissue
infected S2R cells
infected
Total RNA Extracted
Deep sequencing of RNA (Illumina)
RNA reads Genome Assembly Contigs
BLAST
NCBI nucleotide database
RNA Sequencing (Sanger method)
BLAST results
• Viral contigs aligned to different regions of these viruses:1. Flock house virus (FHV) variant American nodavirus (ANV)2. Drosophila C virus (DCV)3. Drosophila X virus (DXV)
• But the obtained contigs and the NCBI sequences had significant differences
• Possibility: viruses present in these supernatants were different from those previously published
• These “new” viral sequence were labeled as1. FHSS2R+
2. DCVS2R+
3. DXVS2R+
5
• Introduces a perl script “Paparazzi”
• it accurately reconstitutes viral genomes through an iterative alignment/consensus call procedure using an initial reference sequence as a scaffold
• resulting full length consensus sequence is then reused to profile vsiRNAs
6
How Paparazzi works ?
7
(Bowtie)
(SSAKE)
• Paparazzi reconstructed the 3,107-nucleotide-long genome sequence of FHVS2R+ RNA1• But the four first nucleotide were unresolved (i.e. criteria to call
the consensus were not met )• Also, Paparazzi reconstructed the same sequence for FHVS2R+
RNA1 when FHVNCBI was taken as the reference sequence
• The reconstructed genome differed from ANVNCBI by 9.14%
• So Paparazzi could also be used when distantly related sequence are taken as reference
8
• Compared the Paparazzi reconstructed genome with Sanger sequencing-determined sequences of FHVS2R+ RNA1• Difference of only a single nucleotide (0.03%)
• while the sequences of ANVNCBI and FHVS2R+ RNA1 differ by 2.83% over 3,107 nucleotides.
• Paparazzi can be used as a substitute to Sanger sequencing in this context.
9
Genome breakpoint identification• FHVS2R+ RNA2• 64 differences as compared to Sanger determined sequences• introduced premature stop codons• introduced two artifactual internal duplications (absent in sanger
sequences)• reconstructed sequence did not represent the actual
functional genome
• The RNA2 segment of FHV is prone to internal deletions• give rise to Defective Interfering particles (DIs)• Existence of DIs confirmed by PCR and Sanger sequence data
10
Genome breakpoint identification
• comparison of the sanger sequence obtained for DIs and that of ANVNCBI RNA2• identification of three breakpoints
• All differences observed between the Paparazzi-reconstructed and Sanger sequencing determined FHVS2R+ RNA2 sequences resided in the regions covered by DIs or around DI breakpoints
11
Genome breakpoint identification• Paparazzi was instructed to • identify breakpoints in viral genomes • to eliminate reads aligning against these breakpoints
12
• At each cycle, the contigs obtained were aligned against the virus consensus sequence generated using BLAT• This allowed the identification
of the breakpoints
• Paparazzi successfully reconstructed genomes of other detected viruses - DCVS2R+ and DXVS2R+
13
(Graph generated from the supplementary table 1)
No. of reads aligned onNCBI sequences with 0
mismatch
No. of reads aligned onNCBI sequences with 1
mismatch
No. of reads aligned onreconstructed sequences with 0
mismatch
0
100,000
200,000
300,000
400,000
500,000
600,000
700,000
800,000
Number of vsiRNA (21nt) reads obtained from naïve S2 cells infected by the supernatant of S2R+ cells that align against NCBI sequences and those reconstructed by Paparazzi
ANV RNA1 ANV RNA2 DCV DXV segment A DXV segment B
No. o
f rea
ds a
ligne
d
Conclusion• Paparazzi provides an effective tool for • viral genome reconstruction• accurate vsiRNA profiling• studying RNAi processing
• Powerful tool to polish the results obtained from virus discovery pipelines.
14
Critique• This approach cannot work well when the some region of
genome are less covered than others (cold spots)• observed in the profile of the Semliki forest virus (SFV)• insensitive to Dicer processing
15