in silico reconstruction of viral genomes from small rnas improves virus-derived small interfering...

Post on 10-May-2015

74 Views

Category:

Health & Medicine

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Raunak Shrestha

10th Nov. 2011

Source: Vodovar N, Goic B, Blanc H, Saleh MC. In silico reconstruction of viral genomes from small RNAs improves virus-derived small interfering RNA profiling. J Virol. 2011 Nov;85(21):11016-21

RNA interference (RNAi) pathway

2Source: http://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/ShRNA_Lentivirus.svg/473px-ShRNA_Lentivirus.svg.png

• 21-nucleotide long virus-derived small interfering RNAs (vsiRNAs) derived from cleavage of viral double-stranded RNA (dsRNA)

• vsiRNAs contain the viral genomic imprint

• This property can be used to reconstruct the viral genome using vsiRNA

3

Method• Model organism for study: Drosophila melanogaster

4

Naïve D. melanogaster S2

tissue

infected S2R cells

infected

Total RNA Extracted

Deep sequencing of RNA (Illumina)

RNA reads Genome Assembly Contigs

BLAST

NCBI nucleotide database

RNA Sequencing (Sanger method)

BLAST results

• Viral contigs aligned to different regions of these viruses:1. Flock house virus (FHV) variant American nodavirus (ANV)2. Drosophila C virus (DCV)3. Drosophila X virus (DXV)

• But the obtained contigs and the NCBI sequences had significant differences

• Possibility: viruses present in these supernatants were different from those previously published

• These “new” viral sequence were labeled as1. FHSS2R+

2. DCVS2R+

3. DXVS2R+

5

• Introduces a perl script “Paparazzi”

• it accurately reconstitutes viral genomes through an iterative alignment/consensus call procedure using an initial reference sequence as a scaffold

• resulting full length consensus sequence is then reused to profile vsiRNAs

6

How Paparazzi works ?

7

(Bowtie)

(SSAKE)

• Paparazzi reconstructed the 3,107-nucleotide-long genome sequence of FHVS2R+ RNA1• But the four first nucleotide were unresolved (i.e. criteria to call

the consensus were not met )• Also, Paparazzi reconstructed the same sequence for FHVS2R+

RNA1 when FHVNCBI was taken as the reference sequence

• The reconstructed genome differed from ANVNCBI by 9.14%

• So Paparazzi could also be used when distantly related sequence are taken as reference

8

• Compared the Paparazzi reconstructed genome with Sanger sequencing-determined sequences of FHVS2R+ RNA1• Difference of only a single nucleotide (0.03%)

• while the sequences of ANVNCBI and FHVS2R+ RNA1 differ by 2.83% over 3,107 nucleotides.

• Paparazzi can be used as a substitute to Sanger sequencing in this context.

9

Genome breakpoint identification• FHVS2R+ RNA2• 64 differences as compared to Sanger determined sequences• introduced premature stop codons• introduced two artifactual internal duplications (absent in sanger

sequences)• reconstructed sequence did not represent the actual

functional genome

• The RNA2 segment of FHV is prone to internal deletions• give rise to Defective Interfering particles (DIs)• Existence of DIs confirmed by PCR and Sanger sequence data

10

Genome breakpoint identification

• comparison of the sanger sequence obtained for DIs and that of ANVNCBI RNA2• identification of three breakpoints

• All differences observed between the Paparazzi-reconstructed and Sanger sequencing determined FHVS2R+ RNA2 sequences resided in the regions covered by DIs or around DI breakpoints

11

Genome breakpoint identification• Paparazzi was instructed to • identify breakpoints in viral genomes • to eliminate reads aligning against these breakpoints

12

• At each cycle, the contigs obtained were aligned against the virus consensus sequence generated using BLAT• This allowed the identification

of the breakpoints

• Paparazzi successfully reconstructed genomes of other detected viruses - DCVS2R+ and DXVS2R+

13

(Graph generated from the supplementary table 1)

No. of reads aligned onNCBI sequences with 0

mismatch

No. of reads aligned onNCBI sequences with 1

mismatch

No. of reads aligned onreconstructed sequences with 0

mismatch

0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

Number of vsiRNA (21nt) reads obtained from naïve S2 cells infected by the supernatant of S2R+ cells that align against NCBI sequences and those reconstructed by Paparazzi

ANV RNA1 ANV RNA2 DCV DXV segment A DXV segment B

No. o

f rea

ds a

ligne

d

Conclusion• Paparazzi provides an effective tool for • viral genome reconstruction• accurate vsiRNA profiling• studying RNAi processing

• Powerful tool to polish the results obtained from virus discovery pipelines.

14

Critique• This approach cannot work well when the some region of

genome are less covered than others (cold spots)• observed in the profile of the Semliki forest virus (SFV)• insensitive to Dicer processing

15

top related