detecting copy number variation with short paired reads
DESCRIPTION
Detecting Copy Number Variation With Short Paired Reads. Department of Computer Science University of Toronto Genome Informatics 2009. Paul Medvedev , Marc Fiume, Misko Dzamba, Tim Smith, Adrian Dalca, Mike Brudno. Copy Number Variants (CNVs). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/1.jpg)
Detecting Copy Number Variation With
Short Paired Reads
Department of Computer Science University of Toronto
Genome Informatics 2009
Paul Medvedev, Marc Fiume, Misko Dzamba,
Tim Smith, Adrian Dalca, Mike Brudno
![Page 2: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/2.jpg)
Copy Number Variants (CNVs)
• Large regions that appear a different number of times within different indiv.
• CNVs are associated with a number of diseases
• Input– reference human genome– sequenced donor genome
• Output– CNV annotations in ref
![Page 3: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/3.jpg)
Previous Approach
DOC 1 2 1 0 1
Ref
Ref 1 1 1 1 1
CNV
0.8 2.3 0.5 0.5 1.7
CNV
Campbell et al 2008Chiang et al 2009Yoon et al 2009
Campbell et al 2008Chiang et al 2009Yoon et al 2009
Using depth of coverage:
Our Approach:
• Capture adjacency information about the donor genome in a graph.
• Use these adjacencies together with DOC
![Page 4: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/4.jpg)
Donor Graph
Step 1: represent reference adjacencies
![Page 5: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/5.jpg)
Donor Graph
Step 1: represent reference adjacencies
![Page 6: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/6.jpg)
Donor Graph
Ref
Donor
Step 2: represent donor adjacencies
![Page 7: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/7.jpg)
Ref
Donor
Donor Graph
Step 2: represent donor adjacencies
![Page 8: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/8.jpg)
Which walk is the donor?
DOC
Ref
1111221Path
Ref 1 2 1 1 1 1 1
CNV
We find a path that is “most faithful” to the DOC – using probabilistic model to score “faithfulness”– use network flow to find traversal counts of walk with max score
0.8 2.3 2.6 0.5 1.4 1.7 1.1
Use depth-of-coverage:
![Page 9: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/9.jpg)
Preliminary Results
• NA18507 individual sampled with Illumina, hg18 reference
• Total of 3730 CNV calls
• 2165 losses, 1565 gains
Size DistributionSize Distribution
![Page 10: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/10.jpg)
58%
6%
1%
35%Just Loss
Both
Just Gain
None
Preliminary Results
After randomly shuffling our calls:
Sensitivity: Kidd et al.’s (2008) LOSS calls (141 calls)
88%
6% 0%
6%
Percentage of Kidd’s callsthat overlap one of ours:
11%
68%
9%
12%
DGV Loss
DGV Both
DGV Gain
None
Percent of our calls that overlap with DGV:
After randomly shufflingour calls:
Specificity: Database of Genomic Variants (DGV)
11%
7%
12%
70%
![Page 11: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/11.jpg)
Conclusion
• Presented a method for detecting CNVs
• Combines – depth-of-coverage – paired-end mapping
• Improves– compared to paired-end mapping:
• Increased sensitivity in repeating regions – segmental duplications
– compared to depth-of-coverage methods:• better resolution (1Kb vs. 30Kb)
• Global optimization approach
![Page 12: Detecting Copy Number Variation With Short Paired Reads](https://reader033.vdocuments.site/reader033/viewer/2022051218/56815828550346895dc58deb/html5/thumbnails/12.jpg)
Detecting Copy Number Variation
Paul Medvedev
Marc Fiume
Misko Dzamba
Tim Smith
Adrian Dalca
Mike Brudno
Genome Informatics 2009