sequencing technologies and human genetic variation

21
By Alfonso Farrugio, Hieu Nguyen, and Antony Vydrin Sequencing Technologies and Human Genetic Variation

Upload: vondra

Post on 22-Feb-2016

45 views

Category:

Documents


0 download

DESCRIPTION

Sequencing Technologies and Human Genetic Variation. By Alfonso Farrugio , Hieu Nguyen, and Antony Vydrin. Overview. Introduction Simulating genomic variation and sequencing Analyzing and comparing different sequencing technologies Algorithms for detecting human genetic variation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Sequencing Technologies and Human Genetic Variation

By Alfonso Farrugio, Hieu Nguyen, and Antony Vydrin

Sequencing Technologies and Human Genetic Variation

Page 2: Sequencing Technologies and Human Genetic Variation

Overview

Introduction

Simulating genomic variation and sequencing

Analyzing and comparing different sequencing technologies

Algorithms for detecting human genetic variation

Page 3: Sequencing Technologies and Human Genetic Variation

Introduction

Different people have different mutations in their genomes

A recent study was done (Nature 453, 56-64, 5/1/2008) where 8 human genomes were compared, and 1,695 structural variants were found

Page 4: Sequencing Technologies and Human Genetic Variation

Whole-genome shotgun sequencing allows for fast and relatively cheap sequencing of human genomes

New technologies are being developed to allow for accurate detection of human genomic variation

Most of these technologies use short paired reads.

How long should the reads be in order to optimize the process of detecting human genomic variation ?

What algorithms can be used to detect variations in a new individual’s genome ?

Page 5: Sequencing Technologies and Human Genetic Variation

Simulating Genomic Variation

Program to take a human genome and add randomly-distributed inversions, insertions, deletions, and SNPs

The number of mutations (and their mean lengths) can be controlled by the user

To simplify, no two mutations can overlap each other (the SNPs are an exception)

Page 6: Sequencing Technologies and Human Genetic Variation

Inversions Insertions Deletions

“Intermediate” mutated genome

Original genome

Page 7: Sequencing Technologies and Human Genetic Variation

Subtract Deletions

“Intermediate” mutated genome

“Intermediate” mutated genome

Page 8: Sequencing Technologies and Human Genetic Variation

SNPs

“Intermediate” mutated genome

(output mutated genome)

Page 9: Sequencing Technologies and Human Genetic Variation

Simulating Genomic Sequencing

Program to take a human genome and create paired reads (output read pairs to a file)

The read lengths are all identical, and the separation between reads in a pair is picked randomly based on a normal distribution

The program can simulate sequencing errors when creating the paired reads

Page 10: Sequencing Technologies and Human Genetic Variation

Simulating Genomic Sequencing

The user can control the total number of reads, read lengths, the mean of the read separations, and sequencing error rate

Page 11: Sequencing Technologies and Human Genetic Variation

Genome to be sequenced

Choose uniformly - distributed random locations

Page 12: Sequencing Technologies and Human Genetic Variation

Genome to be sequenced

Create read pair at each location. Choose random direction for each read

L Ld1

L is a constant while d is random (normally distributed)

Read direction

Page 13: Sequencing Technologies and Human Genetic Variation

L LRead direction

d2

L LRead direction d3

Page 14: Sequencing Technologies and Human Genetic Variation

L Ld2

L Ld3

L Ld1

Resulting paired reads

Page 15: Sequencing Technologies and Human Genetic Variation

L Ld2

L Ld3

L Ld1

Paired reads with simulated sequencing errors

Page 16: Sequencing Technologies and Human Genetic Variation

program runtime ~window size

1

Page 17: Sequencing Technologies and Human Genetic Variation
Page 18: Sequencing Technologies and Human Genetic Variation

1

0

Page 19: Sequencing Technologies and Human Genetic Variation

50 insertions

100 insertions

500 insertions

Page 20: Sequencing Technologies and Human Genetic Variation
Page 21: Sequencing Technologies and Human Genetic Variation