challenge analysis the effect of next-generation sequencing technology on complex trait research...

18
CHALLENGE ANALYSIS THE EFFECT OF NEXT- GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented by December 10, 2013 University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov 1

Upload: evelin-whiddon

Post on 31-Mar-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

1

CHALLENGE ANALYSIS

THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX

TRAIT RESEARCH

Ladang AuxaneMombaerts LaurentUyttendaele Vincent

Presented by

December 10, 2013

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Page 2: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

2

TABLE OF CONTENTS

1. Introduction2. Challenge analysis

1. Optimizing parameters for study design2. Storing and handling data3. Mapping and aligning4. Variant calling5. Analyzing low frequency and rare variants

3. Applications4. Discussion5. Conclusion

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Page 3: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

INTRODUCTION

What is Next-Generation Sequencing (NGS) ? High throughput Low-cost

Applications

From 1970 until now

3

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

F. Sanger

Page 4: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

Three mains parameters:

High cost-to-data. Only parts of thegenome?

Power based on depth of coverage.

Sample selection.

CHALLENGE ANALYSIS4

1. Optimizing parameters for study design

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Page 5: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

5

CHALLENGE ANALYSIS

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Two years ago

• The concept of NGS was still theoretical.

Today

• Devices are operational and affordable→ raw data available.

2. Storing and handling data

Page 6: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

6

CHALLENGE ANALYSIS

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Production of

raw data

•Using a fluorescent-dye DNA sequencer

•Labeling of DNA strands with 4 fluorescent dyes

•Separation of fragments by electrophoresis

•Monitoring by chromatography

Storage of raw data

•One run can provide until 4 Tb of data

•→ Requirement of a huge memory capacity

Handling of

raw data

•Algorithms will be applied for mapping

•→ Requirement of powerful computing tools

Page 7: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

CHALLENGE ANALYSIS

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

7

De novo assembly•Sequencing a genome without the use of a reference genome.•Reads are assembled by an overlapping method.

Mapping•Building a sequence that is similar to a reference genome.•Reads are aligned on the backbone.

3. Mapping and aligning algorithms

Page 8: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

8

Improving speed and efficiency of algorithms to deal with large throughput

Detecting non-unique mapping (reads corresponding to different sequences of the reference genome)

Taking into consideration different base qualities (degrees of certainty)

Using a more accurate reference genome (including individual sequences)

CHALLENGE ANALYSIS

Page 9: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

CHALLENGE ANALYSIS

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

9

Detecting misalignment around indels

• Indel at the middle of a read :Perfect match on either side→ the algorithm opens a gap.

• Indel at one extremity of the read :Hard recognition of the indel→ misalignment of the read→ false positive SNP-call

4. Variant calling

Distinguish true variant from sequencing or mapping errors→ Decrease the number of false positive SNP-calls

Page 10: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

CHALLENGE ANALYSIS

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

10

Considering different error rates depending on the base location

• Nucleobases at the extremities have a higher error rate.• If misalignment : false positive confident SNP call.

• SOLUTION : algorithms that consider a recalibrated “base quality score” and select only the central portion of a read.

Decreasing the number of errors introduced by PCR artefacts

• PCR → not uniform cover of the reference genome → over-represented reads

• If misalignment : false positive confident SNP call.

• SOLUTION : paired-end sequencing libraries to discard clonal reads

Page 11: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

CHALLENGE ANALYSIS

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

11

Single-Point

• Low power• Would require hundreds of thousands of individuals

Across sample sets (composite analysis)

• A bit less heavy in terms of computing time and data volume

5. Analyzing low frequency and rare variants

May be a painful step !

Page 12: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

12

APPLICATIONS

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

The number of scientific publications has exploded !

Page 13: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

13

DISCUSSION

University of Liège

Development of new study design

Development of more effective methods to distinguish errors from low frequency & rare variants

Development of the most appropriate strategy to identify one disease.

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Page 14: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

14

DISCUSSION

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Cost-benefit analysis

Whole genome sequencing is unlikely to be cost effective as it still presents huge challenges.

→ coupling a reduction of the costs with an increase of the efficiency and the accuracy.

→ make NGS platforms marketable, competitive and usable for clinical applications.

Page 15: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

15

DISCUSSION

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Validation analysis

Standards for NGS clinical genomics are required, for instance to validate the test accuracy.

→ important downstream impact on the patient diagnostic and management.

Page 16: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

16

DISCUSSION

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

Current knowledge and research

Lack of knowledge─ in what a SNP implies─ in how we detect interaction between

genes─ in which influence gene expression has─ in which interpretation must be given to

the genome variance …

The more we make tests,the more knowledge we get,

the more associations between phenotype and genome we can do.

Page 17: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

Conclusion

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

17

Multiple issues

• Study design• Error handling

• Data interpretation

Enable a wide variety of applications

Page 18: CHALLENGE ANALYSIS THE EFFECT OF NEXT-GENERATION SEQUENCING TECHNOLOGY ON COMPLEX TRAIT RESEARCH Ladang Auxane Mombaerts Laurent Uyttendaele Vincent Presented

18

REFERENCES

University of Liège GBIO0009-1 : Krystel Van Steen, Kyrill Bessonov

A G Day-Williams, E Zeggini, The effect of Next-Generation Sequencing technology on complex trait research, Eur J Clin Invest 2011, Vol 41 : 561-567.

http://videos.rennes.inria.fr/genopole/GenOuest-2010/, [online on 7th December 2013]

http://www.qiagen.com/products/applications/next-generation-sequencing/#Dataanalysis, [online on 9th December 2013]

Figures http://www.labtimes.org/labtimes/method/methods/2010_01.lasso http://nextgenseek.com/2012/01/illumina-launches-a-new-faster-sequenc

er-hiseq-2500/ http://www.nucleics.com/DNA_sequencing_support/DNA-sequencing-dye-

blobs.html http://videos.rennes.inria.fr/genopole/GenOuest-2010/peterlongo_platefo

rme_2010_10_26.pdf http://www.genomicslawreport.com/index.php/tag/illumina/ http://www.cancer.gov/cancertopics/understandingcancer/geneticvariatio

n/page40