ngs, cancer and bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/ifsbm/ifsbm_intro_ngs-yb.pdf ·...
TRANSCRIPT
![Page 1: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/1.jpg)
5/3/2015 Yannick Boursin
NGS, Cancer and Bioinformatics
1
![Page 2: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/2.jpg)
NGS and Clinical Oncology
• NGS in hereditary cancer genome testing• BRCA1/2 (breast/ovary cancer)
• XPC (melanoma)
• ERCC1 (colorectal cancer)
• NGS for personalized cancer treatment• Clinical trials: MOSCATO (GR), SAFIR (GR), SHIVA (Curie), …
• Ipilimumab (anti-CTLA4), Nivolumab (anti-PD1), Trastuzumab (anti-HER2), Cetuximab (anti-EGFR)
• Detection of chimeric transcripts• Chronic Myeloid Leukemia: Philadelphia chromosome (BCR/ABL)
• Non-Small-Cell Lung Cancer: EML4-ALK
5/3/2015 Yannick Boursin 2
![Page 3: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/3.jpg)
NGS and Oncology
5/3/2015 Yannick Boursin
NGS is now widely used as:• A research tool to screen a large amount of cancer samples
NGS and Oncology
18
07-09th April 2014 NGS and Bioinformatics
NGS is now widely used as:
• A research tool to screen a large amount of cancer samples
• A clinical/diagnosis tool in daily practice
These projects require dedicated bioinformatics integration project to access and analyses this huge amount of data
• A clinical/diagnosis tool in daily practice
These projects require dedicated bioinformatics integration project to access and analyses this huge amount of data.
NGS and Oncology
18
07-09th April 2014 NGS and Bioinformatics
NGS is now widely used as:
• A research tool to screen a large amount of cancer samples
• A clinical/diagnosis tool in daily practice
These projects require dedicated bioinformatics integration project to access and analyses this huge amount of data
3
![Page 4: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/4.jpg)
Why do we need computers for NGS
Sequencing data size evolution Needs to address
• Store PetaBytes of data (1 PB is1000 TB).
• Share data around the world through networks
• Analyze huge amounts of data with complex algorithms
5/3/2015 Yannick Boursin 4
![Page 5: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/5.jpg)
Bioinformatics and Oncology
• Problem: finding, extracting, and presenting relevantinformations.
• Partial solution: designingworkflows in order to ease data analysis.
5/3/2015 Yannick Boursin 5
![Page 6: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/6.jpg)
Interdisciplinary collaboration
5/3/2015 Yannick Boursin
Bioinformatics acts as a hubs between the different fields. Trust between partners is needed, training is needed as well for efficient understanding.
6
![Page 7: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/7.jpg)
Standard Workflow for NGS Analysis
5/3/2015 Yannick Boursin 7
A typical NGS workflow
![Page 8: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/8.jpg)
Step 1: Quality Check and improvements
5/3/2015 Yannick Boursin 8
![Page 9: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/9.jpg)
Standard Workflow for NGS Analysis
5/3/2015 Yannick Boursin 9
A typical NGS workflow
![Page 10: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/10.jpg)
NGS Data: what do they look like ?
5/3/2015 Yannick Boursin 10
A raw data file (.fastq, .sff, .fa, .csfasta/.qual)with millions of short reads of the same size (SOLiD, HiSeq) or readsof different size (Ion PGM/Proton)
Enhanced view of the reads in a fastq file
![Page 11: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/11.jpg)
FASTQ format
5/3/2015 Yannick Boursin
• 1 sequence = 1 read = 4 lines in the file
• First line = sequence identifier
11
![Page 12: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/12.jpg)
FASTQ format
5/3/2015 Yannick Boursin
• Fourth line = Quality
• ASCII encoded (Reduce the file size)
12
![Page 13: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/13.jpg)
Sequence quality encoding
5/3/2015 Yannick Boursin 13
![Page 14: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/14.jpg)
Why looking at sequencing quality ?
5/3/2015 Yannick Boursin
• Quality of data is very important for various downstream analyses:• Sequence assembly or mapping• Variants detection• Gene expression studies•...
• Quality of data = poor• Try to find a reason• Can we correct/improve the quality ? • May lead to erroneous conclusions
14
![Page 15: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/15.jpg)
Quality controls on raw reads: which metricsto check ?
5/3/2015 Yannick Boursin
Mainly:• Quality score per base and over the reads
But also: • Read length distribution• Sequence content per base and % of GC• Kmers content• Overrepresented sequences• Duplicated reads
15
![Page 16: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/16.jpg)
Quality scores
5/3/2015 Yannick Boursin
• Per base (Box Whisker type plot)-> to see wether base calls falls into low quality(commonly towards the end of a read)
• Per sequence (mean quality distribution)-> to see if a subset of your sequences have universallylow quality values
16
![Page 17: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/17.jpg)
Quality scores
5/3/2015 Yannick Boursin 17
![Page 18: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/18.jpg)
Quality scores
5/3/2015 Yannick Boursin 18
![Page 19: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/19.jpg)
Standard Workflow for NGS Analysis
5/3/2015 Yannick Boursin 19
A typical NGS workflow
![Page 20: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/20.jpg)
Reads cleaning: removing bad quality bases
• After QC, we need to remove bad quality entities.
• This is often done by scanning reads with a sliding window algorithm.
5/3/2015 Yannick Boursin 20
Read-ends trimming by a quality trimming algorithm. In red: bad quality bases. In blue: good quality bases.
![Page 21: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/21.jpg)
Reads cleaning: adapters removal
5/3/2015 Yannick Boursin
• An adapter is a small piece of known DNA located at the end of the reads• Adapters roles:
• Hang read to the sequencer flowcell• Allows a specific PCR enrichment of reads having adapter • Use in multiplex sequencing (samples in mix)
• Available tools to trim adapters: • Cutadapt• Trimmomatic• RmAdapter
21
In blue: adapters. In orange: informative part of the read.
![Page 22: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/22.jpg)
Standard Workflow for NGS Analysis
5/3/2015 Yannick Boursin 22
A typical NGS workflow
![Page 23: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/23.jpg)
Step 2: Short Reads Alignment
5/3/2015 Yannick Boursin 23
![Page 24: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/24.jpg)
Standard Workflow for NGS Analysis
5/3/2015 Yannick Boursin 24
A typical NGS workflow
![Page 25: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/25.jpg)
Reads alignment - Vocabulary
5/3/2015 Yannick Boursin
Reference Genome : The reference genome is a known sequence, supposed to be as close as possible to the input genome, and whichis used as an anchor to organize the single reads information.Alignment : (mapping) The reads alignment aims at transformingthe single reads information in an organized and reduced set of information. Giving each read a genomic position.Mismatch : Incoherence between two nucleotidesGap : Bridge within the read alignment (i.e. small Insertion/deletion)Indels : Insertion/Deletion into the reference genomeMappability : Uniqueness of a region (repeated region = lowmappability, unique region = good mappability)
25
![Page 26: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/26.jpg)
Reads alignment – Two strategies
5/3/2015 Yannick Boursin
The reads alignment aims at transforming the single readsinformation in an organized and reduced set of information.
Two strategies can be applied :
- De novo Reads AssemblyUsed when no reference genome are available. It aims atreconstructing long scaffolds from single reads information.
- Alignment on a Reference GenomeThe reads are directly compared to a known reference genome.
26
![Page 27: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/27.jpg)
Alignment on a reference genome
5/3/2015 Yannick Boursin
The reference genome is a known sequence, supposed to be as close as possible to the input genome, and which isused as an anchor to organize the single reads information.
27
Alignment of reads against reference genome
![Page 28: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/28.jpg)
Alignment on a reference genome
5/3/2015 Yannick Boursin
The reference genome is a known sequence, supposed to be as close as possible to the input genome, and which isused as an anchor to organize the single reads information.
28
Alignment of reads against reference genome
![Page 29: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/29.jpg)
Alignment on a reference genome - Challenges
5/3/2015 Yannick Boursin
New alignment algorithms must address the requirements and characterics of NGS reads
– Millions of reads per run (30x of genome coverage)– Reads of different size (35bp - 200bp)– Different types of reads (single-end, paired-end, mate-pair, etc.)– Base-calling quality factors– Sequencing errors ( ~ 1%)– Repetitive regions– Sequencing organism vs. reference genome– Must adjust to evolving sequencing technologies and data formats
29
![Page 30: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/30.jpg)
Alignment on a reference genome – Bioinformaticstools
5/3/2015 Yannick Boursin 30
![Page 31: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/31.jpg)
Finding the best alignment - Rational
5/3/2015 Yannick Boursin
Given a reference and a set of reads, report at least one “good” local alignment for each read if one existsWhat is “good”? For now, we concentrate on: – Fewer mismatches is better
– Failing to align a low-quality base is better than failing to align a high-quality base
Based on a scoring system, i.e. score for a match (1), MM penalty (3), gap open penalty (5), gap extension penalty (2). The best alignmentis the one with the highest score.
31
![Page 32: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/32.jpg)
Alignment key parameters - Repeats
5/3/2015 Yannick Boursin
Approximately 50% of the human genome is comprised of repeats
Trea
nge
nT.
J. a
nd
Sal
zber
gS.
L. 2
01
2. N
atu
re r
evie
wG
enet
ics
13
, 36
-46
32
![Page 33: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/33.jpg)
Alignment key parameters - Repeats
5/3/2015 Yannick Boursin
Close proximity with genes : intergenic and intragenic positions
33
BRCA2: a mosaic of repeated regions
![Page 34: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/34.jpg)
Alignment key parameters – Repeats – 3 strategies
5/3/2015 Yannick Boursin
-1- Report only unique alignment-2- Report best alignments and randomly assign readsacross equaly good loci-3- Report all (best) alignments
Treangen T.J. and Salzberg S.L. 2012. Nature review Genetics 13, 36-46
-1- Report only unique alignment -2- Report best alignments and randomly assign reads across equaly good loci -3- Report all (best) alignments
A B A B A B
-1- -2- -3-
07-09th April 2014 NGS and Bioinformatics
Alignment Key Parameters Repeats – Three strategies
60
07-09th April 2014 NGS and Bioinformatics
Treangen T.J. and Salzberg S.L. 2012. Nature review Genetics 13, 36-46
34
![Page 35: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/35.jpg)
Alignment on a reference genome
5/3/2015 Yannick Boursin
Key points
• The alignment is a crucial step of the NGS analysis.
• The reference genome has to be carefully chosen.
• The mappability of the region of interest has to be takken intoaccount (primer design).
• The scoring method has to be chosen accordingly to the sequencing error rate and the quality of the raw reads.
• The alignment parameters have to be set properly.
35
![Page 36: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/36.jpg)
Limitations of Alignment Tools
5/3/2015 Yannick Boursin
Even if we have now some nice tools to align reads on a reference genome, several issues are still important :
- Homopolymer mapping- Efficiently align small indels- Alignment on several genomes- Alignment on repeted sequences - ...
36
![Page 37: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/37.jpg)
Alignment formats
5/3/2015 Yannick Boursin
• A lot of formats exists:
• SAM• BAM• ELAND (Illumina specific) • MAQ map• …
SAM and BAM are now the standard for aligned data
37
![Page 38: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/38.jpg)
SAM format
5/3/2015 Yannick Boursin
• SAM for Sequence Alignment Map• Tabulated text file• 1 line per read• Each line is composed of 11 fields (minimum)
38
![Page 39: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/39.jpg)
SAM format
5/3/2015 Yannick Boursin 39
![Page 40: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/40.jpg)
SAM format
5/3/2015 Yannick Boursin
• Second field can be used for quick sort of file
• With Samtools (command line) and –f et –F options • Useful webpage:
• http://broadinstitute.github.io/picard/explain-flags.html
40
![Page 41: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/41.jpg)
BAM format
5/3/2015 Yannick Boursin
• BAM for Binary Alignment/Map• Correspond to SAM format compressed as BGZF • Reduce by 5 fois the size of the alignment file• Not directly readable as SAM format• Require Samtools• Best format for alignment file sharing• Couples with an index file (BAI)• Avoid a sequential read of the complete file
41
![Page 42: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/42.jpg)
Standard Workflow for NGS Analysis
5/3/2015 Yannick Boursin 42
A typical NGS workflow
![Page 43: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/43.jpg)
QC 3 : Which metric to check ?
5/3/2015 Yannick Boursin
In practice, how to validate my alignment ?
Be aware of the mapping strategy usedLook at simple descriptive statistics
– Number of aligned reads– Coverage/Depth– Mapping quality– Number of normal/abnormal pairs for paired-end
data–...
43
![Page 44: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/44.jpg)
NGS Analysis : How can I work with my NGS data ?
5/3/2015 Yannick Boursin
• Difficult on personal computer (lack of ressources)• 1 alignement = 4 processors + 15gb Ram (to multiply by the number of samples)• Impossible to open files into sofwares like text editor• Need a very large storage capacity• Data backup administration• Applications server connected to a computing cluster and storage array:
• Commercials solution (CLC Bio, NextGene, ...) • Galaxy server:
https://galaxy.gustaveroussy.fr/galaxyprod
44
![Page 45: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/45.jpg)
Standard Workflow for NGS Analysis
5/3/2015 Yannick Boursin 45
A typical NGS workflow
![Page 46: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/46.jpg)
After sequencing : Data analysis
5/3/2015 Yannick Boursin
Main challenges :
• The rapid evolution of the high-throughput technologies
• The rapid evolution of the bioinformatics solutions
• The rapid evolution of the biological/medical knowledge
46
![Page 47: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/47.jpg)
Data analysis
Chimerictranscript search
Alternative transcripts study
Differentialexpression study
Methylation study Detection of genomic variants
Detection of copy-number variation
5/3/2015 Yannick Boursin 47
![Page 48: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/48.jpg)
Chimeric transcripts
5/3/2015 Yannick Boursin
Does the tumoral cells express any chimeric transcript ?
48
History of the bcr-abl fusion
![Page 49: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/49.jpg)
Alternative transcripts
5/3/2015 Yannick Boursin 49
![Page 50: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/50.jpg)
Differential expression
5/3/2015 Yannick Boursin 50
Are there genesthat would bestronglyexpressed in one kind of tumor that are not in the otherkind ?
Can we group tumorsaccording to their expression profiles ?
Clustering differential expression in breast tumours.
![Page 51: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/51.jpg)
Methylome
5/3/2015 Yannick Boursin 51
Is there any difference between DNA methylation in tumors and in normal cells ?
How does methylation promotes cancer ?
![Page 52: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/52.jpg)
Detection of copynumber variations
5/3/2015 Yannick Boursin 52
Are there any copy-number alteration (gain or loss of chomosomal regions, amplifications …) that could explain tumorigenesis ?
Copynumber variations in cancer. MYC and KRAS are amplified.
![Page 53: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/53.jpg)
Detection of genomic variants
5/3/2015 Yannick Boursin 53
Are there mutational events that are specific to the tumoral genome ? Could the tumorigenesis be explained by those ? Is there any drug targeting those mutations ?
Pancreas adenocarcinoma: from normal cells to tumoral cells
![Page 54: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/54.jpg)
Limitations: Detection of genomic variants
5/3/2015 Yannick Boursin 54
Between 1.4 and 8.9 % of the variants are technology specific
![Page 55: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/55.jpg)
Limitations: Detection of genomic variants
5/3/2015 Yannick Boursin 55
Common genomic variants between different variant callers
![Page 56: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/56.jpg)
Conclusion
• Nowadays, NGS is widely used in cancer centers in order to categorize cancers and link patients with personalized treatments (Precision Medicine)
• NGS is also used in cancer research, in order to discover new oncogenetic mechanisms, to understand the way a treatment works, to link biological and genetical characters …
• Due to technical issues using NGS might not answer your questions. It is important to know that the technique is limited:
• A) by the question you asked at first. If a cancer cannot be explained by mutational events, it might be explained by other mechanisms. But still, sometimes, nothing is to be found in data.
• B) by technical issues. Sequencers and softwares are prone to errors. Statistically, there will be at least one error for any analysis. You can often limit the effects of this limitations by making biological and technical replicates.
5/3/2015 Yannick Boursin 56
![Page 57: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/57.jpg)
5/3/2015 Yannick Boursin 57
![Page 58: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/58.jpg)
Paired-end mapping
5/3/2015 Yannick Boursin
• Insert-size checking
• % of "All Good"= both reads in the pair have aligned• "the pair is properly aligned" meaning that they mapped within a
proper distance from each other• % of "All Bad" = neither the read nor its mate mapped• % of Only one read maps = only one read in a pair is mapped
58
![Page 59: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/59.jpg)
Alignment key parameters – Using single or paired-end reads ?
5/3/2015 Yannick Boursin
The type of sequencing (i.e. single or paired-end reads) is oftendriven by the application.Exemple : Finding large indels, genomic rearrangements, ...However, in most of the case, the pair information can improve the mapping specificity- Single-end alignment – repeated sequence
- Paired-end alignment – unique sequence
59
Alignment of reads against reference genome
![Page 60: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/60.jpg)
NGS Toolkit : SAMtools
5/3/2015 Yannick Boursin
http://samtools.sourceforge.net/samtools.shtmlInteracting with SAM/BAM format
SAMTools provides the following commands :view : tansform and filter SAM or BAM datasort : sort a BAM file per genomic location or nameindex : creates a new index file that allows fast look-up of
data in a (sorted) SAM or BAMmpileup : SNVs/indel detectionrmdup : remove duplicated readsflagstat : compute statistics on the SAM/BAM file ...
60
![Page 61: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/61.jpg)
NGS Toolkit : BEDTools
5/3/2015 Yannick Boursin
http://code.google.com/p/bedtools/
• Address common genomics tasks such as finding feature overlaps and computing coverage.• Can manage BED, GFF/GTF, VCF and SAM/BAM• Unix-like command• Fast• All intersections or annotations tasks can be done with BEDTools
Quinlan AR and Hall IM, 2010. BEDTools: a flexible suite of utilities for comparinggenomic features. Bioinformatics. 26, 6, pp. 841–842.
61
![Page 62: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/62.jpg)
How to visualise data ?
5/3/2015 Yannick Boursin
IGV : Integrative Genome Viewer
http://www.broadinstitute.org/igv/
JAVA application (local version) Annotation available on the Broad server Batch command lineSupport a lot of different file formats (Variants visualization)Easy to useLimited in term of annotations
Screencast: How to use IGV (french) https://www.youtube.com/watch?v=Wx3zHYK0cNg,
62
![Page 63: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/63.jpg)
How to visualise data ?
5/3/2015 Yannick Boursin
UCSC Genome Browser
http://genome.ucsc.edu/
Hundred of annotation data Hundred of public (ENCODE) profils Tables functionsFully online (session) Can be difficult to upload big data files (new format: bigBED, bigWIG, etc.)
Screencast: How to use the UCSC genome browser (french) https://www.youtube.com/watch?v=VPeoeJebdFM,
63
![Page 64: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/64.jpg)
Sequence length distribution
5/3/2015 Yannick Boursin
• Sequencers generates:• either sequence fragments of uniform length• or reads of wildly varying lengths.
Helps to identify and remove reads with abnormal length.
64
![Page 65: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/65.jpg)
Sequence length distribution
5/3/2015 Yannick Boursin 65
![Page 66: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/66.jpg)
Sequence content
5/3/2015 Yannick Boursin
• Proportion of each base position for which A,C, G, T has been called
• GC content of each base position-> in random librairies = a little to no difference betweenthe different bases
• N content per base-> If a sequencer is unable to make a base call withsufficient confidence
66
![Page 67: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/67.jpg)
Sequence content
5/3/2015 Yannick Boursin 67
![Page 68: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/68.jpg)
Over-represented sequences
5/3/2015 Yannick Boursin
• The sequences that are highly duplicated in your library, as well as any primer and/or adapter dimers that werepresent in the original library.
• Run A:• Sequence:
GACTCGGCAGCATCTCCATCCAAACTTTTCATTTCTGCTTTTAAAGGAAA • Count: 37• Pourcentage 0.1%
68
![Page 69: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/69.jpg)
Duplicate reads
5/3/2015 Yannick Boursin
• Different reads which have the same sequence• A duplicate could be PCR effect or reading same
fragment twice or come from enrichment• Reads which align to the identical location on the
reference
• Remove duplicates ? It depends of the application. • Exemple: for targeted sequencing, you do not
want duplicates to be removed
69
![Page 70: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/70.jpg)
Duplicate reads
5/3/2015 Yannick Boursin 70
![Page 71: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/71.jpg)
Data analysis
Motif search Chimeric transcriptsearch
Microbiota study Alternatetranscript search
Differentialexpression study
Methylation study Detection of genomic variants
Detection of copy-number variation
5/3/2015 Yannick Boursin 71
![Page 72: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/72.jpg)
Motif search
5/3/2015 Yannick Boursin 72
How does my protein interacts with DNA ?
![Page 73: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/73.jpg)
Chimeric transcripts
5/3/2015 Yannick Boursin
Are there any chimeric transcripts ?
73
![Page 74: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/74.jpg)
Microbiota
5/3/2015 Yannick Boursin 74
What kind of species grows in the human gut ? Could thosespecies beassociated withtumorigenesis ?
![Page 75: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/75.jpg)
Alternative transcripts
5/3/2015 Yannick Boursin 75
Are there any differences between normal cell and tumoral cells regardingsplicing events ?
![Page 76: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/76.jpg)
Differential expression
5/3/2015 Yannick Boursin 76
Are there genesthat would bestronglyexpressed in one kind of tumor that are not in the otherkind ?
Can we group tumorsaccording to their expression profiles ?
![Page 77: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/77.jpg)
Methylome
5/3/2015 Yannick Boursin 77
Is there anydifferencebetween DNA methylation in tumors and in normal cells ?
![Page 78: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/78.jpg)
Detection of copynumber variations
5/3/2015 Yannick Boursin 78
Are there any copy-number alteration (gain or loss of chomosomal regions, amplifications …) that could explain tumorigenesis ?
![Page 79: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/79.jpg)
Detection of genomic variants
5/3/2015 Yannick Boursin 79
Are there mutational events that are specific to the tumoral genome ? Could the tumorigenesis be subrogated to that ?
![Page 80: NGS, Cancer and Bioinformaticsrssf.i2bc.paris-saclay.fr/transfert/IFSBM/IFSBM_intro_NGS-YB.pdf · NGS and Oncology 5/3/2015 Yannick Boursin NGS is now widely used as: • A research](https://reader034.vdocuments.site/reader034/viewer/2022042509/5f7b67281b72c715ac75ab22/html5/thumbnails/80.jpg)
Quality controls on raw reads : lets start aftersequencing
5/3/2015 Yannick Boursin 80
A raw read is characterized by three parameters:• Its length• Its sequence• Per-base-in-sequence quality
Raw reads