igv: integrated genome viewer - institut pasteur...igv: integrated genome viewer l a, a i & a i...

8
Amel Ghouila, Claudia Chica, Emna Achouri & Fatma Guerfali C3BI Hands-on NGS course – IPP – 23 rd Nov 2016 1 IGV: Integrated Genome Viewer

Upload: others

Post on 13-May-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IGV: Integrated Genome Viewer - Institut Pasteur...IGV: Integrated Genome Viewer l a, a i & a i s- – – 23 rd 6 2 IGV l a, a i & a i s-– – 23 rd 6 3 GFF and BED formats l a,

Am

elG

houi

la, C

laud

ia C

hica

, Em

naA

chou

ri&

Fa

tma

Gue

rfali

C3B

I Ha

nds-

on N

GS

cour

se –

IPP

–23

rdN

ov 2

016

1

IGV:IntegratedGenomeViewer

Page 2: IGV: Integrated Genome Viewer - Institut Pasteur...IGV: Integrated Genome Viewer l a, a i & a i s- – – 23 rd 6 2 IGV l a, a i & a i s-– – 23 rd 6 3 GFF and BED formats l a,

Am

elG

houi

la, C

laud

ia C

hica

, Em

naA

chou

ri&

Fa

tma

Gue

rfali

C3B

I Ha

nds-

on N

GS

cour

se –

IPP

–23

rdN

ov 2

016

2

IGV

Page 3: IGV: Integrated Genome Viewer - Institut Pasteur...IGV: Integrated Genome Viewer l a, a i & a i s- – – 23 rd 6 2 IGV l a, a i & a i s-– – 23 rd 6 3 GFF and BED formats l a,

Am

elG

houi

la, C

laud

ia C

hica

, Em

naA

chou

ri&

Fa

tma

Gue

rfali

C3B

I Ha

nds-

on N

GS

cour

se –

IPP

–23

rdN

ov 2

016

3

GFFandBEDformats

Page 4: IGV: Integrated Genome Viewer - Institut Pasteur...IGV: Integrated Genome Viewer l a, a i & a i s- – – 23 rd 6 2 IGV l a, a i & a i s-– – 23 rd 6 3 GFF and BED formats l a,

Am

elG

houi

la, C

laud

ia C

hica

, Em

naA

chou

ri&

Fa

tma

Gue

rfali

C3B

I Ha

nds-

on N

GS

cour

se –

IPP

–23

rdN

ov 2

016

4

GFF3Format

GFF3ThetwomostwidelyusedformatsforrepresentinggenomefeaturesaretheBEDandGFFformats.GFF(Generic Feature Format)is astandardfileformatforstoring genomic features inatext file.Careful :GFFhasseveral versions,themost recent is GFF3.GFFstart at a1-basedpositionandendsat a1-basedposition.

► 1linefor1feature► tab-delimited file(tab-separated columns)► 9columns +optional additional information

http://gmod.org/wiki/GFF3http://www.ensembl.org/

SEQ-ID SOURCE

TYPE START-END ATTRIBUTESSCORE

STRA

ND

PHAS

E

Page 5: IGV: Integrated Genome Viewer - Institut Pasteur...IGV: Integrated Genome Viewer l a, a i & a i s- – – 23 rd 6 2 IGV l a, a i & a i s-– – 23 rd 6 3 GFF and BED formats l a,

Am

elG

houi

la, C

laud

ia C

hica

, Em

naA

chou

ri&

Fa

tma

Gue

rfali

C3B

I Ha

nds-

on N

GS

cour

se –

IPP

–23

rdN

ov 2

016

5

BEDFormat► other useful fileformatsBED=(BrowserExtensibleData)(http://genome.ucsc.edu/FAQ/FAQformat)

BEDstarts arezero-based andBEDendsareone-based► Tab-delimited files► 3firstfields required,others optional

Fileformat(BEDPE):describesdisjointgenomefeatures,suchasstructuralvariationsorpaired-endsequencealignments.

CHR FEAT

URE

NAM

E

SCORE

*STRA

ND

START-END

*Any stringcould be used (p-value,enrichment score…)

7.thickStart - Thestarting positionat which thefeature is drawn thickly.8.thickEnd - Theending positionat which thefeature is drawn thickly.9.itemRgb - AnRGBvalueoftheform R,G,B(e.g.255,0,0).10.blockCount - Thenumber ofblocks(exons)intheBEDline.11.blockSizes - Acomma-separated list oftheblocksizes.12.blockStarts - Acomma-separated list ofblockstarts.

Page 6: IGV: Integrated Genome Viewer - Institut Pasteur...IGV: Integrated Genome Viewer l a, a i & a i s- – – 23 rd 6 2 IGV l a, a i & a i s-– – 23 rd 6 3 GFF and BED formats l a,

Am

elG

houi

la, C

laud

ia C

hica

, Em

naA

chou

ri&

Fa

tma

Gue

rfali

C3B

I Ha

nds-

on N

GS

cour

se –

IPP

–23

rdN

ov 2

016

6

DifferentBEDFormats

Page 7: IGV: Integrated Genome Viewer - Institut Pasteur...IGV: Integrated Genome Viewer l a, a i & a i s- – – 23 rd 6 2 IGV l a, a i & a i s-– – 23 rd 6 3 GFF and BED formats l a,

Am

elG

houi

la, C

laud

ia C

hica

, Em

naA

chou

ri&

Fa

tma

Gue

rfali

C3B

I Ha

nds-

on N

GS

cour

se –

IPP

–23

rdN

ov 2

016

7

BEDFormat

► BEDTools supportawide rangeofoperations forinterrogating andmanipulatinggenomic features.

intersectBedReturns overlapping features between two BED/GFF/VCFfiles.

genomeCoverageBedHistogram ora“perbase”reportofgenome coverage.…

► Coverage

The number of times each nucleotide is « read »à Fold Coverage (numberX )

Page 8: IGV: Integrated Genome Viewer - Institut Pasteur...IGV: Integrated Genome Viewer l a, a i & a i s- – – 23 rd 6 2 IGV l a, a i & a i s-– – 23 rd 6 3 GFF and BED formats l a,

Am

elG

houi

la, C

laud

ia C

hica

, Em

naA

chou

ri&

Fa

tma

Gue

rfali

C3B

I Ha

nds-

on N

GS

cour

se –

IPP

–23

rdN

ov 2

016

8http://wiki.bits.vib.be/index.php/NGS-formats

VCFFormatcoordinates are1-based

VariantcallingandVCFformat