discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...apr_2013.color_slides.pdfigv: demo...

22
Genome browsers: Discovering biology through genomics BaRC Hot Topics April 2013 BaRC Hot Topics April 2013 George Bell, Ph.D. http://jura.wi.mit.edu/bio/education/hot_topics/

Upload: others

Post on 23-Mar-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Genome browsers:

Discovering biologyg gythrough genomics

BaRC Hot Topics April 2013BaRC Hot Topics – April 2013

George Bell, Ph.D.http://jura.wi.mit.edu/bio/education/hot_topics/

Page 2: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Today's outlineToday s outlineG b i t d ti• Genome browser introduction

P l t f b• Popular types of genome browsers– UCSC Genome Browser

Integrative Genomics Viewer (IGV)– Integrative Genomics Viewer (IGV)– Ensembl– Gbrowse (SGD, FlyBase, WormBase, TAIR, ZFIN,Gbrowse (SGD, FlyBase, WormBase, TAIR, ZFIN,

HapMap, Planarian at Whitehead )

f f f• Browser file formats for custom data tracks

• Throughout the talk: Mining the genome2

Page 3: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

ResourcesResources

• Genome browser tutorial materials– http://genome ucsc edu/training htmlhttp://genome.ucsc.edu/training.html– http://www.broadinstitute.org/igv/UserGuide– http://www.ensembl.org/info/website/tutorials/index.html– http://gmod.org/wiki/GBrowse

• Browser file formats: http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#CustomTracks

• Previous Hot Topics (http://jura.wi.mit.edu/bio/education/hot_topics/)

• OpenHelix training materials (some free)– http://www.openhelix.com/cgi/freeTutorials.cgi

• BaRC scientists3

Page 4: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Genome browser componentsGenome browser components

• Genome sequence (partially or fully assembled)• Graphics + data browsing/searching system• Graphics + data browsing/searching system• Collection of data (qualitative and quantitative)

li k d tlinked to – genome coordinates– genome features linked to genome coordinates

• System to view custom data• Algorithm to align sequences to genome

4

Page 5: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Practical hintsPractical hints

• Take careful notes of genome assembly for– All coordinatesAll coordinates– All custom browser files

• Genome is updated infrequently• Genome is updated infrequently• Data in genome browser can be updated as

often as dailoften as daily• Data displayed in genome browser is often

generated by others• Try out different genome browsers

5

Page 6: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

UCSC Genome BrowserUCSC Genome Browser

6

Page 7: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

UCSC: Demo and exercise 1UCSC: Demo and exercise 1

• Does the RefSeq gene catalog contain the correct isoforms of your favorite human gene?

• Provide evidence from primary sequenceProvide evidence from primary sequence

Examples: WASH2P BMP4• Examples: WASH2P, BMP4

7

Page 8: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

UCSC: Demo and exercise 2UCSC: Demo and exercise 2

• Get the promoter of your favorite gene (defined as 2kb upstream to 2kb downstream of the transcription start site)

• Examples: BMP4, SERPIND1

• According to ENCODE, do any transcription factors bind this promoter?factors bind this promoter?

8

Page 9: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Integrative Genomics Viewer (IGV)Integrative Genomics Viewer (IGV)

9

Page 10: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

IGV: Demo and exercise 3IGV: Demo and exercise 3

• Using the Illumina Body Map RNA-Seq data on IGV,

– Is GATA4 really expressed at a higher level in heart y p gthan in skeletal muscle?

– Why isn't this comparison of mapped reads quantitative?

10

Page 11: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

IGV: Demo and exercise 4IGV: Demo and exercise 4

• Using the Illumina Body Map RNA-Seq data on IGV,

– Does the heart subject have any variants in GATA4? j yWhere?

– Center the variant(s) in the display, zoom in all the way, and save that view as a session.

– Beyond IGV: Is this variant a known SNP?

11

Page 12: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Ensembl: more than a browserEnsembl: more than a browser

• An automated genome annotation pipeline• Includes thorough homology analysis via g gy y

Compara• Hosts hand-curated gene annotation projectsHosts hand curated gene annotation projects

(Vega; Havana)• All data can be downloaded in a variety of ways• All data can be downloaded in a variety of ways• BioMart is a powerful web interface to the

Ensembl databasesEnsembl databases

12

Page 13: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Ensembl gene pagesEnsembl gene pages

13

Page 14: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Ensembl: Demo and exercise 5Ensembl: Demo and exercise 5

• Go to the Ensembl page for mouse Uox (urate oxidase)

• Download Uox homologs (in fasta format) fromDownload Uox homologs (in fasta format) from as many species as possible

• Is this gene missing in any primates?

14

Page 15: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Ensembl: Demo and exercise 6Ensembl: Demo and exercise 6

• Use BioMart to get a list of all human genes on chromosome 1 and corresponding mouse p ghomologs

15

Page 16: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Gbrowse (many MODs)Gbrowse (many MODs)

16

Page 17: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

GBrowse: Demo and exercise 7GBrowse: Demo and exercise 7G t TAIR (Th A bid i I f ti R )• Go to TAIR (The Arabidopsis Information Resource)

Fi d Gb ( d T l )• Find Gbrowse (under Tools)

• Find gene AT2G19420

• What non-coding gene overlaps it?

• Download a GFF file of these genes and view it in Excel.

17

Page 18: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Viewing custom dataViewing custom data

• About any data can be viewed in a genome browser as long as it is– Linked to genome coordinates– Organized in a standard format that is

• qualitative (ex: bed, bam), or• quantitative (ex: wig, bedgraph)

ff f ff• Different formats using different counting schemes (starting at 0 or 1) so off-by-one bugs

kare easy to make• BAM files need to be sorted and indexed first

18

Page 19: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Demo and exercise 8Demo and exercise 8Go to UCSC (http://membrane wi mit edu/ WI only) or• Go to UCSC (http://membrane.wi.mit.edu/ - WI only) or IGV

• Locate track files in \\BaRC_Public\Hot_Topics\Genome_browsers_Apr_2013

• Add the 4 tracks to the browser (mm9)Add the 4 tracks to the browser (mm9)– TargetScanMouse6_mm9.chr3.bed– TargetScanMouse6_mm9.chr3.bedgraph

CGH mm9 chr3 4 wig– CGH.mm9.chr3-4.wig– track type=bam name="Heart BAM"

bigDataUrl=http://tak.wi.mit.edu/barc_ucsc/Hot_Topics/Genome_browsers Apr 2013/HeartCellRNASeq bambrowsers_Apr_2013/HeartCellRNASeq.bam

• Look at some chr3 genes (ex: Pfn2, Serp1, Ssr3, Hdgf)• Optimize the display modes of your custom tracks

19

Page 20: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Other notable browsersOther notable browsers

JBrowse• JBrowse• Golden Helix GenomeBrowse• WashU Epigenome Browser• UCSC Cancer Genome BrowserUCSC Cancer Genome Browser• 1000 Genomes Browser

20

Page 21: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

SummarySummaryG b i t d ti• Genome browser introduction

• Popular types of genome browsers– UCSC Genome Browser– Integrative Genomics Viewer (IGV)– Ensembl– Gbrowse (SGD, FlyBase, WormBase, TAIR, ZFIN,

Planarian at Whitehead )

• Browser file formats for custom data tracks

21

Page 22: Discoveringggy biology through genomicsbarc.wi.mit.edu/...2013/...Apr_2013.color_slides.pdfIGV: Demo and exercise 4IGV: Demo and exercise 4 • Using the Illumina Body Map RNA-Seq

Browser locationsBrowser locations

• UCSC Genome Browser:– http://genome.ucsc.edu/

htt // b i it d / (i id Whit h d t k)– http://membrane.wi.mit.edu/ (inside Whitehead network)

• IGV: http://www.broadinstitute.org/software/igv/download

• Ensembl: http://www.ensembl.org/

• Gbrowse: http://gmod.org/wiki/GBrowsep g g

22