ucsc genome browser dror hollander gil ast lab sackler medical school

Post on 13-Jan-2016

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

UCSC Genome Browser

Dror HollanderGil Ast Lab

Sackler Medical School

Understanding the Genome

DNARNA

protein

miRNA

gene expression

non-coding RNA

secondary structure

alternative splicing

exon-intron structure

histone midifications

genes

GC content

promoters

repetitive elements

conservation

SNPs

nucleosome occupancy

How can you examine a genomic segment while taking all of these factors into account?

Lecture Overview UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR

Galaxy

UCSC Genome Browser

Basic design: “the Genome Browser

stacks annotation tracks beneath

genome coordinate positions,

allowing rapid visual correlation of

different types of information”

genome

track

track

(64 eukaryote

genomes)

Genome Browsing…

Basic Genome Browser Interface

UTRCDSintrongene

direction(> / <)

chromosomal position

genomic coordinates

zoom

Refseq Genes track

UCSC Genes track

mark and drag here to zoom in

•Black - feature has a corresponding entry in the Protein Data Bank (PDB)•Dark blue - transcript has been reviewed or validated by either the RefSeq, SwissProt or CCDS staff•Medium blue - other RefSeq transcripts•Light blue - non-RefSeq transcripts

start codons in green

stop codons in red

Basic Genome Browser Interface

Configure track visualization:

Basic Genome Browser Interface

“RefSeq track shows known human protein-coding and

non-protein-coding genes taken from the NCBI RNA

reference sequences collection (RefSeq). The data are

updated daily”

“The UCSC track shows gene predictions based on data

from RefSeq, Genbank, CCDS and UniProt… includes both

protein-coding and putative non-coding transcripts…

Compared to RefSeq, this gene set has generally about

10% more protein-coding genes, approximately five times

as many putative non-coding genes, and about twice as

many splice variants”

Let’s examine a few examples online…

Basic Genome Browser Interface

A few more tracks worth mentioning:

miRNA (Genes and Gene Prediction Tracks -> sno/miRNA)

conservation (Comparative Genomics -> Conservation)

Expression tracks Regulation tracks (chromatin structure and

modifications, DNA methylation, etc.; includes ENCODE data)

RNA secondary structure (Genes and Gene Prediction Tracks -> EvoFold)

SNPs (Variation and Repeats -> SNPs)

Basic Genome Browser Interface

Convert browser window to an image file:

Get genomic DNA for the viewed coordinates:

Convert sequence to a different genome assembly or genome:

Lecture Overview UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR

Galaxy

Detecting Alternative Splicing Events

Via Human mRNAs & Spliced ESTs tracks (mRNA and EST Tracks)

gene DNA

mRNA

“The mRNA track shows alignments between human mRNAs in GenBank and the genome”

“…alignments between human expressed sequence tags (ESTs) in GenBank and the genome… ESTs are single-read sequences, typically about 500 bases in length”

Detecting Alternative Splicing Events

Via Alt Events track (Genes and Gene Prediction Tracks) – based on UCSC genes

gene DNA

>cassetteExon

Detecting Alternative Splicing Events

Via Burge RNA-seq track (Expression)

Click on the track name to choose tissues

gene DNA

Burge RNA-seq

Different Alternative Splicing Types

Exon skipping

Alternative splice site (3’)

Intron retention

Lecture Overview UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR

Galaxy

Histone Modifications

Transcription Factor Binding

DNA Methylation

Lecture Overview UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR

Galaxy

BLAT

BLAT = Blast-Like Alignment Tool

BLAT is designed to find similarity of >95% on DNA and >80% for protein

BLAT query

BLAT query

Lecture Overview UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR

Galaxy

PCR

temperatures

amplicon in fasta format

coordinates strand primers

amplicon

Lecture Overview UCSC Genome Browser

Interface & selected tracks

Detecting alternative splicing events

Chromatin organization & epigenetics

BLAT

PCR

Galaxy

Galaxy

“Galaxy allows you to do analyses you cannot do anywhere else without the need to install or download anything. You can analyze multiple alignments, compare genomic annotations, profile metagenomic samples and much much more...”

Galaxy – What Is It Good for?

Getting the best out of UCSC

Operating on UCSC data

Supports operations both at the interval level, and at the sequence level

Designed for biologists!

Galaxy – Typical Workflow

Extract sets of coordinates

either upload from computer or from UCSC table browser

Operate on different sets of coordinates (intersect, subtract etc.)

Fetch genomic sequences of

coordinates

top related