browsing genes and genomes with ensembl · 2016-06-08 · variation data in ensembl and the ensembl...

49
Emily Perry Ensembl Outreach Project Leader EMBL-EBI Browsing Genes and Genomes with Ensembl

Upload: others

Post on 11-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Emily Perry

Ensembl Outreach Project Leader

EMBL-EBI

Browsing Genes and Genomes with Ensembl

Page 2: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Objectives

• What is Ensembl?

• What type of data can you get in Ensembl?

• How to navigate the Ensembl browser website.

• Where to go for help and documentation.

Page 3: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

This webinar courseDate Webinar topic Instructor

24th March

Introduction to Ensembl Emily Perry

31st March

Ensembl genes Denise Carvalho-Silva

7th April Data export with BioMart Helen Sparrow

14th April

Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva

21st April

Comparing genes and genomes with Ensembl Compara Helen Sparrow

28th April

Finding features that regulate genes – the Ensembl Regulatory Build

Emily Perry

5th May Uploading your data to Ensembl and advanced ways to access Ensembl data

Ben Moore

Page 4: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Structure

Presentation:What the Ensembl Regulatory Build isHow we produce/process the data

Demo:Getting

regulatory data

Exercises:On the train online course

Page 5: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Questions?

• We’ve muted all the mics• Ask questions in the Chat box in

the webinar interface• My Ensembl colleagues will

respond during the talk• There’s no threading so please

respond with @username

Helen Sparrow Ben Moore Denise Carvalho-Silva

Page 6: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Course exercises

http://www.ebi.ac.uk/training/online/course/ensembl-browser-webinar-series-2016

This text will be replaced by a YouTube (link to YouKu too) video of the webinar

and a pdf of the slides.

The “next page” will be the exercises

A link to exercises and their solutions will appear in the page

hierarchy

Page 7: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Get help with the exercises

• Use the exercise solutions in the online course

• Join our Facebook group and discuss the exercises with everybody (see the online course for the link)

• Email us [email protected]

Page 8: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Quick polls

• Poll 1: Did you attend the previous webinars?• Poll 2: Have you done the previous exercises?

Page 9: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

EBI is an Outstation of the European Molecular Biology Laboratory.

Module 6:Ensembl Regulation

Page 10: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Overview

• Epigenetics in gene regulation• Methods in epigenetics

• Ensembl Regulation• Data sources

• Our build

Page 11: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Epigenetics

The study of heritable genetic changes, without changes in the DNA sequence.

This is known to regulate gene expression.

Page 12: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Epigenetic change -> cell differentiation

Page 13: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Epigenetic change -> cell differentiation

• Cells carry out different functions• Cells are morphologically different• Cells express different genes• Cells have different epigenomes

Page 14: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Epigenetic change -> cell differentiation

Stem cell

Differentiated cell

Page 15: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Promoters enhancers etc

• TF-binding at promoters and enhancers is necessary for transcription

• Combinations of epigenetic marks affect the ability and probability of TF-binding at these sites

Page 16: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Methods of gene regulation

Method of regulation Technique

Histone modifications ChIP-seq

Transcription factor binding ChIP-seq

Open/closed chromatin DNase sensitivity

DNA methylation Bisulfite sequencing

Page 17: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Histone modifications

We describe histone modifications using the form Subunit, Amino acid, Position, Modification, eg H3K36me3.

Page 18: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Histone code

Modification Histone

H3K4 H3K9 H3K14 H3K27 H3K79 H4K20 H2BK5

me1

me2

me3

ac

Page 19: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

ChIP-seq for histone mods & TF-binding

DNA

DNA-binding protein

Shear the genome

Crosslink

Covalent bond

Antibody

Pull down the protein with an antibody

Remove crosslinks and wash

Sequence fragments

ACGCTGACTAGAATCAATGGCTTCTCTTCGCATATGGCTGACTA

Page 20: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

TF motifs vs ChIP-seq peaks

CTCF binding motif – 19 basesChIP-seq sonication fragment 200 bases

ATTTAGTTCCCTAGATCTGATCTAATCATCGGATCTATAGCCGATCGTAGRead length 50 bases

peak

400 bp

19 bpDNAMotif

reads

Page 21: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Open/closed chromatin

Open chromatin is transcriptionally active.Closed chromatin is inactive.

Page 22: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

DNase hypersensitivity

Sequence and compare to reference

DNase treatment

Purify

Page 23: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

GGCGGGATTGCGCGTTAGATCGCGCGCTTATGCTAGCCGCGCTGATAGCGGCGGGATTGCGCGTTAGATCGCGCGCTTATGCTAGCCGCGCTGATAGC

CH3CH3 CH3 CH3 CH3CH3CH3 CH3

GCTATCAGCGCGGCTAGCATAAGCGCGCGATCTAACGCGCAATCCCGCC

CH3CH3 CH3CH3CH3 CH3 CH3CH3

DNA methylation -> inactive

Page 24: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Bisulfite sequencing

GGCGGGATTGCGCGTTAGATCGCGCGCTTATGCTAGCCGCGCTGATAGC

CH3 CH3 CH3

CH3 CH3

Bisulfite treatment

GGUGGGATTGUGUGTTAGATCGCGCGUTTATGUTAGUCGCGUTGATAGC

Sequence and compare to reference

Page 25: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Ensembl RegulationThe goal of Ensembl Regulation team is to annotate the genome with features that may play a role in the transcriptional regulation of genes.

• Predicted open/closed chromatin

• DNase I sensitivity

• FAIRE

• Transcription factor binding sites

• Epigenetic marks

• Histone modifications

• DNA methylation

• RNA Pol binding

Page 26: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Current data

Page 27: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

The future

Page 28: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

A subset of cell types

• Only a subset of available data is displayed in Ensembl.• We display cell types that have, at a minimum:

• CTCF binding (not Blueprint)• DNase or FAIRE data (not Blueprint)• H3K4me3, H3K27me3, H3K36me3 data

• We display all TFBS and histone modification data known in these cell types.

• We process these data to predict activity.

• Further data can be added using track hubs.

Page 29: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Processing the data

• The raw data is taken from the various sources.• This is processed to predict the positions of regulatory

features, such as promoters, enhancers and insulators.• The activity of these features is predicted in the different

cell types.

• All of this can be viewed in the genome browser.

Page 30: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Raw data

Transcription Factor ATranscription Factor BTranscription Factor C

Histone mod1Histone mod2Histone mod3

Page 31: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Searching for patterns

known promoter

known promoter

known promoter

Page 32: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Segmentation

Transcription Factor ATranscription Factor BTranscription Factor C

Histone mod1Histone mod2Histone mod3

Page 33: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Segmentation is blind

• Our algorithm has no idea what the patterns it is looking at are

• eg it doesn’t know if a histone modification is activating or repressing

• Later analysis reveals that it activating modifications are found at activating segments and vice versa

• ie we think it’s working!

Page 34: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

MultiCell features

Cell type 1

Cell type 2

Cell type 4

Cell type 3

Page 35: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Cell-specific features

Cell type 1

Cell type 2

Cell type 4

Cell type 3

MultiCell

Page 36: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

CoverageLabel Count Mean length

(bp)Max length (bp)

Total length (Mbp)

TSS 40,249 973.2 11,400 39.2

Proximal Reg. 101,206 1005.5 15,000 101.8

Distal Reg. 209,081 526.1 8,400 110.0

CTCF 108,284 550.1 5,200 59.6

Unannotated TFBS

163,528 155.8 1,630 25.5

Union 299.2

Page 37: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

We do not…• …link promoters/enhancers/insulators or any other regulatory

features to genes. We allow you see what is where and make your own inferences.

• …link regulatory features to gene expression. We have cell-line specific regulation data and tissue specific expression data – make of it what you will.

Regulatory data is incredibly complex and still in relative infancy. There is no comprehensive database of regulation data.

Page 38: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Real promoters etc?

• Our predictions are based on real biological data

• We have strong evidence to suggest that they are doing what we think they are

• Most of them have not been experimentally validated (ie none have been cloned alongside a gene and tested)

• More data will further refine and improve our pipeline

Page 39: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Regulation BioMart

Dataset Motifs

Reg-feats Evidence

Filters Location

Cell Types Class

Attributes Reg-feat IDs

locations activity

Results table

Page 40: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Hands on

• We’re going to look at the region of a gene LIMD2 to find regulatory features and explore what cells types they are active in and what evidence there is to show this.

Page 41: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Next webinar – Advanced access

As well as exploring genomic data through the web interface, you are also able to upload your own data to view within the browser.

The first part of this final webinar will show you how you can view custom data, such as BED or BAM files, in the Ensembl browser.

We will then introduce some of the more advanced methods of accessing Ensembl data, such as using REST API, Perl API, FTP site and MySQL queries.

Page 42: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Questions?

• You can continue to use the chat box

• I will read out loud any further questions and answer on the screen

• You can also try hands-up, and I will unmute your mic

Helen Sparrow Ben Moore Denise Carvalho-Silva

Page 43: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Course exercises

http://www.ebi.ac.uk/training/online/course/ensembl-browser-webinar-series-2016

This text will be replaced by a YouTube (link to YouKu too) video of the webinar

and a pdf of the slides.

The “next page” will be the exercises

A link to exercises and their solutions will appear in the page

hierarchy

Page 44: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Get help with the exercises

• Use the exercise solutions in the online course

• Join our Facebook group and discuss the exercises with everybody (see the online course for the link)

• Email us [email protected]

Page 45: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Help and documentationCourse online http://www.ebi.ac.uk/training/online/subjects/11

Tutorials www.ensembl.org/info/website/tutorials

Flash animations

www.youtube.com/user/EnsemblHelpdesk

http://u.youku.com/Ensemblhelpdesk

Email us [email protected]

Ensembl public mailing lists [email protected], [email protected]

Page 46: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Follow us

www.facebook.com/Ensembl.org

@Ensembl

www.ensembl.info

Page 47: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Publications

Yates, A. et al

Ensembl 2016

Nucleic Acids Research

http://europepmc.org/articles/4702834

Xosé M. Fernández-Suárez and Michael K. SchusterUsing the Ensembl Genome Server to Browse Genomic Sequence Data.Current Protocols in Bioinformatics 1.15.1-1.15.48 (2010)www.ncbi.nlm.nih.gov/pubmed/20521244

Giulietta M Spudich and Xosé M Fernández-SuárezTouring Ensembl: A practical guide to genome browsingBMC Genomics 11:295 (2010)www.biomedcentral.com/1471-2164/11/295

http://www.ensembl.org/info/about/publications.html

Page 48: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

Ensembl 2015

Page 49: Browsing Genes and Genomes with Ensembl · 2016-06-08 · Variation data in Ensembl and the Ensembl VEP Denise Carvalho-Silva 21st April Comparing genes and genomes with Ensembl Compara

AcknowledgementsThe Entire Ensembl Team

Funding

Co-funded by the European Union