summarizing differential expression using mann-whitney u-tests

16
Summarizing Differential Expression Using Mann- Whitney U-tests

Upload: willis-evans

Post on 05-Jan-2016

234 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Summarizing Differential Expression Using Mann-Whitney U-tests

Summarizing Differential Expression Using Mann-Whitney U-tests

Page 2: Summarizing Differential Expression Using Mann-Whitney U-tests

RNA-Seq… at it’s Most Basic Form

Samples from two conditions

Isolate RNA Generate cDNA

Create sequencing library by fragmenting, size selection and adding adaptorsRun sequencerGenerate short

reads

Identify differentially expressed genes

Profound biological discovery

Page 3: Summarizing Differential Expression Using Mann-Whitney U-tests

Heat stress experiment analyzed with tag-based RNA-seq

indi

vidu

al

stress

controlstress

Page 4: Summarizing Differential Expression Using Mann-Whitney U-tests

Input: - list of significant genes (“our list”)- all GO annotations for all genes in a genome (or transcriptome)

Enrichment test: whether “our list” contain more representatives of a certain GO category than expected by chance (Fisher’s exact, hypergeometric, or similar test)

Gene Ontology enrichment analysis (classic)

Page 5: Summarizing Differential Expression Using Mann-Whitney U-tests
Page 6: Summarizing Differential Expression Using Mann-Whitney U-tests

Mann-Whitney U-test

• Use ranks to test if distributions of group X and group Y are different

• Robust to outliers and does not require normally distributed data

Page 7: Summarizing Differential Expression Using Mann-Whitney U-tests

Input: - list of significant genes with measures to rank them- GO annotations for all genes in a genome (or transcriptome)

Enrichment test: whether a GO category is significantly enriched with either top- or bottom-ranking genes (two-sided Mann-Whitney U test, or permutations)

Advantages: - no need to do choose a “significance cutoff”- can keep track of direction of change

Gene Ontology enrichment analysis (rank-based)

controlstressGenes annotatedwith the GO term

MWU test determines whether genes annotated with the GO term in question (stripes on the white box to the left) are significantly “bunched up” either on top or at the bottom of the ranked list.

“delta rank” : mean rank of GO-term genes minus mean rank of all other genes (how much shift in ranks there is).

Page 8: Summarizing Differential Expression Using Mann-Whitney U-tests

control treatment

Differential Expression

Analysis(DESeq EdgeR)

Name pvalue -log(p) Rank

gene1 0.0001 4 1gene2 0.001 3 2gene3 0.01 2 3gene4 0.1 1 4gene5 0.1 -1 5gene6 0.01 -2 6gene7 0.001 -3 7gene8 0.0001 -4 8

deltarank

Page 9: Summarizing Differential Expression Using Mann-Whitney U-tests

- Cluster GO categories according to the proportion of shared genes would bring similar biological processes together

- Merge identical or very similar categories to reduce redundancy.

Some GO categories in your data might share the same genes(and some may overlap completely)

Page 10: Summarizing Differential Expression Using Mann-Whitney U-tests

Run R Script GO_MWU.R • go to ~/Desktop/Mann-Whitney_U-tests/MWU_go

• open the file GO_MWU.R

• execute commands by highlighting and pressing control + enter

Page 11: Summarizing Differential Expression Using Mann-Whitney U-tests
Page 12: Summarizing Differential Expression Using Mann-Whitney U-tests

gene,logPisogroup0,0.6isogroup1,3.5isogroup10,6.8isogroup100,6.4isogroup1000,1.7isogroup10000,0.1isogroup10001,-0.2isogroup10002,0.6isogroup10003,-0.4

heats.csv(differential expression dataset)

V1 V2isogroup15359 GO:0001614;GO:0004931;GO:0009719;isogroup0 GO:0004687isogroup100 GO:0003779;GO:0008091isogroup10001 GO:0003993isogroup10002GO:0005524;GO:0016887;GO:0000166;GO:0017111isogroup10003GO:0006605;GO:0006886;GO:0016020;GO:0015031;isogroup10004GO:0004197;GO:0006508;GO:0008234;GO:0004217isogroup10006 GO:0001733isogroup10007GO:0003824;GO:0008152;GO:0000247;GO:0008416;GO:0016863;GO:0018842;GO:0004165

amil_defog_iso2go.tab(links genes with their GO terms)

id: GO:0000002name: mitochondrial genome maintenancenamespace: biological_processdef: "The maintenance of the structure and integrity of the mitochondrial genome; includes replication and segregation of the mitochondrial chromosome." [GOC:ai, GOC:vw]is_a: GO:0007005 ! mitochondrion organization

[Term]id: GO:0000003name: reproductionnamespace: biological_processalt_id: GO:0019952alt_id: GO:0050876def: "The production of new individuals that contain some portion of genetic material inherited from one or more parent organisms." [GOC:go_curators, GOC:isa_complete, GOC:jl, ISBN:0198506732]subset: goslim_genericsubset: goslim_pirsubset: goslim_plantsubset: gosubset_prok

go.obo(links GO terms with names, namespaces, and definitions)

Page 13: Summarizing Differential Expression Using Mann-Whitney U-tests

Molecular function:

Cellular component:

Dendrograms : sharing of genes between categories. Fractions : genes with an unadjusted p<0.05 / total number of genes within the category.

FDR-adjusted p-values

GO_MWU: response to adult corals to 3 days of heat stress

https://github.com/z0on/GO_MWU

Page 14: Summarizing Differential Expression Using Mann-Whitney U-tests

Run R Script GO_MWU.R • go to ~/Desktop/Mann-Whitney_U-tests/MWU_go

• open the file GO_MWU.R

• execute commands by highlighting and pressing control + enter

Page 15: Summarizing Differential Expression Using Mann-Whitney U-tests

KOG-MWU: same idea as GOMWU (“KOGMWU” package in )

Non-hierarchical and [mostly] non-overlapping nature of KOG class annotations allows for quantitative comparisons of diverse datasets based on KOG delta-ranks.

“categories enriched with either up- or down-regulated genes”

Page 16: Summarizing Differential Expression Using Mann-Whitney U-tests

Questions