rna-seq visualization cummrrbund in atmosphere jason williams iplant / cold spring harbor laboratory
TRANSCRIPT
RNA-Seq VisualizationcummrRbund in Atmosphere
Jason Williams iPlant / Cold Spring Harbor Laboratory
*Graphics taken from these publications
The Tuxedo Protocol
*TopHat and Cufflinks require a sequenced genome
Tophat
TopHat outputs in IGV
Using CummeRbund in Atmosphere
Using CummeRbund in Atmosphere
• Visualize and mine Cuffdiff results• Output files from Cuffdiff are reorganized into a local database
Choose the right image
We will be using “RNA-Seq Visualization”Rmi-BE9C2D12
Any image w/R can work, and you could also searchFor an image with cummeRbund installed
Installing cummeRbund in R
Installing cummeRbund in R
Reading the data in
> cuff <- readCufflinks()
> cuff
CuffSet instance with: 2 samples 33714 genes 43481 isoforms 35113 TSS 32924 CDS 33621 promoters 35113 splicing 27350 relCDS
Visualizing dispersion
>disp<-dispersionPlot(genes(cuff))>disp
• Counts vs. dispersion
• Overdispersion greater variability in a data set than would be expected based on a given model ( in our case extra-Poisson variation)
• If you use Poisson model, you will overestimate differential expression
Visualizing dispersion
http://www.fgcz.ch/education/StatMethodsExpression/03_Count_data_analysis.pdf
Poisson adequately describes technical variation
Visualizing dispersion
Squared-coefficient of Variation (SCV)
>genes.scv<-fpkmSCVPlot(genes(cuff))>genes.scv
• Normalized measure of cross-replicate variability
• Represents the relationship of the standard deviation to the mean
• Differences in SCV can result in lower numbers of differentially expressed genes due to a higher degree of variability between replicate fpkm estimates
Distributions of FPKM scores across samples
>dens<-csDensity(genes(cuff))>dens
>densRep<-csDensity(genes(cuff),replicates=T)>densRep
Non-parametric estimate of pdf
FPKMPairwise Scatter Plots
> csScatter(genes(cuff),‘WT’,‘hy5’,smooth=T)
Saving your Plots
1. Plot type: >(e.g. jpeg, png, pdf) (file_path_and_file_name)2. Plot function 3. dev.off()
> png (‘csScatter.png’) #Will save in working directory> csScatter(genes(cuff),‘WT’,‘hy5’,smooth=T)>dev.off
Selecting and Filtering Gene Sets
Using the ‘getSig’ function # Enables you to get genes at significance n
>sig <-getSig(cuff, alpha=0.05, level =‘genes’) # genes of significance 0.05>length(sig) #returns the number of genes in the sig object
>sig <-getSig(cuff, alpha=0, level=‘genes’) >tail(sig,100) #displays the last 100 genes in the sig object you just made
Selecting and Filtering Gene Sets
Using the ‘getGenes’ function # Get the gene information
>sigGenes <- getGenes(cuff,sig)
Plot this in another scatter plot
>csScatter(sigGenes, ‘WT’, ‘hy5’)
Heat mapping Similar Expression Values
>sigGenes <-getGenes(cuff,tail(sig,50))#last 50 genes in the list we created of genes
>csHeatmap(sigGenes,cluster=‘both’)
Heat mapping Similar Expression Values
>csHeatmap(sigGenes,cluster=‘both’,replicates=‘T’)
Expression Plots by Genes
> myGeneId<-”AT5G41471"
> myGene<-getGene(cuff,myGeneId)
> myGene
Expression Plots by Genes
> expressionPlot(myGene,replicates=‘T’)