bioinformatic data analysis – comparison from three human studies using different affymetrix...
DESCRIPTION
Talk presented in Vancouver 2007 at the Biomarker Conference.TRANSCRIPT
Dr. Agnieszka Lichanska, UQ
Bioinformatic data analysis – comparison from three human studies
using different Affymetrix platforms
Agnieszka M. Lichanska1, Sheryl Maher2,3, Nguyen Pham1, Timothy Pan1,2, and Saso Ivanovski4
1School of Dentistry, 2Institute for Molecular Biosciences, 3 Australian Biosecurity CRC for Emerging Infectious Diseases The University of
Queensland, and School of Dentistry Griffith University Brisbane, Australia
Dr. Agnieszka Lichanska, UQ
Overview
• Studies and starting hypothesis• Analysis tools• Results from bioinformatics• Validation• Future studies• How well the new exon arrays
characterize the gene expression?
Dr. Agnieszka Lichanska, UQ
Studies
1. Comparison of gene expression between periodontal ligament cells and gingival cells
2. Functions of nuclear IGFBP53. Identification of biological processes
induced in osteoblasts by LPS
Dr. Agnieszka Lichanska, UQ
Array analysis
• Affymetrix Platform– Hu133A arrays - using Ambion MessageAmp and Enzo IVT
kit– Human ST1.0 exon arrays - using new GeneChip WT cDNA
amplification kit
• Analysis– MAS, DMT, Spotfire, – Partek– GO Browser (Affymetrix), Pathway Miner, DAVID, Onto-
Tools, Clover, PAINT, MSCAN, CpGPro,
• Data validation– Real time PCR, cell-based assays, other methods
Dr. Agnieszka Lichanska, UQ
Study 1 - periodontal ligament cells and gingival cells
• The objective was to identify the markers for periodontal ligament cells which can be used for development of periodontitis therapies.
• Limited knowledge about the regulation of gene expression in those tissues.
• Extensive functional knowledge about the role of the different cells in periodontium.
• Hu133A arrays
Dr. Agnieszka Lichanska, UQ
Questions
1. What regulates the differential gene expression in those tissues?
2. Is the differential methylation playing a role in expression regulation?
3. Can we identify markers for each of the tissues?
Dr. Agnieszka Lichanska, UQ
Identification of differentially expressed genes
Total differentially expressed genes - 292
Genes with CpG islands – 121
Up in Ligament – 112 genesDown in Ligament – 180 genes
Genes with CpG islands – 70
Identification of differentially expressed genes: MAS 5.0 – presence/absence callsDMT - number of concordant changesSpotfire - ANOVA analysis
Dr. Agnieszka Lichanska, UQ
Biological processes
Up in ligament
Down in ligament
DAVID functional annotation tool
Dr. Agnieszka Lichanska, UQ
Elk-1
Gene Name Predicted Elk-1 Cluster
PAINT MSCAN
CYP51A1
EGR1
HSPE1
KPNB1
MAGOH
MET
PAWR
PLCB4
PPP1CB
RNF5
SNRPD1
SNRPG
TAF11
TDG
GLG1
SIP1
FUBP3
ADAMTS1
KIAA0152
COX17
CDC42EP3
PDLIM5
PAPOLA
EBNA1BP2
U2AF2
DHRS7B
C14orf109
LSM3
TPRKB
C14orf111
MRPL35
LSM8
ENAH
C13orf10
YRDC
ZNF587
Prediction of Elk-1 Transcription factor binding site clusters in gene by both PAINT and MSCAN
Prediction of Elk-1 Transcription factor binding site clusters in gene by PAINT only
Prediction of Elk-1 Transcription factor binding site clusters in gene by MSCAN only
PAINT analysis
MSCAN analysis
Dr. Agnieszka Lichanska, UQ
Broad-complex
Elk-1HMG-IV bZIP911
Elk-1
FREAC7 HMG-IV HMG-IV
SRYPAX4
DOF3
AP2 alpha
1 2000
Transcription factor P-value
Broad complex 0
SRY 0.001
AP2 alpha 0.001
FREAC-7 0.002
ELK-1 0.002
DOF-3 0.003
UBX 0.006
bZIP911 0.006
PAX4 0.008
HMG-IV 0.01
HFH-1 0.01
Clover analysis
LSM3 gene
Dr. Agnieszka Lichanska, UQ
Conclusions
• A lot of additional information can be mined from the array datasets, such as what can regulate differential gene expression
• The promoter analysis can be particularly useful in cases when little in know about the system
• Similarly to all of other analyses multiple tools have to be used as none of them provides all the information.
• The output formats can be difficult to manipulate
• The hypothesis has to be obviously validated in vitro
Dr. Agnieszka Lichanska, UQ
Exon arrays
Dr. Agnieszka Lichanska, UQ
Study 2 - Functions of nuclear IGFBP5 in osteoblasts
• The objective was to identify the genes regulated by nuclear translocation of IGFBP5
• IGFBP5 Functions:– It is the main IGFBP in the bone– It induces proliferation of osteoblasts in vitro– It can act through IGF-dependent or IGF-independent
mechanisms– It is also associated with breast cancer progression– It is known to interact with FHL2 and RAR/RXR in the
nucleus– The target genes regulated by IGFBP5 are not known
• Array platform Human ST1.0 exon arrays
Dr. Agnieszka Lichanska, UQ
Model cells
osteosarcoma Primary osteoblasts
Dr. Agnieszka Lichanska, UQ
Time course of IGFBP5A. 2 Hour C. 8 Hour E. 48 Hour
B. 4 Hour D. 24 Hour F. No Treatment
Concentration of IGFBP-5, 625ng/mL
-nucleolin Isotype control -IGFBP5
Confocal Z-sections
Dr. Agnieszka Lichanska, UQ
Affymetrix Human Exon 1.0 ST ArrayAffymetrix Human Exon 1.0 ST Array
• Exon-level detection: differentiate differentially spliced transcripts of each gene
• Gene-level detection: all probesets are summarised into an expression value of all transcripts from the same gene
• Each exon comprises one probeset which contains 4 probes
• Each gene contains around 40 probes
• Exon-level detection: differentiate differentially spliced transcripts of each gene
• Gene-level detection: all probesets are summarised into an expression value of all transcripts from the same gene
• Each exon comprises one probeset which contains 4 probes
• Each gene contains around 40 probes (www.affymetrix.com)
Dr. Agnieszka Lichanska, UQ
(www.affymetrix.com)
rRNA reduction step
2nd cycle cDNA synthesis
Sample preparation of 3’UTR vs exon arrays
Dr. Agnieszka Lichanska, UQ
Analysis of the new Affymetrix exon arrays
•Experimental QC–rRNA reduction –IVT yield–cDNA yield–Fragmentation of cDNA
•Analytical QC–Box plot - actually best done in Expression console (Affymetrix)–Histogram analysis–PCA analysis
•Analysis–Exon alternative splicing (visualized with gene model)–Gene level analysis (visualized with a bar chart)
•Output–Splicing - gives Transcript cluster ID–Gene level - gives Probeset Ids–Entrez Gene ID has to be retrieved from Affymetrix to use in functional analysis
•Functional analysis–GO analysis–Pathway mapping–Promoter analysis–CpG islands
Dr. Agnieszka Lichanska, UQ
Preliminary analysis of the datasetPCA
PCA colored by p-value
Dr. Agnieszka Lichanska, UQ
Up-regulated GenesUp-regulated Genes
Dr. Agnieszka Lichanska, UQ
Down-regulated GenesDown-regulated Genes
Dr. Agnieszka Lichanska, UQ
Functional analysis updown
Dr. Agnieszka Lichanska, UQ
Study 3 - Identification of biological processes induced in osteoblasts by
LPS • The objective was to determine how LPS modulates function
of osteoblasts• Osteoblasts express Toll-like receptors 2, 3, 4, 5 and 9, with
TLR4 the being the main receptor for bacterial LPS• In periodontitis tissue loss includes bone loss but the changes
induced by bacteria remain unclear• LPS is used in this study as a model for infection
• What transcriptional events are induced by LPS?• What is the mechanism of induction of apoptosis in
osteoblasts in response to LPS?
Questions
Dr. Agnieszka Lichanska, UQ
PCA analysis of the LPS experiment
QC analysis using PCA
PCA analysis not really useful for separating dataset the into groups after the ANOVA analysis
Dr. Agnieszka Lichanska, UQ
Volcano plot analysis for gene selection
Dr. Agnieszka Lichanska, UQ
Biological Processes regulated in LPS treated cells
• Up-regulated genes– Actin cytoskeleton
• Down-regulated genes– Mitosis– Cell cycle– Regulation of cell processes– Cellular physiological processes
Dr. Agnieszka Lichanska, UQ
Conclusions
• IGFBP5 study has identified biological processes– expected to be regulated by IGFBP5 treatment - cell cycle,
proliferation– Also some unexpected ones - RNA splicing and
transcriptional regulators
• LPS study has provided us with clues as to the mechanisms by which LPS regulates osteoblast function– There is an upregulation of osteoclast stimulating
factors,e.g. CSF-1– There is downregulation of genes involved in proliferation– There is also upregulation of apoptotic genes– This suggests that a number of mechanisms can be
potentially be involved in apoptosis known to occur in osteoblasts in response to LPS
Dr. Agnieszka Lichanska, UQ
How do the new exon arrays compare with old 3’ UTR arrays?
Dr. Agnieszka Lichanska, UQ
3’UTR arrays
These arrays let us to analyze the gene levels for each gene only and as the probe sets were selected mainly in 3’UTR region thus giving us limited information about gene expression.
Dr. Agnieszka Lichanska, UQ
Splicing data - Differences between strong inducer - LPS and weak inducer - IGFBP5
IGFBP5LPS
Dr. Agnieszka Lichanska, UQ
Final conclusions• Exon arrays provide us with much more information
than 3’UTR ones• The analysis of new whole genome arrays and exon
arrays can be combined by using the same hybridization cocktail
• The new probe synthesis method has eliminated the need for using Test arrays, required by using cRNA on the arrays.
• The data analysis can use the entire dataset, can focus on alternative splicing or gene levels
• The output at the moment is difficult to manipulate• Gene Ontology and pathway mapping of exon arrays
can be done through the same tool, DAVID Functional annotation tool.
• Not all public tools are yet catering for the exon arrays.
Dr. Agnieszka Lichanska, UQ
Acknowledgements
QBI: Virginia Nink, Paul Beatus
IMB:Sheryl Maher,Elisabetta d’Aniello
School of Dentistry:Thor FriisTimothy PanNguyen Pham
Griffith University:Dr Saso Ivanovski
Millenium Science: Robert Henke,Jeremy PrestonSpotfire: Andrew Khoo, Partek: Michael Venezia
This work was supported by the UQ ECR grant, ADRF and Eli Lilly Foundation grant
Dr. Agnieszka Lichanska, UQ
ComBio 22-26th September, Sydney
MGED/AMATA - 3-5th September, Brisbane