nonlinear stochastic determination for gene expressions via cdna microarrays

1
Poster abstracts 41 Der, Sandy Characterization of interferon-regulated pathways using oligonucleotide microarrays Sandy D. Der, Mike J. de Veer, Aimin Zhou, Robert H. Silverman & Bryan R.G. Williams Department of Cancer Biology, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, USA Type I (α, β) and type II (γ) interferons (IFN) possess pleiotropic functions, including antiviral, antiproliferative and immunomodulatory activities, which are likely mediated by the induction of unique subsets of IFN-stimulated genes (ISG) that are differentially regulated by IFN subtypes and in different cell types. Oligonucleotide arrays (Affymetrix) were used to investigate differential gene expression in the human fibrosarcoma cell line, HT1080, or primary mouse embryonic fibroblasts in response to IFNs or virus infection. In all samples, the detection of known IFN-regulated genes helped to validate the experimental design and conditions. Combined analysis of these data sets, each with several hundreds of differentially expressed genes, has led to the idenfication of groups of IFN-regulated genes with similar functions, previously unknown IFN-regulated pathways and potential to crosstalk with other signalling pathways. For example, ISGs with similar functions included apoptosis regulators, chemokines, genes involved with antigen processing and presentation, GTPases, heat shock proteins and members of the ubiquitin system. Signalling molecules identified as poten- tially novel ISGs included known kinases, phosphatases and signalling adapter molecules. The functional significance of each gene relating to IFN functions requires further study on an individual basis, but the continuous refinement of a global view for IFN-regulated pathways may identify additional yet unknown areas of IFN action and enable better design of therapeutic strategies based on IFN-regulated mechanisms. Diehn, Max Large-scale identification of secreted and membrane-associated gene products using DNA microarrays Max Diehn Medical Scientist Training Program & Department of Biochemistry, Stanford University School of Medicine, Stanford, California 94305, USA We describe a rapid and efficient method for identifying secreted and transmem- brane gene products. From yeast and human cell cultures, mRNA species bound to rough endoplasmic reticulum (rER)-associated polysomes were separated from free and cytoplasmic polysome-associated mRNAs by sedimentation equilibrium or sedimentation velocity. The representation of mRNA species in each popula- tion was quantified by hybridization to high-density DNA microarrays. Transcripts known to encode transmembrane or secreted proteins were signifi- cantly enriched in the rER-associated fractions, whereas those known to encode cytoplasmic proteins were enriched in the fractions containing free and cytoplas- mic polysome-associated mRNAs. Based on our data, we are able to predict with reasonable confidence that over 275 human and 300 yeast genes encode previ- ously unrecognized transmembrane, ER/golgi-resident or secreted proteins. The combination of fractionation of nucleic acids on the basis of physical properties and analysis on DNA microarrays is a general method that can be applied to a diverse set of biological questions. Dougherty, Edward Nonlinear stochastic determination for gene expressions via cDNA microarrays Edward R. Dougherty 1 , Seungchan Kim 1 , Michael L. Bittner 2 , Yidong Chen 2 , Krishnamoorthy Sivakumar 3 , Paul Meltzer 2 & Jeffrey M. Trent 2 1 Department of Electrical Engineering, Texas A&M University, College Station, Texas, USA 2 Cancer Genetics Laboratory, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, USA 3 Department of Electrical Engineering, Washington State University, Pullman, Washington, USA A basic problem in multivariate systems is finding predictive relations between the measurable variables in the system. The extent to which a target variable can be predicted via measurement of a set of predictor variables reveals the level of interaction between the variables. We want to quantify the degree to which the tar- get variable is statistically determined by the predictor variables, where statistical determination does not imply physical causality. In signal processing systems, stochastic determination provides a window into the overall control structure. In the simple case of two variables possessing a bivariate normal distribution, the best predictive relation is linear and the correlation coefficient provides the desired coefficient of determination. The problem can still be framed in terms of correlation for multivariate linear systems (albeit, not in such a straightforward form), but not for nonlinear systems. We examine quantification of nonlinear multivariate stochastic determination among gene expression levels using cDNA microarrays. The method allows incorporation of knowledge of other relevant conditions, such as the application of particular stimuli or the presence of inactivating gene mutations, as predictive elements affecting the target expression level. The approach is general and can be applied to any class of nonlinear predictor functions. For our talk, prediction is based on a ternary perceptron, to which is input one of three values for each pre- dictor gene: +1 [up-regulated], -1 [down-regulated] or 0 [invariant]. External con- ditions are quantified as +1 [present] or 0 [not present]. Our reason for choosing a perceptron is twofold: first, it is intuitive; second, for n predictor variables, it has only n + 1 parameters to estimate and therefore requires much less data than more general nonlinear predictors. Owing to the large number of genes and a severe limitation on experimental replication when using microarrays, any individual predictor derived from the data lacks statistical significance relative to the population. Since our interest is measuring degrees of determination, we can derive the best predictors from the sample data and then estimate the determination coefficients for various predictor sets from the sample data. These estimates will be biased high, but can be used as sample coefficients to relatively quantify the degrees to which various expression- level sets stochastically determine a target expression level. A more accurate, more computationally costly way to proceed is to employ a bootstrap method to estimate the coefficients of determination. The bias will on average be low, but we can again obtain relative quantification. The entire procedure is supported by software that has a number of facilities to simplify data analysis, and which provides graphics for visualising experimen- tal data and sequential increases in determination as additional expression levels are used for prediction. © 1999 Nature America Inc. • http://genetics.nature.com © 1999 Nature America Inc. • http://genetics.nature.com

Upload: jeffrey-m

Post on 24-Jul-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nonlinear stochastic determination for gene expressions via cDNA microarrays

Poster abstracts

41

Der, Sandy

Characterization of interferon-regulatedpathways using oligonucleotide

microarrays

Sandy D. Der, Mike J. de Veer, Aimin Zhou, Robert H. Silverman& Bryan R.G. Williams

Department of Cancer Biology, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, USA

Type I (α, β) and type II (γ) interferons (IFN) possess pleiotropic functions,including antiviral, antiproliferative and immunomodulatory activities, which arelikely mediated by the induction of unique subsets of IFN-stimulated genes (ISG)that are differentially regulated by IFN subtypes and in different cell types.Oligonucleotide arrays (Affymetrix) were used to investigate differential geneexpression in the human fibrosarcoma cell line, HT1080, or primary mouseembryonic fibroblasts in response to IFNs or virus infection. In all samples, thedetection of known IFN-regulated genes helped to validate the experimentaldesign and conditions. Combined analysis of these data sets, each with severalhundreds of differentially expressed genes, has led to the idenfication of groups ofIFN-regulated genes with similar functions, previously unknown IFN-regulatedpathways and potential to crosstalk with other signalling pathways. For example,ISGs with similar functions included apoptosis regulators, chemokines, genesinvolved with antigen processing and presentation, GTPases, heat shock proteinsand members of the ubiquitin system. Signalling molecules identified as poten-tially novel ISGs included known kinases, phosphatases and signalling adaptermolecules. The functional significance of each gene relating to IFN functionsrequires further study on an individual basis, but the continuous refinement of aglobal view for IFN-regulated pathways may identify additional yet unknownareas of IFN action and enable better design of therapeutic strategies based onIFN-regulated mechanisms.

Diehn, Max

Large-scale identification of secreted andmembrane-associated gene products

using DNA microarrays

Max Diehn

Medical Scientist Training Program & Department of Biochemistry, StanfordUniversity School of Medicine, Stanford, California 94305, USA

We describe a rapid and efficient method for identifying secreted and transmem-brane gene products. From yeast and human cell cultures, mRNA species boundto rough endoplasmic reticulum (rER)-associated polysomes were separated fromfree and cytoplasmic polysome-associated mRNAs by sedimentation equilibriumor sedimentation velocity. The representation of mRNA species in each popula-tion was quantified by hybridization to high-density DNA microarrays.Transcripts known to encode transmembrane or secreted proteins were signifi-cantly enriched in the rER-associated fractions, whereas those known to encodecytoplasmic proteins were enriched in the fractions containing free and cytoplas-mic polysome-associated mRNAs. Based on our data, we are able to predict withreasonable confidence that over 275 human and 300 yeast genes encode previ-ously unrecognized transmembrane, ER/golgi-resident or secreted proteins. Thecombination of fractionation of nucleic acids on the basis of physical propertiesand analysis on DNA microarrays is a general method that can be applied to adiverse set of biological questions.

Dougherty, Edward

Nonlinear stochastic determination forgene expressions via cDNA microarrays

Edward R. Dougherty1, Seungchan Kim1, Michael L. Bittner2,Yidong Chen2, Krishnamoorthy Sivakumar3, Paul Meltzer2

& Jeffrey M. Trent2

1Department of Electrical Engineering, Texas A&M University, College Station, Texas, USA

2Cancer Genetics Laboratory, National Human Genome Research Institute,National Institutes of Health, Bethesda, Maryland, USA

3Department of Electrical Engineering, Washington State University, Pullman,Washington, USA

A basic problem in multivariate systems is finding predictive relations betweenthe measurable variables in the system. The extent to which a target variable canbe predicted via measurement of a set of predictor variables reveals the level ofinteraction between the variables. We want to quantify the degree to which the tar-get variable is statistically determined by the predictor variables, where statisticaldetermination does not imply physical causality. In signal processing systems,stochastic determination provides a window into the overall control structure. Inthe simple case of two variables possessing a bivariate normal distribution, thebest predictive relation is linear and the correlation coefficient provides thedesired coefficient of determination. The problem can still be framed in terms ofcorrelation for multivariate linear systems (albeit, not in such a straightforwardform), but not for nonlinear systems.

We examine quantification of nonlinear multivariate stochastic determinationamong gene expression levels using cDNA microarrays. The method allowsincorporation of knowledge of other relevant conditions, such as the applicationof particular stimuli or the presence of inactivating gene mutations, as predictiveelements affecting the target expression level. The approach is general and can beapplied to any class of nonlinear predictor functions. For our talk, prediction isbased on a ternary perceptron, to which is input one of three values for each pre-dictor gene: +1 [up-regulated], -1 [down-regulated] or 0 [invariant]. External con-ditions are quantified as +1 [present] or 0 [not present]. Our reason for choosinga perceptron is twofold: first, it is intuitive; second, for n predictor variables, it hasonly n + 1 parameters to estimate and therefore requires much less data than moregeneral nonlinear predictors.

Owing to the large number of genes and a severe limitation on experimentalreplication when using microarrays, any individual predictor derived from thedata lacks statistical significance relative to the population. Since our interest ismeasuring degrees of determination, we can derive the best predictors from thesample data and then estimate the determination coefficients for various predictorsets from the sample data. These estimates will be biased high, but can be used assample coefficients to relatively quantify the degrees to which various expression-level sets stochastically determine a target expression level. A more accurate,more computationally costly way to proceed is to employ a bootstrap method toestimate the coefficients of determination. The bias will on average be low, but wecan again obtain relative quantification.

The entire procedure is supported by software that has a number of facilitiesto simplify data analysis, and which provides graphics for visualising experimen-tal data and sequential increases in determination as additional expression levelsare used for prediction.

© 1999 Nature America Inc. • http://genetics.nature.com©

199

9 N

atu

re A

mer

ica

Inc.

• h

ttp

://g

enet

ics.

nat

ure

.co

m