the international journal of …...enriched pathways for major depressive disorder identified from...

12
The International Journal of Neuropsychopharmacology http://journals.cambridge.org/PNP Additional services for The International Journal of Neuropsychopharmacology: Email alerts: Click here Subscriptions: Click here Commercial reprints: Click here Terms of use : Click here Enriched pathways for major depressive disorder identified from a genomewide association study ChungFeng Kao, Peilin Jia, Zhongming Zhao and PoHsiu Kuo The International Journal of Neuropsychopharmacology / Volume 15 / Issue 10 / November 2012, pp 1401 1411 DOI: 10.1017/S1461145711001891, Published online: 16 January 2012 Link to this article: http://journals.cambridge.org/abstract_S1461145711001891 How to cite this article: ChungFeng Kao, Peilin Jia, Zhongming Zhao and PoHsiu Kuo (2012). Enriched pathways for major depressive disorder identified from a genomewide association study. The International Journal of Neuropsychopharmacology, 15, pp 14011411 doi:10.1017/S1461145711001891 Request Permissions : Click here Downloaded from http://journals.cambridge.org/PNP, IP address: 129.59.115.14 on 08 Oct 2012

Upload: others

Post on 26-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

The International Journal of Neuropsychopharmacologyhttp://journals.cambridge.org/PNP

Additional services for The International Journal of Neuropsychopharmacology:

Email alerts: Click hereSubscriptions: Click hereCommercial reprints: Click hereTerms of use : Click here

Enriched pathways for major depressive disorder identified from a genome­wide association study

Chung­Feng Kao, Peilin Jia, Zhongming Zhao and Po­Hsiu Kuo

The International Journal of Neuropsychopharmacology / Volume 15 / Issue 10 / November 2012, pp 1401 ­ 1411DOI: 10.1017/S1461145711001891, Published online: 16 January 2012

Link to this article: http://journals.cambridge.org/abstract_S1461145711001891

How to cite this article:Chung­Feng Kao, Peilin Jia, Zhongming Zhao and Po­Hsiu Kuo (2012). Enriched pathways for major depressive disorder identified from a genome­wide association study. The International Journal of Neuropsychopharmacology, 15, pp 1401­1411 doi:10.1017/S1461145711001891

Request Permissions : Click here

Downloaded from http://journals.cambridge.org/PNP, IP address: 129.59.115.14 on 08 Oct 2012

Page 2: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

Enriched pathways for major depressivedisorder identified from a genome-wideassociation study

Chung-Feng Kao1*, Peilin Jia2,3*, Zhongming Zhao2,3,4# and Po-Hsiu Kuo1,5#1 Department of Public Health and Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan

University, Taipei, Taiwan2 Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA3 Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN, USA4 Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, TN, USA5 Research Center for Genes, Environment and Human Health, National Taiwan University, Taipei, Taiwan

Abstract

Major depressive disorder (MDD) has caused a substantial burden of disease worldwide with moderate

heritability. Despite efforts through conducting numerous association studies and now, genome-wide

association (GWA) studies, the success of identifying susceptibility loci for MDD has been limited, which

is partially attributed to the complex nature of depression pathogenesis. A pathway-based analytic strat-

egy to investigate the joint effects of various genes within specific biological pathways has emerged as a

powerful tool for complex traits. The present study aimed to identify enriched pathways for depression

using a GWA dataset for MDD. For each gene, we estimated its gene-wise p value using combined and

minimum p value, separately. Canonical pathways from the Kyoto Encyclopedia of Genes and Genomes

(KEGG) and BioCarta were used. We employed four pathway-based analytic approaches (gene set

enrichment analysis, hypergeometric test, sum-square statistic, sum-statistic). We adjusted for multiple

testing using Benjamini & Hochberg’s method to report significant pathways. We found 17 significantly

enriched pathways for depression, which presented low-to-intermediate crosstalk. The top four pathways

were long-term depression (pf1r10x5), calcium signalling (pf6r10x5), arrhythmogenic right ven-

tricular cardiomyopathy (pf1.6r10x4) and cell adhesion molecules (pf2.2r10x4). In conclusion, our

comprehensive pathway analyses identified promising pathways for depression that are related to neuro-

transmitter and neuronal systems, immune system and inflammatory response, which may be involved

in the pathophysiological mechanisms underlying depression. We demonstrated that pathway enrich-

ment analysis is promising to facilitate our understanding of complex traits through a deeper interpret-

ation of GWA data. Application of this comprehensive analytic strategy in upcoming GWA data for

depression could validate the findings reported in this study.

Received 7 September 2011 ; Reviewed 17 October 2011 ; Revised 2 November 2011 ; Accepted 26 November 2011 ;

First published online 16 January 2012

Key words : Depression, gene level, genome-wide association, pathway analysis.

Introduction

Major depressive disorder (MDD) is a complex and

multifactorial disorder that causes a substantial bur-

den of disease worldwide. Benefit from advances in

genotyping technology, genome-wide association

(GWA) studies have been frequently conducted to

search for susceptibility genes for traits of interest

during the past few years. Typical GWA studies

examine half or a few million markers in hundreds or

thousands of subjects ; thus, such studies are expected

to provide greater power in detecting common genetic

variants for complex traits. Specifically in MDD,

Sullivan et al. (2009) conducted a GWA study in>3000

subjects and found signals in the piccolo (PCLO) gene

region (minimal p=7.7r10x7). However, in indepen-

dent replication samples, association of this gene did

Address for correspondence : P.-H. Kuo, Ph.D., Department of Public

Health and Institute of Epidemiology and Preventive Medicine,

College of Public Health, National Taiwan University, Rm 521, No. 17,

Xuzhou Road, Taipei 100, Taiwan.

Tel. : +886-2-33668015 Fax : +886-2-2351-1955

Email : [email protected] or [email protected]

* These authors contributed equally to this work.

# These authors contributed to this work as joint senior authors.

International Journal of Neuropsychopharmacology (2012), 15, 1401–1411. f CINP 2012doi:10.1017/S1461145711001891

ARTICLE

Page 3: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

not reach a genome-wide significance level at

p<5r10x8. Furthermore, a more narrowly defined

phenotype, such as recurrent MDD or recurrent early

onset MDD, was thought to be more aetiologically

homogeneous and might have a higher chance of re-

vealing disease-associated loci. A GWA study for re-

current early onset MDD using sequenced treatment

alternatives to relieve depression samples reported

significant findings for the GRM7 gene (Shyn et al.

2011). However, other GWA studies using the same

recurrent early onset MDD phenotype did not find any

significant loci reaching genome-wide significance

when using >1000 individuals in each study (Muglia

et al. 2010; Shi et al. 2011; Wray et al. 2010).

Additionally, among the four studies, each of which

combined several GWA datasets to perform meta-

analyses, only one reported significant associations

for MDD in ATP6V1B2 (p=6.8r10x7), SP4 (p=7.7r10x7) andGRM7 (p=1.1r10x6) and for recurrent

early onset MDD in GRM7 (p=5.6r10x6 ; Muglia et al.

2010; Shi et al. 2011 ; Shyn et al. 2011 ; Wray et al. 2010).

In addition to the results obtained from GWA stud-

ies, genetic data for a large number of candidate genes

have accumulated during the past decade for their

associations with MDD (Kao et al. 2011). Bosker et al.

(2010) identified 57 depression candidate genes (in

total, 92 markers) from literature, which showed sig-

nificant associations with MDD. They attempted to

replicate these previously reported associations using

a GWA dataset of MDD. Only two markers,

rs12520799 (C5orf20, p=0.038) and rs16139 (NPY,

p=0.034), and only one gene, TNF (p=0.0034), were

replicated with weak association evidence. Such poor

replication indicated the predicament of genetic as-

sociation studies at the single gene or single marker

level. There are many possible reasons for poor repli-

cation related to MDD, including the heterogeneous

nature of depression patient recruitment and ascer-

tainment bias in different studies, potential publi-

cation bias that favours the presentation of positive

results, false-positive findings in previous studies and

variations in the contributions of genetic and en-

vironmental risk factors for depression (Bosker et al.

2010; Wang et al. 2010). Most importantly, polygenes

with a small effect size are a significant issue in con-

ducting genetic association studies for complex dis-

orders, which resulted in concerns of under-power in

most of the previous studies (Wray et al. 2010). These

issues potentially impede success in uncovering the

underlying biological mechanisms of depression.

Investigating the join effects of multiple functionally

related markers or genes for complex traits has

recently been employed. In principle, a set-based

analysis at the gene-level is utilized to extract infor-

mation of all single nucleotide polymorphisms (SNPs)

in a gene region, which can be done two ways. A

simple approach used in many studies is to select the

most significant SNP with the smallest p value to rep-

resent the gene (denoted as ‘pmin’). However, this

method may underestimate the importance of genes

with several moderately associated markers or over-

estimate the significance of genes (e.g. if the very sig-

nificant signal results from genotyping errors). Hence,

combining p values of all SNPs in a gene to represent

the gene (denoted as ‘pcomb’) is a useful alternative to

maximize the utilization of SNP association infor-

mation. Haplotype analysis is another way to consider

joint effects of multiple SNPs within genes. Haplotype

blocks can be viewed as subsets of a gene, while the

pcomb method treated the whole gene as one block. To

pre-define haplotype blocks for each gene based on its

real linkage disequilibrium structure in a genome-

wide level is not very straightforward. Thus, in the

current study, we focused on the pmin and the pcomb

methods to collect gene-level information.

Furthermore, complex traits may result from de-

fects in a number of genes that disrupt one or more

biological pathways. Hence, a pathway-based analysis

that incorporates prior biological knowledge of gene

functions is necessary to provide a more comprehen-

sive view than a single gene-level analysis (Wang et al.

2010, 2011) so as to aggregately analyse enriched gene

sets using biological pathway information (Kanehisa

et al. 2010) or functional categories (Consortium, 2007).

Examples of pathway-based analysis methods in-

clude, but are not limited to : (i) the competitive

method that compares association evidence of genes in

a pathway to those not in a pathway, such as gene set

enrichment analysis (GSEA) (Wang et al. 2007) and a

hypergeometric test (Jia et al. 2010a) ; (ii) the self-con-

tained method that accounts for genes within the

pathway only, such as sum-statistic and sum-square-

statistic (Tintle et al. 2009). The competitive method

may be less powerful when many genes outside of a

specific pathway are also associated with the disease

trait. A detailed review of the pathway-based analysis

methods is available in Wang et al. (2010). There have

been several successful examples of pathway-based

analysis as applied to reveal possible biological me-

chanisms for the risk of developing schizophrenia

(Jia et al. 2010a), Parkinson’s disease (Wang et al. 2007)

and heart disease (Tintle et al. 2009). It is thus our in-

tention to examine enriched pathways for MDD using

a large-scale GWA dataset. We performed pathway-

based analyses using four competitive and self-

contained methods with two approaches to gather

1402 C.-F. Kao et al.

Page 4: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

SNP information at the gene level. Of significant im-

portance, the strategies used to conduct pathway-

based analyses in the current study for depression

potentially boosted our power to obtain meaningful

results in studying the molecular mechanisms of de-

pression.

Method and materials

GWA dataset

The MDD GWA dataset was accessed through dbGaP

(http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/

study.cgi?study_id=phs000020.v2.p1). A total of 1821

depression cases and 1822 controls were included in

this dataset (Boomsma et al. 2008). We followed the

quality control processes performed by dbGaP to re-

move subjects that have a missing phenotype, low call

rate (<0.95), gender prediction ambiguity or outliers,

as well as to remove markers via criteria such as minor

allele frequency <0.05, Mendelian errors o3 and the

Hardy–Weinberg Equilibrium test p value f1ex5 in

the downloaded GWA dataset. In total, we included

genotyping data of 424 861 markers in 3394 individuals

(1673 cases and 1721 controls) to perform pathway

analysis. A basic allelic test was used to calculate the

genomic inflation factor for this GWA dataset, which

was 1.028. The quantile–quantile plot for all analysed

SNPs can be found in Supplementary Fig. S1, indicat-

ing good quality in this GWA dataset.

Computing gene-wise statistic values

To obtain gene-level significance, we first mapped

SNPs to a gene (using NCBI build 36) if SNPs were

located within the gene region or 20 kb upstream or

downstream of the gene, which was suggested as a

good gene boundary (Jia et al. 2010b). We used two

approaches to extract information for SNPs at the gene

level. A commonly adopted method is to select the

most significant SNP of a gene region using the smal-

lest p value (pmin) in association tests to represent the

significance level of a gene. Alternatively, we com-

bined all information of SNPs in a gene region to rep-

resent a gene (pcomb) using the inverse c method

(Zaykin et al. 2007). While applying the inverse c

method, we specified a shape parameter (a) of 0.1 for c

distribution, as suggested in Biernacka et al. (unpub-

lished observations) for the best performance in their

study of the alcohol dependence trait. Note that

alcohol dependence might exhibit complexity and

heterogeneity that is similar to MDD. In the current

study, both pmin and pcomb gene level statistics were

calculated for gene-level association signals.

Pathway annotations

To map genes into biological pathways, we used the

Kyoto Encyclopedia of Genes and Genomes (KEGG;

ftp://ftp.genome.jp/pub/kegg/pathway/; December

2010 version) and the BioCarta (ftp://ftp.ncbi.nih.

gov/gene/DATA/GENE_INFO/Mammalia/Homo_

sapiens.gene_info.gz ; February 2011 version) annota-

tions. Pathways with extreme numbers of genes (i.e.

2.5th percentile of pathway size distribution, <4 or

>250 genes) were removed from analysis to avoid

overly limited information or excessively large path-

ways. This procedure resulted in 207 pathways in the

KEGG and 300 pathways in the BioCarta datasets for

the following analysis. There were 5443 human pro-

tein coding genes mapped to 507 pathways in this

pathway annotation procedure and the average gene

number per pathway was 36.

Statistical methods for pathway enrichment analysis

We applied four statistical methods to evaluate the

enrichment of significant pathways for MDD; these

methods are GSEA, hypergeometric test, sum-square-

statistic and sum-statistic. The first two are competi-

tive methods and the latter two are self-contained

methods, according to two recent review articles

(Wang et al. 2010, 2011). The GSEAwas first developed

to analyse microarray gene expression data (Subra-

manian et al. 2005) and was later modified to expand to

GWA studies (Wang et al. 2007). In brief, a set of genes

was first ordered according to gene-wise statistic va-

lues so that genes with higher significance (or small p

values) are ranked on the top. The gene-wise statistic

values (tj) in pmin gene level were defined as the x2

statistic of the corresponding most significant SNP and

in pcomb gene level were defined as the inverse c stat-

istic of the combined information of all SNPs in a gene

region. For each examined pathway, an enrichment

score (ES) was calculated, based on p values of a set of

genes in each pathway. The ES was used to evaluate

association signals for each annotated pathway. At the

second step, for each pathway, the ES was normalized

to compute NES by subtracting the mean of the ES in

the permutated datasets, ES (Sperm) and divided by the

standard deviation of ES (Sperm). We calculated em-

pirical p values for all pathways using 5000 permu-

tations to compare the original ES score from the GWA

dataset and the permutation datasets (denoted as

Sperm) by computing the fraction of the numbers of

[ES (Sperm)>ES (S)] divided by the total number of

permutations.

In the hypergeometric test, we used a cut-off p

value of 0.05 to define significant genes using their

Enriched pathways for MDD in a GWA study 1403

Page 5: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

gene-wise p values. A p value based on hypergeo-

metric distribution for each pathway was computed.

Further details are described in Jia et al. (2010a). We

performed the hypergeometric test for all annotated

pathways using the GWA dataset for MDD.

Both the sum-square-statistic and sum-statistic

(Tintle et al. 2009) are self-contained methods for gene-

set analysis, which only take genes in a specific path-

way into account. Let ti (i=1, …, S) be the x2 test

statistics for each of the S genes in a pathway. The

sum-square-statistic method utilizes the sum of the

squares of all gene-wise statistic values (ti2) over the set

of genes (gi=1S ti

2), and the sum-statistic method calcu-

lates the sum of the gene-wise statistic values (ti) over

the set of genes (gi=1S ti).

To explore the crosstalk among pathways, we used

an overlap coefficient (OC) to calculate the fraction of

genes overlapped across pathways ; that is, the num-

ber of genes overlapped among two pathways divided

by the minimal gene number among two pathways.

Large OCs represent high similarity in gene infor-

mation between two pathways. We defined the degree

of overlap as follows : complete overlap (OC=1) ; high

overlap (OCo50%) ; moderate overlap (20%fOC

<50%) ; low overlap (OC<20%) ; no overlap (OC=0).

To account for multiple testing problems in the path-

way-based analyses, we used the method proposed by

Benjamini & Hochberg (1995) to control for the false

discovery rate (FDR). We ordered all the p values of

pathways and compared each p value p(i) with a

threshold of (i/m)q*, where m represents the total

number of pathways and q* represents the significance

level. Thus, the procedure controls the FDR at q*=0.05

level in the current study, assuming p values are in-

dependently distributed under null hypotheses. Our

analyses were based on two gene-level indexes and

four pathway-based methods. The resulting signifi-

cant pathways may not be the same under different

statistical scenarios. We adjusted the FDR on the level

of individual statistical method and pathways with

p<0.05 after Benjamini and Hochberg’s adjustment

was reported separately by using pmin or pcomb stat-

istic, as shown in the tables.

Results

A total of 249 270 SNPs were mapped to 16 758 pro-

tein-coding genes in the GWA dataset of MDD, which

were then mapped to 507 annotated pathways (207

from KEGG and 300 from BioCarta) in the gene-

pathway mapping process.

Using the competitive methods, we initially found

>60 pathways with nominal p’s=<0.05 via the

hypergeometric test using either the pmin or pcomb

gene-level statistic. Among them, 13 pathways

reached statistical significance after controlling the

FDR at the 0.05 level. No significant pathways were

found after FDR correction using the GSEA method.

Significant pathways over-represented in the GWA

dataset are listed in Table 1. Using the self-contained

methods, five out of 56 significant pathways passed

the FDR correction (q value<0.05) by the sum-statistic

method under the pmin gene-level statistic and three

out of 27 significant pathways were similarly ident-

ified by the sum-square-statistic method under the

pcomb gene level statistic (see Supplementary Table S1).

Of note, there were four overlapping pathways ident-

ified simultaneously by the hypergeometric test and

the sum-statistic method under the pmin gene level

statistic (see Table 1), including long-term depression

(LTD), calcium signalling, arrhythmogenic right ven-

tricular cardiomyopathy (ARVC) and cell adhesion

molecule (CAM) pathways.

In total, we identified 17 significant pathways

for their biological relevance in MDD that were over-

represented in the GWA dataset. The biological func-

tions of 17 enriched pathways (13 from KEGG

and four from BioCarta) are listed in both Table 1

(pmin statistic) and Table 2 (pcomb statistic). Those

pathways were structurally mapped to organismal

systems (i.e. LTD, axon guidance, vascular smooth

muscle contraction and protein kinase C-catalysed

phosphorylation of inhibitory pathways), environ-

mental info-rmation processing (i.e. calcium signal-

ling, CAMs, phosphatidylinositol signalling system

and the extracellular matrix-receptor interaction

pathways), human disease-related (i.e. ARVC, type I

diabetes mellitus and graft-versus-host disease path-

ways), cellular processes (i.e. focal adhesion and

regulation of actin cytoskeleton pathway) and im-

mune-related system (O-Glycan biosynthesis, T cell

receptor signalling, nuclear factor kB (NF-kB) acti-

vation by non-typable Haemophilus influenzae and cell

cycle pathway).

Among the 17 enriched pathways, we evaluated

their crosstalk by checking whether there was a high

degree of overlap for significant genes (p<0.05) in

these pathways. The resulting number and proportion

of overlapping genes are shown in Supplementary

Table S2. The magnitude of overlap across pathways

ranged from a low to intermediate level, which in-

dicated some crosstalk for molecules in enriched

pathways. Among all possible pair-wise pathway

comparisons, 56.6% did not have any significant

genes overlapping, 27.2% had a low degree of overlap

(<20%), 13.2% had a moderate degree of overlap

1404 C.-F. Kao et al.

Page 6: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

(20–50%) and only 3.0% pathways had a high degree

of overlap (>50%). The fact that only a few genes

were commonly identified in significant pathways for

MDD further reflects the difficulty that we faced

in identifying ‘the genes’ for complex diseases.

Rather, those disease traits are usually caused by the

Table 1. Significantly enriched pathways in the genome-wide association data for major depressive disorder using the

‘minimal p ’ gene statistic

Annotated pathway

No. of

genesa

No. of

genes

on chip

Hypergeometric test Sum-statistic

No. of genes

of interestbObserved

p value FDRBHd

Sum-

statistic

Empirical

p valuec FDRBHd

Long-term depression (KEGG) 70 58 32 1.0r10x5 0.005* 196.10 0 0*

Calcium signalling pathway (KEGG) 177 164 70 6.0r10x5 0.013* 485.04 0 0*

Arrhythmogenic right ventricular

cardiomyopathy (KEGG)

74 73 35 1.6r10x4 0.023* 229.99 0 0*

Cell adhesion molecules (KEGG) 133 120 52 2.2r10x4 0.023* 350.95 0 0*

Focal adhesion (KEGG) 199 183 75 2.4r10x4 0.023*

Type I diabetes mellitus (KEGG) 43 38 20 5.7r10x4 0.045*

Regulation of actin cytoskeleton (KEGG) 213 192 76 7.5r10x4 0.046*

Graft-versus-host disease (KEGG) 41 35 18 8.6r10x4 0.046*

Phosphatidylinositol signalling system

(KEGG)

78 69 33 9.4r10x4 0.046*

Axon guidance (KEGG) 129 116 48 1.0r10x3 0.046*

ECMrreceptor interaction (KEGG) 84 79 35 1.1r10x3 0.046*

PKC-catalysed phosphorylation of

inhibitory phosphoprotein of myosin

phosphatase (BioCarta)

32 25 14 1.2r10x3 0.046*

Vascular smooth muscle contraction

(KEGG)

116 101 43 1.4r10x3 0.049*

O-Glycan biosynthesis (KEGG) 30 29 99.83 2.0r10x4 0.020*

KEGG, Kyoto Encyclopedia of Genes and Genomes ; ECM, extracellular matrix ; PKC, protein kinase C.a The number of genes in each pathway.b The number of genes having a significant p value <0.05 in each pathway.c The empirical p values were calculated from 5000 permutations.d The false discovery rate (FDR) results were based on Benjamini & Hochberg’s (1995) multiple testing correction.

* Indicates a false discovery rate <0.05 for reported enriched pathways.

Table 2. Significantly enriched pathways in the genome-wide association data for major depressive disorder using the

‘pcombined’ gene statistic

Annotated pathwayaNo. of

genesb

No. of

genes

on chip

Sum-square

statistic

Empirical

p valuec FDRBHd

T cell receptor signalling pathway (BioCarta) 48 38 99.85 0 0*

Cell cycle_G1/S check point (BioCarta) 29 25 99.90 2.0r10x4 0.034*

NF-kB activation by non-typable Haemophilus influenzae (BioCarta) 25 23 99.93 2.0r10x4 0.034*

NF-kB, Nuclear factor kB.a Results were based on the sum-square-statistic method.b The number of genes in each pathway.c The empirical p values were calculated from 5000 permutations.d The false discovery rate (FDR) results were based on Benjamini & Hochberg’s (1995) multiple testing correction.

* Indicates a false discovery rate <0.05 for reported enriched pathways.

Enriched pathways for MDD in a GWA study 1405

Page 7: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

dysfunction of several susceptible gene loci with small

main and interaction effects.

We further examined genes that, in the 17 signifi-

cantly enriched pathways, were involved in at least

three pathways and had at least one SNP with a p

value <0.05. These genes are listed in Table 3. They

are mainly related to cell transformation and cell ad-

hesion (e.g. PRKCA, ITGA, IL-1B), immune system and

inflammation response (e.g. IL-1B, HLA genes,

MAPK1, PLCB1), signal transduction (e.g. PRKCG,

GNA) and calcium-dependent processes (e.g. CACNA

genes).

Discussion

A wealth of large-scale GWA data have been pro-

duced during the past few years for complex traits,

such as depression. The pathway-based analytic

strategy provides the opportunity to uncover enriched

pathways that are involved in the aetiology of de-

pression, based on prior knowledge of gene functions

and molecular mechanisms. In this study, we reported

17 over-represented pathways using a major GWA

dataset of MDD, where a group of genes in the same

pathway jointly showed associations with depression.

Some genes were involved in multiple pathways to

increase the risk of depression. It is worth noting that

many of these genes did not reach genome-wide sig-

nificance for their associations with depression in

the analysis of single-gene level but reveal their

potential roles in pathway-based analyses. In the

original analysis of this GWA MDD dataset, only one

associated loci, PCLO (rs2715148, p=7.7r10x7), was

reported, although not reaching genome-wide sig-

nificance level at 5r10x8 (Sullivan et al. 2009). Bosker

et al. (2010) attempted to replicate prior associations

of candidate genes for MDD using this same GWA

dataset ; only one gene, TNF, exhibited significant

associations (p=0.0034) after multiple testing correc-

tion. Concordantly, this TNF gene was included in

three significant pathways that we identified (i.e. type

I diabetes mellitus, graft-vs.-host disease and NF-kB

activation by non-typable Haemophilus influenza).

We found 14 enriched pathways using the pmin

gene-level statistic (Table 1) and three pathways using

the pcomb gene-level statistic (Table 2). None of the

pathways overlapped across the two gene-level

calculation methods. Note that we performed multiple

testing corrections on the level of individual statistical

method, but not strictly applied across multiple path-

way-based statistical methods. Nevertheless, four

pathways were consistently identified by both the hy-

pergeometric test and the sum-statistic method using

the pmin. Even when a more stringent multiple testing

adjustment was applied, the four pathways remained

significant (none of the 5000 permutation had a stat-

istic score greater than the original score obtained for

these pathways, p=0). Similarly, under the pcomb ap-

proach, one pathway has empirical significance level

at p=0.

The pmin is a commonly used approach to assess

association evidence at gene level in previous studies

that employed pathway-based analysis (Jia et al.

2010a ; Tintle et al. 2009; Torkamani et al. 2008; Wang

et al. 2007). However, there are limitations of using the

pmin statistic to represent the significance of a gene. For

instance, if there is a set of moderately associated

markers in genes, the importance of such genes may

be down-weighted by not having ‘one’ particular sig-

nificant signal. Highly significant results are prone to

the risk of being affected by genotyping errors. In ad-

dition, larger genes may be more likely to have smaller

p values. In the GWA data for depression, we noticed

that larger genes (e.g. >10 000 kb) were positively

related to smaller p values (Kao et al. 2011; Sup-

plementary Fig. S2). Nevertheless, there is no differ-

ence between the proportion of larger gene size in

the 17 enriched gene sets compared to genes not in

these significant pathways (odds ratio 0.86, p=0.62).

Thus, our findings were unlikely affected by such bias.

In our analyses, we further combined information of

all SNPs in a gene region to represent a gene. Benefits

from the pcomb method include: (1) the overall evi-

dence of gene-set association for depression can be

assessed; (2) SNPs with moderate effects can be in-

cluded. We found that the different strategies of de-

fining gene-level significance seem to have substantial

influences on results and completed different path-

ways were identified from these two statistics. We

noticed that pathways related to neurotransmitters,

neuronal activity, the immune system and inflam-

mation response tended to be selected using the pmin

approach; while immune-related pathways tended to

be identified using the pcomb approach.

Using the pmin based approach, we identified four

significant pathways by both the hypergeometric test

and sum-statistics methods, including LTD, calcium

signalling, ARVC and CAMs. These are important

pathways related to neuronal activity, the immune

system and inflammation response. More specifically,

the LTD pathway is responsible for the mechanism of

long-term, activity-dependent changes in the efficacy

of neuronal synapses in some brain areas (e.g. hippo-

campus and cerebellum) with influence on the central

nervous system to release various neurotransmitters

during the developmental process (du Lac et al.

1406 C.-F. Kao et al.

Page 8: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

Table 3. List of enriched genes in 17 significant pathways

Gene

No. of

pathways Gene size (bp)

Minimum

p value

Combined

p value

Proportion of

significant SNPs

PRKCA 7 507 937 0.0278 0.9997 2/143

ITGA4 5 80 850 0.0427 0.9115 1/25

ITGA6 5 78 868 0.0283 0.9392 1/33

ITGA8 5 202 683 0.0108 0.1337 3/39

ITGA9 5 367 469 0.0193 0.7182 4/82

ITGB8 5 84 658 0.0360 0.9366 1/40

MAP2K1 5 104 672 0.0493 0.2902 1/9

MAPK1 5 108 024 0.0109 0.0030 4/8

MYLK 5 272 007 0.0270 0.2940 2/17

PLCB1 5 752 252 0.0012 0.0147 17/212

PRKCG 5 25 435 0.0235 0.1347 1/6

ROCK1 5 162 110 0.0324 0.0224 2/4

BRAF 4 190 752 0.0499 0.5393 1/11

ITGA1 4 165 350 0.0368 0.9953 1/71

ITGA11 4 130 451 0.0280 0.4604 1/49

ITGA2 4 105 454 0.0145 0.1224 3/38

ITGA3 4 34 510 0.0102 0.1902 1/11

ITPR1 4 354 253 0.0030 0.0161 13/145

ITPR2 4 497 847 0.0146 0.9940 3/147

ITPR3 4 75 188 0.0339 0.4672 4/47

PIK3CG 4 41 669 0.0134 0.0174 3/12

ACTN1 3 105 244 0.0095 0.3227 2/42

ACTN2 3 77 789 0.0095 0.0381 5/26

ACTN4 3 82 844 0.0287 0.3740 1/11

CACNA1C 3 644 700 0.0135 0.2984 10/139

CACNA1D 3 317 462 0.0150 0.4747 4/87

CACNA1S 3 73 055 0.0397 0.9452 1/39

CALML3 3 1302 0.0159 0.1549 1/13

CALML4 3 851 0.0481 0.6194 1/14

CD28 3 31 360 0.0136 0.3558 1/13

EGFR 3 188 307 0.0198 0.8704 2/73

FN1 3 75 613 0.0486 0.8377 1/31

FYN 3 212 143 0.0191 0.9551 1/49

GNA11 3 27 047 0.0100 0.0054 3/9

GNA12 3 116 219 0.0125 0.0044 7/33

GNA13 3 45 922 0.0499 0.1043 1/2

GNAS 3 71 456 0.0112 0.0904 2/17

GSK3B 3 266 968 0.0090 0.0446 2/14

HLA-A 3 3324 0.0181 0.1184 2/15

HLA-B 3 3341 0.0025 0.0017 5/30

HLA-C 3 3327 0.0155 0.0628 5/39

HLA-DMA 3 4459 0.0166 0.1398 3/26

HLA-DMB 3 6403 0.0235 0.4233 2/26

HLA-DOA 3 5430 0.0114 0.3166 1/28

HLA-DOB 3 4286 0.0122 0.4824 3/43

HLA-DPA1 3 8585 0.0163 0.5993 1/33

HLA-DPB1 3 11 217 0.0163 0.4723 1/29

HLA-DQA1 3 6247 0.0213 0.0458 1/2

HLA-DRA 3 5178 0.0037 0.00002 10/44

HLA-F 3 3957 0.0032 0.0062 7/33

HLA-G 3 4144 0.0044 0.0171 5/24

IL1B 3 7020 0.0090 0.0118 3/10

[continued overleaf

Enriched pathways for MDD in a GWA study 1407

Page 9: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

1995; Massey & Bashir, 2007). Evidence shows that

cerebellar LTD affects motor learning, and hippo-

campal LTD may contribute to memory traces

(Malleret et al. 2010; Nicholls et al. 2008). Thus, the

dysregulation of the stress response due to hippo-

campal dysfunction was suggested to be linked to

depression (Pittenger & Duman, 2008).

The network of the calcium signalling pathway is

complex and sophisticated and it is involved in neu-

ronal activity and the immune system (Dodd et al.

2010; Feske, 2007 ; Ghosh & Greenberg, 1995 ; Popoli et

al. 2002; Vig & Kine, 2009). Genes in this pathway

regulate brain function, especially in the hippocam-

pus, to cope with stressful events, which links to an-

xiety and depression as well as the effects of

antidepressants (D’Sa & Duman, 2002; Popoli et al.

2002). The ARVC pathway is named by a disease of the

heart muscle that involves fibro-fatty replacement of

the right ventricle, which is related to immune-medi-

ated damage and inflammation response. CAMs play

critical roles in regulating cell–cell and cell–matrix

adhesion, which are involved with immune response,

inflammation and development of neuronal tissue

(Elangbam et al. 1997). Changes in CAM expression

were suggested to link with depression. For example,

a reduction in glia in both the dorsolateral prefrontal

cortex and anterior cingulated cortex were indicated to

increase neuronal vulnerability in young depressed

subjects (Greenwald et al. 1998; Parker & Auld, 2004;

Thomas et al. 2002). Prior evidence also suggested

that ischaemia-induced inflammatory response in the

prefrontal cortex may cause neuronal damage and

subsequent glial malfunction in elderly depressed

subjects (Thomas et al. 2002). Multiple lines of

evidence have supported the implication that

development of depression is related to complex bio-

logical pathways and networks with the involvement

of neuronal activity, the immune system and inflam-

mation response.

Furthermore, the four aforementioned pathways

share several genes, reflecting the combinatory nature

of cellular components and functional molecules and

their potential involvement in several biological path-

ways. Some example genes in these pathways are

GRM1, GRM5 and CDH2. The metabotropic glutamate

group 1 receptors (e.g. GRM1, GRM5) are mediated by

a G-protein that activates a phosphatidylinositol-cal-

cium second messenger system, which regulates brain

function through neurotransmitter system. N-cadher-

in (e.g. CDH2) is a calcium-dependent CAM essential

for normal neuronal development in synaptic contact.

It affects synapse formation by interacting with po-

tential signalling pathways and/or molecules such as

Wnt signalling (Lamora & Voigt, 2009) and p38 mito-

gen-activated protein kinase signalling (Ando et al.

2011), which is important in terms of the neuronal

recognition mechanism. Evidence from these prior

findings supports the critical roles of our identified

pathways and genes in the pathophysiology of de-

pression. Of note, we previously published a DEP

genes set with 151 prioritized candidate genes for de-

pression (Kao et al. 2011). Seventeen genes in these

four pathways were previously discussed and, in

total, 41 genes in all 17 enriched pathways appeared in

our top list of DEP genes with high scores

(Supplementary Table S3), consistently suggesting the

importance of these enriched pathways in relation to

depression.

Table 3 (cont.)

Gene

No. of

pathways Gene size (bp)

Minimum

p value

Combined

p value

Proportion of

significant SNPs

LAMA2 3 633 425 0.0106 0.4179 5/92

MYL7 3 2454 0.0268 0.1089 1/6

PAK4 3 53 627 0.0338 0.2832 1/7

PAK7 3 301 651 0.0004 1.02E-08 19/103

PIK3R5 3 32 422 0.0451 0.3647 1/13

PLCG1 3 38 197 0.0419 0.1633 1/5

SOS1 3 138 915 0.0377 0.2633 1/11

TNF 3 2763 0.0399 0.4487 1/10

VAV1 3 84 650 0.0002 0.0009 3/25

Genes that were involved in at least three pathways with >1 single nucleotide polymorphism (SNP) having p value <0.05

were listed. Proportion of significant SNPs were defined as the number of significant SNPs (p<0.05) divided by total

number of SNPs in that gene. The counts were provided for better information in this table.

1408 C.-F. Kao et al.

Page 10: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

There is increasing evidence showing that mental

disorders may result from abnormality in the immune

and inflammation systems (Dantzer et al. 2008; Pardo

et al. 2005). Our results support this proposition,

especially for the four significantly identified immune-

related pathways. Two pathways, T cell receptor sig-

nalling and NF-kB activation, are relevant biological

processes in immune response and in regulating the

inflammatory system (Burbach et al. 2007; Shuto et al.

2001). In addition, NF-kB activation and cell cycle

pathways were reported to be involved in the pro-

cesses of synaptic plasticity, memory and neuronal

differentiation (Merlo et al. 2002; Nagy et al. 1998).

Another significant immune-related pathway is

O-glycans synthesis (Brockhausen, 1999 ; Piller et al.

1988; Tsuboi & Fukuda, 1997). Although the causal

mechanisms of inflammation response in depression

are still unknown, our findings further suggest the

potential to investigate roles of neuronal chronic in-

flammation for depression.

There are some limitations in the current study.

First, our pathway analysis relied on the accuracy and

completeness of pathway annotation databases and

we only used two commonly adopted ones, KEGG

and BioCarta. Some genes that may have potential

impacts on depression but are not annotated in path-

way databases were excluded from our analyses.

Other datasets, such as Gene Ontology and Reactome,

may be helpful and can be considered in future

analysis, although their annotations need to be care-

fully selected. Second, our findings have shown that

different pathways were identified by combining all

markers’ information in defining gene-level signifi-

cance. We used the inverse c method to extract SNPs’

information for a gene in the pcomb statistic. Other

gene-level statistics, such as the random effects model

or Bayesian statistical methods, may provide different

gene-level evidence (Stephens & Balding, 2009) in

pathway analysis. Third, the current study focused on

the signals in genetic association tests, while other

genomic information (such as gene expression, gene

regulation, etc.) has not yet been used. Concerning

other useful genomic datasets, a possible utilization

approach is to first integrate gene information from

different platforms or data sources to construct a

combined score for each gene, followed by typical

pathway analysis to obtain more value-added path-

way results using all existing genomic evidence and

knowledge for depression. Fourth, in conducting

pathway analysis for depression, we only used one

major GWA dataset. There are several large-scale

GWA studies for depression, newly completed while

this analysis was in the final stage, and those datasets

are not yet easily accessible to the general public.

The association results from meta-analysis (or

mega-analysis) can be used in the near future to in-

crease our power to uncover the underlying biological

mechanisms for depression.

In conclusion, our results suggest several enriched

pathways for their biological functions to be involved

in molecular mechanisms of depression, including

pathways related to neurotransmitter and neuronal

systems, the immune system and inflammatory re-

sponse. A number of novel genes that did not show

significant associations with depression in the original

single marker/gene analysis of the GWA dataset were

found to participate in several pathways, which,

jointly with other genes, play roles in the pathogenesis

of depression. Although it remains largely unclear

how the defect of pathways is specifically linked to the

development of depression, our identified pathways

provide important biological insights into the in-

terpretation of GWA data for depression. These find-

ings are anticipated to facilitate future follow-up and

functional studies for depression.

Note

Supplementary material accompanies this paper on

the Journal’s website (http://journals.cambridge.org/

pnp).

Acknowledgements

This research was supported by National Science

Council (NSC 97-2314-B-002-184-MY2, NSC 99-2314-

B-002-140-MY3) and National Health Research

Institute (NHRI-EX99-9918NC) grants (to P-H.K.) and

partially supported by NIH grants, 2009 NARSAD

Maltz Investigator Award (to Z.Z.) and 2010 NARSAD

Young Investigator Award (to P.J.). We thank P.C.

Hsiao for his IT assistance. The GWA dataset was

accessed through the Genetic Association Inform-

ation Network (GAIN), database of Genotypes and

Phenotypes (dbGaP) accession number phs000020.

v1.p1 (http://www.ncbi.nlm.nih.gov/projects/gap/

cgi-bin/study.cgi?study_id=phs000020.v2.p1).

Statement of Interest

None.

References

Ando K, Uemura K, Kuzuya A, Maesako M, et al. (2011).

N-cadherin regulates p38 MAPK signaling via association

Enriched pathways for MDD in a GWA study 1409

Page 11: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

with JNK-associated leucine zipper protein : implications

for neurodegeneration in Alzheimer disease. Journal of

Biological Chemistry 286, 7619–7628.

Benjamini Y, Hochberg Y (1995). Controlling the false

discovery rate : a practical and powerful approach to

multiple testing. Journal of Royal Statistical Society. Series B

57, 289–300.

Boomsma DI, Willemsen G, Sullivan PF, Heutink P, et al.

(2008). Genome-wide association of major depression :

description of samples for the GAIN Major Depressive

Disorder Study: NTR and NESDA biobank projects.

European Journal of Human Genetics 16, 335–342.

Bosker F, Hartman C, Nolte I, Prins B, et al. (2010). Poor

replication of candidate genes for major depressive

disorder using genome-wide association data. Molecular

Psychiatry 16, 516–532.

Brockhausen I (1999). Pathways of O-glycan biosynthesis in

cancer cells. Biochimica et Biophysica Acta 1473, 67–95.

Burbach BJ, Medeiros RB, Mueller KL, Shimizu Y (2007).

T-cell receptor signaling to integrins. Immunological Reviews

218, 65–81.

D’Sa C, Duman RS (2002). Antidepressants and

neuroplasticity. Bipolar Disorders 4, 183–194.

Dantzer R, O’Connor JC, Freund GG, Johnson RW, et al.

(2008). From inflammation to sickness and depression:

when the immune system subjugates the brain.

Neuroscience 9, 46–57.

Dodd AN, Kudla J, Sanders D (2010). The language of

calcium signaling. Annual Review of Plant Biology 61,

593–620.

du Lac S, Raymond JL, Sejnowski TJ, Lisberger SG (1995).

Learning and memory in the vestibulo-ocular reflex.

Annual Review of Neurosciences 18, 409–441.

Elangbam CS, Qualls CWJ, Dahlgren RR (1997). Cell

adhesion molecules – update. Veterinary Pathology 34,

61–73.

Feske S (2007). Calcium signalling in lymphocyte

activation and disease. Nature Reviews Immunology 7,

690–702.

Gene Ontology Consortium (2007). The gene ontology

project in 2008. Nucleic Acids Research 36, D440–D444.

Ghosh A, Greenberg ME (1995). Calcium signaling in

neurons : molecular mechanisms and cellular

consequences. Science 268, 239–247.

Greenwald BS, Kramer-Ginsberg E, Krishnan KRR,

Ashtari M, et al. (1998). Neuroanatomic localization of

magnetic resonance imaging signal hyperintensities in

geriatric depression. Stroke 29, 613–617.

Jia P, Wang L, Meltzer HY, Zhao Z (2010a). Common

variants conferring risk of schizophrenia : a

pathway analysis of GWAS data. Schizophrenia Research

122, 38–42.

Jia P, Wang L, Meltzer HY, Zhao Z (2010b). Pathway-based

analysis of GWAS datasets : effective but caution required.

International Journal of Neuropsychopharmacology 14,

567–572.

Kanehisa M, Goto S, Furumichi M, Tanabe M, et al. (2010).

KEGG for representation and analysis of molecular

networks involving diseases and drugs. Nucleic Acids

Research 38, D355–D360.

Kao C-F, Fang Y-S, Zhao Z, Kuo P-H (2011). Prioritization

and evaluation of depression candidate genes by

combining multidimensional data resources. PLoS ONE 6,

1–9.

Lamora A, Voigt MM (2009). Cranial sensory ganglia

neurons require intrinsic N-cadherin function for guidance

of afferent fibers to their final targets. Neuroscience 159,

1175–1184.

Malleret G, Alarcon JM, Martel G, Takizawa S, et al.

(2010). Bidirectional regulation of hippocampal long-

term synaptic plasticity and its influence on

opposing forms of memory. Journal of Neuroscience 30,

3813–3825.

Massey PV, Bashir ZI (2007). Long-term depression :

multiple forms and implications for brain function. Trends

in Neurosciences 30, 176–184.

Merlo E, Freudenthal R, Romano A (2002). The IkB kinase

inhibitor sulfasalazine impairs long-term memory in the

crab Chasmagnathus. Neuroscience 112, 161–172.

Muglia P, Tozzi F, Galwey N, Francks C, et al. (2010).

Genome-wide association study of recurrent major

depressive disorder in two European case-control cohorts.

Molecular Psychiatry 15, 589–601.

Nagy ZS, Esiri MM, Smith AD (1998). The cell division cycle

and the pathophysiology of Alzheimer’s disease.

Neuroscience 87, 731–739.

Nicholls RE, Alarcon JM, Malleret GL, Carroll RC, et al.

(2008). Transgenic mice lacking NMDAR-dependent

LTD exhibit deficits in behavioral flexibility. Neuron 58,

104–117.

Pardo CA, Vargas DL, Zimmerman AW (2005). Immunity,

neuroglia and neuroinflammation in autism. International

Review of Psychiatry 17, 485–495.

Parker RJ, Auld VJ (2004). Signaling in glial development :

differentiation migration and axon guidance. Biochemistry

and Cell Biology 82, 694–707.

Piller F, Piller V, Fox RI, Fukud M (1988). Human

T-lymphocyte activation is associated with changes in

O-glycan biosynthesis. Journal of Biological Chemistry 263,

15 146–15 150.

Pittenger C, Duman RS (2008). Stress, depression, and

neuroplasticity : a convergence of mechanisms.

Neuropsychopharmacology 33, 88–109.

Popoli M, Gennarelli M, Racagni G (2002). Modulation of

synaptic plasticity by stress and antidepressants. Bipolar

Disorders 4, 166–182.

Shi J, Potash J, Knowles J, Weissman M, et al. (2011).

Genome-wide association study of recurrent early-onset

major depressive disorder. Molecular Psychiatry 16,

193–201.

Shuto T, Xu H, Wang B, Han J, et al. (2001). Activation of

NF-kB by nontypeable Hemophilus influenzae is mediated

by toll-like receptor 2-TAK1-dependent NIK-IKKa/B-IkBa

and MKK3/6-p38 MAP kinase signaling pathways in

epithelial cells. Proceedings of the National Academy of

Sciences USA 98, 8774–8779.

1410 C.-F. Kao et al.

Page 12: The International Journal of …...Enriched pathways for major depressive disorder identified from a genome-wide association study Chung-Feng Kao 1*, Peilin Jia2,3*, Zhongming Zhao2,3

Shyn S, Shi J, Kraft J, Potash J, et al. (2011). Novel loci for

major depression identified by genome-wide association

study of Sequenced Treatment Alternatives to Relieve

Depression and meta-analysis of three studies. Molecular

Psychiatry 16, 202–215.

Stephens M, Balding DJ (2009). Bayesian statistical methods

for genetic association studies. Nature Reviews Genetics 10,

681–690.

Subramanian A, Tamayo P, Mootha VK, Mukherjeed S,

et al. (2005). Gene set enrichment analysis : a knowledge-

based approach for interpreting genome-wide expression

profiles. Proceedings of the National Academy of Sciences USA

102, 15 545–15 550.

Sullivan PF, de Geus EJC, Willemsen G, James MR,

et al. (2009). Genome-wide association for major

depressive disorder : a possible role for the

presynaptic protein piccolo. Molecular Psychiatry 14,

359–375.

Thomas AJ, Ferrier IN, Kalaria RN, Davis S, et al. (2002).

Cell adhesion molecule expression in the dorsolateral

prefrontal cortex and anterior cingulate cortex in major

depression in the elderly. British Journal of Psychiatry 181,

129–134.

Tintle NL, Borchers B, Brown M, Bekmetjev A (2009).

Comparing gene set analysis methods on single-nucleotide

polymorphism data from Genetic Analysis Workshop 16.

BMC Proceedings 3, 1–5.

Torkamani A, Topol EJ, Schork NJ (2008). Pathway analysis

of seven common diseases assessed by genome-wide

association. Genomics 92, 265–272.

Tsuboi S, Fukuda M (1997). Branched O-linked

oligosaccharides ectopically expressed in transgenic mice

reduce primary Tcell immune responses. EMBO Journal 16,

6364–6373.

Vig M, Kine J-P (2009). Calcium signaling in immune cells.

Nature Immunology 10, 21–27.

Wang K, Li M, Bucan M (2007). Pathway-based approaches

for analysis of genome-wide association studies. American

Journal of Human Genetics 81, 1278–1283.

Wang K, Li M, Hakonarson H (2010). Analysing biological

pathways in genome-wide association studies. Nature

Reviews Genetics 11, 843–854.

Wang L, Jia P, Wolfingerd RD, Chene X, et al. (2011).

Gene set analysis of genome-wide association studies :

methodological issues and perspectives. Genomics 98,

1–8.

Wray N, Pergadia M, Blackwood D, Penninx B, et al. (2010).

Genome-wide association study of major depressive

disorder : new results, meta-analysis, and lessons learned.

Molecular Psychiatry. Published online : 2 November 2010.

doi :10.1038/mp.2010.109.

Zaykin DV, Zhivotovsky LA, Czika W, Shao S, et al. (2007).

Combining p values in large scale genomics experiments.

Pharmaceutical Statistics 6, 217–226.

Enriched pathways for MDD in a GWA study 1411