a general modeling framework for studying candidate genes copy files from f:\edwin\example

50
A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\ example

Upload: william-blevins

Post on 31-Dec-2015

22 views

Category:

Documents


0 download

DESCRIPTION

A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example. Why general modeling framework?. Candidate genes for quantitative traits usually “main effect” on mean. Genetic advantage more extensive modeling framework - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

A General Modeling Framework for Studying Candidate Genes

Copy files from f:\edwin\example

Page 2: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Why general modeling framework?

• Candidate genes for quantitative traits usually “main effect” on mean.

• Genetic advantage more extensive modeling framework– Some candidate genes may be more likely to be

detected• One reason is power e.g. (pleiotropic) easier to

detect in multivariate study• Some genes may not work in a simple “main effect”

fashion e.g. exert their effects in severely deprived environments only, or influence the sensitivity to environmental fluctuations (variance)

• Correct tests? e.g. different genotypic variances in selected samples

Page 3: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

• Substantive advantage general modeling framework– More extensive picture genetic effects– Shed new light on traditional research questions

Continuity, change, and heterotypyComorbidity/pleiotropyComplex traits: Causal mechanisms involving multiple factors

– New issues: The interplay between genotypes and environment.

Vulnerability, resilience, and protective factorsRisk behavior and the construction of favorable environmentsSensitivity to environmental fluctuations

– Instrumental function due to unique properties

Page 4: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Requirements modeling framework

• Genetic effects on the means, variances, and relations between variables

• Stratification effects on all these components

• Nuclear families of various sizes

• Interpretable parameterization

• Di- and multi-allelic loci, marker haplotypes, multiple loci simultaneously, and parental genotypes

• Easy to fit in existing (Mx) software

Page 5: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

LISREL based model

(s)jk(s)jk(s)(s)jk(s)jk(s)

y(s)yjk(s)y

jk(s)(s)yjk(s)

xxkx

kxk

y subject variables

x family variables

Page 6: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Names, Symbols and Function of Model Matrices

Name Symbol Function

Subject (=y) variables

Structural part

Alpha jk Means

Beta jk Causal effects of subject variables on each other

Gamma jk Causal effects of family variables on subject variables

Psi yjk

diagonal Residual variances

off-diagonal Residual covariances

Page 7: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Names, Symbols and Function of Model Matrices

Name Symbol Function

Measurement part

Nu yjk Intercepts or means indicators

Lambda yjk Factor loadings of indicators

Theta yjk

diagonal Variances errors of measurement

off-diagonal Covariances between errors of measurement

Covariances between y variables of subjects from same family

Ck

Page 8: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Names, Symbols and Function of Model Matrices

Name Symbol Function

Family (=x) variables

Psi xk

diagonal Variances

off-diagonal Covariances

Nu xk Intercepts or means of indicators

Lambda xk Factor loadings of indicators

Theta xk

diagonal Variances errors of measurement

off-diagonal Covariances between errors of measurement

Page 9: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Alternative Models

• Conditional model

(s)jk(s)jk(s)(s)jk(s)xsjk(s)

y(s)jk(s)jk(s)(s)jk(s)xsjk(s)

• x-variables is independent subject plus family variables– relax assumption full multivariate normality

– curvi or non-linear effects x-variables• Disadvantage:

- Optimization,

- Measurement model x-variables

Other modeling frameworks

Page 10: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Partitioning parameter matrices

• Most matrices:

– a) general matrices that are not subscripted represent overall model in all genotype groups and population strata

– b) genetic matrices j represent deviations from the general model caused by locus effects

– c) matrices that are subscripted k and represent deviations from the general model caused by population stratification

Page 11: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

How?

• Example matrix Beta: Causal effects of subject variables on each other

jk(s) =j(gsI) k(fI)

• Main effects are in B that has dimension n n,

Page 12: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

• Genetic effects in term j(gsI)

– The ng 1 vector gs contains ng dummy variables coding the genotype (haplotype) of subject s

• deviations from B thus maximum = #genotypes - 1

• sets of dummy variables to study multiple loci simultaneously or effects of parental genotypes

j = [ 1 | 2 |… | ng]

dimension is n (ng n),

• where 1 is the n n submatrix containing the effects of the first dummy variable, …etc.

Page 13: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

A1A1 A1A2 A2A2

G1 1 0 -1G2 0 1 0

Example

0 0

21 0

Page 14: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

(gsI) = =

j(gs

I) = =0 0

21(1) 00 0

21(2) 0

1 1 0

1 00 10 00 0

1 00 10 00 0

0 0

21(1) 0

A1A1 subjects

Page 15: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

(gsI) = =

j(gs

I) = =0 0

21(1) 00 0

21(2) 0

0 1 0

0 00 01 00 1

0 00 01 00 1

0 0

21(2) 0

A1A2 subjects

Page 16: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

(gsI) = =

j(gs

I) = =0 0

21(1) 00 0

21(2) 0

-1 1 0

-1 0 0 -1 0 0 0 0

-1 0 0 -1 0 0 0 0

0 0

21(1) 0

A2A2 subjects

Page 17: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Stratification effects in termk(fI)

• The nf 1 vector f contains the nf dummy variables used to code family types– deviations thus maximum = #family types - 1

• k = [ 1 | 2 |… | nf]

dimension is n (nf n),– where 1 is the n n submatrix containing the effects

of the first dummy variable, …etc.

and I select proper matrix for dummy variable

Page 18: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

      F1 F2 F3 F4 F5

  SubjectA

SubjectB 

         

Not informative 2 2 1 0 0 0 0

of stratification 1 1 0 1 0 0 0

  0 0 0 0 1 0 0

             

Informative 2 1 0 0 0 1 0

of stratification 2 0 0 0 0 0 1

  1 0 0 0 0 0 0

Sibling pairs

Page 19: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

  ParentA

ParentB

SubjectA

F1 F2 F3 F4 F5

Not informative 2 2 2 1 0 0 0 0

of stratification 2 0 1 0 1 0 0 0

  0 0 0 0 0 1 0 0

                 

Informative 2 1 2 0 0 0 1 0

of stratification     1 0 0 0 1 0

  1 1 2 0 0 0 0 1

      1 0 0 0 0 1

      0 0 0 0 0 1

  1 0 1 0 0 0 0 0

      0 0 0 0 0 0

Two Parents, one “child”

Page 20: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Family Types in a Sample of Singletons and Pairs of Siblings With or Without Genotyped Parentsa

Parent not genotyped Parent genotyped

One subject Two subjects One subject Two subjects

Family types not informative Subject 1 Subject 1 Subject 2 Parent 1 Parent 2 Subject 1 Parent 1 Parent 2 Subject 1 Subject 2

of stratification 2 2 2 2 2 2 2 2 2 2

1 1 1 2 0 1 2 0 1 1

0 0 0 0 0 0 0 0 0 0

Family types informative 2 1 2 1 2 2 1 2 2

of stratification 2 0 1 2 1

1 0 1 1 2 1 1

1 1 1 2 2

0 2 1

1 0 1 2 0

0 1 1

1 0

0 0

1 0 1 1

0 1

0 0

a The cells list the number of A1 alleles.

Page 21: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Subject (=y) variables, Structural part

jk(s)

jgs

kf

with dimension is n 1, j is n ng, k = n nf

jk(s) =j(gs

I) k(fI)

with dimension is n n, j is n (ng n), k is n (nf n)

jk(s) =j(gs

I) k(fI)

with dimension is n n, j is n (ng n), k is n (nf n)

jk(s)

j(gs

Ik(fI)

with dimension is n n, j is n (ng n), k is n (nf n)

Other matrices are partitioned in the same way

Page 22: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Subject (=y) variables, measurement part

yjk(s)

yyjgsy

kf

with dimension y = ny 1, yj = ny ng, y

k = ny nf

yjk(s)

yyj(gs

Iyk(fI

with dimension y = ny n, yj = ny (ng n), y

k = ny (nf n)

yjk(s) yy

j(gsIy

yk(fIy

with dimension y ny ny, yj = ny (ng ny), y

k = ny (nf ny)

Covariance between subjects from same family:

k(s=A,s=B)

= (C + Ck(fIy

with dimension C = ny ny, Ck = ny (nf ny).

Page 23: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Family (=x) variables:

xkxx

k(fI)

with dimension x is n n, xk is n (nf n)

xkxxf

with dimension x = nx 1, xk = nx nf

xkxx

k(fI

with dimension x = nx n, x k = nx (nf n)

xk xx

k(fIx

with dimension x nx nx, xk = nx (nf nx)

Page 24: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

General interpretation

• Genetic effects on:

– means are “main” effects

– relations between variables are interaction effects

– residuals are variance effects

Page 25: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

1 (2 )

2 (1 )

1 (1 ) 2 (2 )

G 1 G 2

Genotype

11

2 1

21

2 2

y2y1

1 2

2 1

Simple example

Page 26: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

y = jgy

E( t) = y

or,

= + + +

or,

y1 = 1 + 1(1)G1 + 1(2)G2 + 12y2 + 1

y2 = 2 + 2(1)G1 + 2(2)G2 + 21y1 + 2

0 12

21 0

1

2

1(1)

1(2)

2(1) 2(2)

y1

y2

G1

G2

y1

y2

1

2

y = 11

21

22

Page 27: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Genetic effects on y1

-1(1) 1(2) 1(1)

A2A2 A1A2 A1A1

1

A1A1 A1A2 A2A2

G1 1 0 -1G2 0 1 0

Page 28: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Additive model

2(1)

2(2)

G1 G2

Genotype

21

11

22

y2y1

Page 29: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Mediator model

1(2)

1(1)

G1 G2

Genotype

21

11

22

y2y1

21

Page 30: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Reversed effect model

2(1)

2(2)

G1 G2

Genotype

12

11

22

y2y1

21

21

Page 31: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Common gene model

1(2)

2(1)

1(1) 2(2)

G1 G2

Genotype

11

21

21

22

y2y1

Page 32: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Interactions

y = jgyjy +

j =

Applied to additive model:

y1 = 1 + 1

y2 = 2 + 2(1)G1 + 2(2)G2 + 21y1 + 21(1)y1G1 + 21(2)y1G2 + 2

0 0

21(1) 00 0

21(2) 0

Page 33: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

A1A1

A1A2

A2A2

y

y

21(1) > 0 and 21(2) = 0

Page 34: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

A2A2

A1A2

A1A1

y

y

21(1) and 21(2) >0

Page 35: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Estimation and specification in Mx

Page 36: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

y(s)y

jk(s) y

jk(s)y

jk(s)I jk(s)

jk(s)

xxkx

k

E((y(s) y

jk(s))(y(s) y

jk(s))t)yjk(s)

yjk(s)

I jk(s)

jk(s)x

k

jk(s)ty

jk(s))I jk(s) ty

jk(s)ty

jk(s)

E((x xk)(x x

k)t)xkx

kx

kx

ktx

k

E((y(s)

jk(s))(x xk)t)yx

k(s)y

jk(s)I jk(s)

jk(s)

xk(x)tk

Expected means and covariances single subject

Page 37: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Complete data vector zt = (xt,yt):

zttjk = [(x

k)t, (y

jk(s=1))t,…,(y

jk(s=ns))t]

xkxy

jk(s=1)xy

jk(s=ns)

yxjk(s=1)

yjk(s=1)

yk(s=A,s=B)

yxjk(s=ns)

k(s=A,s=B)

yjk(s=ns)

k(s=A, s=B)

covariances between subjects from same family

E((z jk)(z jk)t)jk

Expected means and covariances whole family

Page 38: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

NlnL(;zi) = lnLi

i=1

lnLi = { nzilog(2) + log (jk + (zi - (jk)t(jk-1(zi - (jk)}

Maximize log-likelihood function given the observed data by Raw Maximum likelihood

Minus two times the difference between the log likelihoods of two nested models is chi-square distributed with the difference in estimated parameters as the degrees of freedom.

where the individual log-likelihoods equal

Page 39: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Specification

– Most instances selection of matrices

– Dimension matrices > boring, errors

– Get started

Therefore simple program– Batch or questions

Page 40: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

MxScript

• Data structure– Number of (latent) subject variables?– Number of subjects in largest family?– Number of dummy variables for genotypes?

• Matrices to be used– Do the subject variables have causal effects on each other? BETA?– GENETIC: causal relations between subject variables? BETA?– STRATIFICATION: means of subject variables? ALPHA?

• File names– Name of file with your data? (DOS name)?– Name of the file for the Mx script? (DOS name)

Page 41: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Structure Mx script

• Most instances four groupsGroup Function Free parameters Starting values

1 General part yes yes

2 Genetic effects yes

3 Stratification effects yes

4 Fit model to data

Type from DOS-prompt: MxScript <ENTER>

Type from DOS-prompt: MxScript input.dat <ENTER>

Page 42: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Example

• Name data file: example.dat

• Sibling pairs, no parents

• Three genotype groups

• Family variables in data file (indicate that you want specify admixture effects)

• Starting values: sample drawn from multivariate distribution with means 0 and variances 1.5

Page 43: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

BMDexercise

Hip

Arm

SpineDuration

Intensity

General part

Page 44: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Identification measurement model:

y

0 00 10 42

0 52

Page 45: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

BMDexercise

Hip

Arm

SpineDuration

IntensityGenetic + Stratification effects

Common pathway?

Independent pathway?

Page 46: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Tests

Common pathway-Estimate model with genetic and stratification effects on means of second latent variable and test for significance of:

1. Genetic effects

2. Stratification effects

3. Genetic + stratification effect

Independent pathway- Estimate model with genetic and stratification effects on means of the indicators of the second latent variable and test for significance of:

1. Genetic effects

2. Stratification effects

3. Genetic + stratification effect

Page 47: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Free elements

a Full 2 1 Free [Matrices-End matrices section]

Free a 1 1 a 2 1 [After End matrices - free elements]

Free a 1 1 to a 2 1 [After End matrices - free range]

Page 48: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Names, Symbols and Function of Model Matrices

Name Symbol Function

Subject (=y) variables

Structural part

Alpha jk Means

Beta jk Causal effects of subject variables on each other

Gamma jk Causal effects of family variables on subject variables

Psi yjk

diagonal Residual variances

off-diagonal Residual covariances

Page 49: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Names, Symbols and Function of Model Matrices

Name Symbol Function

Measurement part

Nu yjk Intercepts or means indicators

Lambda yjk Factor loadings of indicators

Theta yjk

diagonal Variances errors of measurement

off-diagonal Covariances between errors of measurement

Covariances between y variables of subjects from same family

Ck

Page 50: A General Modeling Framework for Studying Candidate Genes Copy files from f:\edwin\example

Solution

Copy files from f:\edwin\solution