asca: analysis of multivariate data from an experimental design, biosystems data analysis group...
TRANSCRIPT
ASCA: analysis of multivariate data from an experimental design,
Biosystems Data Analysis group
Universiteit van Amsterdam
ANOVA
• different design factors contribute to the variation
withinBABtotal SSSSSSSSSS A
For two treatments A and B the total sum of squares can be split into several contributions
Example
Experiment:
Time: 6, 24 and 48 hours
Experimental Design:
Rats are given Bromobenzene that affects the liver
Groups: 3 doses of BB
Animals: 3 rats per dose per time point
Vehicle group, Control group
Rat
111
Rat
211
Rat
311
Rat
112
Rat
212
Rat
312
Rat
113
Rat
213
Rat
313
Rats
6 hours
24 hours
48 hours
0246810
chemical shift (ppm)
2.932.7175 2.075
3.7525 3.675
3.0475 5.38 3.285
2.055
3.0275
Measurements: NMR spectroscopy of urine
NMR Spectroscopy
- Each type of H-atom has a specific Chemical shift
- The peak height is number of H-atoms at this chemical shift = metabolite concentration
- NMR measures ‘concentrations’ of different types of H-atoms
02468100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
chemical shift (ppm)
2.93
2.7175 2.075
3.7525 3.675
3.0475
5.38 3.285
2.055
3.0275
Different contributions
0 0.2 0.4 0.6 0.8 1-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
time
Me
tab
olit
e c
on
cen
tra
tion
0 0.2 0.4 0.6 0.8 1time
Time
Animal
0 0.2 0.4 0.6 0.8 1time
Dose
0 0.2 0.4 0.6 0.8 1time
Trajectories
Experimental Design
The Method I: ANOVA
Constraints:
0k
k
0hk
hki
hki
0h
hk0 0.2 0.4 0.6 0.8 1time
0 0.2 0.4 0.6 0.8 1time
0
0
0
0 0.2 0.4 0.6 0.8 1time
Data
Individualih
Dose grouph
Timek
MeaningSymbol
khihx
hkhk hkihkkhkix Estimates of these factors:
khhkikkhkhki xxxxxxxxhkhk ............
The Method IIANOVA is a Univariate technique
x X
2.932.7175 2.0753.7525 3.675
3.0475 5.38 3.285
2.055
3.0275
khihx
hkhk hkihkkhkix
αβγαβαT XXX1mX
M A T R I C E S :
αβγαβαT XXX1mX
M A T R I C E S :
2
αβγ
2
αβ
2
α
2T2XXX1mX
F o r a l l v a l u e s i n t h e A N O V A e q u a t i o ne . g . :
αk Xα
F o r a l l v a l u e s i n t h e A N O V A e q u a t i o ne . g . :
αk Xα
Structured !
Multivariate DataNMR Spectroscopy
02468100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
chemical shift (ppm)
2.93
2.7175 2.0753.7525 3.675
3.0475
5.38 3.285
2.055
3.0275
Covariance between the variables
-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5-0.02
-0.01
0
0.01
0.02
0.03
0.04
2.05 ppm6
.01
pp
m
Or:Relationship between the columns of X
X
The Method III: Principal Component Analysis
3D 2D … Imagine!350D 2D !!!
10
0.5
1
0
0.5
1
xx2
0
0.5
1
1.5
2
2.5
3
x 3 X
Loading PC 1
Loading PC 1
Loading PC 2
Loading PC 2
ETP1mX TT loadingsscores
residuals
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
PC 1
PC
2
Scores
The Method IV: ANOVA and PCA ASCA
Column spaces areOrthogonal
E
Parts of the data not explained by the component models
αβγαβαT XXX1mX
EPTPTPT1mX Tαβγαβγ
Tαβαβ
Tαα
T
X
Tαβγ
PαβγPαβPα
TαβTα
X
Tαβγ
PαβγPαβPα
TαβTα
X
Tαβγ
PαβγPαβPα
TαβTα
In Words:
• ASCA models the different contributions to the variation in the data
• ASCA takes the covariance between the variables into account
• ASCA gives a solution for the problem at hand.
Results I
40 %
Xα
Xαβ
XαβγXαXα
XαβXαβ
Xαβγ
6 24 48
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Time (Hours)
Sco
res
controlvehiclelowmediumhigh
αβ -scores
Results II
• Quantitative effect!
• No effect of vehicle
• Scores are in agreement with visual inspection
6 24 48
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Time (Hours)
Sco
res
controlvehiclelowmediumhigh
Results III biomarkers
3.9675 2.735
3.6753.7525
2.0552.5425
5.383.0475
2.58252.6975
3.9675 2.7352.6975
2.933.0275 2.91
2.5825
2.075
3.285
2.055
3.8875 3.73 2.0553.0475 2.93 2.075
2.735
3.0275
3.2625
3.285
0246810chemical shift (ppm)
αβγ
αβ
αDifferencesbetween submodels
Interesting for Biology
Interesting for Diagnostics
Unique to the α submodel
Conclusions
• Metabolomics (and other –omics) techniques give multivariate datasets with an underlying experimental design
• For this type of data, ASCA can be used• The results observed for this experiment
are in accordance with clinical observations• The metabolites that are responsible for
this variation can be found using ASCA BIOMARKERS