department of science u.s. coast guard academy new london, connecticut [email protected]

43
Department of Science U.S. Coast Guard Academy New London, Connecticut [email protected] LCDR Gregory J. Hall Glenn S. Frysinger Chemometric Methods for GC x GC

Upload: media

Post on 25-Feb-2016

36 views

Category:

Documents


4 download

DESCRIPTION

Chemometric Methods for GC x GC. LCDR Gregory J. Hall Glenn S. Frysinger. Department of Science U.S. Coast Guard Academy New London, Connecticut [email protected]. LCDR Gregory J. Hall. 1995 B.S. Marine Science – U.S. Coast Guard Academy 1995 – 1997 Operations Officer, USCGC SPAR - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Department of ScienceU.S. Coast Guard AcademyNew London, [email protected]

LCDR Gregory J. HallGlenn S. Frysinger

Chemometric Methods for GC x GC

Page 2: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

LCDR Gregory J. Hall1995 B.S. Marine Science – U.S. Coast Guard

Academy

1995 – 1997 Operations Officer, USCGC SPAR

1997-1998 M.S. Chemistry, Tufts University

1998-2000 Rotating Military Faculty, USCGA

2000 – Appointed to the PCTS

2002 – 2004 Ph.D. sabbatical, Tufts University

2006 – Ph.D. Chemistry, Tufts University“Chemometric Characterization and Classification of Estuarine Water through Multidimensional Fluorescence”

Page 3: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Permanent Commissioned Teaching Staff (PCTS)

About 23 officers ranked from LT to CAPT

Provide the “interpreters” between the military and civilian faculty and leadership for the college

Teaching, Service, and Scholarship expected

Ph.D. required

Page 4: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

LCDR Gregory J. Hall

Page 5: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

What IS Chemometrics?Chemometrics is the chemical discipline that uses mathematical, statistical and other methods employing formal logic to design or select optimal measurement procedures and experiments, and to provide maximum relevant chemical information by analyzing chemical data. (D.L. Massart: Chemometrics:, Elsevier, NY,1988)

Page 6: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Chemometrics already covered and to come

1. Difference Chromatograms

2. Property Modeling

3. Clustering

4. Chromatograph Prediction

5. Mass Spec searching

6. Template Construction

7. XICs

8. Retention Indices

You are all already chemometricians!

Page 7: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Today1. Data Structures – How I view GC x GC data

2. Variance - PCA

3. Classification – SIMCA, PCR-DA

4. Regression – PLS

5. Peak Resolution - PARAFAC

6. Preprocessing – Alignment

7. The way forward, humble opinions

Page 8: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Data – GC x GC - FID

XJ

K

I

sam

ple

First Dimension

Secon

d Dim

ensio

n

1 2 40 50 67 32 32 25 10 1

2 5 64 90 45 1 18 5 67 10 1

7 41 7 80 23 4 41 50 42 20

Intensity Values

Chromatogram

“Two way”

3 Dimensions

Chromatogram Stack

“Three way”

4 Dimensions

Dataset Data Object

First DimensionSec

ond

Dimen

sion

Page 9: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Data – GC x GC -TOF

Sample (Date?)

First Dimension

Secon

d Dim

ensio

n

m/z

X

Dataset

“Four way”

5 Dimensions !

Page 10: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

variable 1

varia

ble

2va

riabl

e 3

ij

PC 1

PC 2

T2Q

Principal Components Analysis (PCA)

Page 11: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Principal Components Analysis (PCA)

= T

P

“model”

Sam

ples

X

data

E+

residuals

“components”

X T P E

Goal - Variance capture

Page 12: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Multi-way Principal Components Analysis (MPCA)

Wise, B. M.; Gallagher, N. B.; Bro, R.; Shaver, J. M.; Windig, W.; Koch, R. S. PLS Toolbox 4.0; Eigenvector Research, Inc.: Wenatchee, WA, 2006.

Our data15 x 410,000

Page 13: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

0 5 10 15 20 25 30 35 40

3.0

2.0

1.0

0.0

Time (min)

Tim

e (s

)

4.0

GC × GC/MS TIC of Fire Debris

6 clean carpet samples 5 gasoline samples6 “doped” carpet samples

Page 14: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

PCA Model Specifics1. Only two carpet classes included

2. 4 PCs = 98% variance

3. Two random samples per class left out, all gasoline samples left out of “training set”

4. Left out samples “projected” onto the model later.

Page 15: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

PC 1 - LoadingsPC 1 - Loadings

0 5 10 15 20 25 30 35 40 45 50

2.0

1.5

1.0

0.5

0

Time (min)

Tim

e (s

)

Red = positive loadings, correlatedBlue = negative loadings, anti-correlated

Page 16: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

PC 2 - LoadingPC 2 - Loading

0 5 10 15 20 25 30 35 40 45 50

2.0

1.5

1.0

0.5

0

Time (min)

Tim

e (s

)

Chemically interpretable results!Next step - classification

Page 17: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Principal Components RegressionDiscriminant Analysis (PCR-DA) w/

accelerant

wo/ accelerant

0 1

0 1

0 1

0 1

1 0

1 0

1 0

1 0

1 0

Y

0 5 10 15 20 25 30 35 40 45 50

2.0

1.5

1.0

0.5

0Time (min)

Tim

e (s

)0 5 10 15 20 25 30 35 40 45 50

2.0

1.5

1.0

0.5

0Time (min)

Tim

e (s

)0 5 10 15 20 25 30 35 40 45 50

2.0

1.5

1.0

0.5

0Time (min)

Tim

e (s

)

Xvariable 1

varia

ble

2va

riabl

e 3

ij

PC 1

PC 2

T2Q

Regression Vector

Page 18: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Regression VectorRegression Vector

0 5 10 15 20 25 30 35 40 45 50

2.0

1.5

1.0

0.5

0

Time (min)

Tim

e (s

)

Red = positive loadingsBlue = negative loadings

Page 19: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

20 25 30

150

100

O

Regression Vector ZoomRegression Vector Zoom

Page 20: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Principal Components Regression Principal Components Regression PredictionsPredictions

1 6 7 12 17-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8Sample Scores on the Regression Vector

Unaltered Carpet

Arson Debris

Gasoline

Discriminant Analysis 1 = Member of Arson Class

Page 21: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Classification – Soft Independent Model of Class Analogy (SIMCA)

variable 1

varia

ble

2va

riabl

e 3

xy

zk

22 2xk r rd Q T

22

, , 20.95 0.95

k kr k r k

Q TQ TQ T

variable 1va

riable

2

varia

ble

3

ij

PC 1

PC 2

T2Q

Page 22: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

SIMCA Model Specifics1. PCA modeled for 2 classes – Arson , not Arson

2. Each model had 2 PCs with 99% variance captured

3. One random samples per class left out, all gasoline samples left out of “training set”

4. Left out samples “projected” onto each model later.

Page 23: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Arson “Case” SIMCA Results

Carpet Doped Gasoline0

1

In C

arpe

t Cla

ss

0

1

In D

oped

Cla

ss

1

2

Nea

rest

Cla

ss

0

1N

ot in

any

Cla

ss

Carpet Doped Gasoline

Carpet Doped GasolineCarpet Doped Gasoline

Carpet SamplesCarpet TestDoped SamplesDoped TestGasoline Test

Page 24: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Arson “Case” SIMCA Fit Statistics

-10 0 10 20 30-0.01

0

0.01

0.02

0.03

0.04

Q Residuals

T^2

Res

idua

ls

0 50 100 150 200 250

0

0.05

0.1

0.15

0.2

0.25

Q ResidualsT^

2 R

esid

uals

Fit Statistics for Doped Carpet Class

-4 -2 0 2 4 6 80.01

0.015

0.02

0.025

0.03

Q Residuals

T^2

Res

idua

ls

0 500 1000

0

0.2

0.4

0.6

0.8

1

Q Residuals

T^2

Res

idua

ls

Fit Statistics for Carpet Class

Carpet SamplesCarpet TestDoped SamplesDoped TestGasoline Test

Page 25: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Parallel Factor Analysis (PARAFAC)

1

R

ijk ir jr kr ijkr

x a b c e

+=

B

A

CG

X E

+=X E

a1

b1

c1

a2

b2

c2

a3

b3

c3

+ +

J

K

I

J

K

I

J

R

K R

R I

J

K

I

J

K

I

Page 26: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Parallel Factor Analysis (PARAFAC)

PARAFAC

Sample

Sco

reLo

adin

g

Load

ing

First Dimension Second DimensionXJ

K

I

Factor 1Factor 2

a1

b1

c1

a2

b2

c2

Sample

Sco

reLo

adin

g

Load

ing

First Dimension

GC x GC - FID

Chromatogram Stack

Second Dimension

Page 27: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Parallel Factor Analysis (PARAFAC)

GC x GC - TOF

Sinha, A. E.; Fraga, C. G.; Prazen, B. J.; Synovec, R. E. Journal of Chromatography A 2004, 1027, 269-277.

Page 28: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Parallel Factor Analysis (PARAFAC)

PARAFAC

m/z

Sco

reLo

adin

g

Load

ing

First Dimension Second DimensionXJ

K

I

Factor 1Factor 2

a1

b1

c1

a2

b2

c2

m/z

Sco

reLo

adin

g

Load

ing

First Dimension

GC x GC - TOF

Sample

Second Dimension

Page 29: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Parallel Factor Analysis (PARAFAC)

GC x GC - TOF

“Complex Environmental Sample”

Sinha, A. E.; Fraga, C. G.; Prazen, B. J.; Synovec, R. E. Journal of Chromatography A 2004, 1027, 269-277.

Page 30: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

PARAFAC Results

Sinha, A. E.; Fraga, C. G.; Prazen, B. J.; Synovec, R. E. Journal of Chromatography A 2004, 1027, 269-277.

Page 31: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

PARAFAC Results

Sinha, A. E.; Fraga, C. G.; Prazen, B. J.; Synovec, R. E. Journal of Chromatography A 2004, 1027, 269-277.

Page 32: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

GCImage screen capture

NIJ0221 100 µg 75% Wx gasoline / nylon carpet matrix

GC × GC/MS Peak Deconvolution PARAFAC?

Page 33: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Partial Least Squares (PLS)

X T P E Y T Q F

= T

P

“model”

sam

ples

X

data

E+

residuals

“latent variables”

Y

variables

sam

ples

properties

= TFQ +

Page 34: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

PLS Results Naphthalenes in Jet Fuel

Johnson, K. J.; Prazen, B. J.; Young, D. C.; Synovec, R. E. Journal of Separation Science 2004, 27, 410-416.

Page 35: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Alignment Strategy 1Experimental Design

Alignment Strategy 2Templates / Peak TablesAlignment Strategy 3Retention Index

Page 36: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Alignment Strategy 4Piecewise Correlation Maximization

Pierce, K. M.; Wood, L. F.; Wright, B. W.; Synovec, R. E. Analytical Chemistry 2005, 77, 7735-7743.

Page 37: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Alignment Strategy 5

“Warping”

Kaczmarek, K.; Walczak, B.; de Jong, S.; Vandeginste, B. G. M. Journal of Chemical Information and Computer Sciences 2003, 43, 978-986.

Page 38: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Alignment Strategy Proposal # 1Anchor Warping

Page 39: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Alignment Strategy Proposal # 1Anchor Warping

Page 40: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Alignment Strategy Proposal #2DTW – Piecewise Hybrid

1st Dimension DTW Alkanes?

2nd Dimension Piecewise

Page 41: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Humble Opinions1. GC x GC is tremendously interesting data

2. Tremendous amounts of work possible, even with data that presently exists. Good alignment will open up even more possibilities

3. Include the Chemist in the analysis

4. Include the Chemometrician in the experimental design

Page 42: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

Future? 1. More PCA, PCR, PLS, PARAFAC

2. Regression certainty calculations

3. NPLS, NPLS-DA

4. Holistic, automatic alignment strategies 2D COW or DTW ?

PARAFAC 2 ?

5. User driven alignment strategiesAnchor warping

6. Inclusion on m/z axisPurity, CODA?

Page 43: Department of Science U.S. Coast Guard Academy New London, Connecticut gregory.hall@uscga.edu

U.S. Coast Guard Academy Alexander Trust

You all!

Acknowledgements