a new software approach for statistical modelling … · 2017. 6. 6. · rapid evaporative...

1
TO DOWNLOAD A COPY OF THIS POSTER, VISIT WWW.WATERS.COM/POSTERS ©2017 Waters Corporation INTRODUCTION The potential of Mass Spectrometry in foodomics is well recognised [1]. With its ability to rapidly measure thousands of features of an analyte with high sensitivity, MS is a natural fit for applications such as food authenticity, classification, and quality for market [2, 3]. The proliferation of ambient ionisation methods has improved this, by allowing real-time analysis of food samples with minimal preparation at atmospheric pressure. Key among these methods is REIMS (Rapid Evaporative Ionization Mass Spectrometry), one of Waters’ Direct Sample Analysis technologies. This approach combines high- performance mass spectrometry with the innovative iKnife sampling tool [4, 5] to allow rapid analysis of small molecules derived from a sample without any extensive sample preparation or chromatography being required (Figure 1). Authors: Nathaniel G Martin 1 , Dave Jackson 1 , Chris Lawther 1 , Sara Stead 1 , Olivier P Chevallier 2 , Connor Black 2 and Christopher T Elliott 2 Affiliations: 1. Waters Corporation, Wilmslow, UK; 2. Institute for Global Food Security, Queen’s University Belfast, UK. METHODS LiveID Workflow LiveID reads in pre-acquired MassLynx RAW files and uses that data to build statistical models and perform recognition on unkown samples. The complete workflow is shown in Figure 2. Workflow Highlights Raw data visualisation before and after processing. Each burn (or region of interest) is combined into a single spectrum. Optional, user configurable lockmass correction and background subtraction. PCA (Principal Components Analysis), LDA (Linear Discriminant Analysis) or combined PCA-LDA model types. Model cross-validation and analysis capability to iteratively refine the model. Recognition from a MassLynx RAW file that is being actively acquired. Clear instant results at point-of-use. Experimental Methods: White Fish Study A study of five white fish varieties (Cod, Coley, Pollock, Haddock and Whiting) was undertaken at Queens University Belfast (QUB) and this data was analysed in LiveID. The samples were stored at -80°C and thawed to room temperature before analysis. REIMS data was collected using the iKnife sampling device on a Xevo G2XS QToF operating in negative ion and sensitivity mode. Leucine- Enkephalin (Leu-Enk, 2ng / μL in isopropanol (IPA)) was included as a lockmass agent (m/z 554.2615), which was infused using a Waters Acquity UPLC I-class system (Waters Corporation., Milford, MA, USA) at a continuous flow rate of 0.1 mL/min for accurate mass correction. References 1. Herrero M., Simó C., García-Cañas V. et al. (2012). Foodomics: MS-Based Strategies in Modern Food Science and Nutrition. Mass Spectrometry Reviews (31): 49-69. 2. 2. Balog J., Perenyi D., Guallar-Hoyas C. et al. (2016). Identification of the Species of Origin for Meat Products by Rapid Evaporative Ionization Mass Spectrometry. J Agric Food Chem 64(23): 4793-800. 3. 3. Verplanken K., Stead S., Jandova R. et al. (2017). Rapid evaporative ionization mass spectrometry for high-throughput screening in food analysis: The case of boar taint. Talanta 169: 30-36. 4. 4. Schäfer K.C., Dénes J., Albrecht K. et al. (2009). In vivo, in situ tissue analysis using rapid evaporative ionization mass spectrometry. Angew Chem Int Ed Engl. 48(44): 8240-2. 5. 5. REIMS Research System with iKnife Sampling brochure, part number 720005418en (http:// www.waters.com/waters/library.htm?lid=134846772 ). 6. 6. Warner K., Mustain P., Lowell B. et al. Deceptive Dishes: Seafood Swaps Found Worldwide (2016). Oceana report (http://usa.oceana.org/sites/default/files/global_fraud_report_final_low-res.pdf ). Model Building: White Fish Study CONCLUSION Highly intuitive software for the analysis of REIMS data. Customizable and rapid model building. Instant classification of test samples, with clearly presented results. Enables real-time decision making, saving time Supports simple Yes/No answers. RESULTS Model Visualisations: White Fish Study Figure 1: Schematic of REIMS iKnife-based Direct Sample Analysis This transformative technology, providing real-time analysis of food samples, requires suitable analytical software capable of multivariate model-building and instant classification of test samples with clear feed- back to the user. We introduce LiveID TM software for this purpose. LiveID is a new web- based application platform that has been developed for exactly such a use within a direct analysis workflow, enabling: Training, visualization and validation of statistical models using pre- acquired characterised samples of known origin. Immediate classification results for novel test samples upon sam- pling with the iKnife. LiveID is an intuitive, guided workflow that unlocks the potential of direct sample analysis. We demonstrate the effectiveness of this software by the classification of five white fish varieties; white fish production and commerce is an area of considerable interest in foodomics, due to the high level of food fraud that is known to occur in this area [6]. LiveID™: A NEW SOFTWARE APPROACH FOR STATISTICAL MODELLING AND REAL-TIME RECOGNITION FOR USE IN DIRECT ANALYSIS WORK FLOWS Cross Validation: White Fish Study The training model was cross-validated using a stratified 5-fold method, successively leaving out 20% of the data and predicting classes based on rebuilding the model with the remaining 80%. The results are shown in Table 1. As you can see from Table 1 the cross validation scores are excellent indicating the model should be suitable for distinguishing between the five different species. Only three burn regions were incorrectly classified and there were no burns classified as outliers. Select Raw Data Visualize Raw Data Import Spectra Background Subtracted, Lockmass Applied, Spectra Combined Assign Spectra to Model Build Statistical Model Visualize Model Cross-Validate Model RECOGNIZE Optional refinement of model Figure 2: LiveID Workflow Figure 4 3D scores plot after PCA followed by LDA Figure 3: 3D scores plot after PCA Model Visualisations: White Fish Study Figure 3 shows a view of the 3D scores plot of the model data after dimensionality reduction using PCA. The first 3 principal components are plotted; these explain 78% of the total variance. The species are tightly clustered but still largely overlapping. Figure 4 shows a view of the 3D scores plot of the model data after dimensionality reduction using LDA after the PCA. The first 3 linear discriminants are plotted. The supervised algorithm has clearly sepa- rated the species into distinct clusters with little overlap. 2871 burn regions from 222 samples. Spectral data binned (resampled) at 0.5 Da. Model built over a mass range of 600-900 m/z (Glycerophospholipids). PCA/LDA model (PCA followed by LDA). 80 PCA components. 4 LDA linear discriminants. 15 SD Outlier Distance (Mahalanobis). Model builds in under 3 minutes. RESULTS DISCUSSION We have demonstrated highly intuitive and effective software, fully enabling the benefits of REIMS technology. As well as highly accurate classification, LiveID also demonstrated rapid performance, making it a suitable candidate for point-of-use testing. Models can be built offline ready for high-throughput applications, with a simple and clear decision returned for test samples immediately upon burn completion with the convenient iKnife probe. The model building process is customisable and can be iterated, but clear advice and cross- validation reports make this a straightforward process, providing both versatility and simplicity. The applications of this technology in foodomics are exciting and varied. The technology platform has already shown impressive performance in the detection of food admixtures [2] and in the characterisation of boar taint [3], for example. Figure 7: iKnife probe in action Figure 6: Recognition Page Recognition Page Scrolling TIC chart of incoming data. Immediate and clear recognition decisions. Probability of decision indicated by degree of fill of circle. History of recognition results automatically recorded. Selecting historical result in table highlights corresponding burn on TIC chart. Test Samples: White Fish Study The optimised and validated model was challenged with 1479 spectra from 110 test samples held out from the model build. The results are shown in Table 2. LiveID again showed highly successful classification; the correct species was identified in 96.82 % of all recognition events (burn regions of only one scan were excluded from this analysis). Figure 5: Loadings plot for PC1 1D Loadings plots generated for all PCA components 2D loadings plots for two PCs simultaneously (not shown) Explained variance displayed for all components Allows easy identification of feature bins that are important in separating classes Optionally, this information can be used to direct further analysis of potential biomarkers in Waters’ Progenesis ® QI software Table 2: Results from classification of 110 test samples Table 1: Cross-validation results

Upload: others

Post on 16-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A NEW SOFTWARE APPROACH FOR STATISTICAL MODELLING … · 2017. 6. 6. · Rapid evaporative ionization mass spectrometry for high-throughput screening in food analysis: The case of

TO DOWNLOAD A COPY OF THIS POSTER, VISIT WWW.WATERS.COM/POSTERS ©2017 Waters Corporation

INTRODUCTION

The potential of Mass Spectrometry in foodomics is well recognised [1].

With its ability to rapidly measure thousands of features of an analyte

with high sensitivity, MS is a natural fit for applications such as food

authenticity, classification, and quality for market [2, 3].

The proliferation of ambient ionisation methods has improved this, by

allowing real-time analysis of food samples with minimal preparation at

atmospheric pressure. Key among these methods is REIMS (Rapid

Evaporative Ionization Mass Spectrometry), one of Waters’ Direct

Sample Analysis technologies. This approach combines high-

performance mass spectrometry with the innovative iKnife sampling tool

[4, 5] to allow rapid analysis of small molecules derived from a sample

without any extensive sample preparation or chromatography being

required (Figure 1).

Authors: Nathaniel G Martin1, Dave Jackson1, Chris Lawther1, Sara Stead1, Olivier P Chevallier2, Connor Black2 and Christopher T Elliott2 Affiliations: 1. Waters Corporation, Wilmslow, UK; 2. Institute for Global Food Security, Queen’s University Belfast, UK.

METHODS

LiveID Workflow

LiveID reads in pre-acquired MassLynx RAW files and uses that data to

build statistical models and perform recognition on unkown samples.

The complete workflow is shown in Figure 2.

Workflow Highlights

Raw data visualisation before and after processing.

Each burn (or region of interest) is combined into a single

spectrum.

Optional, user configurable lockmass correction and background

subtraction.

PCA (Principal Components Analysis), LDA (Linear Discriminant

Analysis) or combined PCA-LDA model types.

Model cross-validation and analysis capability to iteratively refine

the model.

Recognition from a MassLynx RAW file that is being actively

acquired.

Clear instant results at point-of-use.

Experimental Methods: White Fish Study

A study of five white fish varieties (Cod, Coley, Pollock, Haddock and

Whiting) was undertaken at Queens University Belfast (QUB) and this

data was analysed in LiveID. The samples were stored at -80°C and

thawed to room temperature before analysis.

REIMS data was collected using the iKnife sampling device on a Xevo

G2XS QToF operating in negative ion and sensitivity mode. Leucine-

Enkephalin (Leu-Enk, 2ng / µL in isopropanol (IPA)) was included as a

lockmass agent (m/z 554.2615), which was infused using a Waters

Acquity UPLC I-class system (Waters Corporation., Milford, MA, USA)

at a continuous flow rate of 0.1 mL/min for accurate mass correction.

References

1. Herrero M., Simó C., García-Cañas V. et al. (2012). Foodomics: MS-Based Strategies in Modern Food

Science and Nutrition. Mass Spectrometry Reviews (31): 49-69.

2. 2. Balog J., Perenyi D., Guallar-Hoyas C. et al. (2016). Identification of the Species of Origin for Meat

Products by Rapid Evaporative Ionization Mass Spectrometry. J Agric Food Chem 64(23): 4793-800.

3. 3. Verplanken K., Stead S., Jandova R. et al. (2017). Rapid evaporative ionization mass spectrometry

for high-throughput screening in food analysis: The case of boar taint. Talanta 169: 30-36.

4. 4. Schäfer K.C., Dénes J., Albrecht K. et al. (2009). In vivo, in situ tissue analysis using rapid

evaporative ionization mass spectrometry. Angew Chem Int Ed Engl. 48(44): 8240-2.

5. 5. REIMS Research System with iKnife Sampling brochure, part number 720005418en (http://

www.waters.com/waters/library.htm?lid=134846772).

6. 6. Warner K., Mustain P., Lowell B. et al. Deceptive Dishes: Seafood Swaps Found Worldwide (2016).

Oceana report (http://usa.oceana.org/sites/default/files/global_fraud_report_final_low-res.pdf).

Model Building: White Fish Study

CONCLUSION

Highly intuitive software for the analysis of REIMS data.

Customizable and rapid model building.

Instant classification of test samples, with clearly presented results.

Enables real-time decision making, saving time

Supports simple Yes/No answers.

RESULTS

Model Visualisations: White Fish Study

Figure 1: Schematic of REIMS iKnife-based Direct Sample Analysis

This transformative technology, providing real-time analysis of food

samples, requires suitable analytical software capable of multivariate

model-building and instant classification of test samples with clear feed-

back to the user.

We introduce LiveIDTM

software for this purpose. LiveID is a new web-

based application platform that has been developed for exactly such a

use within a direct analysis workflow, enabling:

Training, visualization and validation of statistical models using pre-

acquired characterised samples of known origin.

Immediate classification results for novel test samples upon sam-

pling with the iKnife.

LiveID is an intuitive, guided workflow that unlocks the potential of direct

sample analysis. We demonstrate the effectiveness of this software by

the classification of five white fish varieties; white fish production and

commerce is an area of considerable interest in foodomics, due to the

high level of food fraud that is known to occur in this area [6].

LiveID™: A NEW SOFTWARE APPROACH FOR STATISTICAL MODELLING AND REAL-TIME RECOGNITION FOR USE IN DIRECT ANALYSIS WORK FLOWS

Cross Validation: White Fish Study

The training model was cross-validated using a stratified 5-fold method,

successively leaving out 20% of the data and predicting classes based

on rebuilding the model with the remaining 80%. The results are shown

in Table 1.

As you can see from Table 1 the cross validation scores are excellent

indicating the model should be suitable for distinguishing between the

five different species. Only three burn regions were incorrectly classified

and there were no burns classified as outliers.

Select Raw Data Visualize Raw Data

Import Spectra – Background Subtracted, Lockmass Applied, Spectra Combined

Assign Spectra to Model Build Statistical Model Visualize Model

Cross-Validate Model RECOGNIZE

Optional refinement of model

Figure 2: LiveID Workflow

Figure 4 3D scores plot after PCA followed by LDA Figure 3: 3D scores plot after PCA

Model Visualisations: White Fish Study

Figure 3 shows a view of the 3D scores plot of the model data after

dimensionality reduction using PCA. The first 3 principal components

are plotted; these explain 78% of the total variance. The species are

tightly clustered but still largely overlapping.

Figure 4 shows a view of the 3D scores plot of the model data after

dimensionality reduction using LDA after the PCA. The first 3 linear

discriminants are plotted. The supervised algorithm has clearly sepa-

rated the species into distinct clusters with little overlap.

2871 burn regions from 222 samples.

Spectral data binned (resampled) at 0.5 Da.

Model built over a mass range of 600-900 m/z (Glycerophospholipids).

PCA/LDA model (PCA followed by LDA).

80 PCA components.

4 LDA linear discriminants.

15 SD Outlier Distance (Mahalanobis).

Model builds in under 3 minutes.

RESULTS DISCUSSION

We have demonstrated highly intuitive and effective software, fully

enabling the benefits of REIMS technology. As well as highly accurate

classification, LiveID also demonstrated rapid performance, making it a

suitable candidate for point-of-use testing.

Models can be built offline ready for high-throughput applications, with a

simple and clear decision returned for test samples immediately upon

burn completion with the convenient iKnife probe. The model building

process is customisable and can be iterated, but clear advice and cross-

validation reports make this a straightforward process, providing both

versatility and simplicity.

The applications of this technology in foodomics are exciting and varied.

The technology platform has already shown impressive performance in

the detection of food admixtures [2] and in the characterisation of boar

taint [3], for example.

Figure 7: iKnife probe in action

Figure 6: Recognition Page

Recognition Page

Scrolling TIC chart of incoming data.

Immediate and clear recognition decisions.

Probability of decision indicated by degree of fill of circle.

History of recognition results automatically recorded.

Selecting historical result in table highlights corresponding burn on TIC chart.

Test Samples: White Fish Study

The optimised and validated model was challenged with 1479 spectra

from 110 test samples held out from the model build. The results are

shown in Table 2.

LiveID again showed highly successful classification; the correct species

was identified in 96.82 % of all recognition events (burn regions of only

one scan were excluded from this analysis).

Figure 5: Loadings plot for PC1

1D Loadings plots generated for all PCA components

2D loadings plots for two PCs simultaneously (not shown)

Explained variance displayed for all components

Allows easy identification of feature bins that are important in separating classes

Optionally, this information can be used to direct further analysis of potential biomarkers in Waters’ Progenesis

® QI software

Table 2: Results from classification of 110 test samples

Table 1: Cross-validation results