m2ia: a web server for microbiome and metabolome ... · associated with obesity from both...

20
M 2 IA: a Web Server for Microbiome and Metabolome Integrative Analysis Yan Ni 1,# , Gang Yu 1 , Huan Chen 2 , Yongqiong Deng 3 , Philippa M Wells 4 , Claire J Steves 4,5 , Feng Ju 6,7 , Junfen Fu 1,# 1. National Clinical Research Center for Child Health, The Children’s Hospital, Zhejiang University School of Medicine, Hangzhou 310029, P.R. China. 2. Key laboratory of Microbial technology and Bioinformatics of Zhejiang Province, NMPA Key laboratory for Testing and Risk Warning of Pharmaceutical Microbiology, Zhejiang Institute of Microbiology, Hangzhou 310012, P.R. China. 3. Department of Dermatology and STD, the Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, P.R. China. 4. The Department of Twin Research, Kings College London, South Wing, St Thomas' Hospital, Westminster Bridge Road, London SE1 7EH, UK 5. Department of Ageing and Health, St Thomas' Hospital, 9th floor, North Wing, Westminster Bridge Road, London SE1 7EH, UK 6. School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou 310024, P.R. China. 7. Institute of Advanced Technology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou 310024, P.R. China. Correspondence should be addressed to Dr. Yan Ni ([email protected]) and Dr. Junfen Fu ([email protected]). . CC-BY-NC-ND 4.0 International license certified by peer review) is the author/funder. It is made available under a The copyright holder for this preprint (which was not this version posted October 30, 2019. . https://doi.org/10.1101/678813 doi: bioRxiv preprint

Upload: others

Post on 19-Jun-2020

86 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

M2IA: a Web Server for Microbiome and Metabolome Integrative Analysis

Yan Ni1,#, Gang Yu1, Huan Chen2, Yongqiong Deng3, Philippa M Wells4, ClaireJSteves4,5,Feng Ju6,7, Junfen Fu1,#

1. National Clinical Research Center for Child Health, The Children’s Hospital, Zhejiang

University School of Medicine, Hangzhou 310029, P.R. China.

2. Key laboratory of Microbial technology and Bioinformatics of Zhejiang Province,

NMPA Key laboratory for Testing and Risk Warning of Pharmaceutical Microbiology,

Zhejiang Institute of Microbiology, Hangzhou 310012, P.R. China.

3. Department of Dermatology and STD, the Affiliated Hospital of Southwest Medical

University, Luzhou, Sichuan, P.R. China.

4. The Department of Twin Research, Kings College London, South Wing, St Thomas'

Hospital, Westminster Bridge Road, London SE1 7EH, UK

5. Department of Ageing and Health, St Thomas' Hospital, 9th floor, North Wing,

Westminster Bridge Road, London SE1 7EH, UK

6. School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou 310024,

P.R. China.

7. Institute of Advanced Technology, Westlake Institute for Advanced Study, 18

Shilongshan Road, Hangzhou 310024, P.R. China.

Correspondence should be addressed to Dr. Yan Ni ([email protected]) and Dr.

Junfen Fu ([email protected]).

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 2: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

Abstract

Background: In the last decade, integrative studies of microbiome and metabolome have

experienced exponential growth in understanding their impact on human health and

diseases. However, analyzing the resulting multi-omics data and their correlations

remains a significant challenge in current studies due to the lack of a comprehensive

computational tool to facilitate data integration and interpretation. In this study, we have

developed a microbiome and metabolome integrative analysis pipeline (M2IA) to meet

the urgent needs for tools that effectively integrate microbiome and metabolome data to

derive biological insights.

Results: M2IA streamlines the integrative data analysis between metabolome and

microbiome, including both univariate analysis and multivariate modeling methods.

KEGG-based functional analysis of microbiome and metabolome are provided to explore

their biological interactions within specific metabolic pathways. The functionality of

M2IA was demonstrated using TwinsUK cohort data with 1116 fecal metabolites and 16s

rRNA microbiome from 786 individuals. As a result, we identified two important

pathways, i.e., the benzoate degradation and phosphotransferase system, were closely

associated with obesity from both metabolome and microbiome.

Conclusions: M2IA is a public available web server (http://m2ia.met-bioinformatics.cn)

with the capability to investigate the complex interplay between microbiome and

metabolome, and help users to develop mechanistic hypothesis for nutritional and

personalized therapies of diseases.

Keywords: M2IA, microbiome and metabolome correlation, data integration, network

analysis, metabolic function

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 3: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

Background

The study of microbiome in human health has experienced exponential growth over the

last decade with the advent of new sequencing technologies for interrogating complex

microbial communities[1]. Meanwhile, metabolomics has been an important tool for

understanding microbial community functions and their links to health and diseases

through the quantitation of dozens to hundreds of small molecules[2]. The gut microbiota

is considered a metabolic ‘organ’ to protect the host against pathogenic microbes,

modulate immunity, and regulate metabolic processes, including short chain fatty acid

production and bile acid biotransformation[3]. Conversely, these metabolites can modulate

gut microbial compositions and functions both directly and indirectly[4]. Thus, the

microbiota-metabolites interactions are important to maintain the host health and well-

being. Integrative data analysis of gut microbiome and metabolome can offer deep

insights on the impact of lifestyle and dietary factors on chronic and acute diseases (e.g.,

autoimmune diseases, inflammatory bowel disease[5], cancers[6], type 2 diabetes and

obesity[7], cardiovascular disorders, and non-alcoholic fatty liver disease[8]), and provide

potential diagnostic and therapeutic targets[9].

In the last decade, metabolomics studies in microbiota-related research have increased in

a wide range of research areas, such as gastroenterology, biochemistry, endocrinology,

microbiology, genetics, to nutrition, food science and pharmacology (Figure S1).

However, both metagenomics/16s rRNA-based high-throughput sequencing technologies

and mass spectrometry/nuclear magnetic resonance-based metabolomics platforms can

produce large and high-dimensional data, posing a major challenge for subsequent data

integration[10, 11]. Current integration analysis methods mainly focus on statistical

correlations between microbiome and metabolome, such as the spearman correlations and

partial least squares discriminant analysis (Table S1). Pedersen et al. recently

summarized a step-by-step computational protocol from their previous study that applied

WGCNA, a dimension reduction method, to measure the correlations among the host

phenotype, gut metagenome, and fasting serum metabolome. However, separate analyses

of -omics data through one or two statistical methods only provide fragmented

information and do not capture the holistic view of disease mechanisms. Moreover,

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 4: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

although much bioinformatics work has been done to process and analyze the individual

omics data, to date, it still lacks a comprehensive strategy or a computational tool to

analyze the correlations between microbiome and metabolome[12]. To rapidly advance

microbiome and metabolome data integration and understand their roles in diverse

diseases, advanced computational methods for multi -omics data integration and

interpretation need to be developed[2].

In this study, we have developed a comprehensive computational tool, a standardized

workflow for microbiome and metabolome integrative analysis (M2IA). M2IA combines

a variety of univariate and multivariate methods for correlation analysis and data

integration. In addition, functional network analysis implemented with KEGG metabolic

pathway database can provide deep insights of their biological correlations, i.e., the

possible participation of identified microbes in a specific metabolic reaction or metabolic

pathway. M2IA accepts the microbiome data from 16S rRNA gene or shotgun

metagenomic sequencing technologies, and metabolomics data from mass spectrometry

or NMR spectroscopy platforms. Other than current bioinformatics tools, M2IA is an

automated data analysis and reporting pipeline that can be applied by users with little

training in bioinformatics. M2IA is a web-based server with user-friendly interface, and

public available to academic users through http://m2ia.met-bioinformatics.cn.

Implementation

M2IA is developed in Java web system, with html, Javascript technology implemented for

interactive interface and JFinal and Mysql databases for data management in the backend.

All the statistical analyses and visualization were implemented in R programming

language. The entire system is deployed on a cloud server with 16 GB of RAM and four

virtual CPUs with 2.6 GHz each. Users can register an academic account to manage their

own research projects. M2IA has been tested with major modern browsers such as Google

Chrome, Safari, Mozilla Firefox and Microsoft Internet.

Result

Workflow

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 5: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

A standard data analysis workflow in M2IA contains six major steps: (1) data upload and

processing, (2) global similarity analysis between microbiome and metabolome data

matrix, (3) pairwise metabolite-microbe correlation analysis, (4) multivariate regression-

based data integration analysis, (5) functional network analysis, and (6) an auto-generated

html report for overview. Figure 1 summarizes the overall design and the flowchart of

M2IA. Each step offers multiple methods helping users to explore complex correlations

between microbiome and metabolome. In this section, each step will be described in

detail.

Data upload and processing

Data upload. The first step is to upload three different types of data: (1) microbiome

datasets: for 16s rRNA gene sequencing data, an OTU abundance table with taxonomic

annotations (.txt format) and a corresponding reference sequence file (.fna format) are

required. Similarly, a unigene table with taxonomic annotations and a table with KEGG

KO function annotations are required for metagenomic shotgun sequencing study. (2)

metabolome dataset: a metabolite abundance table (.csv format) with compound name,

HMDB database ID, and chemical class information, and (3) a metadata table containing

sample IDs and group information. Sample IDs from microbiome and metabolome

datasets may not be exactly the same, but should be paired correctly within the metadata

table. Once uploaded successfully, M2IA system examines the consistency of sample IDs

from two datasets and notifies users whether there exist unmatched samples. After that,

the metadata table is presented and can be modified by users interactively. For example,

instead of uploading a new table, when users want to change grouping information or

remove certain samples, they can modify group IDs or uncheck certain samples directly

on the table for downstream data analysis.

Missing value processing. This step includes data filtering and missing value imputation.

First, unqualified variables can be removed according to the criteria of sample prevalence

and variance across samples. For microbiome data, the minimal count number, the

percentage of samples with non-missing values, and the relative standard deviation (RSD)

are used for screening. As suggested by MicrobiomeAnalyst[10], features with very low

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 6: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

count (e.g., <2) in a few samples (e.g., <20%), or very stable with limited variations

across all the samples (e.g., RSD < 30%), are considered difficult to interpret their

biological significance in the community and can be removed directly. Similarly, our

previous work summarized that those metabolic features with missing values in more

than 80% of samples or RSD values smaller than 30% can be filtered at the beginning.

The sample prevalence is calculated based on all the samples or samples within each

group[13].

Second, both microbiome and metabolome have the characteristics of sparsity seen as the

absence of many taxa or metabolites across samples due to biological and/or technical

reasons[14]. Such missing values may pose general numerical challenges for traditional

statistical analysis, thus M2IA provides a variety of methods for missing value imputation,

aiming to remove zeros or NAs in the data matrix and facilitate subsequent statistical

analysis. There are two different approaches to handle missing values: one is to simply

replace missing values with a certain value (called pseudo count in microbiome),

including half of the minimum or the median value; another is to apply regression models

to impute missing values, such as random forest, k-nearest neighbors, Probabilistic PCA,

Bayesian PCA, singular value decomposition, and the quantile regression imputation of

left-censored data (QRILC)[13]. Users may choose an appropriate method for missing

value imputation.

Data normalization. After missing value processing, users can perform normalization

method in order to make more meaningful comparisons. M2IA provides the total sum

scaling that calculates the relative percentage of features, or the log transformation when

data does not follow normal distribution. To note, scaling or transformation methods are

also provided for specific statistical analysis, such as principal component analysis (PCA)

and PLS-DA analysis.

Global similarity between two datasets

After data processing, the first step is to apply Coinertia analysis (CIA) and Procrustes

analysis (PA) to evaluate the global similarity between metabolome and microbiome

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 7: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

dataset (Figure 1)[15]. CIA can be considered as a PCA model of the joint covariances of

two datasets. Similarly, PA measures the congruence of two-dimensional data

distributions from superimposition and scaling of PCA models of two datasets (Figure

2A). The correlation coefficient R is between 0 and 1, and the closer it is to 1 indicates

the greater similarity between two datasets.

Pairwise metabolite-microbe correlation analysis

Correlation analysis methods. M2IA provides five different types of correlation analysis,

including Pearson, Spearman, SparCC, CCLasso, and Maximal Information Coefficient

(MIC) analysis. Although each method has it pros and cons in different situations, users

can choose an appropriate one. Since Spearman correlation analysis outperforms other

methods due to their overall performances[16], it has been set as a default one in M2IA.

Microbes can be analyzed according to their different taxonomic annotations (i.e.,

phylum, genus, and species) and metabolites at different chemical classifications (e.g.,

amino acids, sugars, free fatty acids etc.). In addition, M2IA allows users to define the

specific criteria of significance between microbe-metabolite pairs for subsequent analysis,

e.g., absolute correlation coefficients > 0.3 and p value <0.05.

Visualization. Three different ways of visualization are provided in order to explore

complex correlations between microbes and metabolites. The first one is to apply circos

plot that can help users to quickly identify those microbes having close correlations with

different classes of metabolites (e.g., amino acids) (Figure 2C). The second one is to

apply a heat map to illustrate their relative positive/negative correlations between each

microbe and metabolite. Meanwhile, hierarchical clustering is used to analyze the

similarities among metabolites or microbes, in which closely correlated

metabolites/microbes are usually clustered together. The third one is to provide a

microbe-metabolite interaction network using cytoscape technique, which can help users

to explore complex relationship between microbes and metabolites (Figure 2D).

Multivariate regression–based data integration

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 8: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

Unsupervised multivariate analysis. M2IA provides canonical correlation analysis

(CCA)[17, 18] and O2PLS[19] in order to evaluate the inherent correlations between two

datasets without considering phenotype information. CCA aims to find two new bases

(canonical variate) in which the correlation between original parameters of two datasets is

maximized. O2PLS is capable of modeling both prediction and systematic variation, and

the joint score plot indicates their relationship between two data matrix.

Metabolites/microbes with the large canonical coefficients from CCA model or loading

values from O2PLS model are considered essential ones for their similarities (Figure 2B).

Supervised multivariate analysis. In comparison, supervised multivariate analysis

methods integrate two data matrix initially and identify differential variables (microbes &

metabolites) that significantly contribute to the discrimination of different groups. Here,

M2IA offers PCA score-based differential analysis, partial least squares discriminant

analysis (PLS-DA), orthogonal partial least squares discriminant analysis (OPLS-DA),

and random forest (RF). PCA is originally an unsupervised data mining method; here we

examine the differences of the first principal component score values from PCA model,

and their correlations with the phenotype information. PLS-DA and OPLS-DA have been

commonly applied in the field of metabolomics for data dimension reduction and feature

selection. OPLS-DA seeks to maximize the explained variance between groups in a

single dimension or the first latent variable (LV), and separate the within group variance

(orthogonal to group difference) into orthogonal LVs (Figure 3A). The variable loadings

and/or coefficient weights from a validated PLS-DA and OPLS-DA model are used to

rank variables with respect to their performance for discriminating between groups

(Figure 3B). Boruta algorithm-based RF classifier can identify important features by

shuffling samples and adding extra randomness to the system (Figure 3C).

Network analysis

WGCNA-based network analysis. Weighted gene co-expression network analysis

(WGCNA) was recently raised to integrate high-throughput ‘-omics’ datasets for

identifying the potential mechanistic links[11]. In M2IA, WGCNA algorithm is used to

collapse co-abundant metabolites into different clusters, and thus metabolites within a

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 9: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

cluster are highly correlated. One attractive feature is that both identified and unknown

metabolite features can be considered. KEGG microbiome functional modules are

annotated using Tax4fun2[20, 21] for 16s RNA sequencing data or HUMAnN pipeline[22]

for shotgun sequencing data, respectively. Finally, the correlations between metabolite

clusters and microbial functional modules with phenotype information are examined

using Spearman correlation analysis and further visualized using heat map and network.

Metabolic function analysis. The final step aims to analyze or predict the biological

function profiles using metabolome and microbiome data, respectively. The metabolic

pathway activity is analyzed using PAPi package in R software [23]. In parallel, function

profiles of microbial communities are predicted using Tax4Fun tool for 16s rRNA data

and HUMAnN for shotgun sequencing data. Linear discriminant analysis is further used

to identify differential KO identifiers from microbiome and metabolome (Figure 4A-C).

Their relationships are further investigated using a microbe-metabolite interaction

network by connecting a specific metabolite, its reactions and related enzymes, and the

genomes of organism (both host and microbial)(Figure 4D-E).

Automated html report

Once users upload the data and optimize parameters of data analysis, M2IA provides a

simplified user experience through a one-click button to submit a job. It may take a few

minutes for M2IA to perform data analysis and generate interpretable results. The exact

processing time depends on the sample size, number of variables, and number of pair-

wise group comparisons. Once the analysis is completed, M2IA will summarize key

results and produce an html report for users. In addition, a zip-compressed package

containing all the supporting data files (tables and figures) can be downloaded for future

use.

Application example

To validate the performance of M2IA, we analyzed the fecal metabolome and

microbiome of 786 individuals from TwinsUK cohort[24]. 16S rRNA was extracted from

fecal samples, PCR amplified, barcoded per sample and sequenced with the Illumina

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 10: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

MiSeq platform, as previously described[25]. A total of 1,116 different metabolites were

measured using UPLC-MS/MS platforms. First, the Procrustes Analysis revealed

moderate similarities between metabolome and microbiome with statistical significance

(Figure 2A, r=0.31, p<0.05). Since most of individuals at the metabolic level surrounded

those at the microbiome level, the variations of metabolic profiles among individuals

were larger in metabolome. The O2PLS analysis indicated microbial and metabolic data

matrixes were highly correlated with correlation coefficient 0.76 (p<0.05, Figure 2B),

and those variables that contributed to their similarities could be identified according to

their weights. Then, Spearman correlation analysis was performed to explore specific

microbe-metabolite correlations. The circos plot showed seven different classes of

metabolites (i.e., nucleotides, carbohydrates, peptides, cofactors and vitamins, amino

acids, xenobiotics, and lipids) were correlated with microbes mainly belonging to

Firmicutes, Actinobacteria, Bacteroidetes, and Protebacteria (Figure 2C). Among which,

four microbes at genus level belonging to the Firmicutes had the most significant

correlations with metabolites, and their specific correlations were further visualized in a

network (Figure 2D).

According to BMI, participants were divided into obese (OB, BMI >28) and healthy

control group (HC, BMI <24). The multivariate OPLS-DA model based on the integrated

data matrix of metabolome and microbiome indicated the obvious difference between OB

and HC group (Figure 3A). Differential metabolites and microbes contributing to their

separations were identified using both VIP values (VIP>1) and correlation coefficients

(p<0.05), as indicated by the V-plot (Figure 3B). Alternative multivariate models e.g., RF

model, could be further used to validate the relative contributions of microbes and

metabolites to the discrimination between two groups (Figure 3C). For example, arginine

was identified as a significant marker from both OPLS-DA and RF model. Then, we

performed functional network analysis to further interpret the biological significance of

their correlations. A total of 39 and 16 significant biological functions (KOs) were

identified in microbiome and metabolome between OB and HC group, respectively (p <

0.05, Figure 4A,B). Two important pathways (i.e., the benzoate degradation and

phosphotransferase system) were overlapped (Figure 4C), among which, the benzoate

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 11: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

degradation was the most significant metabolic pathway in both microbial and metabolic

function analysis. Then, each differential metabolite and relevant microbes that may

participate in its metabolic reactions were further identified through searching the KEGG

database. For example, 3,4-dihydroxybezoate (also called protocatechuic acid) in the

benzoate degradation was a major metabolite of antioxidant polyphenols formed by gut

microbiota, and might have beneficial effects against inflammation and insulin resistance

in obesity[26]. Our results indicated that 3,4-dihydroxybezoate was significantly reduced

in obesity, and eight different gut microbes might participate in its metabolism (Figure

4D). Similarly, a total of 28 different microbes identified in this study were involved in

the glycerol metabolism of phosphotransferase system (Figure 4E).

Discussion

We have implemented M2IA pipeline as a user-friendly web server that can provide

microbiome and metabolome data integration analysis to understand the important roles

of microbial metabolism in diverse disease contexts. This is a bioinformatics workflow

that integrates a wide range of univariate and multivariate methods, including PA/CIA-

based data matrix similarity analysis, univariate-based correlation analysis, multivariate

regression-based analysis, and knowledge-based function analysis. The advantage of

applying these methods simultaneously is to understand the inherent characteristics of the

high-dimensional omics data in different ways, and obtain reliable results from multiple

analyses. M2IA also implements KEGG database in order to link orthologous gene

groups to metabolic reactions and annotated compounds. So that researchers will have

better understanding of their origins of metabolites, from host or microbial, and what

microbes participate in specific metabolic reactions. To make it more convenient, M2IA

automatically produces a comprehensive report that summarizes and interprets the key

results.

Users can apply M2IA to their own microbiome and metabolome data. In terms of

analytical platforms, M2IA accepts the microbiome data from 16S rRNA gene sequencing

or shotgun metagenomic sequencing technologies, and metabolomics data from mass

spectrometry or NMR analytical platforms. For sample types, many human studies on gut

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 12: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

microbiota collect fecal samples, due to its non-invasive characteristics. The fecal

metabolome provides a functional readout of the gut microbiome, and the integrative

analysis can provide better understanding of correlations between gut microbiome and

metabolome. We recommend applying the same fecal samples of human subjects or

fecal/colon contents of animals for analysis. However, samples types can be from any site,

i.e., saliva, buccal mucosa, and colon tissue samples. Simultaneously, metabolomics

studies can analyze feces, plasma/serum, urine, saliva, exhaled breaths, cerebrospinal

fluid, and tissues of target organs. Thus, sample types may differ, but they need to require

originating from a same subject for both microbiome and metabolome analysis.

Finally, the limitations and future directions of this study disserve to be mentioned. M2IA

accepts annotated microbioa and metabolites as input data matrix. However, the raw data

preprocessing steps are not included in this pipeline, e.g., quality control and microbial

annotation for microbiome data, and peak detection and metabolite annotation for

metabolomics data. Currently, many sophisticated software tools or pipelines have been

developed for original data preprocessing, such as Qimme2[27, 28] for 16s RNA

sequencing, and XCMS for MS-based metabolic profiling[29]. M2IA focuses on providing

a standardized pipeline and a comprehensive workflow for the integrative analysis of

metabolome and microbiome data, not only exploring the statistically significant

correlations but also investigating the biological significance of their interaction network.

Moreover, KEGG database based functional network analysis can help to explain their

biological correlations and develop mechanistic hypothesis potentially applicable to the

development of nutritional and personalized therapies. In the future, more advanced

integration methods will be introduced in M2IA and more independent knowledge

databases (e.g., enzyme database and drug metabolism database) will be integrated within

M2IA for deeper biological interpretation.

Conclusions

In summary, M2IA is an effective and efficient computational tool for experimental

biologists to comprehensively analyze and interpret the important interactions between

microbiome and metabolome in the big data era.

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 13: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

Authors’ contributions

YN contributed to develop the methodology, algorithms, and write the manuscript. YG

contributed to the software development and implementation. YD provided valuable

testing datasets during method development of our pipeline. PW and CS contributed to

TwinsUK data preparation and processing for pipeline validation. HC and FJ provided

valuable suggestions on software development and manuscript modification. FF

supported this project and contributed to the review and modification of the manuscript.

Availability of data and materials

Data used in this study can be accessed at http://m2ia.met-bioinformatics.cn.

Funding

This work was fundedby National Natural Science Foundation of China (81900510), the

National Key Research and Development Program of China (No. 2016YFC1305301),

and the startup funding from the Children’s Hospital, Zhejiang University School of

Medicine.

Acknowledgements

Not applicable

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Reference

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 14: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

1. Cho I, BlaserMJ. APPLICATIONSOFNEXT-GENERATION SEQUENCINGThehuman microbiome: at the interface of health and disease. Nat Rev Genet2012,13(4):260-270.

2. ShafferM,ArmstrongAJS,PhelanVV,ReisdorphN,LozuponeCA.Microbiomeandmetabolome data integration provides insight into health and disease.TranslRes2017,189:51-64.

3. Wang WL, Xu SY, Ren ZG, Tao L, Jiang JW, Zheng SS. Application ofmetagenomics in the human gut microbiome. World J Gastroentero 2015,21(3):803-814.

4. Wahlstrom A, Sayin SI, Marschall HU, Backhed F. Intestinal CrosstalkbetweenBileAcidsandMicrobiotaandIts ImpactonHostMetabolism.CellMetab2016,24(1):41-50.

5. FranzosaEA,Sirota-MadiA,Avila-PachecoJ,FornelosN,HaiserH,ReinkerS,Vatanen T, Hall AB, Mallick H, Mclver LJ, Sauk JS,Wilson RG, Stevens BW,ScottJM,PierceK,DeikAA,BullockK,ImhannF,PorterJA,ZhernakovaA,FuJY,WeersmaRK,WijmengaC,ClishCB,VlamakisH,HuttenhowerC,XavierRJ.Gut microbiome structure and metabolic activity in inflammatory boweldisease.NatMicrobiol2019,4(2):293-305.

6. Louis P, Hold GL, Flint HJ. The gut microbiota, bacterial metabolites andcolorectalcancer.NatRevMicrobiol2014,12(10):661-672.

7. GuYY,WangXK,LiJH,ZhangYF,ZhongHZ,LiuRX,ZhangDY,FengQ,XieXY,HongJ,RenHH,LiuW,MaJ,SuQ,ZhangHM,YangJL,WangXL,ZhaoXJ,GuWQ,BiYF,PengYD,XuXQ,XiaHH,LiF,XuX,YangHM,XuGW,MadsenL,KristiansenK,NingG,WangWQ.Analysesofgutmicrobiotaandplasmabileacids enable stratification of patients for antidiabetic treatment. NatCommun2017,8.

8. CaussyC,HsuC,LoMT,LiuA,BettencourtR,AjmeraVH,BassirianS,HookerJ, Sy E, Richards L, Schork N, Schnabl B, Brenner DA, Sirlin CB, Chen CH,Loomba R, Consortium GNT. Link between gut-microbiome derivedmetabolite and shared gene-effects with hepatic steatosis and fibrosis inNAFLD.Hepatology2018,68(3):918-932.

9. Vernocchi P, Del Chierico F, Putignani L. Gut Microbiota Profiling:Metabolomics Based Approach to Unravel Compounds Affecting HumanHealth.FrontMicrobiol2016,7:1144.

10. DhariwalA,ChongJ,HabibS,KingIL,AgellonLB,XiaJ.MicrobiomeAnalyst:aweb-based tool for comprehensive statistical, visual and meta-analysis ofmicrobiomedata.NucleicAcidsRes2017,45(W1):W180-W188.

11. Pedersen HK, Forslund SK, Gudmundsdottir V, Petersen AO, Hildebrand F,HyotylainenT,NielsenT,HansenT,BorkP,EhrlichSD,BrunakS,OresicM,Pedersen O, Nielsen HB. A computational framework to integrate high-throughput '-omics' datasets for the identification of potential mechanisticlinks.NatProtoc2018,13(12):2781-2800.

12. Chong J, Xia JG. Computational Approaches for Integrative Analysis of theMetabolomeandMicrobiome.Metabolites2017,7(4).

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 15: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

13. WeiR,Wang J, SuM, JiaE, ChenS, ChenT,NiY.MissingValue ImputationApproach for Mass Spectrometry-based Metabolomics Data. Scientificreports2018,8(1):663.

14. TsilimigrasMC, Fodor AA. Compositional data analysis of themicrobiome:fundamentals,tools,andchallenges.AnnEpidemiol2016,26(5):330-335.

15. McHardy IH, Goudarzi M, Tong M, Ruegger PM, Schwager E, Weger JR,Graeber TG, Sonnenburg JL, Horvath S, Huttenhower C, McGovern DP,FornaceAJ, Jr.,Borneman J,Braun J. Integrativeanalysisof themicrobiomeandmetabolomeof the human intestinalmucosal surface reveals exquisiteinter-relationships.Microbiome2013,1(1):17.

16. YouYJ,LiangDD,WeiRM,LiMC,LiYT,WangJY,WangXY,ZhengXJ, JiaW,Chen TL. Evaluation of metabolite-microbe correlation detection methods.AnalBiochem2019,567:106-111.

17. Smolinska A, Tedjo DI, Blanchet L, Bodelier A, Pierik MJ, Masclee AAM,Dallinga J, Savelkoul PHM, Jonkers D, Penders J, van Schooten FJ. VolatilemetabolitesinbreathstronglycorrelatewithgutmicrobiomeinCDpatients.AnalChimActa2018,1025:1-11.

18. KosticAD,GeversD,SiljanderH,VatanenT,HyotylainenT,HamalainenAM,PeetA,TillmannV,PohoP,MattilaI,LahdesmakiH,FranzosaEA,VaaralaO,de Goffau M, Harmsen H, Ilonen J, Virtanen SM, Clish CB, Oresic M,Huttenhower C, KnipM, Group DS, Xavier RJ. The dynamics of the humaninfant gut microbiome in development and in progression toward type 1diabetes.CellHostMicrobe2015,17(2):260-273.

19. BylesjoM,ErikssonD,KusanoM,MoritzT,TryggJ.Dataintegrationinplantbiology: the O2PLS method for combined modeling of transcript andmetabolitedata.PlantJ2007,52(6):1181-1191.

20. Tax4Fun2.https://sourceforgenet/projects/tax4fun2/.21. Asshauer KP, Wemheuer B, Daniel R, Meinicke P. Tax4Fun: predicting

functional profiles frommetagenomic16S rRNAdata.Bioinformatics 2015,31(17):2882-2884.

22. AbubuckerS,SegataN,Goll J,SchubertAM, Izard J,CantarelBL,Rodriguez-MuellerB,ZuckerJ,ThiagarajanM,HenrissatB,WhiteO,KelleyST,MetheB,Schloss PD, GeversD,MitrevaM,Huttenhower C.Metabolic reconstructionfor metagenomic data and its application to the humanmicrobiome. PLoSComputBiol2012,8(6):e1002358.

23. KEGGdatabase.https://www.kegg.jp.24. ZiererJ,JacksonMA,KastenmüllerG,ManginoM,LongT,TelentiA,Mohney

RP, Small KS, Bell JT, Steves CJ, ValdesAM, Spector TD,Menni C. The fecalmetabolomeasafunctionalreadoutofthegutmicrobiome.2018,-50(-6):-795.

25. Goodrich JK, Davenport ER, Beaumont M, Jackson MA, Knight R, Ober C,Spector TD, Bell JT, Clark AG, Ley RE. Genetic Determinants of the GutMicrobiomeinUKTwins.CellHostMicrobe2016,19(5):731-743.

26. OrmazabalP,ScazzocchioB,VariR,SantangeloC,D'ArchivioM,SilecchiaG,IacovelliA,GiovanniniC,MasellaR.Effectofprotocatechuicacidon insulinresponsiveness and inflammation in visceral adipose tissue from obese

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 16: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

individuals:possiblerole forPTP1B. Int JObes(Lond)2018,42(12):2012-2021.

27. Qimme2.https://qiime2.org.28. CaporasoJG,KuczynskiJ,StombaughJ,BittingerK,BushmanFD,CostelloEK,

FiererN,PenaAG,Goodrich JK,GordonJI,HuttleyGA,KelleyST,KnightsD,KoenigJE,LeyRE,LozuponeCA,McDonaldD,MueggeBD,PirrungM,ReederJ, Sevinsky JR, Turnbaugh PJ, Walters WA, Widmann J, Yatsunenko T,Zaneveld J, KnightR.QIIME allows analysis of high-throughput communitysequencingdata.NatMethods2010,7(5):335-336.

29. ForsbergEM,HuanT,RinehartD,BentonHP,WarthB,HilmersB,SiuzdakG.Data processing, multi-omic pathway mapping, and metabolite activityanalysisusingXCMSOnline.NatProtoc2018,13(4):633-651.

Legends

Figure 1. The flowchart of M2IA.

Figure 2. Illustration of similarity analysis and spearman correlation analysis results. (A)

PA. The length of lines connecting two points indicates the agreement of samples

between two datasets. (B) O2PLS model. (C) Circos plot of spearman correlations

between microbes and metabolites. (D) Network of spearman correlations between

differential metabolites and microbes belonging to the Firmicutes.

Figure 3. (A) Score plot of an OPLS-DA model between OB and HC group. (B) Variable

importance plot (V-plot) of an OPLS-DA model between OB and HC group. (C) Feature

selection of Random Forest modeling.

Figure 4. (A-B) Bar plot of differential KO functions from metabolome and microbiome

data. (C) Venn diagram of differential KOs. (D-E) Interaction network of each metabolite

(3,4-dihydroxybenzoate and glycerol) with microbes that may participate its metabolism.

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 17: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

Figure 1

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 18: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

Figure 2

-20

-10

0

10

20

-5.0 -2.5 0.0 2.5Genus Joint PC1

Met

abol

ites J

oint

PC1

L

H

R= 0.76 ,P= 0e+00

-0.05

0.00

0.05

0.10

-0.10 -0.05 0.00 0.05 0.10PC1

PC2 Metabolites

Genus

R= 0.31 ,P= 1e-04A

B

C

D

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 19: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

Figure 3

-20

-10

0

10

20

-5 0 5P1(1.51%)

O1(9.11%)

L H

P=0.

1

P=0.

05

P=0.

1

P=0.

05

+

+

+

+

+

++++

+

++

++

+

++

++

+

+

+

+

+

+

+++

+

+

+

+

+

+

+

+

++

++

+

+

+

+

+

++

+++

+

+

+

++

+

++

+

+

+++

+++

+

+

+

+

+

+

+

+

+

+

+

+

++

+

+

+

++

+

+

+

+

+

++

+

+

+

++

+

+

+

++

+ ++

+

+

+

+++

+

+

+

+

+

++

+

+++

+

++

++

+

++

+

++

+ +

+

+

+

+

+

+

++

+

++

+

+++

+

++

+

+

+

+

++

+

+

++++

+

+

+

++

+++

+

+

++

+

+

+

+

++

+

+++

+

++

++

+++

+

+++

++

+

++

+

+

+

+++

+

+

+

++

+

++

++ +

++

+

+

+

+

+

++

+

+

+

++

++

+

+

++

+

+

+++

+ +

+ +

+++

++

++

+

+

+

+

++

++

+

+++

+

+

+

+

+

++

++

+

++

+

+

+

++

++

+ ++

+

+

+

+

+

++

++

+

+

+

+

+ ++

+ ++++

+

+

+

+ +

+

+

+

+++

+++

+

+

++

+

+

+++

++

+

+

++

+++

+ +

+

+

+

+

+

++

++

+

+ ++

+

+

++

+

+

+

+

+

+

+

++

+

+

++

++++

+

+ ++

++

+

+

++

+

+ ++

+

++

+

+++

+

+

+

+

+

+

++

+

+

++

++

++

+

++

+

+++

+

+++

++

+

++

+

+

+

+

++

+++

++

+

+

+

+ +

+

+

+

+

+

+

+

++

+

++

+

++

+++

+

++

++

+++

+

++

+

+

++

++

++

++

+

++

+

+

+

+

+

+

+ +++ ++

+

++

+

+

+

+

+

++

+

+

+

+

+

++

+

++

+

+

+++

+

+

+

+

++

+

+

+

++

+

+

++

+

+

++

++++

++

++

++

++

+

+

+

+

+

++

++

+

+

+

+ ++

++

++ +

++

+

++

++

+++

++++

++

+

+

++

+

++

+

++

+

+++

+++

+

+

+

++

+

+

++

+

+

+

+

+

+

+

+

+

+

++

+

+

+

+

+

+

+

+

+

+

+++

+

++

+++

+

+ +

+

++

+

+

+

+

+

+

+

++

++

+++

+

++

+

+

+

+

+++

+ ++

++

+

+

+

++

+

+

+

++

+

++

++

+

+

+

+

++

+++

+

+

+

+

++

+

+

+

+

+ +++

+

+

+

++

+

+++

+

++

+

+ +

+

++

+

++

+

+

+

++

++

+

+

++++

+

+

+

+

+

+

+

+

+++

+

+

+

+

+ ++

+ +

+

+

+

+

+

++

+

+

+

+

+

+

++

+

+

++

+

+++

+++

+

++

+++

+

++

++

++

+

+

+

+

++

+

+

+

+

+

+

+

+

++

+

+

+

+

+ ++

+

+

+

++

+

+

+++

+++

+++

+

+

+

++++++

+

+++

++

+

+

+

+

+

+++

+

++

+

+

++

+++ ++

+

++

+

+ +

++

+

+++

+

++

++

+

+

+

++

+

+

+

+

++

+

+

++

++

++

+

+

+

+

+

+

+

++

++

++

+

+

++

++

++

+

+

+

+

+

+

+

++

+

+

+

+

+

+

++

+

+

+

+

+

+

+

+ +

+

Acidaminococcus

Actinomyces

Adlercreutzia

Akkermansia

Anaerofustis

Anaerostipes AnaerotruncusAtopobiumBacteroides

Bifidobacterium

BilophilaBlautia

Bulleidia

Butyricimonas

Campylobacter

Catenibacterium

Christensenella

Clostridium

Collinsella

Coprobacillus

Coprococcus

Dehalobacterium

Desulfovibrio

Dialister

Dorea

EggerthellaEnterococcus

Faecalibacterium

Fusobacterium

Haemophilus

Holdemania

Lachnobacterium

Lachnospira

Lactobacillus

Lactococcus

Megasphaera

MogibacteriumOdoribacter

Oscillospira

Others

Oxalobacter

Parabacteroides

Paraprevotella

Parvimonas

Phascolarctobacterium

Porphyromonas

Prevotella

RalstoniaRoseburia

Ruminococcus

Serratia

Slackia

Streptococcus

Sutterella

Turicibacter

Veillonella

WAL_1855D

[Eubacterium]

[Prevotella]

[Ruminococcus]

cc_115rc4-4

(R)-salsolinol

1-methylguanidine 1-methylhistamine

1-methylhistidine

1-methylimidazoleacetate

2,3-dimethylsuccinate

2-(4-hydroxyphenyl)propionate

2-aminoadipate

2-aminobutyrate

2-aminophenol

2-hydroxy-3-methylvalerate

2-hydroxybutyrate/2-hydroxyisobutyrate

2-methylserine

3-(3-hydroxyphenyl)propionate

3-(4-hydroxyphenyl)lactate (HPLA)

3-(4-hydroxyphenyl)propionate

3-hydroxyphenylacetate

3-methyl-2-oxobutyrate

3-methyl-2-oxovalerate

3-methylglutaconate

3-methylhistidine

3-phenylpropionate (hydrocinnamate)

3-sulfo-L-alanine

4-acetamidobutanoate

4-guanidinobutanoate

4-hydroxycinnamate

4-hydroxyglutamate

4-hydroxyphenylacetate

4-hydroxyphenylpyruvate

4-imidazoleacetate

4-methyl-2-oxopentanoate

4-methylthio-2-oxobutanoate

5-aminovalerate

5-hydroxyindoleacetate

5-hydroxylysine

5-oxoproline

6-oxopiperidine-2-carboxylic acid

acisoga

agmatine

alanine

allo-threonine

alpha-hydroxyisocaproate

alpha-hydroxyisovalerate

anthranilate

arginine

argininosuccinate

asparagine

aspartatebeta-hydroxyisovalerate

betaine

cadaverine

carboxyethyl-GABA

cis-4-hydroxycyclohexylacetic acid

cis-urocanate

citrulline

creatine

creatinine

cystathionine

cysteinecysteine s-sulfate

cysteine sulfinic acid

cysteinylglycine

cystine

dihydoxyphenylalanine (L-DOPA)

dihydrocaffeate

dimethylarginine (ADMA + SDMA)

dimethylglycine

ethylmalonate

formiminoglutamate

gamma-aminobutyrate (GABA)

gentisate

glutamate

glutamate, gamma-methyl ester

glutamine

glutarate (pentanedioate)

glycine

guanidinoacetate

guanidinosuccinate

histamine

histidine

homocitrulline

hydantoin-5-propionic acid

imidazole lactateimidazole propionate

indole

indole-3-carboxylic acid

indoleacetate

indolelactate

indolepropionate

isobutyrylglycine (C4)isoleucine

isovalerate (C5)

isovalerylglycine

kynurenate

leucine

lysine

methionine

methionine sulfone

methionine sulfoxide

methylsuccinate

N-acetyl-1-methylhistidine*

N-acetyl-3-methylhistidine*N-acetyl-aspartyl-glutamate (NAAG)

N-acetyl-cadaverine

N-acetylalanine

N-acetylarginine

N-acetylasparagine

N-acetylaspartate (NAA)

N-acetylcitrulline

N-acetylcysteineN-acetylglutamate

N-acetylglutamine

N-acetylglycine

N-acetylhistamineN-acetylhistidine

N-acetylisoleucine

N-acetylkynurenine (2)

N-acetylleucine

N-acetylmethionine

N-acetylmethionine sulfoxide

N-acetylphenylalanine

N-acetylproline

N-acetylputrescine

N-acetylserine

N-acetylserotonin

N-acetyltaurine

N-acetylthreonine

N-acetyltryptophan

N-acetyltyrosineN-acetylvaline

N-alpha-acetylornithine

N-carbamoylalanine

N-carbamoylsarcosine

N-delta-acetylornithine

N-formylmethionineN-formylphenylalanine

N-methylalanine

N-methylglutamate

N-methylhydantoin

N-methylphenylalanine

N-methylproline

N-methyltaurine

N-methyltryptophan

N-monomethylarginine

N-propionylalanine

N1,N12-diacetylspermine

N2,N5-diacetylornithine

N2,N6-diacetyllysine

N2-acetyllysine

N6,N6,N6-trimethyllysine

N6-acetyllysine

N6-carboxyethyllysine

N6-formyllysine

O-acetylhomoserine

o-Tyrosineornithine

p-cresolp-cresol sulfate

phenethylamine

phenylacetate

phenylacetylglutamate

phenylacetylglutamine

phenylalanine

phenyllactate (PLA)

phenylpropionylglycine

phenylpyruvate

picolinate

pipecolate

proline

putrescine

pyroglutamine*

S-1-pyrroline-5-carboxylate

S-methylcysteine

S-methylmethioninesaccharopine

sarcosine

serine

serotoninskatol

spermidine

taurine

thioproline

threoninehydroxyproline

trans-urocanate

tryptamine

tryptophan tyramine

tyrosine

valerylphenylalanine

valine

xanthurenate

2-deoxyribose

arabinose

arabitol/xylitol

arabonate/xylonate

erythronate*

fructose

fucose

galactonateglucose

glucuronate

glycerate

isomaltose

lactatemaltose

maltotriose

mannitol/sorbitol

mannose

N-acetyl-beta-glucosaminylamine

N-acetylglucosamine/N-acetylgalactosamine

N-acetylglucosaminylasparagine

N-acetylmuramateN-acetylneuraminate

N6-carboxymethyllysine

pyruvate

ribitol

ribonate (ribonolactone)ribose

ribulose/xylulose

sedoheptulose

sucrose

xylose

1-methylnicotinamide

5-(2-Hydroxyethyl)-4-methylthiazole

6-hydroxynicotinate

alpha-tocopherolalpha-tocopherol acetate

alpha-tocotrienol

bilirubin

biliverdin

biocytin

biotin

D-urobilin

delta-tocopherol

FAD

FMN

gamma-CEHC

gamma-tocopherol/beta-tocopherol

gamma-tocotrienol

I-urobilinogen

L-urobilin

maleamate

nicotinamide

nicotinamide riboside

nicotinate

nicotinate ribonucleosidepantothenate (Vitamin B5)

pterin

pyridoxal

pyridoxamine

pyridoxate

pyridoxine (Vitamin B6)

quinolinate

retinol (Vitamin A)

riboflavin (Vitamin B2)

thiamin (Vitamin B1)

threonate

trigonelline (N'-methylnicotinate)

2-methylcitrate/homocitratealpha-ketoglutarate

citrate

fumaratemalate

phosphate

succinate

succinylcarnitine (C4)

tricarballylate

1,11-undecanedicarboxylate

1,2-dilinolenoyl-digalactosylglycerol (18:3/18:3)

1,2-dilinolenoyl-galactosylglycerol (18:3/18:3)*

1,2-dilinoleoyl-digalactosylglycerol (18:2/18:2)*1,2-dilinoleoyl-galactosylglycerol (18:2/18:2)*

1,2-dilinoleoyl-GPC (18:2/18:2)

1,2-dipalmitoyl-GPC (16:0/16:0)

1,2-dipalmitoyl-GPG (16:0/16:0)

1-(1-enyl-oleoyl)-GPE (P-18:1)*

1-(1-enyl-palmitoyl)-GPE (P-16:0)*

1-(1-enyl-stearoyl)-GPE (P-18:0)*

1-linolenoylglycerol (18:3)

1-linoleoyl-2-linolenoyl-digalactosylglycerol (18;2/18:3)*1-linoleoyl-2-linolenoyl-galactosylglycerol (18:2/18:3)*

1-linoleoyl-2-linolenoyl-GPC (18:2/18:3)* 1-linoleoyl-GPC (18:2)

1-linoleoyl-GPE (18:2)*

1-linoleoylglycerol (18:2)

1-margaroylglycerol (17:0)

1-myristoylglycerol (14:0)

1-oleoyl-2-linoleoyl-glycerol (18:1/18:2)

1-oleoyl-2-linoleoyl-GPC (18:1/18:2)*

1-oleoyl-3-linoleoyl-glycerol (18:1/18:2)

1-oleoyl-GPC (18:1)

1-oleoyl-GPE (18:1)

1-oleoyl-GPG (18:1)*

1-oleoylglycerol (18:1)

1-palmitoyl-2-arachidonoyl-GPC (16:0/20:4)

1-palmitoyl-2-linolenoyl-digalactosylglycerol (16:0/18:3)

1-palmitoyl-2-linoleoyl-digalactosylglycerol (16:0/18:2)*

1-palmitoyl-2-linoleoyl-galactosylglycerol (16:0/18:2)*

1-palmitoyl-2-linoleoyl-glycerol (16:0/18:2)*

1-palmitoyl-2-linoleoyl-GPC (16:0/18:2)

1-palmitoyl-2-linoleoyl-GPE (16:0/18:2)

1-palmitoyl-2-linoleoyl-GPI (16:0/18:2)

1-palmitoyl-2-oleoyl-GPC (16:0/18:1)

1-palmitoyl-2-oleoyl-GPE (16:0/18:1)

1-palmitoyl-3-linoleoyl-glycerol (16:0/18:2)*

1-palmitoyl-GPC (16:0)

1-palmitoyl-GPE (16:0)

1-palmitoyl-GPG (16:0)*

1-palmitoyl-GPI* (16:0)*

1-palmitoylglycerol (16:0)

1-pentadecanoylglycerol (15:0)

1-stearoyl-GPC (18:0)

1-stearoyl-GPE (18:0)

1-stearoyl-GPG (18:0)1-stearoyl-GPI (18:0)

1-stearoyl-GPS (18:0)*10-heptadecenoate (17:1n7)

10-hydroxystearate

10-nonadecenoate (19:1n9)

11-ketoetiocholanolone sulfate

12,13-DiHOME

12-dehydrocholate12-hydroxystearate

13-HODE + 9-HODE

13-HOTrE

13-methylmyristate

methylpalmitate (15 or 2)

17-methylstearate

2-aminoheptanoate

2-hydroxydecanoate

2-hydroxyglutarate

2-hydroxymyristate

2-hydroxypalmitate

2-hydroxystearate

2-linoleoylglycerol (18:2)

2-methylmalonyl carnitine

2-oleoylglycerol (18:1)3-carboxy-4-methyl-5-propyl-2-furanpropanoate (CMPF)

3-carboxyadipate

3-dehydrocholate

3-hydroxy-3-methylglutarate

3-hydroxybutyrate (BHBA)

3-hydroxyhexanoate

3-hydroxylaurate

3-hydroxymyristate

3-hydroxypropanoate

3-hydroxysebacate

3-hydroxystearate

3-ketosphinganine

3-methyladipate

3-methylglutarate/2-methylglutarate

3b-hydroxy-5-cholenoic acid

4-androsten-3alpha,17alpha-diol monosulfate (2)4-androsten-3beta,17beta-diol disulfate (1)

4-androsten-3beta,17beta-diol disulfate (2)

4-androsten-3beta,17beta-diol monosulfate (1)

4-androsten-3beta,17beta-diol monosulfate (2)

4-cholesten-3-one

5-dodecenoate (12:1n7)

5-hydroxyhexanoate5alpha-androstan-3beta,17alpha-diol monosulfate (1)

5alpha-androstan-3beta,17beta-diol monosulfate (1)

5alpha-pregnan-3beta,20alpha-diol monosulfate (1)

6-oxolithocholate7-ketodeoxycholate

7-ketolithocholate

9,10-DiHOME

acetylcarnitine (C2)

adipateadrenate (22:4n6)

arachidate (20:0)

arachidonate (20:4n6)

azelate (nonanedioate; C9)

beta-sitosterol

butyrylglycine (C4)

campesterol

caprate (10:0)

caproate (6:0)

caprylate (8:0)

carnitine

chenodeoxycholate

cholate

cholesterol

choline

phosphocholine

coprostanol

cortolone

dehydroisoandrosterone sulfate (DHEA-S)

dehydrolithocholate

deoxycarnitine

deoxycholate

deoxycholic acid sulfate

dihomolinoleate (20:2n6)

dihomolinolenate (20:3n3 or 3n6)

dimethylmalonic acid

docosadienoate (22:2n6)

docosahexaenoate (DHA; 22:6n3)

docosapentaenoate (DPA; 22:5n3)

dodecanedioate (C12)

eicosapentaenoate (EPA; 20:5n3)eicosenoate (20:1n9 or 1n11)

epiandrosterone sulfate

ergosterolerucate (22:1n9)

glycerol

glycerol 3-phosphate

glycerophosphoethanolamine

glycerophosphoglycerol

glycerophosphoinositol*

glycerophosphorylcholine (GPC)glycochenodeoxycholate

glycocholate

glycocholenate sulfate*

glycodeoxycholate

glycodeoxycholate sulfate

glycolithocholate

glycolithocholate sulfate*

glycosyl-N-palmitoyl-sphingosineglycosyl-N-stearoyl-sphingosine

glycoursodeoxycholate

heptanoate (7:0)

hexadecanedioate (C16)hexanoylglycine (C6)

hyocholate

isocaproate

lactosyl-N-palmitoyl-sphingosine

lanosterol

laurate (12:0)

linoleamide (18:2n6)

linoleate (18:2n6)

linolenate (18:3n3 or 3n6)

linoleoyl ethanolamide

lithocholate

maleate

malonate margarate (17:0)methylmalonate (MMA)

mevalonate mevalonolactone

myo-inositol

myristate (14:0)

myristoleate (14:1n5)

N-acetylsphingosine

N-linoleoylglycine

N-palmitoyl-sphinganine (d18:0/16:0)

N-palmitoyl-sphingosine (d18:1/16:0)

N-palmitoylglycine

nonadecanoate (19:0)

octadecanedioate (C18)

oleate/vaccenate (18:1)

oleoyl ethanolamide

oleoylcarnitine (C18)

palmitate (16:0)

palmitoleate (16:1n7)

palmitoyl dihydrosphingomyelin (d18:0/16:0)*

palmitoyl ethanolamidepalmitoyl sphingomyelin (d18:1/16:0)

palmitoylcarnitine (C16)

pentadecanoate (15:0)

phytosphingosine

pimelate (heptanedioate)

pregn steroid monosulfate*

pregnen-diol disulfate*

pristanatepropionylglycine (C3)

sebacate (decanedioate)

sphinganine

sphingosine

stearate (18:0)

stearidonate (18:4n3)

stearoyl ethanolamide

stearoylcarnitine (C18)

stigmasterol

suberate (octanedioate)

taurochenodeoxycholate

taurocholate

taurocholenate sulfate

taurodeoxycholate

taurolithocholate 3-sulfate

tetradecanedioate (C14)

trimethylamine N-oxide

undecanedioate

ursodeoxycholateursodeoxycholate sulfate (1)

valerate (5:0)

1-methyladenine

2'-deoxyadenosine2'-deoxycytidine

dCMP

2'-deoxyguanosine

2'-deoxyinosine

2'-deoxyuridine

3-aminoisobutyrate

3-ureidopropionate

4-ureidobutyrate5,6-dihydrothymine

5-methyluridine (ribothymidine)

7-methylguanine

8-hydroxyguanine

adenine

adenosine

AMPallantoin

beta-alanine

cytidine

cytidine 2',3'-cyclic monophosphate

3'-CMP

CMP

cytosinedihydroorotate

guanine

guanosine

guanosine-2',3'-cyclic monophosphate

hypoxanthineinosine

methylphosphate

N-acetyl-beta-alanineN-carbamoylaspartate

1-methyladenosine

N6,N6-dimethyladenosine

N6-carbamoylthreonyladenosine

N6-dimethylallyladenine

orotate

pseudouridinethymidinethymine

uracil

urateuridine

xanthine

xanthosine

alanyl-glutamyl-meso-diaminopimelate

alanylleucine

anserine

carnosine

gamma-glutamyl-epsilon-lysine

gamma-glutamylalanine

gamma-glutamylglutamate

gamma-glutamylglycine

gamma-glutamylhistidine

gamma-glutamylisoleucine*

gamma-glutamylleucine

gamma-glutamylmethionine

gamma-glutamylphenylalaninegamma-glutamyltyrosinegamma-glutamylvaline

glutaminylleucine

glycylisoleucine

glycylleucine

glycylvaline

histidylalanine

isoleucylglycine

leucylalanine

leucylglutamine*

leucylglycine

lysylleucine

phenylalanylalanine

phenylalanylglycine

prolylglycine

threonylphenylalanine

tryptophylglycine

valylglutamine

valylglycine

valylleucine

X - 02249

X - 09789

X - 10457

X - 11381

X - 11429

X - 11440

X - 11538

X - 11612

X - 11640

X - 12026

X - 12096

X - 12097

X - 12100

X - 12101

X - 12117

X - 12125

X - 12199

X - 12339

X - 12450

X - 12456

X - 12680X - 12681

X - 12688

X - 12879X - 13007

X - 13507

X - 13696

X - 13835

X - 14056

X - 14095

X - 14141

X - 14196

X - 14237

X - 14254

X - 14263

X - 14264

X - 14302

X - 14314

X - 14337

X - 14364

X - 14383X - 14384

X - 14392

X - 14416

X - 14454

X - 14502

X - 14697X - 14838

X - 14900

X - 14904

X - 15136

X - 15150

X - 15245

X - 15461

X - 15497

X - 15608X - 15666

X - 15806

X - 15843

X - 15853

X - 15854

X - 15856

X - 16071

X - 16343

X - 16391

X - 16654

X - 16946

X - 17009

X - 17145

X - 17162

X - 17299

X - 17303

X - 17348

X - 17349X - 17438

X - 17469

X - 17653

X - 17739

X - 17749

X - 17807

X - 17852

X - 17877

X - 17910

X - 17919

X - 17960

X - 17969

X - 17978

X - 17983

X - 18162

X - 18165

X - 18307

X - 18410

X - 18889

X - 19220

X - 19232 X - 19299

X - 19497

X - 19751

X - 19763

X - 19931

X - 19932

X - 20100X - 20172

X - 20185

X - 20197

X - 20747

X - 21283

X - 21327

X - 21353

X - 21365

X - 21410

X - 21442

X - 21448

X - 21470

X - 21666

X - 21729

X - 21788

X - 21796

X - 21959

X - 22030

X - 22062

X - 22142

X - 22145

X - 22146

X - 22835

X - 22836

X - 23115

X - 23159

X - 23160

X - 23173

X - 23193X - 23196

X - 23277

X - 23295

X - 23308

X - 23369

X - 23438

X - 23451

X - 23456

X - 23469

X - 23470X - 23475

X - 23482

X - 23507

X - 23581

X - 23583

X - 23584

X - 23585

X - 23587

X - 23600

X - 23637 X - 23639

X - 23652

X - 23655

X - 23657

X - 23662

X - 23665

X - 23710X - 23728

X - 23729

X - 23732

X - 23733

X - 23734

X - 23736

X - 23737

X - 23738

X - 23747

X - 23748

X - 23749

X - 23752X - 23753

X - 23765

X - 23767

X - 23776

X - 23780

X - 23782X - 23925

X - 23974

X - 24023

X - 24027

X - 24077

X - 24177

X - 24204X - 24207

X - 24209

X - 24211X - 24212

X - 24213

X - 24215

X - 24216

X - 24220

X - 24230

X - 24231

X - 24234

X - 24238

X - 24240

X - 24243

X - 24246

X - 24260

X - 24278

X - 24408

X - 24410

X - 24412

X - 24413

X - 24425

X - 24431

X - 24449

X - 24455

X - 24456

X - 24465

X - 24474

X - 24512

X - 24529

X - 24546

X - 24550

X - 24558

X - 24608

X - 24658

X - 24678

X - 24682

X - 24683

X - 24686

X - 24693

X - 24697

X - 24699X - 24727X - 24729

X - 24736X - 24738

X - 24739

X - 24743

X - 24753

X - 24763

X - 24766

X - 24767X - 24803X - 24804X - 24806

X - 24809

X - 24810

X - 24811X - 24813

X - 24829

X - 24831X - 24832

X - 24853

X - 24854

X - 24855

1,3,7-trimethylurate

1,3-dimethylurate

1,3-propanediol

1,7-dimethylurate

1-(3-aminopropyl)-2-pyrrolidone

1-methyl-beta-carboline-3-carboxylic acid

1-methylurate

1-methylxanthine

1H-quinolin-2-one

2,3-dihydroxyisovalerate

2,4,6-trihydroxybenzoate

2,4-dihydroxyhydrocinnamate

2,8-quinolinediol 2-acetamidobutanoate

2-isopropylmalate 2-keto-3-deoxy-gluconate

2-oxindole-3-acetate

2-oxo-1-pyrrolidinepropionate

2-piperidinone

2-pyrrolidinone

3,4-dihydroxybenzoate

3,5-dihydroxybenzoic acid3,7-dimethylurate

3-(2-hydroxyphenyl)propionate

3-hydroxybenzoate

3-hydroxypyridine

3-methylurate*3-methylxanthine

4-acetamidophenol

4-hydroxybenzoate

5-acetylamino-6-amino-3-methyluracil

5-acetylamino-6-formylamino-3-methyluracil

7-methylurate7-methylxanthine

acesulfame

apigenin

astaxanthin

benzoate

beta-carotene

beta-guanidinopropanoate

bufotenine

caffeate

caffeine

ciliatine (2-aminoethylphosphonate)

cinnamate

daidzein

diaminopimelate

diethanolamine

digalacturonic acid

diglycerol

dihydroferulic acid

dipicolinate

ectoine

enterodiol

enterolactone

epicatechin

erythrose

ferulate

fucitol

galacturonate

gluconateglutamyl-meso-diaminopimelate

harmane

hesperetin

hesperidin

indolin-2-one

levulinate (4-oxovalerate)

luteolinidin

methyl glucopyranoside (alpha + beta)

methyl indole-3-acetate

N-carbamylglutamate

N-methylpipecolate

N-propionylmethionine

naringenin

nicotianamine

O-sulfo-L-tyrosine

p-hydroxybenzaldehyde

paraxanthine

pheophorbide A

phytanate

piperidine

piperine

pyrraline

quinate

saccharin

salicylate

secoisolariciresinol

sitostanol

solanidine

stachydrinesuccinimide

sucralose

sulfate*

syringic acid

tartarate

theanine

theobromine

theophylline

tribuloside tyrosol

vanillate

0

1

2

3

-0.4 -0.2 0.0 0.2 0.4Corr.Coeffs.

Varia

ble

Impo

rtanc

e on

Pro

ject

ion

(VIP

)

A B

C

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint

Page 20: M2IA: a Web Server for Microbiome and Metabolome ... · associated with obesity from both metabolome and microbiome. Conclusions: M2IA is a public available web server ... correlations

Figure 4

Benzoate degradation:ko00362

Degradation of aromatic compounds:ko01220

Phosphotransferase system (PTS):ko02060

ABC transporters:ko02010

Streptomycin biosynthesis:ko00521

Various types of N-glycan biosynthesis:ko00513

Glycosphingolipid biosynthesis - ganglio series:ko00604

Oxidative phosphorylation:ko00190

Glycosphingolipid biosynthesis - globo and isoglobo series:ko00603

Carbon metabolism:ko01200

Glycosaminoglycan degradation:ko00531

Lysosome:ko04142

Ribosome:ko03010

Biosynthesis of antibiotics:ko01130

Biosynthesis of secondary metabolites:ko01110

Metabolic pathways:ko01100

0 1 2 3

LDA SCORE(log 10)

L H

Fc epsilon RI signaling pathway:ko04664

Apoptosis:ko04210

Necroptosis:ko04217

Ovarian steroidogenesis:ko04913

Sphingolipid metabolism:ko00600

Caprolactam degradation:ko00930

Steroid hormone biosynthesis:ko00140

Porphyrin and chlorophyll metabolism:ko00860

Indole alkaloid biosynthesis:ko00901

Cholinergic synapse:ko04725

mTOR signaling pathway:ko04150

D-Arginine and D-ornithine metabolism:ko00472

Arginine biosynthesis:ko00220

Aminoacyl-tRNA biosynthesis:ko00970

Thermogenesis:ko04714

Monobactam biosynthesis:ko00261

Toluene degradation:ko00623

Salmonella infection:ko05132

Amoebiasis:ko05146

Chagas disease (American trypanosomiasis):ko05142

Benzoate degradation:ko00362

Tyrosine metabolism:ko00350

Glycerolipid metabolism:ko00561

Biotin metabolism:ko00780

Amyotrophic lateral sclerosis (ALS):ko05014

Clavulanic acid biosynthesis:ko00331

Galactose metabolism:ko00052

Carotenoid biosynthesis:ko00906

Polycyclic aromatic hydrocarbon degradation:ko00624

Aminobenzoate degradation:ko00627

Flavone and flavonol biosynthesis:ko00944

Biosynthesis of secondary metabolites - unclassified:ko00999

Arginine and proline metabolism:ko00330

Phenylpropanoid biosynthesis:ko00940

Phosphotransferase system (PTS):ko02060

Phenylalanine metabolism:ko00360

Ubiquinone and other terpenoid-quinone biosynthesis:ko00130

Starch and sucrose metabolism:ko00500

Biosynthesis of secondary metabolites - other antibiotics:ko00998

0 1 2 3 4LDA SCORE(log 10)

L H

Metabolite 37

Microbiome 14

2

A B C

D E

.CC-BY-NC-ND 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 30, 2019. . https://doi.org/10.1101/678813doi: bioRxiv preprint